A collection of tools Standard Ebooks uses to produce its ebooks, including basic setup of ebooks, text processing, and build tools.
Ubuntu 18.04 (Bionic) users
# Install some pre-flight dependencies # lxml requires the following packages for its pip build process: python3-dev libxml2-dev libxslt1-dev zlib1g-dev sudo apt install -y python3-pip python3-dev libxml2-dev libxslt1-dev zlib1g-dev libxml2-utils librsvg2-bin libssl-dev libimage-exiftool-perl imagemagick epubcheck default-jre inkscape calibre curl git # Clone the tools repo git clone https://github.com/standardebooks/tools.git # Install required fonts mkdir -p ~/.local/share/fonts/ curl -s -o ~/.local/share/fonts/LeagueSpartan-Bold.otf "https://raw.githubusercontent.com/theleagueof/league-spartan/master/LeagueSpartan-Bold.otf" curl -s -o ~/.local/share/fonts/OFLGoudyStM.otf "https://raw.githubusercontent.com/theleagueof/sorts-mill-goudy/master/OFLGoudyStM.otf" curl -s -o ~/.local/share/fonts/OFLGoudyStM-Italic.otf "https://raw.githubusercontent.com/theleagueof/sorts-mill-goudy/master/OFLGoudyStM-Italic.otf" # Refresh the local font cache sudo fc-cache -fv # Install python dependencies pip3 install -r ./tools/requirements.txt
These instructions were tested on macOS 10.12 and 10.13. Your mileage may vary. Corrections and fixes to these steps are welcomed, as the SE maintainers don’t have access to Macs.
Install the Homebrew package manager. Or, if you already have it installed, make sure it’s up to date:
# Install some pre-flight dependencies brew install python epubcheck imagemagick libmagic librsvg exiftool git # Clone the tools repo git clone https://github.com/standardebooks/tools.git # Install required applications brew cask install java calibre xquartz inkscape # Install required fonts curl -s -o ~/Library/Fonts/LeagueSpartan-Bold.otf "https://raw.githubusercontent.com/theleagueof/league-spartan/master/LeagueSpartan-Bold.otf" curl -s -o ~/Library/Fonts/OFLGoudyStM.otf "https://raw.githubusercontent.com/theleagueof/sorts-mill-goudy/master/OFLGoudyStM.otf" curl -s -o ~/Library/Fonts/OFLGoudyStM-Italic.otf "https://raw.githubusercontent.com/theleagueof/sorts-mill-goudy/master/OFLGoudyStM-Italic.otf" # Install python dependencies pip3 install -r ./tools/requirements.txt
Help and pull requests are welcomed!
Move some legacy scripts like
hyphenateinto appropriate libraries/scripts.
Some tool functionality should be moved into the SeEpub class. Suggestions on how to better organize SE code into packages/classes are welcome.
Try to convert British quote style to American quote style in DIRECTORY/src/epub/text/.
Quotes must already be typogrified using the
This script isn’t perfect; proofreading is required, especially near closing quotes near to em-dashes.
Build an ebook from a Standard Ebook source directory.
Build ebook cover and titlepage images in a Standard Ebook source directory and place the output in DIRECTORY/src/epub/images/.
Prettify and canonicalize individual XHTML or SVG files, or all XHTML and SVG files in a source directory. Note that this only prettifies the source code; it doesn’t perform typography changes.
Use Firefox to render and compare XHTML files in an ebook repository. Run on a dirty repository to visually compare the repository’s dirty state with its clean state.
Create a skeleton of a new Standard Ebook.
Convert a decimal number to a Roman numeral.
Extract an EPUB, MOBI, or AZW3 ebook into ./FILENAME.extracted/ or a target directory.
Find words with mismatched diacritics in Standard Ebook source directories. For example,
cafein one file and
Insert soft hyphens at syllable breaks in an XHTML file.
A macro for calling Vim to interactively search and replace a regex on a list of files.
Check for various Standard Ebooks style errors.
Make a string URL-safe.
Modernize spelling of some archaic words, and replace words that may be archaically compounded with a dash to a more modern spelling. For example, replace
Calculate work word count, insert release date if not yet set, and update modified date and revision number.
<spine>tag for content.opf based on the passed Standard Ebooks source directory and print to standard output.
Calculate the Flesch reading ease for a Standard Ebooks source directory.
Recompose a Standard Ebooks source directory into a single HTML5 file, and print to standard output.
Increment the specified endnote and all following endnotes by 1.
Convert a Roman numeral to a decimal number.
Apply some scriptable semantics rules from the Standard Ebooks semantics manual to a Standard Ebook source directory.
Split an XHTML file into many files at all instances of
<!--se:split-->, and include a header template for each file.
Convert a string to titlecase.
Apply some scriptable typography rules from the Standard Ebooks typography manual to a Standard Ebook source directory.
Display Unicode code points, descriptions, and links to more details for each character in a string. Useful for differentiating between different flavors of spaces, dashes, and invisible characters like word joiners.
Count the number of words in an HTML file and optionally categorize by length.
What a Standard Ebooks source directory looks like
Many of these tools act on Standard Ebooks source directories. Such directories have a consistent minimal structure:
. |-images/ |--cover.jpg |--cover.source.jpg |--cover.svg |--titlepage.svg |-src/ |--epub/ |---css/ |----core.css |----local.css |---images/ |----cover.svg |----titlepage.svg |---text/ |----colophon.xhtml |----imprint.xhtml |----titlepage.xhtml |----uncopyright.xhtml |---content.opf |---onix.xml |---toc.xhtml |--META-INF/ |---container.xml |--mimetype |-LICENSE.md
./images/ contains source images for the cover and titlepages, as well as ebook-specific source images. Source images should be in their maximum available resolution, then compressed and placed in
./src/epub/images/ for distribution.
./src/epub/ contains the actual epub files.