A collection of tools Standard Ebooks uses to produce its ebooks, including basic setup of ebooks, text processing, and build tools.
# Install some pre-flight dependencies
# lxml requires the following packages for its pip build process: python3-dev libxml2-dev libxslt1-dev zlib1g-dev
sudo apt install -y python3-pip python3-dev libxml2-dev libxslt1-dev zlib1g-dev libxml2-utils librsvg2-bin libimage-exiftool-perl imagemagick epubcheck default-jre inkscape calibre curl git
# Clone the tools repo
git clone https://github.com/standardebooks/tools.git
# Install required fonts
mkdir -p ~/.fonts/
curl -s -o ~/.fonts/LeagueSpartan-Bold.otf "https://raw.githubusercontent.com/theleagueof/league-spartan/master/LeagueSpartan-Bold.otf"
curl -s -o ~/.fonts/OFLGoudyStM.otf "https://raw.githubusercontent.com/theleagueof/sorts-mill-goudy/master/OFLGoudyStM.otf"
curl -s -o ~/.fonts/OFLGoudyStM-Italic.otf "https://raw.githubusercontent.com/theleagueof/sorts-mill-goudy/master/OFLGoudyStM-Italic.otf"
# Refresh the local font cache
sudo fc-cache -fv
# Install python dependencies
pip3 install -r ./tools/requirements.txt
# Install hyphenation dictionaries for the pyhyphen library
python3 -c "exec(\"from hyphen import dictools\\ndictools.install('en_GB')\\ndictools.install('en_US')\")"
These instructions were tested on macOS 10.12 and 10.13. Your mileage may vary. Corrections and fixes to these steps are welcomed, as the SE maintainers don't have access to Macs.
-
Install the Homebrew package manager. Or, if you already have it installed, make sure it's up to date:
brew update
-
Install dependencies:
# Install some pre-flight dependencies brew install python epubcheck imagemagick librsvg exiftool git # Clone the tools repo git clone https://github.com/standardebooks/tools.git # Install required applications brew cask install java calibre xquartz inkscape # Install required fonts curl -s -o ~/Library/Fonts/LeagueSpartan-Bold.otf "https://raw.githubusercontent.com/theleagueof/league-spartan/master/LeagueSpartan-Bold.otf" curl -s -o ~/Library/Fonts/OFLGoudyStM.otf "https://raw.githubusercontent.com/theleagueof/sorts-mill-goudy/master/OFLGoudyStM.otf" curl -s -o ~/Library/Fonts/OFLGoudyStM-Italic.otf "https://raw.githubusercontent.com/theleagueof/sorts-mill-goudy/master/OFLGoudyStM-Italic.otf" # Install python dependencies # *********IMPORTANT NOTE*********: PyHyphen currently fails to install on Mac OS when you run this command. # Don't worry, you can safely ignore PyHyphen's failure, as long as all the other pip packages install correctly. pip3 install -r ./tools/requirements.txt
Help and pull requests are welcomed!
-
Move some legacy scripts like
hyphenate
into appropriate libraries/scripts. -
Some tool functionality should be moved into the SeEpub class. Suggestions on how to better organize SE code into packages/classes are welcome.
-
Try to convert British quote style to American quote style in DIRECTORY/src/epub/text/.
Quotes must already be "typogrified"--i.e. curly.
This script isn't perfect; proofreading is required, especially near closing quotes near to em-dashes.
-
Build an ebook from a Standard Ebook source directory.
-
Build ebook cover and titlepage images in a Standard Ebook source directory and place the output in DIRECTORY/src/epub/images/.
-
Prettify and canonicalize individual XHTML or SVG files, or all XHTML and SVG files in a source directory. Note that this only prettifies the source code; it doesn't perform typography changes.
-
Create a skeleton of a new Standard Ebook.
-
Convert a decimal number to a Roman numeral.
-
Extract an EPUB, MOBI, or AZW3 ebook into ./FILENAME.extracted/ or a target directory.
-
Find words with mismatched diacritics in Standard Ebook source directories. For example, 'cafe' in one file and 'café' in another.
-
Insert soft hyphens at syllable breaks in an XHTML file.
-
A macro for calling Vim to interactively search and replace a regex on a list of files.
-
Check for various Standard Ebooks style errors.
-
Make a string URL-safe.
-
Modernize spelling of some archaic words, and replace words that may be archaically compounded with a dash to a more modern spelling. For example, replace "ash-tray" with "ashtray".
-
Calculate work word count, insert release date if not yet set, and update modified date and revision number.
-
Create a
<manifest>
and<spine>
tag for content.opf based on the passed Standard Ebooks source directory and print to standard output. -
Calculate the Flesch reading ease for a Standard Ebooks source directory.
-
Recompose a Standard Ebooks source directory into a single HTML5 file, and print to standard output.
-
Increment the specified endnote and all following endnotes by 1.
-
Convert a Roman numeral to a decimal number.
-
Apply some scriptable semantics rules from the Standard Ebooks semantics manual to a Standard Ebook source directory.
-
Split an XHTML file into many files at all instances of
<!--se:split-->
, and include a header template for each file. -
Convert a string to titlecase.
-
Apply some scriptable typography rules from the Standard Ebooks typography manual to a Standard Ebook source directory.
-
Display Unicode code points, descriptions, and links to more details for each character in a string. Useful for differentiating between different flavors of spaces, dashes, and invisible characters like word joiners.
-
Count the number of words in an HTML file and optionally categorize by length.
Many of these tools act on Standard Ebooks source directories. Such directories have a consistent minimal structure:
.
|-images/
|--cover.jpg
|--cover.source.jpg
|--cover.svg
|--titlepage.svg
|-src/
|--epub/
|---css/
|----core.css
|----local.css
|---images/
|----cover.svg
|----titlepage.svg
|---text/
|----colophon.xhtml
|----imprint.xhtml
|----titlepage.xhtml
|----uncopyright.xhtml
|---content.opf
|---onix.xml
|---toc.xhtml
|--META-INF/
|---container.xml
|--mimetype
|-LICENSE.md
./images/
contains source images for the cover and titlepages, as well as ebook-specific source images. Source images should be in their maximum available resolution, then compressed and placed in ./src/epub/images/
for distribution.
./src/epub/
contains the actual epub files.