Automated HTML conditioning tool for a Microsoft Word -> Kindle conversion pipeline
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


A HTML conditioning and splitting tool that aims to make writing books for the Kindle in Microsoft Word a little easier.

  • Strips out superfluous formatting, styling and unused tags written out by Word's HTML exporter
  • A system for circumnavigating the compression/resizing of embedded images while exporting to HTML, allowing the use of the original source artwork
  • Auto-splitting into 'Welcome', 'ToC' and 'Content' files from a single document; setup simple OPF & NCX files for proper MOBI file generation

In reality it's basically a big pile of regexes. Included is an example manuscript in Word 2010 format that contains ideal layout, styles and test content.