Clone this wiki locally
fanfic2ebook is a simple tool with a simple purpose: To make it as easy as possible for me to copy a fanfic to my pocket eBook reader and run out the door. (At the moment, I’m too busy to apply my usual care for other users, but I’m picky enough that it should be simple and fairly usable.) By omitting the format conversion, it can also be used as a simple tool for archiving fanfiction.
The calibre integration (used for LRF and ePub output) has only been tested on Linux but a friend of mine successfully uses the basic downloading and bundling functionality as an input for his Windows-based text-to-speech system. Theoretically, it should work on any operating system where the dependencies are met. (Windows, Linux, OSX, etc.)
Basic example usage (For detailed instructions, see the output of
fanfic2lrf http://www.fanfiction.net/s/2830860/1/ http://www.fanfiction.net/s/1744410/1/
fanfict2ebook is licensed under the GNU GPL 2.0 or later and is available as a Zip or Tar archive or by cloning the git repository. Ready-to-run windows executables are also available. Please report any bugs in the issue tracker.
- Downloads entire stories from Fanfiction.net, Twisting the Hellmouth, or FicWad when given only arbitrary chapter URLs.
- Downloaded chapters are stored in folders named after the story title (human-friendliness) and fanfic2ebook won’t redownload pages if you re-run it. (Aside from the one referenced by the URL you gave. That’s used to check for new chapters and cache-control headers are respected if you have httplib2.)
- Strips away the site template so it doesn’t cause problems on pocket reading devices and can merge the chapters into a single HTML file with a hyperlinked table of contents.
- Can automatically call calibre’s html2lrf or html2epub commands with the proper arguments so your generated eBooks will have appropriate metadata.
- Python 2.x (I forget which version but I’m generally conservative enough that the system default will do fine)
- httplib2 (Optional but recommended. Provides HTTP caching and compression support.)
- calibre (Optional. Only required for conversion to non-HTML formats.)
Known Bugs and Caveats:
- The FicWad scraper crashes it if you use any URL other than the last chapter of the story.
- ePub output hasn’t been tested (My Sony Reader PRS-505 works best with LRF so there was no urgent need to test ePub)
- Build a proper website with proper pseudo-screenshot usage examples and a sample of the HTML-format output.
- Add a scraper for MediaMiner.
- Implement character substitution to compensate for missing glyphs in the PRS-505’s default font. (The PRS-505 only has accented latin glyphs for Western European languages. Glyphs like ō are used in some anime fanfiction.)
- Implement conversion of
foo</p><p>barto work around a shortcoming in html2lrf. (It doesn’t produce a paragraph break the way browser renderers do in such situations)
- Extend the metadata extraction so site categories can be used to populate collections on my Reader.
- Extend the scraper for Twisting the Hellmouth so it detects title images and uses them as book covers for html2lrf and html2epub.
- Support gzip/bzip2 compression on stored fics. (I have a lot of fics and they’re part of my nightly incremental backups)
- Support not generating a folder for single-chapter fics and for upgrading single-chapter (file) to multi-chapter (folder) automatically.
- Set up a test suite.
- Finish re-architecting the code to meet my usual standard (It was initially written in a hurry)
- Add support for plaintext output using either an internal converter or lynx -dump.