Download, convert and organize Gutenberg books for eBook Readers
Switch branches/tags
Nothing to show
Latest commit 0a99fe7 Mar 26, 2016 @motoom Updated README and TODO
Failed to load latest commit information.
README Updated README and TODO Mar 26, 2016
TODO Updated README and TODO Mar 26, 2016 Adaptation for eBook mirror archive Mar 25, 2016 Enhanced encoding Mar 26, 2016 Handle subdirectories Mar 26, 2016


This is a set of python scripts which downloads all 
Dutch ebooks from Project Gutenberg, renames them to
human-readabele filenames, formats them so they display well 
on my ebook reader, and tosses them into subdirectories for 
easier navigation.

Written by Michiel Overtoom,

How to use:

- Run to download the raw texts from a mirror of Project Gutenberg's eBook archive.
- Run to reformat and rename the raw texts.
- Run to distribute them over subdirectories.

After that, upload them to your eBook reader, and enjoy!

In March 2016 I reworked this program since it's no longer allowed to scrape
from Gutenberg's main web site. This newer version:

- downloads from a mirror instead of scraping from Gutenberg's main web site
- language can be specified
- better input encoding detection
- outputs UTF8 encoded text files