Download, convert and organize Gutenberg books for eBook Readers
Python
Switch branches/tags
Nothing to show
Latest commit 0a99fe7 Mar 26, 2016 @motoom Updated README and TODO
Permalink
Failed to load latest commit information.
README Updated README and TODO Mar 26, 2016
TODO Updated README and TODO Mar 26, 2016
bulkdownload.py Adaptation for eBook mirror archive Mar 25, 2016
gutenberg.py Enhanced encoding Mar 26, 2016
toss.py Handle subdirectories Mar 26, 2016

README

This is a set of python scripts which downloads all 
Dutch ebooks from Project Gutenberg, renames them to
human-readabele filenames, formats them so they display well 
on my ebook reader, and tosses them into subdirectories for 
easier navigation.

Written by Michiel Overtoom, motoom@xs4all.nl

How to use:

- Run bulkdownload.py to download the raw texts from a mirror of Project Gutenberg's eBook archive.
- Run gutenberg.py to reformat and rename the raw texts.
- Run toss.py to distribute them over subdirectories.

After that, upload them to your eBook reader, and enjoy!

In March 2016 I reworked this program since it's no longer allowed to scrape
from Gutenberg's main web site. This newer version:

- downloads from a mirror instead of scraping from Gutenberg's main web site
- language can be specified
- better input encoding detection
- outputs UTF8 encoded text files