Download, convert and organize Gutenberg books for eBook Readers
Python
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
README
TODO
bulkdownload.py
gutenberg.py
toss.py

README

This is a set of python scripts which downloads all 
Dutch ebooks from Project Gutenberg, renames them to
human-readabele filenames, formats them so they display well 
on my ebook reader, and tosses them into subdirectories for 
easier navigation.

Written by Michiel Overtoom, motoom@xs4all.nl

How to use:

- Run bulkdownload.py to download the raw texts from a mirror of Project Gutenberg's eBook archive.
- Run gutenberg.py to reformat and rename the raw texts.
- Run toss.py to distribute them over subdirectories.

After that, upload them to your eBook reader, and enjoy!

In March 2016 I reworked this program since it's no longer allowed to scrape
from Gutenberg's main web site. This newer version:

- downloads from a mirror instead of scraping from Gutenberg's main web site
- language can be specified
- better input encoding detection
- outputs UTF8 encoded text files