Join GitHub today
Download, convert and organize Gutenberg books for eBook Readers http://www.michielovertoom.com/python…
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Type||Name||Latest commit message||Commit time|
|Failed to load latest commit information.|
This is a set of python scripts which downloads all Dutch ebooks from Project Gutenberg, renames them to human-readabele filenames, formats them so they display well on my ebook reader, and tosses them into subdirectories for easier navigation. Written by Michiel Overtoom, firstname.lastname@example.org How to use: - Run bulkdownload.py to download the raw texts from a mirror of Project Gutenberg's eBook archive. - Run gutenberg.py to reformat and rename the raw texts. - Run toss.py to distribute them over subdirectories. After that, upload them to your eBook reader, and enjoy! In March 2016 I reworked this program since it's no longer allowed to scrape from Gutenberg's main web site. This newer version: - downloads from a mirror instead of scraping from Gutenberg's main web site - language can be specified - better input encoding detection - outputs UTF8 encoded text files