Download, convert and organize Gutenberg books for eBook Readers
motoom/gutenberg-ebook-scraping
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This is a set of python scripts which downloads all Dutch ebooks from Project Gutenberg, renames them to human-readabele filenames, formats them so they display well on my ebook reader, and tosses them into subdirectories for easier navigation. Written by Michiel Overtoom, motoom@xs4all.nl How to use: - Run bulkdownload.py to download the raw texts from a mirror of Project Gutenberg's eBook archive. - Run gutenberg.py to reformat and rename the raw texts. - Run toss.py to distribute them over subdirectories. After that, upload them to your eBook reader, and enjoy! In March 2016 I reworked this program since it's no longer allowed to scrape from Gutenberg's main web site. This newer version: - downloads from a mirror instead of scraping from Gutenberg's main web site - language can be specified - better input encoding detection - outputs UTF8 encoded text files
About
Download, convert and organize Gutenberg books for eBook Readers
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published