Skip to content

liaspas/epub_scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

epub_scraper

A simple scirpt to extract, clean and format .epub text.
Paragraphs are separated with a single line break \n, and chapters with a chapter break ***.
Curly quotes are replaced with straight quotes, and unicode with ...

Run with:

python3 epub_scraper.py -i {input_dir} -o {output_dir}

If not specified, the script will look for .epub files in a folder ./input and output the .txt files to ./output.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages