Epublifier

Converts websites into epub.

A tool that allows you to extract a list of html pages from a website and compile them into an ePub book to be imported into your eReader of choice.

For advanced users who can write javascript, you can add additional parser definition to customize parsing of any site.

Check out the wiki for usage.

Currently supporting following sites:

Novel Update
Wuxia World
Most sites from awesome-read-the-docs
Custom sites with UL/OL elements as table of content, or regex on Link text, or use query selector
Custom web app with predefined Title (header) element and Next button (clickable)

Installation

Example Usage

Extracting list of pages

Tranversing Webapp through next button

Extracting other documentation

Why build a scraper as an extension? instead of command line, calibre plugin, etc...

This is for ad hoc generation of EPub from websites that don't have scrape well using traditional scrapers (think standard request based command line scripts or some other chrome extensions that scrape based on open tabs/window) for some reasons:

Usually command line scrapers and other extensions have predefined sites they work for, this one's outside of those sites
Or they requires nontrivial configuration and/or code
Some sites use javascript to dynamically generates/retrieve the text, in which case you need the browser to run the JS - This was the biggest gap for me.
This one runs in the browser, so maybe less likely to be detected and blocked

I also don't intend this scraper to be robust or used in a repeated fashion as a background scheduled job, that's why there's a UI for selecting key elements for scraping. It's meant to be more generalized so that you don't have to have a configuration for a site to still be able to scrape it relatively easily with just some mouse clicks.

If the site you're scraping is already handled by the other programs/extensions, then this wouldn't perform better since the other ones are specifically configured for those sites. Otherwise, this extension gives you the tool to scrape something once or twice without spending too much time coding/configuring.

I don't find myself sticking to the same site a lot, so wrote this.

Building

Build Environment

Windows 10
NPM version 8.1.2

Build Steps

Install NPM
Run npm install in base directory
Run npm run build_ff for Firefox
Run npm run build for Chrome

CI/CD

Acknowledgements

jEpub

Donate

Paypal

Name		Name	Last commit message	Last commit date
Latest commit History 876 Commits
.github		.github
.run		.run
assets		assets
docs		docs
src		src
templates		templates
test		test
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
Privacy.md		Privacy.md
README.md		README.md
SECURITY.md		SECURITY.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
webpack.config.js		webpack.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Epublifier

Installation

Example Usage

Extracting list of pages

Tranversing Webapp through next button

Extracting other documentation

Why build a scraper as an extension? instead of command line, calibre plugin, etc...

Building

Acknowledgements

Donate

About

Releases 28

Contributors 4

Languages

License

maoserr/epublifier

Folders and files

Latest commit

History

Repository files navigation

Epublifier

Installation

Example Usage

Extracting list of pages

Tranversing Webapp through next button

Extracting other documentation

Why build a scraper as an extension? instead of command line, calibre plugin, etc...

Building

Acknowledgements

Donate

About

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases 28

Contributors 4

Languages