Skip to content
This repository was archived by the owner on May 21, 2026. It is now read-only.

veckencshtein/novel-reader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Novel Reader

A Python toolkit for scraping web novel chapters and generating self-contained offline HTML readers with compressed content.

Overview

This project consists of two main components:

  1. Scraper (scraper.py) — Scrapes novel chapters from web novel sites using Selenium with stealth capabilities to bypass anti-bot protection. The scraper navigates to a novel's web page, extracts chapter titles and content, then follows "next chapter" links automatically. Chapters are saved to a local SQLite database.

  2. Reader Generator (generate_reader.py) — Loads chapters from the database, gzip-compresses each chapter's content, and embeds them as base64-encoded data inside a Jinja2-rendered HTML file. The resulting single-file HTML reader uses the Compression Streams API to decompress chapters on the fly in the browser — making it lightweight and fully offline.

Requirements

  • Python 3.8+
  • Google Chrome (for scraping)

Python Dependencies

Jinja2==3.1.6
selenium==4.40.0
selenium_stealth==1.0.6
webdriver_manager==4.0.2

Installation

git clone https://github.com/veckencshtein/novel-reader-fork.git
cd novel-reader-fork
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Usage

Scraping Chapters

# Using command-line arguments
python scraper.py --title "my_novel" --url "https://example.com/novel/123"

# Resume from a specific chapter
python scraper.py -t "my_novel" -u "https://..." -s 50

# Debug mode (prints content without saving)
python scraper.py -t "my_novel" -u "https://..." --debug

# Use default values defined in the script
python scraper.py
Flag Description
-t, --title Novel title (used as the database table name)
-u, --url Starting URL of the novel
-s, --start Chapter number to resume from (default: 0)
-d, --db Database file path (default: novel.db)
--debug Print scraped content without saving
-v, --verbose Enable debug-level logging

Generating the Reader

# Interactive mode — pick a novel and configure options
python generate_reader.py

# Non-interactive with all options
python generate_reader.py -d novel.db -t "my_novel" -r pre -o output

# Split output by file size (default 10MB per file)
python generate_reader.py -t "my_novel" -s

# Split with a custom size limit (5MB)
python generate_reader.py -t "my_novel" -s 5

# Split by chapter count (100 chapters per file)
python generate_reader.py -t "my_novel" --split-by-chapters 100

Generated HTML files are saved to the output/ directory by default.

Features

Scraper

  • Stealth mode to bypass anti-bot protection
  • Automatic ChromeDriver management
  • Resumable scraping from any chapter

Reader Generator

  • File splitting by size or chapter count for upload-limited platforms
  • Dark mode with theme persistence
  • Adjustable font size (12px–32px)
  • Searchable chapter dropdown with keyboard navigation
  • Chapter prefetching for faster navigation
  • Mobile-friendly responsive design

Project Structure

novel-reader-fork/
├── scraper.py              # Web scraper for novel chapters
├── generate_reader.py      # HTML reader generator
├── utils.py                # Shared utilities (DatabaseManager, Chapter)
├── requirements.txt        # Python dependencies
└── templates/
    ├── reader.html.j2      # Jinja2 HTML template
    ├── script.js           # Reader JavaScript (navigation, decompression, UI)
    └── styles.css          # Reader styles

License

This project is currently not licensed.

About

A project that scapes novels and generates an offline html reader

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors