Skip to content

PDF to Web: A Python-based project that converts PDF documents into organized, web-friendly HTML pages. This tool is designed to facilitate easier reading and navigation of eBooks and other PDF files by breaking them into chapters with a clean, responsive design. The generated web pages include features such as a collapsible navigation menu, etc.

Notifications You must be signed in to change notification settings

nthnwllm/PDFtoWeb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

README.md

PDF to Web

Overview

PDF to Web is a Python project that transforms PDF documents into web-friendly HTML pages. This tool is particularly useful for converting eBooks and other large PDF files into a series of organized, navigable web pages. Each chapter of the PDF is converted into its own HTML page, complete with a chapter header, navigation buttons, and a collapsible side menu for easy access to other chapters.

Features

  • Chapter-based Conversion: Breaks down the PDF into chapters, each represented as a separate HTML page.
  • Responsive Design: Clean and mobile-friendly design using modern web technologies.
  • Hamburger Navigation Menu: Collapsible side menu for easy navigation between chapters.
  • Header and Footer: Each page includes a customizable header and footer with book details.
  • JavaScript Integration: Includes a script.js file for dynamic functionalities such as the hamburger menu.

Directory Structure

PDFtoWeb/
│
├── books/
│   ├── book1/
│   │   ├── chapter_1.html
│   │   ├── chapter_2.html
│   │   └── ...
│   │
│   └── book2/
│       ├── chapter_1.html
│       ├── chapter_2.html
│       └── ...
│
├── driver.py        # Main Python script for converting PDFs to HTML
├── index.html       # Home page listing all books
├── style.css        # Main stylesheet for the chapters
├── index.css        # Separate stylesheet for the home page
└── script.js        # JavaScript file for dynamic functionalities

Prerequisites

  • Python 3.x
  • PyMuPDF (pip install PyMuPDF)
  • BeautifulSoup4 (pip install beautifulsoup4)

Installation

  1. Clone the repository:

    git clone https://github.com/yourusername/your-repository-name.git
  2. Navigate to the project directory:

    cd PDFtoWeb
  3. Create and activate a virtual environment:

    python -m venv PDFtoWebEnv
    source PDFtoWebEnv/bin/activate  # On Windows: PDFtoWebEnv\Scripts\activate
  4. Install the required dependencies:

    pip install -r requirements.txt

Usage

  1. Place your PDF: Move your PDF file into the project directory (e.g., D:/projects/python_projects/PDFtoWeb/).

  2. Run the conversion script:

    Modify the driver.py script to point to your PDF file and adjust settings such as the number of pages per chapter.

    python driver.py
  3. Access the generated HTML files:

    The generated HTML files will be placed in a folder inside books/ with the same name as the PDF. You can open index.html in your browser to see the list of books and start reading.

Customization

  • Style: Modify style.css to change the appearance of the chapter pages. Use index.css for the home page styling.
  • Script: Update script.js for any additional JavaScript functionality you wish to add.
  • Book Details: Customize the footer information in the convert_text_to_html function within driver.py.

Contributing

Feel free to submit issues, fork the repository, and send pull requests. Contributions are welcome and appreciated!

License

This project is licensed under the MIT License. See the LICENSE file for more details.

About

PDF to Web: A Python-based project that converts PDF documents into organized, web-friendly HTML pages. This tool is designed to facilitate easier reading and navigation of eBooks and other PDF files by breaking them into chapters with a clean, responsive design. The generated web pages include features such as a collapsible navigation menu, etc.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages