PDF to Web is a Python project that transforms PDF documents into web-friendly HTML pages. This tool is particularly useful for converting eBooks and other large PDF files into a series of organized, navigable web pages. Each chapter of the PDF is converted into its own HTML page, complete with a chapter header, navigation buttons, and a collapsible side menu for easy access to other chapters.
- Chapter-based Conversion: Breaks down the PDF into chapters, each represented as a separate HTML page.
- Responsive Design: Clean and mobile-friendly design using modern web technologies.
- Hamburger Navigation Menu: Collapsible side menu for easy navigation between chapters.
- Header and Footer: Each page includes a customizable header and footer with book details.
- JavaScript Integration: Includes a
script.jsfile for dynamic functionalities such as the hamburger menu.
PDFtoWeb/
│
├── books/
│ ├── book1/
│ │ ├── chapter_1.html
│ │ ├── chapter_2.html
│ │ └── ...
│ │
│ └── book2/
│ ├── chapter_1.html
│ ├── chapter_2.html
│ └── ...
│
├── driver.py # Main Python script for converting PDFs to HTML
├── index.html # Home page listing all books
├── style.css # Main stylesheet for the chapters
├── index.css # Separate stylesheet for the home page
└── script.js # JavaScript file for dynamic functionalities
- Python 3.x
- PyMuPDF (
pip install PyMuPDF) - BeautifulSoup4 (
pip install beautifulsoup4)
-
Clone the repository:
git clone https://github.com/yourusername/your-repository-name.git
-
Navigate to the project directory:
cd PDFtoWeb -
Create and activate a virtual environment:
python -m venv PDFtoWebEnv source PDFtoWebEnv/bin/activate # On Windows: PDFtoWebEnv\Scripts\activate
-
Install the required dependencies:
pip install -r requirements.txt
-
Place your PDF: Move your PDF file into the project directory (e.g.,
D:/projects/python_projects/PDFtoWeb/). -
Run the conversion script:
Modify the
driver.pyscript to point to your PDF file and adjust settings such as the number of pages per chapter.python driver.py
-
Access the generated HTML files:
The generated HTML files will be placed in a folder inside
books/with the same name as the PDF. You can openindex.htmlin your browser to see the list of books and start reading.
- Style: Modify
style.cssto change the appearance of the chapter pages. Useindex.cssfor the home page styling. - Script: Update
script.jsfor any additional JavaScript functionality you wish to add. - Book Details: Customize the footer information in the
convert_text_to_htmlfunction withindriver.py.
Feel free to submit issues, fork the repository, and send pull requests. Contributions are welcome and appreciated!
This project is licensed under the MIT License. See the LICENSE file for more details.