Convert any PDF into an audiobook with this Python application! This tool can handle both text-based PDFs and scanned documents through OCR technology.
- 📚 Read any PDF file (text-based or scanned)
- 🔍 Automatic OCR for scanned documents
- 🗣️ Text-to-Speech in Spanish
- 🧹 Smart text cleaning for better audio output
- 📱 Simple GUI for file selection
- 🎯 Page-by-page reading
Make sure you have Python installed and the following dependencies:
pip install pdfplumber pytesseract pdf2image gtts playsoundYou'll also need:
- Tesseract OCR installed on your system (for scanned documents)
- poppler-utils (for PDF to image conversion)
- Clone this repository or download the files
- Create a virtual environment:
python -m venv venv 
- Activate the virtual environment:
- Windows:
.\venv\Scripts\activate 
- Linux/Mac:
source venv/bin/activate
 
- Windows:
- Install the required packages:
pip install -r requirements.txt 
- Run the script:
python main.py 
- Select your PDF file using the file dialog
- Wait while the program processes each page
- Listen to your document being read aloud!
- The program first attempts to extract text directly from the PDF
- If no text is found (scanned document), it automatically switches to OCR
- Text is cleaned and processed to remove unwanted characters and formatting
- Each page is converted to speech and played sequentially
- Temporary audio files are automatically cleaned up after playback
- The current version supports Spanish language text-to-speech
- For OCR to work, make sure Tesseract is properly installed
- Large PDFs may take longer to process, especially when OCR is needed
Feel free to:
- Open issues
- Submit pull requests
- Suggest improvements
- Report bugs
This project is licensed under the MIT License - see the LICENSE file for details