A tiny Python script (main.py) that opens a file picker, lets you choose a PDF file, and reads the text out loud using your system TTS engine.
This repository contains a single script that demonstrates a simple PDF-to-speech utility using PyPDF2 to extract text and pyttsx3 for text-to-speech.
- Opens a file dialog to select a PDF file.
- Extracts text from each page using
PyPDF2. - Speaks each page's extracted text using
pyttsx3.
The core logic lives in main.py:
- Uses
tkinter.filedialog.askopenfilename()to let you pick a PDF. - Uses
PyPDF2.PdfReaderto read pages andpage.extract_text()to get text. - Uses
pyttsx3to speak the extracted text.
- Python 3.7+ (tested on Windows)
- The following Python packages:
pyttsx3PyPDF2
Note: tkinter is part of the standard library on most official Python builds for Windows; if you used a minimal or custom build that lacks tkinter, install a Python distribution that includes it.
Open PowerShell and run:
pip install pyttsx3 PyPDF2(If you use a virtual environment, activate it first.)
Run the script from the project folder:
python main.pyA file selection dialog will open. Choose a PDF file. The script will extract text page-by-page and speak it aloud.
Stopping the script:
- You can stop the reading by focusing the terminal and pressing Ctrl+C.
- Closing the terminal or stopping Python will halt the speech.
-
Text extraction depends on the PDF contents. If a PDF contains scanned pages (images),
PyPDF2'sextract_text()will likely returnNoneor empty strings; in that case you need OCR (for example usingpytesseract) to extract text from images. -
The script does not currently handle errors such as selecting a non-PDF file, missing read permissions, or a canceled file dialog. Consider adding try/except blocks and basic validation.
-
pyttsx3uses your platform's TTS backend (SAPI5 on Windows). If you want different voices or to adjust rate/volume, you can modify the script to set properties on the engine (seepyttsx3docs). Example:
engine = pyttsx3.init()
engine.setProperty('rate', 150) # words per minute
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[0].id) # choose a voice- Add CLI options to choose file path from the command line (for automation) and to adjust TTS voice/rate/volume.
- Add error handling for file selection failures and PDF read errors.
- Support OCR fallback (using
pytesseract) for scanned PDFs. - Provide a simple GUI with play/pause/stop controls instead of blocking the terminal.
The current main.py is small and looks like this:
import pyttsx3
from PyPDF2 import PdfReader
from tkinter.filedialog import askopenfilename
book = askopenfilename()
pdfreader = PdfReader(book)
pages = len(pdfreader.pages)
player = pyttsx3.init()
for num in range(pages):
page = pdfreader.pages[num]
text = page.extract_text()
if text:
player.say(text)
player.runAndWait()This README and any added small scripts are provided under the MIT License. If you want a different license, let me know and I can update it.
If you want, I can:
- Add argument parsing so files can be passed on the command line.
- Add error handling and a small
requirements.txt. - Add an example showing how to change voice settings or save speech to audio file.
Tell me which of the above you'd like next and I'll implement it.