Pdf2text

Description

extracts the text from your pdf using ocr with pytesseract
converts text to mp3

Requirements

pip install Pillow pdf2image pytesseract typer rich click_spinner gtts

Extract

python pdf2text.py extract "input_path" "output_path"

Generate

python pdf2text.py generate "input_path" "output_path" language

Example

python pdf2text.py extract "test.pdf" "text.txt"

python pdf2text.py generate "test.txt" "test.mp3" en