GitHub - Praj-17/PDF-Assessment-Using-ChatGPT

How to setup the code

Step-1: Setup Envirnoment

Create a python envirnoment preferrably 3.9.0 and install all the requirements as stated in requirements.txt. Using the following command

pip install -r requirements.txt

Step-2: Install Tesseract

This is a required dependancy to perform the OCR using Tesseract.

Download and install tesseract-OCR from the following link

https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w64-setup-5.3.3.20231005.exe

Step-3: Set path for Tesseract

Navigate to C://Program Files//Tesseract-OCR and add it to your system envirnoment variable.

Step-4: Install Poppler

Unzip the Release 23.11.0 folder and move the Poppler 23.11.0 folder to C://Program Files

Step-5: Set path for Poppler

Navigate to C:\Program Files\poppler-23.10.0\Library\bin copy the path and set it as the envirnoment variable.

Running the code

Navigate to the directory and run

python main.py

By default it will take a pdf from the data folder and perform OCR using both tesseract and easyocr you can customize this in the code.

Also a Speeach Recognition module is been integrated which runs after the OCR code. By default it accepts voice as input and does not accept a .mp3 or any other sort of audio file.

Examples

Refer the output folder to checkout the OCR extraction output. It is not very good as of now.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
data		data
ouput		ouput
main.py		main.py
readme.md		readme.md
requirments.txt		requirments.txt
sam_page_1.jpg		sam_page_1.jpg
sam_page_2.jpg		sam_page_2.jpg
sam_page_3.jpg		sam_page_3.jpg
sam_page_4.jpg		sam_page_4.jpg
sam_page_5.jpg		sam_page_5.jpg
sam_page_6.jpg		sam_page_6.jpg
sample.mp3		sample.mp3

Praj-17/PDF-Assessment-Using-ChatGPT

Folders and files

Latest commit

History

Repository files navigation

How to setup the code

Step-1: Setup Envirnoment

Step-2: Install Tesseract

Step-3: Set path for Tesseract

Step-4: Install Poppler

Step-5: Set path for Poppler

Running the code

Examples

About

Resources

Stars

Watchers

Forks

Languages