Skip to content

This project is a versatile Optical Character Recognition (OCR) and translation tool designed to process both images and PDF documents and leveraging NLP to translate the text into different languages and give users an option to summarize the contents of a document in their prefered language

Notifications You must be signed in to change notification settings

RahulJ15/image-pdf-ocr-translation-app

Repository files navigation

Character Recognition,Translation and Summarization Model

https://adaptxt.streamlit.app/

Overview:

This project is an end-to-end Optical Character Recognition (OCR) and translation model designed to extract text from images and PDFs, and then translate it into multiple languages. The implementation is done in Python using the Tesseract OCR engine, OpenCV for image processing, Googletrans for translation, and Streamlit for the frontend.

Key Features:

  1. OCR Processing:

    • Utilizes Tesseract OCR to extract text from images and PDFs.
    • Implements image preprocessing techniques, including conversion to grayscale, removal of table lines, and noise reduction, to enhance OCR accuracy.
  2. Translation:

    • Translates extracted text into five different languages (Hindi, French, Spanish, Mandarin, English) using the Googletrans API.
  3. PDF Support:

    • Supports PDF extraction, recognizing text from each page and translating it.
  4. User Interaction:

    • Allows the user to choose the target language for translation.

Frontend with Streamlit:

  • Implements a user-friendly interface using Streamlit for easy interaction with the OCR and translation model.
  • Users can upload images or PDFs, and the application displays the recognized text along with translation options.
  • Provides a dropdown menu for selecting the target language, enhancing user customization.
  • Streamlit simplifies the deployment process, making the application accessible through a web browser.

Dependencies:

Usage:

  1. Access the OCR and translation model through the Streamlit web interface.
  2. Upload images or PDFs using the provided file upload functionality.
  3. The application processes the input, displays the recognized text, and allows translation into the user-selected language.

How to Run:

  • Ensure the required libraries are installed using:
    pip install pillow
    pip install pytesseract
    pip install opencv-python
    pip install googletrans
    pip install pyPDF2
    pip install streamlit
    pip install nltk
  • Make sure Tesseract OCR is properly installed on your system.

About

This project is a versatile Optical Character Recognition (OCR) and translation tool designed to process both images and PDF documents and leveraging NLP to translate the text into different languages and give users an option to summarize the contents of a document in their prefered language

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages