GitHub - skald1311/Summarize-Snap: Transform images with text into a concise summary using Tesseract OCR and Google's Pegasus model

Summarize Snap

Transform images with text into a concise summary using Tesseract OCR and Google's Pegasus model
VIDEO DEMO

Report Bug · Request Feature

Table of Contents

About The Project
- Built With
Getting Started
- Installation
Contributing
License
Contact

About The Project

Summarize Snap is a cutting-edge project that seamlessly bridges the gap between visual content and concise textual summaries. This innovative solution is designed to streamline the process of extracting meaningful insights from images containing textual information. Whether it's a snapshot of a magazine article, a wiki page, or any other image containing text, Summarize Snap empowers users to swiftly and accurately obtain summaries.

Key Features:

Image-to-Text Conversion: Leveraging the power of Tesseract OCR (Optical Character Recognition), Summarize Snap efficiently converts images containing text into editable textual data. This foundational step ensures that the textual content is accurately extracted from the image, setting the stage for robust summarization.
Advanced Text Summarization: With the integration of Google's Pegasus model, Summarize Snap takes text summarization to the next level. This state-of-the-art model, trained on massive amounts of data, excels at capturing the essence of lengthy passages and distilling them into concise and coherent summaries. The model I used was specifically trained on the cnn_dailymail dataset.
User-Friendly Interface: Summarize Snap boasts an intuitive and user-friendly interface, making it accessible to both tech-savvy users and newcomers. Simply upload an image with text, and the tool takes care of the rest, ensuring a seamless user experience from start to finish.
Versatility and Application: From students seeking to grasp the main ideas of dense academic texts to professionals needing quick insights from business documents, Summarize Snap finds application across various domains and sectors.

Experience the future of text summarization with Summarize Snap. Whether you're a researcher, a student, a professional, or simply someone looking to extract valuable information from images, this project offers a revolutionary solution at your fingertips. Embrace the synergy of Tesseract OCR and Google's Pegasus model for an unparalleled summarization experience.

Unlock the potential of images as a source of succinct knowledge with Summarize Snap today. Transform visual content into actionable insights effortlessly and elevate your information processing game.

(back to top)

Built With

(back to top)

Installation

LIVE DEMO ISN'T AVAILABLE BECAUSE UNFORTUNATELY I COULDN'T GET TESSERACT TO BE INSTALLED PROPERLY ON RENDER.COM. IT DOESN'T WORK BUT THE LINK IS HERE REGARDLESS: NOT WORKING

VIDEO DEMO HERE

LOCAL VERSION WORKS FINE, BELOW IS THE INSTRUCTIONS

To get a local copy up and running follow these simple example steps.

Click the green button

Download ZIP

Extract the file

Make sure all of the files are in the same folder!!!

Install Tesseract manually

Latest installer for window: https://github.com/UB-Mannheim/tesseract/wiki

For other OS: https://tesseract-ocr.github.io/tessdoc/Installation.html

Search Edit the system environment variables -> Environment Variables -> PATH -> NEW -> add the path to tesseract-ocr (usually C:\Program Files\Tesseract-OCR) -> OK

In Environment Variables -> New -> Variable name: TESSDATA_PREFIX | Variable value: C:\Program Files\Tesseract-OCR\tessdata -> OK
Open cmd -> change directory to "src" folder -> Create a virtual environment (below is for Windows)
```
py -3 -m venv .venv
.venv\Scripts\activate
```
Install all the dependencies

pip install -r requirements.txt

if this doesn't work, try this instead:

pip install transformers torch sentencepiece pytesseract Flask Flask-Reuploaded Flask-WTF

Run the below command in terminal
```
flask --app app run
```

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
src		src
Demo Video.wmv		Demo Video.wmv
PEGASUS -- A State-of-the-Art Model for Abstractive Text Summarization.pdf		PEGASUS -- A State-of-the-Art Model for Abstractive Text Summarization.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Summarize Snap

About The Project

Built With

Installation

Contributing

Contact

About

Releases

Packages

Languages

skald1311/Summarize-Snap

Folders and files

Latest commit

History

Repository files navigation

Summarize Snap

About The Project

Built With

Installation

Contributing

Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages