VisionParse AI

AI-powered OCR web application for extracting, processing, and analyzing text from images and PDF documents.

🚀 Overview

VisionParse AI is a full-stack OCR (Optical Character Recognition) application built using React + FastAPI that extracts text from images and PDFs using Tesseract OCR.

The application includes smart text processing, background OCR tasks, PDF support, export functionality, and a modern responsive UI.

✨ Features

📤 Drag & Drop File Upload
🖼️ Image OCR Processing
📄 PDF Text Extraction
⚡ Background OCR Processing
📊 OCR Confidence Scores
🔍 Search Within Extracted Text
🧠 Smart Text Cleanup & Processing
📑 TXT / PDF / DOCX / JSON Export
🌙 Dark Mode UI
📱 Responsive Design
🔐 JWT Authentication
📋 History Dashboard

🛠️ Tech Stack

Frontend

React 18
Vite
Tailwind CSS

Backend

FastAPI
Python
SQLAlchemy

OCR & Processing

Tesseract OCR
OpenCV
pdf2image
Poppler

Authentication

JWT Authentication
Passlib / bcrypt

📂 Project Structure

OCR/
├── frontend/
├── backend/
├── README.md
└── .gitignore

⚙️ Installation & Setup

1. Clone Repository

git clone https://github.com/YOUR_USERNAME/VisionParseAI.git
cd VisionParseAI

🔧 Backend Setup

cd backend

python -m venv venv

# Windows
venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run backend server
uvicorn main:app --reload

Backend runs on:

http://localhost:8000

API Docs:

http://localhost:8000/docs

🎨 Frontend Setup

cd frontend

npm install

npm run dev

Frontend runs on:

http://localhost:5173

🔑 Environment Variables

Example .env configuration:

TESSERACT_PATH=C:/Program Files/Tesseract-OCR/tesseract.exe
POPPLER_PATH=C:/poppler/Library/bin

📸 Screenshots

Dashboard

(Add Screenshot Here)

Upload & OCR Processing

(Add Screenshot Here)

OCR Result View

(Add Screenshot Here)

🔌 Main API Endpoints

Method	Endpoint	Description
POST	`/api/upload`	Upload document
GET	`/api/ocr/result/{id}`	Get OCR result
GET	`/api/history`	Document history
GET	`/api/export/{id}`	Export extracted text
POST	`/api/auth/login`	User login

🔒 Security Features

File validation
JWT authentication
UUID-based file naming
Secure environment variables
Restricted CORS configuration

📌 Project Status

✅ Core OCR pipeline completed ✅ Frontend and backend integration completed ✅ Local OCR processing functional

Future improvements:

Advanced AI text analysis
Cloud deployment
OCR optimization
Enhanced UI/UX

👨‍💻 Author

Gokul Nath

📄 License

MIT License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VisionParse AI

🚀 Overview

✨ Features

🛠️ Tech Stack

Frontend

Backend

OCR & Processing

Authentication

📂 Project Structure

⚙️ Installation & Setup

1. Clone Repository

🔧 Backend Setup

🎨 Frontend Setup

🔑 Environment Variables

📸 Screenshots

Dashboard

Upload & OCR Processing

OCR Result View

🔌 Main API Endpoints

🔒 Security Features

📌 Project Status

👨‍💻 Author

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

VisionParse AI

🚀 Overview

✨ Features

🛠️ Tech Stack

Frontend

Backend

OCR & Processing

Authentication

📂 Project Structure

⚙️ Installation & Setup

1. Clone Repository

🔧 Backend Setup

🎨 Frontend Setup

🔑 Environment Variables

📸 Screenshots

Dashboard

Upload & OCR Processing

OCR Result View

🔌 Main API Endpoints

🔒 Security Features

📌 Project Status

👨‍💻 Author

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages