Skip to content

sarangoki/VisionParseAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

VisionParse AI

AI-powered OCR web application for extracting, processing, and analyzing text from images and PDF documents.

React FastAPI Tesseract OCR License


🚀 Overview

VisionParse AI is a full-stack OCR (Optical Character Recognition) application built using React + FastAPI that extracts text from images and PDFs using Tesseract OCR.

The application includes smart text processing, background OCR tasks, PDF support, export functionality, and a modern responsive UI.


✨ Features

  • 📤 Drag & Drop File Upload
  • 🖼️ Image OCR Processing
  • 📄 PDF Text Extraction
  • ⚡ Background OCR Processing
  • 📊 OCR Confidence Scores
  • 🔍 Search Within Extracted Text
  • 🧠 Smart Text Cleanup & Processing
  • 📑 TXT / PDF / DOCX / JSON Export
  • 🌙 Dark Mode UI
  • 📱 Responsive Design
  • 🔐 JWT Authentication
  • 📋 History Dashboard

🛠️ Tech Stack

Frontend

  • React 18
  • Vite
  • Tailwind CSS

Backend

  • FastAPI
  • Python
  • SQLAlchemy

OCR & Processing

  • Tesseract OCR
  • OpenCV
  • pdf2image
  • Poppler

Authentication

  • JWT Authentication
  • Passlib / bcrypt

📂 Project Structure

OCR/
├── frontend/
├── backend/
├── README.md
└── .gitignore

⚙️ Installation & Setup

1. Clone Repository

git clone https://github.com/YOUR_USERNAME/VisionParseAI.git
cd VisionParseAI

🔧 Backend Setup

cd backend

python -m venv venv

# Windows
venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run backend server
uvicorn main:app --reload

Backend runs on:

http://localhost:8000

API Docs:

http://localhost:8000/docs

🎨 Frontend Setup

cd frontend

npm install

npm run dev

Frontend runs on:

http://localhost:5173

🔑 Environment Variables

Example .env configuration:

TESSERACT_PATH=C:/Program Files/Tesseract-OCR/tesseract.exe
POPPLER_PATH=C:/poppler/Library/bin

📸 Screenshots

Dashboard

(Add Screenshot Here)

Upload & OCR Processing

(Add Screenshot Here)

OCR Result View

(Add Screenshot Here)


🔌 Main API Endpoints

Method Endpoint Description
POST /api/upload Upload document
GET /api/ocr/result/{id} Get OCR result
GET /api/history Document history
GET /api/export/{id} Export extracted text
POST /api/auth/login User login

🔒 Security Features

  • File validation
  • JWT authentication
  • UUID-based file naming
  • Secure environment variables
  • Restricted CORS configuration

📌 Project Status

✅ Core OCR pipeline completed ✅ Frontend and backend integration completed ✅ Local OCR processing functional

Future improvements:

  • Advanced AI text analysis
  • Cloud deployment
  • OCR optimization
  • Enhanced UI/UX

👨‍💻 Author

Gokul Nath


📄 License

MIT License

About

AI-powered OCR web application using React, FastAPI, and Tesseract OCR

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors