SmartDoc AI is an automated document processing system for small businesses. It extracts structured data (Date, Total, Items) from receipts and invoices using OCR and modern preprocessing.
- Upload Image/PDF: Supports common receipt and invoice formats.
- Modern Interface: Professional glassmorphism design with smooth transitions.
- Automated Parsing: Extracts Date, Total Amount, and individual Line Items.
- Image Preprocessing: Built-in grayscale and thresholding to handle noisy or low-quality documents.
- Data Export: One-click export to CSV and Microsoft Excel formats.
Download and install Tesseract-OCR for Windows. The application is pre-configured to automatically detect common Tesseract installation paths on Windows.
pip install -r requirements.txtpython main.pyOpen your browser at http://localhost:8000.
/api: FastAPI route handlers and request processing./ocr: Text extraction and image cleaning logic./parser: Data transformation logic for mapping text to structured JSON./static: Frontend application assets.main.py: Main entry point and server configuration.
- Backend: Python, FastAPI, Uvicorn, Pydantic, Pandas.
- OCR Engine: Pytesseract / Tesseract.
- Image Processing: OpenCV (cv2).
- Frontend: HTML5, Vanilla CSS, Lucide Icons.
Designed and Developed by Au Amores.
This project is licensed under the Apache License 2.0. See the LICENSE file for details.
