Skip to content

mayankkuthar/GenAI_Image-API

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Financial Document Processing API

This FastAPI-based API processes financial document images using Google's Gemini AI model to extract structured information. The API supports various financial document types and returns the extracted data in JSON format.

Features

  • Image upload and processing with Gemini AI
  • Support for multiple document types:
    • PT Sheet (Old and New formats)
    • Daybook
    • One Time Info Sheet
  • Robust JSON correction and validation
  • CORS enabled for cross-origin requests
  • Ready for Vercel deployment

Prerequisites

  • Python 3.8 or higher
  • Google Cloud API key for Gemini AI

Setup

  1. Clone the repository:
git clone <your-repo-url>
cd <your-repo-name>
  1. Create and activate a virtual environment:
python -m venv venv
# On Windows:
.\venv\Scripts\activate
# On Unix or MacOS:
source venv/bin/activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Create a .env file in the project root:
GOOGLE_API_KEY=your_gemini_api_key_here
  1. Run the API locally:
uvicorn main:app --reload

The API will be available at http://localhost:8000

API Documentation

GET /

  • Welcome message and API information
  • Returns: List of available endpoints and their descriptions

POST /process-image/

Process an image and extract information based on document type.

Parameters:

  • file: Image file (form-data)
  • document_type: (Optional) Type of document to process
    • Options: "PT Sheet Old", "PT Sheet New", "Daybook", "One Time Info Sheet"

Example using cURL:

curl -X POST "http://localhost:8000/process-image/" \
  -H "accept: application/json" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@your_image.jpg" \
  -F "document_type=Daybook"

Example using Python requests:

import requests

url = "http://localhost:8000/process-image/"
files = {
    'file': open('your_image.jpg', 'rb')
}
data = {
    'document_type': 'Daybook'
}

response = requests.post(url, files=files, data=data)
print(response.json())

Example Responses

Daybook Document

{
  "Day 1": {
    "Date": "15/10/2025",
    "Cash Sale": 5000.0,
    "Credit Sale": 2000.0,
    "Cash Collection": 1500.0,
    "Cash Purchase": 3000.0,
    "Credit Purchase": 1000.0,
    "Cash Paid to Suppliers": 2500.0,
    "Transportation": 200.0,
    "Owner's Wages": 500.0,
    "Workers' Wages": 1000.0,
    "Electricity": 300.0,
    "Repairs": 150.0,
    "Other Cost": 100.0,
    "Other Income": 50.0,
    "Loan": 0.0,
    "Interest": 0.0,
    "Amount": 750.0
  }
}

Deploy to Vercel

  1. Install Vercel CLI:
npm i -g vercel
  1. Deploy:
vercel
  1. Set up environment variables in Vercel:
  • Go to your project settings
  • Add the GOOGLE_API_KEY environment variable

Testing the API

  1. Start the API locally:
uvicorn main:app --reload
  1. Open the interactive Swagger documentation:
  • Visit http://localhost:8000/docs in your browser
  • Test the endpoints using the interactive interface
  1. Try different document types:
  • Upload sample images for each document type
  • Verify the JSON output matches the expected structure
  • Check error handling by uploading invalid files

Error Handling

The API includes comprehensive error handling:

  • Invalid file types
  • Malformed images
  • Invalid document types
  • AI processing errors
  • JSON parsing errors

Error responses include detailed messages to help identify and resolve issues.

CORS

CORS is enabled for all origins by default. Modify the CORS settings in main.py if you need to restrict access.

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages