Skip to content

Shreyas191/FinSight-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FinSight AI - Intelligent Financial Report Analyzer

A production-ready RAG (Retrieval-Augmented Generation) application for analyzing financial documents using React, FastAPI, LangChain, ChromaDB, and Google's Gemini AI.

Features

  • Document Ingestion: Upload and process financial PDFs (10-Ks, earnings calls, etc.)
  • RAG Q&A: Ask questions about documents with source citations
  • KPI Extraction: Automatically extract key financial metrics
  • Sentiment Analysis: Analyze document sentiment (positive/neutral/negative)
  • Multi-Document Comparison: Compare KPIs and trends across documents
  • PDF Report Generation: Export professional analytical reports

Tech Stack

Backend

  • Framework: FastAPI (Python 3.11+)
  • AI/RAG: LangChain with Google Gemini (gemini-1.5-pro)
  • Vector Store: ChromaDB
  • PDF Processing: pypdf, pdfplumber
  • Report Generation: ReportLab
  • Database: SQLite (for document registry)

Frontend

  • Framework: React 18 with Vite
  • State Management: TanStack Query (React Query)
  • Routing: React Router v6
  • Styling: Tailwind CSS
  • Charts: Recharts
  • Icons: Lucide React

Architecture

The backend follows clean/hexagonal architecture:

backend/
├── src/
│   ├── domain/           # Core entities (Document, KPI, Query, etc.)
│   ├── application/      # Use cases/services
│   ├── infrastructure/   # External integrations
│   │   ├── rag/         # LangChain chains (QA, KPI, sentiment, comparison)
│   │   ├── pdf/         # PDF processing and report generation
│   │   └── database/    # SQLAlchemy models
│   └── interfaces/       # FastAPI routes and schemas

Prerequisites

  • Python 3.11 or higher
  • Node.js 18 or higher
  • Google API Key (for Gemini)

Setup Instructions

1. Clone the Repository

git clone <repository-url>
cd finsight-ai

2. Backend Setup

cd backend

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Create .env file
cp .env.example .env

Edit .env and add your Google API key:

GOOGLE_API_KEY=your_google_api_key_here

Get your API key from: https://makersuite.google.com/app/apikey

3. Frontend Setup

cd ../frontend

# Install dependencies
npm install

4. Running the Application

Terminal 1 - Backend:

cd backend
source venv/bin/activate  # On Windows: venv\Scripts\activate
python src/main.py

Backend will run on: http://localhost:8000 API docs available at: http://localhost:8000/docs

Terminal 2 - Frontend:

cd frontend
npm run dev

Frontend will run on: http://localhost:3000

Usage Workflow

1. Upload Documents

  1. Go to the Documents page
  2. Drag & drop or select a financial PDF
  3. Wait for processing (text extraction, chunking, embedding)
  4. Document appears in the list

2. Query Documents

  1. Go to the Query page
  2. Optionally filter by specific documents
  3. Ask questions like:
    • "What was the revenue in Q3?"
    • "What are the main risk factors?"
    • "How did operating margins change?"
  4. View answers with source citations

3. Analytics

  1. Go to the Analytics page
  2. Select a document
  3. View extracted KPIs (Revenue, Net Income, EPS, etc.)
  4. See sentiment analysis results
  5. Download PDF report

4. Compare Documents

  1. Go to the Compare page
  2. Select 2+ documents to compare
  3. View KPI comparison table
  4. Read AI-generated comparison summary
  5. Download comparison report

API Endpoints

Documents

  • POST /documents/upload - Upload PDF
  • GET /documents - List all documents
  • GET /documents/{id} - Get document info
  • DELETE /documents/{id} - Delete document

Analysis

  • POST /query - RAG Q&A
  • GET /documents/{id}/kpis - Extract KPIs
  • GET /documents/{id}/sentiment - Analyze sentiment
  • GET /documents/{id}/report - Generate PDF report

Comparison

  • POST /compare - Compare documents
  • POST /compare/report - Generate comparison report

Health

  • GET /health - Health check

Configuration

Backend (.env)

# Gemini API
GOOGLE_API_KEY=your_key_here

# LangChain
EMBEDDING_MODEL=models/embedding-001
LLM_MODEL=gemini-1.5-pro
LLM_TEMPERATURE=0.1

# ChromaDB
CHROMA_PERSIST_DIRECTORY=./data/chroma
CHROMA_COLLECTION_NAME=financial_documents

# Storage
UPLOAD_DIRECTORY=./data/uploads

# Database
DATABASE_URL=sqlite:///./data/finsight.db

# Chunking
CHUNK_SIZE=1000
CHUNK_OVERLAP=200

# API
API_HOST=0.0.0.0
API_PORT=8000

Project Structure

finsight-ai/
├── backend/
│   ├── src/
│   │   ├── domain/
│   │   │   └── entities.py           # Core domain models
│   │   ├── application/
│   │   │   ├── document_service.py   # Document ingestion
│   │   │   ├── query_service.py      # RAG queries
│   │   │   ├── analytics_service.py  # KPI & sentiment
│   │   │   └── comparison_service.py # Multi-doc comparison
│   │   ├── infrastructure/
│   │   │   ├── config.py             # Settings
│   │   │   ├── rag/
│   │   │   │   ├── embedding.py      # Gemini embeddings
│   │   │   │   ├── vector_store.py   # ChromaDB
│   │   │   │   ├── qa_chain.py       # RAG QA chain
│   │   │   │   ├── kpi_chain.py      # KPI extraction
│   │   │   │   ├── sentiment_chain.py
│   │   │   │   └── comparison_chain.py
│   │   │   ├── pdf/
│   │   │   │   ├── processor.py      # PDF parsing
│   │   │   │   └── report_generator.py
│   │   │   └── database/
│   │   │       └── models.py         # SQLAlchemy
│   │   ├── interfaces/
│   │   │   ├── schemas.py            # Pydantic models
│   │   │   ├── document_routes.py    # Document API
│   │   │   ├── analysis_routes.py    # Analysis API
│   │   │   └── dependencies.py       # DI container
│   │   └── main.py                   # FastAPI app
│   ├── requirements.txt
│   └── .env.example
└── frontend/
    ├── src/
    │   ├── components/
    │   │   ├── Layout.jsx
    │   │   ├── DocumentUpload.jsx
    │   │   └── DocumentList.jsx
    │   ├── pages/
    │   │   ├── DocumentsPage.jsx
    │   │   ├── QueryPage.jsx
    │   │   ├── AnalyticsPage.jsx
    │   │   └── ComparePage.jsx
    │   ├── api/
    │   │   └── client.js             # Axios API client
    │   ├── App.jsx
    │   ├── main.jsx
    │   └── index.css
    ├── package.json
    ├── vite.config.js
    └── tailwind.config.js

Testing

# Backend tests
cd backend
pytest

# Frontend (if tests are added)
cd frontend
npm test

Deployment

Docker (Optional)

# Dockerfile example for backend
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY src/ ./src/
CMD ["python", "src/main.py"]

Production Considerations

  1. Security:

    • Use environment variables for secrets
    • Implement authentication/authorization
    • Add rate limiting
    • Enable HTTPS
  2. Performance:

    • Cache embeddings
    • Use connection pooling
    • Implement request queuing for heavy operations
  3. Monitoring:

    • Add logging (structlog, loguru)
    • Implement health checks
    • Track API metrics

Troubleshooting

Backend Issues

ChromaDB errors:

  • Delete ./data/chroma and restart
  • Check disk space

Gemini API errors:

  • Verify API key is correct
  • Check API quotas/limits
  • Ensure billing is enabled

PDF extraction failures:

  • Try different PDF files
  • Check file is not encrypted
  • Verify file is not corrupted

Frontend Issues

API connection errors:

  • Ensure backend is running
  • Check proxy configuration in vite.config.js
  • Verify CORS settings

npm install errors:

  • Delete node_modules and package-lock.json
  • Run npm install again

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

License

MIT License

Support

For issues and questions:

  • Open a GitHub issue
  • Check existing documentation
  • Review API docs at /docs

Acknowledgments

  • LangChain: RAG framework
  • Google Gemini: LLM and embeddings
  • ChromaDB: Vector database
  • FastAPI: Backend framework
  • React: Frontend framework

About

Financial Document Analyzer

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors