A multi-agent system for analyzing financial documents (PDFs) using CrewAI and GPT-4. Upload a quarterly report or financial statement, and get back detailed analysis including verification, metrics extraction, investment recommendations, and risk assessment.
- Reads PDF financial documents (10-K, quarterly reports, etc.)
- Runs 4 AI agents that each specialize in different analysis
- Stores results in PostgreSQL
- Has a simple web UI + REST API
- Clone and add your API keys to
.env:
cp .env.example .env
# edit .env with your OPENAI_API_KEY and SERPER_API_KEY- Run with Docker:
docker-compose up --build- Open http://localhost:8000 and upload a PDF
That's it. The database tables get created automatically on first run.
Go to http://localhost:8000, upload PDF, optionally add a specific question, click analyze.
curl -X POST http://localhost:8000/analyze \
-F "file=@quarterly_report.pdf" \
-F "query=What are the key financial risks?"Response includes:
verification- document authenticity checkfinancial_analysis- key metrics and trendsinvestment_recommendations- buy/hold/sell thesisrisk_assessment- identified risks and mitigations
curl http://localhost:8000/health4 CrewAI agents working sequentially:
- Verifier - checks document is legit and complete
- Financial Analyst - extracts metrics, analyzes trends
- Investment Advisor - gives investment recommendation
- Risk Assessor - identifies financial/operational risks
Results get saved to PostgreSQL (analysis and analysis_results tables).
Background jobs go through Celery + Redis.
- Python 3.11
- FastAPI
- CrewAI 0.130.0
- OpenAI GPT-4 Turbo
- PostgreSQL 15
- Redis (for Celery)
- Docker
main.py - FastAPI app, API endpoints
agents.py - CrewAI agent definitions
task.py - Task definitions for each agent
tools.py - PDF reader tool
db_models.py - SQLAlchemy models
celery_app.py - Celery configuration
worker.py - Celery worker
index.html - Web UI
The original code had several issues that I identified and fixed:
- Import errors - CrewAI 0.130+ changed import paths (
from crewai import Agentnotfrom crewai.agents) - Tool pattern wrong - Had to use
BaseToolfromcrewai.toolswith_run()method - Agents couldn't find files - Fixed by pre-reading PDF content server-side and injecting into prompts
- SQLAlchemy issues - JSONB import path changed,
metadatais reserved keyword (renamed toresult_metadata) - Missing validation - Added PDF header checks, file size limits, content-type verification
- No database persistence - Added PostgreSQL storage for analysis results
- Generic error handling - Added proper HTTP status codes (400, 413, 415, 500)
- Circular LLM assignment - Original had
llm=llm, fixed with properChatOpenAIinitialization - Wrong parameter name -
tool=changed totools=(list of tools) - max_iter=1 - Agents gave up after 1 try, increased to reasonable values
- Missing agent goal parameter - Verifier and Risk Assessor had syntax errors
- PDFReader import missing - Added
from llama_index.readers.file import PDFReader - Unprofessional backstories - Rewrote agent backstories to be more professional
- Incomplete task descriptions - Rewrote task expected outputs with clear structure
- PDF content is truncated to ~15KB before sending to agents (keeps costs down)
- Each analysis takes 30-60 seconds depending on document size
- Flower dashboard at http://localhost:5555 for monitoring Celery tasks
See .env.example for all options. Main ones:
OPENAI_API_KEY- your OpenAI keySERPER_API_KEY- for web search (optional)DATABASE_URL- PostgreSQL connection string


