A production-ready financial document processing and analysis platform built with Google's Agent Development Kit (ADK) and Google Cloud services.
FinFlow is an intelligent system that analyzes, processes, and extracts insights from financial documents using multiple specialized agents. The system leverages Google Document AI for document processing, BigQuery for data storage, and Vertex AI Gemini models for intelligent processing.
This project provides a complete end-to-end integration of all agent components into a production-level implementation with multiple deployment options, comprehensive testing, and robust configuration management.
- Multi-agent architecture for specialized document processing tasks
- Advanced agent communication framework with state management
- Intelligent task delegation across specialized agents
- Document parsing and entity extraction using Document AI
- Validation against business rules and compliance requirements
- Structured data storage in BigQuery
- Financial analytics and insights generation
- Comprehensive agent testing framework
The system consists of several specialized agents:
- Master Orchestrator - Coordinates the overall workflow
- Document Processor - Extracts information from documents
- Validation - Validates documents against rules
- Storage - Manages data persistence in BigQuery and other storage systems
- Analytics - Generates financial insights
For more details, see Agent Architecture.
The FinFlow system features a robust Agent Communication Framework that provides:
- Full Communication Protocol - Standardized message passing with delivery guarantees
- Task Execution Framework - Task creation, tracking, and hierarchical execution
- State-based Communication - Workflow state management with tracking and history
- LLM-driven Delegation - Intelligent task delegation based on agent capabilities
- Advanced Delegation Strategies - Multiple strategies for optimal agent selection
For more details, see Agent Communication Framework.
For comprehensive setup instructions and development guidelines, see:
The StorageAgent is responsible for all database operations in the FinFlow system. It provides:
- Comprehensive BigQuery dataset and schema management
- Document and entity storage with optimized data access patterns
- Financial data analysis and reporting capabilities
- Relationship tracking between financial documents
- Caching layer for performance optimization
For more details, see Storage Agent Documentation.
FinFlow can be run in several modes to accommodate different use cases:
Run as a web API server:
./start.sh production server 8000Or using Python directly:
python main.py --env production --mode server --port 8000Run in interactive command-line mode:
./start.sh production cliOr using Python directly:
python main.py --env production --mode cliProcess a directory of documents:
python main.py --env production --mode batch --batch-dir /path/to/documentsRun a specific workflow against a document:
python main.py --env production --mode workflow --workflow invoice --document /path/to/invoice.pdfThe system can be deployed using Docker:
# Build the Docker image
docker build -t finflow .
# Run with Docker Compose for full environment
docker-compose up -d- Python 3.10+
- Google Cloud project with the following APIs enabled:
- Vertex AI API
- Document AI API
- BigQuery API
- Cloud Storage API
- Clone this repository
git clone https://github.com/Thin-Equation/finflow.git
cd finflow- Create and activate a virtual environment
python -m venv finflow-env
source finflow-env/bin/activate # On Windows: finflow-env\Scripts\activate- Install dependencies
pip install -r requirements.txt- Set up your configuration
export FINFLOW_ENV=development
# Optionally create a .env file with your configuration settingsThe project includes several tools for dependency management:
For new developers, the easiest way to set up the environment is with our setup script:
# Setup and validate all dependencies at once
./setup_dependencies.shThis script will:
- Check your Python version
- Create or activate a virtual environment
- Install all dependencies
- Verify dependencies match project requirements
- Create a frozen requirements file for reproducible builds
For more granular dependency management, use the Makefile targets:
# Install dependencies
make install
# Check if all dependencies are properly listed in requirements.txt
make check-deps
# Update dependencies to latest compatible versions
make update-depsWhen adding new dependencies to the project:
- Add them to requirements.txt with appropriate version constraints
- Run
make check-depsto verify all imports are properly documented - Consider updating the frozen requirements for reproducibility
To check for outdated dependencies and get recommended updates:
# Show outdated dependencies and safety analysis
make outdated-deps
# or directly
./scripts/check_outdated_deps.pyThis tool analyzes your dependencies against semantic versioning constraints and suggests safe updates.
Use the provided script to run the ADK CLI:
./run_adk.shOr to run a specific agent:
./run_adk.sh --agent FinFlow_DocumentProcessorRun the unit tests:
python -m unittest discover testsFor more comprehensive testing options:
# Run tests with pytest
pytest
# Generate test coverage report
pytest --cov=agents --cov-report=htmlFor interactive testing with ADK CLI:
# Run the Hello World agent
./run_hello_world.sh
# Run with debug mode
./run_hello_world.sh --debugFor detailed testing information, see Testing Guide and Initial Agent Testing.
The system uses environment-specific YAML configuration files:
config/development.yaml- Development settingsconfig/staging.yaml- Staging settingsconfig/production.yaml- Production settings
Configuration can be overridden using local files (e.g., config/development.local.yaml).
4. Configure authentication
- Use
pip install -r requirements.txtto install dependencies - Configure environment variables
agents/: Agent definitionstools/: Custom toolsmodels/: Data modelsconfig/: Configuration filestests/: Test casesutils/: Utility functionsapp.py: Main application entry point