The FinDoc Analyzer with the dark sidebar UI is the MAIN and DEFAULT application for this project.
This README serves as the definitive guide for the FinDoc Analyzer application, clarifying its status as the main application and documenting the extensive development work completed over the past 30 weeks.
powershell -ExecutionPolicy Bypass -File .\run-findoc-fixed.ps1This script:
- Stops any existing Node.js and Python processes
- Starts the backend server (Flask) from the
DevDocs/backenddirectory - Starts the frontend server (Next.js) from the
DevDocs/frontenddirectory - Opens the application in the browser at http://localhost:3002
Over the past 30 weeks, we have developed a comprehensive financial document analysis platform with the following components:
- FinDoc Analyzer UI: Dark sidebar interface with comprehensive navigation
- Backend API: Flask-based API for document processing and financial analysis
- Frontend: Next.js application with React components
- Database: Supabase PostgreSQL database for document and financial data storage
- DocumentPreprocessorAgent: Prepares documents for analysis
- HebrewOCRAgent: Optimized OCR for Hebrew financial documents
- FinancialTableDetectorAgent: Identifies and extracts tables from financial documents
- FinancialDataAnalyzerAgent: Analyzes financial data from documents
- DocumentIntegrationAgent: Integrates data from multiple documents
- QueryEngineAgent: Answers questions about financial documents
- ISINExtractorAgent: Extracts ISIN codes from financial documents
- DocumentMergeAgent: Merges data from multiple documents
- DataExportAgent: Exports financial data in various formats
- DocumentComparisonAgent: Compares multiple financial documents
- FinancialReportGeneratorAgent: Generates financial reports
- PortfolioAnalysisAgent: Analyzes investment portfolios
- Document Upload and Storage: Secure document management
- OCR Processing: Advanced OCR with Hebrew optimization
- RAG Multimodal Processing: AI-powered document understanding
- Financial Data Analysis: Securities identification and portfolio metrics
- Query Engine: Natural language questions about financial documents
- Document Comparison: Identify changes and trends across documents
- Data Export: Multiple export formats (Excel, CSV, PDF, JSON)
- Financial Advisor: AI-powered financial recommendations
- Portfolio Analysis: Comprehensive investment portfolio analysis
- ISIN Processing: Extraction and validation of ISIN codes
- Frontend: Next.js, React
- Backend: Flask, Python
- Database: Supabase (PostgreSQL)
- AI: OpenRouter API (Claude, GPT-4), RAG (Retrieval-Augmented Generation)
- OCR: Tesseract, Camelot, PDFPlumber, Unstructured
- Deployment: Google Cloud Run
- CI/CD: GitHub Actions
- ALWAYS refer to the FinDoc Analyzer with dark sidebar UI as the main application
- ALL development work must be focused on this application
- NEVER create alternative UIs or applications without explicit approval
- Backend code goes in
DevDocs/backend/ - Frontend code goes in
DevDocs/frontend/ - Agent implementations go in
DevDocs/backend/agents/ - Tests go in
DevDocs/with appropriate naming (e.g.,test_agent_name.py)
- The main UI is defined in
DevDocs/frontend/components/FinDocLayout.js - The sidebar navigation is defined in this component
- NEVER modify the core UI structure without approval
- New features should be added as pages or components within the existing structure
- All agents must follow the established pattern in
DevDocs/backend/agents/ - Each agent must have a corresponding test file
- Agents must be integrated into the main application
- New agents must be added to the sidebar navigation
- All code must have corresponding tests
- Tests must verify 100% accuracy in financial document processing
- Tests must check all ISINs, holdings names, values, and quantities
- Run tests before pushing to GitHub
- NEVER commit API keys or secrets to GitHub
- Use GitHub secrets for CI/CD
- The OpenRouter API key should be stored in GitHub secrets
- Local development should use
.env.localfiles (not committed to GitHub)
- The application will be deployed to Google Cloud Run
- Deployment is configured in
cloudbuild.yaml - The main branch is the source for deployment
- The service account is
github@github-456508.iam.gserviceaccount.com
-
Complete OCR Implementation (Week 4)
- Ensure HebrewOCRAgent is fully functional
- Verify Tesseract OCR installation and configuration
- Test OCR with Hebrew financial documents
-
Enhance RAG Processor
- Add support for more document types
- Improve multilingual support (especially Hebrew)
- Enhance document understanding capabilities
-
Improve UI with Visualizations
- Add interactive charts for portfolio analysis
- Implement better feedback mechanisms
- Enhance user experience with visual cues
-
Expand Testing
- Implement CI/CD integration
- Create comprehensive test suite for all agents
- Verify 100% accuracy in financial document processing
-
GitHub Integration
- Ensure all code is pushed to the main branch
- Verify GitHub Actions workflows
- Document all completed work
- Complete HebrewOCRAgent implementation
- Test OCR with various document types
- Integrate OCR results with document processing pipeline
- Implement FinancialTableDetectorAgent
- Enhance table extraction capabilities
- Test with complex financial tables
- Implement PortfolioAnalysisAgent
- Create financial report generation capabilities
- Test with real portfolio data
- Implement remaining agents
- Enhance UI with visualizations
- Improve RAG processor
- Expand testing
- Prepare for deployment
This project is configured for deployment on Google Cloud Run in the me-west1 (Tel Aviv) region:
Project: github (ID: github-456508, Number: 683496987674)
Service Account: github@github-456508.iam.gserviceaccount.com (ID: 104645681997583496565)
Region: me-west1 (Tel Aviv)
Allows unauthenticated invocationsSee DevDocs/DEPLOYMENT.md for detailed deployment instructions.
This project uses several API keys and secrets that should never be committed to the repository:
- OpenRouter API Key:
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 - Supabase Database:
db.dnjnsotemnfrjlotgved.supabase.co:5432/postgres - Google Cloud Service Account:
github@github-456508.iam.gserviceaccount.com
- Environment Variables: Store all secrets as environment variables in
.env.local - GitHub Secrets: Use GitHub secrets for CI/CD pipelines
- Access Control: Limit access to production environments
- Git Hygiene: Never commit secrets to the repository
For questions or support, please contact aviadkim@gmail.com.
