A Python tool for consolidating and analyzing financial transactions from multiple file formats.
Financial Consolidator processes bank statements and transaction exports from multiple sources, providing:
- Multi-format parsing - CSV, OFX/QFX, Excel, and PDF files
- Automatic categorization - Rule-based system with manual override support
- Duplicate detection - Fuzzy matching across files to flag duplicates
- Anomaly detection - Flags large transactions, fees, cash advances, and date gaps
- Financial reporting - Excel workbooks with P&L summaries and detailed analysis
- Multi-format CSV parsing - Recognizes Chase, Bank of America, Wells Fargo, Capital One, and other common CSV formats
- Flexible categorization - Priority-ordered rules with regex pattern matching
- AI-powered categorization - Optional Claude AI integration for uncategorized transactions
- Fuzzy duplicate detection - Configurable similarity threshold and date tolerance
- Running balance calculation - Per-account balance tracking
- Excel output - Styled workbooks with multiple analysis sheets
- CSV export - Google Sheets compatible output files
- Interactive mode - Prompts to create accounts and map files on first run
# Clone repository
git clone <repo-url>
cd financial-disclosure
# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Or install as editable package
pip install -e .- pandas >= 2.0
- openpyxl >= 3.1
- pdfplumber >= 0.10
- ofxparse >= 0.21
- pyyaml >= 6.0
- rich >= 13.0
- anthropic >= 0.39.0 (optional, for AI categorization)
- python-dotenv >= 1.0.0
# Basic usage - outputs to analysis/analysis_YYYYMMDD_HHMMSS.csv
financial-consolidator -i ./bank_statements
# With date range filter
financial-consolidator -i ./statements \
--start-date 2024-01-01 --end-date 2024-12-31
# Also export Excel workbook alongside CSV
financial-consolidator -i ./statements --xlsx
# Custom output path
financial-consolidator -i ./statements -o my_report.csv
# Legacy Excel output with CSV export
financial-consolidator -i ./statements -o report.xlsx --csv
# Non-interactive mode (skip unmapped files)
financial-consolidator -i ./statements --no-interactive
# Verbose output for debugging
financial-consolidator -i ./statements -vvConfiguration files are stored in the config/ directory:
Global application settings:
output:
format: "xlsx"
date_format: "%Y-%m-%d"
currency_symbol: "$"
decimal_places: 2
anomaly_detection:
large_transaction_threshold: 5000.00
date_gap_warning_days: 7
date_gap_alert_days: 30
fee_keywords:
- FEE
- CHARGE
- PENALTY
- OVERDRAFT
cash_advance_keywords:
- CASH ADVANCE
- ATM WITHDRAWAL
logging:
level: "INFO"
file: "financial_consolidator.log"Account definitions and file mappings:
accounts:
checking_main:
name: "Primary Checking"
type: checking
institution: "Chase"
source_file_patterns:
- "*chase*checking*.csv"
opening_balance: 1000.00
opening_balance_date: "2024-01-01"
credit_card:
name: "Rewards Card"
type: credit_card
institution: "Bank of America"
file_mappings:
"Chase1234_Activity.csv": "checking_main"Category hierarchy and categorization rules:
categories:
- id: income_salary
name: "Salary"
type: income
- id: expense_groceries
name: "Groceries"
type: expense
- id: expense_utilities
name: "Utilities"
type: expense
parent_id: expense_housing
rules:
- id: salary_direct_deposit
category_id: income_salary
priority: 100
keywords:
- DIRECT DEPOSIT
- PAYROLL
- id: grocery_stores
category_id: expense_groceries
priority: 50
keywords:
- KROGER
- WHOLE FOODS
- TRADER JOEManual overrides for specific transactions:
overrides:
- category_id: expense_travel
priority: 1000
date: "2024-03-15"
amount: -500.00
description_pattern: "HOTEL"Store sensitive credentials in a .env file (gitignored):
# Copy the template
cp .env.example .env
# Edit with your API key
ANTHROPIC_API_KEY=sk-ant-api03-your-key-hereThe .env file is automatically loaded at startup.
The tool includes optional AI-powered categorization using Claude to:
- Categorize uncategorized transactions - AI analyzes transaction descriptions and suggests categories
- Validate low-confidence categorizations - AI reviews rule-based assignments with confidence below threshold
-
Get an API key from Anthropic Console
-
Create a
.envfile with your key:cp .env.example .env # Edit .env and add your ANTHROPIC_API_KEY
# Enable all AI features
financial-consolidator -i ./statements --ai
# AI dry-run (show costs without making API calls)
financial-consolidator -i ./statements --ai --ai-dry-run
# Only categorize uncategorized transactions
financial-consolidator -i ./statements --ai-categorize
# Only validate low-confidence categorizations
financial-consolidator -i ./statements --ai-validate
# Set budget limit (default: $5.00)
financial-consolidator -i ./statements --ai --ai-budget 2.00
# Skip confirmation prompts
financial-consolidator -i ./statements --ai --skip-ai-confirm- Estimated costs are shown before API calls
- Default budget limit: $5.00 per run
- Confirmation required before proceeding (unless
--skip-ai-confirm) - Rate limiting: 20 requests/minute
| Option | Description |
|---|---|
-i, --input-dir PATH |
Directory containing transaction files (required) |
-o, --output PATH |
Output file path (default: analysis/analysis_YYYYMMDD_HHMMSS.csv) |
--config-dir PATH |
Configuration directory (default: ./config) |
--config PATH |
Path to settings.yaml |
--accounts PATH |
Path to accounts.yaml |
--categories PATH |
Path to categories.yaml |
--start-date DATE |
Filter transactions from this date (YYYY-MM-DD) |
--end-date DATE |
Filter transactions until this date (YYYY-MM-DD) |
--xlsx |
Also export Excel workbook (when using CSV output) |
--csv |
Also export CSV files (when using .xlsx output) |
--no-interactive |
Skip prompts for unmapped files |
--strict |
Abort on first parse error |
--dry-run |
Parse files without generating output |
--validate-only |
Validate configuration files only |
--large-transaction-threshold AMOUNT |
Override large transaction threshold |
--ai |
Enable all AI categorization features |
--ai-validate |
Only validate low-confidence categorizations |
--ai-categorize |
Only categorize uncategorized transactions |
--ai-budget AMOUNT |
Maximum USD to spend on AI (default: $5.00) |
--ai-dry-run |
Show AI costs without making API calls |
--ai-confidence FLOAT |
Validation threshold (default: 0.7) |
--skip-ai-confirm |
Skip AI confirmation prompts |
--export-uncategorized PATH |
Export uncategorized transactions for review |
--export-summary PATH |
Export categorization summary |
-v, --verbose |
Increase verbosity (-v, -vv, -vvv) |
--version |
Show version |
-h, --help |
Show help |
| Format | Extensions | Notes |
|---|---|---|
| CSV | .csv | Auto-detects Chase, Bank of America, Wells Fargo, and generic formats |
| OFX/QFX | .ofx, .qfx | Open Financial Exchange standard |
| Excel | .xlsx | Multi-sheet workbook support |
| Bank statement extraction |
By default, output files are written to the analysis/ directory with timestamped filenames (e.g., analysis/analysis_20240115_143022_*.csv). Use -o to specify a custom path.
CSV files are generated by default for easy import into Google Sheets:
{base}_pl_summary.csv- P&L summary{base}_all_transactions.csv- All transactions{base}_account_{name}.csv- Per-account sheets{base}_category_analysis.csv- Category breakdown{base}_anomalies.csv- Flagged items
Use --xlsx to also generate an Excel workbook.
Use -o report.xlsx or --xlsx flag. The Excel file contains multiple sheets:
- P&L Summary - Income vs. expenses breakdown by category
- All Transactions - Master list with all columns (date, account, description, category, amount, balance, flags)
- [Account Name] - Per-account transaction history (one sheet per account)
- Category Analysis - Monthly spending by category
- Anomalies - Flagged transactions and date gaps
financial-disclosure/
├── config/ # Configuration files
│ ├── settings.yaml
│ ├── accounts.yaml
│ ├── categories.yaml
│ └── manual_categories.yaml
├── src/financial_consolidator/
│ ├── __init__.py
│ ├── cli.py # Command-line interface
│ ├── config.py # Configuration loading
│ ├── models/ # Data models
│ │ ├── transaction.py
│ │ ├── account.py
│ │ └── category.py
│ ├── parsers/ # File format parsers
│ │ ├── detector.py # Format detection
│ │ ├── csv_parser.py
│ │ ├── ofx_parser.py
│ │ ├── excel_parser.py
│ │ └── pdf_parser.py
│ ├── processing/ # Transaction processing
│ │ ├── normalizer.py
│ │ ├── categorizer.py
│ │ ├── deduplicator.py
│ │ ├── anomaly_detector.py
│ │ └── balance_calculator.py
│ ├── output/ # Report generation
│ │ ├── excel_writer.py
│ │ └── csv_exporter.py
│ └── utils/ # Utilities
│ ├── date_utils.py
│ ├── decimal_utils.py
│ └── logging_config.py
├── requirements.txt
├── pyproject.toml
└── README.md
- File Discovery - Scan input directory for supported file types
- Format Detection - Identify appropriate parser for each file
- Account Mapping - Associate files with configured accounts
- Parsing - Extract raw transactions from each file
- Normalization - Standardize dates, amounts, and formats
- Categorization - Apply rules and manual overrides
- Deduplication - Flag duplicate transactions
- Balance Calculation - Compute running balances per account
- Anomaly Detection - Flag suspicious transactions
- Output Generation - Create Excel workbook and CSV files
MIT License - see LICENSE file for details.