# ü§ñ Agentic Financial AI Assistant - RAG System

This Financial AI Assistant uses a **Retrieval Augmented Generation (RAG)** approach, meaning it only works with **YOUR actual financial data**. 

## üìä How It Works:
1. **Upload your financial data** (CSV or PDF files)
2. **Ask questions** about your finances
3. **Get personalized insights** based solely on your uploaded data

## üìÅ Supported File Types:
- **CSV files**: Transaction data with columns like date, amount, category/description
- **PDF files**: Bank statements and financial reports (automatically parsed)

## ‚ö†Ô∏è Important:
- This system provides **NO analysis without your data**
- All insights are generated from **your uploaded files only**
- No sample or demo data is used - everything is based on your actual financial information

# Agentic Financial AI Chatbot

This notebook provides a comprehensive financial analysis and budgeting assistant powered by AI.

## Workflow Overview:
1. **Initialize Libraries & Models** - Load required dependencies and AI models
2. **Data Input** - Upload or select CSV/PDF financial data files
3. **Data Processing** - Parse and analyze financial data
4. **Visualization** - Generate interactive charts and graphs
5. **AI Chat Interface** - Interactive assistant for financial advice and insights

> **Models Used**: GPT-2 and DistilBERT (no authentication required)

In [2]:
# Change to the agentic-ai directory
import os
import sys

# Try to find the agentic-ai directory
possible_paths = [
    './agentic-ai',
    '../agentic-ai', 
    './financial-agentic-ai',
    '../financial-agentic-ai',
    '.'  # Current directory if already in project
]

project_found = False
for path in possible_paths:
    if os.path.exists(path) and os.path.exists(os.path.join(path, 'agents')):
        os.chdir(path)
        project_found = True
        print(f"‚úÖ Found project directory: {os.path.abspath(path)}")
        break

if not project_found:
    print("‚ùå Project directory not found. Please ensure you have cloned the repository.")
    print("üí° Available directories:")
    for item in os.listdir('.'):
        if os.path.isdir(item):
            print(f"   üìÅ {item}")
else:
    print(f"üìÅ Current working directory: {os.getcwd()}")
    
    # List project structure
    print(f"\nüìÇ Project structure:")
    for item in os.listdir('.'):
        if os.path.isdir(item):
            print(f"   üìÅ {item}/")
        elif item.endswith(('.py', '.ipynb', '.txt', '.md')):
            print(f"   üìÑ {item}")
            
    print(f"\n‚úÖ Ready to proceed with setup!")

‚úÖ Found project directory: f:\agentic-ai
üìÅ Current working directory: f:\agentic-ai

üìÇ Project structure:
   üìÅ .git/
   üìÅ agents/
   üìÅ analysis/
   üìÑ cleanup.py
   üìÅ data/
   üìÅ llms/
   üìÑ main.ipynb
   üìÅ models/
   üìÅ parsers/
   üìÑ README.md
   üìÑ requirements.txt
   üìÑ test_setup.py
   üìÅ utils/
   üìÅ venv/
   üìÅ visualizations/

‚úÖ Ready to proceed with setup!


In [3]:
!pip install -r requirements.txt




[notice] A new release of pip is available: 24.0 -> 25.2
[notice] To update, run: python.exe -m pip install --upgrade pip


In [1]:
!python test_setup.py

üîÑ Importing project modules...
‚úÖ Imported 10 modules successfully
2025-08-09 01:16:02,397 - __main__ - INFO - Starting Financial Bot Setup Tests...
üöÄ STARTING COMPREHENSIVE AGENTIC FINANCIAL AI TESTING

üìã PHASE 1: SYSTEM COMPONENT TESTING

üîß Running Advanced System Validation Tests...
‚úÖ Test CSV created at: f:\agentic-ai\data\test_budget.csv
2025-08-09 01:16:02,400 - root - INFO - Testing system requirements...
2025-08-09 01:16:02,404 - root - INFO - Downloading and setting up models...
2025-08-09 01:16:02,404 - root - INFO - Downloading gpt2 model...
2025-08-09 01:16:02,955 - root - INFO - Downloading distilbert model...
2025-08-09 01:16:03,285 - root - INFO - Testing model initialization...
2025-08-09 01:16:03,285 - llms.gpt2_wrapper - INFO - Initializing GPT-2 model from gpt2
2025-08-09 01:16:05,664 - llms.gpt2_wrapper - INFO - Successfully loaded GPT-2 model
2025-08-09 01:16:05,666 - root - INFO - ‚úÖ GPT2Wrapper initialized successfully
2025-08-09 01:16:05,666 - ll

Device set to use cpu

'M' is deprecated and will be removed in a future version, please use 'ME' instead.


'M' is deprecated and will be removed in a future version, please use 'ME' instead.





In [4]:
# ========================================
# 1. INITIALIZE LIBRARIES AND DEPENDENCIES
# ========================================

import os
import sys
from pathlib import Path
import warnings
warnings.filterwarnings('ignore')

# Get the current working directory and add it to Python path
current_dir = os.getcwd()
if current_dir not in sys.path:
    sys.path.insert(0, current_dir)

# Also add parent directory if needed
parent_dir = str(Path(current_dir).parent)
if parent_dir not in sys.path:
    sys.path.insert(0, parent_dir)

print(f"üìÅ Current working directory: {current_dir}")
print(f"üêç Python path includes: {current_dir}")

# Core libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go
from IPython.display import display, HTML

print("‚úì Core libraries imported successfully!")

# Check if project modules are available before importing
def check_and_import_modules():
    """Check if project modules exist and import them safely"""
    modules_status = {}
    
    try:
        # Check if agents directory exists
        if os.path.exists(os.path.join(current_dir, 'agents')):
            from agents.financial_agent import FinancialAgent
            modules_status['FinancialAgent'] = '‚úì'
        else:
            modules_status['FinancialAgent'] = '‚ùå agents/ directory not found'
    except ImportError as e:
        modules_status['FinancialAgent'] = f'‚ùå {str(e)}'
    
    try:
        # Check if llms directory exists
        if os.path.exists(os.path.join(current_dir, 'llms')):
            from llms.gpt2_wrapper import GPT2Wrapper
            from llms.distilbert_wrapper import DistilBertWrapper
            modules_status['LLM Wrappers'] = '‚úì'
        else:
            modules_status['LLM Wrappers'] = '‚ùå llms/ directory not found'
    except ImportError as e:
        modules_status['LLM Wrappers'] = f'‚ùå {str(e)}'
    
    try:
        # Analysis modules
        if os.path.exists(os.path.join(current_dir, 'analysis')):
            from analysis.budget_calculator import BudgetCalculator
            from analysis.trend_analyzer import TrendAnalyzer
            modules_status['Analysis Modules'] = '‚úì'
        else:
            modules_status['Analysis Modules'] = '‚ùå analysis/ directory not found'
    except ImportError as e:
        modules_status['Analysis Modules'] = f'‚ùå {str(e)}'
    
    try:
        # Visualization modules
        if os.path.exists(os.path.join(current_dir, 'visualizations')):
            from visualizations.chart_generator import ChartGenerator
            modules_status['Visualization Modules'] = '‚úì'
        else:
            modules_status['Visualization Modules'] = '‚ùå visualizations/ directory not found'
    except ImportError as e:
        modules_status['Visualization Modules'] = f'‚ùå {str(e)}'
    
    try:
        # File processing modules
        if os.path.exists(os.path.join(current_dir, 'parsers')) and os.path.exists(os.path.join(current_dir, 'utils')):
            from parsers.csv_parser import CSVParser
            from parsers.pdf_parser import PDFParser
            from utils.file_loader import FileLoader
            from utils.helpers import Helpers
            modules_status['File Processing'] = '‚úì'
        else:
            missing = []
            if not os.path.exists(os.path.join(current_dir, 'parsers')):
                missing.append('parsers/')
            if not os.path.exists(os.path.join(current_dir, 'utils')):
                missing.append('utils/')
            modules_status['File Processing'] = f'‚ùå Missing directories: {", ".join(missing)}'
    except ImportError as e:
        modules_status['File Processing'] = f'‚ùå {str(e)}'
    
    return modules_status

# Check module imports
print("\nüîç Checking project modules...")
modules_status = check_and_import_modules()

for module_group, status in modules_status.items():
    print(f"   {module_group}: {status}")

# Count successful imports
successful_imports = sum(1 for status in modules_status.values() if status == '‚úì')
total_modules = len(modules_status)

print(f"\nüìä Import Summary: {successful_imports}/{total_modules} module groups imported successfully")

if successful_imports == total_modules:
    print("üéâ All modules imported successfully! You can proceed with the financial analysis.")
else:
    print("‚ö†Ô∏è  Some modules failed to import. You may need to:")
    print("   1. Make sure you're in the correct project directory")
    print("   2. Check that all project files are present")
    print("   3. Run the test_setup.py script first to validate the setup")
    print(f"\nüí° Current directory structure:")
    for item in os.listdir(current_dir):
        if os.path.isdir(os.path.join(current_dir, item)):
            print(f"   üìÅ {item}/")
        else:
            print(f"   üìÑ {item}")

üìÅ Current working directory: f:\agentic-ai
üêç Python path includes: f:\agentic-ai
‚úì Core libraries imported successfully!

üîç Checking project modules...
‚úì Core libraries imported successfully!

üîç Checking project modules...
   FinancialAgent: ‚úì
   LLM Wrappers: ‚úì
   Analysis Modules: ‚úì
   Visualization Modules: ‚úì
   File Processing: ‚úì

üìä Import Summary: 5/5 module groups imported successfully
üéâ All modules imported successfully! You can proceed with the financial analysis.
   FinancialAgent: ‚úì
   LLM Wrappers: ‚úì
   Analysis Modules: ‚úì
   Visualization Modules: ‚úì
   File Processing: ‚úì

üìä Import Summary: 5/5 module groups imported successfully
üéâ All modules imported successfully! You can proceed with the financial analysis.


In [5]:
# ========================================
# 2. DEPENDENCY VALIDATION
# ========================================

def check_dependencies():
    """Check if all required packages are installed"""
    
    print("üì¶ Checking dependencies...")
    print("-" * 50)
    
    required_packages = {
        'pandas': 'Data manipulation and analysis',
        'numpy': 'Numerical computations',
        'matplotlib': 'Basic plotting',
        'plotly': 'Interactive visualizations',
        'transformers': 'AI model support',
        'torch': 'PyTorch for AI models',
        'PyPDF2': 'PDF file processing',
        'openpyxl': 'Excel file support',
        'scipy': 'Scientific computing',
        'scikit-learn': 'Machine learning utilities'
    }
    
    optional_packages = {
        'pdfplumber': 'Advanced PDF parsing',
        'tabula': 'PDF table extraction',
        'seaborn': 'Statistical visualization'
    }
    
    missing_packages = []
    installed_packages = []
    
    print("üì¶ Package Status:")
    print("-" * 50)
    
    # Check required packages
    for package, description in required_packages.items():
        try:
            if package == 'scikit-learn':
                import sklearn
            elif package == 'PyPDF2':
                import PyPDF2
            else:
                __import__(package)
            print(f"‚úÖ {package} - {description}")
            installed_packages.append(package)
        except ImportError:
            print(f"‚ùå {package} - {description}")
            missing_packages.append(package)
    
    # Check optional packages
    if any(optional_packages):
        print(f"\nüì¶ Optional Packages:")
        for package, description in optional_packages.items():
            try:
                __import__(package)
                print(f"‚úÖ {package} - {description}")
            except ImportError:
                print(f"‚ö†Ô∏è  {package} - {description} (optional)")
    
    # Summary
    if missing_packages:
        print(f"\n‚ö†Ô∏è  Missing Critical Packages:")
        for package in missing_packages:
            print(f"‚ùå {package} - {required_packages[package]}")
        
        print(f"\nüí° To install missing packages, run in terminal:")
        print(f"   pip install {' '.join(missing_packages)}")
        print(f"\n   Or install everything from requirements.txt:")
        print(f"   pip install -r requirements.txt")
        
        return False
    else:
        print(f"\nüéâ All required dependencies are installed!")
        return True

# Run dependency check
dependencies_ok = check_dependencies()

if not dependencies_ok:
    print(f"\nüõë Please install missing packages and restart the notebook.")
    print(f"üí° You can continue with limited functionality, but some features may not work.")
else:
    print(f"\n‚úÖ Dependencies verified - Ready to proceed!")

üì¶ Checking dependencies...
--------------------------------------------------
üì¶ Package Status:
--------------------------------------------------
‚úÖ pandas - Data manipulation and analysis
‚úÖ numpy - Numerical computations
‚úÖ matplotlib - Basic plotting
‚úÖ plotly - Interactive visualizations
‚úÖ transformers - AI model support
‚úÖ torch - PyTorch for AI models
‚úÖ PyPDF2 - PDF file processing
‚úÖ openpyxl - Excel file support
‚úÖ scipy - Scientific computing
‚úÖ scikit-learn - Machine learning utilities

üì¶ Optional Packages:
‚úÖ pdfplumber - Advanced PDF parsing
‚úÖ tabula - PDF table extraction
‚úÖ openpyxl - Excel file support
‚úÖ scipy - Scientific computing
‚úÖ scikit-learn - Machine learning utilities

üì¶ Optional Packages:
‚úÖ pdfplumber - Advanced PDF parsing
‚úÖ tabula - PDF table extraction
‚úÖ seaborn - Statistical visualization

üéâ All required dependencies are installed!

‚úÖ Dependencies verified - Ready to proceed!
‚úÖ seaborn - Statistical visualization

# 2. Initialize AI Models

Loading the language models that will power the financial assistant:

- **GPT-2**: Primary model for text generation and conversational AI
- **DistilBERT**: Lightweight model for specific analysis tasks

These models run locally without requiring authentication or internet access.

In [None]:
# ========================================
# 2. INITIALIZE AI MODELS
# ========================================

print("ü§ñ Initializing AI Models...")
print("-" * 50)

# Initialize GPT-2 model
print("Loading GPT-2 model...")
try:
    gpt2_model = GPT2Wrapper("gpt2")
    print("‚úì GPT-2 model loaded successfully!")
except Exception as e:
    print(f"‚ùå Error loading GPT-2: {e}")
    gpt2_model = None

# Initialize DistilBERT model  
print("Loading DistilBERT model...")
try:
    distilbert_model = DistilBertWrapper("distilbert-base-uncased")
    print("‚úì DistilBERT model loaded successfully!")
except Exception as e:
    print(f"‚ùå Error loading DistilBERT: {e}")
    distilbert_model = None

# Create financial agents
print("Creating financial agents...")
if gpt2_model:
    gpt2_agent = FinancialAgent(gpt2_model)
    print("‚úì GPT-2 Financial Agent ready!")

if distilbert_model:
    distilbert_agent = FinancialAgent(distilbert_model)
    print("‚úì DistilBERT Financial Agent ready!")

# Initialize analysis components
budget_calc = BudgetCalculator()
trend_analyzer = TrendAnalyzer()
chart_gen = ChartGenerator()
file_loader = FileLoader()

print("‚úì All components initialized successfully!")
print("=" * 50)

# 3. RAG-Based Data Input

This system works as a **RAG (Retrieval Augmented Generation) model** that analyzes only your uploaded financial data.

## Supported Formats:
- **CSV files**: Must contain columns for date, amount, and category/description
- **PDF files**: Bank statements and financial reports (with intelligent parsing)

## RAG Model Features:
- üîç **Intelligent Data Discovery**: Automatically detects and maps your data columns
- üìä **Quality Assessment**: Validates data completeness and suggests improvements  
- ü§ñ **Context-Aware Analysis**: LLM analyzes patterns found in YOUR specific data
- üí¨ **Conversational Interface**: Ask questions about your actual financial data

## Required CSV Structure:
Your CSV should contain financial transaction data with columns like:
- **Date column**: Transaction dates (auto-detected format)
- **Amount column**: Transaction amounts (positive/negative values)
- **Category/Description**: Spending categories or transaction descriptions

> **Note**: The system intelligently detects column names and data patterns. Column names can be flexible (e.g., 'date', 'transaction_date', 'amount', 'value', 'category', 'description', etc.)

## How it works:
1. Upload your actual financial data (CSV/PDF)
2. System intelligently parses and validates your data
3. LLM learns from your specific spending patterns
4. Ask questions and get insights based on YOUR data only

# 4. Data Processing & Analysis

Once your data is loaded, we'll process it through multiple analysis stages:

1. **Data Validation**: Ensure proper format and clean any inconsistencies
2. **Budget Calculation**: Analyze spending patterns by category
3. **Trend Analysis**: Identify spending trends over time
4. **Financial Health Check**: Calculate savings rate and financial ratios

In [None]:
# ========================================
# 3. DATA INPUT AND LOADING
# ========================================

print("üìÅ Setting up data input...")

# Get current working directory and ensure we're in the right place
current_dir = os.getcwd()
print(f"üìç Current directory: {current_dir}")

# Smart directory detection for different environments
def setup_data_directories():
    """Setup data directories with smart path detection"""
    
    # Check if we're already in the project directory
    if os.path.exists('agents') and os.path.exists('parsers'):
        base_dir = current_dir
        print("‚úÖ Found project directory structure")
    else:
        # Look for project in common locations
        possible_locations = [
            'agentic-ai',
            '../agentic-ai', 
            './agentic-ai',
            'financial-agentic-ai',
            '../financial-agentic-ai'
        ]
        
        base_dir = None
        for location in possible_locations:
            if os.path.exists(location) and os.path.exists(os.path.join(location, 'agents')):
                base_dir = os.path.abspath(location)
                print(f"‚úÖ Found project at: {base_dir}")
                break
        
        if not base_dir:
            print("‚ùå Project directory not found!")
            print("üìÇ Available directories:")
            for item in os.listdir('.'):
                if os.path.isdir(item):
                    print(f"   üìÅ {item}")
            raise Exception("Please navigate to the correct project directory")
    
    # Create data directories
    input_dir = os.path.join(base_dir, 'data', 'input')
    output_dir = os.path.join(base_dir, 'data', 'output')
    
    os.makedirs(input_dir, exist_ok=True)
    os.makedirs(output_dir, exist_ok=True)
    
    print(f"üìÅ Input directory: {input_dir}")
    print(f"üìÅ Output directory: {output_dir}")
    
    # Verify directories were created
    if not os.path.exists(input_dir):
        raise Exception(f"Failed to create input directory: {input_dir}")
    
    return input_dir, output_dir

# Setup directories
try:
    input_dir, output_dir = setup_data_directories()
except Exception as e:
    print(f"‚ùå Directory setup failed: {e}")
    # Fallback to simple relative paths
    input_dir = os.path.join('data', 'input')
    output_dir = os.path.join('data', 'output')
    os.makedirs(input_dir, exist_ok=True)
    os.makedirs(output_dir, exist_ok=True)
    print(f"üîÑ Using fallback directories:")
    print(f"   üìÅ Input: {os.path.abspath(input_dir)}")
    print(f"   üìÅ Output: {os.path.abspath(output_dir)}")

# Initialize parsers
try:
    csv_parser = CSVParser()
    pdf_parser = PDFParser()
    print("‚úÖ File parsers initialized successfully")
except Exception as e:
    print(f"‚ùå Parser initialization failed: {e}")
    print("   Make sure all project modules are properly imported")
    raise

# File upload widget (for Jupyter environments)
try:
    from IPython.display import FileUpload
    
    print("\nüì§ File Upload Widget Available")
    print("You can drag and drop your CSV or PDF files below:")
    
    def handle_upload(change):
        """Handle file upload with enhanced validation"""
        for filename, file_info in change['new'].items():
            content = file_info['content']
            file_path = os.path.join(input_dir, filename)
            
            # Save uploaded file
            with open(file_path, 'wb') as f:
                f.write(content)
            
            print(f"‚úì File '{filename}' uploaded successfully!")
            print(f"   üìç Saved to: {file_path}")
            
            # Process and validate the file immediately
            process_and_validate_file(file_path)
    
    # Create upload widget - restrict to CSV and PDF only
    uploader = FileUpload(accept='.csv,.pdf', multiple=True)
    uploader.observe(handle_upload, names='value')
    display(uploader)
    
except ImportError:
    print("üí° File upload widget not available. Please use the directory method below.")

print(f"\nüìÇ Alternative: Place files in: {os.path.abspath(input_dir)}")
print("   Then run the next cell to process them.")

def process_and_validate_file(file_path):
    """Process and validate a single uploaded file with enhanced checking"""
    filename = os.path.basename(file_path)
    print(f"\nüîÑ Processing: {filename}")
    print(f"   üìç Full path: {file_path}")
    
    # Check if file exists
    if not os.path.exists(file_path):
        print(f"   ‚ùå File not found at path: {file_path}")
        return
    
    # Check file size
    file_size = os.path.getsize(file_path)
    if file_size == 0:
        print(f"   ‚ùå File is empty (0 bytes)")
        return
    
    print(f"   üìä File size: {file_size:,} bytes")
    
    try:
        if filename.lower().endswith('.csv'):
            # Process CSV with enhanced parser
            print("   üìä Parsing CSV file...")
            result = csv_parser.parse(file_path)
            
            df = result['data']
            summary = result['summary']
            
            print(f"   ‚úÖ CSV processed successfully!")
            print(f"      ‚Ä¢ Rows: {summary['total_rows']}")
            print(f"      ‚Ä¢ Date range: {summary['date_range'][0]} to {summary['date_range'][1]}")
            print(f"      ‚Ä¢ Total amount: ${summary['total_amount']:,.2f}")
            print(f"      ‚Ä¢ Categories: {len(summary['categories'])}")
            print(f"      ‚Ä¢ Column mapping: {summary['column_mapping']}")
            print(f"      ‚Ä¢ Data quality: {summary['data_quality']}/100")
            
            # Display data preview
            print("\n   üìã Data Preview:")
            display(df.head())
            
        elif filename.lower().endswith('.pdf'):
            # Process PDF with enhanced parser
            print("   üìÑ Parsing PDF file...")
            result = pdf_parser.parse(file_path)
            
            df = result['data']
            summary = result['summary']
            
            if summary['total_rows'] == 0:
                print("   ‚ùå No financial data could be extracted from this PDF.")
                print("      üí° Tips for better PDF processing:")
                print("         - Use PDFs with clear, readable text")
                print("         - Avoid scanned images without OCR")
                print("         - Bank statements work better than receipts")
                print("         - Try exporting bank data as CSV instead")
                return
            
            print(f"   ‚úÖ PDF processed successfully!")
            print(f"      ‚Ä¢ Rows extracted: {summary['total_rows']}")
            if summary['date_range']:
                print(f"      ‚Ä¢ Date range: {summary['date_range'][0]} to {summary['date_range'][1]}")
            print(f"      ‚Ä¢ Total amount: ${summary['total_amount']:,.2f}")
            print(f"      ‚Ä¢ Categories: {len(summary['categories'])}")
            print(f"      ‚Ä¢ Parsing method: {summary['parsing_method']}")
            print(f"      ‚Ä¢ Data quality: {summary['data_quality']}/100")
            
            # Display data preview
            print("\n   üìã Data Preview:")
            display(df.head())
            
            # PDF-specific feedback
            if summary['data_quality'] < 50:
                print("   ‚ö†Ô∏è  PDF extraction quality is low. Consider:")
                print("      - Using a CSV export from your bank instead")
                print("      - Ensuring the PDF has selectable text (not scanned)")
                print("      - Using a different PDF if available")
        else:
            print("   ‚ùå Unsupported file format. Please use CSV or PDF files only.")
            return
            
        # Validate if data is suitable for financial analysis
        if 'data_quality' in summary and summary['data_quality'] < 30:
            print("\n   üö´ FILE REJECTED: Data quality too poor for analysis")
            print("      Please upload a proper financial data file with:")
            print("      - Clear date information")
            print("      - Numeric amount values") 
            print("      - At least basic transaction descriptions")
            print("      - Minimum 5 transactions")
        else:
            print(f"\n   ‚úÖ File '{filename}' validated and ready for analysis!")
            
    except ValueError as e:
        print(f"   ‚ùå Validation Error: {e}")
        print("      Please check your file format and data structure.")
    except Exception as e:
        print(f"   ‚ùå Processing Error: {e}")
        print("      Please try a different file or check file integrity.")

# Debug: Check current directory contents
print(f"\nüîç Debugging - Directory Contents:")
print(f"   üìÇ Current directory ({current_dir}):")
for item in os.listdir(current_dir):
    if os.path.isdir(os.path.join(current_dir, item)):
        print(f"      üìÅ {item}/")
    else:
        print(f"      üìÑ {item}")

print(f"\n   üìÇ Input directory ({input_dir}):")
if os.path.exists(input_dir):
    input_contents = os.listdir(input_dir)
    if input_contents:
        for item in input_contents:
            print(f"      üìÑ {item}")
    else:
        print("      (empty)")
else:
    print("      ‚ùå Directory does not exist")

# Check for existing files in input directory
try:
    existing_files = [f for f in os.listdir(input_dir) 
                     if f.lower().endswith(('.csv', '.pdf'))]

    if existing_files:
        print(f"\nüìã Found {len(existing_files)} existing file(s):")
        for file in existing_files:
            print(f"   ‚Ä¢ {file}")
            file_path = os.path.join(input_dir, file)
            process_and_validate_file(file_path)
    else:
        print("\nüí° No files found in input directory.")
        print("   üìù For proper analysis, your files should contain:")
        print("      ‚Ä¢ CSV: columns for date, amount, category/description")
        print("      ‚Ä¢ PDF: bank statements or transaction reports")
        print("   üìÅ Upload files above or add them to the directory and rerun this cell.")
        
except Exception as e:
    print(f"\n‚ùå Error checking input directory: {e}")
    print(f"   Directory path: {input_dir}")
    print("   Please verify the directory exists and has proper permissions")

In [None]:
# ========================================
# 4. DATA PROCESSING AND ANALYSIS
# ========================================

# Process all files in the input directory
input_files = [f for f in os.listdir(input_dir) if f.endswith(('.csv', '.pdf'))]
processed_data = {}
financial_summaries = {}

if not input_files:
    print("‚ùå No data files found. Please upload files in the previous cell.")
    processed_data = None
else:
    print(f"üîÑ Processing {len(input_files)} file(s)...")
    print("=" * 50)
    
    for filename in input_files:
        file_path = os.path.join(input_dir, filename)
        print(f"\nüìÑ Processing: {filename}")
        
        try:
            # Load file using the file_loader utility
            file_data = file_loader.load_file(file_path)
            processed_data[filename] = file_data
            
            # Extract the DataFrame
            df = file_data['data']
            
            # Basic statistics
            print(f"   ‚úì Loaded {len(df)} transactions")
            print(f"   ‚úì Date range: {df['date'].min()} to {df['date'].max()}")
            print(f"   ‚úì Total amount: ${df['amount'].sum():.2f}")
            print(f"   ‚úì Categories: {', '.join(df['category'].unique())}")
            
            # Store summary for later use
            financial_summaries[filename] = {
                'total_amount': df['amount'].sum(),
                'transaction_count': len(df),
                'categories': df['category'].unique().tolist(),
                'date_range': (df['date'].min(), df['date'].max()),
                'avg_transaction': df['amount'].mean()
            }
            
        except Exception as e:
            print(f"   ‚ùå Error processing {filename}: {e}")
    
    print("\n" + "=" * 50)
    print("‚úì Data processing complete!")
    
    # Show overall summary
    if processed_data:
        total_transactions = sum(s['transaction_count'] for s in financial_summaries.values())
        total_amount = sum(s['total_amount'] for s in financial_summaries.values())
        all_categories = set()
        for s in financial_summaries.values():
            all_categories.update(s['categories'])
        
        print(f"\nüìä OVERALL SUMMARY:")
        print(f"   ‚Ä¢ Total files processed: {len(processed_data)}")
        print(f"   ‚Ä¢ Total transactions: {total_transactions}")
        print(f"   ‚Ä¢ Total amount: ${total_amount:.2f}")
        print(f"   ‚Ä¢ Unique categories: {len(all_categories)}")
        print(f"   ‚Ä¢ Categories: {', '.join(sorted(all_categories))}")

# 5. Data Visualization

Generate comprehensive visualizations of your financial data:

- **Monthly Spending Trends**: Track how your spending changes over time
- **Category Breakdown**: See where your money goes (pie charts and bar charts)
- **Daily Spending Patterns**: Identify spending habits by day of week/month
- **Budget Analysis**: Compare actual spending vs recommended percentages
- **Savings Analysis**: Track your saving patterns and goals

In [None]:
# ========================================
# 5. COMPREHENSIVE DATA VISUALIZATION  
# ========================================

if not processed_data:
    print("‚ùå No data to visualize. Please process data files first.")
else:
    print("üìä Generating comprehensive visualizations...")
    print("=" * 50)
    
    # Combine all dataframes for comprehensive analysis
    all_dataframes = []
    for filename, file_data in processed_data.items():
        df = file_data['data'].copy()
        df['source_file'] = filename  # Track which file data came from
        all_dataframes.append(df)
    
    # Create master dataframe
    master_df = pd.concat(all_dataframes, ignore_index=True)
    master_df['date'] = pd.to_datetime(master_df['date'])
    
    print(f"üìà Creating AI-powered visualizations for {len(master_df)} total transactions...")
    
    # Initialize enhanced chart generator
    enhanced_chart_gen = ChartGenerator()
    
    # Analyze data characteristics to determine best visualizations
    data_characteristics = {
        'num_transactions': len(master_df),
        'date_span_days': (master_df['date'].max() - master_df['date'].min()).days,
        'num_categories': master_df['category'].nunique(),
        'amount_range': master_df['amount'].max() - master_df['amount'].min(),
        'has_monthly_data': (master_df['date'].max() - master_df['date'].min()).days >= 30
    }
    
    print(f"   üìä Data analysis complete:")
    print(f"      ‚Ä¢ Time span: {data_characteristics['date_span_days']} days")
    print(f"      ‚Ä¢ Categories: {data_characteristics['num_categories']}")
    print(f"      ‚Ä¢ Amount range: ${data_characteristics['amount_range']:,.2f}")
    
    # Create comprehensive dashboard
    try:
        all_charts = enhanced_chart_gen.create_comprehensive_dashboard(master_df)
        
        # Display charts based on data characteristics
        charts_to_show = []
        
        # Always show overview charts
        essential_charts = ['monthly_trend', 'category_pie', 'top_categories']
        charts_to_show.extend(essential_charts)
        
        # Add time-based charts if we have sufficient temporal data
        if data_characteristics['has_monthly_data']:
            charts_to_show.extend(['monthly_heatmap', 'cumulative_spending'])
        
        # Add pattern analysis if we have enough categories
        if data_characteristics['num_categories'] >= 3:
            charts_to_show.extend(['weekday_analysis', 'category_timeline'])
        
        # Add distribution analysis if we have enough transactions
        if data_characteristics['num_transactions'] >= 20:
            charts_to_show.extend(['amount_histogram', 'daily_distribution'])
        
        # Display selected charts
        print(f"\nüìà Displaying {len(charts_to_show)} optimized visualizations:")
        
        for chart_name in charts_to_show:
            if chart_name in all_charts:
                print(f"\nüìä {chart_name.replace('_', ' ').title()}")
                display(all_charts[chart_name])
        
        # Save all charts
        try:
            saved_files = enhanced_chart_gen.save_charts(
                all_charts, 
                output_dir, 
                formats=['html']
            )
            print(f"\nüíæ Charts saved to: {output_dir}")
            print(f"   üìÅ {len(saved_files)} files created")
            
        except Exception as e:
            print(f"‚ö†Ô∏è  Could not save charts: {e}")
        
    except Exception as e:
        print(f"‚ùå Error with enhanced charts, falling back to basic visualizations: {e}")
        
        # Fallback to basic visualizations
        print("\nüîÑ Creating basic visualizations...")
        
        # 1. MONTHLY SPENDING TREND
        print("\n1Ô∏è‚É£ Monthly Spending Trend")
        monthly_spending = master_df.groupby(master_df['date'].dt.to_period('M'))['amount'].sum()
        
        fig_monthly = go.Figure()
        fig_monthly.add_trace(go.Scatter(
            x=[str(period) for period in monthly_spending.index],
            y=monthly_spending.values,
            mode='lines+markers',
            name='Monthly Spending',
            line=dict(width=3),
            marker=dict(size=8)
        ))
        
        fig_monthly.update_layout(
            title='üìà Monthly Spending Trend',
            xaxis_title='Month',
            yaxis_title='Amount ($)',
            height=400,
            template='plotly_white'
        )
        display(fig_monthly)
        
        # 2. CATEGORY BREAKDOWN (PIE CHART)
        print("\n2Ô∏è‚É£ Spending by Category")
        category_totals = master_df.groupby('category')['amount'].sum().sort_values(ascending=False)
        
        fig_pie = go.Figure(data=[go.Pie(
            labels=category_totals.index,
            values=category_totals.values,
            hole=0.3,
            textinfo='label+percent',
            textposition='outside'
        )])
        
        fig_pie.update_layout(
            title='ü•ß Spending Breakdown by Category',
            height=500,
            template='plotly_white'
        )
        display(fig_pie)
        
        # 3. TOP SPENDING CATEGORIES (BAR CHART)
        print("\n3Ô∏è‚É£ Top Spending Categories")
        top_categories = category_totals.head(10)
        
        fig_bar = go.Figure(data=[go.Bar(
            x=top_categories.values,
            y=top_categories.index,
            orientation='h',
            marker_color='lightblue'
        )])
        
        fig_bar.update_layout(
            title='üìä Top 10 Spending Categories',
            xaxis_title='Amount ($)',
            yaxis_title='Category',
            height=400,
            template='plotly_white'
        )
        display(fig_bar)
    
    # AI-powered insights based on visualizations
    print("\n" + "=" * 50)
    print("ü§ñ AI INSIGHTS FROM VISUALIZATIONS:")
    print("=" * 50)
    
    # Generate insights based on the data
    top_category = master_df.groupby('category')['amount'].sum().idxmax()
    top_amount = master_df.groupby('category')['amount'].sum().max()
    total_spending = master_df['amount'].sum()
    avg_transaction = master_df['amount'].mean()
    
    insights = []
    insights.append(f"üí∞ Your highest spending category is '{top_category}' (${top_amount:,.2f})")
    insights.append(f"? This represents {(top_amount/total_spending)*100:.1f}% of your total spending")
    insights.append(f"üí≥ Your average transaction amount is ${avg_transaction:.2f}")
    
    if data_characteristics['has_monthly_data']:
        monthly_avg = master_df.groupby(master_df['date'].dt.to_period('M'))['amount'].sum().mean()
        insights.append(f"üìÖ Your average monthly spending is ${monthly_avg:,.2f}")
    
    # Category diversity insight
    if data_characteristics['num_categories'] <= 3:
        insights.append("‚ö†Ô∏è  Consider tracking more specific expense categories for better budgeting")
    elif data_characteristics['num_categories'] >= 10:
        insights.append("‚úÖ Good category diversity - this helps with detailed budget analysis")
    
    for insight in insights:
        print(f"   {insight}")
    
    print("\n‚úÖ All visualizations generated successfully!")
    print("üí° Use the AI chat interface below to get personalized insights about your spending patterns.")

# 6. Multi-Agent Financial Analysis System

Your intelligent financial advisor team is ready! This system uses multiple AI agents working collaboratively:

## ? **Multi-Agent Workflow**
- **Planner Agent (GPT-2)**: Analyzes data and creates analysis plans
- **Executor Agent (DistilBERT)**: Executes the planned analysis tasks
- **Reviewer Agent (GPT-2)**: Verifies results and suggests optimizations

## üí° **Collaborative Analysis Features**
- **Task Planning**: AI creates structured analysis plans
- **Cross-Validation**: Multiple agents verify each other's work
- **Optimization Suggestions**: Continuous improvement of analysis
- **Role Switching**: Agents can switch roles for different perspectives

## üìä **Analysis Workflow**
1. **Plan Phase**: Agent analyzes your data and creates analysis strategy
2. **Execute Phase**: Different agent performs the planned analysis
3. **Review Phase**: Third agent validates and optimizes results
4. **Interactive Chat**: Discuss findings with the collaborative team

## üéØ **Collaborative Commands**
- `plan analysis` - Create structured analysis plan
- `execute plan` - Run the planned analysis
- `review results` - Validate and optimize findings
- `switch roles` - Change agent responsibilities
- `team discussion` - Multi-agent conversation mode

**Type 'help' for commands | 'quit' to exit | 'team' for collaborative mode**

In [None]:
# ========================================
# 6. MULTI-AGENT COLLABORATIVE ANALYSIS SYSTEM
# ========================================

class MultiAgentAnalysisSystem:
    """Multi-agent system for collaborative financial analysis"""
    
    def __init__(self, agents_dict, financial_data):
        self.agents = agents_dict
        self.financial_data = financial_data
        self.analysis_history = []
        self.current_plan = None
        self.execution_results = None
        
        # Define agent roles
        self.roles = {
            'planner': 'gpt2_agent',      # Plans analysis strategy
            'executor': 'distilbert_agent', # Executes planned tasks
            'reviewer': 'gpt2_agent'       # Reviews and optimizes results
        }
        
        self.collaboration_log = []
    
    def create_analysis_plan(self, user_request):
        """Planner agent creates structured analysis plan"""
        planner = self.agents[self.roles['planner']]
        
        planning_prompt = f"""
        As a Financial Analysis Planner, create a detailed analysis plan for this request: "{user_request}"
        
        Available data summary:
        - Total transactions: {len(self.financial_data) if self.financial_data is not None else 0}
        - Categories: {list(self.financial_data['category'].unique()) if self.financial_data is not None else 'None'}
        
        Create a step-by-step analysis plan with:
        1. Data requirements
        2. Analysis steps
        3. Expected outputs
        4. Success criteria
        
        Format as a clear, actionable plan.
        """
        
        plan = planner.run(planning_prompt)
        self.current_plan = {
            'request': user_request,
            'plan': plan,
            'created_by': 'planner',
            'timestamp': pd.Timestamp.now()
        }
        
        self.collaboration_log.append(f"üìã Planner created analysis plan for: {user_request}")
        return plan
    
    def execute_analysis_plan(self):
        """Executor agent performs the planned analysis"""
        if not self.current_plan:
            return "No analysis plan available. Please create a plan first."
        
        executor = self.agents[self.roles['executor']]
        
        execution_prompt = f"""
        As a Financial Analysis Executor, perform this analysis plan:
        
        Plan: {self.current_plan['plan']}
        
        Execute each step and provide detailed results with specific numbers and insights.
        Focus on actionable findings and concrete data points.
        """
        
        results = executor.run(execution_prompt)
        self.execution_results = {
            'results': results,
            'executed_by': 'executor',
            'timestamp': pd.Timestamp.now(),
            'original_plan': self.current_plan
        }
        
        self.collaboration_log.append(f"‚ö° Executor completed analysis execution")
        return results
    
    def review_and_optimize(self):
        """Reviewer agent validates and optimizes the results"""
        if not self.execution_results:
            return "No execution results available. Please execute an analysis first."
        
        reviewer = self.agents[self.roles['reviewer']]
        
        review_prompt = f"""
        As a Financial Analysis Reviewer, evaluate these analysis results:
        
        Original Plan: {self.current_plan['plan']}
        Execution Results: {self.execution_results['results']}
        
        Provide:
        1. Quality assessment of the analysis
        2. Accuracy verification
        3. Missing elements identification
        4. Optimization suggestions
        5. Final recommendations
        
        Be thorough and constructive in your review.
        """
        
        review = reviewer.run(review_prompt)
        
        optimized_result = {
            'review': review,
            'reviewed_by': 'reviewer',
            'timestamp': pd.Timestamp.now(),
            'original_results': self.execution_results
        }
        
        # Store complete analysis cycle
        self.analysis_history.append({
            'plan': self.current_plan,
            'execution': self.execution_results,
            'review': optimized_result
        })
        
        self.collaboration_log.append(f"üîç Reviewer completed analysis review and optimization")
        return review
    
    def switch_roles(self, role1, role2):
        """Switch agent roles for different perspectives"""
        if role1 in self.roles and role2 in self.roles:
            # Swap the agents
            agent1, agent2 = self.roles[role1], self.roles[role2]
            self.roles[role1], self.roles[role2] = agent2, agent1
            
            self.collaboration_log.append(f"üîÑ Switched roles: {role1} ‚Üî {role2}")
            return f"Roles switched: {role1} ‚Üî {role2}"
        else:
            return "Invalid roles. Available roles: planner, executor, reviewer"
    
    def team_discussion(self, topic):
        """Multi-agent discussion on a topic"""
        discussions = {}
        
        # Each agent provides their perspective
        for role, agent_name in self.roles.items():
            agent = self.agents[agent_name]
            
            discussion_prompt = f"""
            As a {role} in our financial analysis team, provide your perspective on: "{topic}"
            
            Consider your role's expertise and provide insights that complement the other team members.
            Keep your response focused and valuable to the collaborative analysis.
            """
            
            response = agent.run(discussion_prompt)
            discussions[role] = response
        
        # Synthesize the discussion
        synthesis_prompt = f"""
        Synthesize these team perspectives on "{topic}":
        
        Planner's view: {discussions.get('planner', 'Not available')}
        Executor's view: {discussions.get('executor', 'Not available')} 
        Reviewer's view: {discussions.get('reviewer', 'Not available')}
        
        Provide a balanced synthesis highlighting key insights and consensus points.
        """
        
        synthesis = self.agents[self.roles['planner']].run(synthesis_prompt)
        
        self.collaboration_log.append(f"üí¨ Team discussion completed on: {topic}")
        
        return {
            'individual_perspectives': discussions,
            'synthesis': synthesis
        }
    
    def get_collaboration_status(self):
        """Get current status of collaborative analysis"""
        status = {
            'current_roles': self.roles,
            'has_plan': self.current_plan is not None,
            'has_execution': self.execution_results is not None,
            'completed_analyses': len(self.analysis_history),
            'recent_activity': self.collaboration_log[-5:] if self.collaboration_log else []
        }
        return status

def collaborative_chat_interface():
    """Enhanced chat interface with multi-agent collaboration"""
    
    if not processed_data:
        print("‚ùå No financial data loaded. Please process data files first.")
        return
    
    # Initialize multi-agent system
    agents_dict = {}
    if gpt2_model and gpt2_agent:
        agents_dict['gpt2_agent'] = gpt2_agent
    if distilbert_model and distilbert_agent:
        agents_dict['distilbert_agent'] = distilbert_agent
    
    if len(agents_dict) < 2:
        print("‚ùå Need at least 2 AI models for collaborative analysis.")
        return
    
    # Combine data for analysis
    combined_df = pd.concat([file_data['data'] for file_data in processed_data.values()], 
                           ignore_index=True)
    
    # Initialize multi-agent system
    multi_agent_system = MultiAgentAnalysisSystem(agents_dict, combined_df)
    
    print("ü§ñ Multi-Agent Financial Analysis Team Ready!")
    print("=" * 60)
    print("üë• Team Members:")
    for role, agent in multi_agent_system.roles.items():
        model_name = "GPT-2" if "gpt2" in agent else "DistilBERT"
        print(f"   {role.title()}: {model_name}")
    
    print("\nüí° The team can work together on complex financial analysis!")
    print("=" * 60)
    
    # Prepare financial context
    total_amount = sum(s['total_amount'] for s in financial_summaries.values())
    total_transactions = sum(s['transaction_count'] for s in financial_summaries.values())
    
    print("\n" + "-" * 60)
    print("Commands: 'plan' | 'execute' | 'review' | 'switch' | 'team' | 'status' | 'quit'")
    print("-" * 60)
    
    while True:
        user_input = input("\nüí¨ You: ").strip()
        
        if user_input.lower() == 'quit':
            print("\nüëã Thank you for using the Multi-Agent Financial Analysis System!")
            break
            
        elif user_input.lower().startswith('plan'):
            request = user_input[4:].strip() or "comprehensive financial analysis"
            print(f"\nüìã Creating analysis plan for: {request}")
            print("-" * 40)
            plan = multi_agent_system.create_analysis_plan(request)
            print(f"ü§ñ Planner's Analysis Plan:\n{plan}")
            
        elif user_input.lower() == 'execute':
            print("\n‚ö° Executing analysis plan...")
            print("-" * 30)
            results = multi_agent_system.execute_analysis_plan()
            print(f"ü§ñ Executor's Results:\n{results}")
            
        elif user_input.lower() == 'review':
            print("\nüîç Reviewing and optimizing results...")
            print("-" * 35)
            review = multi_agent_system.review_and_optimize()
            print(f"ü§ñ Reviewer's Assessment:\n{review}")
            
        elif user_input.lower().startswith('switch'):
            # Parse switch command: "switch planner executor"
            parts = user_input.split()
            if len(parts) >= 3:
                result = multi_agent_system.switch_roles(parts[1], parts[2])
                print(f"\nüîÑ {result}")
                print("Updated roles:")
                for role, agent in multi_agent_system.roles.items():
                    model_name = "GPT-2" if "gpt2" in agent else "DistilBERT"
                    print(f"   {role.title()}: {model_name}")
            else:
                print("\nUsage: switch <role1> <role2>")
                print("Available roles: planner, executor, reviewer")
                
        elif user_input.lower().startswith('team'):
            topic = user_input[4:].strip() or "overall financial health"
            print(f"\n? Team discussion on: {topic}")
            print("-" * 40)
            discussion = multi_agent_system.team_discussion(topic)
            
            print("Individual Perspectives:")
            for role, perspective in discussion['individual_perspectives'].items():
                print(f"\nü§ñ {role.title()}: {perspective}")
            
            print(f"\n? Team Synthesis:\n{discussion['synthesis']}")
            
        elif user_input.lower() == 'status':
            status = multi_agent_system.get_collaboration_status()
            print("\nüìä Collaboration Status:")
            print(f"   Current Roles: {status['current_roles']}")
            print(f"   Has Plan: {'‚úÖ' if status['has_plan'] else '‚ùå'}")
            print(f"   Has Execution: {'‚úÖ' if status['has_execution'] else '‚ùå'}")
            print(f"   Completed Analyses: {status['completed_analyses']}")
            print("\n Recent Activity:")
            for activity in status['recent_activity']:
                print(f"   ‚Ä¢ {activity}")
                
        elif user_input.lower() == 'help':
            print("""
? MULTI-AGENT COLLABORATION COMMANDS:

üìã Analysis Workflow:
  ‚Ä¢ 'plan [description]' - Create analysis plan
  ‚Ä¢ 'execute' - Execute the current plan
  ‚Ä¢ 'review' - Review and optimize results

üîÑ Team Management:
  ‚Ä¢ 'switch <role1> <role2>' - Switch agent roles
  ‚Ä¢ 'team [topic]' - Multi-agent discussion
  ‚Ä¢ 'status' - View collaboration status

üí¨ Regular Chat:
  ‚Ä¢ Ask any financial question for collaborative response
  ‚Ä¢ 'quit' - Exit the system
  ‚Ä¢ 'help' - Show this help

üéØ Example Workflow:
  1. plan spending optimization
  2. execute
  3. review
  4. team budget recommendations
            """)
            
        elif user_input.lower() == '':
            continue
        
        else:
            # Regular collaborative analysis
            print(f"\nü§ñ Collaborative Analysis:")
            print("-" * 30)
            
            # Get perspective from current planner
            planner = multi_agent_system.agents[multi_agent_system.roles['planner']]
            planner_response = planner.run(f"Financial question: {user_input}")
            
            # Get perspective from current executor
            executor = multi_agent_system.agents[multi_agent_system.roles['executor']]
            executor_response = executor.run(f"Analyze and provide data: {user_input}")
            
            print(f"üß† Planner's Insight: {planner_response}")
            print(f"\n‚ö° Executor's Analysis: {executor_response}")
            
            multi_agent_system.collaboration_log.append(f"üí¨ Collaborative response to: {user_input}")

# Start the collaborative system
print("üöÄ Starting Multi-Agent Collaborative Analysis System...")
collaborative_chat_interface()

In [None]:
# ========================================
# ALTERNATIVE: SINGLE-AGENT INTERFACE
# ========================================
# Uncomment this section if you prefer a simpler single-agent interface

"""
def simple_chat_interface():
    '''Simple single-agent chat interface'''
    
    if not processed_data:
        print("‚ùå No financial data loaded. Please process data files first.")
        return
    
    # Choose primary agent
    current_agent = gpt2_agent if gpt2_agent else distilbert_agent
    current_model = "GPT-2" if gpt2_agent else "DistilBERT"
    
    if not current_agent:
        print("‚ùå No AI models available.")
        return
    
    print(f"ü§ñ Simple Financial Assistant ({current_model}) Ready!")
    print("=" * 50)
    print("Type 'quit' to exit, 'help' for commands")
    
    while True:
        user_input = input("\nüí¨ You: ").strip()
        
        if user_input.lower() == 'quit':
            print("\nüëã Goodbye!")
            break
        elif user_input.lower() == 'help':
            print("Ask me anything about your financial data!")
        else:
            response = current_agent.run(user_input)
            print(f"\nü§ñ {current_model}: {response}")

# Uncomment the line below to use simple interface instead:
# simple_chat_interface()
"""

In [None]:
# ========================================
# RAG-BASED BUDGET ANALYSIS & RECOMMENDATIONS
# ========================================

if processed_data:
    print("üí° Generating Budget Analysis & Recommendations from Your Data...")
    print("=" * 60)
    
    # Ask for monthly income without default
    while True:
        try:
            income_input = input("üí∞ Enter your monthly income: $").strip()
            if income_input:
                monthly_income = float(income_input)
                break
            else:
                print("‚ö†Ô∏è Please enter a valid monthly income amount.")
        except ValueError:
            print("‚ö†Ô∏è Please enter a valid number.")
    
    # Combine all uploaded data for analysis
    all_dataframes = []
    for filename, file_data in processed_data.items():
        print(f"üìÑ Including data from: {filename}")
        all_dataframes.append(file_data['data'])
    
    master_df = pd.concat(all_dataframes, ignore_index=True)
    print(f"üìä Analyzing {len(master_df)} transactions from uploaded files...")
    
    # Calculate budget analysis from user's actual data
    budget_analysis = budget_calc.calculate_budget(monthly_income, master_df)
    
    print(f"\nüìä BUDGET ANALYSIS for ${monthly_income:,.2f} monthly income:")
    print("-" * 40)
    
    # Summary
    summary = budget_analysis['summary']
    print(f"üí∞ Total Income:     ${summary['total_income']:,.2f}")
    print(f"üí∏ Total Expenses:   ${summary['total_expenses']:,.2f}")
    print(f"üíµ Remaining Budget: ${summary['remaining_budget']:,.2f}")
    print(f"üíæ Savings Rate:     {summary['savings_rate']:.1f}%")
    
    # Category breakdown from uploaded data
    print(f"\nüè∑Ô∏è  SPENDING BY CATEGORY (from uploaded data):")
    print("-" * 30)
    for category, amount in budget_analysis['category_breakdown'].items():
        percentage = budget_analysis['category_percentages'][category]
        print(f"{category:15} ${amount:8,.2f} ({percentage:5.1f}%)")
    
    # Data-driven recommendations
    print(f"\nüí° RECOMMENDATIONS (based on your data patterns):")
    print("-" * 20)
    for i, rec in enumerate(budget_analysis['recommendations'], 1):
        print(f"{i}. {rec}")
    
    # Trend analysis from user data
    print(f"\nüìà TREND ANALYSIS (from uploaded data):")
    print("-" * 18)
    trend_analysis = trend_analyzer.analyze_trends(master_df)
    print(trend_analysis['summary'])
    
    # Save analysis to file
    analysis_file = os.path.join(output_dir, 'budget_analysis.txt')
    with open(analysis_file, 'w') as f:
        f.write(f"Budget Analysis Report (RAG-based)\n")
        f.write(f"Generated: {pd.Timestamp.now()}\n")
        f.write(f"Data Sources: {', '.join(processed_data.keys())}\n")
        f.write(f"Total Transactions Analyzed: {len(master_df)}\n")
        f.write(f"Monthly Income: ${monthly_income:,.2f}\n\n")
        f.write(f"Summary:\n")
        for key, value in summary.items():
            f.write(f"  {key}: {value}\n")
        f.write(f"\nRecommendations:\n")
        for rec in budget_analysis['recommendations']:
            f.write(f"  - {rec}\n")
    
    print(f"\nüíæ Analysis saved to: {analysis_file}")
    
else:
    print("‚ö†Ô∏è  No data available for budget analysis.")
    print("üì§ Please upload and process your CSV/PDF financial data first.")
    print("üí° The system works as a RAG model - it analyzes only your uploaded data.")