# Multi-Agent Loan Processing System

## Project Overview

This Kaggle notebook demonstrates a sophisticated multi-agent AI system that simulates the end-to-end processing of loan applications. Using Google's Gemini model, the system showcases advanced capabilities in document processing, information extraction, verification, credit assessment, and automated underwriting through a team of specialized AI agents working together.

![Loan Processing System](loan_metrics.png)

## Key Features

- **Document Processing**: Handles PDF application forms, image-based ID documents, and text financial statements
- **Information Extraction**: Pulls structured data from diverse document formats
- **Verification Processes**: Simulates KYC/AML checks and validates document consistency
- **Credit Assessment**: Analyzes credit reports and calculates risk scores based on comprehensive financial metrics
- **Automated Underwriting**: Uses Retrieval Augmented Generation (RAG) to apply lending policies to loan applications
- **Decision Support**: Provides detailed, justified loan recommendations with supporting evidence
- **Batch Processing**: Processes multiple loan applications and generates comparative reports
- **Interactive Interface**: User-friendly interface for running the system in different modes

## System Architecture

The system consists of six specialized AI agents:

1. **IntakeAgent**: Validates and processes incoming loan documents
2. **ExtractionAgent**: Extracts structured data from various document types
3. **VerificationAgent**: Performs simulated KYC/AML checks and document validation
4. **CreditAssessmentAgent**: Analyzes credit risk based on financial data
5. **UnderwritingAgent**: Applies lending policies using RAG to make lending decisions
6. **ReportingAgent**: Generates comprehensive loan summary reports with visualizations

## Technologies Demonstrated

- **Large Language Models**: Uses Google's Gemini model for sophisticated reasoning
- **Document AI**: Extracts structured information from unstructured documents
- **Function Calling**: Simulates interactions with external systems
- **Retrieval Augmented Generation (RAG)**: Applies lending policies using vector search
- **Vector Embeddings**: Creates semantic representations of lending policies
- **Data Visualization**: Generates interactive charts of key financial metrics
- **Error Handling**: Implements robust parsing and error recovery mechanisms
- **Mock Data Generation**: Creates realistic synthetic loan application documents

## Data Science Skills Showcased

- **Data Extraction & Transformation**: Converting unstructured documents to structured data
- **Financial Ratio Analysis**: Calculating DTI, PTI, and LTV ratios
- **Risk Modeling**: Creating comprehensive risk assessment algorithms
- **Data Visualization**: Building informative metric visualizations
- **System Architecture Design**: Designing a complex multi-agent system
- **Domain Expertise**: Applying financial industry knowledge to lending decisions
- **AI Orchestration**: Coordinating multiple specialized agents in a workflow
- **Error Handling**: Implementing robust error recovery mechanisms

## Getting Started

### Prerequisites

- Python 3.9+
- Google API key for Gemini access
- Required libraries (see requirements section in notebook)

### Installation

1. Clone this repository or open in Kaggle
2. Install the required packages:
```python
!pip install -q google-generativeai langchain pypdf pillow sentence-transformers faiss-cpu python-dotenv dataclasses-json pydantic typing ipywidgets matplotlib
```
3. Set up your Google API credentials in a `.env` file or Kaggle secrets

### Running the Demo

Execute the notebook cells sequentially to:
1. Set up the environment and suppress TensorFlow warnings
2. Define the data structures and agent classes
3. Generate mock loan application documents
4. Launch the interactive interface with the following options:
   - **Use Mock Data**: Process a single mock application with detailed reporting
   - **Batch Processing**: Process multiple applications with comparative analysis

## Sample Output

The system generates comprehensive loan processing reports that include:
- Applicant information summary
- Loan request details
- Key financial metrics with visualizations
- Verification summary
- Credit assessment with risk factors
- Underwriting decision with justification
- Recommended action for loan officers

## Project Structure

```
loan_processing_notebook/
├── mock_data/                      # Mock application documents
│   ├── loan_application_{i}.pdf    # Application forms
│   ├── id_document_{i}.png         # ID documents
│   ├── financial_statement_{i}.txt # Financial statements
│   └── credit_report_{i}.json      # Credit reports
├── lending_policies.json           # Mock lending policies
├── loan_processing_report.html     # Default report for single application
└── loan_report_{i}.html            # Generated reports for batch processing
```

## Implementation Details

### Multi-Agent System
The project implements a complete multi-agent system where each agent is specialized for a specific task in the loan processing pipeline. The agents communicate through well-defined interfaces, passing structured data between processing stages.

### Mock Data Generation
The system includes sophisticated mock data generation capabilities to create realistic loan application documents including:
- PDF application forms with applicant information
- ID document images with personal identification
- Financial statements with detailed financial metrics
- Credit reports with credit history and scores

### RAG-Based Underwriting
The underwriting process uses Retrieval Augmented Generation to apply lending policies. This involves:
1. Creating vector embeddings of lending policies
2. Retrieving relevant policies based on application details
3. Applying these policies to make informed lending decisions

### Visualization
The system generates detailed visualizations including:
- Credit score gauges with colored segments
- Financial ratio comparisons
- Risk factor analysis charts

## Contact

For questions or feedback about this project, please contact [LinkedIn](https://www.linkedin.com/in/ibrahima2barry)

In [1]:
# --- Standard Libraries ---
import base64        
import json           
import os      
import logging
import random         
import re 
import shutil
from datetime import datetime, timedelta 
import io
from io import BytesIO # For handling in-memory binary streams (like file data)
from typing import List, Dict, Any, Optional, Union # For type hinting (improves code readability and static analysis)
from dataclasses import dataclass # For creating simple data classes


# --- Data Handling & Structures ---
import numpy as np    
import pandas as pd  
from pydantic import BaseModel, Field # For data validation and settings management
from dataclasses_json import dataclass_json # For easy serialization/deserialization of dataclasses to/from JSON


# --- Image Processing ---
from PIL import Image, ImageDraw, ImageFont # For opening, manipulating, and saving image files, drawing on images, and handling fonts


# --- PDF Processing ---
from pypdf import PdfReader
from reportlab.lib.pagesizes import letter # Defines standard page sizes (e.g., LETTER)
from reportlab.pdfgen import canvas        # Core library for generating PDF documents
from reportlab.lib import colors           # Predefined color constants for PDF generation
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle # For styling text in PDFs
from reportlab.platypus import Paragraph, Spacer # High-level elements for building PDF documents
from reportlab.lib.units import inch       # Units for defining measurements in PDFs


# --- Natural Language Processing (NLP) & Text Processing ---
from langchain.text_splitter import RecursiveCharacterTextSplitter # For splitting long texts into smaller chunks


# --- Vector Databases & Embeddings ---
# from langchain.embeddings import HuggingFaceEmbeddings # Older/General LangChain import
from langchain_huggingface import HuggingFaceEmbeddings # Newer/Specific integration import
from langchain.vectorstores import FAISS            # For creating and querying FAISS vector stores


# --- Generative AI ---
import google.generativeai as genai 


# --- Plotting & Visualization ---
import matplotlib.pyplot as plt                
from matplotlib.patches import Circle, Wedge, Rectangle # Specific shapes for drawing in Matplotlib plots
from IPython.display import display, HTML, clear_output, IFrame
import ipywidgets as widgets   
import tempfile

# --- Configuration & Secrets ---
from dotenv import load_dotenv

In [2]:
# Load environment variables (you'll need to add your Gemini API key to a .env file)
load_dotenv()

# Set up Gemini API
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
genai.configure(api_key=GOOGLE_API_KEY)

# Initialize Gemini model
gemini_model = genai.GenerativeModel('gemini-2.0-flash-lite')

## Data Structures and Models
####  Define data structures for our loan processing system

In [3]:

@dataclass_json
@dataclass
class Applicant:
    first_name: str
    last_name: str
    date_of_birth: str
    email: str
    phone: str
    address: str
    ssn: str  # In a real system, this would be securely handled
    employment_status: str
    employer: Optional[str] = None
    job_title: Optional[str] = None
    years_at_current_job: Optional[float] = None

In [4]:
@dataclass_json
@dataclass
class FinancialInfo:
    annual_income: float
    additional_income: float
    monthly_housing_payment: float
    liquid_assets: Optional[float] = None
    existing_debt: Optional[float] = None
    credit_score: Optional[int] = None

In [5]:
@dataclass_json
@dataclass
class LoanRequest:
    loan_purpose: str
    loan_amount: float
    loan_term_months: int
    interest_rate: Optional[float] = None
    collateral_type: Optional[str] = None
    collateral_value: Optional[float] = None

In [6]:
@dataclass_json
@dataclass
class LoanApplication:
    application_id: str
    submission_date: str
    applicant: Applicant
    financial_info: FinancialInfo
    loan_request: LoanRequest
    co_applicant: Optional[Applicant] = None

In [7]:
@dataclass_json
@dataclass
class VerificationResult:
    identity_verified: bool
    document_consistency: bool
    kyc_aml_passed: bool
    employment_verified: bool
    income_verified: bool
    issues: List[str]
    verification_date: str

In [8]:
@dataclass_json
@dataclass
class CreditAssessment:
    credit_score: int
    risk_score: float
    debt_to_income_ratio: float
    payment_to_income_ratio: float
    loan_to_value_ratio: Optional[float] = None
    risk_factors: List[str] = None
    assessment_date: str = None

In [9]:
@dataclass_json
@dataclass
class UnderwritingDecision:
    decision: str  # "approve", "reject", "refer"
    confidence_score: float
    reasons: List[str]
    conditions: List[str]
    max_approved_amount: Optional[float] = None
    approved_term_months: Optional[int] = None
    approved_interest_rate: Optional[float] = None
    decision_date: str = None

In [10]:
@dataclass_json
@dataclass
class LoanProcessingReport:
    application_id: str
    applicant_name: str
    loan_amount: float
    loan_purpose: str
    decision: str
    verification_summary: str
    credit_summary: str
    underwriting_summary: str
    recommended_action: str
    conditions: List[str]
    processing_date: str

## Mock Data Generation

In [11]:
def generate_realistic_mock_data():
    """Generate a set of realistic mock loan application documents."""
    
    # Create directory for mock data if it doesn't exist
    if not os.path.exists('mock_data'):
        os.makedirs('mock_data')
    
    # Create a list of mock applicants for variety
    mock_applicants = [
        {
            "first_name": "John", 
            "last_name": "Smith",
            "dob": "1985-06-15",
            "email": "john.smith@example.com",
            "phone": "(555) 123-4567",
            "address": "123 Main St, Anytown, US 12345",
            "ssn": "XXX-XX-4321",
            "employment": "Employed",
            "employer": "TechCorp Inc.",
            "job_title": "Senior Developer",
            "years_employed": 11,
            "annual_income": 170000,
            "additional_income": 10000,
            "housing_payment": 2500,
            "assets": 45000,
            "debts": 35000,
            "credit_score": 802,
            "loan_purpose": "Home Renovation",
            "loan_amount": 50000,
            "loan_term": 60
        },
        {
            "first_name": "Sarah", 
            "last_name": "Johnson",
            "dob": "1990-03-22",
            "email": "sarah.j@example.com",
            "phone": "(555) 987-6543",
            "address": "456 Oak Ave, Somewhere, US 54321",
            "ssn": "XXX-XX-8765",
            "employment": "Employed",
            "employer": "Health Systems Inc.",
            "job_title": "Registered Nurse",
            "years_employed": 7,
            "annual_income": 85000,
            "additional_income": 5000,
            "housing_payment": 1800,
            "assets": 30000,
            "debts": 22000,
            "credit_score": 720,
            "loan_purpose": "Debt Consolidation",
            "loan_amount": 30000,
            "loan_term": 48
        },
        {
            "first_name": "Michael", 
            "last_name": "Williams",
            "dob": "1978-11-08",
            "email": "m.williams@example.com",
            "phone": "(555) 456-7890",
            "address": "789 Pine Rd, Elsewhere, US 98765",
            "ssn": "XXX-XX-1234",
            "employment": "Self-Employed",
            "employer": "Williams Consulting LLC",
            "job_title": "Financial Consultant",
            "years_employed": 12,
            "annual_income": 150000,
            "additional_income": 20000,
            "housing_payment": 3200,
            "assets": 120000,
            "debts": 80000,
            "credit_score": 750,
            "loan_purpose": "Business Expansion",
            "loan_amount": 100000,
            "loan_term": 84
        }
    ]
    
    # Generate random application IDs
    application_ids = [f"LOAN-{random.randint(10000, 99999)}" for _ in range(len(mock_applicants))]
    
    # Generate submission dates (within the last month)
    today = datetime.now()
    submission_dates = [(today - timedelta(days=random.randint(1, 30))).strftime("%Y-%m-%d") for _ in range(len(mock_applicants))]
    
    # Generate PDF application forms
    pdf_paths = []
    for i, applicant in enumerate(mock_applicants):
        pdf_path = os.path.join('mock_data', f"loan_application_{i+1}.pdf")
        buffer = BytesIO()
        c = canvas.Canvas(buffer, pagesize=letter)
        width, height = letter
        
        # Add header with logo placeholder
        c.setFont("Helvetica-Bold", 18)
        c.drawString(50, height - 50, "FIRST NATIONAL BANK")
        c.setFont("Helvetica-Bold", 16)
        c.drawString(50, height - 80, "LOAN APPLICATION FORM")
        c.setFont("Helvetica", 12)
        c.drawString(50, height - 100, f"Application #: {application_ids[i]}")
        c.drawString(50, height - 120, f"Date: {submission_dates[i]}")
        
        # Add form header
        c.setFillColorRGB(0.9, 0.9, 0.9)
        c.rect(50, height - 140, width - 100, 20, fill=1)
        c.setFillColorRGB(0, 0, 0)
        c.setFont("Helvetica-Bold", 12)
        c.drawString(55, height - 155, "PERSONAL INFORMATION")
        
        # Add personal information
        y_pos = height - 180
        c.setFont("Helvetica", 11)
        fields = [
            ("First Name:", applicant["first_name"]),
            ("Last Name:", applicant["last_name"]),
            ("Date of Birth:", applicant["dob"]),
            ("Email:", applicant["email"]),
            ("Phone:", applicant["phone"]),
            ("Address:", applicant["address"]),
            ("SSN:", applicant["ssn"])
        ]
        
        for label, value in fields:
            c.drawString(55, y_pos, label)
            c.drawString(200, y_pos, str(value))
            y_pos -= 20
        
        # Add employment section
        y_pos -= 20
        c.setFillColorRGB(0.9, 0.9, 0.9)
        c.rect(50, y_pos, width - 100, 20, fill=1)
        c.setFillColorRGB(0, 0, 0)
        c.setFont("Helvetica-Bold", 12)
        c.drawString(55, y_pos - 15, "EMPLOYMENT INFORMATION")
        
        # Add employment information
        y_pos -= 40
        c.setFont("Helvetica", 11)
        fields = [
            ("Employment Status:", applicant["employment"]),
            ("Employer:", applicant["employer"]),
            ("Job Title:", applicant["job_title"]),
            ("Years at Current Job:", str(applicant["years_employed"]))
        ]
        
        for label, value in fields:
            c.drawString(55, y_pos, label)
            c.drawString(200, y_pos, str(value))
            y_pos -= 20
        
        # Add financial section
        y_pos -= 20
        c.setFillColorRGB(0.9, 0.9, 0.9)
        c.rect(50, y_pos, width - 100, 20, fill=1)
        c.setFillColorRGB(0, 0, 0)
        c.setFont("Helvetica-Bold", 12)
        c.drawString(55, y_pos - 15, "FINANCIAL INFORMATION")
        
        # Add financial information
        y_pos -= 40
        c.setFont("Helvetica", 11)
        fields = [
            ("Annual Income:", f"${applicant['annual_income']:,}"),
            ("Additional Income:", f"${applicant['additional_income']:,}"),
            ("Monthly Housing Payment:", f"${applicant['housing_payment']:,}"),
            ("Liquid Assets:", f"${applicant['assets']:,}"),
            ("Existing Debt:", f"${applicant['debts']:,}")
        ]
        
        for label, value in fields:
            c.drawString(55, y_pos, label)
            c.drawString(200, y_pos, str(value))
            y_pos -= 20
        
        # Add loan request section
        y_pos -= 20
        c.setFillColorRGB(0.9, 0.9, 0.9)
        c.rect(50, y_pos, width - 100, 20, fill=1)
        c.setFillColorRGB(0, 0, 0)
        c.setFont("Helvetica-Bold", 12)
        c.drawString(55, y_pos - 15, "LOAN REQUEST")
        
        # Add loan information
        y_pos -= 40
        c.setFont("Helvetica", 11)
        fields = [
            ("Loan Purpose:", applicant["loan_purpose"]),
            ("Loan Amount Requested:", f"${applicant['loan_amount']:,}"),
            ("Loan Term (months):", str(applicant["loan_term"]))
        ]
        
        for label, value in fields:
            c.drawString(55, y_pos, label)
            c.drawString(200, y_pos, str(value))
            y_pos -= 20
        
        # Add signature section
        y_pos -= 50
        c.line(55, y_pos, 200, y_pos)
        c.drawString(55, y_pos - 15, "Applicant Signature")
        c.drawString(55, y_pos - 35, applicant["first_name"] + " " + applicant["last_name"])
        
        c.line(300, y_pos, 445, y_pos)
        c.drawString(300, y_pos - 15, "Date")
        c.drawString(300, y_pos - 35, submission_dates[i])
        
        # Add footer
        c.setFont("Helvetica", 8)
        c.drawString(width/2 - 100, 30, "This is a confidential document. Do not distribute.")
        
        c.save()
        buffer.seek(0)
        
        # Save PDF to file
        with open(pdf_path, 'wb') as f:
            f.write(buffer.getvalue())
        
        pdf_paths.append(pdf_path)
    
    # Generate ID documents
    id_paths = []
    for i, applicant in enumerate(mock_applicants):
        id_path = os.path.join('mock_data', f"id_document_{i+1}.png")
        
        # Create a blank image
        width, height = 1000, 650
        id_card = Image.new('RGB', (width, height), color=(255, 255, 255))
        draw = ImageDraw.Draw(id_card)
        
        # Add a border and background design
        # Main border
        draw.rectangle([(20, 20), (width-20, height-20)], outline=(0, 51, 102), width=3)
        
        # Header background
        draw.rectangle([(20, 20), (width-20, 100)], fill=(0, 51, 102))
        
        # Try to use a font or fall back to default
        try:
            header_font = ImageFont.truetype("Arial.ttf", 36)
            title_font = ImageFont.truetype("Arial.ttf", 24)
            regular_font = ImageFont.truetype("Arial.ttf", 20)
            small_font = ImageFont.truetype("Arial.ttf", 16)
        except IOError:
            # Use default font
            header_font = ImageFont.load_default()
            title_font = ImageFont.load_default()
            regular_font = ImageFont.load_default()
            small_font = ImageFont.load_default()
        
        # Add header
        draw.text((width//2, 60), "STATE OF EXAMPLE", fill=(255, 255, 255), font=header_font, anchor="mm")
        draw.text((width//2, 150), "IDENTIFICATION CARD", fill=(0, 0, 0), font=title_font, anchor="mm")
        
        # Add ID card elements with better layout
        # Left side: Text information
        text_start_y = 200
        text_start_x = 50
        
        # Create a background for the text area
        draw.rectangle([(text_start_x-20, text_start_y-20), (width//2 + 50, height-100)], fill=(240, 240, 240))
        
        info_fields = [
            ("NAME:", f"{applicant['last_name'].upper()}, {applicant['first_name'].upper()}"),
            ("DOB:", applicant['dob']),
            ("ADDRESS:", applicant['address'].split(',')[0].upper()),
            ("", applicant['address'].split(',')[1].strip().upper()),
            ("ID#:", f"DL{random.randint(1000000, 9999999)}"),
            ("ISSUED:", (datetime.now() - timedelta(days=random.randint(300, 700))).strftime("%m/%d/%Y")),
            ("EXPIRES:", (datetime.now() + timedelta(days=random.randint(300, 1500))).strftime("%m/%d/%Y"))
        ]
        
        for i, (label, value) in enumerate(info_fields):
            y_pos = text_start_y + i * 40
            if label:
                draw.text((text_start_x, y_pos), label, fill=(0, 51, 102), font=regular_font)
                draw.text((text_start_x + 120, y_pos), value, fill=(0, 0, 0), font=regular_font)
            else:
                # For continuation lines (like address line 2)
                draw.text((text_start_x + 120, y_pos), value, fill=(0, 0, 0), font=regular_font)
        
        # Right side: Photo placeholder
        photo_left = width//2 + 100
        photo_top = 200
        photo_width = 300
        photo_height = 350
        
        # Draw photo background
        draw.rectangle([(photo_left, photo_top), (photo_left + photo_width, photo_top + photo_height)], 
                       fill=(200, 200, 200), outline=(0, 0, 0), width=2)
        
        # Draw photo label
        draw.text((photo_left + photo_width//2, photo_top + photo_height//2), 
                  "PHOTO", fill=(100, 100, 100), font=title_font, anchor="mm")
        
        # Add barcode placeholder at bottom
        barcode_top = height - 80
        draw.rectangle([(width//4, barcode_top), (width*3//4, barcode_top + 40)], fill=(0, 0, 0))
        
        # Add card ID and disclaimer at bottom
        draw.text((width//2, height - 20), f"CARD ID: {random.randint(10000000, 99999999)}", 
                  fill=(0, 0, 0), font=small_font, anchor="mm")
        
        # Save ID image
        id_card.save(id_path)
        id_paths.append(id_path)
    
    # Generate financial statements
    statement_paths = []
    for i, applicant in enumerate(mock_applicants):
        statement_path = os.path.join('mock_data', f"financial_statement_{i+1}.txt")
        
        # Create statement text with more realistic data
        quarterly_income = applicant["annual_income"] / 4
        investment_income = applicant["additional_income"] / 4
        total_income = quarterly_income + investment_income
        
        checking = random.randint(5000, 20000)
        savings = random.randint(10000, 50000)
        investments = random.randint(50000, 200000)
        retirement = random.randint(100000, 500000)
        real_estate = random.randint(300000, 700000)
        vehicles = random.randint(15000, 50000)
        total_assets = checking + savings + investments + retirement + real_estate + vehicles
        
        mortgage = random.randint(0, 500000)
        car_loan = random.randint(0, 30000)
        student_loans = random.randint(0, 50000)
        credit_card = random.randint(0, 10000)
        personal_loans = random.randint(0, 30000)
        total_liabilities = mortgage + car_loan + student_loans + credit_card + personal_loans
        
        net_worth = total_assets - total_liabilities
        
        mortgage_payment = mortgage * 0.005 if mortgage > 0 else 0  # Approximate monthly payment
        car_payment = car_loan * 0.02 if car_loan > 0 else 0  # Approximate monthly payment
        insurance = random.randint(200, 500)
        utilities = random.randint(200, 600)
        groceries = random.randint(500, 1500)
        entertainment = random.randint(200, 800)
        other_expenses = random.randint(500, 1200)
        total_monthly_expenses = mortgage_payment + car_payment + insurance + utilities + groceries + entertainment + other_expenses
        
        dti_ratio = ((total_liabilities / 12) / ((applicant["annual_income"] + applicant["additional_income"]) / 12)) * 100
        
        statement = f"""
        FINANCIAL STATEMENT - {applicant["first_name"].upper()} {applicant["last_name"].upper()}
        Statement Period: January 1, 2025 - March 31, 2025
        Prepared Date: {(datetime.now() - timedelta(days=random.randint(1, 30))).strftime("%m/%d/%Y")}
        
        INCOME SUMMARY:
        Primary Employment: ${quarterly_income:,.2f} (quarterly)
        Investment Income: ${investment_income:,.2f} (quarterly)
        Total Income: ${total_income:,.2f}
        
        ASSETS:
        Checking Account: ${checking:,.2f}
        Savings Account: ${savings:,.2f}
        Investment Portfolio: ${investments:,.2f}
        Retirement Accounts: ${retirement:,.2f}
        Real Estate (Primary Residence): ${real_estate:,.2f}
        Vehicles: ${vehicles:,.2f}
        Total Assets: ${total_assets:,.2f}
        
        LIABILITIES:
        Mortgage Balance: ${mortgage:,.2f}
        Car Loan: ${car_loan:,.2f}
        Student Loans: ${student_loans:,.2f}
        Credit Card Debt: ${credit_card:,.2f}
        Personal Loans: ${personal_loans:,.2f}
        Total Liabilities: ${total_liabilities:,.2f}
        
        NET WORTH: ${net_worth:,.2f}
        
        MONTHLY EXPENSES:
        Mortgage Payment: ${mortgage_payment:,.2f}
        Car Payment: ${car_payment:,.2f}
        Insurance (Home/Auto): ${insurance:,.2f}
        Utilities: ${utilities:,.2f}
        Groceries & Dining: ${groceries:,.2f}
        Entertainment: ${entertainment:,.2f}
        Other Expenses: ${other_expenses:,.2f}
        Total Monthly Expenses: ${total_monthly_expenses:,.2f}
        
        DEBT-TO-INCOME RATIO: {dti_ratio:.2f}%
        
        I certify that the information provided above is accurate and complete.
        
        {applicant["first_name"]} {applicant["last_name"]}
        {(datetime.now() - timedelta(days=random.randint(1, 5))).strftime("%m/%d/%Y")}
        """
        
        # Save text statement
        with open(statement_path, 'w') as f:
            f.write(statement)
        
        statement_paths.append(statement_path)
    
    # Generate mock credit reports
    credit_report_paths = []
    for i, applicant in enumerate(mock_applicants):
        credit_report_path = os.path.join('mock_data', f"credit_report_{i+1}.json")
        
        # Create credit history with more realistic accounts
        credit_accounts = []
        
        # Credit card
        credit_accounts.append({
            "account_type": "Credit Card",
            "account_status": "Current",
            "payment_history": "Excellent",
            "current_balance": random.randint(500, 5000),
            "credit_limit": random.randint(5000, 20000),
            "open_date": (datetime.now() - timedelta(days=random.randint(365, 2555))).strftime("%Y-%m-%d")
        })
        
        # Auto loan
        if random.random() > 0.3:  # 70% chance of having auto loan
            credit_accounts.append({
                "account_type": "Auto Loan",
                "account_status": "Current",
                "payment_history": "Good",
                "current_balance": random.randint(5000, 25000),
                "original_loan_amount": random.randint(15000, 40000),
                "open_date": (datetime.now() - timedelta(days=random.randint(180, 1095))).strftime("%Y-%m-%d")
            })
        
        # Mortgage
        if random.random() > 0.4:  # 60% chance of having a mortgage
            credit_accounts.append({
                "account_type": "Mortgage",
                "account_status": "Current",
                "payment_history": "Excellent",
                "current_balance": random.randint(100000, 500000),
                "original_loan_amount": random.randint(150000, 600000),
                "open_date": (datetime.now() - timedelta(days=random.randint(365, 3650))).strftime("%Y-%m-%d")
            })
        
        # Student loan
        if random.random() > 0.6:  # 40% chance of having student loans
            credit_accounts.append({
                "account_type": "Student Loan",
                "account_status": "Current",
                "payment_history": "Good",
                "current_balance": random.randint(5000, 80000),
                "original_loan_amount": random.randint(10000, 100000),
                "open_date": (datetime.now() - timedelta(days=random.randint(1095, 5475))).strftime("%Y-%m-%d")
            })
        
        # Personal loan
        if random.random() > 0.7:  # 30% chance of having a personal loan
            credit_accounts.append({
                "account_type": "Personal Loan",
                "account_status": "Current",
                "payment_history": "Good",
                "current_balance": random.randint(2000, 15000),
                "original_loan_amount": random.randint(5000, 20000),
                "open_date": (datetime.now() - timedelta(days=random.randint(90, 720))).strftime("%Y-%m-%d")
            })
        
        # Create full credit report
        inquiries = random.randint(0, 3)
        derogatory_marks = random.randint(0, 1)
        collections = 0 if random.random() > 0.1 else 1  # 10% chance of collections
        
        credit_report = {
            "applicant_id": f"AP-{random.randint(10000, 99999)}",
            "credit_score": applicant["credit_score"],
            "report_date": datetime.now().strftime("%Y-%m-%d"),
            "credit_history": credit_accounts,
            "inquiries_last_6_months": inquiries,
            "derogatory_marks": derogatory_marks,
            "collections": collections,
            "public_records": 0
        }
        
        # Save credit report
        with open(credit_report_path, 'w') as f:
            json.dump(credit_report, f, indent=2)
        
        credit_report_paths.append(credit_report_path)
    
    print(f"✅ Generated {len(mock_applicants)} complete loan application packages.")
    print(f"  - PDF application forms: {len(pdf_paths)}")
    print(f"  - ID documents: {len(id_paths)}")
    print(f"  - Financial statements: {len(statement_paths)}")
    print(f"  - Credit reports: {len(credit_report_paths)}")
    print(f"All files are saved in the 'mock_data' directory.")
    
    # Return all paths for easy access
    return {
        "applicants": mock_applicants,
        "application_forms": pdf_paths,
        "id_documents": id_paths,
        "financial_statements": statement_paths,
        "credit_reports": credit_report_paths
    }

In [12]:
# Function to simulate user file upload for notebook testing
def simulate_file_upload(mock_data_index=0):
    """
    Simulate file upload for testing in the notebook.
    
    Args:
        mock_data_index: Index of the mock applicant to use (0, 1, or 2)
    
    Returns:
        List of file paths for the selected applicant
    """
    # Generate mock data if it doesn't exist
    if not os.path.exists('mock_data'):
        mock_data = generate_realistic_mock_data()
    else:
        # Get paths from existing mock_data directory
        pdf_paths = [os.path.join('mock_data', f) for f in os.listdir('mock_data') if f.startswith('loan_application_')]
        id_paths = [os.path.join('mock_data', f) for f in os.listdir('mock_data') if f.startswith('id_document_')]
        statement_paths = [os.path.join('mock_data', f) for f in os.listdir('mock_data') if f.startswith('financial_statement_')]
        credit_paths = [os.path.join('mock_data', f) for f in os.listdir('mock_data') if f.startswith('credit_report_')]
        
        # Sort paths to ensure correct order
        pdf_paths.sort()
        id_paths.sort()
        statement_paths.sort()
        credit_paths.sort()
        
        mock_data = {
            "application_forms": pdf_paths,
            "id_documents": id_paths,
            "financial_statements": statement_paths,
            "credit_reports": credit_paths
        }
    
    # Select files for the requested applicant
    index = min(mock_data_index, len(mock_data["application_forms"])-1)
    
    # Return the actual file paths as a LIST (not a dictionary)
    return [
        mock_data["application_forms"][index],
        mock_data["id_documents"][index],
        mock_data["financial_statements"][index]
    ]

In [13]:
def generate_mock_financial_statement():
    """Generate a mock financial statement text."""
    statement = """
    FINANCIAL STATEMENT - JOHN SMITH
    Statement Period: January 1, 2025 - March 31, 2025
    
    INCOME SUMMARY:
    Primary Employment: $30,000 (quarterly)
    Investment Income: $2,500 (quarterly)
    Total Income: $32,500
    
    ASSETS:
    Checking Account: $15,000
    Savings Account: $30,000
    Investment Portfolio: $120,000
    Retirement Accounts: $250,000
    Real Estate (Primary Residence): $450,000
    Vehicles: $35,000
    Total Assets: $900,000
    
    LIABILITIES:
    Mortgage Balance: $320,000
    Car Loan: $15,000
    Student Loans: $0
    Credit Card Debt: $2,000
    Personal Loans: $18,000
    Total Liabilities: $355,000
    
    NET WORTH: $545,000
    
    MONTHLY EXPENSES:
    Mortgage Payment: $2,100
    Car Payment: $400
    Insurance (Home/Auto): $350
    Utilities: $400
    Groceries & Dining: $1,200
    Entertainment: $500
    Other Expenses: $800
    Total Monthly Expenses: $5,750
    
    DEBT-TO-INCOME RATIO: 21.23%
    
    I certify that the information provided above is accurate and complete.
    
    John Smith
    04/01/2025
    """
    return statement

In [14]:
# Create a mock lending policy database for RAG
def create_mock_lending_policies():
    """Create mock lending policies for the RAG system."""
    policies = [
        {
            "policy_id": "LP-001",
            "policy_name": "General Eligibility Requirements",
            "policy_text": """
            All loan applicants must meet the following eligibility criteria:
            1. Minimum age of 18 years.
            2. Valid government-issued identification.
            3. Permanent resident or citizen of the United States.
            4. Valid Social Security Number or Tax Identification Number.
            5. Verifiable income source.
            6. No bankruptcy filings within the past 3 years.
            7. No foreclosures within the past 5 years.
            """
        },
        {
            "policy_id": "LP-002",
            "policy_name": "Credit Score Requirements",
            "policy_text": """
            Credit score requirements by loan type:
            1. Personal Loans:
               a. Excellent (Approve): 720+
               b. Good (Consider): 680-719
               c. Fair (Refer): 620-679
               d. Poor (Decline): Below 620
            
            2. Home Renovation Loans:
               a. Excellent (Approve): 700+
               b. Good (Consider): 660-699
               c. Fair (Refer): 600-659
               d. Poor (Decline): Below 600
            
            3. Debt Consolidation Loans:
               a. Excellent (Approve): 700+
               b. Good (Consider): 660-699
               c. Fair (Refer): 640-659
               d. Poor (Decline): Below 640
            """
        },
        {
            "policy_id": "LP-003",
            "policy_name": "Debt-to-Income Ratio Policy",
            "policy_text": """
            Maximum acceptable Debt-to-Income (DTI) ratios:
            1. Personal Loans: Maximum DTI of 45%
            2. Home Renovation Loans: Maximum DTI of 43%
            3. Debt Consolidation Loans: Maximum DTI of 50%
            
            DTI Calculation Method:
            DTI = (Total Monthly Debt Payments) ÷ (Gross Monthly Income) × 100
            
            For all loan types, the following DTI thresholds apply:
            - Below 36%: Favorable consideration
            - 36% to maximum: Additional review required
            - Above maximum: Application denial or significant compensating factors required
            """
        },
        {
            "policy_id": "LP-004",
            "policy_name": "Employment and Income Verification",
            "policy_text": """
            Employment and income verification requirements:
            1. Minimum employment duration:
               a. W2 employees: At least 6 months with current employer
               b. Self-employed: At least 2 years of documented self-employment
            
            2. Income verification documents:
               a. W2 employees: Last 2 pay stubs and previous year's W2
               b. Self-employed: Last 2 years of tax returns and current profit/loss statement
            
            3. Income stability:
               a. Income should be stable or increasing
               b. Unexplained gaps in employment over 30 days require explanation
               c. Recent job changes must be within the same industry or represent career advancement
            """
        },
        {
            "policy_id": "LP-005",
            "policy_name": "Loan-to-Value Ratio Policy",
            "policy_text": """
            Maximum Loan-to-Value (LTV) ratios for Home Renovation Loans:
            1. Primary residence: Maximum LTV of 85%
            2. Secondary residence: Maximum LTV of 75%
            3. Investment property: Maximum LTV of 70%
            
            LTV Calculation:
            LTV = (Loan Amount) ÷ (Appraised Property Value) × 100
            
            For all secured loans, property valuation must be completed by an approved appraiser
            or derived from an approved automated valuation model.
            """
        },
        {
            "policy_id": "LP-006",
            "policy_name": "Home Renovation Loan Specific Policies",
            "policy_text": """
            Home Renovation Loan specific requirements:
            1. Loan purpose must be for qualifying home improvements or renovations.
            2. Maximum loan amount: $100,000
            3. Maximum loan term: 180 months (15 years)
            4. Minimum credit score: 600
            5. Maximum debt-to-income ratio: 43%
            6. Property must be owner-occupied or second home.
            7. Renovations must increase home value or address necessary repairs.
            8. Funding can be disbursed in phases for larger projects based on completion milestones.
            
            Approved renovation purposes include:
            - Kitchen or bathroom remodels
            - Roof replacement or repair
            - HVAC replacement or repair
            - Energy efficiency improvements
            - Necessary structural repairs
            - Room additions or finishing unfinished spaces
            - Accessibility modifications
            """
        },
        {
            "policy_id": "LP-007",
            "policy_name": "Interest Rate Determination",
            "policy_text": """
            Interest rates are determined based on the following factors:
            1. Current market rates (base rate)
            2. Loan type and term
            3. Credit score tier
            4. Debt-to-income ratio
            5. Loan-to-value ratio (for secured loans)
            6. Relationship discounts
            
            Home Renovation Loan interest rate tiers:
            - Tier 1 (Excellent): Credit score 720+, DTI < 36%
              Base rate + 0.0-0.5%
            - Tier 2 (Good): Credit score 660-719, DTI < 40%
              Base rate + 0.75-1.25%
            - Tier 3 (Fair): Credit score 600-659, DTI < 43%
              Base rate + 1.5-2.5%
            
            Current base rate for Home Renovation Loans: 6.25%
            """
        },
        {
            "policy_id": "LP-008",
            "policy_name": "Loan Conditions and Exceptions",
            "policy_text": """
            Common loan conditions that may be applied:
            1. Proof of completion for renovation projects
            2. Additional collateral for borderline applications
            3. Co-signer requirement for credit enhancement
            4. Automatic payment enrollment
            5. Homeowners insurance verification
            
            Exception authority levels:
            1. Loan officers: No exception authority
            2. Underwriting team leads: Exceptions up to 5% outside guidelines
            3. Credit committee: Exceptions up to 15% outside guidelines
            4. Chief Credit Officer: Exceptions beyond 15% outside guidelines
            
            All exceptions must be documented with compensating factors and approval
            from the appropriate authority level.
            """
        }
    ]
    
    # Save policies to a JSON file
    with open("lending_policies.json", "w") as f:
        json.dump(policies, f, indent=2)
    
    print("Mock lending policies saved to: lending_policies.json")
    return policies


In [15]:
# Create mock credit bureau responses
def create_mock_credit_data():
    """Create mock credit bureau data."""
    credit_data = {
        "applicant_id": "AP-12345",
        "credit_score": 682,
        "report_date": "2025-04-01",
        "credit_history": [
            {
                "account_type": "Credit Card",
                "account_status": "Current",
                "payment_history": "Excellent",
                "current_balance": 1500,
                "credit_limit": 10000,
                "open_date": "2018-05-15"
            },
            {
                "account_type": "Auto Loan",
                "account_status": "Current",
                "payment_history": "Good",
                "current_balance": 12000,
                "original_loan_amount": 25000,
                "open_date": "2021-03-10"
            },
            {
                "account_type": "Mortgage",
                "account_status": "Current",
                "payment_history": "Excellent",
                "current_balance": 320000,
                "original_loan_amount": 350000,
                "open_date": "2020-08-22"
            }
        ],
        "inquiries_last_6_months": 2,
        "derogatory_marks": 0,
        "collections": 0,
        "public_records": 0
    }
    
    # Save to a JSON file
    with open("credit_report.json", "w") as f:
        json.dump(credit_data, f, indent=2)
    
    print("Mock credit report data saved to: credit_report.json")
    return credit_data

In [16]:
# Generate and save mock data
def generate_sample_data():
    """Generate and save all mock data needed for the application."""
    print("Generating mock loan application data...")
    
    # Generate the mock application documents
    mock_data = generate_realistic_mock_data()
    
    # Create mock lending policies
    lending_policies = create_mock_lending_policies()
    
    # Create mock credit bureau responses
    mock_credit_reports = mock_data['credit_reports']
    print(f"✅ Using {len(mock_credit_reports)} mock credit reports")
    
    print("Mock data generation complete!")
    return mock_data

# Run this function to generate all needed mock data
mock_data = generate_sample_data()

# Set up paths for the first applicant for immediate use
file_paths = {
    "application_form": mock_data["application_forms"][0],
    "id_document": mock_data["id_documents"][0],
    "financial_statement": mock_data["financial_statements"][0],
    "credit_report": mock_data["credit_reports"][0]
}

print(f"Ready to process applications with files from: {file_paths}")

Generating mock loan application data...
✅ Generated 3 complete loan application packages.
  - PDF application forms: 3
  - ID documents: 3
  - Financial statements: 3
  - Credit reports: 3
All files are saved in the 'mock_data' directory.
Mock lending policies saved to: lending_policies.json
✅ Using 3 mock credit reports
Mock data generation complete!
Ready to process applications with files from: {'application_form': 'mock_data/loan_application_1.pdf', 'id_document': 'mock_data/id_document_1.png', 'financial_statement': 'mock_data/financial_statement_1.txt', 'credit_report': 'mock_data/credit_report_1.json'}


## Agent 1: Intake Agent

In [17]:
class IntakeAgent:
    """
    Responsible for validating and processing incoming loan application documents.
    
    This agent:
    1. Identifies document types
    2. Validates document formats and completeness
    3. Prepares documents for extraction
    """
    
    def __init__(self):
        self.model = gemini_model
        self.supported_doc_types = ["application_form", "id_document", "financial_statement"]
    
    def identify_document(self, file_path):
        """Identify the type of document based on its content."""
        file_ext = file_path.split('.')[-1].lower()
        
        # Load file depending on extension
        if file_ext == 'pdf':
            # For PDF, read text and analyze content
            reader = PdfReader(file_path)
            text = ""
            for page in reader.pages:
                text += page.extract_text()
            
            # Prepare prompt for Gemini
            prompt = f"""
            Identify this document type. It's a PDF with the following content:
            
            {text[:2000]}
            
            Classify this as one of:
            - application_form: Loan application forms containing personal and financial information
            - id_document: ID cards or documents that verify identity
            - financial_statement: Bank statements, income statements, or other financial records
            
            Return only the document type as a single word.
            """
            
            response = gemini_model.generate_content(prompt)
            doc_type = response.text.strip().lower()
            
            return doc_type
            
        elif file_ext in ['png', 'jpg', 'jpeg']:
            # For images, load and analyze visual content
            image = Image.open(file_path)
            
            # Prepare prompt for Gemini Vision
            prompt = """
            Identify this document type from the image.
            Classify this as one of:
            - application_form: Loan application forms
            - id_document: ID cards or documents that verify identity
            - financial_statement: Bank statements or other financial records
            
            Return only the document type as a single word.
            """
            
            response = self.model.generate_content([prompt, image])
            doc_type = response.text.strip().lower()
            
            return doc_type
            
        elif file_ext == 'txt':
            # For text files, analyze content
            with open(file_path, 'r') as f:
                text = f.read()
            
            prompt = f"""
            Identify this document type. It's a text file with the following content:
            
            {text[:2000]}
            
            Classify this as one of:
            - application_form: Loan application forms
            - id_document: ID cards or documents that verify identity
            - financial_statement: Bank statements, income statements, or other financial records
            
            Return only the document type as a single word.
            """
            
            response = gemini_model.generate_content(prompt)
            doc_type = response.text.strip().lower()
            
            return doc_type
            
        else:
            raise ValueError(f"Unsupported file extension: {file_ext}")
    
    def validate_document(self, file_path, doc_type):
        """Validate if the document contains required information based on type."""
        file_ext = file_path.split('.')[-1].lower()
        validation_result = {"valid": False, "issues": []}
        
        # Load document based on extension
        if file_ext == 'pdf':
            reader = PdfReader(file_path)
            text = ""
            for page in reader.pages:
                text += page.extract_text()
                
            # Define required fields based on document type
            if doc_type == "application_form":
                required_fields = [
                    "name", "address", "date of birth", "income", 
                    "loan amount", "loan purpose", "employment"
                ]
                
                # Check for required fields
                missing_fields = []
                for field in required_fields:
                    if field.lower() not in text.lower():
                        missing_fields.append(field)
                
                if missing_fields:
                    validation_result["issues"].append(f"Missing required fields: {', '.join(missing_fields)}")
                else:
                    validation_result["valid"] = True
                    
        elif file_ext in ['png', 'jpg', 'jpeg']:
            # For ID documents, check if it's a valid ID
            if doc_type == "id_document":
                image = Image.open(file_path)
                
                prompt = """
                Validate if this is a valid ID document. 
                Check for the presence of:
                1. Name
                2. Photo (or placeholder for photo)
                3. ID number or similar identifier
                4. Issue/expiry dates or birth date
                
                Respond with YES if all essential elements are present, and NO if missing critical elements.
                """
                
                response = self.model.generate_content([prompt, image])
                if "yes" in response.text.lower():
                    validation_result["valid"] = True
                else:
                    validation_result["issues"].append("ID document missing critical elements")
                    
        elif file_ext == 'txt':
            with open(file_path, 'r') as f:
                text = f.read()
            
            # For financial statements, check for required information
            if doc_type == "financial_statement":
                required_info = ["income", "assets", "liabilities", "expenses"]
                
                missing_info = []
                for info in required_info:
                    if info.lower() not in text.lower():
                        missing_info.append(info)
                
                if missing_info:
                    validation_result["issues"].append(f"Financial statement missing key sections: {', '.join(missing_info)}")
                else:
                    validation_result["valid"] = True
        
        return validation_result
    
    def process_documents(self, file_paths):
        """Process multiple documents and prepare them for extraction."""
        processed_docs = {}
        
        for file_path in file_paths:
            try:
                # Identify document type
                doc_type = self.identify_document(file_path)
                print(f"Identified {file_path} as {doc_type}")
                
                # Validate document
                validation = self.validate_document(file_path, doc_type)
                
                if validation["valid"]:
                    processed_docs[doc_type] = file_path
                    print(f"✅ {file_path} is valid {doc_type}")
                else:
                    print(f"❌ Invalid {doc_type}: {file_path}")
                    print(f"Issues: {validation['issues']}")
            
            except Exception as e:
                print(f"Error processing {file_path}: {str(e)}")
        
        # Check if we have all required documents
        missing_docs = set(self.supported_doc_types) - set(processed_docs.keys())
        if missing_docs:
            print(f"⚠️ Warning: Missing required documents: {', '.join(missing_docs)}")
        
        return processed_docs

In [18]:
# Test the IntakeAgent
intake_agent = IntakeAgent()

file_paths = [
    mock_data["application_forms"][0],  # First application form
    mock_data["id_documents"][0],       # First ID document
    mock_data["financial_statements"][0] # First financial statement
]

processed_docs = intake_agent.process_documents(file_paths)
print(f"\nProcessed documents: {processed_docs}")

Identified mock_data/loan_application_1.pdf as application_form
✅ mock_data/loan_application_1.pdf is valid application_form
Identified mock_data/id_document_1.png as id_document
✅ mock_data/id_document_1.png is valid id_document
Identified mock_data/financial_statement_1.txt as financial_statement
✅ mock_data/financial_statement_1.txt is valid financial_statement

Processed documents: {'application_form': 'mock_data/loan_application_1.pdf', 'id_document': 'mock_data/id_document_1.png', 'financial_statement': 'mock_data/financial_statement_1.txt'}


## Agent 2: Extraction Agent

In [19]:
class ExtractionAgent:
    """
    Responsible for extracting structured data from various document types.
    
    This agent:
    1. Extracts personal information from application forms and IDs
    2. Extracts financial information from statements
    3. Extracts loan request details
    4. Consolidates information into a structured LoanApplication object
    """
    
    def __init__(self):
        self.model = gemini_model
    
    def extract_from_pdf(self, pdf_path):
        """Extract structured data from a PDF application form."""
        # Read PDF
        reader = PdfReader(pdf_path)
        text = ""
        for page in reader.pages:
            text += page.extract_text()
        
        # Extract data using Gemini
        prompt = f"""
        Extract the following information from this loan application in JSON format.
        Return ONLY valid JSON without any explanation or markdown formatting.
        
        Document text:
        {text}
        
        Required JSON structure:
        {{
            "applicant": {{
                "first_name": "",
                "last_name": "",
                "date_of_birth": "",
                "email": "",
                "phone": "",
                "address": "",
                "ssn": "",
                "employment_status": "",
                "employer": "",
                "job_title": "",
                "years_at_current_job": ""
            }},
            "financial_info": {{
                "annual_income": "",
                "additional_income": "",
                "monthly_housing_payment": "",
                "liquid_assets": "",
                "existing_debt": ""
            }},
            "loan_request": {{
                "loan_purpose": "",
                "loan_amount": "",
                "loan_term_months": ""
            }},
            "application_id": "",
            "submission_date": ""
        }}
        """
        
        response = self.model.generate_content(prompt)
        
        # Process response to ensure it's valid JSON
        try:
            application_data = json.loads(response.text)
        except json.JSONDecodeError:
            # Clean up the response to try to extract valid JSON
            cleaned_text = response.text.strip()
            
            # Remove markdown code block formatting if present
            if cleaned_text.startswith("```json"):
                cleaned_text = cleaned_text.replace("```json", "", 1)
                if cleaned_text.endswith("```"):
                    cleaned_text = cleaned_text[:-3]
            elif cleaned_text.startswith("```"):
                cleaned_text = cleaned_text.replace("```", "", 1)
                if cleaned_text.endswith("```"):
                    cleaned_text = cleaned_text[:-3]
                
            # Try parsing again
            try:
                application_data = json.loads(cleaned_text.strip())
            except json.JSONDecodeError:
                # If still failing, create a default structure with error message
                print("❌ Error: Could not parse JSON from model response. Using default values.")
                print(f"Model response: {response.text}")
                application_data = {
                    "applicant": {
                        "first_name": "John",
                        "last_name": "Smith",
                        "date_of_birth": "1985-06-15",
                        "email": "john.smith@example.com",
                        "phone": "(555) 123-4567",
                        "address": "123 Main St, Anytown, US 12345",
                        "ssn": "XXX-XX-4321",
                        "employment_status": "Employed",
                        "employer": "TechCorp Inc.",
                        "job_title": "Senior Developer",
                        "years_at_current_job": "11"
                    },
                    "financial_info": {
                        "annual_income": "$170,000",
                        "additional_income": "$10,000",
                        "monthly_housing_payment": "$2,500",
                        "liquid_assets": "$45,000",
                        "existing_debt": "$35,000"
                    },
                    "loan_request": {
                        "loan_purpose": "Home Renovation",
                        "loan_amount": "$50,000",
                        "loan_term_months": "60"
                    },
                    "application_id": "12345-ABC",
                    "submission_date": "2025-04-01"
                }
        
        return application_data
    
    def extract_from_id(self, id_path):
        """Extract information from an ID document."""
        # Load image
        image = Image.open(id_path)
        
        # Prepare prompt for Gemini Vision
        prompt = """
        Extract the following information from this ID document in JSON format.
        Return ONLY valid JSON without any explanation or markdown formatting.
        
        Required JSON structure:
        {
            "full_name": "",
            "id_number": "",
            "date_of_birth": "",
            "address": "",
            "issue_date": "",
            "expiry_date": ""
        }
        """
        
        # Generate response
        response = self.model.generate_content([prompt, image])
        
        # Process response to ensure it's valid JSON
        try:
            id_data = json.loads(response.text)
        except json.JSONDecodeError:
            # Clean up the response to try to extract valid JSON
            cleaned_text = response.text.strip()
            
            # Remove markdown code block formatting if present
            if cleaned_text.startswith("```json"):
                cleaned_text = cleaned_text.replace("```json", "", 1)
                if cleaned_text.endswith("```"):
                    cleaned_text = cleaned_text[:-3]
            elif cleaned_text.startswith("```"):
                cleaned_text = cleaned_text.replace("```", "", 1)
                if cleaned_text.endswith("```"):
                    cleaned_text = cleaned_text[:-3]
                
            # Try parsing again
            try:
                id_data = json.loads(cleaned_text.strip())
            except json.JSONDecodeError:
                # If still failing, create a default structure with error message
                print("❌ Error: Could not parse JSON from ID document. Using default values.")
                print(f"Model response: {response.text}")
                id_data = {
                    "full_name": "SMITH, JOHN",
                    "id_number": "DL9876543",
                    "date_of_birth": "06/15/1985",
                    "address": "123 MAIN ST, ANYTOWN, US 12345",
                    "issue_date": "01/15/2023",
                    "expiry_date": "01/15/2028"
                }
        
        return id_data
    
    def extract_from_financial_statement(self, statement_path):
        """Extract financial information from a text-based financial statement."""
        # Read text file
        with open(statement_path, 'r') as f:
            text = f.read()
        
        # Prepare prompt for Gemini
        prompt = f"""
        Extract the following financial information from this statement in JSON format.
        Return ONLY valid JSON without any explanation or markdown formatting.
        
        Financial statement:
        {text}
        
        Required JSON structure:
        {{
            "income": {{
                "primary_income": "",
                "additional_income": "",
                "total_income": ""
            }},
            "assets": {{
                "liquid_assets": "",
                "investments": "",
                "real_estate": "",
                "other_assets": "",
                "total_assets": ""
            }},
            "liabilities": {{
                "mortgage": "",
                "loans": "",
                "credit_card_debt": "",
                "other_debt": "",
                "total_liabilities": ""
            }},
            "expenses": {{
                "housing": "",
                "transportation": "",
                "utilities": "",
                "food": "",
                "other_expenses": "",
                "total_monthly_expenses": ""
            }},
            "financial_ratios": {{
                "debt_to_income": "",
                "monthly_expense_to_income": ""
            }}
        }}
        """
        
        response = self.model.generate_content(prompt)
        
        # Process response to ensure it's valid JSON
        try:
            financial_data = json.loads(response.text)
        except json.JSONDecodeError:
            # Clean up the response to try to extract valid JSON
            cleaned_text = response.text.strip()
            
            # Remove markdown code block formatting if present
            if cleaned_text.startswith("```json"):
                cleaned_text = cleaned_text.replace("```json", "", 1)
                if cleaned_text.endswith("```"):
                    cleaned_text = cleaned_text[:-3]
            elif cleaned_text.startswith("```"):
                cleaned_text = cleaned_text.replace("```", "", 1)
                if cleaned_text.endswith("```"):
                    cleaned_text = cleaned_text[:-3]
                
            # Try parsing again
            try:
                financial_data = json.loads(cleaned_text.strip())
            except json.JSONDecodeError:
                # If still failing, create a default structure with error message
                print("❌ Error: Could not parse JSON from financial statement. Using default values.")
                print(f"Model response: {response.text}")
                financial_data = {
                    "income": {
                        "primary_income": "$30,000",
                        "additional_income": "$2,500",
                        "total_income": "$32,500"
                    },
                    "assets": {
                        "liquid_assets": "$45,000",
                        "investments": "$120,000",
                        "real_estate": "$450,000",
                        "other_assets": "$35,000",
                        "total_assets": "$900,000"
                    },
                    "liabilities": {
                        "mortgage": "$320,000",
                        "loans": "$33,000",
                        "credit_card_debt": "$2,000",
                        "other_debt": "$0",
                        "total_liabilities": "$355,000"
                    },
                    "expenses": {
                        "housing": "$2,100",
                        "transportation": "$400",
                        "utilities": "$400",
                        "food": "$1,200",
                        "other_expenses": "$1,650",
                        "total_monthly_expenses": "$5,750"
                    },
                    "financial_ratios": {
                        "debt_to_income": "21.23%",
                        "monthly_expense_to_income": "17.69%"
                    }
                }
        
        return financial_data
    
    def consolidate_information(self, application_data, id_data, financial_data):
        """Combine information from all documents into a unified LoanApplication object."""
        # Cross-validate information
        prompt = f"""
        Cross-validate the information from these three sources and resolve any discrepancies.
        Return ONLY valid JSON without any explanation or markdown formatting.
        
        Application form data:
        {json.dumps(application_data, indent=2)}
        
        ID document data:
        {json.dumps(id_data, indent=2)}
        
        Financial statement data:
        {json.dumps(financial_data, indent=2)}
        
        Required consolidated JSON structure:
        {{
            "application_id": "{application_data.get('application_id', '')}",
            "submission_date": "{application_data.get('submission_date', '')}",
            "applicant": {{
                "first_name": "",
                "last_name": "",
                "date_of_birth": "",
                "email": "",
                "phone": "",
                "address": "",
                "ssn": "",
                "employment_status": "",
                "employer": "",
                "job_title": "",
                "years_at_current_job": ""
            }},
            "financial_info": {{
                "annual_income": "",
                "additional_income": "",
                "monthly_housing_payment": "",
                "liquid_assets": "",
                "existing_debt": ""
            }},
            "loan_request": {{
                "loan_purpose": "",
                "loan_amount": "",
                "loan_term_months": ""
            }}
        }}
        """
        
        response = self.model.generate_content(prompt)
        
        # Process response to ensure it's valid JSON
        try:
            consolidated_data = json.loads(response.text)
        except json.JSONDecodeError:
            # Clean up the response to try to extract valid JSON
            cleaned_text = response.text.strip()
            
            # Remove markdown code block formatting if present
            if cleaned_text.startswith("```json"):
                cleaned_text = cleaned_text.replace("```json", "", 1)
                if cleaned_text.endswith("```"):
                    cleaned_text = cleaned_text[:-3]
            elif cleaned_text.startswith("```"):
                cleaned_text = cleaned_text.replace("```", "", 1)
                if cleaned_text.endswith("```"):
                    cleaned_text = cleaned_text[:-3]
                
            # Try parsing again
            try:
                consolidated_data = json.loads(cleaned_text.strip())
            except json.JSONDecodeError:
                # If still failing, use application data as the base
                print("❌ Error: Could not parse JSON when consolidating information. Using application data as base.")
                print(f"Model response: {response.text}")
                consolidated_data = {
                    "application_id": application_data.get("application_id", "12345-ABC"),
                    "submission_date": application_data.get("submission_date", "2025-04-01"),
                    "applicant": application_data.get("applicant", {
                        "first_name": "John",
                        "last_name": "Smith",
                        "date_of_birth": "1985-06-15",
                        "email": "john.smith@example.com",
                        "phone": "(555) 123-4567",
                        "address": "123 Main St, Anytown, US 12345",
                        "ssn": "XXX-XX-4321",
                        "employment_status": "Employed",
                        "employer": "TechCorp Inc.",
                        "job_title": "Senior Developer",
                        "years_at_current_job": "11"
                    }),
                    "financial_info": application_data.get("financial_info", {
                        "annual_income": "$170,000",
                        "additional_income": "$10,000",
                        "monthly_housing_payment": "$2,500",
                        "liquid_assets": "$45,000",
                        "existing_debt": "$35,000"
                    }),
                    "loan_request": application_data.get("loan_request", {
                        "loan_purpose": "Home Renovation",
                        "loan_amount": "$50,000",
                        "loan_term_months": "60"
                    })
                }
        
        # Convert to LoanApplication object
        loan_app = LoanApplication(
            application_id=consolidated_data["application_id"],
            submission_date=consolidated_data["submission_date"],
            applicant=Applicant(
                first_name=consolidated_data["applicant"]["first_name"],
                last_name=consolidated_data["applicant"]["last_name"],
                date_of_birth=consolidated_data["applicant"]["date_of_birth"],
                email=consolidated_data["applicant"]["email"],
                phone=consolidated_data["applicant"]["phone"],
                address=consolidated_data["applicant"]["address"],
                ssn=consolidated_data["applicant"]["ssn"],
                employment_status=consolidated_data["applicant"]["employment_status"],
                employer=consolidated_data["applicant"]["employer"],
                job_title=consolidated_data["applicant"]["job_title"],
                years_at_current_job=float(consolidated_data["applicant"]["years_at_current_job"]) if consolidated_data["applicant"]["years_at_current_job"] else None
            ),
            financial_info=FinancialInfo(
                annual_income=float(consolidated_data["financial_info"]["annual_income"].replace("$", "").replace(",", "")),
                additional_income=float(consolidated_data["financial_info"]["additional_income"].replace("$", "").replace(",", "")),
                monthly_housing_payment=float(consolidated_data["financial_info"]["monthly_housing_payment"].replace("$", "").replace(",", "")),
                liquid_assets=float(consolidated_data["financial_info"]["liquid_assets"].replace("$", "").replace(",", "")) if consolidated_data["financial_info"]["liquid_assets"] else None,
                existing_debt=float(consolidated_data["financial_info"]["existing_debt"].replace("$", "").replace(",", "")) if consolidated_data["financial_info"]["existing_debt"] else None
            ),
            loan_request=LoanRequest(
                loan_purpose=consolidated_data["loan_request"]["loan_purpose"],
                loan_amount=float(consolidated_data["loan_request"]["loan_amount"].replace("$", "").replace(",", "")),
                loan_term_months=int(consolidated_data["loan_request"]["loan_term_months"])
            )
        )
        
        return loan_app
    
    def extract_all(self, processed_docs):
        """Extract data from all documents and consolidate."""
        # Extract data from application form
        application_data = self.extract_from_pdf(processed_docs["application_form"])
        print("✅ Extracted application form data")
        
        # Extract data from ID document
        id_data = self.extract_from_id(processed_docs["id_document"])
        print("✅ Extracted ID document data")
        
        # Extract data from financial statement
        financial_data = self.extract_from_financial_statement(processed_docs["financial_statement"])
        print("✅ Extracted financial statement data")
        
        # Consolidate all information
        loan_application = self.consolidate_information(application_data, id_data, financial_data)
        print("✅ Consolidated application data")
        
        return loan_application

In [20]:
# 2. Test the ExtractionAgent
print("\n=== TESTING EXTRACTION AGENT ===")
extraction_agent = ExtractionAgent()
loan_application = extraction_agent.extract_all(processed_docs)
print(f"Extracted loan application data:")
print(f"- Applicant: {loan_application.applicant.first_name} {loan_application.applicant.last_name}")
print(f"- Loan amount: ${loan_application.loan_request.loan_amount:,.2f}")
print(f"- Loan purpose: {loan_application.loan_request.loan_purpose}")


=== TESTING EXTRACTION AGENT ===
✅ Extracted application form data
✅ Extracted ID document data
✅ Extracted financial statement data
✅ Consolidated application data
Extracted loan application data:
- Applicant: John Smith
- Loan amount: $50,000.00
- Loan purpose: Home Renovation


## Agent 3: Verification Agent

In [21]:
class VerificationAgent:
    """
    Responsible for verifying the application information.
    
    This agent:
    1. Performs simulated KYC/AML checks
    2. Validates data consistency across documents
    3. Performs simulated employment and income verification
    """
    
    def __init__(self):
        self.model = gemini_model
    
    def simulate_kyc_aml_check(self, applicant):
        """Simulate a KYC/AML check."""
        # In a real system, this would call an external KYC/AML service
        # For this simulation, we'll use a mock check
        
        # Convert applicant to dict for easier access
        applicant_dict = json.loads(applicant.to_json())
        
        # Create a deterministic result based on SSN and name
        ssn_last_four = applicant_dict["ssn"].split("-")[-1]
        name_check = len(applicant_dict["first_name"] + applicant_dict["last_name"]) % 10
        
        # Simulate some potential issues
        issues = []
        if int(ssn_last_four) % 10 == 0:
            issues.append("SSN validation failed - potential synthetic identity")
        
        if name_check == 1:
            issues.append("Name appears on PEP (Politically Exposed Persons) list")
        
        if int(ssn_last_four) % 5 == 0:
            issues.append("Address verification failed")
        
        # For this demo, most applications should pass
        passed = len(issues) == 0
        
        return {
            "passed": passed,
            "issues": issues,
            "check_date": datetime.now().strftime("%Y-%m-%d")
        }
    
    def verify_document_consistency(self, loan_application):
        """Check consistency between different documents."""
        # Convert to JSON for easier processing
        app_json = json.loads(loan_application.to_json())
        
        # In a real system, this would do extensive cross-checking
        # For this simulation, we'll use Gemini to analyze consistency
        
        prompt = f"""
        Analyze the consistency of this loan application data that was extracted from multiple documents.
        Look for inconsistencies in:
        1. Personal information (name, DOB, address)
        2. Financial information
        3. Employment details
        
        Application data:
        {json.dumps(app_json, indent=2)}
        
        Return ONLY a JSON object with this structure, nothing else:
        {{
            "consistent": true/false,
            "inconsistencies": [
                {{
                    "field": "field name",
                    "issue": "description of inconsistency"
                }}
            ]
        }}
        """
        
        response = self.model.generate_content(prompt)
        
        # Process response to ensure it's valid JSON
        try:
            result = json.loads(response.text)
        except json.JSONDecodeError:
            # Clean up the response to try to extract valid JSON
            cleaned_text = response.text.strip()
            
            # Remove markdown code block formatting if present
            if cleaned_text.startswith("```json"):
                cleaned_text = cleaned_text.replace("```json", "", 1)
                if cleaned_text.endswith("```"):
                    cleaned_text = cleaned_text[:-3]
            elif cleaned_text.startswith("```"):
                cleaned_text = cleaned_text.replace("```", "", 1)
                if cleaned_text.endswith("```"):
                    cleaned_text = cleaned_text[:-3]
                
            # Try parsing again
            try:
                result = json.loads(cleaned_text.strip())
            except json.JSONDecodeError:
                # If still failing, create a default structure
                print("❌ Error: Could not parse JSON when checking document consistency. Using default values.")
                print(f"Model response: {response.text}")
                result = {
                    "consistent": True,
                    "inconsistencies": []
                }
        
        return result
    
    def simulate_employment_verification(self, applicant):
        """Simulate employment verification."""
        # Convert to dict for easier access
        applicant_dict = json.loads(applicant.to_json())
        
        # In a real system, this would contact employers or verification services
        # For this simulation, we'll create a mock result
        
        # Deterministic result based on employer name and job title
        employer_name_length = len(applicant_dict["employer"]) if applicant_dict["employer"] else 0
        job_title_length = len(applicant_dict["job_title"]) if applicant_dict["job_title"] else 0
        
        issues = []
        if employer_name_length % 7 == 0:
            issues.append("Employer not found in business registry")
        
        if job_title_length % 5 == 0:
            issues.append("Job title doesn't match company records")
        
        if applicant_dict["years_at_current_job"] and applicant_dict["years_at_current_job"] < 0.5:
            issues.append("Employment duration below required minimum")
        
        # For this demo, most applications should pass
        passed = len(issues) == 0
        
        return {
            "verified": passed,
            "issues": issues,
            "verification_date": datetime.now().strftime("%Y-%m-%d")
        }
    
    def simulate_income_verification(self, applicant, financial_info):
        """Simulate income verification."""
        # Convert to dict for easier access
        applicant_dict = json.loads(applicant.to_json())
        financial_dict = json.loads(financial_info.to_json())
        
        # In a real system, this would verify against tax records, pay stubs, etc.
        # For this simulation, we'll create a mock result
        
        annual_income = financial_dict["annual_income"]
        job_title = applicant_dict["job_title"]
        
        # Use Gemini to check if income is reasonable for the job title
        prompt = f"""
        Evaluate if an annual income of ${annual_income:,.2f} is reasonable for a "{job_title}" position.
        Consider:
        1. Is this income within expected range for this role?
        2. Are there any red flags?
        
        Return ONLY a JSON object with this structure, nothing else:
        {{
            "reasonable": true/false,
            "issues": []
        }}
        """
        
        response = self.model.generate_content(prompt)
        
        # Process response to ensure it's valid JSON
        try:
            result = json.loads(response.text)
        except json.JSONDecodeError:
            # Clean up the response to try to extract valid JSON
            cleaned_text = response.text.strip()
            
            # Remove markdown code block formatting if present
            if cleaned_text.startswith("```json"):
                cleaned_text = cleaned_text.replace("```json", "", 1)
                if cleaned_text.endswith("```"):
                    cleaned_text = cleaned_text[:-3]
            elif cleaned_text.startswith("```"):
                cleaned_text = cleaned_text.replace("```", "", 1)
                if cleaned_text.endswith("```"):
                    cleaned_text = cleaned_text[:-3]
                
            # Try parsing again
            try:
                result = json.loads(cleaned_text.strip())
            except json.JSONDecodeError:
                # If still failing, create a default structure
                print("❌ Error: Could not parse JSON when verifying income. Using default values.")
                print(f"Model response: {response.text}")
                result = {
                    "reasonable": True,
                    "issues": []
                }
        
        return {
            "verified": result["reasonable"],
            "issues": result["issues"],
            "verification_date": datetime.now().strftime("%Y-%m-%d")
        }
    
    def verify_application(self, loan_application):
        """Perform all verification checks on the loan application."""
        # KYC/AML check
        kyc_result = self.simulate_kyc_aml_check(loan_application.applicant)
        print(f"KYC/AML Check: {'✅ Passed' if kyc_result['passed'] else '❌ Failed'}")
        if kyc_result["issues"]:
            print(f"Issues: {kyc_result['issues']}")
        
        # Document consistency check
        consistency_result = self.verify_document_consistency(loan_application)
        print(f"Document Consistency: {'✅ Consistent' if consistency_result['consistent'] else '❌ Inconsistent'}")
        if not consistency_result["consistent"]:
            for issue in consistency_result["inconsistencies"]:
                print(f"- {issue['field']}: {issue['issue']}")
        
        # Employment verification
        employment_result = self.simulate_employment_verification(loan_application.applicant)
        print(f"Employment Verification: {'✅ Verified' if employment_result['verified'] else '❌ Failed'}")
        if employment_result["issues"]:
            print(f"Issues: {employment_result['issues']}")
        
        # Income verification
        income_result = self.simulate_income_verification(
            loan_application.applicant, 
            loan_application.financial_info
        )
        print(f"Income Verification: {'✅ Verified' if income_result['verified'] else '❌ Failed'}")
        if income_result["issues"]:
            print(f"Issues: {income_result['issues']}")
        
        # Compile verification result
        all_issues = (
            kyc_result.get("issues", []) + 
            [f"{issue['field']}: {issue['issue']}" for issue in consistency_result.get("inconsistencies", [])] +
            employment_result.get("issues", []) +
            income_result.get("issues", [])
        )
        
        verification_result = VerificationResult(
            identity_verified=kyc_result["passed"],
            document_consistency=consistency_result["consistent"],
            kyc_aml_passed=kyc_result["passed"],
            employment_verified=employment_result["verified"],
            income_verified=income_result["verified"],
            issues=all_issues,
            verification_date=datetime.now().strftime("%Y-%m-%d")
        )
        
        return verification_result


In [22]:
print("\n=== TESTING VERIFICATION AGENT ===")
verification_agent = VerificationAgent()
verification_result = verification_agent.verify_application(loan_application)
print(f"Verification result:")
print(f"- Identity verified: {verification_result.identity_verified}")
print(f"- Document consistency: {verification_result.document_consistency}")
print(f"- KYC/AML passed: {verification_result.kyc_aml_passed}")
print(f"- Employment verified: {verification_result.employment_verified}")
print(f"- Income verified: {verification_result.income_verified}")
if verification_result.issues:
    print(f"- Issues: {verification_result.issues}")


=== TESTING VERIFICATION AGENT ===
KYC/AML Check: ✅ Passed
Document Consistency: ✅ Consistent
Employment Verification: ✅ Verified
Income Verification: ✅ Verified
Verification result:
- Identity verified: True
- Document consistency: True
- KYC/AML passed: True
- Employment verified: True
- Income verified: True


## Agent 4: Credit Assessment Agent

In [23]:
class CreditAssessmentAgent:
    """
    Responsible for assessing credit risk.
    
    This agent:
    1. Retrieves mock credit report data
    2. Calculates key financial ratios
    3. Generates a risk score
    4. Identifies risk factors
    """
    
    def __init__(self):
        self.model = gemini_model
    
    def retrieve_credit_report(self, applicant_id):
        """Retrieve credit report data for the applicant."""
        # In a real system, this would call a credit bureau API
        # For this simulation, we'll load mock credit data
        
        try:
            with open("credit_report.json", "r") as f:
                credit_data = json.load(f)
            print("✅ Retrieved credit report")
            return credit_data
        except Exception as e:
            print(f"❌ Error retrieving credit report: {str(e)}")
            # Return a default report if file not found
            return {
                "applicant_id": applicant_id,
                "credit_score": 650,  # Default score
                "report_date": datetime.now().strftime("%Y-%m-%d"),
                "credit_history": [],
                "inquiries_last_6_months": 0,
                "derogatory_marks": 0,
                "collections": 0,
                "public_records": 0
            }
    
    # Calculate financial ratios
    def calculate_financial_ratios(self, loan_application, credit_data):
        """Calculate key financial ratios for credit assessment."""
        # Extract required data
        annual_income = loan_application.financial_info.annual_income
        monthly_income = annual_income / 12
        additional_income = loan_application.financial_info.additional_income
        total_monthly_income = (annual_income + additional_income) / 12
        
        monthly_housing_payment = loan_application.financial_info.monthly_housing_payment
        existing_debt = loan_application.financial_info.existing_debt or 0
        
        # Calculate loan payment using the amortization formula
        loan_amount = loan_application.loan_request.loan_amount
        loan_term_months = loan_application.loan_request.loan_term_months
        
        # Use a more accurate monthly payment calculation with a more realistic interest rate
        annual_interest_rate = 0.065  # 6.5% interest rate
        monthly_interest_rate = annual_interest_rate / 12
        
        # Use the proper amortization formula for more accurate payment calculation
        if monthly_interest_rate > 0:
            monthly_payment = loan_amount * (monthly_interest_rate * (1 + monthly_interest_rate) ** loan_term_months) / ((1 + monthly_interest_rate) ** loan_term_months - 1)
        else:
            # If interest rate is 0, simple division
            monthly_payment = loan_amount / loan_term_months
        
        # Calculate total monthly debt payments
        # For existing debt, we need the monthly payment amount
        monthly_debt_payment = 0
        
        # If we have explicit debt information from the application
        if existing_debt > 0:
            # Estimate monthly debt payments as a percentage of total debt
            # This is an approximation assuming a mix of different debt types with different terms
            monthly_debt_payment = existing_debt * 0.03  # Assume approximately 3% monthly payment on total debt
        else:
            # If no explicit debt information, try to estimate from credit report
            credit_history = credit_data.get("credit_history", [])
            for account in credit_history:
                account_type = account.get("account_type", "").lower()
                current_balance = account.get("current_balance", 0)
                
                # Different calculation methods depending on account type
                if "credit card" in account_type:
                    # Minimum payment on credit cards is typically 2-4% of balance
                    monthly_debt_payment += current_balance * 0.03
                elif "loan" in account_type or "mortgage" in account_type:
                    # For loans, estimate based on typical terms
                    if "auto" in account_type:
                        # Auto loans typically 4-6 years
                        term_months = 60  # 5 years
                        loan_rate = 0.05  # 5%
                    elif "mortgage" in account_type:
                        # Mortgages typically 15-30 years
                        term_months = 360  # 30 years
                        loan_rate = 0.04  # 4%
                    else:
                        # Other loans typically 3-5 years
                        term_months = 48  # 4 years
                        loan_rate = 0.07  # 7%
                    
                    # Calculate estimated monthly payment using remaining balance
                    m_rate = loan_rate / 12
                    if m_rate > 0:
                        account_payment = current_balance * (m_rate * (1 + m_rate) ** term_months) / ((1 + m_rate) ** term_months - 1)
                        monthly_debt_payment += account_payment
        
        # Current DTI (without new loan) - fixed calculation
        current_dti_ratio = (monthly_housing_payment + monthly_debt_payment) / total_monthly_income
        
        # Future DTI (with new loan) - fixed calculation
        future_dti_ratio = (monthly_housing_payment + monthly_debt_payment + monthly_payment) / total_monthly_income
        
        # Payment to income ratio for new loan
        pti_ratio = monthly_payment / total_monthly_income
        
        # Loan to value ratio (if applicable)
        ltv_ratio = None
        if (loan_application.loan_request.loan_purpose.lower() == "home renovation" and 
            loan_application.loan_request.collateral_value):
            ltv_ratio = loan_amount / loan_application.loan_request.collateral_value
        
        # For debugging
        print(f"DEBUG - Financial calculations:")
        print(f"  Annual income: ${annual_income:,.2f}")
        print(f"  Additional income: ${additional_income:,.2f}")
        print(f"  Total monthly income: ${total_monthly_income:,.2f}")
        print(f"  Monthly housing payment: ${monthly_housing_payment:,.2f}")
        print(f"  Estimated monthly debt payment: ${monthly_debt_payment:,.2f}")
        print(f"  New loan monthly payment: ${monthly_payment:,.2f}")
        print(f"  Current DTI ratio: {current_dti_ratio:.2%}")
        print(f"  Future DTI ratio: {future_dti_ratio:.2%}")
        
        return {
            "current_dti_ratio": current_dti_ratio,
            "future_dti_ratio": future_dti_ratio,
            "pti_ratio": pti_ratio,
            "ltv_ratio": ltv_ratio,
            "monthly_payment": monthly_payment
        }
    
    # Calculate risk score 
    def calculate_risk_score(self, credit_data, financial_ratios, loan_application, verification_result):
        """Calculate a risk score based on all available data."""
        # Extract key risk factors
        credit_score = credit_data.get("credit_score", 650)
        future_dti = financial_ratios.get("future_dti_ratio", 0)
        pti = financial_ratios.get("pti_ratio", 0)
        ltv = financial_ratios.get("ltv_ratio")
        
        inquiries = credit_data.get("inquiries_last_6_months", 0)
        derogatory_marks = credit_data.get("derogatory_marks", 0)
        collections = credit_data.get("collections", 0)
        
        # Get income information - higher income should lower risk
        annual_income = loan_application.financial_info.annual_income
        additional_income = loan_application.financial_info.additional_income
        total_income = annual_income + additional_income
        
        # Verification flags
        kyc_passed = verification_result.kyc_aml_passed
        employment_verified = verification_result.employment_verified
        income_verified = verification_result.income_verified
        
        # Base score starts at 650 (midpoint)
        risk_score = 650
        
        # Credit score impact (major factor)
        if credit_score >= 750:
            risk_score += 100
        elif credit_score >= 700:
            risk_score += 75
        elif credit_score >= 650:
            risk_score += 40
        elif credit_score >= 600:
            risk_score += 0
        else:
            risk_score -= 100
        
        # DTI impact (major factor)
        if future_dti <= 0.28:
            risk_score += 120  # Very low DTI is very good
        elif future_dti <= 0.36:
            risk_score += 80
        elif future_dti <= 0.43:
            risk_score += 30
        elif future_dti <= 0.50:
            risk_score -= 30
        else:
            risk_score -= 120
        
        # PTI impact
        if pti <= 0.20:
            risk_score += 60  # Very low PTI is good
        elif pti <= 0.28:
            risk_score += 30
        elif pti <= 0.36:
            risk_score += 0
        else:
            risk_score -= 60
        
        # LTV impact (if applicable)
        if ltv is not None:
            if ltv <= 0.70:
                risk_score += 50
            elif ltv <= 0.80:
                risk_score += 25
            elif ltv <= 0.90:
                risk_score -= 25
            else:
                risk_score -= 50
        
        # Income impact (higher income = lower risk)
        if total_income >= 150000:
            risk_score += 80  # High income is very positive
        elif total_income >= 100000:
            risk_score += 50
        elif total_income >= 70000:
            risk_score += 20
        elif total_income < 50000:
            risk_score -= 30
        
        # Negative factors
        risk_score -= inquiries * 5
        risk_score -= derogatory_marks * 50
        risk_score -= collections * 30
        
        # Verification impact
        if not kyc_passed:
            risk_score -= 200
        if not employment_verified:
            risk_score -= 100
        if not income_verified:
            risk_score -= 100
        
        # Job stability - longer employment history reduces risk
        years_at_job = loan_application.applicant.years_at_current_job
        if years_at_job is not None:
            if years_at_job >= 5:
                risk_score += 40
            elif years_at_job >= 2:
                risk_score += 20
            elif years_at_job < 1:
                risk_score -= 30
        
        # Print detailed scoring for debugging
        print(f"DEBUG - Risk score components:")
        print(f"  Base score: 650")
        print(f"  Credit score ({credit_score}): {'+' if credit_score >= 600 else '-'}{abs(40 if 600 <= credit_score < 650 else (75 if 650 <= credit_score < 700 else 100 if credit_score >= 750 else 0 if credit_score >= 600 else 100))}")
        print(f"  DTI ratio ({future_dti:.2%}): {'+' if future_dti <= 0.43 else '-'}{abs(120 if future_dti <= 0.28 else (80 if future_dti <= 0.36 else 30 if future_dti <= 0.43 else 30 if future_dti <= 0.50 else 120))}")
        print(f"  PTI ratio ({pti:.2%}): {'+' if pti <= 0.36 else '-'}{abs(60 if pti <= 0.20 else (30 if pti <= 0.28 else 0 if pti <= 0.36 else 60))}")
        if ltv is not None:
            print(f"  LTV ratio ({ltv:.2%}): {'+' if ltv <= 0.80 else '-'}{abs(50 if ltv <= 0.70 else (25 if ltv <= 0.80 else 25 if ltv <= 0.90 else 50))}")
        print(f"  Income (${total_income:,.2f}): {'+' if total_income >= 50000 else '-'}{abs(80 if total_income >= 150000 else (50 if total_income >= 100000 else 20 if total_income >= 70000 else 0 if total_income >= 50000 else 30))}")
        if years_at_job is not None:
            print(f"  Job stability ({years_at_job} years): {'+' if years_at_job >= 1 else '-'}{abs(40 if years_at_job >= 5 else (20 if years_at_job >= 2 else 0 if years_at_job >= 1 else 30))}")
        
        # Normalize to 0-1000 range
        risk_score = max(0, min(1000, risk_score))
        
        # Lower risk score is better - invert the scale
        # 0 = highest risk, 1 = lowest risk
        return 1 - (risk_score / 1000)  # Return as 0-1 value where 0 = high risk, 1 = low risk
        
    def identify_risk_factors(self, credit_data, financial_ratios, loan_application, verification_result):
        """Identify key risk factors based on the application data."""
        risk_factors = []
        
        # Credit score risks
        credit_score = credit_data.get("credit_score", 650)
        if credit_score < 620:
            risk_factors.append("Low credit score")
        elif credit_score < 680:
            risk_factors.append("Moderate credit score")
        
        # DTI risks
        future_dti = financial_ratios.get("future_dti_ratio", 0)
        if future_dti > 0.43:
            risk_factors.append("High debt-to-income ratio")
        elif future_dti > 0.36:
            risk_factors.append("Elevated debt-to-income ratio")
        
        # LTV risks (if applicable)
        ltv = financial_ratios.get("ltv_ratio")
        if ltv and ltv > 0.80:
            risk_factors.append("High loan-to-value ratio")
        
        # Credit history risks
        inquiries = credit_data.get("inquiries_last_6_months", 0)
        if inquiries > 2:
            risk_factors.append("Multiple recent credit inquiries")
        
        derogatory_marks = credit_data.get("derogatory_marks", 0)
        if derogatory_marks > 0:
            risk_factors.append(f"{derogatory_marks} derogatory marks on credit report")
        
        collections = credit_data.get("collections", 0)
        if collections > 0:
            risk_factors.append(f"{collections} accounts in collections")
        
        # Verification risks
        if not verification_result.identity_verified:
            risk_factors.append("Identity verification issues")
            
        if not verification_result.document_consistency:
            risk_factors.append("Document inconsistencies detected")
        
        if not verification_result.employment_verified:
            risk_factors.append("Employment verification failed")
        
        if not verification_result.income_verified:
            risk_factors.append("Income verification failed")
        
        # Additional factors
        if loan_application.applicant.years_at_current_job and loan_application.applicant.years_at_current_job < 2:
            risk_factors.append("Short employment history")
        
        # Income risks
        annual_income = loan_application.financial_info.annual_income
        additional_income = loan_application.financial_info.additional_income
        total_income = annual_income + additional_income
        
        if total_income < 50000:
            risk_factors.append("Low income relative to requested loan amount")
        
        # Analyze credit history
        credit_history = credit_data.get("credit_history", [])
        if len(credit_history) == 0:
            risk_factors.append("Limited or no credit history")
        else:
            # Check credit utilization
            total_balance = 0
            total_limit = 0
            
            for account in credit_history:
                if account.get("account_type") == "Credit Card":
                    total_balance += account.get("current_balance", 0)
                    total_limit += account.get("credit_limit", 0)
            
            if total_limit > 0:
                utilization = total_balance / total_limit
                if utilization > 0.7:
                    risk_factors.append("High credit utilization (>70%)")
                elif utilization > 0.5:
                    risk_factors.append("Elevated credit utilization (>50%)")
        
        return risk_factors
    
    def assess_application(self, loan_application, verification_result):
        """Perform full credit assessment on the loan application."""
        # Retrieve credit report
        credit_data = self.retrieve_credit_report(loan_application.application_id)
        
        return self.assess_application_with_report(loan_application, verification_result, credit_data)
    
    def assess_application_with_report(self, loan_application, verification_result, credit_data):
        """Perform credit assessment using a provided credit report."""
        # Calculate financial ratios
        financial_ratios = self.calculate_financial_ratios(loan_application, credit_data)
        print("✅ Calculated financial ratios:")
        for k, v in financial_ratios.items():
            if v is not None:
                if k.endswith("ratio"):
                    print(f"  - {k}: {v:.2%}")
                else:
                    print(f"  - {k}: ${v:.2f}")
        
        # Calculate risk score
        risk_score = self.calculate_risk_score(
            credit_data, 
            financial_ratios, 
            loan_application, 
            verification_result
        )
        print(f"✅ Calculated risk score: {risk_score:.4f}")
        
        # Identify risk factors
        risk_factors = self.identify_risk_factors(
            credit_data, 
            financial_ratios, 
            loan_application, 
            verification_result
        )
        if risk_factors:
            print("⚠️ Identified risk factors:")
            for factor in risk_factors:
                print(f"  - {factor}")
        else:
            print("✅ No significant risk factors identified")
        
        # Create CreditAssessment object
        credit_assessment = CreditAssessment(
            credit_score=credit_data.get("credit_score", 0),
            risk_score=risk_score,
            debt_to_income_ratio=financial_ratios.get("future_dti_ratio", 0),
            payment_to_income_ratio=financial_ratios.get("pti_ratio", 0),
            loan_to_value_ratio=financial_ratios.get("ltv_ratio"),
            risk_factors=risk_factors,
            assessment_date=datetime.now().strftime("%Y-%m-%d")
        )
        
        return credit_assessment

In [24]:
# Test the CreditAssessmentAgent
print("\n=== TESTING CREDIT ASSESSMENT AGENT ===")
credit_agent = CreditAssessmentAgent()

# Load the credit report from the mock data
with open(mock_data["credit_reports"][0], 'r') as f:
    credit_data = json.load(f)

credit_assessment = credit_agent.assess_application_with_report(
    loan_application, 
    verification_result,
    credit_data
)
print(f"Credit assessment:")
print(f"- Credit score: {credit_assessment.credit_score}")
print(f"- Risk score: {credit_assessment.risk_score:.4f}")
print(f"- DTI ratio: {credit_assessment.debt_to_income_ratio:.2%}")
print(f"- PTI ratio: {credit_assessment.payment_to_income_ratio:.2%}")
if credit_assessment.risk_factors:
    print(f"- Risk factors: {credit_assessment.risk_factors}")


=== TESTING CREDIT ASSESSMENT AGENT ===
DEBUG - Financial calculations:
  Annual income: $170,000.00
  Additional income: $10,000.00
  Total monthly income: $15,000.00
  Monthly housing payment: $2,500.00
  Estimated monthly debt payment: $1,050.00
  New loan monthly payment: $978.31
  Current DTI ratio: 23.67%
  Future DTI ratio: 30.19%
✅ Calculated financial ratios:
  - current_dti_ratio: 23.67%
  - future_dti_ratio: 30.19%
  - pti_ratio: 6.52%
  - monthly_payment: $978.31
DEBUG - Risk score components:
  Base score: 650
  Credit score (802): +100
  DTI ratio (30.19%): +80
  PTI ratio (6.52%): +60
  Income ($180,000.00): +80
  Job stability (11.0 years): +40
✅ Calculated risk score: 0.0500
⚠️ Identified risk factors:
  - 1 derogatory marks on credit report
Credit assessment:
- Credit score: 802
- Risk score: 0.0500
- DTI ratio: 30.19%
- PTI ratio: 6.52%
- Risk factors: ['1 derogatory marks on credit report']


## Agent 5: Underwriting Agent with RAG

In [25]:
class UnderwritingAgent:
    """
    Responsible for making loan underwriting decisions using RAG.
    
    This agent:
    1. Creates a vector database from lending policies
    2. Retrieves relevant policies for the loan application
    3. Applies policies using RAG to make underwriting decisions
    """
    
    def __init__(self):
        self.model = gemini_model
        self.embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
        self.policies_db = None
        self.setup_policies_db()
    
    def setup_policies_db(self):
        """Set up the vector database for lending policies."""
        try:
            # Load policies from file
            with open("lending_policies.json", "r") as f:
                policies = json.load(f)
            
            # Create documents for the vector store
            documents = []
            metadatas = []
            
            for policy in policies:
                # Split policy text into chunks
                text_splitter = RecursiveCharacterTextSplitter(
                    chunk_size=500,
                    chunk_overlap=50
                )
                chunks = text_splitter.split_text(policy["policy_text"])
                
                # Add each chunk as a document
                for i, chunk in enumerate(chunks):
                    documents.append(chunk)
                    metadatas.append({
                        "policy_id": policy["policy_id"],
                        "policy_name": policy["policy_name"],
                        "chunk_id": i
                    })
            
            # Create the vector store
            self.policies_db = FAISS.from_texts(
                documents, 
                self.embeddings,
                metadatas=metadatas
            )
            
            print(f"✅ Set up policies database with {len(documents)} chunks")
            
        except Exception as e:
            print(f"❌ Error setting up policies database: {str(e)}")
    
    def retrieve_relevant_policies(self, loan_application, credit_assessment):
        """Retrieve relevant policies for this loan application."""
        if not self.policies_db:
            print("❌ Policies database not initialized")
            return []
        
        # Create a query based on the loan application
        query = f"""
        Loan purpose: {loan_application.loan_request.loan_purpose}
        Loan amount: ${loan_application.loan_request.loan_amount}
        Loan term: {loan_application.loan_request.loan_term_months} months
        Credit score: {credit_assessment.credit_score}
        Debt-to-income ratio: {credit_assessment.debt_to_income_ratio:.2%}
        """
        
        # Retrieve similar documents
        docs = self.policies_db.similarity_search(query, k=5)
        
        print(f"✅ Retrieved {len(docs)} relevant policy documents")
        return docs
    
    def apply_policies(self, loan_application, verification_result, credit_assessment, relevant_policies):
        """Apply lending policies to make an underwriting decision."""
        # Convert data to JSON for easier processing
        app_json = json.loads(loan_application.to_json())
        verification_json = json.loads(verification_result.to_json())
        credit_json = json.loads(credit_assessment.to_json())
        
        # Extract policy text
        policy_texts = []
        for doc in relevant_policies:
            policy_texts.append(f"Policy: {doc.metadata['policy_name']}\n{doc.page_content}")
        
        policies_str = "\n\n".join(policy_texts)
        
        # Use Gemini to apply policies and make a decision
        prompt = f"""
        As an underwriting AI, apply the lending policies to this loan application and make a decision.
        
        APPLICATION DATA:
        {json.dumps(app_json, indent=2)}
        
        VERIFICATION RESULTS:
        {json.dumps(verification_json, indent=2)}
        
        CREDIT ASSESSMENT:
        {json.dumps(credit_json, indent=2)}
        
        RELEVANT LENDING POLICIES:
        {policies_str}
        
        Based on these policies and the application data, determine:
        1. Should this loan be approved, rejected, or referred for human review?
        2. What is the confidence level of this decision?
        3. What are the key reasons for this decision?
        4. What conditions should be applied (if approved)?
        5. What is the maximum approved amount, term, and interest rate (if approved)?
        
        Return ONLY a JSON object with this structure, nothing else:
        {{
            "decision": "approve/reject/refer",
            "confidence_score": 0.0 to 1.0,
            "reasons": ["reason1", "reason2", ...],
            "conditions": ["condition1", "condition2", ...],
            "max_approved_amount": amount (if approved),
            "approved_term_months": term (if approved),
            "approved_interest_rate": rate (if approved)
        }}
        """
        
        response = self.model.generate_content(prompt)
        
        # Process response to ensure it's valid JSON
        try:
            decision_data = json.loads(response.text)
            
            # Fix the interest rate if it's unreasonable
            if "approved_interest_rate" in decision_data and decision_data["approved_interest_rate"] is not None:
                # Check if interest rate is expressed as a whole number (e.g., 7 instead of 0.07)
                if decision_data["approved_interest_rate"] > 1:
                    # Convert to a decimal (e.g., 7% becomes 0.07)
                    decision_data["approved_interest_rate"] /= 100
                    print(f"✅ Fixed interest rate from whole percentage to decimal: {decision_data['approved_interest_rate']:.4f}")
                
                # Verify the interest rate is reasonable (between 2% and 25%)
                if decision_data["approved_interest_rate"] < 0.02:
                    decision_data["approved_interest_rate"] = 0.05  # Default to 5%
                    print(f"⚠️ Interest rate too low, set to default: {decision_data['approved_interest_rate']:.4f}")
                elif decision_data["approved_interest_rate"] > 0.25:
                    decision_data["approved_interest_rate"] = 0.0725  # Default to 7.25%
                    print(f"⚠️ Interest rate too high, set to default: {decision_data['approved_interest_rate']:.4f}")
                
        except json.JSONDecodeError:
            # Clean up the response to try to extract valid JSON
            cleaned_text = response.text.strip()
            
            # Remove markdown code block formatting if present
            if cleaned_text.startswith("```json"):
                cleaned_text = cleaned_text.replace("```json", "", 1)
                if cleaned_text.endswith("```"):
                    cleaned_text = cleaned_text[:-3]
            elif cleaned_text.startswith("```"):
                cleaned_text = cleaned_text.replace("```", "", 1)
                if cleaned_text.endswith("```"):
                    cleaned_text = cleaned_text[:-3]
                
            # Try parsing again
            try:
                decision_data = json.loads(cleaned_text.strip())
                
                # Fix the interest rate if it's unreasonable
                if "approved_interest_rate" in decision_data and decision_data["approved_interest_rate"] is not None:
                    # Check if interest rate is expressed as a whole number (e.g., 7 instead of 0.07)
                    if decision_data["approved_interest_rate"] > 1:
                        # Convert to a decimal (e.g., 7% becomes 0.07)
                        decision_data["approved_interest_rate"] /= 100
                        print(f"✅ Fixed interest rate from whole percentage to decimal: {decision_data['approved_interest_rate']:.4f}")
                    
                    # Verify the interest rate is reasonable (between 2% and 25%)
                    if decision_data["approved_interest_rate"] < 0.02:
                        decision_data["approved_interest_rate"] = 0.05  # Default to 5%
                        print(f"⚠️ Interest rate too low, set to default: {decision_data['approved_interest_rate']:.4f}")
                    elif decision_data["approved_interest_rate"] > 0.25:
                        decision_data["approved_interest_rate"] = 0.0725  # Default to 7.25%
                        print(f"⚠️ Interest rate too high, set to default: {decision_data['approved_interest_rate']:.4f}")
                    
            except json.JSONDecodeError:
                # If still failing, create a default structure
                print("❌ Error: Could not parse JSON for underwriting decision. Using default values.")
                print(f"Model response: {response.text}")
                
                # Create a simple default decision based on credit score
                credit_score = credit_assessment.credit_score
                if credit_score >= 680:
                    decision = "approve"
                    reasons = ["Acceptable credit score", "Income verification passed"]
                    conditions = ["Proof of identity", "Proof of income"]
                    max_amount = loan_application.loan_request.loan_amount
                    term = loan_application.loan_request.loan_term_months
                    rate = 0.07
                elif credit_score >= 620:
                    decision = "refer"
                    reasons = ["Borderline credit score", "Additional review needed"]
                    conditions = ["Proof of identity", "Proof of income", "Additional collateral may be required"]
                    max_amount = loan_application.loan_request.loan_amount * 0.8
                    term = loan_application.loan_request.loan_term_months
                    rate = 0.09
                else:
                    decision = "reject"
                    reasons = ["Low credit score", "High risk profile"]
                    conditions = []
                    max_amount = None
                    term = None
                    rate = None
                
                decision_data = {
                    "decision": decision,
                    "confidence_score": 0.7,
                    "reasons": reasons,
                    "conditions": conditions,
                    "max_approved_amount": max_amount if decision != "reject" else None,
                    "approved_term_months": term if decision != "reject" else None,
                    "approved_interest_rate": rate if decision != "reject" else None
                }
        
        # Create UnderwritingDecision object
        decision = UnderwritingDecision(
            decision=decision_data["decision"],
            confidence_score=decision_data["confidence_score"],
            reasons=decision_data["reasons"],
            conditions=decision_data.get("conditions", []),
            max_approved_amount=decision_data.get("max_approved_amount"),
            approved_term_months=decision_data.get("approved_term_months"),
            approved_interest_rate=decision_data.get("approved_interest_rate"),
            decision_date=datetime.now().strftime("%Y-%m-%d")
        )
        
        return decision
    
    def make_decision(self, loan_application, verification_result, credit_assessment):
        """Make an underwriting decision on the loan application."""
        # Retrieve relevant policies
        relevant_policies = self.retrieve_relevant_policies(loan_application, credit_assessment)
        
        # Apply policies to make decision
        decision = self.apply_policies(loan_application, verification_result, credit_assessment, relevant_policies)
        
        # Print decision summary
        decision_emoji = "✅" if decision.decision == "approve" else "❌" if decision.decision == "reject" else "⚠️"
        print(f"{decision_emoji} Underwriting decision: {decision.decision.upper()}")
        print(f"Confidence: {decision.confidence_score:.2%}")
        
        print("Reasons:")
        for reason in decision.reasons:
            print(f"  - {reason}")
        
        if decision.conditions:
            print("Conditions:")
            for condition in decision.conditions:
                print(f"  - {condition}")
        
        if decision.decision == "approve":
            print(f"Approved amount: ${decision.max_approved_amount:,.2f}")
            print(f"Approved term: {decision.approved_term_months} months")
            print(f"Approved interest rate: {decision.approved_interest_rate:.2%}")
        
        return decision

In [26]:
# Test the UnderwritingAgent
print("\n=== TESTING UNDERWRITING AGENT ===")
underwriting_agent = UnderwritingAgent()
underwriting_decision = underwriting_agent.make_decision(
    loan_application, 
    verification_result, 
    credit_assessment
)
print(f"Underwriting decision: {underwriting_decision.decision.upper()}")
print(f"- Confidence: {underwriting_decision.confidence_score:.2%}")
print(f"- Reasons: {underwriting_decision.reasons}")
if underwriting_decision.conditions:
    print(f"- Conditions: {underwriting_decision.conditions}")
if underwriting_decision.decision == "approve":
    print(f"- Approved amount: ${underwriting_decision.max_approved_amount:,.2f}")
    print(f"- Approved term: {underwriting_decision.approved_term_months} months")
    print(f"- Approved interest rate: {underwriting_decision.approved_interest_rate:.2%}")



=== TESTING UNDERWRITING AGENT ===
✅ Set up policies database with 16 chunks
✅ Retrieved 5 relevant policy documents
✅ Fixed interest rate from whole percentage to decimal: 0.0675
✅ Underwriting decision: APPROVE
Confidence: 95.00%
Reasons:
  - Applicant meets all minimum requirements for a Home Renovation Loan.
  - Credit score is above the minimum requirement (600).
  - Debt-to-income ratio is below the maximum allowed (43%).
  - Loan purpose is for home renovation.
  - Applicant has a strong employment history.
Conditions:
  - Proof of completion for renovation projects (e.g., invoices, photos).
  - Homeowners insurance verification.
  - Automatic payment enrollment.
Approved amount: $50,000.00
Approved term: 60 months
Approved interest rate: 6.75%
Underwriting decision: APPROVE
- Confidence: 95.00%
- Reasons: ['Applicant meets all minimum requirements for a Home Renovation Loan.', 'Credit score is above the minimum requirement (600).', 'Debt-to-income ratio is below the maximum al

## Agent 6: Reporting Agent

In [27]:
class ReportingAgent:
    """
    Responsible for generating the final loan processing report.
    
    This agent:
    1. Synthesizes information from all previous agents
    2. Generates a comprehensive summary report
    3. Creates visualizations of key metrics
    """
    
    def __init__(self):
        self.model = gemini_model
    
    def generate_verification_summary(self, verification_result):
        """Generate a summary of the verification process."""
        verification_json = json.loads(verification_result.to_json())
        
        prompt = f"""
        Generate a concise professional summary of this verification process.
        Focus on key findings and issues (if any).
        
        Verification data:
        {json.dumps(verification_json, indent=2)}
        
        Keep the summary under 150 words.
        """
        
        response = self.model.generate_content(prompt)
        return response.text
    
    def generate_credit_summary(self, credit_assessment):
        """Generate a summary of the credit assessment."""
        credit_json = json.loads(credit_assessment.to_json())
        
        prompt = f"""
        Generate a concise professional summary of this credit assessment.
        Focus on credit score, risk factors, and key financial ratios.
        
        Credit assessment data:
        {json.dumps(credit_json, indent=2)}
        
        Keep the summary under 150 words.
        """
        
        response = self.model.generate_content(prompt)
        return response.text
    
    def generate_underwriting_summary(self, underwriting_decision):
        """Generate a summary of the underwriting decision."""
        decision_json = json.loads(underwriting_decision.to_json())
        
        prompt = f"""
        Generate a concise professional summary of this underwriting decision.
        Focus on the decision itself, key reasons, and any conditions if approved.
        
        Underwriting decision:
        {json.dumps(decision_json, indent=2)}
        
        Keep the summary under 150 words.
        """
        
        response = self.model.generate_content(prompt)
        return response.text
    
    def generate_recommended_action(self, verification_result, credit_assessment, underwriting_decision):
        """Generate a recommended action based on all previous assessments."""
        verification_json = json.loads(verification_result.to_json())
        credit_json = json.loads(credit_assessment.to_json())
        decision_json = json.loads(underwriting_decision.to_json())
        
        prompt = f"""
        Generate a clear, actionable recommendation for the loan officer.
        Consider all aspects of the application including verification, credit assessment, and underwriting decision.
        
        Verification data:
        {json.dumps(verification_json, indent=2)}
        
        Credit assessment:
        {json.dumps(credit_json, indent=2)}
        
        Underwriting decision:
        {json.dumps(decision_json, indent=2)}
        
        Keep the recommendation under 100 words and make it direct and actionable.
        """
        
        response = self.model.generate_content(prompt)
        return response.text

    def create_visualizations(self, loan_application, credit_assessment, underwriting_decision):
        """Create visualizations of key loan metrics."""
        # --- IMPORT LIBRARIES NEEDED FOR VISUALIZATIONS ---
        # Place imports here if not at the top level
        import matplotlib.pyplot as plt
        from matplotlib.patches import Wedge, Circle
        import numpy as np

        # Create figure and subplots
        fig, axs = plt.subplots(1, 3, figsize=(15, 5))

        # Plot 1: Financial Ratios (Does not use credit score ranges)
        ratios = {
            'DTI Ratio': credit_assessment.debt_to_income_ratio,
            'PTI Ratio': credit_assessment.payment_to_income_ratio
        }
        if hasattr(credit_assessment, 'loan_to_value_ratio') and credit_assessment.loan_to_value_ratio:
             ratios['LTV Ratio'] = credit_assessment.loan_to_value_ratio

        ratio_colors = ['#ff9999', '#66b3ff', '#99ff99']
        # Highlight potentially problematic DTI
        if credit_assessment.debt_to_income_ratio > 0.43:
             # Ensure we don't go out of bounds if LTV is missing
            if 'DTI Ratio' in ratios and list(ratios.keys()).index('DTI Ratio') < len(ratio_colors):
                ratio_colors[list(ratios.keys()).index('DTI Ratio')] = '#ff0000' # Red

        axs[0].bar(ratios.keys(), ratios.values(), color=ratio_colors[:len(ratios)])
        axs[0].set_title('Financial Ratios')
        axs[0].set_ylim(0, max(max(ratios.values() or [0]) * 1.2, 0.5)) # Added 'or [0]' for safety
        axs[0].set_ylabel('Ratio')
        # Add percentage labels
        for i, v in enumerate(ratios.values()):
            axs[0].text(i, (v or 0) + 0.02, f'{v:.1%}', ha='center') # Added 'or 0' for safety

        # Plot 2: Credit Score Gauge (Uses the specified ranges internally)

        # --- GAUGE FUNCTION DEFINITION ---
        def gauge(ax, value, min_val=300, max_val=850):
            """
            Create a gauge chart for credit scores with 5 colored ranges.
            Args:
                ax (matplotlib.axes.Axes): The axes object to draw the gauge on.
                value (int): The credit score value to display.
                min_val (int): The minimum value of the gauge scale.
                max_val (int): The maximum value of the gauge scale.
            """
            # --- Define THE REQUIRED Ranges and Colors ---
            ranges = {
                'Poor': (300, 579),
                'Fair': (580, 669),
                'Good': (670, 739),
                'Very Good': (740, 799),
                'Excellent': (800, 850)
            }
            colors = {
                'Poor': '#ff0000',       # Red
                'Fair': '#ffa500',       # Orange
                'Good': '#ffff00',       # Yellow
                'Very Good': '#90ee90',  # Light Green
                'Excellent': '#006400'   # Dark Green
            }
            # --- End of Range/Color Definition ---

            range_keys = list(ranges.keys())
            range_colors = [colors[key] for key in range_keys]
            total_range = max_val - min_val

            # --- Normalize value and boundaries ---
            norm_value = (value - min_val) / total_range
            boundaries = [min_val] + [ranges[key][1] for key in range_keys]
            norm_boundaries = [(b - min_val) / total_range for b in boundaries]

            # --- Calculate angles for segments ---
            angles = [180 - (norm_b * 180) for norm_b in norm_boundaries]
            start_angles = angles[:-1]
            end_angles = angles[1:]

            # --- Clear the axis ---
            ax.clear()

            # --- Draw gauge background ---
            background = Wedge((0.5, 0), 0.4, 0, 180, width=0.1, facecolor='#dddddd', lw=0.5, edgecolor='black')
            ax.add_patch(background)

            # --- Draw colored segments ---
            for i in range(len(range_keys)):
                segment = Wedge((0.5, 0), 0.4, end_angles[i], start_angles[i], width=0.1,
                                facecolor=range_colors[i], lw=0.5, edgecolor='black')
                ax.add_patch(segment)

            # --- Draw needle ---
            needle_angle_rad = np.radians(180 - (norm_value * 180))
            needle_length = 0.35
            needle_x = 0.5 + needle_length * np.cos(needle_angle_rad)
            needle_y = 0 + needle_length * np.sin(needle_angle_rad)
            ax.plot([0.5, needle_x], [0, needle_y], color='black', linewidth=2, solid_capstyle='round', zorder=3)

            # --- Add a circle at the base ---
            base_circle = Circle((0.5, 0), 0.04, facecolor='black', zorder=4)
            ax.add_patch(base_circle)

            # --- Add text: Current Score ---
            ax.text(0.5, -0.15, f'{value}', ha='center', va='center', fontsize=18, fontweight='bold')
            ax.text(0.5, -0.25, 'Credit Score', ha='center', va='center', fontsize=10)

            # --- Add labels for ranges and values ---
            label_radius = 0.45
            tick_radius_inner = 0.39
            tick_radius_outer = 0.41
            for i, boundary_val in enumerate(boundaries):
                norm_b = norm_boundaries[i]
                angle_rad = np.radians(180 - (norm_b * 180))
                x_outer = 0.5 + label_radius * np.cos(angle_rad)
                y_outer = 0 + label_radius * np.sin(angle_rad)
                if 0.02 < norm_b < 0.98 : # Avoid crowding edges
                     ax.text(x_outer, y_outer, str(boundary_val), ha='center', va='center', fontsize=7)
                # Add tick marks
                x_tick_inner = 0.5 + tick_radius_inner * np.cos(angle_rad)
                y_tick_inner = 0 + tick_radius_inner * np.sin(angle_rad)
                x_tick_outer = 0.5 + tick_radius_outer * np.cos(angle_rad)
                y_tick_outer = 0 + tick_radius_outer * np.sin(angle_rad)
                ax.plot([x_tick_inner, x_tick_outer], [y_tick_inner, y_tick_outer], color='black', linewidth=1)

            # Add range name labels
            label_radius_names = 0.28
            for i, key in enumerate(range_keys):
                 mid_norm_boundary = (norm_boundaries[i] + norm_boundaries[i+1]) / 2
                 angle_rad = np.radians(180 - (mid_norm_boundary * 180))
                 x = 0.5 + label_radius_names * np.cos(angle_rad)
                 y = 0 + label_radius_names * np.sin(angle_rad)
                 ax.text(x, y, key, ha='center', va='center', fontsize=7, rotation = - (mid_norm_boundary * 180) + 90)

            # --- Set limits and aspect ---
            ax.set_xlim(0, 1)
            ax.set_ylim(-0.3, 0.55)
            ax.set_aspect('equal', adjustable='box')
            ax.axis('off')
        # --- END OF GAUGE FUNCTION DEFINITION ---

        # Call the gauge function using the credit score
        gauge(axs[1], credit_assessment.credit_score)
        axs[1].set_title('Credit Score')

        # Plot 3: Risk Score (Does not use credit score ranges)
        risk_score = credit_assessment.risk_score
        risk_levels = ['Low Risk', 'Medium Risk', 'High Risk']
        risk_colors = ['#00ff00', '#ffff00', '#ff0000'] # Green, Yellow, Red

        # Determine risk level based on risk score thresholds (0.3, 0.7)
        if risk_score < 0.3: risk_level, risk_color = 0, risk_colors[0] # Low
        elif risk_score < 0.7: risk_level, risk_color = 1, risk_colors[1] # Medium
        else: risk_level, risk_color = 2, risk_colors[2] # High

        # Create risk score visualization (simple pie/donut)
        axs[2].pie([1], colors=[risk_color], wedgeprops=dict(width=0.3, edgecolor='w'), startangle=90)
        axs[2].text(0, 0, f'{risk_score:.2f}', ha='center', va='center', fontsize=20) # Display risk score value
        axs[2].text(0, -0.15, risk_levels[risk_level], ha='center', va='center', fontsize=12) # Display risk level text
        axs[2].set_title('Risk Score')
        axs[2].axis('equal') # Equal aspect ratio ensures it is drawn as a circle.

        # Finalize and save the figure
        plt.tight_layout()
        plt.savefig('loan_metrics.png')
        plt.close(fig) # Close the figure to free memory

        print("✅ Created visualization: loan_metrics.png")
    
    def generate_report(self, loan_application, verification_result, credit_assessment, underwriting_decision):
        """Generate the final loan processing report."""
        # Generate component summaries
        verification_summary = self.generate_verification_summary(verification_result)
        credit_summary = self.generate_credit_summary(credit_assessment)
        underwriting_summary = self.generate_underwriting_summary(underwriting_decision)
        recommended_action = self.generate_recommended_action(
            verification_result, 
            credit_assessment,
            underwriting_decision
        )
        
        # Create visualizations
        self.create_visualizations(loan_application, credit_assessment, underwriting_decision)
        
        # Compile the report
        report = LoanProcessingReport(
            application_id=loan_application.application_id,
            applicant_name=f"{loan_application.applicant.first_name} {loan_application.applicant.last_name}",
            loan_amount=loan_application.loan_request.loan_amount,
            loan_purpose=loan_application.loan_request.loan_purpose,
            decision=underwriting_decision.decision,
            verification_summary=verification_summary,
            credit_summary=credit_summary,
            underwriting_summary=underwriting_summary,
            recommended_action=recommended_action,
            conditions=underwriting_decision.conditions,
            processing_date=datetime.now().strftime("%Y-%m-%d")
        )
        
        print("✅ Generated final loan processing report")
        
        # Save report to file
        with open("loan_processing_report.json", "w") as f:
            f.write(report.to_json(indent=2))
        
        print("✅ Saved report to loan_processing_report.json")
        
        return report

    def format_interest_rate(self, rate):
        """Format interest rate properly for display."""
        if rate is None:
            return "N/A"
        
        # Check if it's already in decimal form (less than 1)
        if rate < 1:
            # Display as percentage with 2 decimal places
            return f"{rate*100:.2f}%"
        else:
            # It's already a percentage value, just format it
            return f"{rate:.2f}%"
    
    def generate_html_report(self, report, loan_application, credit_assessment, underwriting_decision):
        """Generate an HTML version of the loan processing report."""
        # Format decision for display
        if report.decision == "approve":
            decision_class = "text-success"
            decision_text = "APPROVED"
        elif report.decision == "reject":
            decision_class = "text-danger"
            decision_text = "REJECTED"
        else:
            decision_class = "text-warning"
            decision_text = "REFERRED TO UNDERWRITER"
        
        # Create HTML content
        html = f"""
        <!DOCTYPE html>
        <html lang="en">
        <head>
            <meta charset="UTF-8">
            <base href="./">                                    
            <meta name="viewport" content="width=device-width, initial-scale=1.0">
            <title>Loan Processing Report</title>
            <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet">
            <style>
                body {{ font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif; }}
                .header {{ background-color: #f8f9fa; padding: 20px; border-bottom: 1px solid #ddd; }}
                .decision-badge {{ font-size: 1.2rem; padding: 8px 16px; }}
                .section {{ margin-bottom: 30px; }}
                .section-title {{ border-bottom: 2px solid #007bff; padding-bottom: 10px; margin-bottom: 20px; color: #007bff; }}
                .metrics-box {{ background-color: #f8f9fa; border-radius: 8px; padding: 15px; margin-bottom: 20px; }}
                .metric {{ text-align: center; }}
                .metric-value {{ font-size: 1.5rem; font-weight: bold; }}
                .metric-label {{ font-size: 0.9rem; color: #6c757d; }}
                .conditions {{ background-color: #fff3cd; border-left: 4px solid #ffc107; padding: 15px; }}
                .risk-factors {{ background-color: #f8d7da; border-left: 4px solid #dc3545; padding: 15px; }}
            </style>
        </head>
        <body>
            <div class="container mt-4 mb-5">
                <!-- Header -->
                <div class="header d-flex justify-content-between align-items-center mb-4">
                    <div>
                        <h1>Loan Processing Report</h1>
                        <p class="text-muted">Application #{report.application_id} | {report.processing_date}</p>
                    </div>
                    <div>
                        <span class="badge {decision_class} decision-badge">{decision_text}</span>
                    </div>
                </div>
                
                <!-- Applicant Information -->
                <div class="section">
                    <h2 class="section-title">Applicant Information</h2>
                    <div class="row">
                        <div class="col-md-6">
                            <p><strong>Name:</strong> {report.applicant_name}</p>
                            <p><strong>Email:</strong> {loan_application.applicant.email}</p>
                            <p><strong>Phone:</strong> {loan_application.applicant.phone}</p>
                        </div>
                        <div class="col-md-6">
                            <p><strong>Address:</strong> {loan_application.applicant.address}</p>
                            <p><strong>Employment:</strong> {loan_application.applicant.employment_status} at {loan_application.applicant.employer}</p>
                            <p><strong>Position:</strong> {loan_application.applicant.job_title} for {loan_application.applicant.years_at_current_job} years</p>
                        </div>
                    </div>
                </div>
                
                <!-- Loan Request -->
                <div class="section">
                    <h2 class="section-title">Loan Request</h2>
                    <div class="row metrics-box">
                        <div class="col-md-4 metric">
                            <div class="metric-value">${loan_application.loan_request.loan_amount:,.2f}</div>
                            <div class="metric-label">Requested Amount</div>
                        </div>
                        <div class="col-md-4 metric">
                            <div class="metric-value">{loan_application.loan_request.loan_term_months} months</div>
                            <div class="metric-label">Requested Term</div>
                        </div>
                        <div class="col-md-4 metric">
                            <div class="metric-value">{loan_application.loan_request.loan_purpose}</div>
                            <div class="metric-label">Loan Purpose</div>
                        </div>
                    </div>
                </div>
                
                <!-- Key Metrics -->
                <div class="section">
                    <h2 class="section-title">Key Metrics</h2>
                    <div class="row metrics-box">
                        <div class="col-md-3 metric">
                            <div class="metric-value">{credit_assessment.credit_score}</div>
                            <div class="metric-label">Credit Score</div>
                        </div>
                        <div class="col-md-3 metric">
                            <div class="metric-value">{credit_assessment.debt_to_income_ratio:.1%}</div>
                            <div class="metric-label">DTI Ratio</div>
                        </div>
                        <div class="col-md-3 metric">
                            <div class="metric-value">{credit_assessment.payment_to_income_ratio:.1%}</div>
                            <div class="metric-label">PTI Ratio</div>
                        </div>
                        <div class="col-md-3 metric">
                            <div class="metric-value">{credit_assessment.risk_score:.2f}</div>
                            <div class="metric-label">Risk Score</div>
                        </div>
                    </div>
                    <div class="text-center mt-3">
                        <img src="loan_metrics.png" alt="Loan Metrics Visualization" class="img-fluid" style="max-width: 100%;">
                    </div>
                </div>
                
                <!-- Verification Summary -->
                <div class="section">
                    <h2 class="section-title">Verification Summary</h2>
                    <p>{report.verification_summary}</p>
                </div>
                
                <!-- Credit Assessment -->
                <div class="section">
                    <h2 class="section-title">Credit Assessment</h2>
                    <p>{report.credit_summary}</p>
                    
                    <div class="risk-factors mt-3">
                        <h5>Risk Factors Identified:</h5>
                        <ul class="mb-0">
                            {' '.join(f'<li>{factor}</li>' for factor in credit_assessment.risk_factors)}
                        </ul>
                    </div>
                </div>
                
                <!-- Underwriting Decision -->
                <div class="section">
                    <h2 class="section-title">Underwriting Decision</h2>
                    <p>{report.underwriting_summary}</p>
                    
                    {
                    f'''
                    <div class="conditions mt-3">
                        <h5>Approval Conditions:</h5>
                        <ul class="mb-0">
                            {' '.join(f'<li>{condition}</li>' for condition in report.conditions)}
                        </ul>
                    </div>
                    ''' if report.conditions else ''
                    }
                </div>
                
                <!-- Recommended Action -->
                <div class="section">
                    <h2 class="section-title">Recommended Action</h2>
                    <div class="alert alert-info">
                        <p class="mb-0">{report.recommended_action}</p>
                    </div>
                </div>
                
                <!-- Footer -->
                <div class="mt-5 text-center text-muted">
                    <p>Generated by Multi-Agent Loan Processing System</p>
                    <p>Generated on {report.processing_date}</p>
                </div>
            </div>
            
            <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/js/bootstrap.bundle.min.js"></script>
        </body>
        </html>
        """
        
        # Save HTML report
        with open("loan_processing_report.html", "w") as f:
            f.write(html)
        
        print("✅ Generated HTML report: loan_processing_report.html")

In [33]:
import json
import os
import base64
import matplotlib.pyplot as plt
from matplotlib.patches import Wedge, Circle
import numpy as np
from datetime import datetime

# (make sure gemini_model and LoanProcessingReport are already imported)

class ReportingAgent:
    """
    Responsible for generating the final loan processing report,
    including summaries, visualizations, JSON export, and an HTML page
    with the chart embedded as Base64.
    """
    def __init__(self):
        self.model = gemini_model

    def generate_verification_summary(self, verification_result):
        verification_json = json.loads(verification_result.to_json())
        prompt = f"""
        Generate a concise professional summary of this verification process.
        Focus on key findings and issues (if any).

        Verification data:
        {json.dumps(verification_json, indent=2)}

        Keep the summary under 150 words.
        """
        response = self.model.generate_content(prompt)
        return response.text

    def generate_credit_summary(self, credit_assessment):
        credit_json = json.loads(credit_assessment.to_json())
        prompt = f"""
        Generate a concise professional summary of this credit assessment.
        Focus on credit score, risk factors, and key financial ratios.

        Credit assessment data:
        {json.dumps(credit_json, indent=2)}

        Keep the summary under 150 words.
        """
        response = self.model.generate_content(prompt)
        return response.text

    def generate_underwriting_summary(self, underwriting_decision):
        decision_json = json.loads(underwriting_decision.to_json())
        prompt = f"""
        Generate a concise professional summary of this underwriting decision.
        Focus on the decision itself, key reasons, and any conditions if approved.

        Underwriting decision:
        {json.dumps(decision_json, indent=2)}

        Keep the summary under 150 words.
        """
        response = self.model.generate_content(prompt)
        return response.text

    def generate_recommended_action(self, verification_result, credit_assessment, underwriting_decision):
        verification_json = json.loads(verification_result.to_json())
        credit_json = json.loads(credit_assessment.to_json())
        decision_json = json.loads(underwriting_decision.to_json())
        prompt = f"""
        Generate a clear, actionable recommendation for the loan officer.
        Consider all aspects of the application including verification, credit assessment, and underwriting decision.

        Verification data:
        {json.dumps(verification_json, indent=2)}

        Credit assessment:
        {json.dumps(credit_json, indent=2)}

        Underwriting decision:
        {json.dumps(decision_json, indent=2)}

        Keep the recommendation under 100 words and make it direct and actionable.
        """
        response = self.model.generate_content(prompt)
        return response.text

    def create_visualizations(self, loan_application, credit_assessment, underwriting_decision):
        import matplotlib.pyplot as plt
        from matplotlib.patches import Wedge, Circle
        import numpy as np

        fig, axs = plt.subplots(1, 3, figsize=(15, 5))

        # --- Plot 1: Financial Ratios ---
        ratios = {
            'DTI Ratio': credit_assessment.debt_to_income_ratio,
            'PTI Ratio': credit_assessment.payment_to_income_ratio
        }
        if hasattr(credit_assessment, 'loan_to_value_ratio') and credit_assessment.loan_to_value_ratio:
            ratios['LTV Ratio'] = credit_assessment.loan_to_value_ratio

        ratio_colors = ['#ff9999', '#66b3ff', '#99ff99']
        if credit_assessment.debt_to_income_ratio > 0.43:
            idx = list(ratios.keys()).index('DTI Ratio')
            if idx < len(ratio_colors):
                ratio_colors[idx] = '#ff0000'

        axs[0].bar(ratios.keys(), ratios.values(), color=ratio_colors[:len(ratios)])
        axs[0].set_title('Financial Ratios')
        axs[0].set_ylim(0, max(max(ratios.values() or [0]) * 1.2, 0.5))
        axs[0].set_ylabel('Ratio')
        for i, v in enumerate(ratios.values()):
            axs[0].text(i, (v or 0) + 0.02, f'{v:.1%}', ha='center')

        # --- Plot 2: Credit Score Gauge ---
        def gauge(ax, value, min_val=300, max_val=850):
            ranges = {
                'Poor': (300, 579), 'Fair': (580, 669),
                'Good': (670, 739), 'Very Good': (740, 799),
                'Excellent': (800, 850)
            }
            colors = {
                'Poor': '#ff0000', 'Fair': '#ffa500', 'Good': '#ffff00',
                'Very Good': '#90ee90', 'Excellent': '#006400'
            }
            total = max_val - min_val
            norm_val = (value - min_val) / total
            boundaries = [min_val] + [r[1] for r in ranges.values()]
            norm_b = [(b - min_val) / total for b in boundaries]
            angles = [180 - nb * 180 for nb in norm_b]

            ax.clear()
            ax.add_patch(Wedge((0.5,0),0.4,0,180,width=0.1,facecolor='#ddd',edgecolor='black'))
            for i, key in enumerate(ranges):
                seg = Wedge((0.5,0),0.4, angles[i+1], angles[i],
                            width=0.1, facecolor=colors[key], edgecolor='black')
                ax.add_patch(seg)

            needle_ang = np.radians(180 - norm_val * 180)
            x2 = 0.5 + 0.35 * np.cos(needle_ang)
            y2 = 0 + 0.35 * np.sin(needle_ang)
            ax.plot([0.5,x2],[0,y2],color='black',linewidth=2)
            ax.add_patch(Circle((0.5,0),0.04,facecolor='black',zorder=4))
            ax.text(0.5,-0.15,str(value),ha='center',fontsize=18,fontweight='bold')
            ax.text(0.5,-0.25,'Credit Score',ha='center',fontsize=10)
            ax.set_xlim(0,1); ax.set_ylim(-0.3,0.55)
            ax.set_aspect('equal'); ax.axis('off')

        gauge(axs[1], credit_assessment.credit_score)
        axs[1].set_title('Credit Score')

        # --- Plot 3: Risk Score ---
        risk = credit_assessment.risk_score
        levels = ['Low Risk','Medium Risk','High Risk']
        colors = ['#00ff00','#ffff00','#ff0000']
        if risk < 0.3: lvl, col = 0, colors[0]
        elif risk < 0.7: lvl, col = 1, colors[1]
        else: lvl, col = 2, colors[2]
        axs[2].pie([1], colors=[col], wedgeprops=dict(width=0.3, edgecolor='w'), startangle=90)
        axs[2].text(0,0,f'{risk:.2f}',ha='center',va='center',fontsize=20)
        axs[2].text(0,-0.15,levels[lvl],ha='center',va='center',fontsize=12)
        axs[2].set_title('Risk Score'); axs[2].axis('equal')

        plt.tight_layout()
        plt.savefig('loan_metrics.png')
        plt.close(fig)
        print("✅ Created visualization: loan_metrics.png")

    def generate_report(self, loan_application, verification_result, credit_assessment, underwriting_decision):
        verification_summary = self.generate_verification_summary(verification_result)
        credit_summary       = self.generate_credit_summary(credit_assessment)
        underwriting_summary = self.generate_underwriting_summary(underwriting_decision)
        recommended_action   = self.generate_recommended_action(
            verification_result, credit_assessment, underwriting_decision
        )
        self.create_visualizations(loan_application, credit_assessment, underwriting_decision)

        report = LoanProcessingReport(
            application_id    = loan_application.application_id,
            applicant_name    = f"{loan_application.applicant.first_name} {loan_application.applicant.last_name}",
            loan_amount       = loan_application.loan_request.loan_amount,
            loan_purpose      = loan_application.loan_request.loan_purpose,
            decision          = underwriting_decision.decision,
            verification_summary = verification_summary,
            credit_summary       = credit_summary,
            underwriting_summary = underwriting_summary,
            recommended_action   = recommended_action,
            conditions            = underwriting_decision.conditions,
            processing_date=datetime.now().strftime("%Y-%m-%d")


        )
        with open("loan_processing_report.json", "w") as f:
            f.write(report.to_json(indent=2))
        print("✅ Generated final loan processing report")
        return report

    def format_interest_rate(self, rate):
        if rate is None:
            return "N/A"
        return f"{rate*100:.2f}%" if rate < 1 else f"{rate:.2f}%"

    def generate_html_report(self, report, loan_application, credit_assessment, underwriting_decision):
         # --- Locate the PNG, even in notebooks where __file__ is undefined ---
        try:
            base_dir = os.path.dirname(__file__)
        except NameError:
            base_dir = os.getcwd()
        img_path = os.path.join(base_dir, "loan_metrics.png")
    
        # then your existing code to read and embed the image…
        with open(img_path, "rb") as img_f:
            img_b64 = base64.b64encode(img_f.read()).decode("utf-8")

        
        # img_path = os.path.join(os.path.dirname(__file__), "loan_metrics.png")
        # with open(img_path, "rb") as img_f:
        #     img_b64 = base64.b64encode(img_f.read()).decode("utf-8")

        # decide badge
        if report.decision == "approve":
            cls, txt = "text-success", "APPROVED"
        elif report.decision == "reject":
            cls, txt = "text-danger", "REJECTED"
        else:
            cls, txt = "text-warning", "REFERRED"

        html = f"""
        <!DOCTYPE html>
        <html lang="en">
        <head>
          <meta charset="UTF-8">
          <meta name="viewport" content="width=device-width, initial-scale=1.0">
          <title>Loan Processing Report</title>
          <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet">
          <style>
            body {{ font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif; }}
            .header {{ background-color: #f8f9fa; padding: 20px; border-bottom: 1px solid #ddd; }}
            .decision-badge {{ font-size:1.2rem; padding:8px 16px; }}
            .section {{ margin-bottom:30px; }}
            .section-title {{ border-bottom:2px solid #007bff; padding-bottom:10px; color:#007bff; }}
            .metrics-box {{ background-color:#f8f9fa; border-radius:8px; padding:15px; }}
            .metric {{ text-align:center; }}
            .metric-value {{ font-size:1.5rem; font-weight:bold; }}
            .metric-label {{ font-size:0.9rem; color:#6c757d; }}
            .conditions {{ background-color:#fff3cd; border-left:4px solid #ffc107; padding:15px; }}
            .risk-factors {{ background-color:#f8d7da; border-left:4px solid #dc3545; padding:15px; }}
          </style>
        </head>
        <body>
          <div class="container mt-4 mb-5">

            <div class="header d-flex justify-content-between align-items-center mb-4">
              <div>
                <h1>Loan Processing Report</h1>
                <p class="text-muted">Application #{report.application_id} | {report.processing_date}</p>
              </div>
              <span class="badge {cls} decision-badge">{txt}</span>
            </div>

            <!-- Applicant Info -->
            <div class="section">
              <h2 class="section-title">Applicant Information</h2>
              <div class="row">
                <div class="col-md-6">
                  <p><strong>Name:</strong> {report.applicant_name}</p>
                  <p><strong>Email:</strong> {loan_application.applicant.email}</p>
                  <p><strong>Phone:</strong> {loan_application.applicant.phone}</p>
                </div>
                <div class="col-md-6">
                  <p><strong>Address:</strong> {loan_application.applicant.address}</p>
                  <p><strong>Employment:</strong> {loan_application.applicant.employment_status} at {loan_application.applicant.employer}</p>
                  <p><strong>Position:</strong> {loan_application.applicant.job_title}, {loan_application.applicant.years_at_current_job} yrs</p>
                </div>
              </div>
            </div>

            <!-- Loan Request -->
            <div class="section">
              <h2 class="section-title">Loan Request</h2>
              <div class="row metrics-box">
                <div class="col-md-4 metric">
                  <div class="metric-value">${loan_application.loan_request.loan_amount:,.2f}</div>
                  <div class="metric-label">Requested Amount</div>
                </div>
                <div class="col-md-4 metric">
                  <div class="metric-value">{loan_application.loan_request.loan_term_months} months</div>
                  <div class="metric-label">Term</div>
                </div>
                <div class="col-md-4 metric">
                  <div class="metric-value">{loan_application.loan_request.loan_purpose}</div>
                  <div class="metric-label">Purpose</div>
                </div>
              </div>
            </div>

            <!-- Key Metrics with inline image -->
            <div class="section">
              <h2 class="section-title">Key Metrics</h2>
              <div class="row metrics-box">
                <div class="col-md-3 metric">
                  <div class="metric-value">{credit_assessment.credit_score}</div>
                  <div class="metric-label">Credit Score</div>
                </div>
                <div class="col-md-3 metric">
                  <div class="metric-value">{credit_assessment.debt_to_income_ratio:.1%}</div>
                  <div class="metric-label">DTI Ratio</div>
                </div>
                <div class="col-md-3 metric">
                  <div class="metric-value">{credit_assessment.payment_to_income_ratio:.1%}</div>
                  <div class="metric-label">PTI Ratio</div>
                </div>
                <div class="col-md-3 metric">
                  <div class="metric-value">{credit_assessment.risk_score:.2f}</div>
                  <div class="metric-label">Risk Score</div>
                </div>
              </div>
              <div class="text-center mt-3">
                <img src="data:image/png;base64,{img_b64}" alt="Loan Metrics" class="img-fluid" style="max-width:100%;">
              </div>
            </div>

            <!-- Summaries & Recommendations -->
            <div class="section">
              <h2 class="section-title">Verification Summary</h2>
              <p>{report.verification_summary}</p>
            </div>
            <div class="section">
              <h2 class="section-title">Credit Assessment</h2>
              <p>{report.credit_summary}</p>
              <div class="risk-factors mt-3">
                <h5>Risk Factors Identified:</h5>
                <ul>
                  {' '.join(f'<li>{f}</li>' for f in credit_assessment.risk_factors)}
                </ul>
              </div>
            </div>
            <div class="section">
              <h2 class="section-title">Underwriting Decision</h2>
              <p>{report.underwriting_summary}</p>
              {(
                '<div class="conditions mt-3"><h5>Conditions:</h5><ul>' +
                ' '.join(f'<li>{c}</li>' for c in report.conditions) +
                '</ul></div>'
              ) if report.conditions else ''}
            </div>
            <div class="section">
              <h2 class="section-title">Recommended Action</h2>
              <div class="alert alert-info">{report.recommended_action}</div>
            </div>

            <div class="text-center text-muted mt-5">
              <p>Generated by Multi-Agent Loan Processing System</p>
              <p>On {report.processing_date}</p>
            </div>

          </div>
          <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/js/bootstrap.bundle.min.js"></script>
        </body>
        </html>
        """

        with open("loan_processing_report.html", "w") as f:
            f.write(html)
        print("✅ Generated HTML report: loan_processing_report.html")


In [34]:
# Test the ReportingAgent
print("\n=== TESTING REPORTING AGENT ===")
reporting_agent = ReportingAgent()
final_report = reporting_agent.generate_report(
    loan_application,
    verification_result,
    credit_assessment,
    underwriting_decision
)
reporting_agent.generate_html_report(
    final_report,
    loan_application,
    credit_assessment,
    underwriting_decision
)


=== TESTING REPORTING AGENT ===
✅ Created visualization: loan_metrics.png
✅ Generated final loan processing report
✅ Generated HTML report: loan_processing_report.html


In [30]:
print(os.getcwd())  # should be the folder where loan_metrics.png lives

/Users/ibrahimabarry/Documents/Loan_processing/Loan-Processing


## Create the Main Orchestrator

In [35]:
class LoanProcessingOrchestrator:
    """
    Orchestrates the entire loan processing workflow.
    
    This class coordinates all agents and steps in the loan processing pipeline.
    It can handle both user-uploaded files and mock data.
    """
    
    def __init__(self):
        self.intake_agent = IntakeAgent()
        self.extraction_agent = ExtractionAgent()
        self.verification_agent = VerificationAgent()
        self.credit_agent = CreditAssessmentAgent()
        self.underwriting_agent = UnderwritingAgent()
        self.reporting_agent = ReportingAgent()
    
    def process_loan_application(self, file_paths=None, mock_applicant_index=0):
        """
        Process a complete loan application package.
        
        Args:
            file_paths: List of file paths or None to use mock data
            mock_applicant_index: Index of mock applicant to use if file_paths is None
        
        Returns:
            Dictionary with results from all processing stages
        """
        print("\n===== LOAN PROCESSING SYSTEM =====")
        print("Starting loan application processing...\n")
        
        # If no file paths provided, generate mock data
        if file_paths is None:
            print("No documents provided. Using mock application data...")
            file_paths = simulate_file_upload(mock_applicant_index)
        
        # Ensure file_paths is a list for IntakeAgent
        if isinstance(file_paths, dict):
            # Convert dictionary to list
            file_list = []
            for doc_type in ['application_form', 'id_document', 'financial_statement']:
                if doc_type in file_paths:
                    file_list.append(file_paths[doc_type])
            
            # Add credit report if available
            if 'credit_report' in file_paths:
                file_list.append(file_paths['credit_report'])
            
            file_paths = file_list
        
        # Step 1: Document Intake
        print("\n----- DOCUMENT INTAKE -----")
        processed_docs = self.intake_agent.process_documents(file_paths)
        
        # Verify we have all required documents
        required_docs = ["application_form", "id_document", "financial_statement"]
        missing_docs = set(required_docs) - set(processed_docs.keys())
        if missing_docs:
            print(f"\n❌ Cannot proceed: Missing required documents: {', '.join(missing_docs)}")
            return None
        
        # Step 2: Information Extraction
        print("\n----- INFORMATION EXTRACTION -----")
        loan_application = self.extraction_agent.extract_all(processed_docs)
        
        # Step 3: Verification
        print("\n----- VERIFICATION -----")
        verification_result = self.verification_agent.verify_application(loan_application)
        
        # Step 4: Credit Assessment
        print("\n----- CREDIT ASSESSMENT -----")
        # Look for credit report file if available
        credit_report_path = None
        for path in file_paths:
            if isinstance(path, str) and path.endswith('.json') and 'credit' in path.lower():
                credit_report_path = path
                break
        
        # Use provided credit report file if available
        if credit_report_path and os.path.exists(credit_report_path):
            print(f"Using provided credit report: {credit_report_path}")
            try:
                with open(credit_report_path, 'r') as f:
                    credit_data = json.load(f)
                # Inject credit data into the assessment process
                credit_assessment = self.credit_agent.assess_application_with_report(
                    loan_application, 
                    verification_result,
                    credit_data
                )
            except Exception as e:
                print(f"Error processing credit report: {str(e)}")
                print("Falling back to standard credit assessment...")
                credit_assessment = self.credit_agent.assess_application(
                    loan_application, 
                    verification_result
                )
        else:
            credit_assessment = self.credit_agent.assess_application(
                loan_application, 
                verification_result
            )
        
        # Step 5: Underwriting
        print("\n----- UNDERWRITING -----")
        underwriting_decision = self.underwriting_agent.make_decision(
            loan_application, 
            verification_result, 
            credit_assessment
        )
        
        # Step 6: Reporting
        print("\n----- REPORT GENERATION -----")
        final_report = self.reporting_agent.generate_report(
            loan_application,
            verification_result,
            credit_assessment,
            underwriting_decision
        )
        
        # Generate HTML report
        self.reporting_agent.generate_html_report(
            final_report,
            loan_application,
            credit_assessment,
            underwriting_decision
        )
        
        print("\n===== PROCESSING COMPLETE =====")
        
        return {
            "loan_application": loan_application,
            "verification_result": verification_result,
            "credit_assessment": credit_assessment,
            "underwriting_decision": underwriting_decision,
            "final_report": final_report
        }

## Create Interface 

In [36]:
# Direct file upload approach with visualization fix
def create_direct_upload_interface():
    """Create a simple interface for processing loan applications."""
    
    # Display header
    display(HTML("""
    <div style="background-color: #4CAF50; color: white; padding: 20px; border-radius: 10px; margin-bottom: 20px; text-align: center;">
        <h1 style="margin: 0;">Loan Processing AI System</h1>
        <p style="margin: 5px 0 0 0;">Professional loan application processing with multi-agent AI</p>
    </div>
    """))
    
    # Create container
    upload_box = widgets.VBox([
        widgets.HTML("""
        <div style="background-color: #f8f9fa; padding: 15px; border-radius: 5px; margin-bottom: 20px;">
            <h3 style="margin-top: 0;">Choose an option:</h3>
            <p>Select how you want to process loan applications:</p>
        </div>
        """)
    ])
    
    # Create buttons for different options
    mock_data_button = widgets.Button(
        description='Use Mock Data',
        button_style='primary',
        icon='database',
        tooltip='Process a single loan application with mock data'
    )
    
    batch_button = widgets.Button(
        description='Batch Processing (3 applications)',
        button_style='warning',
        icon='tasks',
        tooltip='Process multiple applications with comparative analysis'
    )
    
    # Create output areas - keep log and report separate
    log_area = widgets.Output(layout=widgets.Layout(max_height='300px', overflow='auto'))
    report_area = widgets.Output()
    
    # Button callbacks
    def use_mock_data(b):
        with log_area:
            clear_output()
            print("🔄 Generating mock data...")
            
            # Generate mock data
            mock_data = generate_realistic_mock_data()
            
            # Get file paths
            file_paths = [
                mock_data["application_forms"][0],
                mock_data["id_documents"][0],
                mock_data["financial_statements"][0],
                mock_data["credit_reports"][0]
            ]
            
            print("🔄 Starting loan application processing...")
            
            # Process the application
            orchestrator = LoanProcessingOrchestrator()
            results = orchestrator.process_loan_application(file_paths)
            
            if results:
                print("\n✅ Loan application processed successfully!")
                print("📊 Generating report...")
                
                # Check for visualization image
                found_image = False
                for file in os.listdir('.'):
                    if file.startswith('loan_metrics') and file.endswith('.png'):
                        print(f"✅ Found visualization: {file}")
                        found_image = True
                        break
                
                # Display the report using proper IFrame
                with report_area:
                    clear_output()
                    print("Loan Processing Report:")
                    if os.path.exists("loan_processing_report.html"):
                        display(IFrame("loan_processing_report.html", width='100%', height=600))
                    else:
                        print("❌ Report file not found.")
            else:
                print("❌ Error processing loan application.")
    
    def run_batch_processing(b):
        with log_area:
            clear_output()
            print("🔄 Generating mock data for batch processing...")
            
            # Generate mock data
            mock_data = generate_realistic_mock_data()
            
            # Make sure we have policy data
            if not os.path.exists("lending_policies.json"):
                print("Creating lending policies...")
                create_mock_lending_policies()
            
            # Get all available applications
            available_apps = len(mock_data["application_forms"])
            print(f"Found {available_apps} applications available for processing")
            
            # Create reports directory if it doesn't exist
            reports_dir = "batch_reports"
            if not os.path.exists(reports_dir):
                os.makedirs(reports_dir)
                print(f"Created reports directory: {reports_dir}")
            
            # Process all available applications
            report_paths = []
            for i in range(available_apps):
                print(f"\n🔄 Processing application {i+1}...")
                
                # Get file paths
                file_paths = [
                    mock_data["application_forms"][i],
                    mock_data["id_documents"][i],
                    mock_data["financial_statements"][i],
                    mock_data["credit_reports"][i]
                ]
                
                # Verify all files exist
                all_exist = all(os.path.exists(path) for path in file_paths)
                if not all_exist:
                    print(f"⚠️ Some files for application {i+1} don't exist:")
                    for path in file_paths:
                        if not os.path.exists(path):
                            print(f"  Missing: {path}")
                    continue
                
                # Process the application with a fresh orchestrator
                try:
                    orchestrator = LoanProcessingOrchestrator()
                    results = orchestrator.process_loan_application(file_paths)
                    
                    if results:
                        print(f"✅ Application {i+1} processed successfully!")
                        
                        # Create a unique report name
                        report_name = f"{reports_dir}/loan_report_{i+1}.html"
                        
                        # If the original report exists, make a copy with the new name
                        if os.path.exists("loan_processing_report.html"):
                            # Find any visualization images
                            visualization_file = None
                            for file in os.listdir('.'):
                                if file.startswith('loan_metrics') and file.endswith('.png'):
                                    visualization_file = file
                                    # Copy the visualization to the reports directory
                                    viz_dest = f"{reports_dir}/{file.replace('.png', f'_{i+1}.png')}"
                                    shutil.copy(file, viz_dest)
                                    print(f"✅ Copied visualization to {viz_dest}")
                                    break
                            
                            # Read the HTML report content
                            with open("loan_processing_report.html", "r") as src:
                                report_content = src.read()
                            
                            # Update report title to indicate which application it is
                            report_content = report_content.replace(
                                "<title>Loan Processing Report</title>",
                                f"<title>Loan Report {i+1}</title>"
                            )
                            
                            # Add a header to distinguish the report
                            report_content = report_content.replace(
                                '<div class="header d-flex justify-content-between align-items-center mb-4">',
                                f'<div class="header d-flex justify-content-between align-items-center mb-4">'
                                f'<div style="background-color: #007bff; color: white; padding: 5px 10px; '
                                f'position: absolute; top: 0; right: 0; border-radius: 0 0 0 5px;">Application {i+1}</div>'
                            )
                            
                            # Update visualization path if needed
                            if visualization_file:
                                # Create the correct relative path for the copied visualization
                                viz_in_report_dir = visualization_file.replace('.png', f'_{i+1}.png')
                                
                                # Check if image tag exists in the HTML
                                img_pattern = r'<img src=[\'"]([^\'"]*loan_metrics[^\'"]*)[\'"]'
                                if re.search(img_pattern, report_content):
                                    # Update the image path
                                    report_content = re.sub(
                                        img_pattern,
                                        f'<img src="{viz_in_report_dir}"',
                                        report_content
                                    )
                                    print(f"✅ Updated visualization reference in report")
                                else:
                                    # Try to find visualization container
                                    viz_container_pattern = r'<div class="visualization-container">(.*?)</div>'
                                    viz_container_match = re.search(viz_container_pattern, report_content, re.DOTALL)
                                    
                                    if viz_container_match:
                                        # Replace content with our image
                                        report_content = report_content.replace(
                                            viz_container_match.group(0),
                                            f'<div class="visualization-container">'
                                            f'<img src="{viz_in_report_dir}" alt="Loan Metrics Visualization" '
                                            f'class="img-fluid" style="max-width: 100%;">'
                                            f'</div>'
                                        )
                                        print(f"✅ Added visualization to report")
                            
                            # Write the modified report
                            with open(report_name, "w") as dest:
                                dest.write(report_content)
                            
                            report_paths.append(report_name)
                            print(f"📄 Report saved as: {report_name}")
                        else:
                            print(f"⚠️ Could not find loan_processing_report.html")
                    else:
                        print(f"❌ Error processing application {i+1}")
                except Exception as e:
                    print(f"❌ Error during processing of application {i+1}: {str(e)}")
            
            # Display final status
            if report_paths:
                print(f"\n✅ Processed {len(report_paths)} applications successfully!")
                
                # Create a summary table of all applications
                applicant_info = []
                for i, report_path in enumerate(report_paths):
                    # Extract basic info from the results
                    applicant_name = f"Applicant {i+1}"
                    decision = "Unknown"
                    
                    # Try to extract info from the HTML report
                    try:
                        with open(report_path, 'r') as f:
                            content = f.read()
                            # Extract name (simplified approach)
                            name_match = re.search(r'<p><strong>Name:</strong>\s*(.*?)</p>', content)
                            if name_match:
                                applicant_name = name_match.group(1)
                            
                            # Extract decision
                            if "APPROVED" in content:
                                decision = "APPROVED"
                            elif "REJECTED" in content:
                                decision = "REJECTED"
                            elif "REFERRED" in content:
                                decision = "REFERRED"
                    except:
                        pass
                    
                    applicant_info.append({
                        "id": i+1,
                        "name": applicant_name,
                        "decision": decision,
                        "report": report_path
                    })
                
                # Create and display the summary table
                summary_html = """
                <div style="background-color: #f8f9fa; padding: 15px; border-radius: 5px; margin: 15px 0;">
                    <h3>Batch Processing Summary</h3>
                    <table style="width:100%; border-collapse: collapse;">
                        <thead>
                            <tr style="background-color: #007bff; color: white;">
                                <th style="padding: 8px; text-align: left; border: 1px solid #ddd;">Application</th>
                                <th style="padding: 8px; text-align: left; border: 1px solid #ddd;">Applicant</th>
                                <th style="padding: 8px; text-align: left; border: 1px solid #ddd;">Decision</th>
                            </tr>
                        </thead>
                        <tbody>
                """
                
                for app in applicant_info:
                    decision_color = "#28a745" if app["decision"] == "APPROVED" else "#dc3545" if app["decision"] == "REJECTED" else "#ffc107"
                    summary_html += f"""
                        <tr>
                            <td style="padding: 8px; text-align: left; border: 1px solid #ddd;">{app["id"]}</td>
                            <td style="padding: 8px; text-align: left; border: 1px solid #ddd;">{app["name"]}</td>
                            <td style="padding: 8px; text-align: left; border: 1px solid #ddd; color: {decision_color}; font-weight: bold;">{app["decision"]}</td>
                        </tr>
                    """
                
                summary_html += """
                        </tbody>
                    </table>
                </div>
                """
                
                display(HTML(summary_html))
                
                # Create dropdown for report selection
                report_dropdown = widgets.Dropdown(
                    options=[
                        (f'Application {app["id"]}: {app["name"]} ({app["decision"]})', app["report"]) 
                        for app in applicant_info
                    ],
                    description='View Report:',
                    layout=widgets.Layout(width='50%')
                )
                
                # Function to show selected report
                def on_dropdown_change(change):
                    if change['new']:
                        with report_area:
                            clear_output()
                            display(IFrame(change['new'], width='100%', height=600))
                
                report_dropdown.observe(on_dropdown_change, names='value')
                
                # Display dropdown
                display(report_dropdown)
                
                # Display the first report
                with report_area:
                    clear_output()
                    display(IFrame(report_paths[0], width='100%', height=600))
            else:
                print("❌ No reports were generated. Please check the errors above.")
        
    # Connect button callbacks
    mock_data_button.on_click(use_mock_data)
    batch_button.on_click(run_batch_processing)
    
    # Add buttons to interface with better styling
    button_box = widgets.HBox([mock_data_button, batch_button], 
                             layout=widgets.Layout(justify_content='space-around', 
                                                 padding='10px'))
    
    # Add output areas with fixed sizes
    processing_section = widgets.VBox([
        widgets.HTML('<div style="margin-top: 20px; border-top: 1px solid #eee; padding-top: 10px;"><h3>Processing Log:</h3></div>'),
        log_area
    ])
    
    report_section = widgets.VBox([
        widgets.HTML('<div style="margin-top: 20px; border-top: 1px solid #eee; padding-top: 10px;"><h3>Report Preview:</h3></div>'),
        report_area
    ])
    
    # Assemble interface with fixed layout
    upload_box.children = list(upload_box.children) + [
        button_box, 
        processing_section,
        report_section
    ]
    
    # Display the interface
    display(upload_box)
    
    return log_area, report_area

# Run the improved interface
log_area, report_area = create_direct_upload_interface()

VBox(children=(HTML(value='\n        <div style="background-color: #f8f9fa; padding: 15px; border-radius: 5px;…