# Lesson 4: Multi-Step Financial Compliance Workflow - SOLUTION

## Chaining Prompts for Agentic Reasoning with Validation Gates

This is the complete solution for implementing a three-stage prompt chain with Pydantic-based gate checks for financial compliance analysis.

In [1]:
# Import necessary libraries
import os
from dotenv import load_dotenv
from openai import OpenAI
from IPython.display import Markdown, display
import json
from typing import List, Dict, Optional, Union
from pydantic import BaseModel, Field, ValidationError, validator
from enum import Enum
import re

# Load environment variables from the root .env file
load_dotenv('../../.env')

True

In [2]:
# Setup OpenAI client
client = OpenAI(
    api_key=os.getenv("OPENAI_API_KEY")
)

def get_completion(system_prompt, user_prompt, model="gpt-4o-mini", temperature=0.3):
    """Function to get a completion from the OpenAI API."""
    try:
        response = client.chat.completions.create(
            model=model,
            messages=[
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": user_prompt},
            ],
            temperature=temperature,
        )
        return response.choices[0].message.content
    except Exception as e:
        return f"An error occurred: {e}"

## Pydantic Models for Gate Checks - SOLUTION

In [3]:
# SOLUTION: Complete Pydantic models with validation logic

class RiskLevel(str, Enum):
    LOW = "low"
    MEDIUM = "medium"
    HIGH = "high"

class CustomerData(BaseModel):
    """Stage 1 Output: Customer information collection with validation"""
    customer_name: str = Field(..., description="Customer name")
    business_type: str = Field(..., description="Type of business")
    account_age_months: int = Field(..., ge=0, description="Account age in months")
    monthly_revenue: float = Field(..., ge=0, description="Monthly revenue in USD")
    geographic_locations: List[str] = Field(..., description="Operating locations")
    
    @validator('customer_name')
    def name_must_not_be_empty(cls, v):
        if not v or v.strip() == "":
            raise ValueError('Customer name cannot be empty')
        return v.strip()

class RiskAssessment(BaseModel):
    """Stage 2 Output: Risk analysis with validation gates"""
    customer_data: CustomerData
    risk_level: RiskLevel = Field(..., description="Overall risk assessment")
    risk_factors: List[str] = Field(..., description="Identified risk factors")
    risk_score: float = Field(..., ge=0, le=100, description="Risk score 0-100")
    mitigation_factors: List[str] = Field(..., description="Risk mitigation factors")
    
    @validator('risk_factors')
    def must_have_risk_factors(cls, v):
        if not v or len(v) == 0:
            raise ValueError('At least one risk factor must be identified')
        return v

class ComplianceReport(BaseModel):
    """Stage 3 Output: Final compliance report with script generation"""
    risk_assessment: RiskAssessment
    compliance_status: str = Field(..., description="Compliance determination")
    regulatory_requirements: List[str] = Field(..., description="Applicable regulations")
    recommendations: List[str] = Field(..., description="Action recommendations")
    monitoring_script: Optional[str] = Field(None, description="Generated monitoring script")
    
    @validator('compliance_status')
    def valid_compliance_status(cls, v):
        valid_statuses = ["compliant", "non-compliant", "requires_review", "pending_documentation"]
        if v.lower() not in valid_statuses:
            raise ValueError(f'Compliance status must be one of: {valid_statuses}')
        return v.lower()

print("✅ Pydantic models defined for gate checks...")

✅ Pydantic models defined for gate checks...


/var/folders/wp/fv3cb29x4knff6cdkrw1d1s00000gn/T/ipykernel_65894/2368147009.py:16: PydanticDeprecatedSince20: Pydantic V1 style `@validator` validators are deprecated. You should migrate to Pydantic V2 style `@field_validator` validators, see the migration guide for more details. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/
  @validator('customer_name')
/var/folders/wp/fv3cb29x4knff6cdkrw1d1s00000gn/T/ipykernel_65894/2368147009.py:30: PydanticDeprecatedSince20: Pydantic V1 style `@validator` validators are deprecated. You should migrate to Pydantic V2 style `@field_validator` validators, see the migration guide for more details. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/
  @validator('risk_factors')
/var/folders/wp/fv3cb29x4knff6cdkrw1d1s00000gn/T/ipykernel_65894/2368147009.py:44: PydanticDeprecatedSince20: Pydantic 

In [4]:
# Customer scenario for testing
customer_scenario = """
Customer Information Request:

Global Import Solutions LLC is a customer requesting enhanced service limits. 
They operate an international trade business importing electronics from Asia.
The company has been banking with us for 18 months and reports monthly revenue 
of approximately $850,000. They have operations in the United States, Singapore, 
and have suppliers in China, Taiwan, and South Korea.

Recent activity shows:
- Large wire transfers ($100K-300K) multiple times per month
- Transactions with various Asian suppliers
- Some documentation gaps in supplier verification
- Clean banking history with no previous compliance issues
- Rapid business growth (300% revenue increase in past year)

Please conduct a comprehensive compliance analysis.
"""

## Stage 1: Data Collection - SOLUTION

In [5]:
# SOLUTION: Complete Stage 1 implementation
stage1_prompt = """You are a financial data analyst. Extract customer information from the provided text and return it in a structured JSON format.

Focus on extracting:
- Customer name and business type
- Account age (convert to months if given in years)
- Monthly revenue (extract numeric value)
- Geographic locations and operations

Be precise with numbers and ensure all required fields are captured.

Return ONLY a JSON object with this exact structure:
{
  "customer_name": "string",
  "business_type": "string",
  "account_age_months": number,
  "monthly_revenue": number,
  "geographic_locations": ["string"]
}"""

def execute_stage1(scenario_text):
    """Execute Stage 1: Data Collection with Gate Check"""
    print("=== STAGE 1: DATA COLLECTION ===")
    
    response = get_completion(stage1_prompt, scenario_text)
    print(f"Raw AI Response:\n{response}\n")
    
    try:
        # Extract JSON from response
        json_match = re.search(r'\{.*\}', response, re.DOTALL)
        if json_match:
            json_data = json.loads(json_match.group())
            
            # Gate Check: Validate with Pydantic
            customer_data = CustomerData(**json_data)
            print("✅ Stage 1 Gate Check: PASSED")
            return customer_data
        else:
            raise ValueError("No valid JSON found in response")
            
    except (json.JSONDecodeError, ValidationError) as e:
        print(f"❌ Stage 1 Gate Check: FAILED - {e}")
        return None

# Test Stage 1
stage1_result = execute_stage1(customer_scenario)
if stage1_result:
    print(f"\n📊 Stage 1 Output: {stage1_result}")

=== STAGE 1: DATA COLLECTION ===
Raw AI Response:
{
  "customer_name": "Global Import Solutions LLC",
  "business_type": "international trade",
  "account_age_months": 18,
  "monthly_revenue": 850000,
  "geographic_locations": ["United States", "Singapore", "China", "Taiwan", "South Korea"]
}

✅ Stage 1 Gate Check: PASSED

📊 Stage 1 Output: customer_name='Global Import Solutions LLC' business_type='international trade' account_age_months=18 monthly_revenue=850000.0 geographic_locations=['United States', 'Singapore', 'China', 'Taiwan', 'South Korea']


## Stage 2: Risk Analysis - SOLUTION

In [6]:
# SOLUTION: Complete Stage 2 implementation
stage2_prompt = """You are a risk assessment specialist. Analyze the customer data and provide a comprehensive risk evaluation.

Evaluate these risk factors:
- Account age and relationship history (newer accounts = higher risk)
- Transaction volume relative to business size (unusually high volumes = risk)
- Geographic risk (operations in high-risk jurisdictions)
- Business model and industry risk (cash-intensive businesses = higher risk)
- Documentation and compliance gaps
- Rapid growth patterns (potential money laundering indicator)

Risk Level Guidelines:
- LOW (0-30): Established business, clear documentation, low-risk jurisdictions
- MEDIUM (31-70): Some risk factors but manageable with proper controls
- HIGH (71-100): Multiple risk factors requiring enhanced due diligence

Return ONLY a JSON object with this exact structure:
{
  "risk_level": "low|medium|high",
  "risk_factors": ["list of specific risk factors identified"],
  "risk_score": number_between_0_and_100,
  "mitigation_factors": ["list of factors that reduce risk"]
}"""

def execute_stage2(customer_data: CustomerData):
    """Execute Stage 2: Risk Analysis with Gate Check"""
    print("\n=== STAGE 2: RISK ANALYSIS ===")
    
    stage2_input = f"""Customer Data for Risk Analysis:
Customer: {customer_data.customer_name}
Business: {customer_data.business_type}
Account Age: {customer_data.account_age_months} months
Monthly Revenue: ${customer_data.monthly_revenue:,.2f}
Locations: {', '.join(customer_data.geographic_locations)}

Please conduct a comprehensive risk assessment considering all these factors."""
    
    response = get_completion(stage2_prompt, stage2_input)
    print(f"Raw AI Response:\n{response}\n")
    
    try:
        json_match = re.search(r'\{.*\}', response, re.DOTALL)
        if json_match:
            json_data = json.loads(json_match.group())
            json_data['customer_data'] = customer_data.dict()
            
            risk_assessment = RiskAssessment(**json_data)
            print("✅ Stage 2 Gate Check: PASSED")
            return risk_assessment
        else:
            raise ValueError("No valid JSON found in response")
            
    except (json.JSONDecodeError, ValidationError) as e:
        print(f"❌ Stage 2 Gate Check: FAILED - {e}")
        return None

# Test Stage 2
if stage1_result:
    stage2_result = execute_stage2(stage1_result)
    if stage2_result:
        print(f"\n📊 Stage 2 Output: Risk Level = {stage2_result.risk_level.value}, Score = {stage2_result.risk_score}")


=== STAGE 2: RISK ANALYSIS ===
Raw AI Response:
{
  "risk_level": "medium",
  "risk_factors": [
    "Account age of 18 months indicates a newer account",
    "Monthly revenue of $850,000.00 may be considered unusually high relative to business size",
    "Operations in high-risk jurisdictions (China, Singapore)",
    "Industry risk associated with international trade and potential cash-intensive transactions"
  ],
  "risk_score": 55,
  "mitigation_factors": [
    "Established business model in international trade",
    "Presence in the United States, which is a low-risk jurisdiction",
    "Potential for strong compliance and documentation practices"
  ]
}

✅ Stage 2 Gate Check: PASSED

📊 Stage 2 Output: Risk Level = medium, Score = 55.0


/var/folders/wp/fv3cb29x4knff6cdkrw1d1s00000gn/T/ipykernel_65894/3971669215.py:45: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/
  json_data['customer_data'] = customer_data.dict()


## Stage 3: Compliance Report - SOLUTION

In [7]:
# SOLUTION: Complete Stage 3 implementation
stage3_prompt = """You are a compliance officer generating final reports and monitoring recommendations.

Based on the risk assessment, provide:
- Compliance status determination (be specific about current status)
- Applicable regulatory requirements (name specific regulations)
- Specific action recommendations (concrete next steps)
- Python monitoring script for ongoing oversight (if risk level medium/high)

Compliance Status Guidelines:
- "compliant": Low risk, all documentation complete
- "requires_review": Medium risk, additional due diligence needed
- "pending_documentation": Missing required documentation
- "non-compliant": High risk or regulatory violations

For monitoring scripts, focus on:
- Transaction pattern monitoring
- Volume threshold alerts
- Geographic risk tracking

Return ONLY a JSON object with this exact structure:
{
  "compliance_status": "compliant|non-compliant|requires_review|pending_documentation",
  "regulatory_requirements": ["list of applicable regulations"],
  "recommendations": ["list of specific actions to take"],
  "monitoring_script": "Python script code for monitoring (optional)"
}"""

def execute_stage3(risk_assessment: RiskAssessment):
    """Execute Stage 3: Compliance Report with Gate Check"""
    print("\n=== STAGE 3: COMPLIANCE REPORT ===")
    
    stage3_input = f"""Risk Assessment Summary:
Customer: {risk_assessment.customer_data.customer_name}
Risk Level: {risk_assessment.risk_level.value}
Risk Score: {risk_assessment.risk_score}/100
Risk Factors: {', '.join(risk_assessment.risk_factors)}
Mitigation Factors: {', '.join(risk_assessment.mitigation_factors)}

Generate compliance report and monitoring recommendations based on this assessment."""
    
    response = get_completion(stage3_prompt, stage3_input)
    print(f"Raw AI Response:\n{response}\n")
    
    try:
        json_match = re.search(r'\{.*\}', response, re.DOTALL)
        if json_match:
            json_data = json.loads(json_match.group())
            json_data['risk_assessment'] = risk_assessment.dict()
            
            compliance_report = ComplianceReport(**json_data)
            print("✅ Stage 3 Gate Check: PASSED")
            return compliance_report
        else:
            raise ValueError("No valid JSON found in response")
            
    except (json.JSONDecodeError, ValidationError) as e:
        print(f"❌ Stage 3 Gate Check: FAILED - {e}")
        return None

# Test Stage 3
if 'stage2_result' in locals() and stage2_result:
    stage3_result = execute_stage3(stage2_result)
    if stage3_result:
        print(f"\n📊 Stage 3 Output: Status = {stage3_result.compliance_status}")


=== STAGE 3: COMPLIANCE REPORT ===
Raw AI Response:
{
  "compliance_status": "requires_review",
  "regulatory_requirements": ["Bank Secrecy Act (BSA)", "USA PATRIOT Act", "FinCEN regulations"],
  "recommendations": [
    "Conduct enhanced due diligence on the customer due to high monthly revenue and operations in high-risk jurisdictions.",
    "Review transaction history for unusual patterns or anomalies.",
    "Ensure all documentation related to the business model and compliance practices is up to date and complete.",
    "Implement ongoing training for staff on compliance related to international trade and cash-intensive transactions."
  ],
  "monitoring_script": "import pandas as pd\nimport numpy as np\n\n# Load transaction data\ntransactions = pd.read_csv('transactions.csv')\n\n# Function to monitor transaction patterns\ndef monitor_transactions(transactions):\n    # Identify high volume transactions\n    high_volume_threshold = 1000000  # Set threshold for alerts\n    high_volum

/var/folders/wp/fv3cb29x4knff6cdkrw1d1s00000gn/T/ipykernel_65894/1341897559.py:49: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/
  json_data['risk_assessment'] = risk_assessment.dict()


## Complete Workflow - SOLUTION

In [8]:
# SOLUTION: Complete workflow orchestration
def execute_complete_workflow(scenario_text):
    """Execute the complete three-stage prompt chain workflow"""
    print("🚀 STARTING COMPLETE WORKFLOW\n")
    
    # Stage 1: Data Collection
    stage1_result = execute_stage1(scenario_text)
    if not stage1_result:
        print("❌ Workflow failed at Stage 1")
        return None
    
    # Stage 2: Risk Analysis
    stage2_result = execute_stage2(stage1_result)
    if not stage2_result:
        print("❌ Workflow failed at Stage 2")
        return None
    
    # Stage 3: Compliance Report
    stage3_result = execute_stage3(stage2_result)
    if not stage3_result:
        print("❌ Workflow failed at Stage 3")
        return None
    
    print("\n🎉 WORKFLOW COMPLETED SUCCESSFULLY!")
    return stage3_result

# Test complete workflow
final_result = execute_complete_workflow(customer_scenario)
if final_result:
    print("\n=== FINAL COMPLIANCE REPORT ===")
    print(f"Status: {final_result.compliance_status}")
    print(f"Risk Level: {final_result.risk_assessment.risk_level.value}")
    print(f"Risk Score: {final_result.risk_assessment.risk_score}/100")
    print(f"Recommendations: {final_result.recommendations}")
    if final_result.monitoring_script:
        print(f"\nGenerated Monitoring Script:\n{final_result.monitoring_script}")

🚀 STARTING COMPLETE WORKFLOW

=== STAGE 1: DATA COLLECTION ===
Raw AI Response:
{
  "customer_name": "Global Import Solutions LLC",
  "business_type": "international trade",
  "account_age_months": 18,
  "monthly_revenue": 850000,
  "geographic_locations": ["United States", "Singapore", "China", "Taiwan", "South Korea"]
}

✅ Stage 1 Gate Check: PASSED

=== STAGE 2: RISK ANALYSIS ===
Raw AI Response:
{
  "risk_level": "medium",
  "risk_factors": [
    "Account age of 18 months indicates a newer account",
    "Monthly revenue of $850,000 is unusually high relative to typical business size",
    "Operations in high-risk jurisdictions (China, Singapore)",
    "International trade business model can involve cash-intensive transactions"
  ],
  "risk_score": 62,
  "mitigation_factors": [
    "Established business model with a focus on international trade",
    "Presence in the United States, which has strong regulatory frameworks",
    "Potential for established relationships with suppliers a

/var/folders/wp/fv3cb29x4knff6cdkrw1d1s00000gn/T/ipykernel_65894/3971669215.py:45: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/
  json_data['customer_data'] = customer_data.dict()


Raw AI Response:
{
  "compliance_status": "requires_review",
  "regulatory_requirements": ["Bank Secrecy Act (BSA)", "USA PATRIOT Act", "Office of Foreign Assets Control (OFAC) regulations"],
  "recommendations": [
    "Conduct enhanced due diligence on the customer's business model and transaction patterns.",
    "Review the source of funds for the unusually high monthly revenue.",
    "Monitor transactions related to high-risk jurisdictions more closely.",
    "Establish a regular review schedule for compliance documentation and risk assessment updates."
  ],
  "monitoring_script": "import pandas as pd\nimport numpy as np\n\n# Sample transaction data\ntransactions = pd.read_csv('transactions.csv')\n\n# Function to monitor transaction patterns\n\ndef monitor_transactions(transactions):\n    # Check for unusual transaction volumes\n    volume_threshold = 1000000  # Set volume threshold\n    high_volume_transactions = transactions[transactions['amount'] > volume_threshold]\n    if not h

/var/folders/wp/fv3cb29x4knff6cdkrw1d1s00000gn/T/ipykernel_65894/1341897559.py:49: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/
  json_data['risk_assessment'] = risk_assessment.dict()


## Summary

This solution demonstrates a complete three-stage prompt chain with Pydantic validation gates:

### Key Implementation Features:
- **Structured Data Flow**: Each stage outputs validated structured data for the next
- **Robust Error Handling**: Comprehensive validation prevents cascade failures
- **Quality Gates**: Pydantic models ensure data integrity between stages
- **Advanced Validation**: Additional business logic validates data quality
- **Script Generation**: Stage 3 can generate Python monitoring scripts

### Production Benefits:
- **Reliability**: Gate checks ensure consistent output quality
- **Maintainability**: Clear separation of concerns between stages
- **Scalability**: Framework can handle complex multi-step workflows
- **Auditability**: Each stage provides clear validation checkpoints

This approach provides a robust foundation for building sophisticated AI reasoning systems for financial compliance! 🎉