# Lesson 4: Multi-Step Financial Compliance Workflow

## Chaining Prompts for Agentic Reasoning with Validation Gates

In this hands-on exercise, you will implement a three-stage prompt chain with Pydantic-based gate checks for financial compliance analysis. You'll learn to design multi-step AI workflows by linking prompt outputs to inputs with programmatic validation between each step.

## Learning Objectives

By the end of this exercise, you will be able to:
- Design multi-step AI workflows with prompt chaining
- Implement Pydantic-based validation gates between workflow stages
- Build robust error handling for complex AI systems
- Create automated script generation capabilities
- Apply agentic reasoning techniques to financial compliance scenarios

## Setup Instructions

Before starting this exercise:

1. **Install Required Packages**: Run `pip install -r ../../requirements.txt` in your terminal
2. **Configure API Key**: 
   - Open the `.env` file in the root directory
   - Replace `your_openai_api_key_here` with your actual OpenAI API key
   - Save the file
3. **Verify Setup**: Run the import and setup cells below to ensure everything works

**Note**: This exercise uses Pydantic for data validation between prompt chain stages.

In [None]:
# Import necessary libraries
import os
from dotenv import load_dotenv
from openai import OpenAI
from IPython.display import Markdown, display
import json
from typing import List, Dict, Optional, Union
from pydantic import BaseModel, Field, ValidationError, validator
from enum import Enum
import re

# Load environment variables from the root .env file
load_dotenv('../../.env')

In [None]:
# Setup OpenAI client for Vocareum environment
client = OpenAI(
    base_url="https://openai.vocareum.com/v1",
    api_key=os.getenv("OPENAI_API_KEY")
)

def get_completion(system_prompt, user_prompt, model="gpt-4o-mini"):
    """
    Function to get a completion from the OpenAI API.
    Args:
        system_prompt: The system prompt
        user_prompt: The user prompt  
        model: The model to use (default is gpt-4o-mini)
    Returns:
        completion text
    """
    try:
        response = client.chat.completions.create(
            model=model,
            messages=[
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": user_prompt},
            ],
            temperature=0.7,
        )
        return response.choices[0].message.content
    except Exception as e:
        return f"An error occurred: {e}"

## Pydantic Models for Gate Checks

First, let's define the data models that will validate outputs between each stage of our prompt chain.

**Your Task**: Complete the Pydantic models with proper validation logic.

In [None]:
# TODO: Complete the Pydantic models with validation logic

class RiskLevel(str, Enum):
    # TODO: Define the risk level enum values
    # Hint: Use "low", "medium", "high" as string values
    pass

class CustomerData(BaseModel):
    """Stage 1 Output: Customer information collection with validation"""
    customer_name: str = Field(..., description="Customer name")
    business_type: str = Field(..., description="Type of business")
    account_age_months: int = Field(..., ge=0, description="Account age in months")
    monthly_revenue: float = Field(..., ge=0, description="Monthly revenue in USD")
    geographic_locations: List[str] = Field(..., description="Operating locations")
    
    # TODO: Add validation for customer_name (should not be empty)
    @validator('customer_name')
    def name_must_not_be_empty(cls, v):
        # TODO: Check if name is empty and raise ValueError if so
        pass

class RiskAssessment(BaseModel):
    """Stage 2 Output: Risk analysis with validation gates"""
    customer_data: CustomerData
    risk_level: RiskLevel = Field(..., description="Overall risk assessment")
    risk_factors: List[str] = Field(..., description="Identified risk factors")
    risk_score: float = Field(..., ge=0, le=100, description="Risk score 0-100")
    mitigation_factors: List[str] = Field(..., description="Risk mitigation factors")
    
    # TODO: Add validation for risk_factors (should have at least 1 factor)
    @validator('risk_factors')
    def must_have_risk_factors(cls, v):
        # TODO: Ensure at least one risk factor is provided
        pass

class ComplianceReport(BaseModel):
    """Stage 3 Output: Final compliance report with script generation"""
    risk_assessment: RiskAssessment
    compliance_status: str = Field(..., description="Compliance determination")
    regulatory_requirements: List[str] = Field(..., description="Applicable regulations")
    recommendations: List[str] = Field(..., description="Action recommendations")
    monitoring_script: Optional[str] = Field(None, description="Generated monitoring script")
    
    # TODO: Add validation for compliance_status
    @validator('compliance_status')
    def valid_compliance_status(cls, v):
        # TODO: Ensure status is one of: "compliant", "non-compliant", "requires_review", "pending_documentation"
        pass

print("Pydantic models defined for gate checks...")

In [None]:
# Customer scenario for testing
customer_scenario = """
Customer Information Request:

Global Import Solutions LLC is a customer requesting enhanced service limits. 
They operate an international trade business importing electronics from Asia.
The company has been banking with us for 18 months and reports monthly revenue 
of approximately $850,000. They have operations in the United States, Singapore, 
and have suppliers in China, Taiwan, and South Korea.

Recent activity shows:
- Large wire transfers ($100K-300K) multiple times per month
- Transactions with various Asian suppliers
- Some documentation gaps in supplier verification
- Clean banking history with no previous compliance issues
- Rapid business growth (300% revenue increase in past year)

Please conduct a comprehensive compliance analysis.
"""

## Stage 1: Data Collection

First stage of our prompt chain: Extract and validate customer information.

**Your Task**: Create a comprehensive prompt for extracting structured customer data.

In [None]:
stage1_prompt = """
# TODO: Write a prompt that instructs the AI to:
# 1. Act as a financial data analyst
# 2. Extract customer information from text
# 3. Return data in specific JSON format
# 4. Be precise with numbers (convert years to months, extract revenue amounts)
# 
# Required JSON structure:
# {
#   "customer_name": "string",
#   "business_type": "string",
#   "account_age_months": number,
#   "monthly_revenue": number,
#   "geographic_locations": ["string"]
# }
"""

def execute_stage1(scenario_text):
    """Execute Stage 1: Data Collection with Gate Check"""
    print("=== STAGE 1: DATA COLLECTION ===")
    
    # TODO: Call get_completion with your prompt and scenario_text
    response = None  # Replace with get_completion call
    print(f"Raw AI Response:\n{response}\n")
    
    try:
        # Extract JSON from response using regex
        json_match = re.search(r'\{.*\}', response, re.DOTALL)
        
        if json_match:
            # Parse the JSON string into a Python dictionary
            json_data = json.loads(json_match.group())
            
            # TODO: Create CustomerData object from the parsed JSON
            # This validates the data structure and triggers Pydantic validation
            customer_data = None  # Replace with CustomerData instantiation
            
            print("✅ Stage 1 Gate Check: PASSED")
            return customer_data
        else:
            raise ValueError("No valid JSON found in response")
            
    except (json.JSONDecodeError, ValidationError) as e:
        print(f"❌ Stage 1 Gate Check: FAILED - {e}")
        return None

# TODO: Test Stage 1 when your implementation is ready
# stage1_result = execute_stage1(customer_scenario)
# if stage1_result:
#     print(f"\n📊 Stage 1 Output: {stage1_result}")

## Stage 2: Risk Analysis

Second stage: Analyze risk factors using the validated data from Stage 1.

**Your Task**: Create a risk assessment prompt that evaluates multiple risk factors.

In [None]:
stage2_prompt = """
# TODO: Write a prompt that instructs the AI to:
# 1. Act as a risk assessment specialist
# 2. Analyze customer data for risk factors
# 3. Consider: account age, transaction volume, geography, business model
# 4. Assign risk scores and levels based on guidelines
# 5. Return data in specific JSON format
#
# Risk Guidelines:
# - LOW (0-30): Established, low-risk
# - MEDIUM (31-70): Some risk factors
# - HIGH (71-100): Multiple risk factors
#
# Required JSON structure:
# {
#   "risk_level": "low|medium|high",
#   "risk_factors": ["list of risks"],
#   "risk_score": number_0_to_100,
#   "mitigation_factors": ["list of mitigations"]
# }
"""

def execute_stage2(customer_data: CustomerData):
    """Execute Stage 2: Risk Analysis with Gate Check"""
    print("\n=== STAGE 2: RISK ANALYSIS ===")
    
    # TODO: Format customer data for the prompt
    # Hint: Create a readable summary of customer_data fields
    stage2_input = f"""Customer Data for Risk Analysis:
Customer: {customer_data.customer_name}
Business: xxxx
Account Age: xxxx months
Monthly Revenue: xxxx
Locations: {', '.join(customer_data.geographic_locations)}

Please conduct a comprehensive risk assessment considering all these factors."""
    
    # TODO: Call get_completion with stage2_prompt and stage2_input
    response = None
    print(f"Raw AI Response:\n{response}\n")
    
    try:
        # Extract JSON from response using regex
        json_match = re.search(r'\{.*\}', response, re.DOTALL)
        
        if json_match:
            # Parse the JSON string into a Python dictionary
            json_data = json.loads(json_match.group())
            
            # Add customer_data to json_data before creating RiskAssessment
            # The RiskAssessment model expects customer_data as a nested object
            json_data['customer_data'] = customer_data.dict() 
            
            # TODO: Create RiskAssessment object from the enhanced JSON data
            # This validates risk scores, levels, and ensures proper data structure
            risk_assessment = None  # Replace with RiskAssessment instantiation
            print("✅ Stage 2 Gate Check: PASSED")
            return risk_assessment
        else:
            raise ValueError("No valid JSON found in response")
            
    except (json.JSONDecodeError, ValidationError) as e:
        print(f"❌ Stage 2 Gate Check: FAILED - {e}")
        return None

# TODO: Test Stage 2 after Stage 1 is working
# if 'stage1_result' in locals() and stage1_result:
#     stage2_result = execute_stage2(stage1_result)
#     if stage2_result:
#         print(f"\n📊 Stage 2 Output: Risk Level = {stage2_result.risk_level}, Score = {stage2_result.risk_score}")

## Stage 3: Compliance Report

Final stage: Generate compliance report and monitoring script based on risk assessment.

**Your Task**: Create a compliance reporting prompt that includes script generation.

In [None]:
stage3_prompt = """You are a compliance officer generating final reports and monitoring recommendations.

Based on the risk assessment, provide:
- Compliance status determination (be specific about current status)
- Applicable regulatory requirements (name specific regulations)
- Specific action recommendations (concrete next steps)
- Python monitoring script for ongoing oversight (if risk level medium/high)

Compliance Status Guidelines:
- "compliant": Low risk, all documentation complete
- "requires_review": Medium risk, additional due diligence needed
- "pending_documentation": Missing required documentation
- "non-compliant": High risk or regulatory violations

For monitoring scripts, focus on:
- Transaction pattern monitoring
- Volume threshold alerts
- Geographic risk tracking

Return ONLY a JSON object with this exact structure:
{
  "compliance_status": "compliant|non-compliant|requires_review|pending_documentation",
  "regulatory_requirements": ["list of applicable regulations"],
  "recommendations": ["list of specific actions to take"],
  "monitoring_script": "Python script code for monitoring (optional)"
}"""

def execute_stage3(risk_assessment: RiskAssessment):
    """Execute Stage 3: Compliance Report with Gate Check"""
    print("\n=== STAGE 3: COMPLIANCE REPORT ===")
    
    stage3_input = f"""Risk Assessment Summary:
Customer: {risk_assessment.customer_data.customer_name}
Risk Level: {risk_assessment.risk_level.value}
Risk Score: {risk_assessment.risk_score}/100
Risk Factors: {', '.join(risk_assessment.risk_factors)}
Mitigation Factors: {', '.join(risk_assessment.mitigation_factors)}

Generate compliance report and monitoring recommendations based on this assessment."""
    
    # TODO: Call get_completion with stage3_prompt and stage3_input
    response = None
    print(f"Raw AI Response:\n{response}\n")
    
    try:
        # Extract JSON from response using regex
        json_match = re.search(r'\{.*\}', response, re.DOTALL)
        
        if json_match:
            # Parse the JSON string into a Python dictionary
            json_data = json.loads(json_match.group())
            
            # TODO: Add risk_assessment to json_data before creating ComplianceReport
            # The ComplianceReport model expects risk_assessment as a nested object
            json_data['risk_assessment'] = None  # Convert risk_assessment to dictionary format
            
            # TODO: Create ComplianceReport object from the enhanced JSON data
            # This validates compliance status and ensures all required fields are present
            compliance_report = None  # Replace with ComplianceReport instantiation
            print("✅ Stage 3 Gate Check: PASSED")
            return compliance_report
        else:
            raise ValueError("No valid JSON found in response")
            
    except (json.JSONDecodeError, ValidationError) as e:
        print(f"❌ Stage 3 Gate Check: FAILED - {e}")
        return None

# TODO: Test Stage 3 after Stages 1 and 2 are working
# if 'stage2_result' in locals() and stage2_result:
#     stage3_result = execute_stage3(stage2_result)
#     if stage3_result:
#         print(f"\n📊 Stage 3 Output: Status = {stage3_result.compliance_status}")

## Complete Workflow

Now let's execute the complete prompt chain with all validation gates.

**Your Task**: Implement the complete three-stage workflow orchestration.

In [None]:
def execute_complete_workflow(scenario_text):
    """Execute the complete three-stage prompt chain workflow"""
    print("🚀 STARTING COMPLETE WORKFLOW\n")
    
    # TODO: Execute Stage 1 and handle failures
    # Call execute_stage1() with the scenario text
    stage1_result = None  # Replace with function call
    if not stage1_result:
        print("❌ Workflow failed at Stage 1")
        return None
    
    # TODO: Execute Stage 2 and handle failures
    # Call execute_stage2() with the validated customer data from Stage 1
    stage2_result = None  # Replace with function call
    if not stage2_result:
        print("❌ Workflow failed at Stage 2")
        return None
    
    # TODO: Execute Stage 3 and handle failures
    # Call execute_stage3() with the validated risk assessment from Stage 2
    stage3_result = None  # Replace with function call
    if not stage3_result:
        print("❌ Workflow failed at Stage 3")
        return None
    
    print("\n🎉 WORKFLOW COMPLETED SUCCESSFULLY!")
    return stage3_result

# TODO: Test the complete workflow when all stages are implemented
# final_result = execute_complete_workflow(customer_scenario)
# if final_result:
#     print("\n=== FINAL COMPLIANCE REPORT ===")
#     print(f"Status: {final_result.compliance_status}")
#     print(f"Risk Level: {final_result.risk_assessment.risk_level}")
#     print(f"Risk Score: {final_result.risk_assessment.risk_score}/100")
#     print(f"Recommendations: {final_result.recommendations}")
#     if final_result.monitoring_script:
#         print(f"\nGenerated Monitoring Script:\n{final_result.monitoring_script}")

## Reflection Questions

After completing the exercise, consider these questions:

1. **Chain Design**: How does breaking the analysis into three stages improve the quality of the final output compared to a single prompt?

2. **Validation Gates**: What benefits do Pydantic validation gates provide for production AI systems?

3. **Error Handling**: How do validation failures help you improve your prompts and make the system more robust?

4. **Agentic Reasoning**: How does this three-stage approach enable more sophisticated AI reasoning than single-step prompts?

5. **Real-world Application**: How could this pattern be adapted for other financial services use cases?

**Your Task**: Add your reflections in the cell below.

## Your Reflections

**Chain Design Effectiveness:**
- [Add your thoughts on the three-stage design]

**Pydantic Gate Checks:**
- [Add your observations about validation benefits]

**Error Handling Impact:**
- [Add your insights on robustness]

**Agentic Reasoning:**
- [Add your analysis of reasoning capabilities]

**Production Considerations:**
- [Add your thoughts on real-world deployment]

## Summary

In this exercise, you implemented a three-stage prompt chain with Pydantic-based validation gates:

1. **Stage 1 - Data Collection**: Structured customer information extraction with validation
2. **Stage 2 - Risk Analysis**: Multi-factor risk assessment with quality gates
3. **Stage 3 - Compliance Report**: Final analysis with script generation capabilities

### Key Concepts Learned:
- **Prompt Chaining**: Linking outputs from one stage to inputs of the next
- **Validation Gates**: Using Pydantic to ensure data quality between stages
- **Error Recovery**: Graceful handling of validation failures
- **Agentic Reasoning**: Building complex AI workflows through structured steps
- **Production Readiness**: Validation frameworks for reliable financial applications

### Next Steps:
- Experiment with different risk scenarios
- Enhance validation rules for your specific use case
- Consider adding more stages for complex workflows
- Explore integration with existing financial systems

These techniques provide a robust foundation for building sophisticated multi-step AI reasoning systems for financial services! 🎉