# Criteria Validation Testing Notebook

This notebook demonstrates how to use the criteria validation module for healthcare/insurance prior authorization validation.

The criteria validation module:
1. Processes user history documents from S3
2. Validates them against configurable criteria questions
3. Generates recommendations (Pass/Fail/Information Not Found)
4. Supports async processing with rate limiting
5. Tracks costs and metering data
6. **Now uses centralized config.yaml for all configuration**

> **Note**: This notebook uses AWS services including S3 and Bedrock. You need valid AWS credentials with appropriate permissions.

## 1. Install Dependencies

In [None]:
# Auto-reload modules
%load_ext autoreload
%autoreload 2

ROOTDIR="../"

# First uninstall existing package
%pip uninstall -y idp_common

# Install the IDP common package including criteria validation
%pip install -q -e "{ROOTDIR}/lib/idp_common_pkg[all]"

# Install additional dependencies including nest_asyncio for Jupyter async support
%pip install -q pydantic nest_asyncio pyyaml

# Check installed version
%pip show idp_common | grep -E "Version|Location"

## 2. Load Configuration and Set Up Environment

In [None]:
import os
import json
import time
import boto3
import logging
import datetime
import asyncio
import yaml
from typing import Dict, Any

# Fix for Jupyter async event loop conflicts
import nest_asyncio
nest_asyncio.apply()

# Import criteria validation module
from idp_common.criteria_validation import CriteriaValidationService, CriteriaValidationResult

# Load configuration from YAML file
config_path = "../../config_library/pattern-2/criteria-validation/config.yaml"
with open(config_path, 'r') as f:
    config_data = yaml.safe_load(f)

print(f"✅ Loaded configuration from: {config_path}")
print(f"Available criteria types: {config_data.get('criteria_types', [])}")
print(f"Model: {config_data['criteria_validation']['model']}")

# Configure logging
logging.basicConfig(level=logging.INFO)
logging.getLogger('idp_common.criteria_validation').setLevel(logging.DEBUG)
logging.getLogger('idp_common.bedrock').setLevel(logging.INFO)

# Set environment variables
os.environ['METRIC_NAMESPACE'] = 'IDP-CriteriaValidation-Test'
os.environ['AWS_REGION'] = boto3.session.Session().region_name or 'us-east-1'

# Get AWS account ID for unique bucket names
sts_client = boto3.client('sts')
account_id = sts_client.get_caller_identity()["Account"]
region = os.environ['AWS_REGION']

# Use bucket names from config, but make them unique per account/region
user_history_bucket = f"{config_data['request_bucket']}-{account_id}-{region}"
criteria_bucket = f"{config_data['criteria_bucket']}-{account_id}-{region}"
output_bucket = f"{config_data['output_bucket']}-{account_id}-{region}"

print("\nEnvironment setup:")
print(f"AWS_REGION: {os.environ.get('AWS_REGION')}")
print(f"User History bucket: {user_history_bucket}")
print(f"Criteria bucket: {criteria_bucket}")
print(f"Output bucket: {output_bucket}")
print("\n✅ Async event loop patched for Jupyter compatibility")

## 3. Set Up S3 Buckets and Sample Data

In [None]:
# Create S3 client
s3_client = boto3.client('s3')

# Function to create bucket if it doesn't exist
def ensure_bucket_exists(bucket_name):
    try:
        s3_client.head_bucket(Bucket=bucket_name)
        print(f"Bucket {bucket_name} already exists")
    except Exception:
        try:
            if region == 'us-east-1':
                s3_client.create_bucket(Bucket=bucket_name)
            else:
                s3_client.create_bucket(
                    Bucket=bucket_name,
                    CreateBucketConfiguration={'LocationConstraint': region}
                )
            print(f"Created bucket: {bucket_name}")
            
            # Wait for bucket to be accessible
            waiter = s3_client.get_waiter('bucket_exists')
            waiter.wait(Bucket=bucket_name)
        except Exception as e:
            print(f"Error creating bucket {bucket_name}: {str(e)}")
            raise

# Ensure buckets exist (only user history and output buckets needed)
ensure_bucket_exists(user_history_bucket)
ensure_bucket_exists(output_bucket)
print("\n✅ S3 buckets configured (criteria bucket not needed - using config.yaml)")

## 4. Create Sample User History Data

In [None]:
# Create sample user history
sample_user_history = """Patient: John Doe
Date: 2024-01-15

Medical History:
The patient has been diagnosed with rheumatoid arthritis (RA) and has failed treatment with methotrexate and two TNF inhibitors. 
The treating physician, Dr. Sarah Johnson, has recommended starting immunotherapy with infliximab.

Treatment Plan:
- Infliximab will be administered at the infusion center under direct supervision of trained medical staff
- The facility is equipped with emergency response equipment including epinephrine for anaphylaxis treatment
- Initial dose: 3 mg/kg at 0, 2, and 6 weeks, then every 8 weeks
- Pre-medication with antihistamines and corticosteroids as per protocol

Facility Information:
The treatment will be provided at Memorial Hospital Infusion Center, which has 24/7 emergency support and trained nursing staff.

Additional Clinical Information:
- Patient has been screened for contraindications including tuberculosis and hepatitis B
- Baseline laboratory values are within normal limits
- Patient education has been completed regarding potential side effects and monitoring requirements
- Emergency response protocols are in place with trained nursing staff available 24/7
"""

# Upload sample data
request_id = "TEST-" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
request_prefix = config_data['request_history_prefix']

# Upload user history
user_history_key = f"{request_prefix}-{request_id}/extracted_text/patient_history.txt"
s3_client.put_object(
    Bucket=user_history_bucket,
    Key=user_history_key,
    Body=sample_user_history.encode('utf-8')
)
print(f"✅ Uploaded user history to: s3://{user_history_bucket}/{user_history_key}")
print(f"Request ID: {request_id}")

## 5. Configure Criteria Validation Service (Using Config.yaml)

In [None]:
# Extract configuration from loaded YAML and map to service expectations
criteria_config = config_data['criteria_validation']

validation_config = {
    # Model configuration from config.yaml
    "model_id": criteria_config['model'],
    "temperature": criteria_config['temperature'],
    "top_k": criteria_config['top_k'],
    "top_p": criteria_config['top_p'],
    "max_tokens": criteria_config['max_tokens'],
    
    # Bucket configuration
    "request_bucket": user_history_bucket,
    "request_history_prefix": request_prefix,
    "criteria_bucket": criteria_bucket,  # Not used, but kept for compatibility
    "output_bucket": output_bucket,
    
    # Criteria types from config.yaml
    "criteria_types": config_data.get('criteria_types', ['administration_requirements']),
    
    # Prompts and options from config.yaml
    "system_prompt": criteria_config['system_prompt'],
    "task_prompt": criteria_config['task_prompt'],
    "recommendation_options": criteria_config['criteria']['recommendation_options'],
    
    # Async processing configuration from config.yaml
    "criteria_validation": {
        "semaphore": criteria_config['semaphore'],
        "max_chunk_size": criteria_config['max_chunk_size'],
        "token_size": criteria_config['token_size'],
        "overlap_percentage": criteria_config['overlap_percentage'],
    }
}

print("✅ Configuration extracted from config.yaml:")
print(f"   Model: {validation_config['model_id']}")
print(f"   Criteria types: {validation_config['criteria_types']}")
print(f"   Semaphore: {validation_config['criteria_validation']['semaphore']}")

# Display criteria from config for verification
print("\n📋 Available criteria from config.yaml:")
for criteria_type in validation_config['criteria_types']:
    if criteria_type in criteria_config['criteria']:
        questions = criteria_config['criteria'][criteria_type]
        print(f"\n{criteria_type.replace('_', ' ').title()}:")
        for i, question in enumerate(questions, 1):
            print(f"  {i}. {question}")

## 6. Upload Criteria from Config to S3 (For Service Compatibility)

In [None]:
# Since the current service expects criteria from S3, we'll upload them from config
# This is a temporary step until the service is updated to read from config directly

# Ensure criteria bucket exists
ensure_bucket_exists(criteria_bucket)

# Upload each criteria type from config.yaml to S3
criteria_from_config = criteria_config['criteria']

for criteria_type in validation_config['criteria_types']:
    if criteria_type in criteria_from_config:
        # Create criteria data in format expected by service
        criteria_data = {
            "criteria": criteria_from_config[criteria_type]
        }
        
        # Upload to S3
        criteria_key = f"{criteria_type}.json"
        s3_client.put_object(
            Bucket=criteria_bucket,
            Key=criteria_key,
            Body=json.dumps(criteria_data, indent=2).encode('utf-8')
        )
        print(f"✅ Uploaded {criteria_type} criteria to: s3://{criteria_bucket}/{criteria_key}")

print("\n✅ All criteria from config.yaml uploaded to S3 for service compatibility")

## 7. Run Criteria Validation

In [None]:
# Initialize the service
criteria_service = CriteriaValidationService(
    region=region,
    config=validation_config
)

print(f"🚀 Starting validation for request: {request_id}")
print(f"Processing {len(validation_config['criteria_types'])} criteria types...")
print("This may take a few moments...")

# Run validation
start_time = time.time()
result = criteria_service.validate_request(
    request_id=request_id,
    config=validation_config
)
validation_time = time.time() - start_time

print(f"\n✅ Validation completed in {validation_time:.2f} seconds")
print(f"Request ID: {result.request_id}")
print(f"Criteria Type: {result.criteria_type}")

## 8. Display Results

In [None]:
# Display validation responses
print("\n=== 📊 Validation Results ===")

# Helper function to parse S3 URIs
def parse_s3_uri(uri):
    parts = uri.replace("s3://", "").split("/")
    bucket = parts[0]
    key = "/".join(parts[1:])
    return bucket, key

# Read results from S3
if result.metadata and 'output_uris' in result.metadata:
    for output_uri in result.metadata['output_uris']:
        print(f"\n📄 Reading results from: {output_uri}")
        
        # Parse S3 URI and read content
        bucket, key = parse_s3_uri(output_uri)
        response = s3_client.get_object(Bucket=bucket, Key=key)
        content = response['Body'].read().decode('utf-8')
        responses = json.loads(content)
        
        # Display each validation response
        for idx, response_item in enumerate(responses):
            print(f"\n--- 📋 Criteria {idx + 1} ---")
            print(f"❓ Question: {response_item.get('question', 'N/A')}")
            
            recommendation = response_item.get('Recommendation', 'N/A')
            if recommendation == 'Pass':
                print(f"✅ Recommendation: {recommendation}")
            elif recommendation == 'Fail':
                print(f"❌ Recommendation: {recommendation}")
            else:
                print(f"❓ Recommendation: {recommendation}")
            
            print(f"📝 Reasoning: {response_item.get('Reasoning', 'N/A')}")
            print(f"📁 Source Files: {response_item.get('source_file', [])}")
else:
    print("❌ No validation results found in metadata")

## 9. Display Metering and Cost Information

In [None]:
# Display metering information with support for nested model-specific structure
print("\n=== 💰 Token Usage ===")
if result.metering:
    # First try the old flat structure for backward compatibility
    if 'total_input_tokens' in result.metering:
        print(f"📥 Total Input Tokens: {result.metering.get('total_input_tokens', 0):,}")
        print(f"📤 Total Output Tokens: {result.metering.get('total_output_tokens', 0):,}")
    else:
        # Handle new nested structure: {model_key: {inputTokens: X, outputTokens: Y, totalTokens: Z}}
        total_input_tokens = 0
        total_output_tokens = 0
        total_tokens = 0
        
        print("\n📊 Per-Model Token Usage:")
        for model_key, usage in result.metering.items():
            if isinstance(usage, dict) and ('inputTokens' in usage or 'outputTokens' in usage):
                input_tokens = usage.get('inputTokens', 0)
                output_tokens = usage.get('outputTokens', 0)
                model_total = usage.get('totalTokens', input_tokens + output_tokens)
                
                # Extract model name from the key for cleaner display
                model_name = model_key.split('/')[-1] if '/' in model_key else model_key
                print(f"  🤖 {model_name}:")
                print(f"    📥 Input Tokens: {input_tokens:,}")
                print(f"    📤 Output Tokens: {output_tokens:,}")
                print(f"    📊 Total Tokens: {model_total:,}")
                
                # Add to totals
                total_input_tokens += input_tokens
                total_output_tokens += output_tokens
                total_tokens += model_total
        
        print(f"\n📈 Total Across All Models:")
        print(f"  📥 Total Input Tokens: {total_input_tokens:,}")
        print(f"  📤 Total Output Tokens: {total_output_tokens:,}")
        print(f"  📊 Grand Total Tokens: {total_tokens:,}")
    
    # Display per-criteria usage if available (legacy structure)
    criteria_usage = result.metering.get('criteria_usage', {})
    if criteria_usage:
        print("\n📋 Per-Criteria Usage:")
        for criteria_type, usage in criteria_usage.items():
            print(f"  📄 {criteria_type}:")
            print(f"    📥 Input Tokens: {usage.get('input_tokens', 0):,}")
            print(f"    📤 Output Tokens: {usage.get('output_tokens', 0):,}")
else:
    print("❌ No metering data available")

# Display timing information
print("\n=== ⏱️ Timing Information ===")
if result.metadata and 'timing' in result.metadata:
    timing = result.metadata['timing']
    print(f"🕐 Total Duration: {timing.get('total_duration', 0):.2f} seconds")
    
    # Display per-criteria timing
    criteria_timing = timing.get('criteria_processing_time', [])
    if criteria_timing:
        print("\n📊 Per-Criteria Processing Time:")
        for item in criteria_timing:
            print(f"  📋 {item['criteria_type']}: {item['duration']:.2f} seconds")
else:
    print("❌ No timing data available")

## 10. Clean Up (Optional)

In [None]:
# Function to delete objects in a bucket
def delete_bucket_objects(bucket_name):
    try:
        # List all objects in the bucket
        response = s3_client.list_objects_v2(Bucket=bucket_name)
        if 'Contents' in response:
            delete_keys = {'Objects': [{'Key': obj['Key']} for obj in response['Contents']]}
            s3_client.delete_objects(Bucket=bucket_name, Delete=delete_keys)
            print(f"🗑️ Deleted all objects in bucket {bucket_name}")
        else:
            print(f"📭 Bucket {bucket_name} is already empty")
            
        # Delete bucket
        s3_client.delete_bucket(Bucket=bucket_name)
        print(f"🗑️ Deleted bucket {bucket_name}")
    except Exception as e:
        print(f"❌ Error cleaning up bucket {bucket_name}: {str(e)}")

# Uncomment the following lines to delete the buckets
# print("🧹 Cleaning up resources...")
# delete_bucket_objects(user_history_bucket)
# delete_bucket_objects(criteria_bucket)
# delete_bucket_objects(output_bucket)
# print("✅ Cleanup complete")

print("\n✅ Notebook completed successfully!")
print("💡 Uncomment the cleanup section above to delete the test S3 buckets.")
print(f"🎯 Configuration successfully migrated from pattern-4 to pattern-2")
print(f"📋 All criteria now sourced from: {config_path}")

## Conclusion

This notebook now demonstrates the **updated** criteria validation module with **centralized configuration**:

### ✅ New Features:
1. **Centralized Config** - All configuration now loaded from `config.yaml`
2. **Config-Driven Criteria** - Criteria questions defined in config, not hardcoded
3. **Pattern-2 Integration** - Config moved from pattern-4 to pattern-2
4. **YAML Support** - Added pyyaml dependency for config loading

### 🔧 Technical Capabilities:
1. **Async Processing** - Concurrent evaluation of multiple criteria questions
2. **Rate Limiting** - Built-in semaphore control for API rate limits
3. **Chunking** - Automatic text chunking for large documents
4. **Cost Tracking** - Comprehensive token usage and metering
5. **Pydantic Validation** - Strong data validation for inputs/outputs
6. **S3 Integration** - Seamless reading/writing of validation data
7. **Jupyter Compatibility** - Fixed async event loop conflicts with nest_asyncio

### 📈 Key Benefits:
- **Maintainability** - Single source of truth for all configuration
- **Scalability** - Process multiple criteria types concurrently
- **Reliability** - Built-in error handling and retry logic
- **Consistency** - Uses common bedrock client for standardized LLM interactions
- **Flexibility** - Easy to modify criteria without code changes
- **Traceability** - Complete audit trail with source file citations

### 🎯 Migration Summary:
- ✅ Moved config from `pattern-4/criteria-validation/` to `pattern-2/criteria-validation/`
- ✅ Added criteria definitions directly in config.yaml under `criteria_validation.criteria`
- ✅ Updated notebook to load config from YAML instead of inline definitions
- ✅ Eliminated hardcoded criteria and recommendation options
- ✅ Maintained backward compatibility with existing service architecture

The module is designed for healthcare/insurance prior authorization validation but can be adapted for other business rule validation use cases by simply updating the config.yaml file.
