# 🎯 ChatGPT-4 Level Evaluation for Singapore Financial Regulations

## 🏆 **COMPREHENSIVE EVALUATION SYSTEM**
This notebook evaluates our fine-tuned GPT-2 model against ChatGPT-4 level questions and responses for Singapore financial regulations.

### **📊 Evaluation Criteria:**
- **Accuracy**: Factual correctness of regulatory information
- **Completeness**: Comprehensive coverage of the topic
- **Clarity**: Clear and professional communication
- **Regulatory Compliance**: Alignment with MAS guidelines
- **Practical Applicability**: Actionable insights for practitioners

### **🎯 Target Performance:**
- **90%+ accuracy** on Singapore financial regulation questions
- **Comparable depth** to ChatGPT-4 responses
- **Professional quality** suitable for financial industry use


In [None]:
# 🚀 Setup and Model Loading
!pip install torch transformers datasets peft accelerate rouge-score nltk sentence-transformers -q

import torch
import json
import time
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import nltk
from rouge_score import rouge_scorer
from sentence_transformers import SentenceTransformer
import numpy as np

# Download required NLTK data
try:
    nltk.download('punkt', quiet=True)
except:
    pass

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"✅ Setup complete! Using device: {device}")

# Load evaluation tools
rouge_scorer_obj = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)
semantic_model = SentenceTransformer('all-MiniLM-L6-v2')

print("🎯 Ready for ChatGPT-4 level evaluation!")


In [None]:
# 🎯 ChatGPT-4 Level Test Questions
print("📋 Loading ChatGPT-4 level Singapore financial regulation questions...")

# Comprehensive test questions that require ChatGPT-4 level expertise
chatgpt4_level_questions = [
    {
        "question": "Explain MAS's approach to regulating artificial intelligence in financial services, including key principles and implementation requirements.",
        "expected_topics": ["AI governance", "explainability", "fairness", "human oversight", "risk management"],
        "difficulty": "advanced",
        "category": "AI_regulation"
    },
    {
        "question": "What are the specific capital adequacy requirements for Singapore banks under Basel III, and how do they compare to international standards?",
        "expected_topics": ["CET1 ratio", "Tier 1 capital", "total capital ratio", "Basel III", "buffers"],
        "difficulty": "advanced", 
        "category": "capital_adequacy"
    },
    {
        "question": "Describe the comprehensive AML/CFT framework in Singapore, including reporting obligations and penalties for non-compliance.",
        "expected_topics": ["MAS Notice 626", "STRO", "suspicious transactions", "customer due diligence", "penalties"],
        "difficulty": "advanced",
        "category": "AML_CFT"
    },
    {
        "question": "How does MAS regulate digital banking in Singapore, and what are the key differences between digital bank licenses and traditional banking licenses?",
        "expected_topics": ["digital bank license", "capital requirements", "operational requirements", "technology risk"],
        "difficulty": "advanced",
        "category": "digital_banking"
    },
    {
        "question": "What are MAS's cybersecurity requirements for financial institutions, including incident reporting and resilience testing?",
        "expected_topics": ["cybersecurity framework", "incident reporting", "penetration testing", "business continuity"],
        "difficulty": "advanced",
        "category": "cybersecurity"
    },
    {
        "question": "Explain the Payment Services Act framework and how it regulates different types of payment service providers in Singapore.",
        "expected_topics": ["PSA", "payment service types", "licensing", "capital requirements", "e-money"],
        "difficulty": "advanced",
        "category": "payment_services"
    },
    {
        "question": "What is MAS's approach to fintech regulation and innovation, including the regulatory sandbox program?",
        "expected_topics": ["regulatory sandbox", "fintech innovation", "proportionate regulation", "technology risk"],
        "difficulty": "intermediate",
        "category": "fintech_innovation"
    },
    {
        "question": "Describe the risk management requirements for Singapore financial institutions under MAS guidelines.",
        "expected_topics": ["operational risk", "market risk", "credit risk", "liquidity risk", "stress testing"],
        "difficulty": "advanced",
        "category": "risk_management"
    },
    {
        "question": "How does MAS ensure financial stability in Singapore through macroprudential policies and supervision?",
        "expected_topics": ["macroprudential policy", "systemic risk", "financial stability", "supervision"],
        "difficulty": "expert",
        "category": "financial_stability"
    },
    {
        "question": "What are the corporate governance requirements for Singapore financial institutions under MAS regulations?",
        "expected_topics": ["board composition", "risk governance", "internal controls", "audit requirements"],
        "difficulty": "intermediate",
        "category": "corporate_governance"
    }
]

print(f"✅ Loaded {len(chatgpt4_level_questions)} ChatGPT-4 level questions")
print(f"📊 Difficulty levels: {len([q for q in chatgpt4_level_questions if q['difficulty'] == 'expert'])} expert, {len([q for q in chatgpt4_level_questions if q['difficulty'] == 'advanced'])} advanced, {len([q for q in chatgpt4_level_questions if q['difficulty'] == 'intermediate'])} intermediate")
print("🎯 These questions require deep Singapore financial regulation expertise!")
