# Manual Testing Notebook (Enterprise Edition)

This notebook verifies the new features: Real Data Integration (CSV), RAG (Regulatory Chat), and the Validation Pipeline.

In [None]:
import os
import sys
import yaml
import pandas as pd

# Ensure we can import from src
sys.path.append(os.path.abspath('..'))
sys.path.append(os.path.abspath('./src'))
sys.path.append(os.path.abspath('./notebooks'))
sys.path.append(os.path.abspath('./templates'))
sys.path.append(os.path.abspath('./documents'))
sys.path.append(os.path.abspath('./metrics_from_client'))
sys.path.append(os.path.abspath('./chroma_db'))

if os.path.basename(os.getcwd()) == "notebooks":
    os.chdir("..")  # Move to 'openai_basics/' root

print(f"✅ Current Working Directory: {os.getcwd()}")

from src.data_loader_module import get_data_loader, CSVDataLoader, SyntheticDataLoader
from src.calculations import CreditRiskFormulas
from src.pipeline import ValidationPipeline
from src.rag_engine import RagEngine

✅ Current Working Directory: c:\Users\semeier\Desktop\gemini_chat_private_GH\openai_basics


## 1. Test Data Loading (CSV vs Synthetic)

In [12]:
# Test Synthetic First
print("Testing Synthetic Loader:")
synth_loader = SyntheticDataLoader()
print("AUC Data Head:")
display(synth_loader.load_auc_data().head())

Testing Synthetic Loader:
AUC Data Head:


Unnamed: 0,Date,AUC_Initial,AUC_Current
0,2025-01-19,0.75,0.759934
1,2025-02-18,0.75,0.747235
2,2025-03-20,0.75,0.762954
3,2025-04-19,0.75,0.780461
4,2025-05-19,0.75,0.745317


In [13]:
# Test CSV Loader (Expect Error if files missing)
print("Testing CSV Loader:")
try:
    csv_loader = CSVDataLoader("../metrics_from_client")
    display(csv_loader.load_auc_data().head())
except Exception as e:
    print(f"CSV Load Failed as expected (if no files): {e}")

# Note: To pass this test, put valid CSVs in metrics_from_client

Testing CSV Loader:
CSV Load Failed as expected (if no files): Missing required CSV files in ../metrics_from_client: auc_data.csv, calibration_data.csv, score_data.csv. Please ensure all required files are present.


## 2. Test RAG Engine

In [15]:
rag = RagEngine("config.yaml")

print("Ingesting Documents...")
status = rag.ingest_documents()
print(status)

Ingesting Documents...
Ingested 88 documents (304 chunks).


In [19]:
print("Testing Query...")
response = rag.query_knowledge_base(
    question="⦁	what are the management body and senior management of a credit institution responsible for?",
    stats_context="Current AUC dropped by 0.05."
)
print("RAG Response:")
print(response)

Testing Query...
RAG Response:
The management body of a credit institution is responsible for approving and regularly reviewing the institution's credit risk management strategy, as well as the main policies and processes for identifying, measuring, evaluating, monitoring, reporting, and mitigating credit risk. This should be consistent with the approved risk appetite set by the management body. Additionally, to limit the risk that lending exposures pose to depositors and, more generally, financial stability, the management body should require that senior management adopt and adhere to sound underwriting practices (Source: documents\Final Guidelines on Accounting for Expected Credit Losses (EBA-GL-2017-06).pdf | Page: 18).

To fulfill these responsibilities, the management body should instruct senior management to develop and maintain appropriate processes, which should be systematic and consistently applied (Source: documents\Final Guidelines on Accounting for Expected Credit Losses (

## 3. Test Full Pipeline with Real Data

In [21]:
# Load Config
with open("./config.yaml", "r") as f:
    config = yaml.safe_load(f)

# Initialize Pipeline
pipeline = ValidationPipeline("./config.yaml")
calculator = CreditRiskFormulas(config)

# Load Data (Switch to Synthetic for reliable test run if CSVs missing)
loader = SyntheticDataLoader()
df_auc = loader.load_auc_data()
df_calib = loader.load_calibration_data()
df_scores = loader.load_score_data()

# Calculate Stats
auc_stats = calculator.evaluate_auc(df_auc)
chi_stats = calculator.calculate_chi_square(df_calib)
binomial_results = calculator.calculate_binomial_test(df_calib)
jeffrey_results = calculator.calculate_jeffrey_test(df_calib)
ttest_stats = calculator.calculate_ttest(df_scores)

# Run Pipeline
print("Running Pipeline...")
result = pipeline.run(
    auc_stats, chi_stats, binomial_results, jeffrey_results, ttest_stats
)

print(f"Status: {result['status']}")
print(f"Score: {result['score']}")

Running Pipeline...
--- Pipeline Attempt 1/3 ---
Status: Pass
Score: 9


In [22]:
result

{'report': "# Model Validation Report\n\n## Executive Summary\nThe credit risk model under review demonstrates satisfactory performance across various statistical tests. The model's predictive power, as measured by the Area Under the Curve (AUC), remains within acceptable limits. Calibration tests, including the Pearson Chi-Square, Binomial, and Jeffrey's tests, indicate that the model's probability of default (PD) estimates align well with observed default rates across all rating grades. Population stability analysis does not reveal significant shifts. Based on these findings, the recommendation is to **Accept** the model.\n\n## AUC Analysis\nThe comparison between the initial and current AUC shows a deviation of -0.0093, which is well within the acceptable tolerance of 0.02. This indicates that the model's discriminatory power has not significantly deteriorated over time. The status of this analysis is **Accept**.\n\n- **Latest AUC Difference**: -0.0093\n- **Tolerance**: 0.02\n- **St