# Validate Analysis Results

This notebook compares analysis outputs between **production runs** (baseline) and **test runs** to validate that tests produced expected results.

**Modes:**
- **Single validation**: Compare one production/test pair for one perspective (Section 3A)
- **Lookup test IDs**: Populate test analysis IDs by name + EDM lookup (Section 3B)
- **Batch validation**: Compare multiple pairs from CSV/XLSX, all perspectives (Section 3C)

**Perspectives validated (batch mode):** GR (Gross), GU (Ground-Up), RL (Reinsurance Layer)

**Endpoints compared:**
- Settings (analysis configuration)
- Statistics (`/stats`)
- EP Metrics (`/ep`)
- Event Loss Table (`/elt`)
- Period Loss Table (`/plt`) - HD analyses only

**Fields compared per endpoint:**

| Endpoint | Fields |
|----------|--------|
| Settings | engineType, engineVersion, analysisType, analysisMode, analysisFramework, modelProfile, outputProfile, eventRateSchemeNames, peril, perilCode, subPeril, region, regionCode, lossAmplification, insuranceType, vulnerabilityCurve, engineSubType, isMultiEvent |
| Statistics | epType, purePremium, totalStdDev, cv, netPurePremium, activation, exhaustion, totalLossRatio, limit, premium, netStdDev, exhaustAllReinstatements |
| EP Metrics | epType, value (contains returnPeriods and positionValues arrays) |
| ELT | eventId, sourceId, positionValue, stdDevI, stdDevC, expValue, rate, peril, region, oepWUC |
| PLT | periodId, eventId, weight, eventDate, lossDate, positionValue, peril, region |

## 1. Setup & Imports

In [17]:
%load_ext autoreload
%autoreload 2

from helpers.analysis_results_validator import (
    AnalysisResultsValidator,
    # Single validation output
    print_validation_summary,
    print_validation_details,
    # Batch validation output
    batch_results_to_dataframe,
    print_batch_summary,
    export_batch_failures_to_json,
    export_batch_summary_to_csv,
)

validator = AnalysisResultsValidator()
print("Setup complete.")

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
Setup complete.


## 2. Configuration

**Common settings** apply to both single and batch validation.

In [18]:
# === Common Settings ===

# Comparison settings
RELATIVE_TOLERANCE = 1e-4  # For floating-point comparison (0.01% relative difference)
DECIMAL_PLACES = 0         # Values must match when rounded to this many decimal places
MAX_DIFF = 1             # Maximum allowed difference between rounded values
MAX_DIFFERENCES_TO_SHOW = 10  # Limit output for large datasets

# Note: PLT comparison is automatically included for HD analyses only (based on engineType)

## 3A. Single Validation (One Pair, One Perspective)

Use this section to validate a single production/test pair for a specific perspective.

Enter the **appAnalysisId** values - these are the IDs shown in the Moody's RiskModeler UI (e.g., 35810).

In [26]:
# Single validation configuration
PRODUCTION_APP_ANALYSIS_ID = 1705  # Replace with your production analysis ID
TEST_APP_ANALYSIS_ID = 5296        # Replace with your test analysis ID
PERSPECTIVE_CODE = 'RL'            # 'GR' (Gross), 'GU' (Ground-Up), 'RL' (Reinsurance Layer)

# Run single validation
result = validator.validate(
    production_app_analysis_id=PRODUCTION_APP_ANALYSIS_ID,
    test_app_analysis_id=TEST_APP_ANALYSIS_ID,
    perspective_code=PERSPECTIVE_CODE,
    relative_tolerance=RELATIVE_TOLERANCE,
    decimal_places=DECIMAL_PLACES,
    max_diff=MAX_DIFF,
)

print_validation_summary(result)

ANALYSIS VALIDATION RESULTS

Production Analysis ID: 1705
Test Analysis ID:       5296
Perspective:            RL

------------------------------------------------------------
Endpoint Results:
------------------------------------------------------------
  [OK] Settings: PASS
  [OK] Statistics: PASS
  [OK] EP Metrics: PASS
  [OK] ELT: PASS

OVERALL: PASS


### Single Validation Details (if failed)

In [27]:
print_validation_details(result, max_differences=MAX_DIFFERENCES_TO_SHOW)


No differences found - all endpoints match!


## 3B. Lookup Test Analysis IDs (Optional)

Populate `test_app_analysis_id` by looking up analyses using `test_analysis_name` + `test_edm_name` columns.

In [None]:
from helpers.irp_integration import IRPClient
import pandas as pd

# === Configuration ===
LOOKUP_FILE_PATH = 'analysis_results/big_config_ALL_Inc_USIF_results.xlsx'
SAVE_OUTPUT = True  # Set to True to save updated file

# Load and process
df = pd.read_excel(LOOKUP_FILE_PATH)
if 'test_app_analysis_id' not in df.columns:
    df['test_app_analysis_id'] = pd.NA

client = IRPClient()
found, errors = 0, 0

for idx, row in df.iterrows():
    analysis_name, edm_name = row.get('test_analysis_name'), row.get('test_edm_name')
    if pd.isna(analysis_name) or pd.isna(edm_name) or pd.notna(row.get('test_app_analysis_id')):
        continue
    try:
        analysis = client.analysis.get_analysis_by_name(str(analysis_name).strip(), str(edm_name).strip())
        if app_id := analysis.get('appAnalysisId'):
            df.at[idx, 'test_app_analysis_id'] = int(app_id)
            found += 1
            print(f"[OK] {analysis_name} -> {app_id}")
    except Exception as e:
        errors += 1
        print(f"[ERROR] {analysis_name}: {e}")

print(f"\nFound: {found}, Errors: {errors}")

if SAVE_OUTPUT:
    df.to_excel(LOOKUP_FILE_PATH, index=False)
    print(f"Saved to: {LOOKUP_FILE_PATH}")

## 3B-2. Lookup Group IDs by Name (Optional)

Populate `production_app_analysis_id` and `test_app_analysis_id` by looking up **analysis groups** using name columns.

Groups are globally unique by name, so no EDM is required.

In [19]:
from helpers.irp_integration import IRPClient
import pandas as pd

# === Configuration ===
GROUP_LOOKUP_FILE_PATH = 'analysis_results/big_config_ALL_Inc_USIF_GROUPING_results.xlsx'
SAVE_OUTPUT = True  # Set to True to save updated file

# Load and process
df = pd.read_excel(GROUP_LOOKUP_FILE_PATH)
if 'production_app_analysis_id' not in df.columns:
    df['production_app_analysis_id'] = pd.NA
if 'test_app_analysis_id' not in df.columns:
    df['test_app_analysis_id'] = pd.NA

client = IRPClient()
found, errors = 0, 0

def lookup_group_id(name: str) -> int | None:
    """Look up a group by name and return its appAnalysisId."""
    results = client.analysis.search_analyses(filter=f'analysisName = "{name.strip()}"')
    if len(results) == 0:
        print(f"[NOT FOUND] {name}")
        return None
    elif len(results) > 1:
        print(f"[DUPLICATE] {name} - found {len(results)} matches")
        return None
    return results[0].get('appAnalysisId')

for idx, row in df.iterrows():
    # Lookup production group
    prod_name = row.get('production_analysis_name')
    if pd.notna(prod_name) and pd.isna(row.get('production_app_analysis_id')):
        try:
            if app_id := lookup_group_id(str(prod_name)):
                df.at[idx, 'production_app_analysis_id'] = int(app_id)
                found += 1
                print(f"[OK] prod: {prod_name} -> {app_id}")
            else:
                errors += 1
        except Exception as e:
            errors += 1
            print(f"[ERROR] prod: {prod_name}: {e}")

    # Lookup test group
    test_name = row.get('test_analysis_name')
    if pd.notna(test_name) and pd.isna(row.get('test_app_analysis_id')):
        try:
            if app_id := lookup_group_id(str(test_name)):
                df.at[idx, 'test_app_analysis_id'] = int(app_id)
                found += 1
                print(f"[OK] test: {test_name} -> {app_id}")
            else:
                errors += 1
        except Exception as e:
            errors += 1
            print(f"[ERROR] test: {test_name}: {e}")

print(f"\nFound: {found}, Errors: {errors}")

if SAVE_OUTPUT:
    df.to_excel(GROUP_LOOKUP_FILE_PATH, index=False)
    print(f"Saved to: {GROUP_LOOKUP_FILE_PATH}")

[NOT FOUND] 202503_USOW_Geico_HIP2_Group
[OK] prod: 202503_USHU_LT_Manufactured_Housing_Group -> 1705
[OK] prod: 202503_USHU_NT_Manufactured_Housing_Group -> 1701
[OK] prod: 202503_USHU_LT_Renters_Group -> 1706
[OK] prod: 202503_USHU_NT_Renters_Group -> 1702
[OK] prod: 202503_USHU_LT_Other_Group -> 1707
[OK] prod: 202503_USHU_NT_Other_Group -> 1703
[OK] prod: 202503_USHU_LT_LendarPlaced_Group -> 1708
[OK] prod: 202503_USHU_NT_LendarPlaced_Group -> 1704
[OK] prod: 202503_NT_USHU_Group -> 1710
[OK] prod: 202503_LT_USHU_Group -> 1709
[OK] prod: 202503_USEQ_Group -> 1711
[OK] prod: 202503_USFF_Group -> 1712
[OK] prod: 202503_USWF_Group -> 1713
[NOT FOUND] 202503_USOW_Group
[OK] prod: 202503_LenderPlaced_LT_All_PerilxWFIF_Group -> 1715
[OK] prod: 202503_LenderPlaced_LT_All_Peril_Group -> 1716
[OK] prod: 202503_LenderPlacedxCB_LT_All_PerilxWFIF_Group -> 1719
[OK] prod: 202503_LenderPlacedxCB_LT_All_Peril_Group -> 1720
[OK] prod: 202503_Geico_HIP2_LT_All_Peril_Group -> 1723
[OK] prod: 202503_

## 3C. Batch Validation (Multiple Pairs, All Perspectives)

Use this section to validate multiple production/test pairs from a CSV or XLSX file.
All three perspectives (GR, GU, RL) are validated automatically.

**File format (CSV or XLSX):**
```
production_app_analysis_id,test_app_analysis_id,name
1575,4342,USFL_Other
1576,4343,USEQ_Primary
1577,4344,USHU_Full
```

Required columns:
- `production_app_analysis_id`: App analysis ID for production (baseline)
- `test_app_analysis_id`: App analysis ID for test run
- `name` (optional): Label for this analysis pair

In [28]:
# Path to CSV or XLSX file with analysis pairs
FILE_PATH = 'analysis_results/big_config_ALL_Inc_USIF_GROUPING_results.xlsx'  # Replace with your file path (.csv or .xlsx)

# Progress callback (optional)
def show_progress(current, total, name, perspective):
    print(f"[{current}/{total}] {name} - {perspective}")

# Run batch validation (validates all perspectives: GR, GU, RL)
# PLT is automatically included for HD analyses only
batch_result = validator.validate_batch_from_file(
    file_path=FILE_PATH,
    relative_tolerance=RELATIVE_TOLERANCE,
    decimal_places=DECIMAL_PLACES,
    max_diff=MAX_DIFF,
    progress_callback=show_progress,
)

print()
print_batch_summary(batch_result)

Skipped 34 incomplete row(s): [2, 16, 72, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 110, 112, 114]
[1/80] Testing_All_202503_USHU_LT_Manufactured_Housing_Group - GR
[1/80] Testing_All_202503_USHU_LT_Manufactured_Housing_Group - GU
[1/80] Testing_All_202503_USHU_LT_Manufactured_Housing_Group - RL
[2/80] Testing_All_202503_USHU_NT_Manufactured_Housing_Group - GR
[2/80] Testing_All_202503_USHU_NT_Manufactured_Housing_Group - GU
[2/80] Testing_All_202503_USHU_NT_Manufactured_Housing_Group - RL
[3/80] Testing_All_202503_USHU_LT_Renters_Group - GR
[3/80] Testing_All_202503_USHU_LT_Renters_Group - GU
[3/80] Testing_All_202503_USHU_LT_Renters_Group - RL
[4/80] Testing_All_202503_USHU_NT_Renters_Group - GR
[4/80] Testing_All_202503_USHU_NT_Renters_Group - GU
[4/80] Testing_All_202503_USHU_NT_Renters_Group - RL
[5/80] Testing_All_202503_USHU_LT_Other_Group - GR
[5/80] Testing_All_202503_USHU_LT_Other_Group - GU
[5/80]

### Batch Results Summary Table

In [29]:
# Display results as a DataFrame
df = batch_results_to_dataframe(batch_result)
df

Unnamed: 0,test_analysis_name,prod_id,test_id,status,settings,GR_stats,GR_ep,GR_elt,GR_plt,GU_stats,GU_ep,GU_elt,GU_plt,RL_stats,RL_ep,RL_elt,RL_plt,error
0,Testing_All_202503_USHU_LT_Manufactured_Housin...,1705,5296,PASS,PASS,PASS,PASS,PASS,,PASS,PASS,PASS,,PASS,PASS,PASS,,
1,Testing_All_202503_USHU_NT_Manufactured_Housin...,1701,5297,PASS,PASS,PASS,PASS,PASS,,PASS,PASS,PASS,,PASS,PASS,PASS,,
2,Testing_All_202503_USHU_LT_Renters_Group,1706,5298,PASS,PASS,PASS,PASS,PASS,,PASS,PASS,PASS,,PASS,PASS,PASS,,
3,Testing_All_202503_USHU_NT_Renters_Group,1702,5299,PASS,PASS,PASS,PASS,PASS,,PASS,PASS,PASS,,PASS,PASS,PASS,,
4,Testing_All_202503_USHU_LT_Other_Group,1707,5300,PASS,PASS,PASS,PASS,PASS,,PASS,PASS,PASS,,PASS,PASS,PASS,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
75,Testing_All_202503_USFL_NT_TotalFlood_Group,1831,5427,FAIL,PASS,FAIL (1 diff)\nLargest: 1.4M\nPct Diff: -4.8%,FAIL (4 diff)\nLargest: 3.3M\nPct Diff: -4.3%,FAIL (500 diff)\nLargest: 46.4M\nPct Diff: -14.9%,,FAIL (1 diff)\nLargest: 2.5M\nPct Diff: -5.6%,FAIL (4 diff)\nLargest: 5.8M\nPct Diff: -4.9%,FAIL (500 diff)\nLargest: 269.7M\nPct Diff: -1...,,FAIL (1 diff)\nLargest: 1.3M\nPct Diff: -4.8%,FAIL (4 diff)\nLargest: 3.2M\nPct Diff: -4.3%,FAIL (500 diff)\nLargest: 45.9M\nPct Diff: -15.6%,,
76,Testing_All_202503_LT_USAPxWFIF_Group,1847,5456,FAIL,PASS,FAIL (1 diff)\nLargest: 26.1K\nPct Diff: -0.0%,FAIL (4 diff)\nLargest: 57.3K\nPct Diff: -0.0%,FAIL (470 diff)\nLargest: 3.5M\nPct Diff: -0.1%,,FAIL (1 diff)\nLargest: 34.9K\nPct Diff: -0.0%,FAIL (4 diff)\nLargest: 82.6K\nPct Diff: -0.0%,FAIL (468 diff)\nLargest: 4.7M\nPct Diff: -0.1%,,FAIL (1 diff)\nLargest: 28.0K\nPct Diff: -0.0%,FAIL (4 diff)\nLargest: 64.1K\nPct Diff: -0.0%,FAIL (469 diff)\nLargest: 3.5M\nPct Diff: -0.1%,,
77,Testing_All_202503_NT_USAPxWFIF_Group,1855,5458,FAIL,PASS,FAIL (1 diff)\nLargest: 497.4K\nPct Diff: -0.2%,FAIL (4 diff)\nLargest: 1.5M\nPct Diff: -0.5%,FAIL (500 diff)\nLargest: 14.7M\nPct Diff: +0.6%,,FAIL (1 diff)\nLargest: 1.1M\nPct Diff: -0.3%,FAIL (4 diff)\nLargest: 2.6M\nPct Diff: -0.3%,FAIL (500 diff)\nLargest: 169.7M\nPct Diff: -0.1%,,FAIL (1 diff)\nLargest: 472.4K\nPct Diff: -0.2%,FAIL (4 diff)\nLargest: 1.5M\nPct Diff: -0.3%,FAIL (500 diff)\nLargest: 14.7M\nPct Diff: +0.6%,,
78,Testing_All_202503_LT_USAPxWFIFxCB_Group,1851,5460,FAIL,PASS,FAIL (1 diff)\nLargest: 26.1K\nPct Diff: -0.0%,FAIL (4 diff)\nLargest: 1.6M\nPct Diff: +0.3%,FAIL (470 diff)\nLargest: 3.5M\nPct Diff: -0.1%,,FAIL (1 diff)\nLargest: 34.9K\nPct Diff: -0.0%,FAIL (4 diff)\nLargest: 82.8K\nPct Diff: -0.0%,FAIL (468 diff)\nLargest: 4.7M\nPct Diff: -0.1%,,FAIL (1 diff)\nLargest: 28.0K\nPct Diff: -0.0%,FAIL (4 diff)\nLargest: 63.5K\nPct Diff: -0.0%,FAIL (469 diff)\nLargest: 3.5M\nPct Diff: -0.1%,,


### Export Results (Optional)

Export the summary to CSV and detailed failures to JSON.

In [30]:
# Export summary to CSV (saved to validation_outputs folder)
summary_path = export_batch_summary_to_csv(batch_result)
print(f"Summary exported to: {summary_path}")

# Export detailed failures to JSON (only if there are failures)
if batch_result.failed_count > 0:
    failures_path = export_batch_failures_to_json(
        batch_result,
        max_differences=MAX_DIFFERENCES_TO_SHOW
    )
    print(f"Failure details exported to: {failures_path}")
else:
    print("No failures to export.")

Summary exported to: validation_outputs/validation_summary.csv
Failure details exported to: validation_outputs/validation_failures.json
