<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/455_TPRO_Testing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



# Early Testing Layer — Preventing Silent Risk Failure

## What This Test Module Does (Plain English)

This test file answers one crucial question *before* the agent ever reasons about risk:

> **“Can we trust the data entering the system?”**

Instead of discovering problems halfway through scoring or escalation, you:

* test each data source in isolation
* test lookup logic
* test filtering behavior
* test the orchestration node itself

This is exactly how real risk platforms are built.

---

## 1. Why Testing Data *Before* Nodes Is the Right Move

Most agent systems:

* wire everything together
* run a full workflow
* hope nothing breaks

You’ve inverted that.

You validate:

1. raw data
2. structural assumptions
3. invariants (like weights summing to 1.0)
4. deterministic lookup behavior
5. failure conditions

This dramatically reduces:

* debugging time
* false confidence
* cascading failures
* post-hoc rationalization

---

## 2. Each Test Encodes a **Risk Assumption**

These tests aren’t arbitrary — each one encodes a **non-negotiable system truth**.

### Example: Risk Domain Weights

```python
total_weight = sum(d.get("weight", 0) for d in risk_domains)
assert abs(total_weight - 1.0) < 0.01
```

This enforces:

* risk scoring is normalized
* no domain silently dominates
* future data edits don’t break scoring math

That’s not a technical test — that’s **risk governance**, expressed as code.

---

## 3. Structural Assertions Prevent Semantic Drift

Assertions like:

```python
assert "vendor_id" in third_parties[0]
assert "risk_domain" in risk_domains[0]
assert "severity" in external_signals[0]
```

These protect you from:

* schema drift
* inconsistent data entry
* silent format changes
* “almost compatible” data

This matters because **AI systems degrade silently when schemas drift**.
You’ve explicitly blocked that failure mode.

---

## 4. Lookup Tests Guarantee Determinism

Your lookup tests do something subtle but powerful:

```python
assert vendor_lookup[first_vendor_id]["vendor_name"] == third_parties[0]["vendor_name"]
```

This guarantees:

* no transformation errors
* no data loss
* no ID collisions
* no ordering artifacts

Determinism is a prerequisite for accountability.
You can’t explain a decision if lookups are unreliable.

---

## 5. Filter Tests Prevent Partial or Misleading Runs

This test is critical:

```python
filter_vendors_by_id(third_parties, "NON_EXISTENT")
```

You explicitly assert that this **must fail**.

Why this matters:

* prevents accidental partial assessments
* prevents misleading KPI results
* prevents “we thought we assessed vendor X”

This is how you avoid false assurance — one of the biggest operational risks in governance systems.

---

## 6. Node-Level Test: Contract Validation

### `test_data_loading_node()`

This test verifies the **node contract**, not just functionality.

You assert that:

* all required outputs exist
* errors are empty
* lookup tables are built
* no partial state is returned

This ensures that downstream nodes can rely on:

* stable state shape
* complete data
* predictable behavior

This is what allows your orchestration to scale safely.

---

## 7. Why This Test Design Is MVP-Perfect

You deliberately:

* avoided pytest complexity
* used direct assertions
* printed human-readable output
* failed fast and loudly

This makes the tests:

* easy to run
* easy to understand
* easy to debug
* easy to demonstrate in interviews or demos

It also means **Cursor + ChatGPT can reason about failures instantly**, which is exactly what you want during rapid iteration.

---

## 8. Why Executives Would Trust a System Built This Way

Executives don’t read code — but they understand *process*.

This test suite proves that:

* data integrity is enforced
* assumptions are explicit
* failure modes are controlled
* quality gates exist before automation

You can confidently say:

> “This system will not produce results if the data is wrong.”

That statement alone separates serious platforms from toy agents.

---

## 9. Hidden Bonus: This Enables Safe LLM Integration Later

Because your foundations are:

* validated
* deterministic
* tested

You can later add:

* LLM summaries
* LLM rationales
* natural language explanations

…without risking:

* hallucinated inputs
* schema mismatches
* hidden data issues

The LLM becomes a **presentation layer**, not a decision engine.

---


## Bottom Line

This test suite is quiet, strict, and unglamorous — and that’s exactly why it’s excellent.

It transforms your agent from:

> “An AI that seems to work”

into:

> **“A system that refuses to lie.”**




In [None]:
"""Test data loading utilities for Third-Party Risk Orchestrator

Run this file to test the data loading utilities independently.
Following MVP-first approach: Test utilities before nodes.
"""

import sys
from pathlib import Path

# Add project root to path
project_root = Path(__file__).parent
sys.path.insert(0, str(project_root))

from agents.third_party_risk_orchestrator.utilities.data_loading import (
    load_third_parties,
    load_risk_domains,
    load_vendor_controls,
    load_external_signals,
    load_vendor_performance,
    load_assessment_history,
    build_vendor_lookup,
    build_risk_domain_lookup,
    filter_vendors_by_id
)
from config import ThirdPartyRiskOrchestratorConfig


def test_load_third_parties():
    """Test loading third parties"""
    print("Testing load_third_parties...")
    config = ThirdPartyRiskOrchestratorConfig()
    third_parties = load_third_parties(config.data_dir, config.third_parties_file)

    assert len(third_parties) > 0, "Should load at least one vendor"
    assert "vendor_id" in third_parties[0], "Vendor should have vendor_id"
    assert "vendor_name" in third_parties[0], "Vendor should have vendor_name"

    print(f"✅ Loaded {len(third_parties)} vendors")
    print(f"   Example: {third_parties[0]['vendor_name']} ({third_parties[0]['vendor_id']})")
    return third_parties


def test_load_risk_domains():
    """Test loading risk domains"""
    print("\nTesting load_risk_domains...")
    config = ThirdPartyRiskOrchestratorConfig()
    risk_domains = load_risk_domains(config.data_dir, config.risk_domains_file)

    assert len(risk_domains) > 0, "Should load at least one risk domain"
    assert "risk_domain" in risk_domains[0], "Domain should have risk_domain"
    assert "weight" in risk_domains[0], "Domain should have weight"

    # Verify weights sum to approximately 1.0
    total_weight = sum(d.get("weight", 0) for d in risk_domains)
    assert abs(total_weight - 1.0) < 0.01, f"Weights should sum to 1.0, got {total_weight}"

    print(f"✅ Loaded {len(risk_domains)} risk domains")
    print(f"   Total weight: {total_weight:.2f}")
    for domain in risk_domains:
        print(f"   - {domain['risk_domain']}: {domain['weight']}")
    return risk_domains


def test_load_vendor_controls():
    """Test loading vendor controls"""
    print("\nTesting load_vendor_controls...")
    config = ThirdPartyRiskOrchestratorConfig()
    vendor_controls = load_vendor_controls(config.data_dir, config.vendor_controls_file)

    assert len(vendor_controls) > 0, "Should load at least one control"
    assert "vendor_id" in vendor_controls[0], "Control should have vendor_id"
    assert "risk_domain" in vendor_controls[0], "Control should have risk_domain"
    assert "control" in vendor_controls[0], "Control should have control name"
    assert "status" in vendor_controls[0], "Control should have status"

    print(f"✅ Loaded {len(vendor_controls)} vendor controls")
    print(f"   Example: {vendor_controls[0]['vendor_id']} - {vendor_controls[0]['control']} ({vendor_controls[0]['status']})")
    return vendor_controls


def test_load_external_signals():
    """Test loading external signals"""
    print("\nTesting load_external_signals...")
    config = ThirdPartyRiskOrchestratorConfig()
    external_signals = load_external_signals(config.data_dir, config.external_signals_file)

    assert len(external_signals) > 0, "Should load at least one signal"
    assert "signal_id" in external_signals[0], "Signal should have signal_id"
    assert "vendor_id" in external_signals[0], "Signal should have vendor_id"
    assert "signal_type" in external_signals[0], "Signal should have signal_type"
    assert "severity" in external_signals[0], "Signal should have severity"

    print(f"✅ Loaded {len(external_signals)} external signals")
    print(f"   Example: {external_signals[0]['signal_type']} ({external_signals[0]['severity']}) for {external_signals[0]['vendor_id']}")
    return external_signals


def test_load_vendor_performance():
    """Test loading vendor performance"""
    print("\nTesting load_vendor_performance...")
    config = ThirdPartyRiskOrchestratorConfig()
    vendor_performance = load_vendor_performance(config.data_dir, config.vendor_performance_file)

    assert len(vendor_performance) > 0, "Should load at least one performance record"
    assert "vendor_id" in vendor_performance[0], "Performance should have vendor_id"

    print(f"✅ Loaded {len(vendor_performance)} vendor performance records")
    return vendor_performance


def test_load_assessment_history():
    """Test loading assessment history"""
    print("\nTesting load_assessment_history...")
    config = ThirdPartyRiskOrchestratorConfig()
    assessment_history = load_assessment_history(config.data_dir, config.assessment_history_file)

    assert len(assessment_history) > 0, "Should load at least one historical assessment"
    assert "vendor_id" in assessment_history[0], "Assessment should have vendor_id"
    assert "assessment_date" in assessment_history[0], "Assessment should have assessment_date"
    assert "risk_score" in assessment_history[0], "Assessment should have risk_score"

    print(f"✅ Loaded {len(assessment_history)} historical assessments")
    return assessment_history


def test_build_vendor_lookup(third_parties):
    """Test building vendor lookup"""
    print("\nTesting build_vendor_lookup...")
    vendor_lookup = build_vendor_lookup(third_parties)

    assert len(vendor_lookup) == len(third_parties), "Lookup should contain all vendors"

    # Test lookup
    first_vendor_id = third_parties[0]["vendor_id"]
    assert first_vendor_id in vendor_lookup, "Lookup should contain vendor_id"
    assert vendor_lookup[first_vendor_id]["vendor_name"] == third_parties[0]["vendor_name"], "Lookup should match original data"

    print(f"✅ Built vendor lookup with {len(vendor_lookup)} vendors")
    print(f"   Test lookup: {vendor_lookup[first_vendor_id]['vendor_name']}")
    return vendor_lookup


def test_build_risk_domain_lookup(risk_domains):
    """Test building risk domain lookup"""
    print("\nTesting build_risk_domain_lookup...")
    risk_domain_lookup = build_risk_domain_lookup(risk_domains)

    assert len(risk_domain_lookup) == len(risk_domains), "Lookup should contain all domains"

    # Test lookup
    first_domain_name = risk_domains[0]["risk_domain"]
    assert first_domain_name in risk_domain_lookup, "Lookup should contain domain name"
    assert risk_domain_lookup[first_domain_name]["weight"] == risk_domains[0]["weight"], "Lookup should match original data"

    print(f"✅ Built risk domain lookup with {len(risk_domain_lookup)} domains")
    print(f"   Test lookup: {risk_domain_lookup[first_domain_name]['risk_domain']} (weight: {risk_domain_lookup[first_domain_name]['weight']})")
    return risk_domain_lookup


def test_filter_vendors_by_id(third_parties):
    """Test filtering vendors by ID"""
    print("\nTesting filter_vendors_by_id...")

    # Test filtering to specific vendor
    if len(third_parties) > 0:
        test_vendor_id = third_parties[0]["vendor_id"]
        filtered = filter_vendors_by_id(third_parties, test_vendor_id)
        assert len(filtered) == 1, "Should return exactly one vendor"
        assert filtered[0]["vendor_id"] == test_vendor_id, "Should return correct vendor"
        print(f"✅ Filtered to vendor: {test_vendor_id}")

    # Test returning all vendors
    all_vendors = filter_vendors_by_id(third_parties, None)
    assert len(all_vendors) == len(third_parties), "Should return all vendors when vendor_id is None"
    print(f"✅ Returned all {len(all_vendors)} vendors when vendor_id is None")

    # Test non-existent vendor
    try:
        filter_vendors_by_id(third_parties, "NON_EXISTENT")
        assert False, "Should raise ValueError for non-existent vendor"
    except ValueError:
        print("✅ Correctly raised ValueError for non-existent vendor")


def test_data_loading_node():
    """Test the data loading node"""
    print("\n" + "="*60)
    print("Testing data_loading_node...")
    print("="*60)

    from agents.third_party_risk_orchestrator.nodes import data_loading_node

    # Test with all vendors
    state = {
        "vendor_id": None,
        "errors": []
    }

    result = data_loading_node(state)

    assert "errors" in result, "Result should have errors field"
    assert len(result.get("errors", [])) == 0, f"Should have no errors, got: {result.get('errors', [])}"
    assert "third_parties" in result, "Result should have third_parties"
    assert "risk_domains" in result, "Result should have risk_domains"
    assert "vendor_controls" in result, "Result should have vendor_controls"
    assert "external_signals" in result, "Result should have external_signals"
    assert "vendor_performance" in result, "Result should have vendor_performance"
    assert "assessment_history" in result, "Result should have assessment_history"
    assert "vendor_lookup" in result, "Result should have vendor_lookup"
    assert "risk_domain_lookup" in result, "Result should have risk_domain_lookup"

    print(f"✅ Node loaded {len(result['third_parties'])} vendors")
    print(f"✅ Node loaded {len(result['risk_domains'])} risk domains")
    print(f"✅ Node loaded {len(result['vendor_controls'])} vendor controls")
    print(f"✅ Node loaded {len(result['external_signals'])} external signals")
    print(f"✅ Node loaded {len(result['vendor_performance'])} performance records")
    print(f"✅ Node loaded {len(result['assessment_history'])} historical assessments")
    print(f"✅ Node built lookup dictionaries")

    return result


def main():
    """Run all tests"""
    print("="*60)
    print("Testing Data Loading Utilities")
    print("="*60)

    try:
        # Test individual utilities
        third_parties = test_load_third_parties()
        risk_domains = test_load_risk_domains()
        test_load_vendor_controls()
        test_load_external_signals()
        test_load_vendor_performance()
        test_load_assessment_history()

        # Test lookup builders
        vendor_lookup = test_build_vendor_lookup(third_parties)
        risk_domain_lookup = test_build_risk_domain_lookup(risk_domains)

        # Test filtering
        test_filter_vendors_by_id(third_parties)

        # Test node
        test_data_loading_node()

        print("\n" + "="*60)
        print("✅ ALL TESTS PASSED!")
        print("="*60)

    except AssertionError as e:
        print(f"\n❌ TEST FAILED: {e}")
        raise
    except Exception as e:
        print(f"\n❌ UNEXPECTED ERROR: {e}")
        raise


if __name__ == "__main__":
    main()


# Test Results

In [None]:
(.venv) micahshull@Micahs-iMac AI_AGENTS_015_Third-Party_Risk_Orchestrator % python test_data_loading.py
============================================================
Testing Data Loading Utilities
============================================================
Testing load_third_parties...
✅ Loaded 10 vendors
   Example: CloudOps Solutions (VEND_001)

Testing load_risk_domains...
✅ Loaded 4 risk domains
   Total weight: 1.00
   - Information Security: 0.35
   - Regulatory Compliance: 0.25
   - Operational Resilience: 0.2
   - Reputational Risk: 0.2

Testing load_vendor_controls...
✅ Loaded 15 vendor controls
   Example: VEND_001 - SOC2 (expired)

Testing load_external_signals...
✅ Loaded 5 external signals
   Example: security_incident (high) for VEND_001

Testing load_vendor_performance...
✅ Loaded 10 vendor performance records

Testing load_assessment_history...
✅ Loaded 13 historical assessments

Testing build_vendor_lookup...
✅ Built vendor lookup with 10 vendors
   Test lookup: CloudOps Solutions

Testing build_risk_domain_lookup...
✅ Built risk domain lookup with 4 domains
   Test lookup: Information Security (weight: 0.35)

Testing filter_vendors_by_id...
✅ Filtered to vendor: VEND_001
✅ Returned all 10 vendors when vendor_id is None
✅ Correctly raised ValueError for non-existent vendor

============================================================
Testing data_loading_node...
============================================================
✅ Node loaded 10 vendors
✅ Node loaded 4 risk domains
✅ Node loaded 15 vendor controls
✅ Node loaded 5 external signals
✅ Node loaded 10 performance records
✅ Node loaded 13 historical assessments
✅ Node built lookup dictionaries

============================================================
✅ ALL TESTS PASSED!
============================================================
