<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/377_GCO_AgentState.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Governance & Compliance Orchestrator

**Status:** MVP Complete ‚úÖ

The Governance & Compliance Orchestrator monitors all AI agent activity, enforces policy constraints, detects bias and risk, provides full auditability, and ensures compliance with industry regulations.

---

## üéØ What This Agent Does

This orchestrator acts as the **air traffic controller** and **audit engine** for all AI activity in the enterprise. It:

1. **Monitors Agent Actions** - Tracks all agent decisions and recommendations
2. **Enforces Policies** - Evaluates actions against machine-readable policy rules
3. **Detects Violations** - Identifies policy violations, bias signals, and drift
4. **Assesses Risk** - Calculates risk scores for agents and overall system
5. **Generates Audit Reports** - Creates comprehensive compliance reports

---

## üìÅ Structure

```
governance_compliance_orchestrator/
‚îú‚îÄ‚îÄ __init__.py
‚îú‚îÄ‚îÄ nodes.py                    # Orchestration nodes
‚îú‚îÄ‚îÄ orchestrator.py             # LangGraph workflow
‚îî‚îÄ‚îÄ utilities/
    ‚îú‚îÄ‚îÄ __init__.py
    ‚îú‚îÄ‚îÄ data_loading.py         # Load agent logs, policies, signals
    ‚îú‚îÄ‚îÄ policy_evaluation.py   # Evaluate events against policies
    ‚îú‚îÄ‚îÄ violation_detection.py # Detect violations and generate events
    ‚îú‚îÄ‚îÄ risk_scoring.py         # Calculate risk scores
    ‚îú‚îÄ‚îÄ prioritization.py      # Prioritize compliance issues
    ‚îî‚îÄ‚îÄ report_generation.py    # Generate audit reports
```

---

## üöÄ Quick Start

### 1. Run the Test

```bash
python test_governance_compliance_orchestrator.py
```

This will:
- Load all agent action logs from `agents/data/`
- Load policy rules, bias signals, and drift signals
- Evaluate all events against policies
- Detect violations
- Calculate risk scores
- Generate an audit report in `output/governance_compliance_reports/`

### 2. Use in Code

```python
from config import GovernanceComplianceOrchestratorConfig
from agents.governance_compliance_orchestrator.orchestrator import create_orchestrator

# Create config
config = GovernanceComplianceOrchestratorConfig()

# Create orchestrator
orchestrator = create_orchestrator(config)

# Run analysis
initial_state = {
    "agent_name": None,  # Analyze all agents (or specify one)
    "time_window_days": None,  # Use default (30 days)
    "errors": []
}

result = orchestrator.invoke(initial_state)

# Access results
summary = result.get("summary")
risk_scores = result.get("risk_scores")
prioritized_issues = result.get("prioritized_issues")
audit_report = result.get("audit_report")
report_path = result.get("report_file_path")
```

---

## üìä Data Requirements

The orchestrator expects data files in `agents/data/`:

1. **Agent Action Logs** (`agent_action_logs_batch_*.json`)
   - Events with: event_id, timestamp, agent_name, action_type, input_data, output, confidence_score, human_in_the_loop

2. **Policy Rules** (`policy_rules.json`)
   - Policies with: policy_id, description, conditions, required_action, severity

3. **Bias Signals** (`bias_signals.json`)
   - Bias detection signals with group comparisons and deltas

4. **Drift Signals** (`drift_and_degradation_signals.json`)
   - Model drift signals with metric comparisons

---

## üîß Configuration

Configuration is in `config.py` as `GovernanceComplianceOrchestratorConfig`:

- **Data paths** - Where to find data files
- **Severity weights** - How to weight different severity levels
- **Priority scoring weights** - How to prioritize issues (CEO-friendly transparency)
- **Bias/drift thresholds** - Thresholds for flagging issues

---

## üìã Workflow

The orchestrator follows a linear workflow:

1. **Goal** - Define governance objective
2. **Planning** - Create execution plan
3. **Data Loading** - Load all data files
4. **Policy Evaluation** - Evaluate events against policies
5. **Violation Detection** - Detect violations and generate compliance events
6. **Risk Scoring** - Calculate risk scores for agents and system
7. **Prioritization** - Prioritize issues by severity and urgency
8. **Report Generation** - Generate comprehensive audit report

---

## üéì MVP Features

**What's Included:**
- ‚úÖ Rule-based policy evaluation
- ‚úÖ Violation detection
- ‚úÖ Risk scoring (agent-level and system-level)
- ‚úÖ Bias signal detection
- ‚úÖ Drift signal detection
- ‚úÖ Issue prioritization
- ‚úÖ Comprehensive audit reports

**What's Not Included (Future Enhancements):**
- ‚è≥ LLM-powered violation explanations
- ‚è≥ Advanced bias detection algorithms
- ‚è≥ ML-based drift detection
- ‚è≥ Real-time monitoring
- ‚è≥ Automated remediation

---

## üìù Example Output

The orchestrator generates:
- **Summary statistics** - Total events, violations, risk scores
- **Prioritized issues** - Top compliance issues ranked by priority
- **Agent risk scores** - Risk assessment per agent
- **Bias signals** - Detected bias patterns
- **Drift signals** - Model degradation signals
- **Audit report** - Comprehensive markdown report

---

## üß™ Testing

Run the test script to verify everything works:

```bash
python test_governance_compliance_orchestrator.py
```

Expected output:
- ‚úÖ Orchestrator completes successfully
- ‚úÖ Summary statistics displayed
- ‚úÖ Risk scores calculated
- ‚úÖ Top priority issues listed
- ‚úÖ Audit report saved to `output/governance_compliance_reports/`

---

## üîÑ Next Steps

To enhance the MVP:

1. **Add LLM Explanations** - Use LLM to generate detailed violation explanations
2. **Enhance Bias Detection** - Add more sophisticated bias detection algorithms
3. **Real-time Monitoring** - Add streaming event processing
4. **Automated Remediation** - Add automated response to violations
5. **Dashboard Integration** - Create visual dashboard for executives

---

## üìö Related Documentation

- **Data Review** - `agents/data/data_review.md`
- **Data Proposal Review** - `agents/data_proposal_review.md`
- **Orchestrator Guide** - `docs/guides/agent_patterns/ORCHESTRATOR_AGENTS_GUIDE_3.md`
- **Toolshed Guide** - `docs/guides/TOOLSHED_GUIDE.md`

---

**This MVP provides a solid foundation for learning the orchestrator architecture. You can add complexity and enhancements as needed!**



# Governance & Compliance Orchestrator Agent

In [None]:
# ============================================================================
# Governance & Compliance Orchestrator Agent
# ============================================================================

class GovernanceComplianceOrchestratorState(TypedDict, total=False):
    """State for Governance & Compliance Orchestrator Agent"""

    # Input fields
    agent_name: Optional[str]              # Specific agent to analyze (None = analyze all)
    time_window_days: Optional[int]        # Time window for analysis (None = use default)

    # Goal & Planning fields (MVP: Fixed goal, template-based plan)
    goal: Dict[str, Any]                   # Goal definition (from goal_node)
    plan: List[Dict[str, Any]]            # Execution plan (from planning_node)

    # Data Ingestion
    agent_action_logs: List[Dict[str, Any]]  # Loaded agent action log events
    # Structure per event:
    # {
    #   "event_id": "evt_0001",
    #   "timestamp": "2026-01-02T09:05:12Z",
    #   "agent_name": "SalesEnablementAgent",
    #   "action_type": "pricing_recommendation",
    #   "input_data": {...},
    #   "output": {...},
    #   "model": "gpt-4.1",
    #   "confidence_score": 0.78,
    #   "human_in_the_loop": false,
    #   "data_sources": ["CRM", "SalesForecast_v1"]
    # }

    policy_rules: List[Dict[str, Any]]     # Loaded policy rules
    # Structure per policy:
    # {
    #   "policy_id": "EU_HIGH_RISK_REQUIRES_APPROVAL",
    #   "description": "...",
    #   "conditions": {...},
    #   "required_action": "human_approval",
    #   "severity": "high"
    # }

    bias_signals: List[Dict[str, Any]]    # Loaded bias detection signals
    # Structure per signal:
    # {
    #   "signal_id": "bias_001",
    #   "agent_name": "HRDecisionAgent",
    #   "decision_type": "hiring_decision",
    #   "protected_attribute": "gender",
    #   "groups": [...],
    #   "delta": 0.31,
    #   "threshold": 0.20,
    #   "risk_level": "high",
    #   "recommended_action": "..."
    # }

    drift_signals: List[Dict[str, Any]]   # Loaded drift and degradation signals
    # Structure per signal:
    # {
    #   "signal_id": "drift_001",
    #   "agent_name": "CustomerSupportAgent",
    #   "model": "gpt-4.1",
    #   "metric": "hallucination_rate",
    #   "previous_average": 0.03,
    #   "current_average": 0.11,
    #   "threshold": 0.08,
    #   "delta": 0.08,
    #   "risk_level": "high",
    #   "detected_at": "2026-01-02T12:55:00Z",
    #   "recommended_action": "..."
    # }

    # Data Lookups (for fast access)
    policy_lookup: Dict[str, Dict[str, Any]]  # policy_id -> policy dict
    events_lookup: Dict[str, Dict[str, Any]]  # event_id -> event dict

    # Policy Evaluation
    policy_evaluations: List[Dict[str, Any]]  # Policy matches per event
    # Structure per evaluation:
    # {
    #   "event_id": "evt_0002",
    #   "policy_id": "EU_HIGH_RISK_REQUIRES_APPROVAL",
    #   "matched": true,
    #   "violation": true,
    #   "severity": "high",
    #   "required_action": "human_approval",
    #   "reason": "EU region + confidence < 0.6 + no human approval"
    # }

    # Risk Assessment
    compliance_events: List[Dict[str, Any]]  # Generated compliance events (violations)
    # Structure per event:
    # {
    #   "compliance_event_id": "cmp_0001",
    #   "event_id": "evt_0002",
    #   "risk_type": "policy_violation",
    #   "policy_id": "EU_HIGH_RISK_REQUIRES_APPROVAL",
    #   "severity": "high",
    #   "status": "open",
    #   "recommended_action": "Escalate to compliance officer",
    #   "timestamp": "2026-01-02T14:32:12Z"
    # }

    risk_scores: Dict[str, Any]           # Risk scores per agent/event
    # Structure:
    # {
    #   "agent_scores": {
    #     "SalesEnablementAgent": {
    #       "total_violations": 3,
    #       "high_severity_count": 2,
    #       "risk_score": 0.75
    #     }
    #   },
    #   "overall_risk_score": 0.68
    # }

    # Prioritization
    prioritized_issues: List[Dict[str, Any]]  # Prioritized compliance issues
    # Structure per issue:
    # {
    #   "compliance_event_id": "cmp_0001",
    #   "priority_score": 85.5,
    #   "severity": "high",
    #   "urgency": "high",
    #   "agent_name": "SalesEnablementAgent"
    # }

    # Summary
    summary: Dict[str, Any]               # Overall summary statistics
    # Structure:
    # {
    #   "total_events_analyzed": 36,
    #   "total_violations": 8,
    #   "high_severity_count": 4,
    #   "bias_signals_count": 4,
    #   "drift_signals_count": 5,
    #   "agents_affected": ["SalesEnablementAgent", "HRDecisionAgent"]
    # }

    # Output
    audit_report: str                     # Generated audit report (markdown)
    report_file_path: Optional[str]       # Path to saved report file

    # Metadata
    errors: List[str]                     # Any errors encountered
    processing_time: Optional[float]      # Time taken to process


@dataclass
class GovernanceComplianceOrchestratorConfig:
    """Configuration for Governance & Compliance Orchestrator Agent"""

    # LLM Settings
    llm_model: str = os.getenv("LLM_MODEL", "gpt-4o-mini")
    temperature: float = 0.3

    # Data file paths
    data_dir: str = "agents/data"
    agent_logs_files: List[str] = field(default_factory=lambda: [
        "agent_action_logs_batch_1.json",
        "agent_action_logs_batch_2.json",
        "agent_action_logs_batch_3.json"
    ])
    policy_rules_file: str = "policy_rules.json"
    bias_signals_file: str = "bias_signals.json"
    drift_signals_file: str = "drift_and_degradation_signals.json"

    # Report settings
    reports_dir: str = "output/governance_compliance_reports"

    # Policy Evaluation Settings
    default_time_window_days: int = 30    # Default time window for analysis

    # Risk Scoring Settings
    severity_weights: Dict[str, float] = field(default_factory=lambda: {
        "critical": 1.0,
        "high": 0.75,
        "medium": 0.50,
        "low": 0.25
    })

    # Bias Detection Settings
    bias_delta_threshold: float = 0.20    # Minimum delta to flag as bias
    bias_risk_levels: Dict[str, float] = field(default_factory=lambda: {
        "critical": 0.50,
        "high": 0.30,
        "medium": 0.20,
        "low": 0.10
    })

    # Drift Detection Settings
    drift_threshold_multiplier: float = 1.2  # Threshold multiplier for drift detection

    # Priority Scoring Weights (CEO-friendly transparency)
    priority_scoring_weights: Dict[str, float] = field(default_factory=lambda: {
        "severity": 0.40,
        "urgency": 0.30,
        "impact": 0.20,
        "frequency": 0.10
    })

    # Toolshed Integration
    enable_prioritization: bool = True     # Use toolshed.prioritization
    enable_reporting: bool = True          # Use toolshed.reporting

    # LLM Enhancement (Optional - Phase 8)
    enable_llm_explanations: bool = False  # Enable LLM-generated violation explanations
    llm_explanation_max_events: int = 5    # Max events to generate LLM explanations for (cost control)




# Governance & Compliance Orchestrator ‚Äî Core State & Configuration

## Purpose of This Code

This code defines the **operating foundation** of the Governance & Compliance Orchestrator.

Rather than performing governance actions directly, this layer establishes:

* **What the system can observe**
* **What it can reason about**
* **What it can produce as accountable outputs**

In other words, this is the **contract** that makes the agent auditable, controllable, and enterprise-ready.

---

## 1. Why a Structured Orchestrator State Matters

At the heart of this design is the `GovernanceComplianceOrchestratorState`.

This state object acts as a **single source of truth** for the entire governance workflow. Every node in the agent ‚Äî ingestion, policy evaluation, risk scoring, prioritization, and reporting ‚Äî reads from and writes back to this shared structure.

### What This Enables in Practice

* Clear traceability from **raw agent actions ‚Üí policy violations ‚Üí executive reports**
* Deterministic behavior (the same inputs always produce the same governance outcomes)
* Full auditability for compliance, legal, and leadership review
* Safe scaling as new agents, policies, or signals are added

This is a deliberate move away from opaque, conversational agents toward **systems that can be inspected and trusted**.

---

## 2. What the State Captures (Conceptually)

Rather than focusing on implementation details, it‚Äôs more useful to understand the *categories* of information this state tracks.

### üîπ A. Inputs & Scope

The orchestrator can analyze:

* A **specific agent** or all agents
* A **configurable time window**

This supports both targeted investigations and broad system audits.

---

### üîπ B. Goal & Plan (Governed Execution)

Even in an MVP, the agent explicitly records:

* Its **goal**
* Its **execution plan**

This ensures the system always knows *why* it is acting ‚Äî a critical requirement for executive trust and regulatory scrutiny.

---

### üîπ C. Ingested Evidence

The state holds structured evidence from three independent governance lenses:

1. **Agent Action Logs**
   What AI systems actually did

2. **Policy Rules**
   What they were allowed or required to do

3. **Bias & Drift Signals**
   Whether behavior is becoming unfair, unsafe, or degraded over time

This separation mirrors real enterprise governance:

* Operations
* Policy
* Oversight

---

### üîπ D. Evaluation & Risk Outputs

Once evidence is loaded, the state captures the results of governance reasoning:

* **Policy evaluations**
  Which rules applied, and why

* **Compliance events**
  Formal records of violations or risks

* **Risk scores**
  Quantified assessments by agent and system-wide

This is where governance becomes measurable instead of subjective.

---

### üîπ E. Prioritization & Summary

Not all issues are equal. The state explicitly supports:

* **Prioritized issues** (what leadership should act on first)
* **Executive summary metrics** (what is happening at a glance)

This bridges the gap between technical findings and business decisions.

---

### üîπ F. Final Outputs & Metadata

The system ultimately produces:

* A **human-readable audit report**
* Optional saved artifacts
* Processing metadata and error tracking

This makes the agent suitable for:

* Board reviews
* Compliance audits
* Post-incident analysis

---

## 3. Why This Configuration Design Is Important

The `GovernanceComplianceOrchestratorConfig` defines how governance decisions are **controlled without code changes**.

This is one of the most important architectural choices in the system.

### What the Config Controls

* **Which data sources are used**
* **Which policies apply**
* **How severity and risk are weighted**
* **When bias or drift is considered material**
* **How issues are prioritized for leadership**
* **Whether LLM enhancements are allowed**

Every threshold, weight, and toggle is explicit and adjustable.

---

## 4. Executive-Friendly Governance by Design

Several design choices stand out as particularly strong from a business perspective:

### üîπ Severity Weights Are Explicit

Leadership can see ‚Äî and change ‚Äî how ‚Äúhigh‚Äù vs ‚Äúmedium‚Äù risk affects outcomes.

### üîπ Bias & Drift Thresholds Are Transparent

No black-box fairness claims. Clear deltas trigger clear actions.

### üîπ LLM Usage Is Optional and Bounded

LLMs are disabled by default and tightly scoped when enabled, reinforcing:

> **The system decides. The LLM explains.**

### üîπ Prioritization Is Configurable

Executives can express what matters most ‚Äî severity, urgency, impact ‚Äî without retraining models or rewriting logic.

---

## 5. Why This Foundation Matters

This code does something many AI systems skip:

It **separates governance from intelligence**.

By establishing a strong state contract and a transparent configuration layer, the orchestrator becomes:

* Auditable instead of anecdotal
* Configurable instead of fragile
* Explainable instead of opaque
* Trusted instead of tolerated

This is exactly the kind of infrastructure organizations need before they can responsibly scale AI.

---

## Bottom Line

This first code batch doesn‚Äôt ‚Äúdo‚Äù governance yet ‚Äî and that‚Äôs its strength.

It defines the **rules of engagement** for every decision the system will make, ensuring that when automation scales, **accountability scales with it**.


