<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/299_HITL_Overview.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Human-in-the-Loop Collaboration Orchestrator Agent

**Status:** MVP - Ready for Testing

## Overview

The Human-in-the-Loop (HITL) Collaboration Orchestrator intelligently routes AI agent outputs between autonomous execution and human review based on:

- **Confidence Scores** - Agent confidence in its output
- **Risk Levels** - Business risk associated with the task
- **Routing Policy** - Configurable rules for decision-making

## Features

✅ **Confidence-Based Routing** - Automatically routes tasks based on agent confidence scores  
✅ **Risk Assessment** - Considers task risk levels (low, medium, high)  
✅ **Human Review Workflow** - Manages human review process for tasks requiring oversight  
✅ **Audit Trail** - Complete logging of all routing decisions and outcomes  
✅ **Summary Metrics** - Tracks auto-approvals, human reviews, escalations, and overrides  

## Data Requirements

The orchestrator expects the following JSON files in the `data/` directory:

1. **`tasks.json`** - List of tasks with risk levels
2. **`agent_outputs.json`** - Agent outputs with confidence scores
3. **`routing_policy.json`** - Routing policy rules
4. **`human_reviews.json`** (optional) - Existing human reviews
5. **`audit_logs.json`** (optional) - Existing audit logs


## Workflow

The orchestrator follows this linear workflow:

1. **Goal** - Define orchestration objective
2. **Planning** - Create execution plan
3. **Data Loading** - Load tasks, outputs, and routing policy
4. **Routing Decision** - Apply routing policy to make decisions
5. **Human Review Processing** - Process human reviews (or auto-approve for testing)
6. **Audit Logging** - Create audit logs and calculate metrics
7. **Report Generation** - Generate final report

## Routing Policy

The routing policy defines rules for routing decisions. Rules are evaluated in priority order (lower number = higher priority).

Example rule:
```json
{
  "rule_id": "rule_001",
  "priority": 1,
  "conditions": {
    "risk_level": "high"
  },
  "action": "escalate",
  "assigned_human_role": "senior_manager"
}
```

## Configuration

Configuration is defined in `config.py` as `HITLOrchestratorConfig`:

- `data_dir`: Directory containing data files (default: "data")
- `reports_dir`: Directory for reports (default: "output/hitl_orchestrator_reports")
- `auto_approve_for_testing`: Auto-approve pending reviews (default: True)

## Output

The orchestrator generates:

1. **Routing Decisions** - Decision for each task (auto_approve, human_review, escalate)
2. **Final Decisions** - Final outcome for each task
3. **Audit Logs** - Complete audit trail
4. **Summary Metrics** - Statistics about routing and decisions
5. **Report** - Markdown report saved to `output/hitl_orchestrator_reports/`

## Next Steps

To enhance the MVP:

1. **Add LLM Integration** - Use LLMs to generate routing explanations
2. **Real-Time Human Review** - Integrate with human review API/UI
3. **Dynamic Policy Updates** - Learn from human feedback to update routing policy
4. **Confidence Score Calibration** - Improve confidence score accuracy
5. **Multi-Domain Support** - Domain-specific routing policies

## Files

- `nodes.py` - Orchestrator nodes (goal, planning, data loading, routing, etc.)
- `orchestrator.py` - LangGraph workflow definition
- `utilities/` - Reusable utilities (data loading, routing, audit, reporting)
- `README.md` - This file



# Human-in-the-Loop Orchestrator Report

**Generated:** 2025-12-18 15:16:17

---

## Executive Summary

**Objective:** Route AI agent outputs between autonomous execution and human review based on confidence scores and risk levels

**Total Tasks Processed:** 5  
**Auto-Approved:** 2  
**Human Reviewed:** 2  
**Escalated:** 1  
**Human Overrides:** 2  

**Average Confidence Score:** 0.82  
**Average Latency:** 568.00 seconds

---

## Routing Decisions


### Auto-Approved Tasks (2)

- **task_001**: Risk=low, Confidence=0.91
- **task_004**: Risk=low, Confidence=0.95

### Human Review Tasks (2)

- **task_002**: Risk=medium, Confidence=0.68, Role=domain_reviewer
- **task_005**: Risk=medium, Confidence=0.74, Role=domain_reviewer

### Escalated Tasks (1)

- **task_003**: Risk=high, Confidence=0.83, Role=senior_manager

---

## Final Decisions


### task_001

- **Final Decision:** approved
- **Decision Source:** agent
- **Human Involved:** False
- **Confidence Score:** 0.91
- **Risk Level:** low

### task_002

- **Final Decision:** approved
- **Decision Source:** human
- **Human Involved:** True
- **Confidence Score:** 0.68
- **Risk Level:** medium

### task_003

- **Final Decision:** override_approved
- **Decision Source:** human
- **Human Involved:** True
- **Confidence Score:** 0.83
- **Risk Level:** high

### task_004

- **Final Decision:** approved
- **Decision Source:** agent
- **Human Involved:** False
- **Confidence Score:** 0.95
- **Risk Level:** low

### task_005

- **Final Decision:** modified_and_approved
- **Decision Source:** human
- **Human Involved:** True
- **Confidence Score:** 0.74
- **Risk Level:** medium

---

## Human Reviews


### task_002

- **Reviewer Role:** domain_reviewer
- **Decision:** approve
- **Feedback:** Correctly identified as a refund request
- **Confidence Assessment:** medium
- **Timestamp:** 2025-01-10T09:42:00Z

### task_003

- **Reviewer Role:** senior_manager
- **Decision:** override
- **Feedback:** Customer qualifies due to recent policy exception
- **Confidence Assessment:** high
- **Timestamp:** 2025-01-10T10:15:00Z

### task_005

- **Reviewer Role:** domain_reviewer
- **Decision:** modify
- **Feedback:** Content is borderline but does not violate policy; mark as allowed
- **Confidence Assessment:** medium
- **Timestamp:** 2025-01-10T11:05:00Z

---

## Audit Trail


### task_001

- **Routing Decision:** auto_approve
- **Final Decision:** approved
- **Decision Source:** agent
- **Human Involved:** False
- **Latency:** 12.00 seconds
- **Timestamp:** 2025-01-10T09:15:12Z

### task_002

- **Routing Decision:** human_review
- **Final Decision:** approved
- **Decision Source:** human
- **Human Involved:** True
- **Latency:** 720.00 seconds
- **Timestamp:** 2025-01-10T09:42:00Z

### task_003

- **Routing Decision:** escalate
- **Final Decision:** override_approved
- **Decision Source:** human
- **Human Involved:** True
- **Latency:** 900.00 seconds
- **Timestamp:** 2025-01-10T10:15:00Z

### task_004

- **Routing Decision:** auto_approve
- **Final Decision:** approved
- **Decision Source:** agent
- **Human Involved:** False
- **Latency:** 8.00 seconds
- **Timestamp:** 2025-01-10T10:20:08Z

### task_005

- **Routing Decision:** human_review
- **Final Decision:** modified_and_approved
- **Decision Source:** human
- **Human Involved:** True
- **Latency:** 1200.00 seconds
- **Timestamp:** 2025-01-10T11:05:00Z

---

## Summary Metrics

- **Total Tasks:** 5
- **Auto Approved Count:** 2
- **Human Reviewed Count:** 2
- **Escalated Count:** 1
- **Average Confidence Score:** 0.82
- **Average Latency Seconds:** 568.00
- **Human Override Count:** 2

---

*Report generated by HITL Collaboration Orchestrator*
