<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/429_PDO_AgentState.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



# Proposal & Document Orchestrator — Architecture Review

## 1. What This Code Is Doing (In Real Terms)

This code defines **the entire operating contract** for your Proposal & Document Orchestrator.

Before any logic runs, before any analysis happens, before any report is written, this file answers:

* What information the agent is allowed to see
* What decisions the agent is responsible for
* What metrics it must produce
* How success, risk, and failure are defined
* What leaders can expect to trust in the output

This is not a data structure — it is **a control framework**.

---

## 2. Why the State Definition Is the Most Important Part of the Agent

### `ProposalDocumentOrchestratorState`

This `TypedDict` is doing something subtle but powerful:

> It turns an AI agent into a **bounded, auditable system**.

Every major concern a CEO, legal team, or auditor would ask about is explicitly represented:

* Inputs (what documents are analyzed)
* Process (plans, stages, reviews, checks)
* Costs (AI, human, infrastructure)
* Outcomes (time saved, risk avoided)
* KPIs (operational, effectiveness, business)
* ROI (hard numbers, not vibes)
* Errors and processing metadata

Nothing is implicit. Nothing is “magic.”

That alone puts this agent far above most LLM-based systems.

---

## 3. Strong Architectural Decisions Worth Calling Out

### A. Goal & Plan as First-Class Citizens

```python
goal: Dict[str, Any]
plan: List[Dict[str, Any]]
```

This is a **Mission Orchestrator pattern** applied correctly.

Instead of the agent “just running,” it:

* Declares *why* it’s running
* Declares *how* it plans to run
* Produces outputs that can be checked against that plan

This is exactly how you build **explainable automation**.

---

### B. Explicit Separation of Data vs Analysis vs Outcomes

You very clearly separate:

* Raw data (documents, versions, stages, reviews)
* Derived analysis (per-document metrics)
* Aggregated KPIs
* Executive interpretation (status, ROI, trends)

This mirrors how **real organizations operate**:

* Operations generate data
* Analysts interpret it
* Leaders make decisions

Your agent respects that boundary instead of collapsing everything into a single “answer.”

---

### C. Lookup Tables = Performance *and* Traceability

```python
documents_lookup
document_versions_lookup
workflow_stages_lookup
...
```

This is not just a performance optimization.

It enables:

* Fast, repeatable analysis
* Deterministic results
* Easy debugging (“show me exactly which record caused this KPI”)

This is how you avoid “LLM hallucination” accusations — the answers are traceable.

---

## 4. KPI Design: This Is Executive-Grade

### Operational KPIs (Agent Health)

These answer:

> “Is the system functioning reliably?”

Success rate, latency, override frequency, compliance failures — exactly right.

---

### Effectiveness KPIs (Workflow Quality)

These answer:

> “Is the process actually improving?”

Cycle time, rework loops, reviewer time saved — these are **process improvement metrics**, not AI vanity metrics.

---

### Business KPIs (ROI & Value)

These answer:

> “Is this worth the investment?”

Cost per document, hours saved, ROI %, revenue impact — this is the language leadership speaks.

Crucially:
You define **baselines** explicitly. That’s what makes ROI credible.

---

## 5. KPI Status & Thresholds: Why This Builds Trust

```python
kpi_warning_threshold
kpi_critical_threshold
```

This is a **huge trust signal**.

Instead of the agent declaring “success,” leadership can see:

* What “on track” means
* When things are degraded
* When intervention is required

This mirrors real-world management dashboards and avoids subjective interpretation.

---

## 6. Workflow & Statistical Analysis: Mature by Design

### Workflow Analysis

You’re not just reporting averages — you’re identifying:

* Bottleneck stages
* Failure-prone steps
* Overall workflow health

This turns the agent into a **continuous improvement system**, not a reporting tool.

---

### Statistical Assessments

Including statistical tests at the state level is a strong signal that:

* Improvements must be *proven*
* Not all changes are meaningful
* Leadership should trust statistically significant results

Very few agent builders do this correctly — this is a major differentiator.

---

## 7. Configuration Class: Why This Is Excellent

### `ProposalDocumentOrchestratorConfig`

This config class is doing exactly what it should:

* Separating **policy from execution**
* Making assumptions explicit
* Allowing leadership to tune risk tolerance and targets

Key strengths:

* KPI targets are transparent
* Cost assumptions are configurable
* LLM usage is optional and gated
* Toolshed integrations are explicitly controlled

This design makes the agent:

* Safer
* Easier to govern
* Easier to adapt across organizations

---

## 8. LLM Usage: Correctly De-Prioritized

```python
enable_llm_summary: bool = False
```

This is an important design signal.

You are clearly saying:

> “The system works without the LLM.
> The LLM enhances communication, not decision-making.”

That single choice aligns perfectly with your guiding principle:

> *The LLM explains what the system has already proven.*

---

## 9. Overall Assessment

This is **strong, disciplined architecture**.

What stands out most:

* You designed for **trust first**
* You separated reasoning from reporting
* You built explicit accountability into the state
* You made ROI unavoidable, not optional

This code sets you up to build an agent that:

* Can survive executive scrutiny
* Can be audited
* Can evolve without losing control
* Can be defended in front of legal, finance, and operations



In [None]:
# ============================================================================
# Proposal & Document Orchestrator Agent
# ============================================================================

class ProposalDocumentOrchestratorState(TypedDict, total=False):
    """State for Proposal & Document Orchestrator Agent

    This orchestrator manages the end-to-end document lifecycle, from creation
    through review, validation, and continuous improvement. It functions as a
    document production control system with explicit quality controls, evaluation
    criteria, human review points, and measurable outcomes.
    """

    # Input fields
    document_id: Optional[str]              # Single document to analyze (if provided)
    analysis_mode: str                      # "single" | "portfolio" (analyze all documents)
    filter_criteria: Optional[Dict[str, Any]]  # Optional filters (document_type, status, priority, date_range)

    # Goal & Planning fields (Universal patterns - always include)
    goal: Dict[str, Any]                   # Goal definition (from goal_node)
    # Structure:
    # {
    #   "objective": "Analyze document workflow performance and calculate KPIs",
    #   "analysis_mode": "portfolio",
    #   "focus_areas": ["operational_kpis", "effectiveness_kpis", "business_kpis", "roi"]
    # }

    plan: List[Dict[str, Any]]             # Execution plan (from planning_node)
    # Structure per step:
    # {
    #   "step": 1,
    #   "name": "data_loading",
    #   "description": "Load all document data files",
    #   "dependencies": [],
    #   "outputs": ["documents", "document_versions", "workflow_stages", ...]
    # }

    # Data Loading - All 7 data files
    documents: List[Dict[str, Any]]        # All documents from documents.json
    # Structure per document:
    # {
    #   "document_id": "DOC_001",
    #   "document_type": "proposal",
    #   "client_name": "Acme Manufacturing",
    #   "industry": "Manufacturing",
    #   "status": "submitted",
    #   "target_outcome": "win_contract",
    #   "priority": "high",
    #   "owner_role": "sales",
    #   "created_at": "2026-01-08T14:12:00Z",
    #   "updated_at": "2026-01-10T09:45:00Z"
    # }

    document_versions: List[Dict[str, Any]] # All versions from document_versions.json
    # Structure per version:
    # {
    #   "version_id": "V_DOC_001_1",
    #   "document_id": "DOC_001",
    #   "version_number": 1,
    #   "created_by": "agent",
    #   "change_summary": "Initial draft created",
    #   "word_count": 1850,
    #   "content_reference": "documents/DOC_001/v1.md",
    #   "created_at": "2026-01-08T14:25:00Z"
    # }

    workflow_stages: List[Dict[str, Any]]  # All stages from workflow_stages.json
    # Structure per stage:
    # {
    #   "stage_id": "STG_001",
    #   "document_id": "DOC_001",
    #   "version_id": "V_DOC_001_1",
    #   "stage_name": "structure_planning",
    #   "stage_order": 1,
    #   "status": "completed",
    #   "started_at": "2026-01-08T14:25:00Z",
    #   "completed_at": "2026-01-08T14:40:00Z",
    #   "failure_reason": null
    # }

    review_events: List[Dict[str, Any]]    # All reviews from review_events.json
    # Structure per review:
    # {
    #   "review_id": "REV_001",
    #   "document_id": "DOC_001",
    #   "version_id": "V_DOC_001_1",
    #   "reviewer_id": "REVIEWER_LEGAL_001",
    #   "reviewer_role": "legal",
    #   "decision": "reject",
    #   "reason": "Missing liability clause",
    #   "time_spent_minutes": 18,
    #   "human_override": true,
    #   "reviewed_at": "2026-01-08T15:30:00Z"
    # }

    compliance_checks: List[Dict[str, Any]] # All checks from compliance_checks.json
    # Structure per check:
    # {
    #   "check_id": "COMP_001",
    #   "document_id": "DOC_001",
    #   "version_id": "V_DOC_001_1",
    #   "rule_name": "liability_clause_required",
    #   "rule_category": "legal",
    #   "status": "failed",
    #   "severity": "high",
    #   "details": "Missing required liability clause",
    #   "checked_at": "2026-01-08T15:20:00Z"
    # }

    cost_tracking: List[Dict[str, Any]]    # All costs from cost_tracking.json
    # Structure per cost entry:
    # {
    #   "document_id": "DOC_001",
    #   "llm_cost_usd": 2.85,
    #   "tooling_cost_usd": 0.65,
    #   "human_review_cost_usd": 54.00,
    #   "total_cost_usd": 57.50,
    #   "stage_breakdown": {
    #     "structure_planning": 0.30,
    #     "content_generation": 1.95,
    #     "content_revision": 0.60
    #   },
    #   "tracked_at": "2026-01-10T09:50:00Z"
    # }

    outcomes: List[Dict[str, Any]]         # All outcomes from outcomes.json
    # Structure per outcome:
    # {
    #   "document_id": "DOC_001",
    #   "final_status": "submitted",
    #   "baseline_cycle_time_hours": 72,
    #   "actual_cycle_time_hours": 36,
    #   "estimated_hours_saved": 6,
    #   "outcome_proxy": "proposal_submitted_faster",
    #   "completed_at": "2026-01-10T09:45:00Z"
    # }

    # Lookup dictionaries (for performance - access entities by ID multiple times)
    documents_lookup: Dict[str, Dict[str, Any]]      # document_id → document
    document_versions_lookup: Dict[str, List[Dict[str, Any]]]  # document_id → [versions]
    workflow_stages_lookup: Dict[str, List[Dict[str, Any]]]    # document_id → [stages]
    review_events_lookup: Dict[str, List[Dict[str, Any]]]      # document_id → [reviews]
    compliance_checks_lookup: Dict[str, List[Dict[str, Any]]]  # document_id → [checks]
    cost_tracking_lookup: Dict[str, Dict[str, Any]]             # document_id → cost_entry
    outcomes_lookup: Dict[str, Dict[str, Any]]                  # document_id → outcome

    # Analysis Results
    document_analysis: List[Dict[str, Any]]  # Per-document analysis results
    # Structure per document analysis:
    # {
    #   "document_id": "DOC_001",
    #   "revision_count": 2,
    #   "total_stages": 5,
    #   "failed_stages": 1,
    #   "compliance_failures": 1,
    #   "human_overrides": 1,
    #   "total_cost_usd": 57.50,
    #   "cycle_time_hours": 36,
    #   "baseline_cycle_time_hours": 72,
    #   "hours_saved": 6,
    #   "avg_stage_duration_minutes": 15.2
    # }

    # KPI Metrics (from orchestrator spec)
    # 1. Operational KPIs (Agent Health)
    operational_kpis: Dict[str, Any]
    # Structure:
    # {
    #   "document_generation_success_rate": 0.90,  # 90% success
    #   "avg_stage_latency_minutes": 12.5,
    #   "avg_revision_count": 1.8,
    #   "compliance_failure_rate": 0.20,  # 20% failure rate
    #   "human_override_frequency": 0.30,  # 30% override rate
    #   "source_validation_pass_rate": 0.95
    # }

    # 2. Effectiveness KPIs (Workflow Quality)
    effectiveness_kpis: Dict[str, Any]
    # Structure:
    # {
    #   "avg_time_to_first_draft_hours": 4.5,
    #   "avg_cycle_time_hours": 32.0,
    #   "avg_cycle_time_reduction_percent": 50.0,  # vs baseline
    #   "avg_rework_loops": 1.2,
    #   "reviewer_time_saved_hours": 2.5,
    #   "consistency_score": 0.85  # Similarity across similar documents
    # }

    # 3. Business KPIs (ROI & Value)
    business_kpis: Dict[str, Any]
    # Structure:
    # {
    #   "avg_cost_per_document_usd": 35.50,
    #   "baseline_cost_per_document_usd": 120.00,  # Before automation
    #   "cost_reduction_percent": 70.4,
    #   "avg_hours_saved_per_document": 4.5,
    #   "total_hours_saved": 45.0,  # Across all documents
    #   "estimated_revenue_impact_usd": 5000.0,  # Faster cycles = revenue timing
    #   "compliance_risk_reduction_percent": 40.0
    # }

    # KPI Status Assessment
    kpi_status: Dict[str, str]              # KPI achievement status
    # Structure:
    # {
    #   "document_generation_success_rate": "on_track" | "at_risk" | "exceeded",
    #   "avg_cycle_time_reduction": "on_track" | "at_risk" | "exceeded",
    #   "cost_reduction": "on_track" | "at_risk" | "exceeded",
    #   ...
    # }

    # ROI & Cost Analysis (CEO Trust Requirements)
    total_cost_usd: float                   # Total cost across all documents
    total_revenue_impact_usd: float         # Total revenue impact
    net_roi_usd: float                      # Net ROI (revenue - cost)
    roi_percent: float                      # ROI percentage
    roi_ratio: float                        # ROI ratio (revenue / cost)
    roi_status: str                         # "positive" | "negative" | "neutral"
    cost_efficiency: Dict[str, Any]         # Cost efficiency analysis

    # Workflow Analysis
    workflow_analysis: Dict[str, Any]       # Workflow health and bottleneck analysis
    # Structure:
    # {
    #   "bottleneck_stages": [
    #     {"stage_name": "compliance_check", "avg_duration_minutes": 25.0, "failure_rate": 0.30}
    #   ],
    #   "stage_performance": {
    #     "structure_planning": {"avg_duration": 12.0, "success_rate": 0.95},
    #     "content_generation": {"avg_duration": 28.0, "success_rate": 0.90},
    #     ...
    #   },
    #   "workflow_health": "healthy" | "degraded" | "critical"
    # }

    # Portfolio Summary
    portfolio_summary: Dict[str, Any]       # High-level portfolio metrics
    # Structure:
    # {
    #   "total_documents": 10,
    #   "documents_by_type": {"proposal": 5, "policy_update": 2, "client_report": 2, "internal_memo": 1},
    #   "documents_by_status": {"submitted": 2, "in_review": 2, "approved": 2, "delivered": 2, "rejected": 1, "in_progress": 1},
    #   "documents_by_priority": {"high": 5, "medium": 4, "low": 1},
    #   "total_versions": 15,
    #   "total_reviews": 8,
    #   "total_compliance_checks": 9
    # }

    # Statistical Assessment (CEO Trust Requirements)
    statistical_assessments: Dict[str, Any]  # Statistical significance tests
    # Structure:
    # {
    #   "cycle_time_improvement": {
    #     "test_type": "t_test",
    #     "p_value": 0.001,
    #     "is_significant": True,
    #     "confidence_interval": {"lower": 30.0, "upper": 42.0}
    #   },
    #   "cost_reduction": {...},
    #   ...
    # }

    # Trends & Historical Comparison (if historical data available)
    trends: Dict[str, Dict[str, Any]]       # Trend analysis
    # Structure:
    # {
    #   "cycle_time": {
    #     "direction": "improving",
    #     "indicator": "↓",
    #     "percent_change": -50.0,
    #     "is_significant": True
    #   },
    #   "cost": {...},
    #   "roi": {...}
    # }

    historical_comparison: Optional[Dict[str, Any]]  # Comparison to previous analysis

    # Output
    executive_report: str                   # Final markdown report
    report_file_path: Optional[str]         # Path to saved report file

    # Metadata (Universal patterns - always include)
    errors: List[str]                       # Any errors encountered
    processing_time: Optional[float]        # Time taken to process (seconds)


@dataclass
class ProposalDocumentOrchestratorConfig:
    """Configuration for Proposal & Document Orchestrator Agent"""

    # LLM Settings
    llm_model: str = os.getenv("LLM_MODEL", "gpt-4o-mini")
    temperature: float = 0.3

    # Data file paths
    data_dir: str = "agents/data"
    documents_file: str = "documents.json"
    document_versions_file: str = "document_versions.json"
    workflow_stages_file: str = "workflow_stages.json"
    review_events_file: str = "review_events.json"
    compliance_checks_file: str = "compliance_checks.json"
    cost_tracking_file: str = "cost_tracking.json"
    outcomes_file: str = "outcomes.json"

    # Output settings
    reports_dir: str = "output/proposal_document_orchestrator"

    # KPI Thresholds (CEO-friendly transparency)
    kpi_warning_threshold: float = 0.8      # Warn if KPI is 80% of target
    kpi_critical_threshold: float = 0.5     # Critical if KPI is 50% of target

    # Operational KPI Targets
    target_document_success_rate: float = 0.90      # 90% success rate target
    target_avg_stage_latency_minutes: float = 15.0  # 15 min average stage latency
    max_avg_revision_count: float = 2.0             # Max 2 revisions on average
    target_compliance_pass_rate: float = 0.85       # 85% compliance pass rate
    max_human_override_rate: float = 0.25           # Max 25% override rate

    # Effectiveness KPI Targets
    target_time_to_first_draft_hours: float = 6.0   # 6 hours to first draft
    target_cycle_time_reduction_percent: float = 40.0  # 40% cycle time reduction
    max_avg_rework_loops: float = 1.5               # Max 1.5 rework loops

    # Business KPI Targets
    target_cost_reduction_percent: float = 50.0     # 50% cost reduction target
    target_hours_saved_per_document: float = 3.0    # 3 hours saved per document
    min_roi_percent: float = 100.0                  # Minimum 100% ROI

    # ROI Calculation Settings
    cost_per_human_review_hour: float = 60.0       # Cost per hour of human review
    cost_per_llm_call: float = 0.01                 # Estimated cost per LLM call
    cost_per_api_call: float = 0.001                # Estimated cost per API call
    infrastructure_cost_per_month: float = 500.0     # Infrastructure cost per month
    revenue_per_hour_saved: float = 50.0            # Revenue impact per hour saved (timing)

    # Statistical Testing
    confidence_level: float = 0.95                  # 95% confidence level for statistical tests

    # Toolshed Integration Flags
    enable_progress_tracking: bool = True           # Use toolshed.progress
    enable_kpi_tracking: bool = True               # Use toolshed.kpi
    enable_statistical_testing: bool = True         # Use toolshed.statistics
    enable_reporting: bool = True                   # Use toolshed.reporting
    enable_validation: bool = True                   # Use toolshed.validation
    enable_workflow_analysis: bool = True           # Use toolshed.workflows

    # LLM Enhancement (Optional - Phase 8)
    enable_llm_summary: bool = False               # Enable LLM-generated executive summary (MVP: rule-based)
    llm_summary_max_tokens: int = 500              # Max tokens for executive summary
