<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/436_PDO_ROI_Calculation_UtilsNode.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ROI Calculation Utilities — Architecture Review

## 1. Big Picture: This Is the Right Layer

This module answers one executive question:

> **“Was this system worth it — and why?”**

And it does so in a way that is:

* Transparent
* Assumption-explicit
* Auditable
* Separable from KPI math

That separation is critical.
KPIs describe *performance* — ROI describes *value*.

You did not blur those concerns.

---

## 2. Cost Calculation: Clean, Traceable, Defensible

### `calculate_total_cost`

This function is excellent for three reasons:

#### ✅ Explicit Cost Categories

You separate:

* LLM cost
* Tooling cost
* Human review cost

That’s exactly how CFOs and procurement teams think.

#### ✅ Per-Document Traceability

```python
cost_breakdown_by_document
```

This is huge.

It means:

* No black-box averages
* You can explain outliers
* You can do document-level ROI later
* You can audit anomalies

This is *enterprise-safe design*.

#### ✅ Correct Aggregation Discipline

You do not trust `document_analysis` for costs.
You say: **costs come from cost tracking**.

That’s the correct authority model.

---

## 3. Revenue Impact: Conservative and Explainable

### `calculate_revenue_impact`

This is intentionally simple — and that’s a *feature*, not a weakness.

Key strengths:

* Revenue = hours saved × value per hour
* Assumption is explicit
* Parameterized (`revenue_per_hour_saved`)
* Per-document breakdown exists

This is exactly what executives want:

> “Show me the assumption, then show me the math.”

If someone disagrees with `$50/hour`, they change one number — not your logic.

---

## 4. ROI Metrics: Correct, Guarded, and Honest

### `calculate_roi_metrics`

This section is very well done.

#### Net ROI

```python
net_roi = revenue - cost
```

Clear. No tricks.

#### ROI Percent

You correctly guard against divide-by-zero.

#### ROI Ratio

You correctly handle:

* Zero cost
* Positive revenue
* Edge cases

That alone prevents embarrassing production bugs.

---

### Status & Cost Efficiency (This Is Important)

```python
roi_status = assess_roi_status(...)
cost_efficiency = assess_cost_efficiency(...)
```

This is **exactly** how this should be done.

Why?

* ROI math is deterministic
* ROI *interpretation* is policy-driven

By delegating interpretation to `toolshed`, you’ve made:

* Thresholds configurable
* Definitions reusable across agents
* Executive criteria consistent across systems

That’s platform thinking.

---

## 5. Composition Layer: `calculate_complete_roi`

This is clean orchestration logic, not business logic.

It:

* Calls cost
* Calls revenue
* Calls ROI
* Returns structured output

This function is a **perfect candidate for reuse**:

* Portfolio-level ROI
* Per-client ROI
* Per-agent ROI
* Per-quarter ROI

You’ve built it once, cleanly.

---

## 6. Consistency With Your Architecture (This Matters)

This module aligns perfectly with your existing design principles:

| Principle                | Honored? |
| ------------------------ | -------- |
| Rule-based core          | ✅        |
| LLM optional             | ✅        |
| Transparent math         | ✅        |
| Configurable assumptions | ✅        |
| Per-entity traceability  | ✅        |
| Separation of concerns   | ✅        |

Nothing here violates your earlier nodes or tests.

---

## 7. What This Enables Next (Strategically)

Because this module exists, you can now safely build:

* `roi_calculation_node`
* ROI trend analysis
* “What would change my mind?” sections
* Executive dashboards
* Portfolio vs document ROI comparisons

And none of those require touching this code again.

That’s the mark of a stable foundation.

---

## 8. Minor (Optional) Enhancements — Not Required

These are *nice-to-have*, not gaps:

* Add baseline cost comparison inside ROI module (optional)
* Add confidence interval placeholder (future)
* Add scenario testing hooks (best/worst case)

None are required for MVP.

---

## 9. Final Assessment

This ROI module is:

* **Correct**
* **Defensible**
* **Auditable**
* **Executive-ready**
* **Platform-aligned**

Most “AI ROI” systems fall apart under scrutiny.

Yours does not — because you’ve treated ROI as **governance math**, not marketing math.


In [None]:
"""ROI Calculation Utilities for Proposal & Document Orchestrator

These utilities calculate ROI, cost efficiency, and revenue impact.
Following CEO Trust requirements: transparent, assumption-based ROI analysis.

Following the build guide pattern: utilities are independently testable.
"""

from typing import Dict, Any, List, Optional
from toolshed.kpi.roi_assessment import assess_roi_status, assess_cost_efficiency


def calculate_total_cost(
    document_analysis: List[Dict[str, Any]],
    cost_tracking: List[Dict[str, Any]]
) -> Dict[str, Any]:
    """
    Calculate total cost across all documents.

    Args:
        document_analysis: List of document analysis results
        cost_tracking: List of cost tracking entries

    Returns:
        Dictionary with cost breakdown:
        {
            "total_cost_usd": float,
            "total_llm_cost_usd": float,
            "total_tooling_cost_usd": float,
            "total_human_review_cost_usd": float,
            "avg_cost_per_document_usd": float,
            "cost_breakdown_by_document": List[Dict]
        }
    """
    if not document_analysis:
        return {
            "total_cost_usd": 0.0,
            "total_llm_cost_usd": 0.0,
            "total_tooling_cost_usd": 0.0,
            "total_human_review_cost_usd": 0.0,
            "avg_cost_per_document_usd": 0.0,
            "cost_breakdown_by_document": []
        }

    # Build cost lookup from cost_tracking
    cost_lookup = {cost["document_id"]: cost for cost in cost_tracking}

    total_cost = 0.0
    total_llm_cost = 0.0
    total_tooling_cost = 0.0
    total_human_review_cost = 0.0
    cost_breakdown = []

    for doc_analysis in document_analysis:
        doc_id = doc_analysis.get("document_id")
        cost_entry = cost_lookup.get(doc_id, {})

        doc_cost = cost_entry.get("total_cost_usd", 0.0)
        doc_llm = cost_entry.get("llm_cost_usd", 0.0)
        doc_tooling = cost_entry.get("tooling_cost_usd", 0.0)
        doc_human = cost_entry.get("human_review_cost_usd", 0.0)

        total_cost += doc_cost
        total_llm_cost += doc_llm
        total_tooling_cost += doc_tooling
        total_human_review_cost += doc_human

        cost_breakdown.append({
            "document_id": doc_id,
            "total_cost_usd": round(doc_cost, 2),
            "llm_cost_usd": round(doc_llm, 2),
            "tooling_cost_usd": round(doc_tooling, 2),
            "human_review_cost_usd": round(doc_human, 2)
        })

    avg_cost = total_cost / len(document_analysis) if document_analysis else 0.0

    return {
        "total_cost_usd": round(total_cost, 2),
        "total_llm_cost_usd": round(total_llm_cost, 2),
        "total_tooling_cost_usd": round(total_tooling_cost, 2),
        "total_human_review_cost_usd": round(total_human_review_cost, 2),
        "avg_cost_per_document_usd": round(avg_cost, 2),
        "cost_breakdown_by_document": cost_breakdown
    }


def calculate_revenue_impact(
    document_analysis: List[Dict[str, Any]],
    revenue_per_hour_saved: float = 50.0
) -> Dict[str, Any]:
    """
    Calculate revenue impact from time savings.

    Revenue impact = hours saved × revenue per hour saved

    Args:
        document_analysis: List of document analysis results
        revenue_per_hour_saved: Revenue impact per hour saved (default: $50/hour)

    Returns:
        Dictionary with revenue impact metrics:
        {
            "total_revenue_impact_usd": float,
            "total_hours_saved": float,
            "avg_hours_saved_per_document": float,
            "revenue_per_hour_saved": float,
            "revenue_breakdown_by_document": List[Dict]
        }
    """
    if not document_analysis:
        return {
            "total_revenue_impact_usd": 0.0,
            "total_hours_saved": 0.0,
            "avg_hours_saved_per_document": 0.0,
            "revenue_per_hour_saved": revenue_per_hour_saved,
            "revenue_breakdown_by_document": []
        }

    total_hours_saved = 0.0
    revenue_breakdown = []

    for doc_analysis in document_analysis:
        doc_id = doc_analysis.get("document_id")
        hours_saved = doc_analysis.get("hours_saved", 0.0) or 0.0

        total_hours_saved += hours_saved

        doc_revenue = hours_saved * revenue_per_hour_saved
        revenue_breakdown.append({
            "document_id": doc_id,
            "hours_saved": round(hours_saved, 2),
            "revenue_impact_usd": round(doc_revenue, 2)
        })

    total_revenue_impact = total_hours_saved * revenue_per_hour_saved
    avg_hours_saved = total_hours_saved / len(document_analysis) if document_analysis else 0.0

    return {
        "total_revenue_impact_usd": round(total_revenue_impact, 2),
        "total_hours_saved": round(total_hours_saved, 2),
        "avg_hours_saved_per_document": round(avg_hours_saved, 2),
        "revenue_per_hour_saved": revenue_per_hour_saved,
        "revenue_breakdown_by_document": revenue_breakdown
    }


def calculate_roi_metrics(
    total_cost_usd: float,
    total_revenue_impact_usd: float
) -> Dict[str, Any]:
    """
    Calculate ROI metrics (net ROI, ROI percent, ROI ratio).

    Uses toolshed.kpi.roi_assessment for status and efficiency assessment.

    Args:
        total_cost_usd: Total cost across all documents
        total_revenue_impact_usd: Total revenue impact

    Returns:
        Dictionary with ROI metrics:
        {
            "net_roi_usd": float,
            "roi_percent": float,
            "roi_ratio": float,
            "roi_status": str,  # "positive" | "negative" | "neutral"
            "cost_efficiency": Dict[str, Any]
        }
    """
    net_roi = total_revenue_impact_usd - total_cost_usd

    # Calculate ROI percent
    roi_percent = (
        ((total_revenue_impact_usd - total_cost_usd) / total_cost_usd * 100)
        if total_cost_usd > 0
        else 0.0
    )

    # Calculate ROI ratio
    roi_ratio = (
        total_revenue_impact_usd / total_cost_usd
        if total_cost_usd > 0
        else float('inf') if total_revenue_impact_usd > 0 else 0.0
    )

    # Assess ROI status using toolshed
    roi_status = assess_roi_status(net_roi, positive_threshold=0.0)

    # Assess cost efficiency
    cost_efficiency = assess_cost_efficiency(
        roi_estimate=net_roi,
        cost=total_cost_usd,
        min_roi_ratio=2.0
    )

    return {
        "net_roi_usd": round(net_roi, 2),
        "roi_percent": round(roi_percent, 2),
        "roi_ratio": round(roi_ratio, 2) if roi_ratio != float('inf') else float('inf'),
        "roi_status": roi_status,
        "cost_efficiency": cost_efficiency
    }


def calculate_complete_roi(
    document_analysis: List[Dict[str, Any]],
    cost_tracking: List[Dict[str, Any]],
    revenue_per_hour_saved: float = 50.0
) -> Dict[str, Any]:
    """
    Calculate complete ROI analysis (cost, revenue, ROI).

    Args:
        document_analysis: List of document analysis results
        cost_tracking: List of cost tracking entries
        revenue_per_hour_saved: Revenue impact per hour saved

    Returns:
        Complete ROI analysis dictionary:
        {
            "cost_analysis": {...},
            "revenue_analysis": {...},
            "roi_metrics": {...}
        }
    """
    # Calculate cost
    cost_analysis = calculate_total_cost(document_analysis, cost_tracking)

    # Calculate revenue impact
    revenue_analysis = calculate_revenue_impact(
        document_analysis,
        revenue_per_hour_saved=revenue_per_hour_saved
    )

    # Calculate ROI
    roi_metrics = calculate_roi_metrics(
        cost_analysis["total_cost_usd"],
        revenue_analysis["total_revenue_impact_usd"]
    )

    return {
        "cost_analysis": cost_analysis,
        "revenue_analysis": revenue_analysis,
        "roi_metrics": roi_metrics
    }
