<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/258_Product_CustomerFitDiscoveryOrchestrator.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Report generation utilities for Product-Customer Fit Discovery Orchestrator

This final batch of code, the **Report Generation Utilities**, represents the **Presentation Layer (Step 7)** of your entire orchestrator.

Its purpose is not analysis, but **communication**. It takes the structured, validated results from all previous agents and organizes them into a single, cohesive, and actionable narrative that a business executive can read and act upon immediately.

***

## ðŸ§  Core Agent Architecture: The Presentation Layer

The `generate_discovery_report` function acts as the final strategic writer, translating complex numerical results into plain-language business insights using a structured report format.

### 1. The Strategic Narrative

The report is architected to tell a clear business story, moving from high-level conclusions to supporting evidence:

* **Executive Summary:** Starts with the most critical metrics: the total number of opportunities, the estimated **potential value**, and the count of **cross-validated insights**. This ensures the executive immediately grasps the scale and reliability of the findings.
* **Top Business Opportunities:** This is the heart of the report. It ranks the insights generated by the Synthesis Agent and explicitly lists the **Recommended Actions** and the **Supporting Evidence** (e.g., "From Patterns: Association rule..."). This transparency proves the rigor of the multi-agent approach.
* **Detailed Analysis Sections:** The rest of the report provides the necessary context and justification by detailing the raw results from each analytical step:
    * Summaries of the most important **Customer Segments** and **Product Bundles**.
    * Key **Association Rules** (e.g., "P01 $\to$ P05" with confidence/support).
    * Structural findings from the network (e.g., **Hub Products** and **Isolated Products**).

### 2. Output Formatting and Actionability

* **Markdown Structure:** Using Markdown ensures the report is readable across different platforms and provides clear headers and formatting.
* **Professional Formatting:** The code includes careful string formatting (e.g., `f"${total_value:,.0f}"`, `f"{confidence:.0%}"`) to present numerical data in a professional, currency, and percentage format.
* **Methodology Transparency:** The final section lists the **seven key analysis techniques** used, building trust and transparency in the agent's findings.

***

## âœ¨ Differentiation: The Final Product

This module confirms that your orchestrator is a **turn-key solution**.

* **Actionable Intelligence:** The output is not a data file or a confusing metric table; it is a **strategic playbook**. The entire system is optimized to generate the "Recommended Actions" list in the final report.
* **End-to-End Cohesion:** This final utility closes the loop. It verifies that every complex step taken by the previous agents (clustering, graph analysis, pattern mining) has a defined, digestible place in the final business deliverable.
* **Persistence:** The `save_report` function ensures the analytical findings are persistent and ready for distribution to stakeholders.

In [None]:
"""Report generation utilities for Product-Customer Fit Discovery Orchestrator"""

from typing import Dict, Any, List
from pathlib import Path
from datetime import datetime


def generate_discovery_report(state: Dict[str, Any]) -> str:
    """
    Generate comprehensive markdown discovery report from all analysis results.

    Args:
        state: Complete orchestrator state with all analysis results

    Returns:
        Markdown report string
    """
    report_lines = []

    # Header
    report_lines.append("# Product-Customer Fit Discovery Report")
    report_lines.append("")
    report_lines.append(f"**Generated:** {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    report_lines.append("")
    report_lines.append("---")
    report_lines.append("")

    # Executive Summary
    report_lines.append("## Executive Summary")
    report_lines.append("")

    synthesis_summary = state.get("synthesis_summary", {})
    if synthesis_summary:
        total_insights = synthesis_summary.get("total_insights", 0)
        high_confidence = synthesis_summary.get("high_confidence_insights", 0)
        total_value = synthesis_summary.get("total_potential_value", 0.0)

        report_lines.append(f"This analysis discovered **{total_insights} business opportunities** across your customer and product data.")
        report_lines.append(f"- **{high_confidence} high-confidence opportunities** identified")
        report_lines.append(f"- **Estimated potential value:** ${total_value:,.0f}")
        report_lines.append(f"- **Cross-validated insights:** {synthesis_summary.get('cross_validated_insights', 0)}")
        report_lines.append("")

    # Top Opportunities
    report_lines.append("## Top Business Opportunities")
    report_lines.append("")

    top_opportunities = state.get("top_opportunities", [])
    if top_opportunities:
        for idx, opp in enumerate(top_opportunities[:10], 1):
            report_lines.append(f"### {idx}. {opp.get('title', 'Opportunity')}")
            report_lines.append("")
            report_lines.append(f"**Type:** {opp.get('insight_type', 'unknown').replace('_', ' ').title()}")
            report_lines.append(f"**Confidence:** {opp.get('confidence', 0.0):.0%}")
            report_lines.append(f"**Business Value:** ${opp.get('business_value', 0.0):,.0f}")
            report_lines.append(f"**Validated:** {'âœ“ Yes' if opp.get('validated', False) else 'âš  Limited evidence'}")
            report_lines.append("")
            report_lines.append(f"**Description:** {opp.get('description', '')}")
            report_lines.append("")

            # Recommended Actions
            actions = opp.get("recommended_actions", [])
            if actions:
                report_lines.append("**Recommended Actions:**")
                for action in actions:
                    report_lines.append(f"- {action}")
                report_lines.append("")

            # Evidence
            evidence = opp.get("evidence", {})
            if any(evidence.values()):
                report_lines.append("**Supporting Evidence:**")
                for source, items in evidence.items():
                    if items:
                        source_name = source.replace("from_", "").replace("_", " ").title()
                        report_lines.append(f"- {source_name}: {', '.join(str(item) for item in items[:3])}")
                report_lines.append("")

            report_lines.append("---")
            report_lines.append("")
    else:
        report_lines.append("No opportunities identified in this analysis.")
        report_lines.append("")

    # Customer Segmentation
    report_lines.append("## Customer Segmentation Analysis")
    report_lines.append("")

    customer_clusters = state.get("customer_clusters", [])
    if customer_clusters:
        report_lines.append(f"**Segments Identified:** {len(customer_clusters)}")
        report_lines.append("")

        for cluster in customer_clusters[:5]:  # Top 5 segments
            report_lines.append(f"### {cluster.get('cluster_label', 'Segment')}")
            report_lines.append("")
            report_lines.append(f"- **Size:** {cluster.get('size', 0)} customers")

            characteristics = cluster.get("characteristics", {})
            if characteristics:
                report_lines.append(f"- **Average Age Group:** {characteristics.get('avg_age_group', 'N/A')}")
                report_lines.append(f"- **Common Location:** {', '.join(characteristics.get('common_location_tiers', [])[:2])}")
                report_lines.append(f"- **Top Products:** {', '.join(characteristics.get('top_products', [])[:5])}")
                report_lines.append(f"- **Product Diversity:** {characteristics.get('product_diversity', 0):.1f}")

            report_lines.append(f"- **Business Value:** ${cluster.get('business_value', 0.0):,.0f}")
            report_lines.append("")
    else:
        report_lines.append("No customer segments identified.")
        report_lines.append("")

    # Product Bundling
    report_lines.append("## Product Bundling Analysis")
    report_lines.append("")

    product_clusters = state.get("product_clusters", [])
    if product_clusters:
        report_lines.append(f"**Product Bundles Identified:** {len(product_clusters)}")
        report_lines.append("")

        for cluster in product_clusters[:5]:  # Top 5 bundles
            report_lines.append(f"### {cluster.get('cluster_label', 'Bundle')}")
            report_lines.append("")
            report_lines.append(f"- **Products:** {', '.join(cluster.get('entity_ids', [])[:5])}")
            report_lines.append(f"- **Bundle Potential:** {cluster.get('bundle_potential', 0.0):.0%}")

            characteristics = cluster.get("characteristics", {})
            if characteristics:
                report_lines.append(f"- **Common Features:** {', '.join(characteristics.get('common_features', [])[:5])}")
                report_lines.append(f"- **Monetization Models:** {', '.join(characteristics.get('monetization_models', [])[:3])}")

            report_lines.append("")
    else:
        report_lines.append("No product bundles identified.")
        report_lines.append("")

    # Association Rules
    report_lines.append("## Product Association Rules")
    report_lines.append("")

    association_rules = state.get("association_rules", [])
    if association_rules:
        high_confidence_rules = [r for r in association_rules if r.get("confidence", 0) >= 0.5]
        report_lines.append(f"**Total Rules Found:** {len(association_rules)}")
        report_lines.append(f"**High-Confidence Rules (â‰¥50%):** {len(high_confidence_rules)}")
        report_lines.append("")

        report_lines.append("### Top Association Rules")
        report_lines.append("")
        for rule in high_confidence_rules[:10]:
            antecedent = ', '.join(rule.get("antecedent", []))
            consequent = ', '.join(rule.get("consequent", []))
            confidence = rule.get("confidence", 0.0)
            support = rule.get("support", 0.0)

            report_lines.append(f"- **{antecedent} â†’ {consequent}**")
            report_lines.append(f"  - Confidence: {confidence:.0%} | Support: {support:.0%} | Type: {rule.get('rule_type', 'unknown')}")
            report_lines.append("")
    else:
        report_lines.append("No significant association rules found.")
        report_lines.append("")

    # Graph Analysis
    report_lines.append("## Network Analysis")
    report_lines.append("")

    graph_analysis_summary = state.get("graph_analysis_summary", {})
    centrality_metrics = state.get("centrality_metrics", {})

    if graph_analysis_summary:
        report_lines.append(f"**Network Statistics:**")
        report_lines.append(f"- Total Nodes: {graph_analysis_summary.get('total_nodes', 0)}")
        report_lines.append(f"- Total Edges: {graph_analysis_summary.get('total_edges', 0)}")
        report_lines.append(f"- Graph Density: {graph_analysis_summary.get('graph_density', 0.0):.3f}")
        report_lines.append(f"- Significant Motifs: {graph_analysis_summary.get('significant_motifs', 0)}")
        report_lines.append("")

    # Hub Products
    hub_products = centrality_metrics.get("hub_products", [])
    if hub_products:
        report_lines.append("### Hub Products (High Connectivity)")
        report_lines.append("")
        for hub in hub_products[:5]:
            report_lines.append(f"- **{hub.get('product_id', 'N/A')}**: Centrality {hub.get('centrality_score', 0.0):.3f}")
        report_lines.append("")

    # Bridge Customers
    bridge_customers = centrality_metrics.get("bridge_customers", [])
    if bridge_customers:
        report_lines.append("### Bridge Customers (Connect Different Groups)")
        report_lines.append("")
        for bridge in bridge_customers[:5]:
            report_lines.append(f"- **{bridge.get('customer_id', 'N/A')}**: Centrality {bridge.get('centrality_score', 0.0):.3f}")
        report_lines.append("")

    # Isolated Products
    isolated_products = centrality_metrics.get("isolated_products", [])
    if isolated_products:
        report_lines.append("### Underutilized Products (Low Connectivity)")
        report_lines.append("")
        report_lines.append(f"These products have low network connectivity and may represent opportunities:")
        report_lines.append(f"- {', '.join(isolated_products[:10])}")
        report_lines.append("")

    # Key Insights
    if graph_analysis_summary:
        key_insights = graph_analysis_summary.get("key_insights", [])
        if key_insights:
            report_lines.append("### Key Network Insights")
            report_lines.append("")
            for insight in key_insights:
                report_lines.append(f"- {insight}")
            report_lines.append("")

    # Methodology
    report_lines.append("## Methodology")
    report_lines.append("")
    report_lines.append("This analysis employed multiple advanced techniques:")
    report_lines.append("")
    report_lines.append("1. **Customer Segmentation**: K-means clustering on demographic and behavioral features")
    report_lines.append("2. **Product Bundling**: Clustering analysis to identify natural product combinations")
    report_lines.append("3. **Association Rule Mining**: Apriori algorithm to discover product relationships")
    report_lines.append("4. **Sequential Pattern Analysis**: Identification of common purchase sequences")
    report_lines.append("5. **Graph Motif Detection**: Network analysis to find recurring relationship patterns")
    report_lines.append("6. **Centrality Analysis**: Identification of hub products and bridge customers")
    report_lines.append("7. **Synthesis**: Cross-validation and ranking of opportunities across all methods")
    report_lines.append("")

    # Footer
    report_lines.append("---")
    report_lines.append("")
    report_lines.append("*Report generated by Product-Customer Fit Discovery Orchestrator*")

    return "\n".join(report_lines)


def save_report(report_content: str, output_path: str) -> str:
    """
    Save report to file.

    Args:
        report_content: Markdown report content
        output_path: Path to save report file

    Returns:
        Path to saved file
    """
    output_file = Path(output_path)
    output_file.parent.mkdir(parents=True, exist_ok=True)

    with open(output_file, 'w', encoding='utf-8') as f:
        f.write(report_content)

    return str(output_file)



# Tests for report generation utilities

In [None]:
"""Tests for report generation utilities"""

import pytest
from pathlib import Path
from tools.report_generation import generate_discovery_report, save_report


def test_generate_discovery_report_basic():
    """Test basic report generation"""
    state = {
        "synthesis_summary": {
            "total_insights": 10,
            "high_confidence_insights": 5,
            "total_potential_value": 1000.0,
            "cross_validated_insights": 3
        },
        "top_opportunities": [
            {
                "title": "Test Opportunity",
                "insight_type": "product_gap",
                "confidence": 0.8,
                "business_value": 500.0,
                "validated": True,
                "description": "Test description",
                "recommended_actions": ["Action 1", "Action 2"],
                "evidence": {"from_clustering": ["Evidence 1"]}
            }
        ]
    }

    report = generate_discovery_report(state)

    assert len(report) > 0
    assert "# Product-Customer Fit Discovery Report" in report
    assert "Executive Summary" in report
    assert "Top Business Opportunities" in report
    assert "Test Opportunity" in report


def test_generate_discovery_report_with_all_sections():
    """Test report includes all sections"""
    state = {
        "synthesis_summary": {
            "total_insights": 5,
            "high_confidence_insights": 3,
            "total_potential_value": 500.0,
            "cross_validated_insights": 2
        },
        "top_opportunities": [],
        "customer_clusters": [
            {
                "cluster_label": "Segment 1",
                "size": 10,
                "characteristics": {
                    "avg_age_group": "35-44",
                    "common_location_tiers": ["Tier 1"],
                    "top_products": ["P01", "P02"]
                },
                "business_value": 100.0
            }
        ],
        "product_clusters": [
            {
                "cluster_label": "Bundle 1",
                "entity_ids": ["P01", "P02"],
                "bundle_potential": 0.8,
                "characteristics": {
                    "common_features": ["A", "B"]
                }
            }
        ],
        "association_rules": [
            {
                "antecedent": ["P01"],
                "consequent": ["P02"],
                "confidence": 0.75,
                "support": 0.5,
                "rule_type": "cross_sell"
            }
        ],
        "graph_analysis_summary": {
            "total_nodes": 100,
            "total_edges": 200,
            "graph_density": 0.1,
            "significant_motifs": 5,
            "key_insights": ["Insight 1"]
        },
        "centrality_metrics": {
            "hub_products": [
                {"product_id": "P01", "centrality_score": 0.8}
            ],
            "bridge_customers": [
                {"customer_id": "C001", "centrality_score": 0.6}
            ],
            "isolated_products": ["P20"]
        }
    }

    report = generate_discovery_report(state)

    assert "Customer Segmentation Analysis" in report
    assert "Product Bundling Analysis" in report
    assert "Product Association Rules" in report
    assert "Network Analysis" in report
    assert "Methodology" in report
    assert "Segment 1" in report
    assert "Bundle 1" in report


def test_save_report():
    """Test saving report to file"""
    report_content = "# Test Report\n\nThis is a test report."
    output_path = "output/test_report.md"

    # Clean up if exists
    test_file = Path(output_path)
    if test_file.exists():
        test_file.unlink()

    saved_path = save_report(report_content, output_path)

    assert Path(saved_path).exists()
    assert saved_path == output_path

    # Verify content
    with open(saved_path, 'r') as f:
        content = f.read()
        assert "Test Report" in content

    # Clean up
    if test_file.exists():
        test_file.unlink()


def test_generate_discovery_report_empty_state():
    """Test report generation with minimal state"""
    state = {}

    report = generate_discovery_report(state)

    assert len(report) > 0
    assert "# Product-Customer Fit Discovery Report" in report


def test_generate_discovery_report_opportunities_formatting():
    """Test opportunity formatting in report"""
    state = {
        "top_opportunities": [
            {
                "title": "Opportunity 1",
                "insight_type": "bundle_opportunity",
                "confidence": 0.85,
                "business_value": 1500.0,
                "validated": True,
                "description": "Test description",
                "recommended_actions": ["Action A", "Action B"],
                "evidence": {
                    "from_clustering": ["Evidence 1"],
                    "from_patterns": ["Evidence 2"],
                    "from_graph": ["Evidence 3"]
                }
            }
        ],
        "synthesis_summary": {}
    }

    report = generate_discovery_report(state)

    assert "Opportunity 1" in report
    assert "85%" in report  # Confidence percentage
    assert "$1,500" in report  # Business value
    assert "âœ“ Yes" in report  # Validated
    assert "Action A" in report
    assert "Evidence 1" in report

