# 11: Autonomous Integration Demo üé¨

This notebook executes the **Full Autonomous Pipeline** of the SalesOps Agent Suite.
It serves as the "Cinematic Demo" for the project, visualizing every step from raw data to enterprise action.

### üîÑ The Pipeline Flow
1.  **Ingest:** Load & clean `superstore.csv`.
2.  **Detect:** Identify statistical outliers (Z-Score/IQR).
3.  **Explain:** Use **Gemini 2.0** (via RAG) to analyze root causes.
4.  **Act:** Trigger downstream workflows (Mock Jira/Email).
5.  **Learn:** Update Long-Term Memory with resolution context.

## 1: Setup & Imports

In [1]:
import sys
import os
import json
import time
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from pathlib import Path
from IPython.display import display, Markdown, HTML
from dotenv import load_dotenv

# Move working directory from notebooks ‚Üí project root
os.chdir("..")

# Add project root
project_root = os.path.abspath(os.path.join(os.path.dirname("__file__"), ".."))
if project_root not in sys.path:
    sys.path.append(project_root)

os.environ["OBSERVABILITY_DIR"] = os.path.join(project_root, "outputs", "observability")

load_dotenv()

# Import Core Systems
from agents.a2a_coordinator import A2ACoordinator
from agents.memory_agent import MemoryAgent
from observability.collector import LogCollector

# Define Demo Output Directory
DEMO_DIR = os.path.join(project_root, "outputs", "demo_run")
print(f"‚úÖ Environment Ready. Output target: {DEMO_DIR}")

  from google.cloud.aiplatform.utils import gcs_utils
2025-11-29 19:27:13,570 - google_adk.google.adk.models.registry - INFO - Updating LLM class for gemini-.* from <class 'google.adk.models.google_llm.Gemini'> to <class 'google.adk.models.google_llm.Gemini'>
2025-11-29 19:27:13,570 - google_adk.google.adk.models.registry - INFO - Updating LLM class for projects\/.+\/locations\/.+\/endpoints\/.+ from <class 'google.adk.models.google_llm.Gemini'> to <class 'google.adk.models.google_llm.Gemini'>
2025-11-29 19:27:13,570 - google_adk.google.adk.models.registry - INFO - Updating LLM class for projects\/.+\/locations\/.+\/publishers\/google\/models\/gemini.+ from <class 'google.adk.models.google_llm.Gemini'> to <class 'google.adk.models.google_llm.Gemini'>
2025-11-29 19:27:13,570 - google_adk.google.adk.models.registry - INFO - Updating LLM class for gemini-.* from <class 'google.adk.models.google_llm.Gemini'> to <class 'google.adk.models.google_llm.Gemini'>
2025-11-29 19:27:13,570 - google_

‚úÖ Environment Ready. Output target: d:\01. Github\outputs\demo_run


## 2: Configuration

In [2]:
# 1. Pipeline Configuration
# We enable actions because we are using the Safe Mock Server
flow_config = {
    "id": "demo_integration_run",
    "parallelism": 3,
    "confirm_actions": True,
    "max_anomalies": 5,
}

inputs = {"csv_path": "data/raw/superstore.csv"}

session_id = "session:video-demo-001"

print("‚öôÔ∏è Configuration Loaded:")
print(json.dumps(flow_config, indent=2))

‚öôÔ∏è Configuration Loaded:
{
  "id": "demo_integration_run",
  "parallelism": 3,
  "confirm_actions": true,
  "max_anomalies": 5
}


## 3: Run Infrastructure

In [3]:
# 2. Robust Mock Server Startup (Port 7777)
import requests
import time
from subprocess import Popen

SERVER_URL = "http://localhost:7777"
LOG_FILE_PATH = "outputs/mock_server_demo.log"

print("üîç Checking Enterprise Mock Server...")


def is_server_healthy():
    try:
        resp = requests.get(f"{SERVER_URL}/health", timeout=1)
        return resp.status_code == 200
    except:
        return False


if is_server_healthy():
    try:
        requests.post(f"{SERVER_URL}/admin/chaos", json={"enabled": False})
    except:
        pass
    print("‚úÖ Server is ALREADY RUNNING.")

else:
    print("üöÄ Starting Mock Server...")

    # Find the directory that actually contains 'tools'
    current_dir = os.getcwd()
    if os.path.exists(os.path.join(current_dir, "tools")):
        project_root = current_dir
    elif os.path.exists(os.path.join(current_dir, "..", "tools")):
        project_root = os.path.abspath(os.path.join(current_dir, ".."))
    else:
        # Fallback: try to find it relative to this notebook file if accessible
        project_root = os.path.abspath(os.path.join(os.getcwd(), ".."))

    print(f"   (Context: Running from {project_root})")

    log_file = open(LOG_FILE_PATH, "w")

    process = Popen(
        [sys.executable, "-m", "uvicorn", "tools.mock_server:app", "--port", "7777"],
        stdout=log_file,
        stderr=log_file,
        cwd=project_root,  # <--- Explicitly set CWD to project root
    )

    # Poll for readiness
    for i in range(10):
        time.sleep(1)
        if is_server_healthy():
            print(f"‚úÖ Server Started Successfully (Attempt {i+1})")
            break
    else:
        print("‚ùå Server FAILED to start.")
        print("--- Tail of Server Log ---")
        log_file.flush()
        with open(LOG_FILE_PATH, "r") as f:
            print(f.read()[-500:])
        raise RuntimeError("Could not start Mock Server")

üîç Checking Enterprise Mock Server...
üöÄ Starting Mock Server...
   (Context: Running from d:\01. Github\salesops-suite)
‚úÖ Server Started Successfully (Attempt 1)


## 4: Execute Coordinator

In [4]:
# 3. Trigger the Autonomous Agent
print("‚ñ∂Ô∏è STARTING PIPELINE EXECUTION...")
start_time = time.time()

# Initialize with specific output dir for clean demo artifacts
coordinator = A2ACoordinator(output_dir="outputs", dry_run=False)

# RUN!
manifest = coordinator.run(flow_config, inputs, session_id)

duration = time.time() - start_time
print(f"\n‚ú® Run Completed in {duration:.2f}s")
print(f"üÜî Run ID: {manifest['run_id']}")
print(f"üìÇ Artifacts: {list(manifest['artifacts'].keys())}")

2025-11-29 19:27:19,141 - A2ACoordinator - INFO - Starting Run run_20251129T135719Z_cfca10
2025-11-29 19:27:19,141 - A2ACoordinator - INFO - Starting Run run_20251129T135719Z_cfca10
2025-11-29 19:27:19,152 - agents.data_ingestor - INFO - Attempting read with encoding='utf-8'...
2025-11-29 19:27:19,188 - agents.data_ingestor - INFO - Attempting read with encoding='latin1'...
2025-11-29 19:27:19,221 - agents.data_ingestor - INFO - Success! Read 9994 rows.
2025-11-29 19:27:19,247 - agents.data_ingestor - INFO - Schema validation passed.


‚ñ∂Ô∏è STARTING PIPELINE EXECUTION...


2025-11-29 19:27:19,351 - agents.data_ingestor - INFO - Snapshot saved successfully to D:\01. Github\salesops-suite\outputs\runs\run_20251129T135719Z_cfca10\snapshot.parquet
2025-11-29 19:27:19,530 - agents.anomaly_stats_agent - INFO - Running Global Z-Score Detector on Sales (w=30, t=3.0)
2025-11-29 19:27:19,543 - agents.anomaly_stats_agent - INFO - Running Grouped IQR Detector on Region (w=14, k=1.5)
2025-11-29 19:27:19,592 - agents.anomaly_stats_agent - INFO - Saved 501 anomalies to D:\01. Github\salesops-suite\outputs\runs\run_20251129T135719Z_cfca10\anomalies.json
2025-11-29 19:27:20,748 - google_genai.models - INFO - AFC is enabled with max remote calls: 10.
2025-11-29 19:27:20,748 - google_genai.models - INFO - AFC is enabled with max remote calls: 10.
2025-11-29 19:27:20,750 - google_genai.models - INFO - AFC is enabled with max remote calls: 10.
2025-11-29 19:27:22,818 - google_genai.models - INFO - AFC is enabled with max remote calls: 10.
2025-11-29 19:27:22,875 - httpx - IN


‚ú® Run Completed in 29.61s
üÜî Run ID: run_20251129T135719Z_cfca10
üìÇ Artifacts: ['snapshot', 'anomalies', 'explanations', 'actions_log']


## 5: Visualizing Ingestion

In [5]:
# 4. Stage 1: Ingestion Preview
print("\n--- üì• Stage 1: Data Ingestion ---")
snap_path = manifest["artifacts"].get("snapshot")

if snap_path and os.path.exists(snap_path):
    df_clean = pd.read_parquet(snap_path)

    # KPIs
    kpis = {
        "Total Rows": len(df_clean),
        "Date Range": f"{df_clean['Order Date'].min().date()} to {df_clean['Order Date'].max().date()}",
        "Total Sales": f"${df_clean['Sales'].sum():,.0f}",
    }

    # Display KPIs
    cols = (
        st.columns(3) if "st" in locals() else [None] * 3
    )  # Placeholder if we port to streamlit later
    print(json.dumps(kpis, indent=2))

    # Plot Sales Trend
    daily = df_clean.groupby("Order Date")["Sales"].sum().reset_index()
    fig = px.line(
        daily, x="Order Date", y="Sales", title="Ingested Sales Trend (Daily)"
    )
    fig.update_layout(height=300)
    fig.show()
else:
    print("‚ùå Snapshot artifact missing.")


--- üì• Stage 1: Data Ingestion ---
{
  "Total Rows": 9994,
  "Date Range": "2014-01-03 to 2017-12-30",
  "Total Sales": "$2,297,201"
}


## 6: Visualizing Detection

In [6]:
# 5. Stage 2: Anomaly Detection
print("\n--- üö® Stage 2: Statistical Detection ---")
anom_path = manifest["artifacts"].get("anomalies")

if anom_path and os.path.exists(anom_path):
    with open(anom_path, "r") as f:
        anoms = json.load(f)

    all_anoms = anoms.get("all_anomalies", [])
    df_anoms = pd.DataFrame(all_anoms)

    if not df_anoms.empty:
        # Score Distribution
        fig = px.scatter(
            df_anoms,
            x="period_start",
            y="score",
            color="detector",
            size="score",
            hover_data=["entity_id", "value", "expected"],
            title="Detected Anomalies by Severity",
        )
        fig.show()

        print("Top 3 Anomalies:")
        display(df_anoms[["entity_id", "metric", "score", "reason"]].head(3))
    else:
        print("No anomalies found.")


--- üö® Stage 2: Statistical Detection ---


Top 3 Anomalies:


Unnamed: 0,entity_id,metric,score,reason
0,South,Sales,24.21,Outside Tukey Fence (Score=24.21)
1,South,Sales,22.22,Outside Tukey Fence (Score=22.22)
2,Central,Sales,21.06,Outside Tukey Fence (Score=21.06)


## 7: Visualizing Explanation

In [7]:
# 6. Stage 3: AI Explanation (Gemini)
print("\n--- üß† Stage 3: AI Analysis (RAG + Gemini) ---")
enriched_path = manifest["artifacts"].get("explanations")

if enriched_path and os.path.exists(enriched_path):
    with open(enriched_path, "r") as f:
        enriched = json.load(f)

    for i, rec in enumerate(enriched[:2]):  # Show top 2
        md = f"""
#### ü§ñ Analysis #{i+1}: {rec.get('entity_id')}
**Confidence:** {rec.get('confidence')} | **RAG Used:** {'Yes' if 'History' in str(rec) else 'Implicit'}

> **{rec.get('explanation_short')}**

*"{rec.get('explanation_full')}"*

**Suggested Actions:** {rec.get('suggested_actions')}
"""
        display(Markdown(md))


--- üß† Stage 3: AI Analysis (RAG + Gemini) ---



#### ü§ñ Analysis #1: Central
**Confidence:** High | **RAG Used:** Implicit

> **Central region sales are significantly higher than expected, driven by a substantial positive deviation from the expected value.**

*"The Central region experienced sales of 18,336.74, which is 21.06 times higher than the expected 914.88. This value falls far above the third quartile (Q3) of 914.88, indicating a significant outlier."*

**Suggested Actions:** ['Investigate source of high sales volume in Central region', 'Review recent large orders in Central region']



#### ü§ñ Analysis #2: South
**Confidence:** High | **RAG Used:** Implicit

> **South region sales significantly exceeded expectations, driven by a large, atypical order.**

*"The sales value of 23,661.23 is drastically higher than the expected value of 1,159.38. This is supported by the deviation score of 24.21. While historical context shows anomalies can occur due to bulk orders (e.g., Acme Corp. in Technology), there is no specific historical event provided for the South region sales that directly explains this spike."*

**Suggested Actions:** ['Investigate the source of the large sales value in the South region.', 'Verify if this represents a genuine large order or a data error.']


## 8: Visualizing Action

In [8]:
# 7. Stage 4: Enterprise Action
print("\n--- ‚ö° Stage 4: Execution Audit ---")
action_log = manifest["artifacts"].get("actions_log")

if action_log and os.path.exists(action_log):
    actions = []
    with open(action_log, "r") as f:
        for line in f:
            actions.append(json.loads(line))

    # Filter for actions from THIS run
    # (In a real app we'd filter by run_id, here we take tail)
    recent_actions = actions[-len(enriched) :] if enriched else []

    if recent_actions:
        df_actions = pd.DataFrame(recent_actions)

        # Safe extraction of nested result
        df_actions["status"] = df_actions["result"].apply(lambda x: x.get("status"))
        df_actions["http_code"] = df_actions["result"].apply(
            lambda x: x.get("http_code")
        )

        display(df_actions[["action_id", "type", "status", "http_code"]])
    else:
        print("No actions executed in this batch.")


--- ‚ö° Stage 4: Execution Audit ---


Unnamed: 0,action_id,type,status,http_code
0,1379193a-eed9-4b16-98b5-d61305bdde40,create_ticket,success,201
1,4adda8f1-7af7-4bd9-a15b-a88dafbaae07,create_ticket,success,201
2,4fdc6a5a-3f0e-41bc-8ecb-e9893238ada6,create_ticket,success,201
3,2935518e-b0f5-4269-a592-c6af6f85c3d0,create_ticket,success,201
4,146d6822-525f-49e0-acf2-26fbf7f97e1f,create_ticket,success,201


## 9: Memory Demo

In [9]:
# 8. Memory Learning (Simulated)
print("\n--- üíæ Memory Bank Update ---")
# Manually trigger a retrieval to show what the agent "knows" now
mem_agent = MemoryAgent()
query = "Technology sales drop"
results = mem_agent.bank.query(query, top_k=2)

print(f"Querying Memory for: '{query}'")
for res in results:
    print(f"- [Score: {res['_score']:.2f}] {res['text'][:100]}...")


--- üíæ Memory Bank Update ---
Querying Memory for: 'Technology sales drop'
- [Score: 0.82] Historical: Technology sales dipped in 2014 due to supply chain....
- [Score: 0.52] Anomaly in Technology (Sales). Severity: 5.5. Explanation: Spike caused by bulk laptop order from Ac...


## 10: Export Assets

In [10]:
# 9. Export Data for Streamlit Dashboard
# We copy the run artifacts to a known 'latest' directory for the App to pick up easily
DASHBOARD_DATA = Path("dashboard_data")
DASHBOARD_DATA.mkdir(exist_ok=True)

import shutil

# Copy files
shutil.copy(snap_path, DASHBOARD_DATA / "snapshot.parquet")
shutil.copy(anom_path, DASHBOARD_DATA / "anomalies.json")
shutil.copy(enriched_path, DASHBOARD_DATA / "enriched.json")
shutil.copy(action_log, DASHBOARD_DATA / "actions.jsonl")
shutil.copy(coordinator.master_manifest_path, DASHBOARD_DATA / "manifest.jsonl")

print(f"‚úÖ Demo Assets Exported to {DASHBOARD_DATA}")

‚úÖ Demo Assets Exported to dashboard_data


## üé¨ Demo Complete

We have successfully executed the autonomous pipeline and exported the artifacts.

**Next Step:**
Open your terminal and run the Streamlit App to see the interactive dashboard:
```bash
streamlit run app.py