# DevFlow Analyzer Demo

This notebook demonstrates the core capabilities of DevFlow Analyzer:
1. Process mining analysis of CI/CD data
2. DFG visualization
3. Agent-based investigation
4. Report generation
5. Experiment tracking with MLflow

## Cell 1: Setup and Imports

In [None]:
# Cell 1: Setup
import sys
sys.path.insert(0, '..')  # Add parent directory to path

from pathlib import Path
from src.process_analyzer import ProcessAnalyzer
from src.agent import DevFlowAgent
from src.llm_reporter import LLMReporter
from src.llm_provider import get_available_models
from src.evaluation import ExperimentTracker, Timer, compute_cost

print("Available models:", get_available_models())

## Cell 2: Load and Analyze CI/CD Data

In [None]:
# Cell 2: Process Analysis
analyzer = ProcessAnalyzer()
analyzer.load_data(Path('../data/sample/travistorrent_10k.csv'))

result = analyzer.analyze()

print(f"Builds analyzed: {result.n_builds:,}")
print(f"Projects: {result.n_projects}")
print(f"Date range: {result.date_range_start} to {result.date_range_end}")
print(f"\nSuccess rate: {result.overall_success_rate:.1%}")
print(f"Failure rate: {result.overall_failure_rate:.1%}")
print(f"Error rate: {result.overall_error_rate:.1%}")
print(f"\nMedian duration: {result.median_duration_seconds:.0f}s")
print(f"P90 duration: {result.p90_duration_seconds:.0f}s")
print(f"\nBottlenecks detected: {len(result.bottlenecks) if result.bottlenecks else 0}")
print(f"Projects at risk: {len(result.projects_at_risk) if result.projects_at_risk else 0}")

## Cell 3: Generate DFG Visualization

In [None]:
# Cell 3: DFG Visualization
from IPython.display import Image, display

dfg_path = Path('../outputs/figures/dfg_demo.png')
dfg_path.parent.mkdir(parents=True, exist_ok=True)

analyzer.generate_dfg(dfg_path)
print(f"DFG saved to: {dfg_path}")

# Display the image
display(Image(filename=str(dfg_path)))

## Cell 4: Agent Investigation

The DevFlow Agent uses ReAct pattern with tools to autonomously investigate CI/CD issues.

In [None]:
# Cell 4: Agent Investigation
agent = DevFlowAgent(model_key='gpt-4o-mini', temperature=0.3)

question = "Which projects have the highest failure rates and what might be causing the issues?"

print(f"Question: {question}\n")
print("Agent investigating...\n")

with Timer() as timer:
    response = agent.investigate(result, question)

print(response)
print(f"\n---\nLatency: {timer.elapsed_ms:.0f}ms")

## Cell 5: Full Report Generation

Generate a structured report with all sections.

In [None]:
# Cell 5: Full Report
from IPython.display import Markdown

reporter = LLMReporter(model_key='gpt-4o-mini', temperature=0.7)

print("Generating full report...\n")

with Timer() as timer:
    report = reporter.generate_report(result)

print(f"Report generated in {timer.elapsed_ms:.0f}ms\n")

# Display as formatted markdown
display(Markdown(report.to_markdown()))

## Cell 6: Experiment Tracking with MLflow

Track experiments for reproducibility and comparison.

In [None]:
# Cell 6: Experiment Tracking
tracker = ExperimentTracker('demo-notebook')

with tracker.start_run('demo-analysis', tags={'notebook': 'demo'}):
    # Log parameters
    tracker.log_params({
        'model_key': 'gpt-4o-mini',
        'n_builds': result.n_builds,
        'n_projects': result.n_projects,
    })
    
    # Run agent analysis
    agent = DevFlowAgent(model_key='gpt-4o-mini')
    
    with Timer() as timer:
        response = agent.investigate(result, 'Summarize the CI/CD health in 3 bullet points.')
    
    # Estimate tokens and cost
    input_tokens = len(result.to_llm_context()) // 4
    output_tokens = len(response) // 4
    cost = compute_cost('gpt-4o-mini', input_tokens, output_tokens)
    
    # Log metrics
    tracker.log_metrics({
        'latency_ms': timer.elapsed_ms,
        'input_tokens': input_tokens,
        'output_tokens': output_tokens,
        'cost_usd': cost,
    })
    
    # Log artifact
    tracker.log_artifact(response, 'agent_response.md')

print("Experiment logged to MLflow!")
print(f"\nAgent response:\n{response}")
print(f"\nMetrics: {timer.elapsed_ms:.0f}ms, ${cost:.6f}")

## Summary

This demo showed:
1. **Process Mining**: Loading and analyzing CI/CD build data with PM4Py
2. **Visualization**: Generating Directly-Follows Graphs
3. **Agent Analysis**: ReAct-style agent investigating issues autonomously
4. **Report Generation**: Structured LLM-powered reports
5. **Experiment Tracking**: MLflow integration for reproducibility

To run the full Streamlit application:
```bash
streamlit run app.py
```

To view MLflow experiments:
```bash
mlflow ui --port 5000
```