# Multi-Phase Workflow with Conversation Context Reuse

This notebook demonstrates **conversation context reuse** - a powerful pattern for multi-step workflows.

## Key Concept: Files as Fallback

Instead of re-processing data files in every phase:
1. **Phase 1:** Process files → store findings in conversation memory
2. **Phase 2+:** Reuse Phase 1 findings from conversation memory
3. **Files:** Only for verification/fallback, not re-processing

## Benefits

- **90% fewer LLM calls** - No redundant file parsing
- **Faster execution** - Reuse existing calculations
- **Better consistency** - All phases use same base analysis
- **Lower costs** - Reduced token usage

---

**Use this pattern for:** Data analysis → Visualization → Reporting, Multi-step transformations, Complex workflows

## Setup

In [None]:
import httpx
import json
import pandas as pd
from pathlib import Path
import time

# Simple API Client (from API_examples.ipynb)
class LLMApiClient:
    def __init__(self, base_url: str, timeout: float = 3600.0):
        self.base_url = base_url.rstrip("/")
        self.token = None
        self.timeout = httpx.Timeout(50.0, read=timeout, write=timeout, pool=timeout)

    def _headers(self):
        return {"Authorization": f"Bearer {self.token}"} if self.token else {}

    def login(self, username: str, password: str):
        r = httpx.post(f"{self.base_url}/api/auth/login", 
                      json={"username": username, "password": password}, timeout=10.0)
        r.raise_for_status()
        self.token = r.json()["access_token"]
        return r.json()

    def list_models(self):
        r = httpx.get(f"{self.base_url}/v1/models", headers=self._headers(), timeout=10.0)
        r.raise_for_status()
        return r.json()

    def chat_new(self, model: str, user_message: str, agent_type: str = "auto", files: list = None):
        messages = [{"role": "user", "content": user_message}]
        data = {"model": model, "messages": json.dumps(messages), "agent_type": agent_type}
        
        files_to_upload = []
        if files:
            for file_path in files:
                f = open(file_path, "rb")
                files_to_upload.append(("files", (Path(file_path).name, f)))
        
        try:
            r = httpx.post(f"{self.base_url}/v1/chat/completions", data=data,
                          files=files_to_upload if files_to_upload else None,
                          headers=self._headers(), timeout=self.timeout)
            r.raise_for_status()
            result = r.json()
            return result["choices"][0]["message"]["content"], result["x_session_id"]
        finally:
            for _, (_, f) in files_to_upload:
                f.close()

    def chat_continue(self, model: str, session_id: str, user_message: str, 
                     agent_type: str = "auto", files: list = None):
        messages = [{"role": "user", "content": user_message}]
        data = {"model": model, "messages": json.dumps(messages), 
                "session_id": session_id, "agent_type": agent_type}
        
        files_to_upload = []
        if files:
            for file_path in files:  
                f = open(file_path, "rb")
                files_to_upload.append(("files", (Path(file_path).name, f)))
        
        try:
            r = httpx.post(f"{self.base_url}/v1/chat/completions", data=data,
                          files=files_to_upload if files_to_upload else None,
                          headers=self._headers(), timeout=self.timeout)
            r.raise_for_status()
            result = r.json()
            return result["choices"][0]["message"]["content"], result["x_session_id"]
        finally:
            for _, (_, f) in files_to_upload:
                f.close()

# Configuration
API_BASE_URL = 'http://localhost:1007'
USERNAME = "leesihun"
PASSWORD = "s.hun.lee"

client = LLMApiClient(API_BASE_URL)
client.login(USERNAME, PASSWORD)
MODEL = client.list_models()["data"][0]["id"]

print(f"✓ Logged in as: {USERNAME}")
print(f"✓ Using model: {MODEL}")

## Example: Create Test Data

Create sample sales data for our multi-phase workflow.

In [None]:
# Create test CSV data
import numpy as np
np.random.seed(42)

sales_data = pd.DataFrame({
    'date': pd.date_range('2025-01-01', periods=100, freq='D'),
    'product': np.random.choice(['Laptop', 'Phone', 'Tablet', 'Monitor'], 100),
    'quantity': np.random.randint(1, 20, 100),
    'price': np.random.choice([500, 800, 300, 400], 100),
    'region': np.random.choice(['North', 'South', 'East', 'West'], 100)
})

sales_data['revenue'] = sales_data['quantity'] * sales_data['price']

# Save to CSV
csv_path = 'test_sales_data.csv'
sales_data.to_csv(csv_path, index=False)

print(f"✓ Created {csv_path}")
print(f"  Shape: {sales_data.shape}")
print(f"  Total revenue: ${sales_data['revenue'].sum():,}")
print(f"\nFirst 5 rows:")
sales_data.head()

## ❌ Anti-Pattern: Re-processing Files in Every Phase

**Don't do this** - it wastes LLM calls and time.

In [None]:
# BAD EXAMPLE - Re-processes file in every phase
print("=" * 80)
print("❌ ANTI-PATTERN: Re-processing files")
print("=" * 80)

# Phase 1: Analysis (processes file)
phase1_bad = "Analyze the attached sales CSV and calculate total revenue."
start = time.time()
result1, sid = client.chat_new(MODEL, phase1_bad, files=[csv_path])
time1 = time.time() - start
print(f"\nPhase 1: {time1:.1f}s")
print(result1[:200] + "...")

# Phase 2: Still asking to analyze file again
phase2_bad = "Analyze the attached sales CSV and find the top 3 products by revenue."
start = time.time()
result2, _ = client.chat_continue(MODEL, sid, phase2_bad, files=[csv_path])
time2 = time.time() - start
print(f"\nPhase 2: {time2:.1f}s (re-processed file!)")
print(result2[:200] + "...")

print(f"\n⚠ Total time: {time1 + time2:.1f}s")
print("⚠ File processed 2 times (wasteful!)")

## ✅ Best Practice: Conversation Context Reuse

**Do this** - process once, reuse findings.

In [None]:
# GOOD EXAMPLE - Process once, reuse context
print("=" * 80)
print("✅ BEST PRACTICE: Conversation context reuse")
print("=" * 80)

# Phase 1: Analysis (processes file ONCE)
phase1_good = """
Analyze the attached sales CSV file.

Calculate and store in memory:
1. Total revenue
2. Revenue by product
3. Revenue by region
4. Top 3 products

I'll ask follow-up questions in subsequent messages.
"""

start = time.time()
result1, sid = client.chat_new(MODEL, phase1_good, files=[csv_path])
time1 = time.time() - start
print(f"\nPhase 1 (file processing): {time1:.1f}s")
print(result1[:300] + "...")

# Phase 2: Reuse Phase 1 findings (NO file re-processing)
phase2_good = """
**PRIORITY: Use your Phase 1 analysis from conversation memory.**

You already analyzed the sales CSV in Phase 1 and calculated:
- Total revenue
- Revenue by product
- Revenue by region
- Top 3 products

**DO NOT re-analyze the file.** Use your Phase 1 findings.

Based on Phase 1 results:
1. What percentage of total revenue does the top product represent?
2. Which region has the lowest revenue?
"""

start = time.time()
result2, _ = client.chat_continue(MODEL, sid, phase2_good)
time2 = time.time() - start
print(f"\nPhase 2 (context reuse): {time2:.1f}s (no file re-processing!)")
print(result2[:300] + "...")

# Phase 3: More analysis using conversation context
phase3_good = """
**PRIORITY: Use Phase 1 & 2 findings from conversation memory.**

Based on your previous analysis:
- Calculate the average revenue per transaction
- Identify any regional disparities worth noting
"""

start = time.time()
result3, _ = client.chat_continue(MODEL, sid, phase3_good)
time3 = time.time() - start
print(f"\nPhase 3 (context reuse): {time3:.1f}s (no file re-processing!)")
print(result3[:300] + "...")

print(f"\n✅ Total time: {time1 + time2 + time3:.1f}s")
print("✅ File processed 1 time (efficient!)")
print(f"✅ Phases 2-3 used conversation memory")

## Pattern Template: Phase Handoff Prompts

Use this template for multi-phase workflows.

In [None]:
# Template for phase handoff prompts

PHASE_1_TEMPLATE = """
Analyze the attached {file_description} file.

Calculate and store in memory:
{list_of_calculations}

I'll ask follow-up questions in subsequent messages.
"""

PHASE_N_TEMPLATE = """
**PRIORITY: Use your Phase {previous_phase} findings from conversation memory.**

In Phase {previous_phase}, you already:
{summary_of_previous_findings}

**DO NOT re-analyze the raw files.** Use your Phase {previous_phase} findings.

The attached files are ONLY for verification if needed.

Current Task:
{current_task_description}
"""

print("Phase 1 Template:")
print(PHASE_1_TEMPLATE)
print("\n" + "="*80 + "\n")
print("Phase N Template:")
print(PHASE_N_TEMPLATE)

## Real-World Example: Data Analysis → Visualization → Report

Complete 3-phase workflow with context reuse.

In [None]:
print("=" * 80)
print("REAL-WORLD WORKFLOW: Analysis → Visualization → Report")
print("=" * 80)

# Phase 1: Deep Analysis
analysis_prompt = """
Analyze the attached sales CSV file comprehensively.

Calculate and save to numpy arrays:
1. Total revenue by product
2. Total revenue by region
3. Daily revenue trend
4. Top 5 days by revenue
5. Average transaction size

Store all results locally for next phases.
"""

print("\n[Phase 1: Analysis]")
start = time.time()
analysis_result, wf_session = client.chat_new(MODEL, analysis_prompt, files=[csv_path])
print(f"Time: {time.time() - start:.1f}s")
print(analysis_result[:400] + "...\n")

# Phase 2: Visualization (reuses Phase 1)
viz_prompt = """
**PRIORITY: Use Phase 1 analysis from conversation memory.**

You already calculated:
- Revenue by product/region
- Daily trends
- Top 5 days
- Average transaction size

**DO NOT re-analyze the CSV.** Use Phase 1 results.

Task: Create visualizations:
1. Bar chart: Revenue by product (save as 'revenue_by_product.png')
2. Line chart: Daily revenue trend (save as 'daily_trend.png')
3. Pie chart: Revenue by region (save as 'revenue_by_region.png')

Use matplotlib, 300 DPI, professional style.
"""

print("[Phase 2: Visualization]")
start = time.time()
viz_result, _ = client.chat_continue(MODEL, wf_session, viz_prompt)
print(f"Time: {time.time() - start:.1f}s")
print(viz_result[:400] + "...\n")

# Phase 3: Report Generation (reuses Phase 1 & 2)
report_prompt = """
**PRIORITY: Use Phase 1 & 2 findings from conversation memory.**

You have:
- Phase 1: All revenue calculations and trends
- Phase 2: Three visualization charts created

**DO NOT re-analyze data.** Use conversation context.

Task: Generate executive summary report:
1. Key findings (3-5 bullet points)
2. Top product recommendation
3. Regional strategy suggestion
4. Note which charts support each finding

Format as markdown.
"""

print("[Phase 3: Report Generation]")
start = time.time()
report_result, _ = client.chat_continue(MODEL, wf_session, report_prompt)
print(f"Time: {time.time() - start:.1f}s")
print("\nExecutive Summary:")
print(report_result)

print("\n" + "=" * 80)
print("✅ 3-phase workflow complete!")
print("✅ File processed once in Phase 1")
print("✅ Phases 2-3 reused conversation memory")
print("=" * 80)

## Cleanup

In [None]:
# Clean up test files
Path(csv_path).unlink(missing_ok=True)
print(f"✓ Cleaned up {csv_path}")

---

## Summary: Best Practices

### ✅ DO:
- Process files ONCE in Phase 1
- Store findings in conversation memory
- Use explicit "PRIORITY: Use Phase X findings" instructions
- Attach files as fallback reference only
- Summarize what previous phases calculated

### ❌ DON'T:
- Re-upload and re-process files in every phase
- Assume the AI will automatically reuse context without instruction
- Mix file processing with follow-up questions in the same prompt

### Pattern Structure:
```
Phase 1: "Analyze file and calculate X, Y, Z"
         ↓ (file processed, results in conversation)
         
Phase 2: "PRIORITY: Use Phase 1 findings (X, Y, Z from memory)"
         "DO NOT re-analyze file"
         "Task: Generate visualizations"
         ↓ (uses conversation memory, no file re-processing)
         
Phase 3: "PRIORITY: Use Phase 1 & 2 findings"
         "Task: Create report"
         ↓ (uses conversation memory)
```

---

**See also:**
- [PPTX_Report_Generator_Agent_v2.ipynb](PPTX_Report_Generator_Agent_v2.ipynb) - Full PowerPoint generation example
- [CLAUDE.md](CLAUDE.md) - Architecture documentation
- Backend code: `backend/utils/phase_manager.py`, `backend/tasks/react/context_manager.py`