<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/186_Analysis_Orchestrator.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
#!/usr/bin/env python3
"""
Tiny runner to smoke-test the 4-node MVP flow.
Executes nodes manually in sequence (no LangGraph yet) to verify contracts.
"""

from mvp_nodes import (
    set_goal,
    build_plan,
    collect,
    analyze,
    report,
    BusinessAnalysisState,
)


def run_smoke_test():
    print("🧪 Running 4-node MVP smoke test...\n")

    # Initial state
    state: BusinessAnalysisState = {
        "company_name": "Tesla",
        "framework": "swot",
    }
    print(f"✅ Initial state: {state}\n")

    # Execute nodes in sequence
    state = set_goal(state)
    print(f"✅ After set_goal: framework={state.get('framework')}, goal={state.get('goal', {}).get('objective')}\n")

    state = build_plan(state)
    print(f"✅ After build_plan: plan steps={[s.get('step') for s in state.get('plan', [])]}\n")

    state = collect(state)
    print(f"✅ After collect: raw_sources={len(state.get('raw_sources', []))} items\n")

    state = analyze(state)
    print(f"✅ After analyze: insights={len(state.get('insights', []))} items\n")

    state = report(state)
    print(f"✅ After report: report_md length={len(state.get('report_md', ''))} chars\n")

    print("=" * 60)
    print("📄 Final Report:")
    print("=" * 60)
    print(state.get("report_md", ""))
    print("=" * 60)
    print("\n✅ Smoke test passed! Nodes execute in sequence.\n")


if __name__ == "__main__":
    run_smoke_test()



Explaining `.get()` with examples:


## `.get()` Method on Dictionaries

`.get()` safely reads dictionary values, returning a default if the key is missing.

### Basic usage

```python
# Instead of: state['framework']  (crashes if key missing)
# We use:     state.get('framework')  (returns None if missing)
# Or:         state.get('framework', 'default')  (returns 'default' if missing)
```

### Examples from the runner

**Line 29:** `state.get('goal', {}).get('objective')`
- `state.get('goal', {})` → returns the `goal` dict if present, otherwise `{}`
- `.get('objective')` → reads `objective` from that dict (or `None` if missing)
- This avoids a crash if `goal` doesn't exist yet

**Line 32:** `[s.get('step') for s in state.get('plan', [])]`
- `state.get('plan', [])` → returns the `plan` list, or `[]` if missing
- `s.get('step')` → extracts `'step'` from each item in the list

### Why use `.get()` instead of `[]`?

```python
# ❌ CRASHES if key doesn't exist:
framework = state['framework']  # KeyError if missing

# ✅ SAFE - returns None or default:
framework = state.get('framework')  # Returns None if missing
framework = state.get('framework', 'swot')  # Returns 'swot' if missing
```

In short: `.get()` prevents crashes when keys may be missing, which is common with state dictionaries that grow over the pipeline.


This pattern is a best practice based on your lessons learned.

**Why this approach works:**

1. Validate contracts early
   - Test state I/O before wiring into LangGraph
   - Catch mismatches before adding graph complexity

2. Faster iteration
   - Quick smoke test: run `python test_mvp_runner.py`
   - No LangGraph setup needed

3. Less debugging
   - If a node fails, you know it’s the node logic, not graph wiring
   - Clearer error messages

4. Clear separation
   - Node logic vs. graph wiring are separate concerns
   - Easier to test in isolation

5. Incremental development
   - Start with minimal nodes → test → expand one at a time
   - Avoid building everything and debugging everything at once

**From your SWOT analysis:**
- You noted the need for better testing
- Manual testing was slow
- Hard to test nodes in isolation

This tiny runner addresses those points.

**Is it industry standard?**
- It aligns with TDD (test behavior, then wire into a framework)
- Some skip it and wire directly into LangGraph
- We’re documenting it as our standard because it fits your workflow


In [None]:
🧪 Running 4-node MVP smoke test...

✅ Initial state: {'company_name': 'Tesla', 'framework': 'swot'}

✅ After set_goal: framework=swot, goal=SWOT assessment

✅ After build_plan: plan steps=['collect_data', 'analyze', 'report']

✅ After collect: raw_sources=2 items

✅ After analyze: insights=1 items

✅ After report: report_md length=95 chars

============================================================
📄 Final Report:
============================================================
# swot Report
- Objective: SWOT assessment
- Steps: collect_data, analyze, report
- Insights: 1
============================================================

✅ Smoke test passed! Nodes execute in sequence.

Smoke test passed.

- **Nodes execute in sequence**: `set_goal` → `build_plan` → `collect` → `analyze` → `report`
- **State propagates correctly**: each step adds/updates state as expected
- **Contracts verified**: nodes accept and return state dictionaries properly
- **Final output generated**: report created with objective, steps, and insight count

**Next steps** (when ready):
1. Wire these nodes into a LangGraph `StateGraph` with edges connecting them
2. Add real data collection (Tavily + Wikipedia)
3. Add LLM analysis with Pydantic validation
4. Implement Jinja2 report templates

For now, the foundation is solid. The tiny runner validated the contracts, so wiring into LangGraph should be straightforward.

---

**What we verified:**
- Nodes can be called in sequence
- State flows correctly through each step
- Each node accepts state and returns updated state
- The contracts (input/output shape) work

**What we haven't tested yet:**
- Real data collection (currently placeholder strings)
- LLM calls (currently placeholder insights)
- Pydantic validation (not implemented yet)
- Report templating (currently simple string formatting)

**Why this approach helps:**
1. Separates concerns: test structure first, then functionality
2. Easier debugging: if structure is broken, you know immediately
3. Incremental: add real data collection next, test again; then add LLM, test again
4. Faster iteration: can run this in seconds without API calls

**Next steps (incrementally):**
- Step 1: Add real data collection → test with actual Tavily/Wikipedia
- Step 2: Add LLM analysis → test with real prompts
- Step 3: Add validation → test Pydantic schemas
- Step 4: Add templates → test Jinja2 rendering

This confirms the framework works. We can now incrementally add functionality and test each layer.


# mvp_nodes.py

In [None]:
from typing import TypedDict, List, Dict, Any
from config import TAVILY_API_KEY, OPENAI_API_KEY
from tavily import TavilyClient


class BusinessAnalysisState(TypedDict, total=False):
    company_name: str
    framework: str
    goal: Dict[str, Any]
    plan: List[Dict[str, Any]]
    raw_sources: List[str]
    insights: List[Dict[str, Any]]
    report_md: str


ALLOWED_FRAMEWORKS = {"swot", "pestel", "porter_five_forces"}


def set_goal(state: BusinessAnalysisState) -> BusinessAnalysisState:
    objective = (state.get("goal") or {}).get("objective") or "SWOT assessment"
    framework = (state.get("goal") or {}).get("framework") or state.get("framework", "swot")
    framework = framework.lower().replace("porters", "porter_five_forces")
    framework = framework if framework in ALLOWED_FRAMEWORKS else "swot"
    goal = {
        "objective": objective,
        "success_criteria": (state.get("goal") or {}).get("success_criteria", "Valid schema and clear exec summary"),
        "framework": framework,
        "scope": {
            "company_name": state.get("company_name", "Unknown"),
            "region": (state.get("goal") or {}).get("region") or state.get("region") or "global",
        },
        "constraints": {"time_minutes": 60, "cost_usd_max": 2.0},
        "acceptance_thresholds": {"min_confidence": 0.5, "min_coverage": 0.8},
        "priority": (state.get("goal") or {}).get("priority", "normal"),
    }
    return {**state, "goal": goal, "framework": framework}


def build_plan(state: BusinessAnalysisState) -> BusinessAnalysisState:
    g = state.get("goal", {})
    thr = g.get("acceptance_thresholds", {"min_confidence": 0.5, "min_coverage": 0.8})
    framework = g.get("framework", "swot")
    plan = [
        {"step": "collect_data", "inputs": ["company_name", "framework"], "outputs": ["raw_sources"], "accept_if": {"coverage": f">={thr['min_coverage']}"}},
        {"step": "analyze", "inputs": ["raw_sources", "framework"], "outputs": ["insights"], "accept_if": {"schema_valid": True, "min_confidence": thr["min_confidence"], "required_categories": framework}},
        {"step": "report", "inputs": ["insights"], "outputs": ["report_md"], "accept_if": {"render_ok": True}},
    ]
    return {**state, "plan": plan}


def collect(state: BusinessAnalysisState) -> BusinessAnalysisState:
    """Collect data from Tavily search API."""
    company = state.get("company_name", "Unknown")
    framework = state.get("framework", "swot")

    raw_sources = []

    if TAVILY_API_KEY:
        try:
            client = TavilyClient(api_key=TAVILY_API_KEY)
            # Simple search query based on company and framework
            query = f"{company} business analysis {framework}"
            response = client.search(query=query, max_results=5)

            for result in response.get("results", []):
                raw_sources.append({
                    "title": result.get("title", ""),
                    "content": result.get("content", ""),
                    "url": result.get("url", ""),
                    "source": "tavily"
                })
        except Exception as e:
            print(f"⚠️ Tavily error: {e}")
            raw_sources = [{"error": str(e), "source": "tavily"}]
    else:
        raw_sources = [{"error": "TAVILY_API_KEY not found", "source": "tavily"}]

    return {**state, "raw_sources": raw_sources}


def analyze(state: BusinessAnalysisState) -> BusinessAnalysisState:
    insights = [{"category": "Strength", "insight": "Placeholder", "confidence": 0.6, "impact": "Moderate"}]
    return {**state, "insights": insights}


def report(state: BusinessAnalysisState) -> BusinessAnalysisState:
    n = len(state.get("insights", []))
    steps = ", ".join([s.get("step") for s in state.get("plan", [])])
    report_md = f"# {state.get('framework', 'SWOT')} Report\n- Objective: {state.get('goal', {}).get('objective')}\n- Steps: {steps}\n- Insights: {n}"
    return {**state, "report_md": report_md}


# config

In [None]:
"""Configuration loader for agent settings and API keys."""
import os
from pathlib import Path
from dotenv import load_dotenv

# Try to load from env/API_KEYS.env first, fallback to root API_KEYS.env
project_root = Path(__file__).parent.parent
env_path = project_root / "env" / "API_KEYS.env"
if not env_path.exists():
    env_path = project_root / "API_KEYS.env"

load_dotenv(env_path)

# API Keys
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
TAVILY_API_KEY = os.getenv("TAVILY_API_KEY")

# Agent Settings
LLM_MODEL = os.getenv("AGENT_LLM_MODEL", "gpt-4o-mini")
LLM_TEMPERATURE = float(os.getenv("AGENT_LLM_TEMPERATURE", "0.1"))
MIN_CONFIDENCE = float(os.getenv("AGENT_MIN_CONFIDENCE", "0.5"))

# Feature Flags
ENABLE_DATA_CACHING = os.getenv("FEATURE_ENABLE_DATA_CACHING", "true").lower() == "true"
ENABLE_COST_TRACKING = os.getenv("FEATURE_ENABLE_COST_TRACKING", "true").lower() == "true"


In [None]:
(.venv) micahshull@Micahs-iMac LG_Cursor_009_BizStrategyAnalysis % python agents/test_mvp_runner.py
🧪 Running 4-node MVP smoke test...

✅ Initial state: {'company_name': 'Tesla', 'framework': 'swot'}

✅ After set_goal: framework=swot, goal=SWOT assessment

✅ After build_plan: plan steps=['collect_data', 'analyze', 'report']

✅ After collect: raw_sources=5 items

✅ After analyze: insights=1 items

✅ After report: report_md length=95 chars

============================================================
📄 Final Report:
============================================================
# swot Report
- Objective: SWOT assessment
- Steps: collect_data, analyze, report
- Insights: 1
============================================================

✅ Smoke test passed! Nodes execute in sequence.


Success. The collect node is now fetching real Tavily data — 5 sources collected.

**What we've accomplished:**
- Framework structure verified (nodes execute in sequence)
- Real data collection working (Tavily API returning actual company data)
- State propagates correctly through the pipeline

**Progress:**
1. Framework structure (done)
2. Real data collection (done)
3. LLM analysis (next: add OpenAI to `analyze` node)
4. Pydantic validation (add schema validation)
5. Report templating (Jinja2 templates)

**Next steps:**
- Add real OpenAI LLM analysis to the `analyze` node
- Load persona from `prompts/Business_Analyst_Persona.md`
- Generate real SWOT insights using the collected Tavily data


In [None]:
from typing import TypedDict, List, Dict, Any
from pathlib import Path
import json
from config import TAVILY_API_KEY, OPENAI_API_KEY, LLM_MODEL, LLM_TEMPERATURE
from tavily import TavilyClient
from openai import OpenAI


class BusinessAnalysisState(TypedDict, total=False):
    company_name: str
    framework: str
    goal: Dict[str, Any]
    plan: List[Dict[str, Any]]
    raw_sources: List[str]
    insights: List[Dict[str, Any]]
    report_md: str


ALLOWED_FRAMEWORKS = {"swot", "pestel", "porter_five_forces"}


def set_goal(state: BusinessAnalysisState) -> BusinessAnalysisState:
    objective = (state.get("goal") or {}).get("objective") or "SWOT assessment"
    framework = (state.get("goal") or {}).get("framework") or state.get("framework", "swot")
    framework = framework.lower().replace("porters", "porter_five_forces")
    framework = framework if framework in ALLOWED_FRAMEWORKS else "swot"
    goal = {
        "objective": objective,
        "success_criteria": (state.get("goal") or {}).get("success_criteria", "Valid schema and clear exec summary"),
        "framework": framework,
        "scope": {
            "company_name": state.get("company_name", "Unknown"),
            "region": (state.get("goal") or {}).get("region") or state.get("region") or "global",
        },
        "constraints": {"time_minutes": 60, "cost_usd_max": 2.0},
        "acceptance_thresholds": {"min_confidence": 0.5, "min_coverage": 0.8},
        "priority": (state.get("goal") or {}).get("priority", "normal"),
    }
    return {**state, "goal": goal, "framework": framework}


def build_plan(state: BusinessAnalysisState) -> BusinessAnalysisState:
    g = state.get("goal", {})
    thr = g.get("acceptance_thresholds", {"min_confidence": 0.5, "min_coverage": 0.8})
    framework = g.get("framework", "swot")
    plan = [
        {"step": "collect_data", "inputs": ["company_name", "framework"], "outputs": ["raw_sources"], "accept_if": {"coverage": f">={thr['min_coverage']}"}},
        {"step": "analyze", "inputs": ["raw_sources", "framework"], "outputs": ["insights"], "accept_if": {"schema_valid": True, "min_confidence": thr["min_confidence"], "required_categories": framework}},
        {"step": "report", "inputs": ["insights"], "outputs": ["report_md"], "accept_if": {"render_ok": True}},
    ]
    return {**state, "plan": plan}


def collect(state: BusinessAnalysisState) -> BusinessAnalysisState:
    """Collect data from Tavily search API."""
    company = state.get("company_name", "Unknown")
    framework = state.get("framework", "swot")

    raw_sources = []

    if TAVILY_API_KEY:
        try:
            client = TavilyClient(api_key=TAVILY_API_KEY)
            # Simple search query based on company and framework
            query = f"{company} business analysis {framework}"
            response = client.search(query=query, max_results=5)

            for result in response.get("results", []):
                raw_sources.append({
                    "title": result.get("title", ""),
                    "content": result.get("content", ""),
                    "url": result.get("url", ""),
                    "source": "tavily"
                })
        except Exception as e:
            print(f"⚠️ Tavily error: {e}")
            raw_sources = [{"error": str(e), "source": "tavily"}]
    else:
        raw_sources = [{"error": "TAVILY_API_KEY not found", "source": "tavily"}]

    return {**state, "raw_sources": raw_sources}


def analyze(state: BusinessAnalysisState) -> BusinessAnalysisState:
    """Analyze collected data using OpenAI LLM to generate SWOT insights."""
    company = state.get("company_name", "Unknown")
    framework = state.get("framework", "swot")
    raw_sources = state.get("raw_sources", [])
    goal = state.get("goal", {})

    # Load persona
    persona_path = Path(__file__).parent.parent / "prompts" / "Business_Analyst_Persona.md"
    persona = ""
    if persona_path.exists():
        persona = persona_path.read_text()

    # Build context from raw sources
    context = "\n\n".join([
        f"Source: {s.get('title', 'Unknown')}\nURL: {s.get('url', '')}\n{s.get('content', '')[:500]}"
        for s in raw_sources if isinstance(s, dict) and not s.get("error")
    ])

    # Build SWOT prompt
    if framework == "swot":
        system_prompt = f"""{persona}

You are conducting a SWOT analysis for {company}.

CRITICAL: Return ONLY valid JSON. No preamble, no explanations, just JSON.

Required JSON schema:
{{
    "strengths": [
        {{"category": "Strength", "insight": "...", "evidence": "...", "confidence": 0.0-1.0, "impact": "Very Low|Low|Moderate|High|Very High"}}
    ],
    "weaknesses": [
        {{"category": "Weakness", "insight": "...", "evidence": "...", "confidence": 0.0-1.0, "impact": "Very Low|Low|Moderate|High|Very High"}}
    ],
    "opportunities": [
        {{"category": "Opportunity", "insight": "...", "evidence": "...", "confidence": 0.0-1.0, "impact": "Very Low|Low|Moderate|High|Very High"}}
    ],
    "threats": [
        {{"category": "Threat", "insight": "...", "evidence": "...", "confidence": 0.0-1.0, "impact": "Very Low|Low|Moderate|High|Very High"}}
    ]
}}

Provide at least 2 insights per category (8 total minimum)."""

        user_prompt = f"""Analyze the following data about {company} and generate a SWOT analysis:

{context}

Objective: {goal.get('objective', 'SWOT assessment')}

Return ONLY the JSON object matching the schema above."""

    insights = []

    if OPENAI_API_KEY:
        try:
            client = OpenAI(api_key=OPENAI_API_KEY)
            response = client.chat.completions.create(
                model=LLM_MODEL,
                temperature=LLM_TEMPERATURE,
                messages=[
                    {"role": "system", "content": system_prompt},
                    {"role": "user", "content": user_prompt}
                ],
                response_format={"type": "json_object"}
            )

            result_json = json.loads(response.choices[0].message.content)

            # Flatten to list of insights
            for category_key in ["strengths", "weaknesses", "opportunities", "threats"]:
                insights.extend(result_json.get(category_key, []))

        except Exception as e:
            print(f"⚠️ OpenAI error: {e}")
            insights = [{"category": "Error", "insight": f"Analysis failed: {str(e)}", "confidence": 0.0, "impact": "Very Low"}]
    else:
        insights = [{"category": "Error", "insight": "OPENAI_API_KEY not found", "confidence": 0.0, "impact": "Very Low"}]

    return {**state, "insights": insights}


def report(state: BusinessAnalysisState) -> BusinessAnalysisState:
    n = len(state.get("insights", []))
    steps = ", ".join([s.get("step") for s in state.get("plan", [])])
    report_md = f"# {state.get('framework', 'SWOT')} Report\n- Objective: {state.get('goal', {}).get('objective')}\n- Steps: {steps}\n- Insights: {n}"
    return {**state, "report_md": report_md}


In [None]:
(.venv) micahshull@Micahs-iMac LG_Cursor_009_BizStrategyAnalysis % python agents/test_mvp_runner.py
🧪 Running 4-node MVP smoke test...

✅ Initial state: {'company_name': 'Tesla', 'framework': 'swot'}

✅ After set_goal: framework=swot, goal=SWOT assessment

✅ After build_plan: plan steps=['collect_data', 'analyze', 'report']

✅ After collect: raw_sources=5 items

✅ After analyze: insights=8 items

✅ After report: report_md length=95 chars

============================================================
📄 Final Report:
============================================================
# swot Report
- Objective: SWOT assessment
- Steps: collect_data, analyze, report
- Insights: 8
============================================================

✅ Smoke test passed! Nodes execute in sequence.



Test passed. End-to-end MVP is working:

**What's working:**
- Real data collection: 5 sources from Tavily API
- Real LLM analysis: 8 SWOT insights generated (2 per category as requested)
- State flows through all nodes

**What we've built:**
1. Framework structure verified
2. Real data collection (Tavily)
3. Real LLM analysis (OpenAI gpt-4o-mini)

**Next steps (your choice):**
1. Generalize `analyze` node — add PESTEL and Porter's prompts (dynamic framework routing)
2. Improve report formatting — format the 8 insights into a readable SWOT report
3. Test edge cases — try different companies, check error handling
4. Add validation — Pydantic schemas to ensure insight structure



Recommendation: improve report formatting first.

Why:
- You can inspect the actual insights the LLM produced
- Validates prompt quality and whether adjustments are needed
- Small, incremental step that follows our pattern
- Makes the output immediately useful

**Suggested order:**
1. Improve report formatting (see the insights)
2. Add Pydantic validation (ensure structure is always correct)
3. Generalize analyze node (add PESTEL, Porter's, etc.)
4. Test edge cases and error handling

This lets us check output quality, catch format issues early, then expand.

Proposed plan:
- Update the `report` node to format the 8 insights into a clean SWOT table
- Include confidence scores and impact levels
- Make it readable for quick review


# report node update

In [None]:

def report(state: BusinessAnalysisState) -> BusinessAnalysisState:
    """Generate formatted markdown report from insights."""
    framework = state.get("framework", "swot").upper()
    company = state.get("company_name", "Unknown")
    goal = state.get("goal", {})
    insights = state.get("insights", [])

    # Organize insights by category
    categories = {
        "Strength": [],
        "Weakness": [],
        "Opportunity": [],
        "Threat": []
    }

    for insight in insights:
        category = insight.get("category", "")
        if category in categories:
            categories[category].append(insight)

    # Build report
    report_md = f"# {framework} Analysis Report\n\n"
    report_md += f"**Company:** {company}\n"
    report_md += f"**Objective:** {goal.get('objective', 'Strategic assessment')}\n"
    report_md += f"**Date:** Generated via Business Analysis Agent\n\n"
    report_md += "---\n\n"

    # Summary table
    report_md += "## Summary\n\n"
    report_md += "| Category | Count |\n"
    report_md += "|----------|-------|\n"
    for cat_name, cat_insights in categories.items():
        report_md += f"| {cat_name}s | {len(cat_insights)} |\n"
    report_md += "\n---\n\n"

    # Detailed insights by category
    for cat_name, cat_insights in categories.items():
        if cat_insights:
            report_md += f"## {cat_name}s\n\n"
            report_md += "| Insight | Evidence | Confidence | Impact |\n"
            report_md += "|---------|----------|------------|--------|\n"

            for insight in cat_insights:
                insight_text = insight.get("insight", "")[:100] + ("..." if len(insight.get("insight", "")) > 100 else "")
                evidence = insight.get("evidence", "")[:80] + ("..." if len(insight.get("evidence", "")) > 80 else "")
                confidence = insight.get("confidence", 0.0)
                impact = insight.get("impact", "Unknown")
                report_md += f"| {insight_text} | {evidence} | {confidence:.2f} | {impact} |\n"

            report_md += "\n"

    report_md += "---\n\n"
    report_md += f"*Report generated with {len(insights)} total insights*\n"

    return {**state, "report_md": report_md}


In [None]:
(.venv) micahshull@Micahs-iMac LG_Cursor_009_BizStrategyAnalysis % python agents/test_mvp_runner.py
🧪 Running 4-node MVP smoke test...

✅ Initial state: {'company_name': 'Tesla', 'framework': 'swot'}

✅ After set_goal: framework=swot, goal=SWOT assessment

✅ After build_plan: plan steps=['collect_data', 'analyze', 'report']

✅ After collect: raw_sources=5 items

✅ After analyze: insights=8 items

✅ After report: report_md length=1975 chars


# SWOT Analysis Report

**Company:** Tesla
**Objective:** SWOT assessment
**Date:** Generated via Business Analysis Agent

---

## Summary

| Category | Count |
|----------|-------|
| Strengths | 2 |
| Weaknesss | 2 |
| Opportunitys | 2 |
| Threats | 2 |

---

## Strengths

| Insight | Evidence | Confidence | Impact |
|---------|----------|------------|--------|
| Strong brand recognition as a leader in the electric vehicle (EV) market. | Tesla is the most recognizable EV brand globally, contributing to its competitiv... | 0.90 | Very High |
| Innovative technology and product offerings. | Tesla's energy-saving techniques and luxury-focused designs have set it apart in... | 0.85 | High |

## Weaknesss

| Insight | Evidence | Confidence | Impact |
|---------|----------|------------|--------|
| Limited global operations compared to competitors. | Tesla's international presence is not as extensive as some traditional automaker... | 0.75 | High |
| Production and delivery challenges. | Tesla has faced issues with scaling production to meet demand, impacting custome... | 0.80 | Moderate |

## Opportunitys

| Insight | Evidence | Confidence | Impact |
|---------|----------|------------|--------|
| Expansion into emerging markets, particularly India. | Tesla's strategy to enter the Indian market could significantly increase its cus... | 0.70 | High |
| Growth in renewable energy solutions. | Increasing global focus on sustainability presents opportunities for Tesla's sol... | 0.80 | Very High |

## Threats

| Insight | Evidence | Confidence | Impact |
|---------|----------|------------|--------|
| Intensifying competition in the EV market. | Traditional automakers are rapidly entering the EV space, increasing competition... | 0.85 | Very High |
| Regulatory challenges and changing government policies. | Potential changes in EV incentives and regulations could impact Tesla's sales an... | 0.75 | High |

---

*Report generated with 8 total insights*

============================================================

✅ Smoke test passed! Nodes execute in sequence.



Results look solid for an MVP.

**What's working well:**
- All 4 SWOT categories populated (2 per category = 8 insights)
- Insights are relevant and coherent (brand recognition, production challenges, India expansion, competition)
- Confidence scores are reasonable (0.70–0.90)
- Impact levels are appropriate (High, Very High, Moderate)
- Evidence is sourced from real Tavily data

**Minor issues to fix:**
- Pluralization bug: "Weaknesss" and "Opportunitys" → "Weaknesses" and "Opportunities"

**What this means:**
- End-to-end flow works (Tavily → LLM → Report)
- Prompt produces structured SWOT insights
- Quality is usable for strategic analysis

**Next steps:**
1. Fix the pluralization bug
2. Add Pydantic validation to ensure structure
3. Test with other companies to check consistency

The MVP is functional.

## Explaining the Pydantic validation changes:

Why it matters:
- Ensures structure: required fields, correct types, valid enums
- Fail fast: catch malformed output before it propagates
- Type safety: clearer errors instead of silent failures

How it works:

Step 1: Define the schema (lines 24-36)

```python
class SWOTInsight(BaseModel):
    category: str = Field(regex="^(Strength|Weakness|Opportunity|Threat)$")
    insight: str = Field(min_length=10)
    evidence: str = Field(min_length=5)
    confidence: float = Field(ge=0.0, le=1.0)  # Between 0.0 and 1.0
    impact: str = Field(regex="^(Very Low|Low|Moderate|High|Very High)$")
```

This enforces:
- `category` must be exactly "Strength", "Weakness", "Opportunity", or "Threat"
- `insight` must be at least 10 characters
- `evidence` must be at least 5 characters
- `confidence` must be 0.0–1.0
- `impact` must be one of the 5 allowed values

`SWOTResult` ensures each category has at least 2 insights.

Step 2: Use it during validation (lines 171-181)


**Validation flow:**

```python
try:
    validated = SWOTResult(**result_json)  # Pydantic validates everything
    # If it passes, we know structure is correct!
except ValidationError as e:
    # If it fails, we log but still use the data (fallback)
    # This prevents crashes while alerting us to quality issues
```

**Example scenarios:**

✅ Valid response:
```json
{
  "strengths": [
    {"category": "Strength", "insight": "Strong brand", "evidence": "Market data", "confidence": 0.9, "impact": "High"}
  ],
  ...
}
```
→ Validation passes → use the validated data

❌ Invalid response (LLM made a mistake):
```json
{
  "strengths": [
    {"category": "Strengths", "insight": "x", "confidence": 1.5, "impact": "Super High"}
  ]
}
```
→ Validation catches:
- `category: "Strengths"` should be `"Strength"`
- `insight` too short (< 10 chars)
- `confidence` 1.5 is > 1.0
- `impact: "Super High"` not in allowed values

→ Prints warning but uses raw data (graceful fallback)

**Why both validation and fallback:**
- Validation ensures quality when the LLM returns correct structure
- Fallback prevents crashes and allows inspection of problematic data
- Warning alerts us to prompt or model issues

**Benefits:**
- Catches issues early (before report generation)
- Clear error messages (Pydantic shows what's wrong)
- Graceful degradation (keeps working even if validation fails)

This is a safety net to catch LLM output issues early while keeping the agent running.

In [None]:
(.venv) micahshull@Micahs-iMac LG_Cursor_009_BizStrategyAnalysis % python agents/test_mvp_runner.py
🧪 Running 4-node MVP smoke test...

✅ Initial state: {'company_name': 'Tesla', 'framework': 'swot'}

✅ After set_goal: framework=swot, goal=SWOT assessment

✅ After build_plan: plan steps=['collect_data', 'analyze', 'report']

✅ After collect: raw_sources=5 items

✅ After analyze: insights=8 items

✅ After report: report_md length=1974 chars

============================================================
📄 Final Report:
============================================================


# SWOT Analysis Report

**Company:** Tesla
**Objective:** SWOT assessment
**Date:** Generated via Business Analysis Agent

---

## Summary

| Category | Count |
|----------|-------|
| Strengths | 2 |
| Weaknesses | 2 |
| Opportunities | 2 |
| Threats | 2 |

---

## Strengths

| Insight | Evidence | Confidence | Impact |
|---------|----------|------------|--------|
| Strong brand recognition as a leader in the electric vehicle (EV) market. | Tesla is the most recognizable EV brand globally, contributing to its competitiv... | 0.90 | Very High |
| Innovative technology and product offerings. | Tesla's energy-saving techniques and luxury-focused designs have set it apart fr... | 0.85 | High |

## Weaknesses

| Insight | Evidence | Confidence | Impact |
|---------|----------|------------|--------|
| Limited global operations compared to traditional automakers. | Tesla's international presence is not as extensive as that of established compet... | 0.75 | High |
| Production and delivery challenges. | Tesla has faced issues with scaling production to meet demand, impacting custome... | 0.80 | Moderate |

## Opportunities

| Insight | Evidence | Confidence | Impact |
|---------|----------|------------|--------|
| Expansion into emerging markets. | Tesla's strategy to enter markets like India could significantly increase its cu... | 0.70 | High |
| Growth in renewable energy solutions. | Increasing global focus on sustainability presents opportunities for Tesla's sol... | 0.80 | Very High |

## Threats

| Insight | Evidence | Confidence | Impact |
|---------|----------|------------|--------|
| Intensifying competition in the EV market. | Traditional automakers are rapidly entering the EV space, increasing competition... | 0.85 | Very High |
| Regulatory challenges and changing government policies. | Potential changes in EV incentives and regulations could impact Tesla's sales an... | 0.70 | Moderate |

---

*Report generated with 8 total insights*






All checks passed:

- Pluralization fixed: "Weaknesses" and "Opportunities" are correct
- Pydantic validation passed: no warnings, so the LLM output matched the schema
- Report formatted correctly: all 8 insights with tables
- Insights quality looks good: relevant to Tesla (brand recognition, production challenges, competition, etc.)

What’s working:
1. Framework structure
2. Real data collection (Tavily)
3. Real LLM analysis (OpenAI)
4. Pydantic validation
5. Formatted reporting

Next steps (when ready):
1. Wire into LangGraph — convert the tiny runner into a StateGraph workflow
2. Add PESTEL and Porter’s — extend the generalized analyze node
3. Enhance data collection — add Wikipedia and improve query strategies
4. Add caching — cache Tavily results for repeated company analyses



In [None]:
class PESTELInsight(BaseModel):
    category: str = Field(pattern="^(Political|Economic|Social|Technological|Environmental|Legal)$")
    insight: str = Field(min_length=10)
    evidence: str = Field(min_length=5)
    confidence: float = Field(ge=0.0, le=1.0)
    impact: str = Field(pattern="^(Low|Moderate|High)$")


class PESTELResult(BaseModel):
    political: List[PESTELInsight] = Field(min_items=2)
    economic: List[PESTELInsight] = Field(min_items=2)
    social: List[PESTELInsight] = Field(min_items=2)
    technological: List[PESTELInsight] = Field(min_items=2)
    environmental: List[PESTELInsight] = Field(min_items=2)
    legal: List[PESTELInsight] = Field(min_items=2)


class PorterForce(BaseModel):
    force: str = Field(pattern="^(Competitive Rivalry|Threat of New Entrants|Threat of Substitutes|Supplier Power|Buyer Power)$")
    rating: str = Field(pattern="^(Low|Moderate|High)$")
    insight: str = Field(min_length=10)
    evidence: str = Field(min_length=5)
    confidence: float = Field(ge=0.0, le=1.0)


class PorterResult(BaseModel):
    competitive_rivalry: PorterForce
    threat_of_new_entrants: PorterForce
    threat_of_substitutes: PorterForce
    supplier_power: PorterForce
    buyer_power: PorterForce


In [None]:

def report(state: BusinessAnalysisState) -> BusinessAnalysisState:
    """Generate formatted markdown report from insights (framework-agnostic)."""
    framework = state.get("framework", "swot").upper()
    company = state.get("company_name", "Unknown")
    goal = state.get("goal", {})
    insights = state.get("insights", [])

    # Framework-specific category definitions
    framework_categories = {
        "swot": {
            "Strength": [], "Weakness": [], "Opportunity": [], "Threat": []
        },
        "pestel": {
            "Political": [], "Economic": [], "Social": [],
            "Technological": [], "Environmental": [], "Legal": []
        },
        "porter_five_forces": {
            "Competitive Rivalry": [], "Threat of New Entrants": [],
            "Threat of Substitutes": [], "Supplier Power": [], "Buyer Power": []
        }
    }

    # Organize insights by category based on framework
    categories = framework_categories.get(state.get("framework", "swot"), {})
    for insight in insights:
        category = insight.get("category", "")
        if category in categories:
            categories[category].append(insight)

    # Build report
    report_md = f"# {framework} Analysis Report\n\n"
    report_md += f"**Company:** {company}\n"
    report_md += f"**Objective:** {goal.get('objective', 'Strategic assessment')}\n"
    report_md += f"**Date:** Generated via Business Analysis Agent\n\n"
    report_md += "---\n\n"

    # Summary table
    report_md += "## Summary\n\n"
    report_md += "| Category | Count |\n"
    report_md += "|----------|-------|\n"
    for cat_name, cat_insights in categories.items():
        report_md += f"| {cat_name} | {len(cat_insights)} |\n"
    report_md += "\n---\n\n"

    # Detailed insights by category
    is_porter = state.get("framework", "swot") == "porter_five_forces"
    for cat_name, cat_insights in categories.items():
        if cat_insights:
            report_md += f"## {cat_name}\n\n"
            # Porter's uses "rating" instead of "impact"
            if is_porter:
                report_md += "| Insight | Evidence | Confidence | Rating |\n"
                report_md += "|---------|----------|------------|--------|\n"
            else:
                report_md += "| Insight | Evidence | Confidence | Impact |\n"
                report_md += "|---------|----------|------------|--------|\n"

            for insight in cat_insights:
                insight_text = insight.get("insight", "")[:100] + ("..." if len(insight.get("insight", "")) > 100 else "")
                evidence = insight.get("evidence", "")[:80] + ("..." if len(insight.get("evidence", "")) > 80 else "")
                confidence = insight.get("confidence", 0.0)
                metric = insight.get("rating" if is_porter else "impact", "Unknown")
                report_md += f"| {insight_text} | {evidence} | {confidence:.2f} | {metric} |\n"

            report_md += "\n"

    report_md += "---\n\n"
    report_md += f"*Report generated with {len(insights)} total insights*\n"

    return {**state, "report_md": report_md}


In [None]:

(.venv) micahshull@Micahs-iMac LG_Cursor_009_BizStrategyAnalysis % python agents/test_mvp_runner.py
🧪 Running 4-node MVP smoke test...

✅ Initial state: {'company_name': 'Tesla', 'framework': 'swot'}

✅ After set_goal: framework=swot, goal=SWOT assessment

✅ After build_plan: plan steps=['collect_data', 'analyze', 'report']

✅ After collect: raw_sources=5 items

✅ After analyze: insights=8 items

✅ After report: report_md length=1975 chars

============================================================
📄 Final Report:
============================================================

✅ Smoke test passed! Nodes execute in sequence.




# SWOT Analysis Report

**Company:** Tesla
**Objective:** SWOT assessment
**Date:** Generated via Business Analysis Agent

---

## Summary

| Category | Count |
|----------|-------|
| Strength | 2 |
| Weakness | 2 |
| Opportunity | 2 |
| Threat | 2 |

---

## Strength

| Insight | Evidence | Confidence | Impact |
|---------|----------|------------|--------|
| Strong brand recognition as a leader in the electric vehicle (EV) market. | Tesla is the most recognizable EV brand globally, contributing to its competitiv... | 0.90 | Very High |
| Innovative technology and product offerings. | Tesla's energy-saving techniques and luxury-focused designs have set it apart fr... | 0.85 | High |

## Weakness

| Insight | Evidence | Confidence | Impact |
|---------|----------|------------|--------|
| Limited global operations compared to competitors. | Tesla's international presence is not as extensive as some traditional automaker... | 0.75 | High |
| Production and supply chain challenges. | Tesla has faced issues with scaling production to meet demand, impacting deliver... | 0.80 | Moderate |

## Opportunity

| Insight | Evidence | Confidence | Impact |
|---------|----------|------------|--------|
| Expansion into emerging markets, particularly India. | Tesla's strategy to enter the Indian market could significantly increase its cus... | 0.70 | High |
| Growth in renewable energy solutions. | Tesla's integrated renewable energy products align with global trends towards su... | 0.80 | Very High |

## Threat

| Insight | Evidence | Confidence | Impact |
|---------|----------|------------|--------|
| Intensifying competition in the EV market. | Traditional automakers are rapidly entering the EV space, increasing market comp... | 0.85 | Very High |
| Regulatory challenges and changing government policies. | Potential changes in EV incentives and regulations could impact Tesla's sales an... | 0.75 | Moderate |

---

*Report generated with 8 total insights*



In [None]:
"""Business Analysis Agent - LangGraph workflow."""
from langgraph.graph import StateGraph, END
from mvp_nodes import (
    BusinessAnalysisState,
    set_goal,
    build_plan,
    collect,
    analyze,
    report
)


def create_business_analysis_agent():
    """Create the business analysis agent workflow."""
    workflow = StateGraph(BusinessAnalysisState)

    # Add nodes
    workflow.add_node("set_goal", set_goal)
    workflow.add_node("build_plan", build_plan)
    workflow.add_node("collect", collect)
    workflow.add_node("analyze", analyze)
    workflow.add_node("report", report)

    # Linear flow
    workflow.set_entry_point("set_goal")
    workflow.add_edge("set_goal", "build_plan")
    workflow.add_edge("build_plan", "collect")
    workflow.add_edge("collect", "analyze")
    workflow.add_edge("analyze", "report")
    workflow.add_edge("report", END)

    return workflow.compile()


if __name__ == "__main__":
    # Example usage
    agent = create_business_analysis_agent()

    initial_state: BusinessAnalysisState = {
        "company_name": "Tesla",
        "framework": "swot"
    }

    print("🚀 Running Business Analysis Agent...\n")
    result = agent.invoke(initial_state)

    print("\n" + "=" * 60)
    print("📄 Final Report:")
    print("=" * 60)
    print(result.get("report_md", ""))
    print("=" * 60)




🚀 Running Business Analysis Agent...


============================================================
📄 Final Report:
============================================================
# PESTEL Analysis Report

**Company:** Tesla
**Objective:** SWOT assessment
**Date:** Generated via Business Analysis Agent

---

## Summary

| Category | Count |
|----------|-------|
| Political Factors | 2 |
| Economic Factors | 2 |
| Social Factors | 2 |
| Technological Factors | 2 |
| Environmental Factors | 2 |
| Legal Factors | 2 |

---

## Political

| Insight | Evidence | Confidence | Impact |
|---------|----------|------------|--------|
| Government incentives for electric vehicles (EVs) support Tesla's growth. | Political support for green technologies enhances Tesla's market opportunities. | 0.80 | High |
| Political stability in key markets is crucial for Tesla's operations. | Tesla's global operations are influenced by trade policies and regulations. | 0.70 | Moderate |

## Economic

| Insight | Evidence | Confidence | Impact |
|---------|----------|------------|--------|
| Rising fuel prices increase demand for electric vehicles. | Economic trends show consumers shifting towards EVs as fuel costs rise. | 0.75 | High |
| Global supply chain disruptions affect production costs. | Recent economic conditions have led to increased material costs for manufacturer... | 0.60 | Moderate |

## Social

| Insight | Evidence | Confidence | Impact |
|---------|----------|------------|--------|
| Growing environmental awareness drives consumer preference for EVs. | Social trends indicate a shift towards sustainable transportation options. | 0.85 | High |
| Changing demographics favor younger consumers who prioritize sustainability. | Younger generations are more inclined to adopt electric vehicles. | 0.70 | Moderate |

## Technological

| Insight | Evidence | Confidence | Impact |
|---------|----------|------------|--------|
| Advancements in battery technology enhance Tesla's competitive edge. | Tesla's battery efficiency is among the best in the industry. | 0.90 | High |
| Continuous innovation in autonomous driving technology is critical. | Technological advancements are essential for maintaining market leadership. | 0.80 | High |

## Environmental

| Insight | Evidence | Confidence | Impact |
|---------|----------|------------|--------|
| Regulatory pressures for sustainability impact production processes. | Ecological trends dictate material availability and production methods. | 0.75 | Moderate |
| Climate change initiatives create opportunities for EV market expansion. | Government policies increasingly favor clean energy solutions. | 0.80 | High |

## Legal

| Insight | Evidence | Confidence | Impact |
|---------|----------|------------|--------|
| Compliance with varying regulations across jurisdictions is essential. | Tesla operates in multiple regions with different legal frameworks. | 0.70 | Moderate |
| Intellectual property laws protect Tesla's innovations. | Legal protections are crucial for maintaining competitive advantages. | 0.65 | Moderate |

---

*Report generated with 12 total insights*



🚀 Running Business Analysis Agent...


============================================================
📄 Final Report:
============================================================
# PORTER_FIVE_FORCES Analysis Report

**Company:** Tesla
**Objective:** SWOT assessment
**Date:** Generated via Business Analysis Agent

---

## Summary

| Category | Count |
|----------|-------|
| Competitive Rivalry | 1 |
| Threat of New Entrants | 1 |
| Threat of Substitutes | 1 |
| Supplier Power | 1 |
| Buyer Power | 1 |

---

## Competitive Rivalry

| Insight | Evidence | Confidence | Rating |
|---------|----------|------------|--------|
| Tesla faces intense competition from established automotive manufacturers and new entrants in the EV... | The automotive industry is increasingly shifting towards electric vehicles, with... | 0.80 | High |

## Threat of New Entrants

| Insight | Evidence | Confidence | Rating |
|---------|----------|------------|--------|
| While the capital requirements and technology barriers are high, the growing demand for EVs attracts... | The entry of new companies into the EV market, despite high initial costs, indic... | 0.70 | Moderate |

## Threat of Substitutes

| Insight | Evidence | Confidence | Rating |
|---------|----------|------------|--------|
| Alternative fuel vehicles and public transportation options pose a moderate threat to Tesla's market... | The rise of hybrid vehicles and improvements in public transport infrastructure ... | 0.60 | Moderate |

## Supplier Power

| Insight | Evidence | Confidence | Rating |
|---------|----------|------------|--------|
| Tesla relies on a limited number of suppliers for critical components, giving suppliers moderate bar... | The specialized nature of EV components, such as batteries, increases supplier p... | 0.70 | Moderate |

## Buyer Power

| Insight | Evidence | Confidence | Rating |
|---------|----------|------------|--------|
| Consumers have moderate bargaining power due to increasing options in the EV market. | As more manufacturers enter the EV space, consumers can choose from a wider rang... | 0.70 | Moderate |

---

*Report generated with 5 total insights*




## Why this approach worked well

1. **Documentation-first**
   - Reviewed the SWOT analysis and lessons learned first
   - Defined specs before coding
   - Result: clear direction and fewer course corrections

2. **Progressive complexity**
   - Started with minimal nodes (5–10 lines)
   - Verified with the tiny runner before LangGraph
   - Generalized only after MVP was working
   - Result: caught contract issues early

3. **Environment organization early**
   - Set up folder structure and requirements upfront
   - Centralized templates to avoid redundancy
   - Result: less refactoring later

4. **Incremental testing**
   - Tested each node manually in sequence
   - Fixed issues immediately (Pydantic, pluralization)
   - Result: smooth LangGraph wiring

5. **Right-time generalization**
   - Got SWOT working first, then added frameworks
   - Result: patterns emerged naturally

**Bottom line:** You followed a methodical approach: plan → build → test → generalize. The "tiny runner" pattern was valuable because it let you verify contracts before LangGraph complexity.

This approach is repeatable and should scale to more complex agents. The guide now documents these patterns for future builds.

