# ChatRoutes AutoBranch - Getting Started Demo

**Intelligent branch exploration for LLM-powered applications**

[![PyPI version](https://badge.fury.io/py/chatroutes-autobranch.svg)](https://badge.fury.io/py/chatroutes-autobranch)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)

---

## What is ChatRoutes AutoBranch?

ChatRoutes AutoBranch helps you:
- 🎯 **Select the best responses** from multiple LLM outputs (beam search)
- 🌈 **Ensure diversity** in responses (avoid repetition)
- 🛑 **Know when to stop** exploring (entropy-based convergence)
- 💰 **Control costs** with budget management (tokens, time, nodes)

Perfect for:
- Tree-of-thought reasoning
- Multi-agent systems
- Creative writing
- Question answering
- Any LLM application that explores multiple paths

---

## This Demo

In this notebook, we'll cover:
1. Installation
2. Basic beam search
3. Scoring strategies
4. Novelty filtering
5. Complete pipeline example

**Time**: ~5 minutes  
**Level**: Beginner-friendly

## Step 1: Installation

Install ChatRoutes AutoBranch from PyPI:

In [None]:
!pip install -q chatroutes-autobranch
print("✅ Installation complete!")

## Step 2: Import the Library

Let's import the key components:

In [None]:
from chatroutes_autobranch import (
    # Core components
    BranchSelector,
    BeamSelector,
    Candidate,
    ScoredCandidate,
    
    # Scoring
    CompositeScorer,
    
    # Novelty filtering
    CosineNoveltyFilter,
    MMRNoveltyFilter,
    
    # Budget management
    Budget,
    BudgetManager,
    
    # Utilities
    DummyEmbeddingProvider,
)

print("✅ Imports successful!")
print("\nKey components:")
print("  • BranchSelector - Main orchestrator")
print("  • BeamSelector - Keeps top-K candidates")
print("  • Candidate - Represents a branch/response")
print("  • CompositeScorer - Combines multiple scoring strategies")
print("  • NoveltyFilters - Remove similar/redundant responses")

## Example 1: Basic Beam Search

Let's start with a simple example: selecting the top 3 responses from 5 candidates.

**Scenario**: You asked an LLM to generate story openings, and got 5 responses. You want to keep only the best 3.

In [None]:
# Step 1: Create a simple beam selector (keep top 3)
beam = BeamSelector(k=3)

# Step 2: Create the main selector
selector = BranchSelector(beam_selector=beam)

# Step 3: Define your candidates
# In real use, these would come from your LLM with logprobs/scores
parent = Candidate(
    id="prompt",
    text="Write a story opening about a detective",
    meta={"logprobs": -0.1}
)

candidates = [
    Candidate(id="c1", text="Detective Sarah Chen walked into the dimly lit office...", meta={"logprobs": -0.5}),
    Candidate(id="c2", text="The rain hammered against the window as Detective Miller reviewed the case...", meta={"logprobs": -0.3}),
    Candidate(id="c3", text="It was another cold morning when Detective Rodriguez found the first clue...", meta={"logprobs": -0.8}),
    Candidate(id="c4", text="Detective Park had seen many cases, but this one was different...", meta={"logprobs": -0.4}),
    Candidate(id="c5", text="The phone rang at 3 AM. Detective Thompson knew it was trouble...", meta={"logprobs": -0.2}),
]

# Step 4: Select the best 3
result = selector.step(parent, candidates)

# Display results
print("="*60)
print("BEAM SEARCH RESULTS")
print("="*60)
print(f"\nInput: {len(candidates)} candidates")
print(f"Output: {len(result.kept)} selected (top {beam.k})\n")

print("✅ SELECTED (Top 3 by confidence):")
for i, candidate in enumerate(result.kept, 1):
    score = candidate.meta.get('logprobs', 'N/A')
    print(f"\n{i}. [{candidate.id}] Score: {score}")
    print(f"   {candidate.text}")

print(f"\n❌ FILTERED OUT ({len(result.pruned)}):")
for candidate in result.pruned:
    score = candidate.meta.get('logprobs', 'N/A')
    print(f"   [{candidate.id}] Score: {score}")

## Example 2: Multi-Strategy Scoring

Instead of just using raw scores, let's combine multiple factors:
- **Confidence** (from logprobs)
- **Relevance** (semantic similarity to prompt)
- **Novelty** (how different from other candidates)

This gives us smarter selection!

In [None]:
# Setup embedding provider (for semantic similarity)
embedding_provider = DummyEmbeddingProvider(dimension=64, seed=42)

# Create a composite scorer with weighted strategies
scorer = CompositeScorer(
    weights={
        "confidence": 0.4,  # 40% weight on LLM confidence
        "relevance": 0.4,   # 40% weight on relevance to prompt
        "novelty": 0.2,     # 20% weight on being different
    },
    embedding_provider=embedding_provider
)

# Create beam with scoring
beam = BeamSelector(scorer=scorer, k=3)
selector = BranchSelector(beam_selector=beam)

# Same candidates as before
result = selector.step(parent, candidates)

print("="*60)
print("MULTI-STRATEGY SCORING RESULTS")
print("="*60)
print(f"\nScoring Strategy:")
print(f"  • 40% Confidence (logprobs from LLM)")
print(f"  • 40% Relevance (similarity to prompt)")
print(f"  • 20% Novelty (uniqueness)\n")

print("✅ SELECTED (Top 3 by composite score):")
for i, candidate in enumerate(result.kept, 1):
    print(f"\n{i}. [{candidate.id}]")
    print(f"   {candidate.text[:60]}...")
    if hasattr(candidate, 'score'):
        print(f"   Final Score: {candidate.score:.3f}")

print(f"\n💡 Notice: Results may differ from Example 1!")
print(f"   We're now considering relevance and novelty, not just confidence.")

## Example 3: Novelty Filtering (Remove Duplicates)

Sometimes LLMs generate similar responses. Let's filter out near-duplicates using:
- **Cosine similarity** - Remove responses that are too similar
- **MMR (Maximal Marginal Relevance)** - Balance relevance and diversity

In [None]:
# Create candidates with some duplicates
candidates_with_dupes = [
    Candidate(id="c1", text="The sky is blue and beautiful today", meta={"logprobs": -0.2}),
    Candidate(id="c2", text="The sky is blue and lovely today", meta={"logprobs": -0.3}),  # Very similar to c1
    Candidate(id="c3", text="Quantum computers use superposition", meta={"logprobs": -0.4}),
    Candidate(id="c4", text="The weather is nice with blue skies", meta={"logprobs": -0.25}),  # Similar to c1
    Candidate(id="c5", text="Machine learning enables pattern recognition", meta={"logprobs": -0.35}),
]

parent = Candidate(id="prompt", text="Tell me something interesting")

print("="*60)
print("NOVELTY FILTERING DEMO")
print("="*60)
print(f"\nInput: {len(candidates_with_dupes)} candidates (some are similar)\n")

# Method 1: Cosine Similarity Filter
print("\n📊 Method 1: COSINE SIMILARITY FILTER")
print("   Remove candidates with similarity > 80%\n")

cosine_filter = CosineNoveltyFilter(
    threshold=0.80,  # Remove if 80%+ similar
    embedding_provider=embedding_provider
)

beam_cosine = BeamSelector(scorer=scorer, k=5, novelty_filter=cosine_filter)
selector_cosine = BranchSelector(beam_selector=beam_cosine)
result_cosine = selector_cosine.step(parent, candidates_with_dupes)

print(f"✅ Kept: {len(result_cosine.kept)} unique responses")
for c in result_cosine.kept:
    print(f"   • [{c.id}] {c.text}")

print(f"\n❌ Filtered: {len(result_cosine.pruned)} duplicates")
for c in result_cosine.pruned:
    print(f"   • [{c.id}] {c.text[:50]}... (too similar)")

# Method 2: MMR (Maximal Marginal Relevance)
print("\n\n📊 Method 2: MMR (BALANCED APPROACH)")
print("   Balance between relevance (70%) and diversity (30%)\n")

mmr_filter = MMRNoveltyFilter(
    lambda_param=0.7,  # 70% relevance, 30% diversity
    embedding_provider=embedding_provider
)

beam_mmr = BeamSelector(scorer=scorer, k=5, novelty_filter=mmr_filter)
selector_mmr = BranchSelector(beam_selector=beam_mmr)
result_mmr = selector_mmr.step(parent, candidates_with_dupes)

print(f"✅ Kept: {len(result_mmr.kept)} diverse responses")
for c in result_mmr.kept:
    print(f"   • [{c.id}] {c.text}")

print("\n💡 MMR ensures both quality AND diversity!")

## Example 4: Complete Pipeline with Budget Control

Let's put it all together with:
- ✅ Beam search (top K selection)
- ✅ Multi-strategy scoring
- ✅ Novelty filtering
- ✅ Budget management (prevent runaway costs)

In [None]:
print("="*60)
print("COMPLETE PIPELINE EXAMPLE")
print("="*60)

# Setup components
embedding_provider = DummyEmbeddingProvider(dimension=64, seed=42)

# 1. Scorer: Weighted strategies
scorer = CompositeScorer(
    weights={
        "confidence": 0.5,
        "relevance": 0.3,
        "novelty": 0.2,
    },
    embedding_provider=embedding_provider
)

# 2. Novelty filter: Remove near-duplicates
novelty_filter = MMRNoveltyFilter(
    lambda_param=0.6,  # 60% relevance, 40% diversity
    embedding_provider=embedding_provider
)

# 3. Budget: Control costs
budget = Budget(
    max_tokens=10000,    # Stop at 10K tokens
    max_nodes=20,        # Max 20 branches explored
    max_time_seconds=60  # Max 60 seconds
)
budget_manager = BudgetManager(budget=budget)

# 4. Beam: Top-3 selection
beam = BeamSelector(
    scorer=scorer,
    k=3,
    novelty_filter=novelty_filter
)

# 5. Main selector
selector = BranchSelector(
    beam_selector=beam,
    budget_manager=budget_manager
)

# Test scenario: Question answering
parent = Candidate(
    id="question",
    text="What are the benefits of exercise?",
    meta={"logprobs": -0.1}
)

candidates = [
    Candidate(id="c1", text="Exercise improves cardiovascular health and reduces disease risk", meta={"logprobs": -0.2}),
    Candidate(id="c2", text="Regular physical activity boosts heart health and lowers illness", meta={"logprobs": -0.3}),  # Similar to c1
    Candidate(id="c3", text="Working out enhances mental well-being and reduces stress", meta={"logprobs": -0.25}),
    Candidate(id="c4", text="Physical fitness increases energy levels and improves sleep quality", meta={"logprobs": -0.35}),
    Candidate(id="c5", text="Exercise helps with weight management and muscle building", meta={"logprobs": -0.4}),
    Candidate(id="c6", text="Regular workouts strengthen bones and improve flexibility", meta={"logprobs": -0.45}),
]

# Run the pipeline
result = selector.step(parent, candidates)

# Display results
print(f"\n📝 Question: {parent.text}")
print(f"\n📊 Pipeline Configuration:")
print(f"   • Beam Size: Top-3 selection")
print(f"   • Scoring: 50% confidence, 30% relevance, 20% novelty")
print(f"   • Novelty: MMR with 60% relevance balance")
print(f"   • Budget: 10K tokens, 20 nodes, 60 seconds")

print(f"\n✅ SELECTED RESPONSES ({len(result.kept)}/{len(candidates)}):")
for i, candidate in enumerate(result.kept, 1):
    print(f"\n{i}. [{candidate.id}]")
    print(f"   {candidate.text}")
    if hasattr(candidate, 'score'):
        print(f"   Score: {candidate.score:.3f}")

print(f"\n❌ FILTERED OUT ({len(result.pruned)}):")
for candidate in result.pruned:
    print(f"   • [{candidate.id}] {candidate.text[:50]}...")

# Budget status
print(f"\n💰 Budget Status:")
print(f"   • Tokens used: {budget_manager.tokens_used} / {budget.max_tokens}")
print(f"   • Nodes explored: {budget_manager.nodes_explored} / {budget.max_nodes}")
print(f"   • Within budget: {'✅ Yes' if not budget_manager.is_exhausted() else '❌ No'}")

print("\n" + "="*60)
print("✅ Pipeline complete! High-quality, diverse results.")
print("="*60)

## Summary

In this demo, you learned:

### ✅ What We Covered

1. **Basic Beam Search** - Select top-K responses
2. **Multi-Strategy Scoring** - Combine confidence, relevance, novelty
3. **Novelty Filtering** - Remove duplicates and ensure diversity
4. **Complete Pipeline** - Production-ready setup with budget control

### 🎯 Key Components

| Component | Purpose |
|-----------|----------|
| `BranchSelector` | Main orchestrator |
| `BeamSelector` | Top-K selection |
| `CompositeScorer` | Multi-strategy scoring |
| `NoveltyFilter` | Diversity control |
| `BudgetManager` | Cost control |

### 📚 Next Steps

Ready to dive deeper?

1. **Advanced Demo**: Check out the [Creative Writing Demo](https://colab.research.google.com/github/chatroutes/chatroutes-autobranch/blob/master/notebooks/creative_writing_colab.ipynb)
   - Full end-to-end example with Ollama
   - Real LLM integration
   - Multi-turn branching

2. **Documentation**:
   - [GitHub Repository](https://github.com/chatroutes/chatroutes-autobranch)
   - [Quick Start Guide](https://github.com/chatroutes/chatroutes-autobranch/blob/master/QUICKSTART.md)
   - [Examples](https://github.com/chatroutes/chatroutes-autobranch/tree/master/examples)

3. **Use Cases**:
   - Tree-of-thought reasoning
   - Multi-agent systems
   - Creative writing assistants
   - Question answering systems
   - Code generation with multiple solutions

### 💡 Pro Tips

- Start with simple beam search, add complexity as needed
- Use MMR for best balance of quality and diversity
- Always set budgets to prevent runaway costs
- Experiment with different scoring weights for your use case

---

### 🚀 Install in Your Project

```bash
pip install chatroutes-autobranch
```

### 📧 Get Help

- **Issues**: https://github.com/chatroutes/chatroutes-autobranch/issues
- **Discussions**: https://github.com/chatroutes/chatroutes-autobranch/discussions
- **Email**: hello@chatroutes.com

---

**Happy branching!** 🌳