# Deep Dive: Advanced Provider Integrations

This notebook covers advanced features beyond the basics in `quick_start.ipynb`.

**What you will learn:**

| Section | Feature |
|---|---|
| 1 | LiteLLM integration for multi-provider tracing |
| 2 | LangChain integration with agents, tools, and retrievers |
| 3 | Chroma integration with `ContextResource.from_chroma()` |
| 4 | Gesture-based interactions (hover, click, edit, drag) |
| 5 | Context budget analysis |

**Prerequisites:** Complete `quick_start.ipynb` first for basics on ContextTrace, ContextBuilder, and ContextResource.

In [1]:
# Setup: works with pip install OR local development
import sys
from pathlib import Path

# Try importing the package - if it fails, add parent directory to path
try:
    import context_engineering_dashboard
    print(f"Using installed package: {context_engineering_dashboard.__file__}")
except ImportError:
    # Running from local clone - add parent directory to path
    repo_root = Path().resolve().parent
    if repo_root not in sys.path:
        sys.path.insert(0, str(repo_root))
    print(f"Using local development: {repo_root}")

# Installation options (run one if package not found):
# !pip install 'context-engineering-dashboard[all]'     # All providers
# !pip install 'context-engineering-dashboard[litellm]' # LiteLLM only
# !pip install 'context-engineering-dashboard[langchain]' # LangChain only
# !pip install 'context-engineering-dashboard[chroma]'  # Chroma only

Using local development: /workspace


In [2]:
import os
from dotenv import load_dotenv

load_dotenv()  # reads OPENAI_API_KEY from .env

True

---
## 1 | LiteLLM Integration

LiteLLM provides a unified API for 100+ LLM providers. The `trace_litellm()` 
tracer captures calls regardless of which backend you use.

Supported providers include:
- OpenAI (`gpt-4o`, `gpt-4o-mini`)
- Anthropic (`anthropic/claude-3-sonnet`)
- Google (`gemini/gemini-pro`)
- Azure OpenAI (`azure/gpt-4`)
- And many more...

In [3]:
import litellm
from context_engineering_dashboard import trace_litellm, ContextBuilder

# Using OpenAI via LiteLLM (bare model name defaults to OpenAI)
with trace_litellm() as tracer:
    response = litellm.completion(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You are an expert on context engineering."},
            {"role": "user", "content": "What are the key components of an LLM context window? Answer in 2-3 sentences."},
        ],
        temperature=0.7,
    )

print("Response:", response.choices[0].message.content)

Response: The key components of an LLM context window include the input tokens, which represent the text or data processed by the model, and the maximum token limit, which defines the size of the context window and determines how much information the model can consider at once. Additionally, the context window includes any relevant metadata or prompt instructions that guide the generation process, such as task-specific instructions or formatting guidelines.


In [4]:
# Inspect the captured trace
litellm_trace = tracer.result

print(f"Provider:    {litellm_trace.trace.provider}")
print(f"Model:       {litellm_trace.trace.model}")
print(f"Prompt:      {litellm_trace.trace.usage.get('prompt_tokens', '?')} tokens")
print(f"Completion:  {litellm_trace.trace.usage.get('completion_tokens', '?')} tokens")
print(f"Latency:     {litellm_trace.trace.latency_ms:.0f} ms")

Provider:    openai
Model:       gpt-4o
Prompt:      39 tokens
Completion:  78 tokens
Latency:     2209 ms


In [5]:
# Visualize the context window
ContextBuilder(trace=litellm_trace)

### Provider Prefixes

LiteLLM uses `provider/model-name` format to route to different backends:

| Model String | Provider |
|---|---|
| `gpt-4o` | OpenAI (default) |
| `anthropic/claude-3-opus` | Anthropic |
| `azure/gpt-4` | Azure OpenAI |
| `bedrock/anthropic.claude-v2` | AWS Bedrock |
| `gemini/gemini-pro` | Google |

The tracer automatically extracts the provider name for visualization.

---
## 2 | LangChain Integration with Agent

The `trace_langchain()` tracer integrates with LangChain's callback system to 
capture chain executions. This section demonstrates a RAG pipeline with:

- **RAG documents** from a retriever (automatically captured)
- **Tool definitions** (manually added to demonstrate the pattern)
- **Few-shot examples** (manually added to demonstrate the pattern)

> **Note:** The tracer automatically captures retriever results as RAG components. 
> Tool and Example components are shown here to demonstrate how to extend traces 
> with custom components for complete context visibility.

In [8]:
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import Chroma as LangChainChroma
from langchain_core.documents import Document

# Sample documents about context engineering

docs = [
    Document(
        page_content="Context engineering optimizes what information goes into an LLM's context window.",
        metadata={"source": "intro.md", "topic": "definition"}
    ),
    Document(
        page_content="RAG (Retrieval-Augmented Generation) retrieves relevant documents before LLM calls.",
        metadata={"source": "rag_guide.md", "topic": "rag"}
    ),
    Document(
        page_content="Tool outputs consume tokens and should be concise to preserve context budget.",
        metadata={"source": "tools_guide.md", "topic": "tools"}
    ),
    Document(
        page_content="Few-shot examples help the model understand the expected output format.",
        metadata={"source": "examples_guide.md", "topic": "examples"}
    ),
]

# Create in-memory vector store
vectorstore = LangChainChroma.from_documents(docs, OpenAIEmbeddings())
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

print(f"Created vectorstore with {len(docs)} documents")

Created vectorstore with 4 documents


In [9]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from context_engineering_dashboard import trace_langchain

# Simple RAG chain
prompt = ChatPromptTemplate.from_template("""
Answer the question based on the context below.

Context: {context}

Question: {question}
""")

llm = ChatOpenAI(model="gpt-4o", temperature=0)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

# Trace the chain execution
with trace_langchain() as tracer:
    result = chain.invoke(
        "What is context engineering?",
        config={"callbacks": [tracer.handler]}
    )

print("Answer:", result)

Answer: Context engineering is the process of optimizing the information that is included in a language model's context window. This involves carefully selecting and structuring the input data to ensure that the most relevant and useful information is provided to the model, thereby maximizing its performance and efficiency. The goal is to make the best use of the available context budget, which is limited by the number of tokens the model can process at once.


In [10]:
# Inspect the captured trace
langchain_trace = tracer.result

print(f"Captured {len(langchain_trace.components)} components:")
for c in langchain_trace.components:
    print(f"  {c.id}: {c.type.value}, {c.token_count} tokens")

Captured 3 components:
  rag_0: rag, 15 tokens
  rag_1: rag, 13 tokens
  prompt_0: user_message, 49 tokens


In [11]:
# Visualize the RAG components captured from the retriever
ContextBuilder(trace=langchain_trace)

### Extending Traces with Tool and Example Components

The LangChain tracer captures RAG documents automatically via retriever callbacks.
To visualize the complete context including tool definitions and few-shot examples,
you can add them manually:

In [12]:
from context_engineering_dashboard import (
    ContextComponent,
    ComponentType,
    ContextTrace,
)

# Tool definitions that would be sent to the LLM
tool_components = [
    ContextComponent(
        id="tool_search",
        type=ComponentType.TOOL,
        content="search_knowledge_base(query: str) -> str: Search the knowledge base for relevant information.",
        token_count=18,
        metadata={"tool_name": "search_knowledge_base"}
    ),
    ContextComponent(
        id="tool_calculate",
        type=ComponentType.TOOL,
        content="calculate_tokens(text: str) -> int: Count the number of tokens in a text string.",
        token_count=16,
        metadata={"tool_name": "calculate_tokens"}
    ),
]

# Few-shot examples for the agent
example_components = [
    ContextComponent(
        id="example_1",
        type=ComponentType.EXAMPLE,
        content="Q: How do I optimize my context window?\nA: Start by measuring token usage per component, then prioritize high-value content.",
        token_count=30,
        metadata={"category": "optimization"}
    ),
    ContextComponent(
        id="example_2",
        type=ComponentType.EXAMPLE,
        content="Q: When should I use summarization?\nA: When chat history exceeds 20% of your context budget.",
        token_count=25,
        metadata={"category": "history_management"}
    ),
]

print(f"Created {len(tool_components)} tool components")
print(f"Created {len(example_components)} example components")

Created 2 tool components
Created 2 example components


In [13]:
# Build a complete trace with all component types
complete_components = (
    [ContextComponent("sys", ComponentType.SYSTEM_PROMPT, 
                     "You are a context engineering expert. Use the tools and examples provided.", 
                     token_count=15)]
    + example_components
    + tool_components
    + langchain_trace.components  # RAG docs from retriever
    + [ContextComponent("user", ComponentType.USER_MESSAGE,
                       "What is context engineering?", token_count=8)]
)

complete_trace = ContextTrace(
    context_limit=128_000,
    components=complete_components,
    total_tokens=sum(c.token_count for c in complete_components),
)

print(f"Complete trace: {len(complete_trace.components)} components")
print(f"Component types: {sorted(set(c.type.value for c in complete_trace.components))}")

Complete trace: 9 components
Component types: ['example', 'rag', 'system_prompt', 'tool', 'user_message']


In [14]:
# Visualize the complete context window with all component types
ContextBuilder(trace=complete_trace)

### What the LangChain Tracer Captures

| Source | ComponentType | Status |
|--------|---------------|--------|
| Retriever documents | RAG | Automatic |
| LLM prompts | USER_MESSAGE | Automatic |
| Tool definitions | TOOL | Add manually |
| Few-shot examples | EXAMPLE | Add manually |

This pattern of combining automatic captures with manual additions gives you 
complete visibility into your context window.

---
## 3 | Chroma Integration

`ContextResource.from_chroma()` wraps a Chroma collection so you can query it
and manage document selection for your context window.

In [15]:
import chromadb
from context_engineering_dashboard import ContextResource, ResourceType

client = chromadb.Client()
collection = client.get_or_create_collection(
    name="context_eng_docs",
    metadata={"description": "Context engineering reference docs"},
)

# Populate with realistic documentation chunks
doc_data = [
    {
        "id": "ce_overview",
        "text": (
            "Context engineering is the discipline of designing and optimizing the information "
            "provided to a large language model within its context window. Unlike prompt engineering, "
            "which focuses on instruction phrasing, context engineering considers the entire input."
        ),
        "meta": {"section": "overview", "page": 1},
    },
    {
        "id": "ce_rag_best",
        "text": (
            "RAG best practices: (1) Retrieve more than you need, then re-rank and prune. "
            "(2) Prefer smaller, focused chunks (200-400 tokens) over large passages. "
            "(3) Include metadata (source, date, score) so the model can weigh relevance."
        ),
        "meta": {"section": "rag", "page": 7},
    },
    {
        "id": "ce_tools",
        "text": (
            "Tool integration patterns: Function calling lets the model invoke external APIs. "
            "Each tool definition consumes tokens from the context window. Best practices: "
            "Only include tools relevant to the current task. Keep descriptions concise."
        ),
        "meta": {"section": "tools", "page": 18},
    },
]

collection.add(
    ids=[d["id"] for d in doc_data],
    documents=[d["text"] for d in doc_data],
    metadatas=[d["meta"] for d in doc_data],
)

print(f"Collection '{collection.name}' has {collection.count()} documents")



Collection 'context_eng_docs' has 3 documents


In [16]:
# Create a ContextResource from a Chroma collection
rag_resource = ContextResource.from_chroma(
    collection=collection,
    resource_type=ResourceType.RAG,
    name="Documentation",
)

# Query the resource (queries the underlying Chroma collection)
user_question = "What are the best practices for RAG?"

rag_resource.query(
    query_texts=[user_question],
    n_results=3,
)

print(f"Query: '{user_question}'\n")
print(f"Retrieved {len(rag_resource.items)} documents:")
for item in rag_resource.items:
    print(f"  [{item.id}] score={item.score:.3f}, {item.token_count} tokens")

Query: 'What are the best practices for RAG?'

Retrieved 3 documents:
  [ce_rag_best] score=0.413, 58 tokens
  [ce_tools] score=0.362, 39 tokens
  [ce_overview] score=0.326, 40 tokens


In [17]:
# Select top 2 documents and visualize with resource pool
top_ids = [item.id for item in rag_resource.items[:2]]
rag_resource.select(top_ids)

# Build a trace with the selected documents
chroma_trace = ContextTrace(
    context_limit=128_000,
    components=[
        ContextComponent("sys", ComponentType.SYSTEM_PROMPT, "You are helpful.", token_count=5),
        ContextComponent("user", ComponentType.USER_MESSAGE, user_question, token_count=10),
    ] + rag_resource.to_components(),
    total_tokens=15 + rag_resource.total_selected_tokens,
)

# Show available pool (left) vs context (right)
ContextBuilder(trace=chroma_trace, resources=[rag_resource])

---
## 4 | Gesture-based Interactions

The dashboard uses intuitive mouse gestures:

| Gesture | Action |
|---------|--------|
| **Hover** | Tooltip showing component type and token count |
| **Click** | Opens a modal with full content and metadata |
| **Click text in modal** | Switch to edit mode (Save button appears) |
| **Drag** | Move items between Available and Context panels |

In [18]:
# Try the gestures on this visualization:
# - Hover over any block to see the tooltip
# - Click a component to see its full content and metadata
# - Click on the text content to edit (Save button appears)
# - Drag items between panels to change selection
ContextBuilder(trace=chroma_trace, resources=[rag_resource])

---
## 5 | Context Budget Analysis

Programmatically inspect where your tokens go.

In [19]:
# Analyze the complete trace from Section 2
analysis_trace = complete_trace

print("CONTEXT BUDGET ANALYSIS")
print("=" * 50)
print(f"Context limit:  {analysis_trace.context_limit:>10,} tokens")
print(f"Tokens used:    {analysis_trace.total_tokens:>10,} tokens")
print(f"Tokens free:    {analysis_trace.unused_tokens:>10,} tokens")
print(f"Utilization:    {analysis_trace.utilization:>9.1f}%")
print()
print(f"{'Component Type':<22} {'Tokens':>8} {'% of Used':>10} {'Count':>6}")
print("-" * 50)

for comp_type in ComponentType:
    comps = analysis_trace.get_components_by_type(comp_type)
    if comps:
        total = sum(c.token_count for c in comps)
        pct = (total / analysis_trace.total_tokens) * 100 if analysis_trace.total_tokens else 0
        print(f"  {comp_type.value:<20} {total:>8,} {pct:>9.1f}% {len(comps):>5}")

CONTEXT BUDGET ANALYSIS
Context limit:     128,000 tokens
Tokens used:           189 tokens
Tokens free:       127,811 tokens
Utilization:          0.1%

Component Type           Tokens  % of Used  Count
--------------------------------------------------
  system_prompt              15       7.9%     1
  user_message               57      30.2%     2
  rag                        28      14.8%     2
  tool                       34      18.0%     2
  example                    55      29.1%     2


In [20]:
# Individual component breakdown
print(f"\n{'ID':<20} {'Type':<18} {'Tokens':>7}")
print("-" * 50)
for c in analysis_trace.components:
    print(f"{c.id:<20} {c.type.value:<18} {c.token_count:>7,}")


ID                   Type                Tokens
--------------------------------------------------
sys                  system_prompt           15
example_1            example                 30
example_2            example                 25
tool_search          tool                    18
tool_calculate       tool                    16
rag_0                rag                     15
rag_1                rag                     13
prompt_0             user_message            49
user                 user_message             8


---

## Summary

| Feature | How to use it |
|---|---|
| LiteLLM tracing | `with trace_litellm() as t:` wraps `litellm.completion()` |
| LangChain tracing | `with trace_langchain() as t:` with `config={"callbacks": [t.handler]}` |
| Chroma integration | `ContextResource.from_chroma(collection, resource_type, name)` then `.query()` |
| Custom components | `ContextComponent(id, type, content, token_count)` for Tool/Example |
| Visualization | `ContextBuilder(trace=trace)` |
| Available pool | `ContextBuilder(trace=trace, resources=[...])` |
| Interactions | Hover (tooltip), Click (modal), Click text (edit), Drag (move items) |
| Budget analysis | `trace.utilization`, `trace.unused_tokens`, `trace.get_components_by_type()` |

**See also:** `quick_start.ipynb` for basics on ContextTrace, ContextDiff, and serialization.