# üåê LangGraph Agents with Llama Stack: Bring Your Own Agentic Framework

This notebook demonstrates **agentic framework integration** - how to use any agentic framework (LangGraph, AutoGen, CrewAI) with Llama Stack's OpenAI-compatible APIs.

**What is LangGraph?**
LangGraph is a state-based agent framework that transforms AI applications into sophisticated multi-step reasoning systems:
- **State Management**: Track conversation history and intermediate reasoning steps
- **Graph-Based Flows**: Define complex agent workflows with conditional logic and loops
- **Tool Integration**: Seamlessly bind external tools for enhanced capabilities
- **Flexible Architecture**: Build everything from simple chatbots to complex multi-agent systems

**Why LangGraph + Llama Stack?**
Instead of being locked into a single provider's ecosystem, this combination gives you:
- **Framework Freedom**: Use your preferred agentic framework without vendor lock-in
- **OpenAI Compatibility**: Leverage existing LangChain/LangGraph code with minimal changes
- **Tool Ecosystem**: Access MCP (Model Context Protocol) tools for weather, web search, and more
- **Production Ready**: Deploy on your infrastructure with full observability and control

## üèóÔ∏è LangGraph + Llama Stack Architecture

The integration creates **three key layers** that work together to enable sophisticated agentic capabilities:

### 1. üß† LangGraph Layer (The Agent Brain)
This is where intelligent agent behavior is defined:
- **StateGraph**: Manages conversation state and agent memory across interactions
- **Nodes & Edges**: Define agent reasoning steps and decision flow
- **Message Handling**: Tracks conversation history and context
- **Conditional Logic**: Enables complex multi-step reasoning workflows

### 2. üîó OpenAI Compatibility Layer (The Translation)
This bridges LangGraph to Llama Stack seamlessly:
- **ChatOpenAI Client**: Standard LangChain interface pointing to Llama Stack
- **OpenAI-Compatible Endpoint**: Llama Stack's `/v1/openai/v1` endpoint
- **Tool Binding**: Attach MCP tools to LLM for enhanced capabilities
- **Response Handling**: Process streaming and non-streaming responses

### 3. ü¶ô Llama Stack Layer (The AI API to run them all)
This provides the AI model and tool runtime:
- **Model Inference**: vLLM-powered Llama 3.2 3B for fast, local inference
- **MCP Tools**: Weather, web search, and custom tool integrations
- **Observability**: Comprehensive telemetry and monitoring
- **Production Features**: Safety filters, rate limiting, and error handling

**The Power**: LangGraph provides sophisticated agent orchestration while Llama Stack handles the heavy lifting of model inference and tool execution.

## üì¶ Install Required Packages

Install the LangGraph and integration dependencies:

In [6]:
!pip install -q langgraph==0.6.7 langchain-openai==0.3.32 langchain-core==0.3.75


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [7]:
# Core imports for LangGraph integration
import os
import sys
import json
from typing import Annotated
from typing_extensions import TypedDict

# LangGraph imports for agent creation
from langgraph.graph import StateGraph, END, START
from langgraph.graph.message import add_messages

# LangChain imports for OpenAI compatibility
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage
from langchain_core.tools import tool

# Additional utilities for development
from pprint import pprint

## üîó Connect LangGraph to Llama Stack

Connect LangGraph to Llama Stack's OpenAI-compatible endpoint. This creates a seamless bridge that allows LangGraph to use Llama Stack as its inference backend while maintaining full compatibility with existing LangChain code.

**Key Integration Points:**
- **OpenAI-Compatible Endpoint**: Use Llama Stack's `/v1/openai/v1` endpoint for seamless integration
- **Model Configuration**: Point to deployed Llama 3.2 3B model for fast inference
- **Tool Binding**: Prepare for MCP weather tool integration
- **State Management**: Set up LangGraph's conversation state handling

In [8]:
# === LangGraph + Llama Stack Configuration ===
print("üåê Configuring LangGraph with Llama Stack Integration")

# === Endpoint Configuration ===
# Use Llama Stack's OpenAI-compatible endpoint for seamless LangChain integration
LLAMA_STACK_OPENAI_ENDPOINT = "http://llama-stack-instance-service.llama-serve.svc.cluster.local:8321/v1/openai/v1"
INFERENCE_MODEL = "llama3-2-3b"  # Model deployed in the cluster
API_KEY = "fake"       # Not needed for local deployment

print(f"üìç Llama Stack OpenAI Endpoint: {LLAMA_STACK_OPENAI_ENDPOINT}")
print(f"ü§ñ Inference Model: {INFERENCE_MODEL}")

# === Create ChatOpenAI Client ===
# This creates a standard LangChain client that talks to Llama Stack
llm = ChatOpenAI(
    model=INFERENCE_MODEL,
    openai_api_key=API_KEY,
    openai_api_base=LLAMA_STACK_OPENAI_ENDPOINT,
    use_responses_api=True,
)

print("‚úÖ ChatOpenAI client configured for Llama Stack")

# === Test Basic Connectivity ===
print("\nüß™ Testing basic LangGraph-Llama Stack connectivity...")

try:
    # Simple connectivity test
    response = llm.invoke("Hello! Please respond with 'Connection successful' if you can hear me.")
    print(f"üì§ Test Query: Hello connectivity test")
    print(f"üì• Response: {response.content}")
    print("‚úÖ Connection successful!")
except Exception as e:
    print(f"‚ùå Connection failed: {e}")
    print("üí° Make sure Llama Stack service is running and accessible")
    sys.exit(1)

print(f"\nüéØ LangGraph is now ready to use Llama Stack for inference!")

üåê Configuring LangGraph with Llama Stack Integration
üìç Llama Stack OpenAI Endpoint: http://llama-stack-instance-service.llama-serve.svc.cluster.local:8321/v1/openai/v1
ü§ñ Inference Model: llama3-2-3b
‚úÖ ChatOpenAI client configured for Llama Stack

üß™ Testing basic LangGraph-Llama Stack connectivity...
üì§ Test Query: Hello connectivity test
üì• Response: [{'type': 'text', 'text': 'Connection successful', 'annotations': []}]
‚úÖ Connection successful!

üéØ LangGraph is now ready to use Llama Stack for inference!


## ü§ñ Step 1: Create a Basic LangGraph Agent

Now let's build our first LangGraph agent that uses Llama Stack for inference. This demonstrates the core pattern for creating stateful, conversational agents.

**LangGraph Fundamentals:**
1. **State Definition**: Define what information the agent tracks across interactions
2. **Node Creation**: Create functions that process messages and update state
3. **Graph Building**: Connect nodes with edges to define conversation flow
4. **Compilation**: Compile the graph into an executable agent

**Key Benefits of LangGraph:**
- **Memory**: Automatically tracks conversation history across interactions
- **State Persistence**: Maintains context and intermediate results
- **Flexible Flow**: Define complex multi-step reasoning patterns
- **Tool Integration**: Easily bind external tools for enhanced capabilities

In [9]:
print("üìã LangGraph + Llama Stack Integration")
print(f"   LLAMA_STACK_URL: {LLAMA_STACK_OPENAI_ENDPOINT}")
print(f"   INFERENCE_MODEL: {INFERENCE_MODEL}")

llm = ChatOpenAI(
    model=INFERENCE_MODEL,
    openai_api_key=API_KEY,
    openai_api_base=LLAMA_STACK_OPENAI_ENDPOINT,
    use_responses_api=True,
)

# Test connectivity
print("\nüß™ Testing basic connectivity:")
response = llm.invoke("Hello")
print(f"‚úÖ Connection successful")

# Bind MCP weather tools
print("\nüõ†Ô∏è Setting up MCP weather tools...")
llm_with_tools = llm.bind_tools([
    {
        "type": "mcp",
        "server_label": "weather",
        "server_url": "http://mcp-weather.llama-serve.svc.cluster.local:80/sse",
        "require_approval": "never",
    },
])
print("‚úÖ MCP tools configured")

# Define LangGraph State and Agent
class State(TypedDict):
    messages: Annotated[list, add_messages]

def chatbot(state: State):
    message = llm_with_tools.invoke(state["messages"])
    return {"messages": [message]}

# Build LangGraph StateGraph
print("\nüèóÔ∏è Building LangGraph agent...")
graph_builder = StateGraph(State)
graph_builder.add_node("chatbot", chatbot)
graph_builder.add_edge(START, "chatbot")
graph_builder.add_edge("chatbot", END)

graph = graph_builder.compile()
print("‚úÖ LangGraph agent ready")

# Test the integration
print("\n" + "="*50)
print("üöÄ Testing LangGraph Agent with MCP Tools")
print("="*50)

response = graph.invoke({
    "messages": [{"role": "user", "content": "What is the weather in Boston?"}]
})

print("Weather Response:")
for message in response['messages']:
    if hasattr(message, 'content'):
        if isinstance(message.content, list):
            for content_block in message.content:
                if content_block.get('type') == 'text':
                    print(content_block.get('text', ''))
        elif isinstance(message.content, str):
            print(message.content)
    else:
        message.pretty_print()

üìã LangGraph + Llama Stack Integration
   LLAMA_STACK_URL: http://llama-stack-instance-service.llama-serve.svc.cluster.local:8321/v1/openai/v1
   INFERENCE_MODEL: llama3-2-3b

üß™ Testing basic connectivity:


‚úÖ Connection successful

üõ†Ô∏è Setting up MCP weather tools...
‚úÖ MCP tools configured

üèóÔ∏è Building LangGraph agent...
‚úÖ LangGraph agent ready

üöÄ Testing LangGraph Agent with MCP Tools
Weather Response:
What is the weather in Boston?
Unfortunately, I don't have direct access to real-time weather information. However, I can suggest some alternative ways for you to find the current weather in Boston. You can check online weather websites such as weather.com or accuweather.com, or download a weather app on your smartphone to get the latest forecast.
