# LlamaStack Workshop - Client Demo

This notebook demonstrates how to interact with your deployed LlamaStack distribution programmatically.

## Prerequisites
- You have deployed a model (Llama 3.2-3B) in your project
- You have enabled the GenAI Playground (which creates a LlamaStack Distribution)
- MCP servers have been configured in your LlamaStack

## What You'll Learn
1. How to list available models
2. How to list available tools (MCP servers)
3. How to make chat completions
4. **How to invoke MCP tools directly**
5. **How to use agent-based tool calling**

In [None]:
# Install required packages
%pip install -q requests

In [None]:
import requests
import json
import os

# Configuration - UPDATE THIS with your project name!
PROJECT_NAME = "user-XX"  # <-- Change XX to your user number (e.g., user-05)

# LlamaStack endpoint (internal OpenShift service)
LLAMASTACK_URL = f"http://lsd-genai-playground-service.{PROJECT_NAME}.svc.cluster.local:8321"

print(f"Project: {PROJECT_NAME}")
print(f"LlamaStack URL: {LLAMASTACK_URL}")

if PROJECT_NAME == "user-XX":
    print("\n‚ö†Ô∏è WARNING: Please update PROJECT_NAME above with your actual project name!")

## 1. List Available Models

Let's see what models are available in your LlamaStack distribution.

In [None]:
response = requests.get(f"{LLAMASTACK_URL}/v1/models", timeout=10)
models = response.json().get("data", [])

# Filter to LLM models only
llm_models = [m for m in models if m.get("model_type") == "llm"]

print(f"ü§ñ LLM Models Available: {len(llm_models)}")
print("=" * 50)
for m in llm_models:
    print(f"  ‚Ä¢ {m.get('identifier')} ({m.get('provider_id')})")
    
# Store the first model ID for later use
# Note: identifier already includes the provider prefix (e.g., "vllm-inference-1/llama-32-3b-instruct")
if llm_models:
    MODEL_ID = llm_models[0].get('identifier')
    print(f"\nüìå Using model: {MODEL_ID}")

## 2. List Available Tools (MCP Servers)

Tools are provided by MCP servers. Let's see what's available.

In [None]:
response = requests.get(f"{LLAMASTACK_URL}/v1/tools", timeout=10)
data = response.json()
tools = data if isinstance(data, list) else data.get("data", [])

# Group by toolgroup (MCP server)
toolgroups = {}
for t in tools:
    tg = t.get("toolgroup_id", "unknown")
    if tg not in toolgroups:
        toolgroups[tg] = []
    toolgroups[tg].append(t.get("name", "unknown"))

# Count MCP servers (exclude builtin)
mcp_servers = [tg for tg in toolgroups.keys() if tg.startswith("mcp::")]

# Store toolgroups for agent creation later
TOOLGROUPS = mcp_servers

print(f"üõ†Ô∏è MCP Servers: {len(mcp_servers)}")
print(f"üìä Total Tools: {len(tools)}")
print("=" * 50)
for tg, tool_list in sorted(toolgroups.items()):
    icon = "üå§Ô∏è" if "weather" in tg else "üë•" if "hr" in tg else "üîß"
    print(f"\n{icon} {tg} ({len(tool_list)} tools)")
    for tool in tool_list:
        print(f"   ‚Ä¢ {tool}")

## 3. Simple Chat Completion

Let's test a basic chat completion.

In [None]:
payload = {
    "model": MODEL_ID,
    "messages": [
        {"role": "user", "content": "What is the capital of France? Answer in one sentence."}
    ],
    "temperature": 0.7,
    "max_tokens": 256
}

print(f"ü§ñ Using model: {MODEL_ID}")
print("=" * 50)

response = requests.post(
    f"{LLAMASTACK_URL}/v1/openai/v1/chat/completions",
    json=payload,
    timeout=60
)

if response.status_code == 200:
    result = response.json()
    content = result.get("choices", [{}])[0].get("message", {}).get("content", "")
    print(f"\nüìù Response:")
    print(content)
else:
    print(f"‚ùå Error: {response.status_code} - {response.text}")

---

# üõ†Ô∏è Part 2: Using MCP Tools

There are **two ways** to use MCP tools in LlamaStack:

1. **Direct Tool Invocation** - Call tools directly via the `/v1/tool-runtime/invoke` API
2. **Agent-Based Tool Calling** - Create an agent that automatically decides when to use tools

Let's try both!

In [None]:
def invoke_tool(tool_name: str, kwargs: dict = None) -> str:
    """Invoke a tool directly via LlamaStack."""
    if kwargs is None:
        kwargs = {}
    
    response = requests.post(
        f"{LLAMASTACK_URL}/v1/tool-runtime/invoke",
        json={"tool_name": tool_name, "kwargs": kwargs},
        timeout=30
    )
    
    if response.status_code == 200:
        result = response.json()
        content = result.get("content", [])
        if isinstance(content, list) and content:
            return content[0].get("text", str(content))
        return str(result)
    else:
        return f"Error: {response.status_code} - {response.text}"

print("‚úÖ invoke_tool function ready!")

## 4. Direct Tool Invocation

You can call MCP tools directly without involving the LLM. This is useful for testing tools or when you know exactly which tool you need.

In [None]:
# Test Weather MCP - Get statistics
print("üå§Ô∏è Weather MCP - Get Statistics")
print("=" * 50)
result = invoke_tool("get_weather_statistics")
print(result[:1000] if len(result) > 1000 else result)

In [None]:
# Test Weather MCP - List stations
print("üå§Ô∏è Weather MCP - List Stations")
print("=" * 50)
result = invoke_tool("list_weather_stations")
print(result[:1500] if len(result) > 1500 else result)

## 5. Agent-Based Tool Calling

The more powerful approach is to create an **Agent** that can automatically decide when to use tools based on the user's question.

This uses the LlamaStack Agents API.

In [None]:
# Step 1: Create an agent with MCP tools enabled
print("ü§ñ Creating Agent with MCP Tools...")
print(f"   Model: {MODEL_ID}")
print(f"   Toolgroups: {TOOLGROUPS}")

agent_config = {
    "agent_config": {
        "model": MODEL_ID,
        "instructions": "You are a helpful assistant. Use the available tools to answer questions about weather and other data.",
        "toolgroups": TOOLGROUPS,
        "enable_session_persistence": False,
        "sampling_params": {
            "max_tokens": 1024,
            "temperature": 0.7
        }
    }
}

response = requests.post(f"{LLAMASTACK_URL}/v1/agents", json=agent_config, timeout=30)

if response.status_code == 200:
    agent_id = response.json().get("agent_id")
    print(f"\n‚úÖ Agent created: {agent_id}")
else:
    print(f"‚ùå Error creating agent: {response.status_code} - {response.text}")
    agent_id = None

In [None]:
# Step 2: Create a session for the agent
if agent_id:
    session_response = requests.post(
        f"{LLAMASTACK_URL}/v1/agents/{agent_id}/session",
        json={"session_name": "workshop-session"},
        timeout=30
    )
    
    if session_response.status_code == 200:
        session_id = session_response.json().get("session_id")
        print(f"‚úÖ Session created: {session_id}")
    else:
        print(f"‚ùå Error creating session: {session_response.status_code}")
        session_id = None
else:
    session_id = None

In [None]:
def ask_agent(question: str):
    """Ask the agent a question - it will use tools automatically!"""
    if not agent_id or not session_id:
        print("‚ùå Agent or session not created. Run the cells above first.")
        return
    
    print(f"‚ùì Question: {question}")
    print("=" * 50)
    
    # Note: LlamaStack Agents API requires stream=True
    turn_request = {
        "messages": [{"role": "user", "content": question}],
        "stream": True
    }
    
    response = requests.post(
        f"{LLAMASTACK_URL}/v1/agents/{agent_id}/session/{session_id}/turn",
        json=turn_request,
        timeout=120,
        stream=True
    )
    
    if response.status_code == 200:
        tool_calls = []
        final_response = ""
        
        # Parse streaming response (Server-Sent Events format)
        for line in response.iter_lines():
            if line:
                line_str = line.decode('utf-8')
                if line_str.startswith('data: '):
                    try:
                        data = json.loads(line_str[6:])
                        event = data.get("event", {}).get("payload", {})
                        event_type = event.get("event_type", "")
                        
                        # Capture tool calls
                        if event_type == "step_complete" and event.get("step_type") == "tool_execution":
                            step_details = event.get("step_details", {})
                            for tc in step_details.get("tool_calls", []):
                                tool_calls.append(tc.get("tool_name", "unknown"))
                        
                        # Capture final response
                        if event_type == "turn_complete":
                            turn = event.get("turn", {})
                            for msg in turn.get("output_message", {}).get("content", []):
                                if msg.get("type") == "text":
                                    final_response = msg.get("text", "")
                    except json.JSONDecodeError:
                        pass
        
        # Show results
        if tool_calls:
            print("\nüîß Tools Used:")
            for tc in tool_calls:
                print(f"   ‚Ä¢ {tc}")
        
        if final_response:
            print(f"\nüìù Response:")
            print(final_response)
        elif not tool_calls:
            print("‚ö†Ô∏è No response received")
    else:
        print(f"‚ùå Error: {response.status_code} - {response.text}")

print("‚úÖ ask_agent function ready!")

In [None]:
# Test: Ask about weather stations
ask_agent("List all available weather stations")

In [None]:
# Test: Get weather statistics
ask_agent("Get weather statistics")

## 6. Explore on Your Own!

Try different questions. The agent will automatically use the appropriate MCP tools.

**Weather questions:**
- "List all available weather stations"
- "Get weather statistics"
- "Search for weather observations in New Delhi"
- "Get current weather for station VIDP"

**Station codes:** VIDP = New Delhi, RJTT = Tokyo, KJFK = New York, EGLL = London, YSSY = Sydney

**HR questions (if HR MCP is configured):**
- "List all employees"
- "Get vacation balance for employee EMP001"
- "List all job openings"

In [None]:
# Your turn! Try your own questions
my_question = "Get weather statistics"  # <-- Change this!

ask_agent(my_question)

---

# üéì Workshop Complete!

## What You Learned

1. ‚úÖ **List Models** - Query available LLM models
2. ‚úÖ **List Tools** - Discover MCP servers and their tools
3. ‚úÖ **Chat Completion** - Basic conversation with the model
4. ‚úÖ **Direct Tool Invocation** - Call MCP tools directly via API
5. ‚úÖ **Agent-Based Tool Calling** - Let the AI decide when to use tools

## Key Takeaways

- **MCP Tools are Unified**: All tools are accessible through the same API
- **Two Ways to Use Tools**: Direct invocation for specific needs, Agent API for automatic tool selection
- **Easy to Extend**: Just add more MCP servers to your LlamaStack config to give your AI new capabilities!