<a href="https://colab.research.google.com/github/meta-llama/llama-stack/blob/main/docs/zero_to_hero_guide/Tool_Calling101_Using_Together's_Llama_Stack_Server.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

If you'd prefer not to set up a local server, explore this on tool calling with the Together API. This guide will show you how to leverage Together.ai's Llama Stack Server API, allowing you to get started with Llama Stack without the need for a locally built and running server.

## Tool Calling w Together API


In this section, we'll explore how to enhance your applications with tool calling capabilities. We'll cover:
1. Setting up and using the Brave Search API
2. Creating custom tools
3. Configuring tool prompts and safety settings

In [None]:
!pip install llama-stack-client==0.0.50
!pip install -U httpx==0.27.2 # https://github.com/meta-llama/llama-stack-apps/issues/131

Collecting llama-stack-client
  Downloading llama_stack_client-0.0.50-py3-none-any.whl.metadata (13 kB)
Downloading llama_stack_client-0.0.50-py3-none-any.whl (282 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m283.0/283.0 kB[0m [31m3.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: llama-stack-client
Successfully installed llama-stack-client-0.0.50


In [None]:
LLAMA_STACK_API_TOGETHER_URL = "https://llama-stack.together.ai"
LLAMA31_8B_INSTRUCT = "Llama3.1-8B-Instruct"


In [None]:
import asyncio
import os
from typing import Dict, List, Optional

from llama_stack_client import LlamaStackClient
from llama_stack_client.lib.agents.agent import Agent
from llama_stack_client.lib.agents.event_logger import EventLogger
from llama_stack_client.types.agent_create_params import (
    AgentConfigToolSearchToolDefinition,
)


# Helper function to create an agent with tools
async def create_tool_agent(
    client: LlamaStackClient,
    tools: List[Dict],
    instructions: str = "You are a helpful assistant",
    model: str = LLAMA31_8B_INSTRUCT,
) -> Agent:
    """Create an agent with specified tools."""
    print("Using the following model: ", model)
    return Agent(
        client, 
        model=model,
        instructions=instructions,
        sampling_params={
            "strategy": {
                "type": "greedy",
            },
        },
        tools=tools,
    )


In [None]:
# comment this if you don't have a BRAVE_SEARCH_API_KEY
os.environ["BRAVE_SEARCH_API_KEY"] = "YOUR_BRAVE_SEARCH_API_KEY"


async def create_search_agent(client: LlamaStackClient) -> Agent:
    """Create an agent with Brave Search capability."""

    # comment this if you don't have a BRAVE_SEARCH_API_KEY
    search_tool = AgentConfigToolSearchToolDefinition(
        type="brave_search",
        engine="brave",
        api_key=os.getenv("BRAVE_SEARCH_API_KEY"),
    )

    return await create_tool_agent(
        client=client,
        tools=[search_tool],  # set this to [] if you don't have a BRAVE_SEARCH_API_KEY
        model=LLAMA31_8B_INSTRUCT,
        instructions="""
        You are a research assistant that can search the web.
        Always cite your sources with URLs when providing information.
        Format your responses as:

        FINDINGS:
        [Your summary here]

        SOURCES:
        - [Source title](URL)
        """,
    )


# Example usage
async def search_example():
    client = LlamaStackClient(base_url=LLAMA_STACK_API_TOGETHER_URL)
    agent = await create_search_agent(client)

    # Create a session
    session_id = agent.create_session("search-session")

    # Example queries
    queries = [
        "What are the latest developments in quantum computing?",
        # "Who won the most recent Super Bowl?",
    ]

    for query in queries:
        print(f"\nQuery: {query}")
        print("-" * 50)

        response = agent.create_turn(
            messages=[{"role": "user", "content": query}],
            session_id=session_id,
        )

        async for log in EventLogger().log(response):
            log.print()


# Run the example (in Jupyter, use asyncio.run())
await search_example()


Using the following model:  Llama3.1-8B-Instruct

Query: What are the latest developments in quantum computing?
--------------------------------------------------
inference> FINDINGS:
The latest developments in quantum computing involve significant advancements in the field of quantum processors, error correction, and the development of practical applications. Some of the recent breakthroughs include:

* Google's 53-qubit Sycamore processor, which achieved quantum supremacy in 2019 (Source: Google AI Blog, https://ai.googleblog.com/2019/10/experiment-advances-quantum-computing.html)
* The development of a 100-qubit quantum processor by the Chinese company, Origin Quantum (Source: Physics World, https://physicsworld.com/a/origin-quantum-scales-up-to-100-qubits/)
* IBM's 127-qubit Eagle processor, which has the potential to perform complex calculations that are currently unsolvable by classical computers (Source: IBM Research Blog, https://www.ibm.com/blogs/research/2020/11/ibm-advances-

## 3. Custom Tool Creation

Let's create a custom weather tool:

#### Key Highlights:
- **`WeatherTool` Class**: A custom tool that processes weather information requests, supporting location and optional date parameters.
- **Agent Creation**: The `create_weather_agent` function sets up an agent equipped with the `WeatherTool`, allowing for weather queries in natural language.
- **Simulation of API Call**: The `run_impl` method simulates fetching weather data. This method can be replaced with an actual API integration for real-world usage.
- **Interactive Example**: The `weather_example` function shows how to use the agent to handle user queries regarding the weather, providing step-by-step responses.

In [None]:
import json
from datetime import datetime
from typing import Any, Dict, Optional, TypedDict

from llama_stack_client.lib.agents.custom_tool import CustomTool
from llama_stack_client.types import CompletionMessage, ToolResponseMessage
from llama_stack_client.types.tool_param_definition_param import (
    ToolParamDefinitionParam,
)


class WeatherTool(CustomTool):
    """Example custom tool for weather information."""

    def get_name(self) -> str:
        return "get_weather"

    def get_description(self) -> str:
        return "Get weather information for a location"

    def get_params_definition(self) -> Dict[str, ToolParamDefinitionParam]:
        return {
            "location": ToolParamDefinitionParam(
                param_type="str", description="City or location name", required=True
            ),
            "date": ToolParamDefinitionParam(
                param_type="str",
                description="Optional date (YYYY-MM-DD)",
                required=False,
            ),
        }

    async def run(self, messages: List[CompletionMessage]) -> List[ToolResponseMessage]:
        assert len(messages) == 1, "Expected single message"

        message = messages[0]

        tool_call = message.tool_calls[0]
        # location = tool_call.arguments.get("location", None)
        # date = tool_call.arguments.get("date", None)
        try:
            response = await self.run_impl(**tool_call.arguments)
            response_str = json.dumps(response, ensure_ascii=False)
        except Exception as e:
            response_str = f"Error when running tool: {e}"

        message = ToolResponseMessage(
            call_id=tool_call.call_id,
            tool_name=tool_call.tool_name,
            content=response_str,
            role="ipython",
        )
        return [message]

    async def run_impl(
        self, location: str, date: Optional[str] = None
    ) -> Dict[str, Any]:
        """Simulate getting weather data (replace with actual API call)."""
        # Mock implementation
        if date:
            return {"temperature": 90.1, "conditions": "sunny", "humidity": 40.0}
        return {"temperature": 72.5, "conditions": "partly cloudy", "humidity": 65.0}


async def create_weather_agent(client: LlamaStackClient) -> Agent:
    """Create an agent with weather tool capability."""

    # Create the agent with the tool
    weather_tool = WeatherTool()

    agent = Agent(
        client=client, 
        model=LLAMA31_8B_INSTRUCT,
        instructions="""
        You are a weather assistant that can provide weather information.
        Always specify the location clearly in your responses.
        Include both temperature and conditions in your summaries.
        """,
        sampling_params={
            "strategy": {
                "type": "greedy",
            },
        },
        tools=[weather_tool],
    )

    return agent


# Example usage
async def weather_example():
    client = LlamaStackClient(base_url=LLAMA_STACK_API_TOGETHER_URL)
    agent = await create_weather_agent(client)
    session_id = agent.create_session("weather-session")

    queries = [
        "What's the weather like in San Francisco?",
        "Tell me the weather in Tokyo tomorrow",
    ]

    for query in queries:
        print(f"\nQuery: {query}")
        print("-" * 50)

        response = agent.create_turn(
            messages=[{"role": "user", "content": query}],
            session_id=session_id,
        )

        async for log in EventLogger().log(response):
            log.print()


# For Jupyter notebooks
import nest_asyncio

nest_asyncio.apply()

# Run the example
await weather_example()



Query: What's the weather like in San Francisco?
--------------------------------------------------
inference> {
    "function": "get_weather",
    "parameters": {
        "location": "San Francisco"
    }
}

Query: Tell me the weather in Tokyo tomorrow
--------------------------------------------------
inference> {
    "function": "get_weather",
    "parameters": {
        "location": "Tokyo",
        "date": "tomorrow"
    }
}


Thanks for checking out this tutorial, hopefully you can now automate everything with Llama! :D

Next up, we learn another hot topic of LLMs: Memory and Rag. Continue learning [here](./04_Memory101.ipynb)!