# SGLang – Tool Calling Feature Documentation

This guide demonstrates how to use SGLang’s **ToolCalling** functionality with a `get_current_weather` function. You can replace or add any tool function depending on your use case.

## Launching the Server
First, you need to launch the SGLang server so it can handle incoming requests. The server is started with the `sglang.launch_server` command.

**Example command:**

```bash
python -m sglang.launch_server \
  --model-path meta-llama/Meta-Llama-3.1-8B-Instruct \
  --tool-call-parser llama3 \
  --port 30000 \
  --host 0.0.0.0

```

-   **`--model-path`**: Specifies the path of the model to be used, e.g., `meta-llama/Meta-Llama-3.1-8B-Instruct`.
-   **`--tool-call-parser`**: Defines the parser used to interpret responses. Currently supported parsers include:
    -   llama3: Llama 3.1 / 3.2 (e.g. `meta-llama/Llama-3.1-8B-Instruct`, `meta-llama/Llama-3.2-1B-Instruct`)
    -   mistral: Mistral (e.g. `mistralai/Mistral-7B-Instruct-v0.3`, `mistralai/Mistral-Nemo-Instruct-2407`)
    -   qwen25: Qwen 2.5 (e.g. `Qwen/Qwen2.5-1.5B-Instruct`, `Qwen/Qwen2.5-7B-Instruct`)
-   **`--port`**: Sets the port number for the server (e.g., 30000).
-   **`--host`**: Sets the hostname or IP address (using `0.0.0.0` allows connections from any network).

The following code block is used to launch the server.

In [None]:
from openai import OpenAI
import json
from sglang.utils import execute_shell_command, wait_for_server, terminate_process, print_highlight

server_process = execute_shell_command(
    "python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --tool-call-parser llama3 --port 30222 --host 0.0.0.0"  # llama3
)
wait_for_server("http://localhost:30222")

Once the server is running, you’ll be ready to interact with it using a client, as described in the subsequent steps.

## Define Tools for Function Call
Below is a Python snippet that shows how to define a tool as a dictionary. The dictionary includes a tool name, a description, and property defined Parameters.

In [None]:
# Define tools
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                },
                "required": ["location"],
            },
        },
    }
]

## Define Messages

In [None]:
messages = [
    {
        "role": "user",
        "content": "What's the weather like in Boston today? Please respond with the format: Today's weather is :{function call result}",
    }
]

## Initialize the Client

Use an OpenAI-like client to communicate with the SGLang server. Replace `YOUR_API_KEY` with the appropriate key, or leave it as a dummy value if you’re testing locally.

In [None]:
# Initialize OpenAI-like client
client = OpenAI(api_key="None", base_url="http://0.0.0.0:30222/v1")
model_name = client.models.list().data[0].id

##  Non-Streaming Request

In [None]:
# Non-streaming mode test
response_non_stream = client.chat.completions.create(
    model=model_name,
    messages=messages,
    temperature=0.8,
    top_p=0.8,
    stream=False,  # Non-streaming
    tools=tools,
)
print_highlight("Non-stream response:")
print(response_non_stream)

## Streaming Request

In [None]:
# Streaming mode test
print_highlight("Streaming response:")
response_stream = client.chat.completions.create(
    model=model_name,
    messages=messages,
    temperature=0.8,
    top_p=0.8,
    stream=True,  # Enable streaming
    tools=tools,
)

chunks = []
for chunk in response_stream:
    chunks.append(chunk)
    if chunk.choices[0].delta.tool_calls:
        print(chunk.choices[0].delta.tool_calls[0])


## Handle Tool Calls

When the server determines it should call a particular tool, it will return arguments or partial arguments through the response. You can parse these arguments and later invoke the tool accordingly. Note, non-streaming mode also supports function calling.

In [None]:
# Parse and combine function call arguments
arguments = []
for chunk in chunks:
    choice = chunk.choices[0]
    delta = choice.delta
    if delta.tool_calls:
        tool_call = delta.tool_calls[0]
        if tool_call.function.name:
            print(f"Streamed function call name: {tool_call.function.name}")

        if tool_call.function.arguments:
            arguments.append(tool_call.function.arguments)
            print(f"Streamed function call arguments: {tool_call.function.arguments}")

# Combine all fragments into a single JSON string
full_arguments = "".join(arguments)
print_highlight(f"Final streamed function call arguments: {full_arguments}")

## Define a Tool Function

Next, define the actual function that implements the tool’s logic. In this example, `get_current_weather` simply returns a hard-coded weather description for Dallas, Texas.

In [None]:
# This is a demonstration, define real function according to your usage.
def get_current_weather(location: str, unit: str):
    # Here you can integrate an actual weather API
    return f"The weather in {location} is 85 degrees {unit}. It is partly cloudy, with highs in the 90's."


# Simulate tool call
available_tools = {"get_current_weather": get_current_weather}


## Execute the Tool

In [None]:
# Parse JSON arguments
call_data = json.loads(full_arguments)

# Add user message and function call result to the message list
messages.append(
    {
        "role": "user",
        "content": "",
        "tool_calls": {"name": "get_current_weather", "arguments": full_arguments},
    }
)

# Call the corresponding tool function
tool_name = messages[-1]["tool_calls"]["name"]
tool_to_call = available_tools[tool_name]
result = tool_to_call(**call_data)
print_highlight(f"Function call result: {result}")
messages.append({"role": "tool", "content": result, "name": tool_name})

print_highlight(f"Updated message history: {messages}")

In [None]:
terminate_process(server_process)