# MLX Server Function Calling Example

This is a detailed text version of the function calling example for MLX Server with OpenAI-compatible API.

## Setup

In [50]:
from openai import OpenAI

## Initialize the client

Connect to your local MLX server:

In [51]:
client = OpenAI(
    base_url = "http://localhost:8000/v1",
    api_key = "mlx-server-api-key"
)

## Function calling example

This example demonstrates how to use function calling with the MLX server:

In [57]:
# Define the user message
messages = [
    {
        "role": "user",
        "content": "What is the weather in Tokyo?"
    }
]

# Define the available tools/functions
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the weather in a given city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "The city to get the weather for"}
                }
            }
        }
    }
]

# Make the API call
completion = client.chat.completions.create(
    model="mlx-server-model",
    messages=messages,
    tools=tools,
    tool_choice="auto",
    max_tokens = 512,
    extra_body = {
        "enable_thinking": True
    }
)

# Get the result
print(completion.choices[0].message.tool_calls)

[ChatCompletionMessageToolCall(id='call-1746182514150232', function=Function(arguments='{"city": "Tokyo"}', name='get_weather'), type='function')]


In [58]:
print(completion.choices[0].message.content)

<think>
Okay, the user is asking for the weather in Tokyo. Let me check the tools provided. There's a function called get_weather that takes a city parameter. So I need to call that function with the city set to Tokyo. I'll make sure the JSON is correctly formatted with the city name as a string. Let me double-check the parameters. The function requires "city" as a string, so the arguments should be {"city": "Tokyo"}. Alright, that's all I need for the tool call.
</think>


## Streaming version

In [59]:
# Set stream=True in the API call
completion = client.chat.completions.create(
    model="mlx-server-model",
    messages=messages,
    tools=tools,
    tool_choice="auto",
    stream=True,
    extra_body = {
        "enable_thinking": True
    }
)

# Process the streaming response
for chunk in completion:
    delta = chunk.choices[0].delta
    print(delta)

ChoiceDelta(content='<think>', function_call=None, refusal=None, role='assistant', tool_calls=None)
ChoiceDelta(content='\n', function_call=None, refusal=None, role='assistant', tool_calls=None)
ChoiceDelta(content='Okay', function_call=None, refusal=None, role='assistant', tool_calls=None)
ChoiceDelta(content=',', function_call=None, refusal=None, role='assistant', tool_calls=None)
ChoiceDelta(content=' the', function_call=None, refusal=None, role='assistant', tool_calls=None)
ChoiceDelta(content=' user', function_call=None, refusal=None, role='assistant', tool_calls=None)
ChoiceDelta(content=' is', function_call=None, refusal=None, role='assistant', tool_calls=None)
ChoiceDelta(content=' asking', function_call=None, refusal=None, role='assistant', tool_calls=None)
ChoiceDelta(content=' for', function_call=None, refusal=None, role='assistant', tool_calls=None)
ChoiceDelta(content=' the', function_call=None, refusal=None, role='assistant', tool_calls=None)
ChoiceDelta(content=' weather