# Simulated Function Calling with Open Models and OpenAI Python SDK

This notebook demonstrates how to simulate function (tool) calling with Llama models using the OpenAI Python SDK.  
This is an alternative to OpenAI's built-in function calling, and works with any model that can follow instructions and output JSON.

We use the `response_format={"type": "json_object"}` parameter to ensure the model outputs valid JSON, and we instruct the model via the system prompt to only output the required JSON structure.

> **Note:** This approach is especially useful for open models that do not natively support OpenAI's function calling API.

---

## Prerequisites
1. Sign up for an account at [Featherless](https://featherless.ai/register)
2. Subscribe to a plan and get your API key from [API Keys](https://featherless.ai/account/api-keys)
## Setup
First, let's import the required libraries and set up our API key.

In [1]:
# Install the OpenAI Python SDK if you haven't already
# !pip install openai

In [2]:
from openai import OpenAI

# Initialize the OpenAI client with your endpoint and API key
client = OpenAI(
    base_url="https://api.featherless.ai/v1",
    api_key="YOUR FEATHERLESS API KEY",
)

## Example 1: Simulate a Single Function Call (`get_weather`)

We instruct the model to always respond with a JSON object calling the `get_weather` function, with a `location` argument.

In [3]:
response = client.chat.completions.create(
    model="meta-llama/Meta-Llama-3.1-8B-Instruct",
    messages=[
        {
            "role": "system",
            "content": (
                "You are a function calling assistant. "
                "When asked a question, respond ONLY in the following JSON format:\n"
                '{\n  "function": "get_weather",\n  "arguments": {"location": "<city, country>"}\n}\n'
                "Do not answer the question directly. Only output the JSON."
            ),
        },
        {
            "role": "user",
            "content": "What is the weather like in Paris today?"
        }
    ],
)
print(response)
print("Simulated Tool Calling Output:", response.choices[0].message.content)

ChatCompletion(id='4XjrRd', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='{\n  "function": "get_weather",\n  "arguments": {"location": "Paris, France"}\n}', refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=None))], created=1748615619430, model='meta-llama/Meta-Llama-3.1-8B-Instruct', object='chat.completion', service_tier=None, system_fingerprint='', usage=CompletionUsage(completion_tokens=22, prompt_tokens=71, total_tokens=93, completion_tokens_details=None, prompt_tokens_details=None))
Simulated Tool Calling Output: {
  "function": "get_weather",
  "arguments": {"location": "Paris, France"}
}


## Example 2: Simulate Multiple Functions (`get_weather` or `get_time`)

Here, we tell the model it can call either `get_weather` or `get_time`, and to always respond with the correct JSON for the function it chooses.

In [4]:
response = client.chat.completions.create(
    model="meta-llama/Meta-Llama-3.1-8B-Instruct",
    messages=[
        {
            "role": "system",
            "content": (
                "You are a function calling assistant. "
                "You have access to two functions:\n"
                '1. get_weather(location: str)\n'
                '2. get_time(location: str)\n'
                "When asked a question, respond ONLY in the following JSON format:\n"
                '{\n  "function": "<function_name>",\n  "arguments": {"location": "<city, country>"}\n}\n'
                "Do not answer the question directly. Only output the JSON."
            ),
        },
        {
            "role": "user",
            "content": "What's the weather and time in Tokyo?"
        }
    ],
    response_format={"type": "json_object"},
)
print("Simulated Tool Calling (Multiple Functions) Output:", response.choices[0].message.content)

Simulated Tool Calling (Multiple Functions) Output: {
  "function": "get_weather",
  "arguments": {"location": "Tokyo, Japan"}
}


## Example 3: Force a Specific Function (`get_time`)

You can force the model to always call a specific function, regardless of the user's question.

In [6]:
response = client.chat.completions.create(
    model="meta-llama/Meta-Llama-3.1-8B-Instruct",
    messages=[
        {
            "role": "system",
            "content": (
                "You are a function calling assistant. "
                "You must always call the function get_time. "
                "Respond ONLY in the following JSON format:\n"
                '{\n  "function": "get_time",\n  "arguments": {"location": "<city, country>"}\n}\n'
                "Do not answer the question directly. Only output the JSON."
            ),
        },
        {
            "role": "user",
            "content": "Tell me the time in New York."
        }
    ],
    response_format={"type": "json_object"},
)
print("Simulated Tool Calling (Forced Function) Output:", response.choices[0].message.content)

Simulated Tool Calling (Forced Function) Output: {
  "function": "get_time",
  "arguments": {"location": "New York, USA"}
}


## Summary

- This approach works with any model that can follow instructions and output JSON, including open models.
- You can simulate function calling by carefully crafting your system prompt and using `response_format={"type": "json_object"}`.
- This is a flexible alternative to OpenAI's built-in function calling, especially for open models or custom endpoints.

---