# Streaming with tool calling
Using tool calling with streaming responses requires a different approach than when using non-streaming chat completions. Let's walk through an example of how to do this.

To get started, first install dependencies.

In [None]:
%pip install writer-sdk python-dotenv -q

### Initialization

The cell below performs the initialization required for this notebook including the creation of an instance of the `Writer` object to interact with the LLM.

To create a Writer client object, you need an API key. [You can sign up for one for free](https://app.writer.com/register). 

Once you have an API key, we recommend that you store it as an environment variable in a `.env` file like so:

```
WRITER_API_KEY="{Your Writer API key goes here}"
```

When you instantiate the client with `client = Writer()`, the newly-created object will automatically look for an environment variable named `WRITER_API_KEY` and will complete the instantiation if an only if `WRITER_API_KEY` has been defined. This notebook uses the [python-dotenv] library to automatically define environment variables based on the contents of an `.env` file in the same directory.

The `Writer()` initializer method also has an `api_key` parameter that you can use like this...

```
client = Writer(api_key="{Your Writer API key goes here}")
```

...but we strongly encourage you not to leave API keys in your source code.

In [None]:
import json
from writerai import Writer

# Load environment variables from .env file
%reload_ext dotenv
%dotenv

client = Writer()

### Defining a tool

As with non-streaming tool calling, we need to define the tools that our model can use. In this example, we'll define a tool that calculates the mean of a list of numbers, as well as a JSON schema to be passed to the model. You can feel free to define any tools you'd like here, such as tools to perform database queries or interact with other external APIs.

In [None]:
def calculate_mean(numbers: list) -> float:
    return sum(numbers) / len(numbers)

tools = [
    { 
        "type": "function",
        "function": {
            "name": "calculate_mean", 
            "description": "Calculate the mean (average) of a list of numbers.", 
            "parameters": { 
                "type": "object", 
                "properties": { 
                    "numbers": { 
                        "type": "array", 
                        "items": {"type": "number"}, 
                        "description": "List of numbers"
                    } 
                }, 
                "required": ["numbers"] 
            } 
        }
    }
]

### Calling the model

Now that we've defined our tool and schema, we can call the model. Set `stream=True` to get a streaming response and pass the `tools` and `tool_choice` parameters for tool calling. Use `tool_choice="auto"` to let the model decide when to use the tool.

In [None]:
messages = [{"role": "user", "content": "what is the mean of [1,3,5,7,9]?"}]

response = client.chat.chat(
    model="palmyra-x-004", 
    messages=messages, 
    tools=tools, 
    tool_choice="auto", 
    stream=True
)

### Processing the response

Processing the streaming response requires a different approach than when using non-streaming chat completions.

Here's a high-level overview of what the code below is doing:

**1. Initialization and iterating over the response**
- `streaming_content` is initialized as an empty string to accumulate content.
- `function_calls` is initialized as an empty list to store tool call information.
- The code iterates over response, which is expected to be a streaming response from a chat model.
- For each chunk, the first choice is accessed via `chunk.choices[0]`.

**2. Handling tool call and non-tool-call content**
- If `choice.delta.tool_calls` is not `null`, it indicates that the model is suggesting tool calls. Each chunk will contain bits and pieces of the tool calls throughout the streaming response, so we need to collect them and store them in `function_calls`.
  - For each `tool_call`, if an ID is present, a dictionary with the tool call ID is appended to `function_calls`.
  - If a function is specified, its name and arguments are appended to the last dictionary in `function_calls`.
- If `choice.delta.tool_calls` is `null` but `choice.delta.content` exists, it indicates that no tool calls are being made and the content is appended to `streaming_content`.

**3. Finish reasons**
- If `choice.finish_reason` is `stop`, it means the model has finished generating the response (and thus has made no further tool calls), and the accumulated `streaming_content` is appended to `messages` with the role `assistant`.
- If `choice.finish_reason` is `tool_calls`, it indicates the model has finished deciding which tools to call and we can now execute the tool calls.

**4. Executing tool calls**
- For each `function_call` in `function_calls`, if the function name is "calculate_mean", the arguments are parsed from JSON.
- The `calculate_mean` function is called with the parsed arguments, and the result is appended to `messages` with the role `tool`.

**5. Final response**
- A new chat request is made with the updated messages to get the final response.
- To demonstrate the streaming response, `print(choice.delta.content, end="", flush=True)` is used to print the content as it comes in.
- The streaming content is also accumulated in `final_streaming_content` and then printed when complete.

In [None]:
streaming_content = ""
function_calls = []

for chunk in response:
    choice = chunk.choices[0]

    if choice.delta:
        # Check for tool calls
        if choice.delta.tool_calls:
            for tool_call in choice.delta.tool_calls:
                if tool_call.id:
                    # Append an empty dictionary to the function_calls list with the tool call ID
                    function_calls.append(
                        {"name": "", "arguments": "", "call_id": tool_call.id}
                    )
                if tool_call.function:
                    # Append function name and arguments to the last dictionary in the function_calls list
                    function_calls[-1]["name"] += (
                        tool_call.function.name
                        if tool_call.function.name
                        else ""
                    )
                    function_calls[-1]["arguments"] += (
                        tool_call.function.arguments
                        if tool_call.function.arguments
                        else ""
                    )
        # Handle non-tool-call content
        elif choice.delta.content:
            streaming_content += choice.delta.content
            print(choice.delta.content, end="", flush=True)

        # A finish reason of stop means the model has finished generating the response
        if choice.finish_reason == "stop":
            messages.append({"role": "assistant", "content": streaming_content})

        # A finish reason of tool_calls means the model has finished deciding which tools to call
        elif choice.finish_reason == "tool_calls":
            for function_call in function_calls:
                if function_call["name"] == "calculate_mean":
                    arguments_dict = json.loads(function_call["arguments"])
                    function_response = calculate_mean(arguments_dict["numbers"])

                    messages.append(
                        {
                            "role": "tool",
                            "content": str(function_response),
                            "tool_call_id": function_call["call_id"],
                            "name": function_call["name"],
                        }
                    )
               
                final_response = client.chat.chat(
                    model="palmyra-x-004", messages=messages, stream=True
                )

                final_streaming_content = ""
                for chunk in final_response:
                    choice = chunk.choices[0]
                    if choice.delta and choice.delta.content:
                        final_streaming_content += choice.delta.content
                        print(choice.delta.content, end="", flush=True)
                print(f"\n\nCompleted response: {final_streaming_content}")

With that, we've now seen how to use tool calling with streaming responses. To learn more about tool calling, check out the [tool calling guide](https://dev.writer.com/api-guides/tool-calling) on the Writer docs.