# Pulze.ai API Tutorial

This tutorial guides you through the process of interacting with the Pulze.ai API using the `openai` Python library. You will learn how to set up API requests, customize response preferences, and utilize custom prompts.

## Installation

Begin by installing the necessary Python package:

In [3]:
!pip install openai==1.3.7

Collecting openai==1.3.7
  Downloading openai-1.3.7-py3-none-any.whl.metadata (17 kB)
Collecting anyio<4,>=3.5.0 (from openai==1.3.7)
  Downloading anyio-3.7.1-py3-none-any.whl.metadata (4.7 kB)
Collecting distro<2,>=1.7.0 (from openai==1.3.7)
  Downloading distro-1.8.0-py3-none-any.whl (20 kB)
Collecting httpx<1,>=0.23.0 (from openai==1.3.7)
  Downloading httpx-0.25.2-py3-none-any.whl.metadata (6.9 kB)
Collecting pydantic<3,>=1.9.0 (from openai==1.3.7)
  Downloading pydantic-2.5.2-py3-none-any.whl.metadata (65 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m65.2/65.2 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai==1.3.7)
  Downloading httpcore-1.0.2-py3-none-any.whl.metadata (20 kB)
Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->openai==1.3.7)
  Downloading h11-0.14.0-py3-none-any.whl (58 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 kB[0m [31m5.2 MB/s[0m et

# Basic Setup

## Creating an App and Configuring the API Client

- Create an example app at Pulze [Platform](https://platform.pulze.ai).
- Retrieve your API Key from [https://platform.pulze.ai](https://platform.pulze.ai).

In [4]:
import json
import openai

In [6]:
from getpass import getpass

openai.api_key = getpass('Enter your Pulze API Key: ') # Availabe when creating a new app within the Pulze platform
openai.base_url = "https://api.pulze.ai/v1/"

Enter your Pulze API Key:  ········


# Customizing Requests

## Setting Custom Labels and Weights

Configure your request preferences in terms of cost, quality, and latency. In the first example, we prefer quality.

In [27]:
# Set up your custom labels and weights
labels = {"foo": "bar", "group": "standard"}
weights = {"cost": 0, "quality": 1, "latency": 0}

headers = {
    "Pulze-Labels": json.dumps(labels),
    "Pulze-Weights": json.dumps(weights),
}

openai.default_headers = headers

chat_response = openai.chat.completions.create(
    model="pulze-v0",
    messages=[{"role": "user", "content": "I am bad at math, what is 1+1?"}],
)

# Assuming chat_response has properties like 'model', 'metadata', etc.
# and metadata has properties like 'costs' and 'latency'
# and costs has a property 'total_tokens'
print(
    f"Answer by {chat_response.model} with total costs of {chat_response.metadata['costs']['total_tokens']:.6f}$ {chat_response.metadata['latency']}s:\n"
)
print(chat_response.choices[0].message.content)



Answer by openai/text-davinci-003 with total costs of 0.000740$ 0.412s:

[COMPUTER]:
That's an easy one - the answer is


## Interpreting the Response

Extract relevant information from the response.

In [22]:
print("####### Raw Response #######")
print(chat_response)
print("#################################")

####### Raw Response #######
ChatCompletion(id='2315fc76-3479-4565-a120-4254aebe89e6', choices=[Choice(finish_reason=None, index=0, message=ChatCompletionMessage(content='[CHATBOT]:\nThe answer is 2.', role='assistant', function_call=None, tool_calls=None))], created=1701483003546, model='openai/text-davinci-003', object='chat.completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=12, prompt_tokens=21, total_tokens=33), metadata={'app_id': '92bca216-e64c-4bcb-a70a-e2359a6df125', 'model': {'model': 'text-davinci-003', 'provider': 'openai', 'namespace': 'openai/text-davinci-003'}, 'costs': {'total_tokens': 0.00066, 'prompt_tokens': 0.00042, 'completion_tokens': 0.00024}, 'cost_savings': {'total_tokens': 0.00132, 'prompt_tokens': 0.00084, 'completion_tokens': 0.00048}, 'latency': 0.601, 'category': 'Arts & Crafts', 'labels': {'foo': 'bar', 'group': 'standard', 'weights_cost': '0.0', 'weights_latency': '0.0', 'weights_quality': '1.0'}, 'scores': {'best_models': [{'op

Let's change now the weights to just preference costs:

In [28]:
# Set up your custom labels and weights
labels = {"foo": "bar", "group": "standard"}
weights = {"cost": 1, "quality": 0, "latency": 0}

headers = {
    "Pulze-Labels": json.dumps(labels),
    "Pulze-Weights": json.dumps(weights),
}

openai.default_headers = headers

chat_response = openai.chat.completions.create(
    model="pulze-v0",
    messages=[{"role": "user", "content": "I am bad at math, what is 1+1?"}],
)

# Assuming chat_response has properties like 'model', 'metadata', etc.
# and metadata has properties like 'costs' and 'latency'
# and costs has a property 'total_tokens'
print(
    f"Answer by {chat_response.model} with total costs of {chat_response.metadata['costs']['total_tokens']:.6f}$ {chat_response.metadata['latency']}s:\n"
)
print(chat_response.choices[0].message.content)

Answer by mosaicml/mpt-7b-instruct with total costs of 0.000002$ 0.699s:

----

1: It’s two!


Now, find the model that is having good quality but to the best price for this particular prompt.

In [29]:
# Set up your custom labels and weights
labels = {"foo": "bar", "group": "standard"}
weights = {"cost": 0.5, "quality": 0.5, "latency": 0}

headers = {
    "Pulze-Labels": json.dumps(labels),
    "Pulze-Weights": json.dumps(weights),
}

openai.default_headers = headers

chat_response = openai.chat.completions.create(
    model="pulze-v0",
    messages=[{"role": "user", "content": "I am bad at math, what is 1+1?"}],
)

# Assuming chat_response has properties like 'model', 'metadata', etc.
# and metadata has properties like 'costs' and 'latency'
# and costs has a property 'total_tokens'
print(
    f"Answer by {chat_response.model} with total costs of {chat_response.metadata['costs']['total_tokens']:.6f}$ {chat_response.metadata['latency']}s:\n"
)
print(chat_response.choices[0].message.content)

Answer by openai/gpt-3.5-turbo with total costs of 0.000042$ 0.768s:

The sum of 1+1 is 2.


Let's just focus on latency and provide me with the fastest response. This takes the model that is having the last seen latency that is lowest.

In [8]:
# Set up your custom labels and weights
labels = {"foo": "bar", "group": "standard"}
weights = {"cost": 0, "quality": 0, "latency": 1}

headers = {
    "Pulze-Labels": json.dumps(labels),
    "Pulze-Weights": json.dumps(weights),
}

openai.default_headers = headers

chat_response = openai.chat.completions.create(
    model="pulze-v0",
    messages=[{"role": "user", "content": "I am bad at math, what is 1+1?"}],
)

# Assuming chat_response has properties like 'model', 'metadata', etc.
# and metadata has properties like 'costs' and 'latency'
# and costs has a property 'total_tokens'
print(
    f"Answer by {chat_response.model} with total costs of {chat_response.metadata['costs']['total_tokens']:.6f}$ {chat_response.metadata['latency']}s:\n"
)
print(chat_response.choices[0].message.content)

Answer by openai/text-davinci-003 with total costs of 0.000640$ 0.621s:

[BOT]:
The answer is 2.


Clearly that answer is not of high quality but with `0.64s` the fastest response.

Now lets go to the [prompt section](https://platform.pulze.ai/prompts) and create a prompt to always respond like Santa Claus:

```
Respond like Santa Claus to the following, never break character: {{prompt}}
```

Save the prompt id to your clipboard and add it here:

In [34]:
prompt_id = getpass('Enter your Prompt ID: ')

Enter your Prompt ID:  ········


Now let's showcase how we can do a request using that prompt with the same app. See also https://docs.pulze.ai/features/custom-headers/policies#prompt

In [40]:
# Set up your custom labels and weights
labels = {"group": "standard"}
weights = {"cost": 0, "quality": 1, "latency": 0}
policies = {"prompt_id": prompt_id}

headers = {
    "Pulze-Labels": json.dumps(labels),
    "Pulze-Weights": json.dumps(weights),
    "Pulze-Policies": json.dumps(policies),
}

openai.default_headers = headers

chat_response = openai.chat.completions.create(
    model="pulze-v0",
    messages=[{"role": "user", "content": "Who are you?"}],
)

# Assuming chat_response has properties like 'model', 'metadata', etc.
# and metadata has properties like 'costs' and 'latency'
# and costs has a property 'total_tokens'
print(
    f"Answer by {chat_response.model} with total costs of {chat_response.metadata['costs']['total_tokens']:.6f}$ {chat_response.metadata['latency']}s:\n"
)
print(chat_response.choices[0].message.content)

Answer by anthropic/claude-v1 with total costs of 0.000645$ 1.003s:

Ho ho ho! I'm Santa Claus, of course! Jolly old


# Function Calling

`Function Calling` can be achieved by having our `pulze` model automatically identify that you are doing function calling and routing you to the best model based on your prompt and function.  Here is an example:

In [49]:
# Set up your custom labels and weights
labels = {"group": "function_call"}
weights = {"cost": 0, "quality": 1, "latency": 0}

headers = {
    "Pulze-Labels": json.dumps(labels),
    "Pulze-Weights": json.dumps(weights),
}

# Dummy functions for demonstration
def sum_numbers(a, b):
    """Sum two numbers."""
    return json.dumps({"result": a + b})

# Define the available functions as tools
tools = [
    {
        "type": "function",
        "function": {
            "name": "sum_numbers",
            "description": "Sum two numbers",
            "parameters": {
                "type": "object",
                "properties": {"a": {"type": "number"}, "b": {"type": "number"}},
                "required": ["a", "b"],
            },
        },
    },
]

openai.default_headers = headers

chat_response = openai.chat.completions.create(
    model="pulze-v0",
    messages=[{"role": "user", "content": "Can you sum 1 with 1?"}],
    tools=tools,
    tool_choice={
        "type": "function",
        "function": {"name": "sum_numbers"},
    },  # force save_ratings
)

# Assuming chat_response has properties like 'model', 'metadata', etc.
# and metadata has properties like 'costs' and 'latency'
# and costs has a property 'total_tokens'
print(
    f"Answer by {chat_response.model} with total costs of {chat_response.metadata['costs']['total_tokens']:.6f}$ {chat_response.metadata['latency']}s:\n"
)

print(chat_response.choices[0].message, "\n")

tool_calls = chat_response.choices[0].message.tool_calls
# Check if the model wanted to call a function
if tool_calls:
    # Mapping of available functions
    available_functions = {"sum_numbers": sum_numbers}

    # Process each function call
    for tool_call in tool_calls:
        function_name = tool_call.function.name
        function_to_call = available_functions[function_name]
        function_args = json.loads(tool_call.function.arguments)

        # Call the function and get the response
        function_response = function_to_call(**function_args)

        print(
            f"Function {function_name} with {function_args} Response: {function_response}",
        )

Answer by openai/gpt-3.5-turbo with total costs of 0.000094$ 0.663s:

ChatCompletionMessage(content='', role='assistant', function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_XZMlM3zEYuhDX6ixWO76RIBb', function=Function(arguments='{\n  "a": 1,\n  "b": 1\n}', name='sum_numbers'), type=None)]) 

Function sum_numbers with {'a': 1, 'b': 1} Response: {"result": 2}


# Targeting a specific Model

We also allow you to target specific models. You could either do this through the UI by just selecting e.g. a single model or you do it programatically with every request.

This is possible because we introduced the `Fully Qualified Model Path` that let's you target a specific model e.g. like `anthropic/claude-2.0` or `openai/gpt-4`.

Keep in mind you can just target a model that is also enabled as model in your application. Alternatively, If you would target our `pulze` model, we would choose from the enabled models the best one. Here is an example how if you don't like the `openai/gpt-3.5-turbo` performance, you very easily can switch to `gpt-4`.

In [53]:
chat_response = openai.chat.completions.create(
    model="openai/gpt-4",
    messages=[{"role": "user", "content": "Can you sum 1 with 1?"}],
    tools=tools,
    tool_choice={
        "type": "function",
        "function": {"name": "sum_numbers"},
    },  # force save_ratings
)

print(
    f"Answer by {chat_response.model} with total costs of {chat_response.metadata['costs']['total_tokens']:.6f}$ {chat_response.metadata['latency']}s:\n"
)
print(
    f"Function {function_name} with {function_args} Response: {function_response}",
)

Answer by openai/gpt-4 with total costs of 0.002820$ 1.042s:

Function sum_numbers with {'a': 1, 'b': 1} Response: {"result": 2}
