# Examples

[**(quick-start)**](#quick-start) Adding a `@costly()` decorator to your function automatically adds the arguments `simulate: bool` and `cost_log: costly.Costlog` to it (and also `description: list[str]` which is useful for tracing the sources of costs in the breakdown).

By default, this uses [`LLM_Simulator_Faker.simulate_llm_call()`](costly/simulators/llm_simulator_faker.py) to simulate your function and [`LLM_API_Estimation.get_cost_real()`](costly/estimators/llm_api_estimation.py) as the estimator -- see [**(assumptions)**](#assumptions) for some brief comments on how these work by default. 

You can also use your own simulator and estimator, as shown in Example [**(customators)**](#customators). In particular you can subclass the existing simulator and estimator objects.

The default simulator and estimator assume that your function takes arguments `input_string: str, model: str, response_model: str | BaseModel)`, and supplies these arguments to the simulator and estimator. Regardless of which simulator and estimator you use, if your function does not take the same arguments as them (e.g. if your function adds a system prompt to `input_string`, or you use a different naming system for `model`s from that used in `LLM_API_Estimation.PRICES`), you can pass parameter mappings as shown in Example [**(param-mappings)**](#param-mappings). 

In particular see the example parameter mappings for [**(messages-and-instructor)**](#messages-and-instructor).

For more accurate cost tracking, you might want to calculate costs within the function, e.g. use info returned by the API call itself. For this, your function must response a `CostlyResponse` object as shown in Example [**(costly-response)**](#costly-response). In usage, your function will still return the normal output contained in `CostlyResponse.output`, and everything else will be passed to the simulator and estimator.

Finally, see [**(costlog-notes)**](#costlog-notes) for some notes on the `Costlog` class itself.

## quick-start

In [11]:
from openai import OpenAI
from costly import Costlog, costly


@costly()
def chatgpt(input_string: str, model: str) -> str:
    from openai import OpenAI

    client = OpenAI()
    response = client.chat.completions.create(
        model=model, messages=[{"role": "user", "content": input_string}]
    )
    output_string = response.choices[0].message.content
    return output_string


cl = Costlog()
x = chatgpt(
    input_string="Write the Lorem ipsum text",
    model="gpt-4o-mini",
    cost_log=cl,
    simulate=False,
    description=["chatgpt call"],
)
y = chatgpt(
    input_string="Write the Lorem ipsum text",
    model="gpt-4o-mini",
    cost_log=cl,
    simulate=True,
    description=["chatgpt call"],
)
print(
    x,
    "\n---\n",
    y,
    "\n---\n",
    cl.totals,
    "\n---\n",
    cl.items[0],
    "\n---\n",
    cl.items[1],
)

Certainly! Here is the classic "Lorem Ipsum" placeholder text:

```
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
```

Feel free to ask if you need a different variation or more text! 
---
 Official trial contain them test first. Sea management run realize education now detail. Billion mission space nothing go lose. Voice together stand other make person fast.
Particular list family machine rate contain. Under drug prepare inside. Force human everybody bag but discuss age.
Perform rate study top card. Population yes consumer. Section her line our claim.
Former account available its rich look w

Here's basically what's happening under the hood, courtesy of `costly.decorator.costly`:


In [2]:
from costly.simulators.llm_simulator_faker import LLM_Simulator_Faker
from costly.estimators.llm_api_estimation import LLM_API_Estimation

def _chatgpt(input_string: str, model: str) -> str:
    from openai import OpenAI
    client = OpenAI()
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": input_string}]
    )
    output_string = response.choices[0].message.content
    return output_string

def chatgpt(input_string: str, model: str, cost_log: Costlog=None, simulate: bool=False, description: list[str]=None) -> str:
    if simulate:
        return LLM_Simulator_Faker.simulate_llm_call(
            input_string=input_string,
            model=model,
            response_model=str,
            cost_log=cost_log,
            description=description,
        )
    if cost_log is not None:
        with cost_log.new_item() as (item, timer):
            output_string = _chatgpt(input_string, model)
            cost_item = LLM_API_Estimation.get_cost_real(
                model=model,
                input_string=input_string,
                output_string=output_string,
                description=description,
                timer=timer(),
            )
            item.update(cost_item)
    else:
        output_string = _chatgpt(input_string, model)
    return output_string


## assumptions

`LLM_Simulator_Faker`, when producing text, produces text of about `600 * 4.5` characters.

Generally we assume that 1 token is about 4.5 characters. Though actual token estimation does use `tiktoken` (unless you subclass `LLM_API_Estimation` and set `tokenize=_tokenize_rough`).

Generally we assume, for cost estimation, that output tokens are in the range `[0, 2048]`, and the min and max are computed accordingly. As a rule of thumb for complex projects the true value tends to be about 1/3 the way through, and for projects that receive quite short responses it would be much lower.

`LLM_API_Estimation.get_cost_real()` uses the `PRICES` dict to get the cost of a (real of simulated) API call, and to estimate time for a simulated API call. By default the standard OpenAI and Anthropic models are included; you can add more via subclassing.

All of this can be overriden by subclassing.

## customators

The defualt decorator behaviour is

```python
@costly(
    simulator=LLM_Simulator_Faker.simulate_llm_call,
    estimator=LLM_API_Estimation.get_cost_real,
)
```

These functions can be replaced by your own custom functions -- the best way to do this is probably to subclass the respective class.

[`costly.simulators.llm_simulator_faker`](costly/simulators/llm_simulator_faker.py) has some examples of how to subclass it. One obvious reason to subclass it is to have custom simulating functions for the types you are interested in. Although the default class "works" for any Pydantic basemodel etc., you might want to have a custom function -- e.g. if a value needs to be within a certain range, or if its distribution is not uniform, or if you want to use examples from your data, etc.

Again [`costly.estimators.llm_api_estimation`](costly/estimators/llm_api_estimation.py) has some examples of how to subclass it. The most obvious reason would be to add prices for other models we don't have listed (right now it's just OpenAI and Anthropic). The `PRICES` dict is like this:

```python
class LLM_API_Estimation:

    PRICES = {
        "gpt-4o": {
            "input_tokens": 5.0e-6,
            "output_tokens": 15.0e-6,
            "time": 18e-3,
        },
        "gpt-4o-mini": {
            "input_tokens": 0.15e-6,
            "output_tokens": 0.6e-6,
            "time": 9e-3,
        },
        ...
    }
```

Something like this would be quite natural:

```python
class My_Estimation(LLM_API_Estimation):
    PRICES = LLM_API_Estimation.PRICES | {"my_model": LLM_API_Estimation.PRICES["gpt-4o"]}
```

Note that `LLM_API_Estimation` _can_ handle things like `gpt-4o-2024-05-13` etc. because it `LLM_API_Estimation.get_model()` gets the longest prefix matching model name in `PRICES`. 


The default simulator and estimator have type signatures:

```python
class LLM_Simulator_Faker:

    @staticmethod
    def simulate_llm_call(
        input_string: str,
        model: str = None,
        response_model: type = str,
        cost_log: Costlog = None,
        description: list[str] = None,
    ) -> str | Any:
        ...

class LLM_API_Estimation:

    @staticmethod
    def get_cost_real(
        model: str,
        input_tokens: int = None,
        output_tokens_min: int = None,
        output_tokens_max: int = None,
        input_string: str = None,
        output_string: str = None,
        timer: float = None,
        **kwargs,
    ) -> dict[str, float]:
        ...
```

## param-mappings

In [16]:
from costly import Costlog, costly
from costly.simulators.llm_simulator_faker import LLM_Simulator_Faker
from costly.estimators.llm_api_estimation import LLM_API_Estimation


@costly(
    input_string=(lambda kwargs: kwargs["prompt"]),
    model=(lambda kwargs: kwargs["model_name"]),
)
def chatgpt2(prompt: str, model_name: str) -> str:
    from openai import OpenAI

    client = OpenAI()
    response = client.chat.completions.create(
        model=model_name, messages=[{"role": "user", "content": prompt}]
    )
    output_string = response.choices[0].message.content
    return output_string


cl = Costlog()
chatgpt2(prompt="Hello", model_name="gpt-3.5-turbo", simulate=True, cost_log=cl)
chatgpt2(prompt="Hello", model_name="gpt-3.5-turbo", simulate=False, cost_log=cl)
print(cl.items[0],'\n---\n',cl.items[1])


{'cost_min': 5e-07, 'cost_max': 0.0030725, 'time_min': 0.0, 'time_max': 73.728, 'input_tokens': 1, 'output_tokens_min': 0, 'output_tokens_max': 2048, 'calls': 1, 'model': 'gpt-3.5-turbo', 'simulated': True, 'input_string': 'Hello', 'output_string': None, 'description': None} 
---
 {'cost_min': 1.4e-05, 'cost_max': 1.4e-05, 'time_min': 0.9808458000188693, 'time_max': 0.9808458000188693, 'input_tokens': 1, 'output_tokens': 9, 'output_tokens_min': 9, 'output_tokens_max': 9, 'calls': 1, 'model': 'gpt-3.5-turbo', 'simulated': False, 'input_string': 'Hello', 'output_string': 'Hello! How can I help you today?', 'description': None}


Actually for the specific case of just remapping variables you can simply do:

In [17]:
from costly import Costlog, costly
from costly.simulators.llm_simulator_faker import LLM_Simulator_Faker
from costly.estimators.llm_api_estimation import LLM_API_Estimation


@costly(input_string="prompt", model="model_name")
def chatgpt2(prompt: str, model_name: str) -> str:
    from openai import OpenAI

    client = OpenAI()
    response = client.chat.completions.create(
        model=model_name, messages=[{"role": "user", "content": prompt}]
    )
    output_string = response.choices[0].message.content
    return output_string


cl = Costlog(
    totals_keys={
        "cost_min",
        "cost_max",
        "time_min",
        "time_max",
        "calls",
        "input_tokens",
        "output_tokens_min",
        "output_tokens_max",
    }
)
chatgpt2(prompt="Hello", model_name="gpt-3.5-turbo", simulate=True, cost_log=cl)
chatgpt2(prompt="Hello", model_name="gpt-3.5-turbo", simulate=False, cost_log=cl)
print(cl.items[0],'\n---\n',cl.items[1])


{'cost_min': 5e-07, 'cost_max': 0.0030725, 'time_min': 0.0, 'time_max': 73.728, 'input_tokens': 1, 'output_tokens_min': 0, 'output_tokens_max': 2048, 'calls': 1, 'model': 'gpt-3.5-turbo', 'simulated': True, 'input_string': 'Hello', 'output_string': None, 'description': None} 
---
 {'cost_min': 1.4e-05, 'cost_max': 1.4e-05, 'time_min': 0.6241316000232473, 'time_max': 0.6241316000232473, 'input_tokens': 1, 'output_tokens': 9, 'output_tokens_min': 9, 'output_tokens_max': 9, 'calls': 1, 'model': 'gpt-3.5-turbo', 'simulated': False, 'input_string': 'Hello', 'output_string': 'Hello! How can I assist you today?', 'description': None}


## messages-and-instructor

More examples of parameter mappings.

In [19]:
from costly import Costlog, costly
from costly.simulators.llm_simulator_faker import LLM_Simulator_Faker
from costly.estimators.llm_api_estimation import LLM_API_Estimation


@costly(
    input_string=lambda kwargs: LLM_API_Estimation.messages_to_input_string(
        kwargs["messages"]
    ),
)
def chatgpt_messages(messages: list[dict[str, str]], model: str) -> str:
    from openai import OpenAI

    client = OpenAI()
    response = client.chat.completions.create(model=model, messages=messages)
    output_string = response.choices[0].message.content
    return output_string


cl = Costlog()
chatgpt_messages(
    messages=LLM_API_Estimation._input_string_to_messages("Hey"),
    model="gpt-4o-mini",
    simulate=True,
    cost_log=cl,
)
chatgpt_messages(
    messages=LLM_API_Estimation._input_string_to_messages("Hey"),
    model="gpt-4o-mini",
    simulate=False,
    cost_log=cl,
)
print(cl.items[0], '\n---\n', cl.items[1])


{'cost_min': 1.5e-07, 'cost_max': 0.00122895, 'time_min': 0.0, 'time_max': 18.432, 'input_tokens': 1, 'output_tokens_min': 0, 'output_tokens_max': 2048, 'calls': 1, 'model': 'gpt-4o-mini', 'simulated': True, 'input_string': 'Hey', 'output_string': None, 'description': None} 
---
 {'cost_min': 5.55e-06, 'cost_max': 5.55e-06, 'time_min': 0.7071941000176594, 'time_max': 0.7071941000176594, 'input_tokens': 1, 'output_tokens': 9, 'output_tokens_min': 9, 'output_tokens_max': 9, 'calls': 1, 'model': 'gpt-4o-mini', 'simulated': False, 'input_string': 'Hey', 'output_string': 'Hello! How can I assist you today?', 'description': None}


In [20]:
import instructor
from pydantic import BaseModel
from openai import OpenAI
from instructor import Instructor
from costly import costly, Costlog
from costly.estimators.llm_api_estimation import LLM_API_Estimation


@costly(
    input_string=lambda kwargs: LLM_API_Estimation.get_raw_prompt_instructor(**kwargs),
)
def chatgpt_instructor(
    messages: str | list[dict[str, str]],
    model: str,
    client: Instructor,
    response_model: BaseModel,
) -> str:
    if isinstance(messages, str):
        messages = [{"role": "user", "content": messages}]
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        response_model=response_model,
    )
    return response


class PersonInfo(BaseModel):
    name: str
    age: int


cl = Costlog()
chatgpt_instructor(
    messages="Hey",
    model="gpt-3.5-turbo",
    response_model=PersonInfo,
    client=instructor.from_openai(OpenAI()),
    simulate=True,
    cost_log=cl,
)
chatgpt_instructor(
    messages="Hey",
    model="gpt-3.5-turbo",
    response_model=PersonInfo,
    client=instructor.from_openai(OpenAI()),
    simulate=False,
    cost_log=cl,
)
print(cl.items[0], '\n---\n', cl.items[1])

2024-08-31 21:23:19,515 DEBUG instructor: Patching `client.chat.completions.create` with mode=<Mode.TOOLS: 'tool_call'>
2024-08-31 21:23:19,583 DEBUG instructor: Instructor Request: mode.value='tool_call', response_model=<class '__main__.PersonInfo'>, new_kwargs={'messages': [{'content': 'Hey', 'role': 'user'}], 'model': 'gpt-3.5-turbo', 'tools': [{'type': 'function', 'function': {'name': 'PersonInfo', 'description': 'Correctly extracted `PersonInfo` with all the required parameters with correct types', 'parameters': {'properties': {'name': {'title': 'Name', 'type': 'string'}, 'age': {'title': 'Age', 'type': 'integer'}}, 'required': ['age', 'name'], 'type': 'object'}}}], 'tool_choice': {'type': 'function', 'function': {'name': 'PersonInfo'}}}
2024-08-31 21:23:19,590 DEBUG instructor: max_retries: 3
2024-08-31 21:23:19,667 DEBUG instructor: Patching `client.chat.completions.create` with mode=<Mode.TOOLS: 'tool_call'>
2024-08-31 21:23:19,683 DEBUG instructor: Instructor Request: mode.val

{'cost_min': 1.7e-05, 'cost_max': 0.003089, 'time_min': 0.0, 'time_max': 73.728, 'input_tokens': 34, 'output_tokens_min': 0, 'output_tokens_max': 2048, 'calls': 1, 'model': 'gpt-3.5-turbo', 'simulated': True, 'input_string': "HeyPersonInfoCorrectly extracted `PersonInfo` with all the required parameters with correct typesdict_values(['Name', 'string'])dict_values(['Age', 'integer'])", 'output_string': None, 'description': None} 
---
 {'cost_min': 1.7e-05, 'cost_max': 1.7e-05, 'time_min': 0.6167464000172913, 'time_max': 0.6167464000172913, 'input_tokens': 34, 'output_tokens': 0, 'output_tokens_min': 0, 'output_tokens_max': 0, 'calls': 1, 'model': 'gpt-3.5-turbo', 'simulated': False, 'input_string': "HeyPersonInfoCorrectly extracted `PersonInfo` with all the required parameters with correct typesdict_values(['Name', 'string'])dict_values(['Age', 'integer'])", 'output_string': PersonInfo(name='Alice', age=30), 'description': None}


## costly-response

For more accurate estimation of _real_ (not simulated) costs, we might want to calculate costs within the function.

To do this, your function must return a `CostlyResponse` object.

```python
@dataclass
class CostlyResponse:
    output: Any
    cost_info: dict[str, Any]
```

 In actual usage, your function will only return the `output` field -- the `@costly` decorator will take care of this.


In [21]:
from costly import Costlog, costly, CostlyResponse


@costly()
def chatgpt(input_string: str, model: str) -> str:
    from openai import OpenAI

    client = OpenAI()
    response = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "user", "content": input_string},
        ],
    )

    return CostlyResponse(
        output=response.choices[0].message.content,
        cost_info={
            "input_tokens": response.usage.prompt_tokens,
            "output_tokens": response.usage.completion_tokens,
        },
    )


cl = Costlog()
x = chatgpt(
    input_string="Write the Lorem ipsum text",
    model="gpt-4",
    cost_log=cl,
    simulate=True,
)
y = chatgpt(
    input_string="Write the Lorem ipsum text",
    model="gpt-4",
    cost_log=cl,
    simulate=False,
)
print(x, "\n---\n", y, "\n---\n", cl.items[0], "\n---\n", cl.items[1])

Little break party executive probably. Article act address director author. Professor into moment couple recognize.
Level same international describe finally avoid. Receive sense various learn lot second animal. Ready check local body blood country.
Success toward only.
Population top career professor next avoid between. Pull I he big into. Today by direction.
Own morning its writer big.
Really her item year table yourself must. Although argue coach design bad. Still section bed finally as minute often.
Concern manager either throughout public sister including. Guy cup save trial film yeah natural focus.
Yourself act foreign outside dinner.
Member decade nation if behind. Quickly give TV be give. Measure against begin understand case.
Different us positive why look. Us start culture cover especially. Key yard south pretty marriage tend. Others certainly however control.
International always produce yet face dream pay.
Media bring book theory pick color.
Yourself the fact continue feel 

In [22]:
import instructor
from pydantic import BaseModel
from instructor import Instructor
from openai import OpenAI
from costly import Costlog, costly, CostlyResponse
from costly.estimators.llm_api_estimation import LLM_API_Estimation


@costly(
    input_string=lambda kwargs: LLM_API_Estimation.get_raw_prompt_instructor(**kwargs),
)
def chatgpt_instructor(
    messages: str | list[dict[str, str]],
    model: str,
    client: Instructor,
    response_model: BaseModel,
) -> str:
    if isinstance(messages, str):
        messages = [{"role": "user", "content": messages}]
    response = client.chat.completions.create_with_completion(
        model=model,
        messages=messages,
        response_model=response_model,
    )
    output_string, cost_info = response
    return CostlyResponse(
        output=output_string,
        cost_info={
            "input_tokens": cost_info.usage.prompt_tokens,
            "output_tokens": cost_info.usage.completion_tokens
        }
    )
    
class PersonInfo(BaseModel):
    name: str
    age: int


cl = Costlog()
chatgpt_instructor(
    messages="Hey",
    model="gpt-4",
    response_model=PersonInfo,
    client=instructor.from_openai(OpenAI()),
    simulate=True,
    cost_log=cl,
)
chatgpt_instructor(
    messages="Hey",
    model="gpt-4",
    response_model=PersonInfo,
    client=instructor.from_openai(OpenAI()),
    simulate=False,
    cost_log=cl,
)
print(x, '\n---\n', y, '\n---\n', cl.items[0], '\n---\n', cl.items[1])


2024-08-31 21:31:28,447 DEBUG instructor: Patching `client.chat.completions.create` with mode=<Mode.TOOLS: 'tool_call'>
2024-08-31 21:31:28,463 DEBUG instructor: Instructor Request: mode.value='tool_call', response_model=<class '__main__.PersonInfo'>, new_kwargs={'messages': [{'content': 'Hey', 'role': 'user'}], 'model': 'gpt-4', 'tools': [{'type': 'function', 'function': {'name': 'PersonInfo', 'description': 'Correctly extracted `PersonInfo` with all the required parameters with correct types', 'parameters': {'properties': {'name': {'title': 'Name', 'type': 'string'}, 'age': {'title': 'Age', 'type': 'integer'}}, 'required': ['age', 'name'], 'type': 'object'}}}], 'tool_choice': {'type': 'function', 'function': {'name': 'PersonInfo'}}}
2024-08-31 21:31:28,463 DEBUG instructor: max_retries: 3
2024-08-31 21:31:28,497 DEBUG instructor: Patching `client.chat.completions.create` with mode=<Mode.TOOLS: 'tool_call'>
2024-08-31 21:31:28,516 DEBUG instructor: Instructor Request: mode.value='tool

Little break party executive probably. Article act address director author. Professor into moment couple recognize.
Level same international describe finally avoid. Receive sense various learn lot second animal. Ready check local body blood country.
Success toward only.
Population top career professor next avoid between. Pull I he big into. Today by direction.
Own morning its writer big.
Really her item year table yourself must. Although argue coach design bad. Still section bed finally as minute often.
Concern manager either throughout public sister including. Guy cup save trial film yeah natural focus.
Yourself act foreign outside dinner.
Member decade nation if behind. Quickly give TV be give. Measure against begin understand case.
Different us positive why look. Us start culture cover especially. Key yard south pretty marriage tend. Others certainly however control.
International always produce yet face dream pay.
Media bring book theory pick color.
Yourself the fact continue feel 

## costlog-notes

The default [`costly.Costlog`](costly/costlog.py) class has two modes: `memory` and `jsonl`. The default is `memory`, but for large projects you may want to use `jsonl`: this dumps the cost log into a `.costly` folder in your working directory.

The other thing that can be customized is the `totals_keys` parameter, which is a set of keys to aggregate costs by. By default it is `{"cost_min", "cost_max", "time_min", "time_max", "calls"}`, i.e. it tracks the range of possible costs and running times (`max` and `min` are usually only different when simulating because then you have to estimate). Out-of-the box you can customize it to also track `input_tokens`, `output_tokens_min`, `output_tokens_max`; any other customizations will only make sense if you are using your own estimator.

You will want to change `totals_keys` if you want to use this package for things other than LLM costs, or if you want to track something other than `min` and `max`, e.g. some estimate of the average, or percentiles or whatever.