# Cerebras and Weights & Biases Weave   

This notebook demonstrates how to use [Weights & Biases Weave](https://weave-docs.wandb.ai/) with the Cerebras Cloud SDK for automatic tracking and logging of LLM calls.

When building applications that use LLMs, it's important to have good observability into the model calls being made. This allows you to understand the performance of the model, debug issues, and experiment with different configurations. 

Weave is a tool that helps with this. It can automatically trace and log LLM calls, and it can also be used to create custom ops for more reproducible experiments.

In this notebook, we'll show how to use Weave with the Cerebras Cloud SDK.

## Setup

First, let's import the necessary libraries and initialize Weave:

In [1]:
# !pip install weave cerebras-cloud-sdk

In [2]:
import os
import weave
from cerebras.cloud.sdk import Cerebras

# Set up Cerebras client
model = "llama3.1-8b" # Cerebras model
api_key= os.environ["CEREBRAS_API_KEY"]
client = Cerebras(api_key=api_key)

Logged in as Weights & Biases user: capecape.
View Weave data at https://wandb.ai/capecape/cerebras_speedster/weave


**Initialize the weave project**

In [None]:
# this is where your traces will be saved
weave.init("cerebras_speedster")

Get your Weights & Biases API key from https://wandb.ai/authorize and paste it below.

## Basic Tracing

Let's start with a simple example of tracing a Cerebras SDK call:

In [3]:
response = client.chat.completions.create(
    model=model,
    messages=[{"role": "user", "content": "What's the fastest land animal?"}],
)

print(response.choices[0].message.content)

🍩 https://wandb.ai/capecape/cerebras_speedster/r/call/0191bc34-91ad-7792-a739-306f9ade02c8
The fastest land animal is the cheetah (Acinonyx jubatus). On average, a cheetah can reach speeds of up to 70 kilometers per hour (43 miles per hour). However, the cheetah can achieve burst speeds of up to 120 kilometers per hour (75 miles per hour) over short distances, typically while chasing prey.

Cheetahs are designed for speed, with a slender body, long legs, and a flexible spine that allows them to stretch their stride. They also have non-retractable claws that provide traction and help them accelerate quickly.

It's worth noting that cheetahs are only fast for short bursts, and they can only sustain high speeds for about 20-30 seconds. After that, they tire quickly and need to rest and recover.

Other contenders for fastest land animal include:

* Pronghorn (Antilocapra americana): up to 96 km/h (60 mph)
* Thomson's gazelle (Eudorcas thomsonii): up to 90 km/h (56 mph)
* Springbok (Antilop

You can click on the generated links to view the traces in the Weave web interface. Every call to the Cerebras SDK is automatically traced and logged.

[![](imgs/cerebras_call.png)](https://wandb.ai/capecape/cerebras_speedster/weave/calls/0191bbfc-d2c4-7be2-9dd5-e67789051e3b)

## Custom Weave Ops

Now, let's create a custom Weave op for more reproducible experiments:

> Most of the time this is what you want to do, as it allows you to specify the exact parameters you want to use and track.

In [4]:
@weave.op
def animal_speedster(animal: str, model: str) -> str:
    "Find out how fast an animal can run"

    response = client.chat.completions.create(
    model=model,
    messages=[{"role": "user", "content": f"How fast can a {animal} run?"}],
    )
    return response.choices[0].message.content

# Test the custom op
print(animal_speedster("cheetah", "llama3.1-8b"))
print(animal_speedster("ostrich", "llama3.1-8b"))
print(animal_speedster("human", "llama3.1-8b"))

🍩 https://wandb.ai/capecape/cerebras_speedster/r/call/0191bc34-9d2c-77d0-bf9e-2edb75a44c96
The cheetah is widely considered to be the fastest land animal on Earth, and for good reason. They can reach incredible speeds when chasing prey. According to various sources, including the Cheetah Conservation Fund and the San Diego Zoo, a cheetah can run as fast as:

* Up to 70 mph (113 km/h) in short bursts, which is their top speed
* Sustained speeds of around 40-50 mph (64-80 km/h) over longer distances
* Accelerate from 0 to 60 mph (0-97 km/h) in about 3 seconds

To put that in perspective, when a cheetah is chasing its prey, it can go from being stationary to running at top speed in just a few seconds. This incredible speed is thanks to their slender body, powerful legs, and specialized paw structure, which allows them to generate a lot of force and traction.

Of course, it's worth noting that cheetahs can only maintain top speed for short periods of time, usually around 20-30 seconds. Aft

As you can see, we got 3 separate traces, one for each call to the `animal_speedster` op.

[![](imgs/cerebras_custom_op.png)](https://wandb.ai/capecape/cerebras_speedster/weave/calls?filter=%7B%22opVersionRefs%22%3A%5B%22weave%3A%2F%2F%2Fcapecape%2Fcerebras_speedster%2Fop%2Fanimal_speedster%3A*%22%5D%7D&peekPath=%2Fcapecape%2Fcerebras_speedster%2Fcalls%2F0191bbff-87fb-7352-8a34-de17012c5f8b)

## Using the Model Class

Finally, let's create a Model for easier experimentation. A `Model` is a Weave class that can be used to organize your experiments. This way you can iterate on your model and experiment with different configurations.

You can also run evaluations on your model to assess its performance.

In [5]:
import json

class AnimalSpeedModel(weave.Model):
    model: str
    temperature: float

    @weave.op
    def predict(self, animal: str) -> str:
        "Predict the top speed of an animal" 

        response = client.chat.completions.create(
            model=self.model,
            messages=[
                {"role": "user", 
                 "content": f"What's the top speed of a {animal} in km/h? Please respond in JSON format with `animal` and `speed` keys."}],
            response_format={ "type": "json_object" },
            temperature=self.temperature
        )
        return json.loads(response.choices[0].message.content)

# this is versioned, so you can iterate on it
speed_model = AnimalSpeedModel(
    model="llama3.1-8b",
    temperature=0.7
)

result = speed_model.predict(animal="peregrine falcon")
print(result)

🍩 https://wandb.ai/capecape/cerebras_speedster/r/call/0191bc34-b91b-7ff1-ba8a-6af4d786ca9f
{'animal': 'Peregrine Falcon', 'speed': 322}


[![](imgs/cerebras_model_call.png)](https://wandb.ai/capecape/cerebras_speedster/weave/calls/0191bc11-224b-7e03-a34e-467b7d1016dc)

With this setup, you can easily experiment with different models and parameters while keeping track of your Cerebras-powered inferences!

## Running an Evalution on your Model

You can automatically run [evaluations](https://weave-docs.wandb.ai/tutorial-eval) as you iterate on your LLM-powered pipelines using `weave.Evaluation`.

Let's create a test dataset:

In [10]:
dataset = [
    {"animal": "cheetah", "speed": 120},
    {"animal": "ostrich", "speed": 70},
    {"animal": "human", "speed": 45},
    {"animal": "shark", "speed": 80},
    {"animal": "dog", "speed": 64},
]

Let's define a metric that is not so strict, so getting the speed to 10% accuracy is good enough:


In [11]:
@weave.op
def speed_accuracy(model_output, speed) -> dict:
    "Check if the speed is within 10% of the actual speed"
    return {
        "is_accurate": abs(int(model_output["speed"]) - speed) / speed < 0.1
    }

In [12]:
evaluation = weave.Evaluation(dataset=dataset, scorers=[speed_accuracy])

In [13]:
await evaluation.evaluate(speed_model)

{'model_output': {'speed': {'mean': 85.67999999999999}},
 'speed_accuracy': {'is_accurate': {'true_count': 4, 'true_fraction': 0.8}},
 'model_latency': {'mean': 2.987691783905029}}

We can inspect the evalution results in the Weave web interface:

[![](imgs/cerebras_eval.png)](https://wandb.ai/capecape/cerebras_speedster/weave/calls/0191bc35-cd85-7a90-8a89-8ce56d66083d)

To learn more about Weave Evaluations, check out the [Weave Evaluation Documentation](https://weave-docs.wandb.ai/tutorial-eval).