<a href="https://colab.research.google.com/github/staerkjoe/NLP_colab/blob/main/AML4NLP2025_E04_Intro_W%26B_Weave.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Introduction: Eploring Weights & Biases Weave

ITU KSAMLDS1KU - Advanced Machine Learning for Data Science 2025

by Jonathan Tiedchen, Eisuke Okuda, Stefan Heinrich,
& material by Bertram Højer, Kevin Murphy, and Chris Bishop.

All info and static material: https://learnit.itu.dk/course/view.php?id=3024752

-------------------------------------------------------------------------------

### 1. Install dependencies

In [1]:
!pip install wandb weave transformers datasets torch

Collecting weave
  Downloading weave-0.52.8-py3-none-any.whl.metadata (27 kB)
Collecting diskcache==5.6.3 (from weave)
  Downloading diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
Collecting eval-type-backport (from weave)
  Downloading eval_type_backport-0.2.2-py3-none-any.whl.metadata (2.2 kB)
Collecting gql[aiohttp,requests] (from weave)
  Downloading gql-4.0.0-py3-none-any.whl.metadata (10 kB)
Collecting polyfile-weave (from weave)
  Downloading polyfile_weave-0.5.6-py3-none-any.whl.metadata (7.6 kB)
Collecting graphql-core<3.3,>=3.2 (from gql[aiohttp,requests]->weave)
  Downloading graphql_core-3.2.6-py3-none-any.whl.metadata (11 kB)
Collecting backoff<3.0,>=1.11.1 (from gql[aiohttp,requests]->weave)
  Downloading backoff-2.2.1-py3-none-any.whl.metadata (14 kB)
Collecting abnf~=2.2.0 (from polyfile-weave->weave)
  Downloading abnf-2.2.0-py3-none-any.whl.metadata (1.1 kB)
Collecting cint>=1.0.0 (from polyfile-weave->weave)
  Downloading cint-1.0.0-py3-none-any.whl.metadata (511 

### 2. Imports

In [1]:
import wandb
import weave
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

### 3. Initialize W&B and Weave

In [13]:
# Login to W&B
wandb.login()

# Start a new run
wandb_run = wandb.init(
    project="llm_prompt_logging",
    config={
        "model": "gpt2",
        "max_new_tokens": 20,
        "temperature": 0.2
    }
)

# Initialize Weave client
client = weave.init("jojs-it-universitetet-i-k-benhavn/llm_prompt_logging")

[34m[1mwandb[0m: Initializing weave.


Output()

[36m[1mweave[0m: wandb version 0.22.0 is available!  To upgrade, please run:
[36m[1mweave[0m:  $ pip install wandb --upgrade
[36m[1mweave[0m: Logged in as Weights & Biases user: jojs.
[36m[1mweave[0m: View Weave data at https://wandb.ai/jojs-it-universitetet-i-k-benhavn/llm_prompt_logging/weave


Output()

[36m[1mweave[0m: wandb version 0.22.0 is available!  To upgrade, please run:
[36m[1mweave[0m:  $ pip install wandb --upgrade
[36m[1mweave[0m: Logged in as Weights & Biases user: jojs.
[36m[1mweave[0m: View Weave data at https://wandb.ai/jojs-it-universitetet-i-k-benhavn/llm_prompt_logging/weave


### 4. Load a Small Model (tiny GPT-2)

In [14]:
model_name = wandb.config["model"]

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

### 5. Define a Prompt Function with Logging

In [15]:
def prompt_model(question, temperature=0.7, max_new_tokens=50):
    # Create Weave call
    call = client.create_call(
        op="prompt_model",
        inputs={
            "question": question,
            "temperature": temperature,
            "max_new_tokens": max_new_tokens
        }
    )

    # Encode input
    input_text = f"Q: {question}\nA:"
    inputs = tokenizer(input_text, return_tensors="pt")

    # Generate output
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            temperature=temperature,
            pad_token_id=tokenizer.eos_token_id
        )
    output_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    answer = output_text.split("A:")[-1].strip()

    # Log to W&B
    wandb.log({"question": question, "answer": answer})

    # Finish Weave call
    client.finish_call(call, output={"answer": answer})

    return answer


### 6. Run a few examples

In [16]:
questions = [
    "What is 2+2?",
    "Explain the capital of France.",
    "Who wrote the book 1984?"
]

for q in questions:
    ans = prompt_model(q, temperature=wandb.config["temperature"])
    print(f"Q: {q}\nA: {ans}\n")


[36m[1mweave[0m: 🍩 https://wandb.ai/jojs-it-universitetet-i-k-benhavn/llm_prompt_logging/r/call/01996c18-7260-76d1-b638-46c75eb325b1
[36m[1mweave[0m: 🍩 https://wandb.ai/jojs-it-universitetet-i-k-benhavn/llm_prompt_logging/r/call/01996c18-7e3c-7892-94f3-9261141a5850


Q: What is 2+2?
A: 2+2 is a number that is used to represent the number of times a given number is equal to



[36m[1mweave[0m: 🍩 https://wandb.ai/jojs-it-universitetet-i-k-benhavn/llm_prompt_logging/r/call/01996c18-89c4-7914-9ce6-ca115c081c51


Q: Explain the capital of France.
A: The capital of France is Paris. It is the capital of France. It is the capital of France. It is the capital of France. It is the capital of France. It is the capital of France. It is the capital of France. It

Q: Who wrote the book 1984?
A: I think it was a very good book. I think it was written by a very good writer. I think it was written by a very good writer. I think it was written by a very good writer. I think it was written by a very



### 7. Finish Run

In [17]:
wandb.finish()


0,1
answer,I think it was a ver...
question,Who wrote the book 1...
