# LangWatch DSPy Visualizer

This notebook shows an example of a simple DSPy optimization process integrated with LangWatch for training visualization and debugging.

[<img align="center" src="https://colab.research.google.com/assets/colab-badge.svg" />](https://colab.research.google.com/github/langwatch/langwatch/blob/main/python-sdk/examples/dspy_visualization.ipynb)

In [None]:
# Install langwatch along with dspy for the visualization
!pip install dspy-ai langwatch

## Preparing the LLM

In [1]:
import os
from getpass import getpass

os.environ["OPENAI_API_KEY"] = getpass("Enter your OPENAI_API_KEY: ")

import dspy
import openai

llm = dspy.OpenAI(
    model="gpt-3.5-turbo",
    max_tokens=2048,
    temperature=0,
    api_key=os.environ["OPENAI_API_KEY"]
)

print("LLM test response:", llm("hello there"))

colbertv2_wiki17_abstracts = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')
dspy.settings.configure(lm=llm, rm=colbertv2_wiki17_abstracts)

Enter your OPENAI_API_KEY: ··········
LLM test response: ['Hello! How can I assist you today?']


## Preparing the Dataset

In [2]:
from dspy.datasets import HotPotQA

# Load the dataset.
dataset = HotPotQA(train_seed=1, train_size=32, eval_seed=2023, dev_size=50, test_size=0)

# Tell DSPy that the 'question' field is the input. Any other fields are labels and/or metadata.
trainset = [x.with_inputs('question') for x in dataset.train]
devset = [x.with_inputs('question') for x in dataset.dev]

len(trainset), len(devset)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
  table = cls._concat_blocks(blocks, axis=0)


(32, 50)

## Defining the model

In [4]:
class GenerateAnswer(dspy.Signature):
    """Answer questions with short factoid answers."""

    context = dspy.InputField(desc="may contain relevant facts")
    question = dspy.InputField()
    answer = dspy.OutputField(desc="often between 1 and 5 words")


class RAG(dspy.Module):
    def __init__(self, num_passages=3):
        super().__init__()

        self.retrieve = dspy.Retrieve(k=num_passages)
        self.generate_answer = dspy.ChainOfThought(GenerateAnswer)

    def forward(self, question):
        context = self.retrieve(question).passages
        prediction = self.generate_answer(context=context, question=question)
        return dspy.Prediction(context=context, answer=prediction.answer)


dev_example = devset[18]
print(f"[Devset] Question: {dev_example.question}")
print(f"[Devset] Answer: {dev_example.answer}")
print(f"[Devset] Relevant Wikipedia Titles: {dev_example.gold_titles}")

generate_answer = RAG()

pred = generate_answer(question=dev_example.question)

# Print the input and the prediction.
print(f"[Prediction] Question: {dev_example.question}")
print(f"[Prediction] Predicted Answer: {pred.answer}")

[Devset] Question: What is the nationality of the chef and restaurateur featured in Restaurant: Impossible?
[Devset] Answer: English
[Devset] Relevant Wikipedia Titles: {'Restaurant: Impossible', 'Robert Irvine'}
[Prediction] Question: What is the nationality of the chef and restaurateur featured in Restaurant: Impossible?
[Prediction] Predicted Answer: American


## Login to LangWatch

In [5]:
import langwatch

langwatch.login()

Please go to https://app.langwatch.ai/authorize to get your API key
Paste your API key here: ··········
LangWatch API key set


## Start Training Session!

In [8]:
from dspy.teleprompt import BootstrapFewShotWithRandomSearch
import dspy.evaluate

# Define our metric validation
def validate_context_and_answer(example, pred, trace=None):
    answer_EM = dspy.evaluate.answer_exact_match(example, pred)
    answer_PM = dspy.evaluate.answer_passage_match(example, pred)
    return answer_EM and answer_PM

# Set up a basic optimizer, which will compile our RAG program.
optimizer = BootstrapFewShotWithRandomSearch(metric=validate_context_and_answer, max_rounds=1, max_bootstrapped_demos=4, max_labeled_demos=4)

# Initialize langwatch for this run, to track the optimizer compilation
langwatch.dspy.init(experiment="my-awesome-experiment", optimizer=optimizer)

# Compile
compiled_rag = optimizer.compile(RAG(), trainset=trainset)


[LangWatch] Experiment initialized, run_id: yellow-mamba-of-proficiency
[LangWatch] Open https://app.langwatch.ai/inbox-narrator/experiments/my-awesome-experiment?runIds=yellow-mamba-of-proficiency to track your DSPy training session live



Average Metric: 10 / 32  (31.2): 100%|██████████| 32/32 [00:00<00:00, 190.05it/s]
Average Metric: 11 / 32  (34.4): 100%|██████████| 32/32 [00:00<00:00, 272.69it/s]
 34%|███▍      | 11/32 [00:00<00:00, 466.05it/s]
Average Metric: 12 / 32  (37.5): 100%|██████████| 32/32 [00:00<00:00, 234.53it/s]
 41%|████      | 13/32 [00:00<00:00, 690.03it/s]
Average Metric: 12 / 32  (37.5): 100%|██████████| 32/32 [00:00<00:00, 338.70it/s]
  9%|▉         | 3/32 [00:00<00:00, 501.19it/s]
Average Metric: 10 / 32  (31.2): 100%|██████████| 32/32 [00:00<00:00, 389.88it/s]
 12%|█▎        | 4/32 [00:00<00:00, 533.12it/s]
Average Metric: 11 / 32  (34.4): 100%|██████████| 32/32 [00:00<00:00, 351.03it/s]
 12%|█▎        | 4/32 [00:00<00:00, 478.24it/s]
Average Metric: 11 / 32  (34.4): 100%|██████████| 32/32 [00:00<00:00, 616.36it/s]
 25%|██▌       | 8/32 [00:00<00:00, 584.59it/s]
Average Metric: 10 / 32  (31.2): 100%|██████████| 32/32 [00:00<00:00, 372.51it/s]
 19%|█▉        | 6/32 [00:00<00:00, 432.64it/s]
Averag

In [9]:
compiled_rag.save("optimized_model.json")