<a href="https://colab.research.google.com/github/hanhanwu/Hanhan_COLAB_Experiemnts/blob/master/LLM_Practice/langwatch_dspy_optimization.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# LangWatch DSPy Visualizer

This notebook shows an example of a simple DSPy optimization process integrated with LangWatch for training visualization and debugging.

[<img align="center" src="https://colab.research.google.com/assets/colab-badge.svg" />](https://colab.research.google.com/github/langwatch/langwatch/blob/main/python-sdk/examples/dspy_visualization.ipynb)

In [2]:
%%capture --no-stderr
%pip install -U --quiet dspy langwatch

## Preparing the LLM

In [3]:
import os
from getpass import getpass
import dspy
from google.colab import userdata


OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')
llm = dspy.LM("openai/gpt-4.1-nano", api_key=OPENAI_API_KEY)
print("LLM test response:", llm("How much do you know about Robert Nishihara?"))

colbertv2_wiki17_abstracts = dspy.ColBERTv2(
    url="http://20.102.90.50:2017/wiki17_abstracts"
)
dspy.settings.configure(lm=llm, rm=colbertv2_wiki17_abstracts)

LLM test response: ['As of my knowledge cutoff in October 2023, Robert Nishihara is a researcher known for his work in machine learning and artificial intelligence. He has contributed to areas such as reinforcement learning, distributed computing, and scalable algorithms. Nishihara has been involved with prominent research institutions and has published papers on topics like large-scale optimization and parallel computing frameworks for machine learning. If you have specific questions about his work or background, please let me know!']


## Preparing the Dataset

In [8]:
from dspy.datasets import HotPotQA


dataset = HotPotQA(train_seed=1, train_size=32, eval_seed=2025, dev_size=50, test_size=0)
trainset = [x.with_inputs('question') for x in dataset.train]
devset = [x.with_inputs('question') for x in dataset.dev]

print(len(trainset), len(devset))
print(trainset[0])
print(devset[0])

32 50
Example({'question': 'At My Window was released by which American singer-songwriter?', 'answer': 'John Townes Van Zandt'}) (input_keys={'question'})
Example({'question': 'Pehchaan: The Face of Truth stars Vinod Khanna, Rati Agnihotri and which Indian actress, producer, and former model who also produced the film?', 'answer': 'Raveena Tandon', 'gold_titles': {'Pehchaan: The Face of Truth', 'Raveena Tandon'}}) (input_keys={'question'})


## Defining the model

In [9]:
class GenerateAnswer(dspy.Signature):
    """Answer questions with short factoid answers."""
    context = dspy.InputField(desc="may contain relevant facts")
    question = dspy.InputField()
    answer = dspy.OutputField(desc="often between 1 and 5 words")


class RAG(dspy.Module):
    def __init__(self, num_passages=3):
        super().__init__()

        self.retrieve = dspy.Retrieve(k=num_passages)
        self.generate_answer = dspy.ChainOfThought(GenerateAnswer)

    def forward(self, question):
        context = self.retrieve(question).passages
        print('HANHAN TEST', context)  #seems using online search results as the context
        prediction = self.generate_answer(context=context, question=question)
        return dspy.Prediction(context=context, answer=prediction.answer)


dev_example = devset[18]
print(f"[Devset] Question: {dev_example.question}")
print(f"[Devset] Answer: {dev_example.answer}")
print(f"[Devset] Relevant Wikipedia Titles: {dev_example.gold_titles}")

generate_answer = RAG()
pred = generate_answer(question=dev_example.question)

# Print the input and the prediction.
print(f"[Prediction] Question: {dev_example.question}")
print(f"[Prediction] Predicted Answer: {pred.answer}")

[Devset] Question: Which magazine was released first, Fortune or Motor Trend?
[Devset] Answer: Motor Trend
[Devset] Relevant Wikipedia Titles: {'Fortune (magazine)', 'Motor Trend'}
HANHAN TEST ['Motor Trend | Motor Trend is an American automobile magazine. It first appeared in September 1949, issued by Petersen Publishing Company in Los Angeles, and bearing the tagline "The Magazine for a Motoring World". Petersen Publishing was sold to British publisher EMAP in 1998, who sold the former Petersen magazines to Primedia in 2001. As of 2017, it is published by (formerly Source Interlink Media). It has a monthly circulation of over one million readers.', 'Motor (Hearst magazine) | MOTOR is a monthly American automobile magazine first published by Hearst Magazines in 1903 as "THE MOTOR". The founder was William Randolph Hearst. The magazine is based in Troy, Michigan.', 'The Motor | The Motor (later, just Motor) was a British weekly car magazine founded on 28 January 1903 and published by T

## Login to LangWatch

In [13]:
import langwatch


langwatch.login()

Please go to https://app.langwatch.ai/authorize to get your API key
Paste your API key here: ··········
LangWatch API key set


## Start Training Session!

In [17]:
from dspy.teleprompt import MIPROv2
import dspy.evaluate

# Define our metric validation
def validate_context_and_answer(example, pred, trace=None):
    answer_EM = dspy.evaluate.answer_exact_match(example, pred)
    answer_PM = dspy.evaluate.answer_passage_match(example, pred)
    return answer_EM and answer_PM

# Set up a MIPROv2 optimizer, which will compile our RAG program.
optimizer = MIPROv2(metric=validate_context_and_answer, prompt_model=llm,
                    task_model=llm, num_candidates=2, init_temperature=0.7,
                    auto=None)

# Initialize langwatch for this run, to track the optimizer compilation
langwatch.dspy.init(experiment="hanhan_exp1", optimizer=optimizer)

# Compile
compiled_rag = optimizer.compile( RAG(),
    trainset=trainset,
    num_trials=10,
    max_bootstrapped_demos=3,
    max_labeled_demos=5,
    minibatch_size=10,
    requires_permission_to_run=False
)

2025/05/09 22:05:27 INFO dspy.teleprompt.mipro_optimizer_v2: 
==> STEP 1: BOOTSTRAP FEWSHOT EXAMPLES <==
2025/05/09 22:05:27 INFO dspy.teleprompt.mipro_optimizer_v2: These will be used as few-shot example candidates for our program and for creating instructions.

2025/05/09 22:05:27 INFO dspy.teleprompt.mipro_optimizer_v2: Bootstrapping N=2 sets of demonstrations...
2025/05/09 22:05:27 INFO dspy.teleprompt.mipro_optimizer_v2: 
==> STEP 2: PROPOSE INSTRUCTION CANDIDATES <==
2025/05/09 22:05:27 INFO dspy.teleprompt.mipro_optimizer_v2: We will use the few-shot examples from the previous step, a generated dataset summary, a summary of the program code, and a randomly selected prompting tip to propose instructions.



[LangWatch] Experiment initialized, run_id: feathered-splendid-guan
[LangWatch] Open https://app.langwatch.ai/my-garden-vZCaox/experiments/hanhan-exp1?runIds=feathered-splendid-guan to track your DSPy training session live

Bootstrapping set 1/2
Bootstrapping set 2/2


2025/05/09 22:05:29 INFO dspy.teleprompt.mipro_optimizer_v2: 
Proposing N=2 instructions...

2025/05/09 22:05:35 INFO dspy.teleprompt.mipro_optimizer_v2: Proposed Instructions for Predictor 0:

2025/05/09 22:05:35 INFO dspy.teleprompt.mipro_optimizer_v2: 0: Answer questions with short factoid answers.

2025/05/09 22:05:35 INFO dspy.teleprompt.mipro_optimizer_v2: 1: Given a context containing relevant facts and a natural language question, generate a step-by-step reasoning process leading to a concise, fact-based answer, typically between one and five words. Ensure the reasoning clearly explains how the answer is derived from the context, and provide the final answer directly after the reasoning. Maintain clarity and brevity throughout the response.

2025/05/09 22:05:35 INFO dspy.teleprompt.mipro_optimizer_v2: 

2025/05/09 22:05:35 INFO dspy.teleprompt.mipro_optimizer_v2: ==> STEP 3: FINDING OPTIMAL PROMPT PARAMETERS <==
2025/05/09 22:05:35 INFO dspy.teleprompt.mipro_optimizer_v2: We wi

  0%|          | 0/25 [00:00<?, ?it/s]HANHAN TEST ['Battle of Kursk | The Battle of Kursk was a Second World War engagement between German and Soviet forces on the Eastern Front near Kursk (450 km south-west of Moscow) in the Soviet Union during July and August 1943. The battle began with the launch of the German offensive, Operation Citadel (German: "Unternehmen Zitadelle" ), on 5 July, which had the objective of pinching off the Kursk salient with attacks on the base of the salient from north and south simultaneously. After the German offensive stalled on the northern side of the salient, on 12 July the Soviets commenced their Kursk Strategic Offensive Operation with the launch of Operation Kutuzov (Russian: Кутузов ) against the rear of the German forces in the northern side. On the southern side, the Soviets also launched powerful counterattacks the same day, one of which led to a large armoured clash, the Battle of Prokhorovka. On 3 August, the Soviets began the second phase of t

2025/05/09 22:05:42 INFO dspy.evaluate.evaluate: Average Metric: 6 / 25 (24.0%)





2025/05/09 22:05:43 INFO dspy.teleprompt.mipro_optimizer_v2: Default program score: 24.0

2025/05/09 22:05:43 INFO dspy.teleprompt.mipro_optimizer_v2: == Trial 2 / 13 - Minibatch ==


HANHAN TEST ['Remember Me Ballin\' | Remember Me Ballin\' is the CD single by Indo G featuring Gangsta Boo. The song is sampled from Curtis Mayfield\'s "Give Me Your Love" featured on the hit soundtrack, Super Fly. Like Indo\'s last single, the song had a video released with it .', 'Ballin (Juicy J song) | "Ballin" is a song by American rapper Juicy J. It was released on September 30, 2016 by Kemosabe Records and Columbia Records, as a standalone single. The song feature vocals from Kanye West, and was produced by TM88, Juicy J, Crazy Mike and Cubeatz. The song samples "I\'m Not Gonna Give Up" by R&B singer Eddie Holman.', 'Indo G | Indo G (born c. 1973 Birth Name Tobian Tools ) is an American rapper from Memphis, Tennessee. First hitting the Memphis rap scene with fellow Memphian, Lil\' Blunt, in the mid-1990s, they released two albums on Luke Records, "Up In Smoke" (1995) and "The Antidote" (1995). Soon after, Indo G became affiliated with Three 6 Mafia and released "Angel Dust" in 1

2025/05/09 22:05:46 INFO dspy.evaluate.evaluate: Average Metric: 2 / 10 (20.0%)





2025/05/09 22:05:47 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 20.0 on minibatch of size 10 with parameters ['Predictor 0: Instruction 1', 'Predictor 0: Few-Shot Set 0'].
2025/05/09 22:05:47 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [20.0]
2025/05/09 22:05:47 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [24.0]
2025/05/09 22:05:47 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 24.0


2025/05/09 22:05:47 INFO dspy.teleprompt.mipro_optimizer_v2: == Trial 3 / 13 - Minibatch ==


HANHAN TESTHANHAN TEST ['Sunflower Slow Drag | "Sunflower Slow Drag" is a ragtime composition by Scott Joplin and Scott Hayden. It is about four minutes long and has been described as "full of gaiety and sunshine".', 'Scott Joplin | Scott Joplin ( ; 1867/68 or November 24, 1868– April 1, 1917) was an African-American composer and pianist. Joplin achieved fame for his ragtime compositions and was dubbed the "King of Ragtime". During his brief career, he wrote 44 original ragtime pieces, one ragtime ballet, and two operas. One of his first, and most popular pieces, the "Maple Leaf Rag", became ragtime\'s first and most influential hit, and has been recognized as the rag.', 'Maple Leaf Rag | The "Maple Leaf Rag" (copyright registered on September 18, 1899) is an early ragtime musical composition for piano composed by Scott Joplin. It was one of Joplin\'s early works, and became the model for ragtime compositions by subsequent composers. It is one of the most famous of all ragtime pieces. 

2025/05/09 22:05:49 INFO dspy.evaluate.evaluate: Average Metric: 4 / 10 (40.0%)





2025/05/09 22:05:50 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 40.0 on minibatch of size 10 with parameters ['Predictor 0: Instruction 1', 'Predictor 0: Few-Shot Set 0'].
2025/05/09 22:05:50 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [20.0, 40.0]
2025/05/09 22:05:50 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [24.0]
2025/05/09 22:05:50 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 24.0


2025/05/09 22:05:50 INFO dspy.teleprompt.mipro_optimizer_v2: == Trial 4 / 13 - Minibatch ==


HANHAN TEST ['Harold E. Wilson | Chief Warrant Officer Harold Edward Wilson (December 5, 1921 – March 29, 1998) was a United States Marine who earned the United States’ military highest award, the Medal of Honor, for heroism as a platoon sergeant of a rifle platoon in Korea on the night of 23-April 24, 1951. He received the award from President Harry S. Truman during ceremonies at the White House on April 11, 1952.', 'Arthur H. Wilson | Arthur Harrison Wilson (1881–1953) was an officer in the United States Army and a Medal of Honor recipient for his actions in the Philippine Insurrection. Originally a member of the West Point class of 1903, he was held back a year and graduated in 1904. He was the captain of the world champion West Point Polo Team, and served a long career in the Cavalry. He was a full colonel and commander of Fort Brown, Texas, when he retired in 1942 after almost 40 years service. He lived the remaining 11 years of his life in retirement at Brownsville, Texas. He die

2025/05/09 22:05:52 INFO dspy.evaluate.evaluate: Average Metric: 5 / 10 (50.0%)





2025/05/09 22:05:53 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 50.0 on minibatch of size 10 with parameters ['Predictor 0: Instruction 1', 'Predictor 0: Few-Shot Set 1'].
2025/05/09 22:05:53 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [20.0, 40.0, 50.0]
2025/05/09 22:05:53 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [24.0]
2025/05/09 22:05:53 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 24.0


2025/05/09 22:05:53 INFO dspy.teleprompt.mipro_optimizer_v2: == Trial 5 / 13 - Minibatch ==


HANHAN TEST ['Davy Crockett, King of the Wild Frontier | Davy Crockett, King of the Wild Frontier is a 1955 live-action Walt Disney adventure film starring Fess Parker as Davy Crockett. This film is an edited compilation of the first three stories from the Disney television miniseries "Davy Crockett" :', 'Davy Crockett and the River Pirates | Davy Crockett and the River Pirates is a 1956 live-action Walt Disney adventure film starring Fess Parker as Davy Crockett. It was shot in Cave-In-Rock, Illinois. This film acts as a prequel to 1955\'s "Davy Crockett, King of the Wild Frontier" and is an edited compilation of the fourth and fifth stories featuring the Disney television series "Davy Crockett":', 'Daniel Boone, Trail Blazer | Daniel Boone, Trail Blazer is a 1956 American western adventure film co-produced and directed by Albert C. Gannaway and Ismael Rodríguez and starring Bruce Bennett, Lon Chaney Jr. and Faron Young. The film was shot in Trucolor in Mexico. It was released by Repu

2025/05/09 22:05:53 INFO dspy.evaluate.evaluate: Average Metric: 3 / 10 (30.0%)





2025/05/09 22:05:54 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 30.0 on minibatch of size 10 with parameters ['Predictor 0: Instruction 0', 'Predictor 0: Few-Shot Set 0'].
2025/05/09 22:05:54 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [20.0, 40.0, 50.0, 30.0]
2025/05/09 22:05:54 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [24.0]
2025/05/09 22:05:54 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 24.0


2025/05/09 22:05:54 INFO dspy.teleprompt.mipro_optimizer_v2: == Trial 6 / 13 - Minibatch ==


HANHAN TEST ['Michael Biehn | Michael Connell Biehn (born July 31, 1956) is an American actor, primarily known for his military roles in science fiction films directed by James Cameron; as Sgt. Kyle Reese in "The Terminator" (1984), Cpl. Dwayne Hicks in "Aliens" (1986) and Lt. Coffey in "The Abyss" (1989). He was nominated for the Saturn Award for Best Actor for "Aliens." His other films include "The Fan" (1981), "K2" (1991), "Tombstone" (1993), "The Rock" (1996), "" (2001) and "Planet Terror" (2007). On television, he has appeared in "Hill Street Blues" (1984) and "Adventure Inc." (2002-03).', 'Quintin Sondergaard | Quentin Charles Sondergaard, known primarily as Quintin Sondergaard (January 11, 1925 – February 15, 1984), was an American actor principally active on television westerns from 1957-70. He had a supporting role with eleven appearances as "Deputy Quint" on the series "Tombstone Territory", with co-stars Pat Conway, Richard Eastham, and Gilman Rankin. "Tombstone Territory" b

2025/05/09 22:05:56 INFO dspy.evaluate.evaluate: Average Metric: 4 / 10 (40.0%)





2025/05/09 22:05:57 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 40.0 on minibatch of size 10 with parameters ['Predictor 0: Instruction 0', 'Predictor 0: Few-Shot Set 1'].
2025/05/09 22:05:57 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [20.0, 40.0, 50.0, 30.0, 40.0]
2025/05/09 22:05:57 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [24.0]
2025/05/09 22:05:57 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 24.0


2025/05/09 22:05:57 INFO dspy.teleprompt.mipro_optimizer_v2: ===== Trial 7 / 13 - Full Evaluation =====
2025/05/09 22:05:57 INFO dspy.teleprompt.mipro_optimizer_v2: Doing full eval on next top averaging program (Avg Score: 50.0) from minibatch trials...


HANHAN TEST ['Rosario Dawson | Rosario Isabel Dawson (born May 9, 1979) is an American actress, producer, singer, comic book writer, and political activist. She made her film debut in the 1995 teen drama "Kids". Her subsequent film roles include "He Got Game", "Men in Black II", "25th Hour", "Rent", "Sin City", "Death Proof", "Seven Pounds", "", and "Top Five". Dawson has also provided voice-over work for Disney and DC.', 'Sarai Gonzalez | Sarai Isaura Gonzalez (born 2005) is an American Latina child actress who made her professional debut at the age of 11 on the Spanish-language ""Soy Yo"" ("That\'s Me") music video by Bomba Estéreo. Cast as a "nerdy" tween with a "sassy" and "confident" attitude, her performance turned her into a "Latina icon" for "female empowerment, identity and self-worth". She subsequently appeared in two get out the vote videos for Latinos in advance of the 2016 United States elections.', 'Gabriela (2001 film) | Gabriela is a 2001 American romance film, starring

2025/05/09 22:05:59 INFO dspy.evaluate.evaluate: Average Metric: 10 / 25 (40.0%)





2025/05/09 22:06:00 INFO dspy.teleprompt.mipro_optimizer_v2: [92mNew best full eval score![0m Score: 40.0
2025/05/09 22:06:00 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [24.0, 40.0]
2025/05/09 22:06:00 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 40.0
2025/05/09 22:06:00 INFO dspy.teleprompt.mipro_optimizer_v2: 

2025/05/09 22:06:00 INFO dspy.teleprompt.mipro_optimizer_v2: == Trial 8 / 13 - Minibatch ==


HANHAN TEST ['Davy Crockett, King of the Wild Frontier | Davy Crockett, King of the Wild Frontier is a 1955 live-action Walt Disney adventure film starring Fess Parker as Davy Crockett. This film is an edited compilation of the first three stories from the Disney television miniseries "Davy Crockett" :', 'Davy Crockett and the River Pirates | Davy Crockett and the River Pirates is a 1956 live-action Walt Disney adventure film starring Fess Parker as Davy Crockett. It was shot in Cave-In-Rock, Illinois. This film acts as a prequel to 1955\'s "Davy Crockett, King of the Wild Frontier" and is an edited compilation of the fourth and fifth stories featuring the Disney television series "Davy Crockett":', 'Daniel Boone, Trail Blazer | Daniel Boone, Trail Blazer is a 1956 American western adventure film co-produced and directed by Albert C. Gannaway and Ismael Rodríguez and starring Bruce Bennett, Lon Chaney Jr. and Faron Young. The film was shot in Trucolor in Mexico. It was released by Repu

2025/05/09 22:06:02 INFO dspy.evaluate.evaluate: Average Metric: 5 / 10 (50.0%)





2025/05/09 22:06:03 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 50.0 on minibatch of size 10 with parameters ['Predictor 0: Instruction 1', 'Predictor 0: Few-Shot Set 0'].
2025/05/09 22:06:03 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [20.0, 40.0, 50.0, 30.0, 40.0, 50.0]
2025/05/09 22:06:03 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [24.0, 40.0]
2025/05/09 22:06:03 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 40.0


2025/05/09 22:06:03 INFO dspy.teleprompt.mipro_optimizer_v2: == Trial 9 / 13 - Minibatch ==


HANHAN TEST ['Ashley Holliday | Ashley Holliday Tavares is an American actress best known for playing Chloe Delgado on the 2010 ABC Family series "Huge" and Melissa Sanders on the 2012 Nick at Nite serial drama "Hollywood Heights".', "Mary-Kate and Ashley in Action! | Mary-Kate and Ashley in Action! is an animated television series featuring the voices and likeness of Mary-Kate and Ashley Olsen. It is also a series of books that spun off, from the show. The show premiered on October 20, 2001 on the ABC block Disney's One Saturday Morning, and was cancelled after one season due to negative ratings. Reruns were later shown on Toon Disney and will be seen on Nickelodeon in 2016, as part of the network's purchasing of the television rights to Mary-Kate and Ashley's TV series and movies.", 'Sprout Sharing Show | The Sprout Sharing Show was a programming block on the Sprout cable channel. The show premiered on May 5, 2008, airing on daily afternoons (4PM-6PM EST) in the time slot formerly oc

2025/05/09 22:06:03 INFO dspy.evaluate.evaluate: Average Metric: 1 / 10 (10.0%)





2025/05/09 22:06:04 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 10.0 on minibatch of size 10 with parameters ['Predictor 0: Instruction 0', 'Predictor 0: Few-Shot Set 0'].
2025/05/09 22:06:04 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [20.0, 40.0, 50.0, 30.0, 40.0, 50.0, 10.0]
2025/05/09 22:06:04 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [24.0, 40.0]
2025/05/09 22:06:04 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 40.0


2025/05/09 22:06:04 INFO dspy.teleprompt.mipro_optimizer_v2: == Trial 10 / 13 - Minibatch ==


HANHAN TEST ['Samantha Cristoforetti | Samantha Cristoforetti (] ; born 26 April 1977 in Milan) is an Italian European Space Agency astronaut, Italian Air Force pilot and engineer. She holds the record for the longest uninterrupted spaceflight of a European astronaut (199 days, 16 hours), and until June 2017 held the record for the longest single space flight by a woman until this was broken by Peggy Whitson. She is also the first Italian woman in space. Samantha Cristoforetti is also known as the first person who brewed an espresso coffee in space.', 'ISSpresso | ISSpresso is the first espresso coffee machine designed for use in space, produced for the International Space Station by Argotec and Lavazza in a public-private partnership with the Italian Space Agency (ASI). The first espresso coffee was drunk in space by astronaut Samantha Cristoforetti on 3 May 2015. ISSpresso is one of nine experiments selected by the Italian Space Agency for the Futura mission.', 'Mark Shuttleworth | M

2025/05/09 22:06:04 INFO dspy.evaluate.evaluate: Average Metric: 4 / 10 (40.0%)





2025/05/09 22:06:05 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 40.0 on minibatch of size 10 with parameters ['Predictor 0: Instruction 1', 'Predictor 0: Few-Shot Set 1'].
2025/05/09 22:06:05 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [20.0, 40.0, 50.0, 30.0, 40.0, 50.0, 10.0, 40.0]
2025/05/09 22:06:05 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [24.0, 40.0]
2025/05/09 22:06:05 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 40.0


2025/05/09 22:06:05 INFO dspy.teleprompt.mipro_optimizer_v2: == Trial 11 / 13 - Minibatch ==


HANHAN TEST ['Sunflower Slow Drag | "Sunflower Slow Drag" is a ragtime composition by Scott Joplin and Scott Hayden. It is about four minutes long and has been described as "full of gaiety and sunshine".', 'Scott Joplin | Scott Joplin ( ; 1867/68 or November 24, 1868– April 1, 1917) was an African-American composer and pianist. Joplin achieved fame for his ragtime compositions and was dubbed the "King of Ragtime". During his brief career, he wrote 44 original ragtime pieces, one ragtime ballet, and two operas. One of his first, and most popular pieces, the "Maple Leaf Rag", became ragtime\'s first and most influential hit, and has been recognized as the rag.', 'Maple Leaf Rag | The "Maple Leaf Rag" (copyright registered on September 18, 1899) is an early ragtime musical composition for piano composed by Scott Joplin. It was one of Joplin\'s early works, and became the model for ragtime compositions by subsequent composers. It is one of the most famous of all ragtime pieces. As a result

2025/05/09 22:06:05 INFO dspy.evaluate.evaluate: Average Metric: 2 / 10 (20.0%)





2025/05/09 22:06:06 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 20.0 on minibatch of size 10 with parameters ['Predictor 0: Instruction 1', 'Predictor 0: Few-Shot Set 1'].
2025/05/09 22:06:06 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [20.0, 40.0, 50.0, 30.0, 40.0, 50.0, 10.0, 40.0, 20.0]
2025/05/09 22:06:06 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [24.0, 40.0]
2025/05/09 22:06:06 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 40.0


2025/05/09 22:06:06 INFO dspy.teleprompt.mipro_optimizer_v2: == Trial 12 / 13 - Minibatch ==


HANHAN TEST ['Diogal | Diogal Sakho (born 1970, Ngor, Dakar) is a Senegalese singer and musician.', 'Sukho (island) | Sukho island is small artificial island located in the south-east of Lake Ladoga 20 km near its southern shore. The island has the shape of an irregular horseshoe. Dimensions of this island is approximately 90 to 60 m.', 'Souleymane Cissokho | Souleymane Diop Cissokho (born 4 July 1991) is a French amateur welterweight boxer who won a bronze medal at the 2016 Summer Olympics.']
HANHAN TEST ['Kerry Condon | Kerry Condon (born 4 January 1983) is an Irish television and film actress, best known for her role as Octavia of the Julii in the HBO/BBC series "Rome," as Stacey Ehrmantraut in AMC\'s "Better Call Saul" and as the voice of F.R.I.D.A.Y. in various films in the Marvel Cinematic Universe. She is also the youngest actress ever to play Ophelia in a Royal Shakespeare Company production of "Hamlet."', 'Corona Riccardo | Corona Riccardo (c. 1878October 15, 1917) was an Ital

2025/05/09 22:06:06 INFO dspy.evaluate.evaluate: Average Metric: 4 / 10 (40.0%)





2025/05/09 22:06:07 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 40.0 on minibatch of size 10 with parameters ['Predictor 0: Instruction 1', 'Predictor 0: Few-Shot Set 0'].
2025/05/09 22:06:07 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [20.0, 40.0, 50.0, 30.0, 40.0, 50.0, 10.0, 40.0, 20.0, 40.0]
2025/05/09 22:06:07 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [24.0, 40.0]
2025/05/09 22:06:07 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 40.0


2025/05/09 22:06:07 INFO dspy.teleprompt.mipro_optimizer_v2: ===== Trial 13 / 13 - Full Evaluation =====
2025/05/09 22:06:07 INFO dspy.teleprompt.mipro_optimizer_v2: Doing full eval on next top averaging program (Avg Score: 40.0) from minibatch trials...


HANHAN TEST ['Rosario Dawson | Rosario Isabel Dawson (born May 9, 1979) is an American actress, producer, singer, comic book writer, and political activist. She made her film debut in the 1995 teen drama "Kids". Her subsequent film roles include "He Got Game", "Men in Black II", "25th Hour", "Rent", "Sin City", "Death Proof", "Seven Pounds", "", and "Top Five". Dawson has also provided voice-over work for Disney and DC.', 'Sarai Gonzalez | Sarai Isaura Gonzalez (born 2005) is an American Latina child actress who made her professional debut at the age of 11 on the Spanish-language ""Soy Yo"" ("That\'s Me") music video by Bomba Estéreo. Cast as a "nerdy" tween with a "sassy" and "confident" attitude, her performance turned her into a "Latina icon" for "female empowerment, identity and self-worth". She subsequently appeared in two get out the vote videos for Latinos in advance of the 2016 United States elections.', 'Gabriela (2001 film) | Gabriela is a 2001 American romance film, starring

2025/05/09 22:06:10 INFO dspy.evaluate.evaluate: Average Metric: 10 / 25 (40.0%)





2025/05/09 22:06:11 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [24.0, 40.0, 40.0]
2025/05/09 22:06:11 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 40.0
2025/05/09 22:06:11 INFO dspy.teleprompt.mipro_optimizer_v2: 

2025/05/09 22:06:11 INFO dspy.teleprompt.mipro_optimizer_v2: Returning best identified program with score 40.0!


In [18]:
compiled_rag

generate_answer.predict = Predict(StringSignature(context, question -> reasoning, answer
    instructions='Given a context containing relevant facts and a natural language question, generate a step-by-step reasoning process leading to a concise, fact-based answer, typically between one and five words. Ensure the reasoning clearly explains how the answer is derived from the context, and provide the final answer directly after the reasoning. Maintain clarity and brevity throughout the response.'
    context = Field(annotation=str required=True json_schema_extra={'desc': 'may contain relevant facts', '__dspy_field_type': 'input', 'prefix': 'Context:'})
    question = Field(annotation=str required=True json_schema_extra={'__dspy_field_type': 'input', 'prefix': 'Question:', 'desc': '${question}'})
    reasoning = Field(annotation=str required=True json_schema_extra={'prefix': "Reasoning: Let's think step by step in order to", 'desc': '${reasoning}', '__dspy_field_type': 'output'})
    answe

In [19]:
compiled_rag.save("optimized_model.json")