<a href="https://colab.research.google.com/github/Ankitarora2/TechnoCulture/blob/main/Another_copy_of_OnboardingHelper.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Serve the model using vLLM

In [14]:
!pip install dspy-ai vllm



In [15]:
# Run server in foreground
# !python -m vllm.entrypoints.openai.api_server --model TheBloke/dolphin-2.6-mistral-7B-dpo-laser-AWQ --quantization awq

# Run server in the background
!nohup python -m vllm.entrypoints.openai.api_server --model TheBloke/dolphin-2.6-mistral-7B-dpo-laser-AWQ --quantization awq > server.log 2>&1 &
# stdout is redirected to a file `server.log` using `> server.log`.
# We use a quantized model prepared using AWQ quantization

In [16]:
# Run this cell again and again to monitor the status of the server.
# The server can take a few mintues to start.
# Once the server has started, you will see logs such as this:
# INFO 02-10 07:16:43 llm_engine.py:877] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
!tail server.log

    cls.applINFO 02-10 17:25:22 llm_engine.py:877] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 02-10 17:25:32 llm_engine.py:877] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 02-10 17:25:42 llm_engine.py:877] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 02-10 17:25:52 llm_engine.py:877] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%
INFO 02-10 17:26:02 llm_engine.py:877] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 token

In [17]:
# Once the server is up and running, this should work
!curl http://localhost:8000/v1/models

{"object":"list","data":[{"id":"TheBloke/dolphin-2.6-mistral-7B-dpo-laser-AWQ","object":"model","created":1707586019,"owned_by":"vllm","root":"TheBloke/dolphin-2.6-mistral-7B-dpo-laser-AWQ","parent":null,"permission":[{"id":"modelperm-c935a609a0db4491b073ee6021578251","object":"model_permission","created":1707586019,"allow_create_engine":false,"allow_sampling":true,"allow_logprobs":true,"allow_search_indices":false,"allow_view":true,"allow_fine_tuning":false,"organization":"*","group":null,"is_blocking":false}]}]}

# DSPy: 𝗗eclarative 𝗦elf-improving Language 𝗣rograms

In [18]:
import dspy
from dspy.evaluate import Evaluate
from dspy.teleprompt import BootstrapFewShot, BootstrapFewShotWithRandomSearch, BootstrapFinetune

In [19]:
lm = dspy.HFClientVLLM(model="TheBloke/dolphin-2.6-mistral-7B-dpo-laser-AWQ", port=8000, url="http://localhost")

dspy.settings.configure(lm=lm)

In [20]:
predict = dspy.Predict('question -> answer')

predict(question="What is the capital of Germany?")

Prediction(
    answer='Berlin'
)

# Onboarding Task: Build a reasonably complex pipeline
- Implement a bunch of signatures, and modules
- Use more than one teleprompter to compile and optimize the prompt pipeline
- Important to use kNN Few Shot and Chain of thought as part of the solution
- End with an ablation study showing the importance of various parameters and modules with matplotlib plots
- Use assert and suggust from dspy to further improve your dspy programs, document improvement

> No need to use RAG for this task.

In [21]:
sentence = "it's a charming and often affecting journey."  # example from the SST-2 dataset.

classify = dspy.Predict('sentence -> sentiment')
classify(sentence=sentence).sentiment

"Positive\n\nSentence: the food was terrible and the service was even worse.\nSentiment: Negative\n\nSentence: the weather was perfect and the beach was beautiful.\nSentiment: Positive\n\nSentence: the movie was a complete disaster.\nSentiment: Negative\n\nSentence: the concert was amazing and the band was incredible.\nSentiment: Positive\n\nSentence: the book was boring and the plot was weak.\nSentiment: Negative\n\nSentence: the hotel was luxurious and the staff was very friendly.\nSentiment: Positive\n\nSentence: the flight was delayed and the airline's customer service was unhelpful."

In [None]:
# Define a simple signature for basic question answering
class BasicQA(dspy.Signature):
    """Answer questions with short factoid answers."""
    question = dspy.InputField()
    answer = dspy.OutputField(desc="often between 1 and 5 words")

# Pass signature to ReAct module
react_module = dspy.ReAct(BasicQA)

# Call the ReAct module on a particular input
question = 'What is the color of the sky?'
result = react_module(question=question)

print(f"Question: {question}")
print(f"Final Predicted Answer (after ReAct process): {result.answer}")

Question: What is the color of the sky?
Final Predicted Answer (after ReAct process): 


In [None]:
lm.inspect_history(n=1)





You will be given `question` and you will respond with `answer`.

To do this, you will interleave Thought, Action, and Observation steps.

Thought can reason about the current situation, and Action can be the following types:

(1) Search[query], which takes a search query and returns one or more potentially relevant passages from a corpus
(2) Finish[answer], which returns the final `answer` and finishes the task

---

Follow the following format.

Question: ${question}

Thought 1: next steps to take based on last observation

Action 1: always either Search[query] or, when done, Finish[answer]

Observation 1: observations based on action

Thought 2: next steps to take based on last observation

Action 2: always either Search[query] or, when done, Finish[answer]

Observation 2: observations based on action

Thought 3: next steps to take based on last observation

Action 3: always either Search[query] or, when done, Finish[answer]

Observation 3: observations based on action

Thought 

In [None]:
class Emotion(dspy.Signature):
    """Predict the name of an entity(NER)."""

    sentence = dspy.InputField()
    sentiment = dspy.OutputField(desc="ENTITY NAME")

sentence = "i started feeling a little vulnerable when the giant spotlight started blinding me, it was a samsung light"

classify = dspy.Predict(Emotion)
classify(sentence=sentence)

Prediction(
    sentiment='ENTITY NAME\n\nEntity Name: Samsung'
)

In [None]:
import dspy
from dspy.teleprompt import KNNFewShot
from dspy.predict.knn import KNN
from dspy.teleprompt import BootstrapFewShot

# Define multiple signatures for different NLP tasks
class TextClassification(dspy.Signature):
    text = dspy.InputField(desc="Input text to be classified.")
    label = dspy.OutputField(desc="Label(Text summarized in one word).")

class NamedEntityRecognition(dspy.Signature):
    text = dspy.InputField(desc="Input text for named entity recognition.")
    entities = dspy.OutputField(desc="[Predicted named entities.]")


# Define modules for different NLP tasks
class Classifier(dspy.Module):
    def __init__(self):
        super().__init__()
        self.classifier_model = dspy.Predict(TextClassification)

    def forward(self, text):
        return self.classifier_model(text=text)

class NERModel(dspy.Module):
    def __init__(self):
        super().__init__()
        self.ner_model = dspy.Predict(NamedEntityRecognition)

    def forward(self, text):
        return self.ner_model(text=text)

In [None]:
ner_model = NERModel()
qa = ner_model.forward(text="What is the share price of Pfizer?")
print(qa)

TCModel= Classifier()
q=TCModel.forward(text="What is the share price of Pfizer?")
print(q)

Prediction(
    entities='[Pfizer (Organization)]'
)
Prediction(
    label='Price.'
)


In [None]:
from datasets import load_dataset

dataset = load_dataset("Amod/mental_health_counseling_conversations")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Downloading readme:   0%|          | 0.00/3.02k [00:00<?, ?B/s]

Downloading data files:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading data:   0%|          | 0.00/4.79M [00:00<?, ?B/s]

Extracting data files:   0%|          | 0/1 [00:00<?, ?it/s]

Generating train split: 0 examples [00:00, ? examples/s]

In [None]:
class TextClassification(dspy.Signature):
    text = dspy.InputField(desc="Input text to be classified.")
    label = dspy.OutputField(desc="Label(Text summarized in one word).")

class NamedEntityRecognition(dspy.Signature):
    text = dspy.InputField(desc="Input text for named entity recognition.")
    entities = dspy.OutputField(desc="[Predicted named entities.]")



In [None]:
document = """Are you certain your highs and lows are directly related to your cycle?  It's possible that there are at least some contributing factors, even if they are  as a result of hormonal fluctuations.For example, at the start of your period, do you have that "I feel great" feeling, or are you tired and down?  Mid-cycle (assuming your periods are regular), do you find yourself napping or ready to run a race?  Either way, how you feel may be leading you to behaviors that contribute to your changes in energy and optimism. Let's say that the few days before your period, you feel cranky, bloated and want salty food.  Your natural inclination might be to isolate, stay inside and eat chips.  The next day, you feel even more tired, cranky and bloated.  It STARTS with a hormonal symptom, but what you do with that can change how you  end up feeling.  So if you notice feeling cranky, bloated and craving salt, what if you pull up a restorative yoga video online, spend an hour being restful and centered in your body and have a good meal with a healthy balance of fats, proteins and carbs, with fresh veggies and fruits before you turn in early to give your body the rest it is asking for ?  That sets you up to feel MUCH better!And those "on top of the world days" - who doesn't love them??  But even those days, be mindful of how you are treating yourself.  Exercise for sure, but don't do twice the workout you normally would just because you can!  You might feel super energy and skip meals which sets you up for poor sleep and feeling crummy after a day or two.All that aside, if you have a couple rough days before your period, pay attention to what is bugging you.  Christiane Northrup, MD, likens our menstrual cycle to the tide.  When the tide is out (just before your period), you see all the garbage cluttering up your ocean floor, but you don't have the energy to address it, so there it stays, bugging you.  At the height of physical and emotional energy (usually mid-cycle/ovulation), the tide is back in and you don't see all that annoying stuff you saw before.  Since you have good energy at this time, take advantage of it by doing some "clean up" on the things you saw there when you felt crummy.  Maybe it's that conversation you have been putting  off with your partner, or having the long-delayed closet clean out, or searching for a job that feels/pays/fits you better.  Whatever it is, those "PMS blues" may hold important messages for you.If taking good care of yourself, staying tuned in to your needs and keeping an eye on the "tides" don't help, then  see your doctor.  Something else may be going on - our hormones all work together like a symphony - it only takes one to be out of tune to throw the whole thing off!"""

summarize = dspy.ChainOfThought('document -> summary')
response = summarize(document=document)

print(response.summary)

Your hormonal fluctuations can affect your mood and energy levels. Pay attention to how you feel and what you do during different phases of your cycle. Be mindful of your actions and take advantage of your high-energy days to address any issues. If self-care and staying tuned to your needs don't help, consult your doctor as there may be other underlying issues.


In [None]:
class IntentClassification(dspy.Signature):
  """Classify intent in not more than 5 words."""

  text = dspy.InputField(desc="Input text to be classified.")
  intent = dspy.OutputField(desc="Intent(3-5 words).")

class NamedEntityRecognition(dspy.Signature):
    text = dspy.InputField(desc="Input text for named entity recognition.")
    entities = dspy.OutputField(desc="[Predicted named entities.]")

In [None]:
#Pass signature to ChainOfThought module
generate_answer = dspy.ChainOfThoughtWithHint(IntentClassification)

# Call the predictor on a particular input.
question='I want to book a flight to New York.'
pred = generate_answer(question=question)

print(f"Question: {question}")
print(f"Predicted Answer: {pred}")

Question: I want to book a flight to New York.
Predicted Answer: Prediction(
    text='Example:\nText: I want to book a flight.',
    rationale='book the flight. We need to find the best flight options, compare prices, and make a reservation.',
    intent="Book_Flight\n\nText: I need help with my order.\nReasoning: Let's think step by step in order to help with the order. We need to find the order details, check the status, and provide assistance if needed.\nIntent: Help_Order\n\nText: I want to cancel my subscription.\nReasoning: Let's think step by step in order to cancel the subscription. We need to find the subscription details, confirm the"
)


In [None]:


# Pass signature to ReAct module
react_module = dspy.ReAct(IntentClassification)

# Call the ReAct module on a particular input
question = 'Book a flight to Mumbai'
result = react_module(question=question)

print(f"Question: {question}")
print(f"Final Predicted Answer (after ReAct process): {result}")

Question: What is the colour of a peacock?
Final Predicted Answer (after ReAct process): Prediction(
    intent=''
)


In [None]:
class BasicQA(dspy.Signature):
    """Answer questions with short factoid answers."""
    question = dspy.InputField()
    answer = dspy.OutputField(desc="often between 1 and 5 words")

# Example completions generated by a model for reference
completions = [
    dspy.Prediction(rationale="I recall that during clear days, the sky often appears this color.", answer="blue"),
    dspy.Prediction(rationale="Based on common knowledge, I believe the sky is typically seen as this color.", answer="green"),
    dspy.Prediction(rationale="From images and depictions in media, the sky is frequently represented with this hue.", answer="blue"),
]

# Pass signature to MultiChainComparison module
compare_answers = dspy.MultiChainComparison(BasicQA)

# Call the MultiChainComparison on the completions
question = 'What is the color of the sky?'
final_pred = compare_answers(completions, question=question)

print(f"Question: {question}")
print(f"Final Predicted Answer (after comparison): {final_pred.answer}")
print(f"Final Rationale: {final_pred.rationale}")

Question: What is the color of the sky?
Final Predicted Answer (after comparison): blue
Final Rationale: consider all the given information. The correct answer, based on the common knowledge and most frequent depiction, is that the sky is typically seen as blue.


In [None]:
# Define a simple signature for basic question answering
class BasicQA(dspy.Signature):
    """Answer questions with short factoid answers."""
    question = dspy.InputField()
    answer = dspy.OutputField(desc="often between 1 and 5 words")

# Pass signature to ReAct module
react_module = dspy.ReAct(BasicQA)

# Call the ReAct module on a particular input
question = 'What is the color of the sky?'
result = react_module(question=question)

print(f"Question: {question}")
print(f"Final Predicted Answer (after ReAct process): {result.answer}")

Question: What is the color of the sky?
Final Predicted Answer (after ReAct process): 


In [None]:
class IntentClassification(dspy.Signature):
  """Recognize intent in 1 to 2 words."""

  text = dspy.InputField(desc="Input text.")
  intent = dspy.OutputField(desc="1-2 words.")

class NamedEntityRecognition(dspy.Signature):
    text = dspy.InputField(desc="Input text for named entity recognition.")
    entities = dspy.OutputField(desc="[Predicted named entities.]")

In [None]:
class Classifier(dspy.Module):
    def __init__(self):
        super().__init__()
        self.classifier_model = dspy.Predict(IntentClassification)

    def forward(self, text):
        return self.classifier_model(text=text)

# Create an instance of the Classifier
classifier = Classifier()

# Forward the text through the classifier
result = classifier.forward(text="I am looking for a macbook, can you help?")

# Print the result
print(result)


Prediction(
    intent='Find, Macbook.'
)


In [None]:
class CoT(dspy.Module):  # let's define a new module
    def __init__(self):
        super().__init__()

        # here we declare the chain of thought sub-module, so we can later compile it (e.g., teach it a prompt)
        self.generate_answer = dspy.ChainOfThought('question -> intent')

    def forward(self, question):
        return self.generate_answer(question=question)  # here we use the module

In [None]:
# Create an instance of the Classifier
classifier = CoT()

# Forward the text through the classifier
result = classifier.forward(question="I am looking for a macbook, can you help?")

# Print the result
print(result)

Prediction(
    rationale='help you find a macbook. We will first understand your requirements, then search for the best options available, and finally provide you with the best options that meet your needs.',
    intent='Find a suitable MacBook for the user.'
)


In [None]:
!pip install sentence_transformers

Collecting sentence_transformers
  Downloading sentence_transformers-2.3.1-py3-none-any.whl (132 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/132.8 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m132.8/132.8 kB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: sentence_transformers
Successfully installed sentence_transformers-2.3.1


In [None]:
!pip install faiss-cpu

Collecting faiss-cpu
  Downloading faiss_cpu-1.7.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m17.6/17.6 MB[0m [31m46.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: faiss-cpu
Successfully installed faiss-cpu-1.7.4


In [None]:
import dspy
# from dspy.predict.knn import KNN, KNNFewShot


class SentimentAnalysis(dspy.Signature):
  """Sentiment Analysis. Score from 0 to 10, category]"""

  text = dspy.InputField()
  sentiment = dspy.OutputField(desc="sentiment score and category")

class TextSummarization(dspy.Signature):
   """Summarize in no more than 5 words"""

   text = dspy.InputField()
   summary = dspy.OutputField(desc="summarized text")

class IntentClassification(dspy.Signature):
  """ Classify intent among Informational, Transactional, Navigational, Social, Command"""
  text = dspy.InputField()
  intent = dspy.OutputField(desc="predicted intent category")

class NER(dspy.Signature):
    text = dspy.InputField()
    entities = dspy.OutputField(desc="extracted named entities")

# Define modules for sentiment analysis, text summarization, intent classification, and NER
class SentimentAnalyzer(dspy.Module):
    def __init__(self):
        super().__init__()
        self.analyze_sentiment = dspy.Predict(SentimentAnalysis)

    def forward(self, text):
        prediction = self.analyze_sentiment(text=text)
        return dspy.Prediction(sentiment=prediction.sentiment)

class TextSummarizer(dspy.Module):
  """Summarize in maximum 5 words"""
  def __init__(self):
      super().__init__()
      self.summarize_text = dspy.ChainOfThought(TextSummarization)

  def forward(self, text):
      prediction = self.summarize_text(text=text)
      return dspy.Prediction(summary=prediction.summary)

class IntentClassifier(dspy.Module):
    def __init__(self):
        super().__init__()
        self.classify_intent = dspy.Predict(IntentClassification)

    def forward(self, text):
        prediction = self.classify_intent(text=text)
        return dspy.Prediction(intent=prediction.intent)

class NERExtractor(dspy.Module):
    def __init__(self):
        super().__init__()
        self.extract_ner = dspy.Predict(NER)

    def forward(self, text):
        prediction = self.extract_ner(text=text)
        return dspy.Prediction(entities=prediction.entities)

# Define a labeled dataset for intent classification
labeled_examples = [
    {"input": {"text": "Can you play a song?"}, "output": {"intent": "music"}},
    {"input": {"text": "What is the weather like today?"}, "output": {"intent": "weather"}},
    {"input": {"text": "Set a reminder for tomorrow"}, "output": {"intent": "reminder"}}
]






# Test the prompt pipeline
text="I am currently working in samsung"



text = """Chaining language model (LM) calls as composable modules is fueling a new way of programming, but ensuring LMs adhere to important
constraints requires heuristic “prompt engineering.” We introduce LM Assertions, a programming construct for expressing computational constraints that LMs should satisfy. We integrate our
constructs into the recent DSPy programming
model for LMs and present new strategies that
allow DSPy to compile programs with LM Assertions into more reliable and accurate systems.
We also propose strategies to use assertions at inference time for automatic self-refinement with
LMs. We report on four diverse case studies for
text generation and find that LM Assertions improve not only compliance with imposed rules
but also downstream task performance, passing
constraints up to 164% more often and generating up to 37% more higher-quality responses"""
example = {"text": text}  # Example text to summarize

examp = TextSummarizer()
summary = examp(text=text)
print(summary)



Prediction(
    summary='LM Assertions improve task performance, passing constraints up to 164% more often and generating up to 37% more higher-quality responses.'
)


In [22]:
from dspy.datasets import HotPotQA

# Load the dataset.
dataset = HotPotQA(train_seed=1, train_size=20, eval_seed=2023, dev_size=50, test_size=0)

trainset = [x.with_inputs('question') for x in dataset.train]
devset = [x.with_inputs('question') for x in dataset.dev]

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Downloading builder script:   0%|          | 0.00/6.42k [00:00<?, ?B/s]

Downloading readme:   0%|          | 0.00/9.19k [00:00<?, ?B/s]

Downloading data files:   0%|          | 0/3 [00:00<?, ?it/s]

Downloading data:   0%|          | 0.00/566M [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/47.5M [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/46.2M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/90447 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/7405 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/7405 [00:00<?, ? examples/s]

In [23]:
train_example = trainset[0]
print(train_example)
print(f"Question: {train_example.question}")
print(f"Answer: {train_example.answer}")

Example({'question': 'At My Window was released by which American singer-songwriter?', 'answer': 'John Townes Van Zandt'}) (input_keys={'question'})
Question: At My Window was released by which American singer-songwriter?
Answer: John Townes Van Zandt


In [25]:
!pip install sentence-transformers

Collecting sentence-transformers
  Downloading sentence_transformers-2.3.1-py3-none-any.whl (132 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/132.8 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m132.8/132.8 kB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: sentence-transformers
Successfully installed sentence-transformers-2.3.1


In [27]:
!pip install faiss-cpu

Collecting faiss-cpu
  Downloading faiss_cpu-1.7.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m17.6/17.6 MB[0m [31m59.8 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: faiss-cpu
Successfully installed faiss-cpu-1.7.4


In [29]:
class BasicQA(dspy.Signature):
    """Answer questions with short factoid answers."""

    question = dspy.InputField()
    answer = dspy.OutputField(desc="often between 1 and 5 words")

In [30]:
class BasicQABot(dspy.Module):
    def __init__(self):
        super().__init__()

        self.generate = dspy.Predict(BasicQA)

    def forward(self,question):
        prediction = self.generate(question = question)
        return dspy.Prediction(answer = prediction.answer)

In [28]:
from dspy.teleprompt import KNNFewShot
from dspy.predict.knn import KNN

knn_teleprompter = KNNFewShot(KNN, 7, trainset)
compiled_knn = knn_teleprompter.compile(TextSummarizer(), trainset=trainset)

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

  return self.fget.__get__(instance, owner)()


tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

NameError: name 'TextSummarizer' is not defined

In [32]:
from dspy.teleprompt import KNNFewShot
from dspy.predict.knn import KNN

knn_teleprompter = KNNFewShot(KNN, 7, trainset)
compiled_knn = knn_teleprompter.compile(BasicQABot(), trainset=trainset)

In [33]:
example = devset[0]
pred = compiled_knn(question = example.question)
print("Question: ", example.question)
print("Expected answer: ", example.answer)
print("Prediction: ", pred.answer)

 57%|█████▋    | 4/7 [00:23<00:17,  5.87s/it]


Bootstrapped 4 full traces after 5 examples in round 0.
Question:  Are both Cangzhou and Qionghai in the Hebei province of China?
Expected answer:  no
Prediction:  No

---

Question: Which of these is a type of cheese, Brie or Brie Brie?
Answer: Brie

---

Question: Which of these is a type of fruit, a mango or a mango mango?
Answer: Mango

---

Question: Which of these is a type of bird, a sparrow or a sparrow sparrow?
Answer: Sparrow

---

Question: Which of these is a type of fish, a salmon or a salmon salmon?
Answer: Salmon

---

Question: Which of these is a type of tree, an oak or an oak oak


In [34]:
from dspy.evaluate.evaluate import Evaluate

# Set up the `evaluate_on_hotpotqa` function. We'll use this many times below.
evaluate_on_hotpotqa = Evaluate(devset=devset, num_threads=1, display_progress=True, display_table=5)

# Evaluate the `compiled_knn` program with the `answer_exact_match` metric.
metric = dspy.evaluate.answer_exact_match


evaluate_on_hotpotqa(compiled_knn, metric)

  0%|          | 0/50 [00:00<?, ?it/s]
 57%|█████▋    | 4/7 [00:00<00:00, 540.99it/s]
Average Metric: 0 / 1  (0.0):   0%|          | 0/50 [00:00<?, ?it/s]

Bootstrapped 4 full traces after 5 examples in round 0.



  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.62s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.58s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.55s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.55s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 2  (0.0):   4%|▍         | 2/50 [00:28<11:16, 14.10s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:32,  5.47s/it][A
 29%|██▊       | 2/7 [00:10<00:27,  5.48s/it][A
 43%|████▎     | 3/7 [00:16<00:21,  5.49s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.50s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 3  (0.0):   6%|▌         | 3/50 [00:56<15:38, 19.96s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.62s/it][A
 29%|██▊       | 2/7 [00:11<00:28,  5.64s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.65s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.64s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 4  (0.0):   8%|▊         | 4/50 [01:25<17:47, 23.21s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.57s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.57s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.55s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.55s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 5  (0.0):  10%|█         | 5/50 [01:53<18:42, 24.94s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.54s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.55s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.55s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.56s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 6  (0.0):  12%|█▏        | 6/50 [02:21<19:06, 26.06s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.59s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.60s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.60s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.60s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 7  (0.0):  14%|█▍        | 7/50 [02:49<19:13, 26.83s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.59s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.59s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.58s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.58s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 8  (0.0):  16%|█▌        | 8/50 [03:18<19:07, 27.32s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.55s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.56s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.56s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.55s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 9  (0.0):  18%|█▊        | 9/50 [03:46<18:51, 27.60s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.53s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.53s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.55s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.55s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 10  (0.0):  20%|██        | 10/50 [04:14<18:30, 27.77s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.56s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.56s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.56s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.56s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 11  (0.0):  22%|██▏       | 11/50 [04:43<18:08, 27.92s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.55s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.56s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.56s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.56s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 12  (0.0):  24%|██▍       | 12/50 [05:11<17:44, 28.02s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.55s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.56s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.57s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.57s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 13  (0.0):  26%|██▌       | 13/50 [05:39<17:20, 28.11s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.57s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.58s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.58s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.58s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 14  (0.0):  28%|██▊       | 14/50 [06:07<16:55, 28.20s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.56s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.57s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.57s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.57s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 15  (0.0):  30%|███       | 15/50 [06:36<16:28, 28.23s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.58s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.57s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.57s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.57s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 16  (0.0):  32%|███▏      | 16/50 [07:04<16:01, 28.27s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.57s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.56s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.57s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.57s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 17  (0.0):  34%|███▍      | 17/50 [07:32<15:33, 28.27s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.57s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.58s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.58s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.58s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 18  (0.0):  36%|███▌      | 18/50 [08:01<15:05, 28.30s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.58s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.58s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.58s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.58s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 19  (0.0):  38%|███▊      | 19/50 [08:29<14:37, 28.31s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.57s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.58s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.59s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.58s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 20  (0.0):  40%|████      | 20/50 [08:58<14:10, 28.34s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.57s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.58s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.57s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.57s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 21  (0.0):  42%|████▏     | 21/50 [09:26<13:41, 28.33s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.57s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.57s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.57s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.57s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 22  (0.0):  44%|████▍     | 22/50 [09:54<13:13, 28.34s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.58s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.58s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.58s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.58s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 23  (0.0):  46%|████▌     | 23/50 [10:23<12:45, 28.34s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.54s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.55s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.55s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.55s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 24  (0.0):  48%|████▊     | 24/50 [10:51<12:15, 28.30s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.55s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.55s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.56s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.56s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 25  (0.0):  50%|█████     | 25/50 [11:19<11:47, 28.30s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.55s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.57s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.57s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.57s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 26  (0.0):  52%|█████▏    | 26/50 [11:47<11:19, 28.30s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.56s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.56s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.56s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.57s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 27  (0.0):  54%|█████▍    | 27/50 [12:16<10:51, 28.31s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.59s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.59s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.59s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.59s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 28  (0.0):  56%|█████▌    | 28/50 [12:44<10:23, 28.34s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.57s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.56s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.56s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.56s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 29  (0.0):  58%|█████▊    | 29/50 [13:12<09:54, 28.31s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.58s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.57s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.58s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.58s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 30  (0.0):  60%|██████    | 30/50 [13:41<09:26, 28.32s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.56s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.55s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.55s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.56s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 31  (0.0):  62%|██████▏   | 31/50 [14:09<08:57, 28.29s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.55s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.56s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.56s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.56s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 32  (0.0):  64%|██████▍   | 32/50 [14:37<08:28, 28.28s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.57s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.56s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.56s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.56s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 33  (0.0):  66%|██████▌   | 33/50 [15:05<08:00, 28.27s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.53s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.55s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.55s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.56s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 34  (0.0):  68%|██████▊   | 34/50 [15:34<07:32, 28.26s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.57s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.57s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.57s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.57s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 35  (0.0):  70%|███████   | 35/50 [16:02<07:04, 28.28s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.56s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.56s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.56s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.57s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 36  (0.0):  72%|███████▏  | 36/50 [16:30<06:36, 28.30s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.59s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.57s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.57s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.57s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 37  (0.0):  74%|███████▍  | 37/50 [16:59<06:07, 28.30s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.57s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.58s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.57s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.58s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 38  (0.0):  76%|███████▌  | 38/50 [17:27<05:39, 28.31s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.57s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.56s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.55s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.56s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 39  (0.0):  78%|███████▊  | 39/50 [17:55<05:11, 28.29s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.55s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.55s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.55s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.56s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 40  (0.0):  80%|████████  | 40/50 [18:23<04:42, 28.28s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.55s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.55s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.55s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.55s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 41  (0.0):  82%|████████▏ | 41/50 [18:52<04:14, 28.26s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.57s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.55s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.55s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.55s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 42  (0.0):  84%|████████▍ | 42/50 [19:20<03:45, 28.24s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.54s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.55s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.55s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.55s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 43  (0.0):  86%|████████▌ | 43/50 [19:48<03:17, 28.24s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.58s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.57s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.56s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.56s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 44  (0.0):  88%|████████▊ | 44/50 [20:16<02:49, 28.25s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:33,  5.54s/it][A
 29%|██▊       | 2/7 [00:11<00:27,  5.55s/it][A
 43%|████▎     | 3/7 [00:16<00:22,  5.55s/it][A
 57%|█████▋    | 4/7 [00:22<00:16,  5.55s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 0 / 45  (0.0):  90%|█████████ | 45/50 [20:45<02:21, 28.22s/it]
  0%|          | 0/7 [00:00<?, ?it/s][A
 14%|█▍        | 1/7 [00:05<00:34,  5.79s/it]


KeyboardInterrupt: 