## Welcome to the Notebook! 🥳

In this notebook, you will learn how to optimize your DSPy program using COPRO. We will compile `Command-R` to improve the performance of `Command-R`!

A few requirements:
1. You'll need a running Weaviate instance
    1. You can create a 14-day free cluster on [WCS](https://console.weaviate.cloud/)
    1. Or run Weaviate locally (use the `yaml` file in this folder)
1. Generate a Coehre API key
1. Installations
    1. weaviate-client
    1. dspy-ai
1. Load your Weaviate cluster with data
    1. If you want to use the Weaviate blogs as the dataset, refer to the `Weaviate-Import.ipynb` file in this folder.

In [11]:
import logging
# Disable logs with severity levelINFO and below
logging.getLogger().setLevel(logging.WARNING)

In [9]:
import dspy
from dspy.retrieve.weaviate_rm import WeaviateRM
import weaviate

command_r = dspy.Cohere(model="command-r", max_tokens=2000, api_key="ai-key")

weaviate_client = weaviate.connect_to_wcs(cluster_url ="wcs-url", 
                                  auth_credentials=weaviate.auth.AuthApiKey("wcs-auth-key"))
retriever_model = WeaviateRM("WeaviateBlogChunk", weaviate_client=weaviate_client)
dspy.settings.configure(lm=command_r, rm=retriever_model)

HTTP Request: GET https://hkwrfqgurmse7pygkia1gw.c0.us-east1.gcp.weaviate.cloud/v1/meta "HTTP/1.1 200 OK"
HTTP Request: GET https://pypi.org/pypi/weaviate-client/json "HTTP/1.1 200 OK"


In [3]:
# Phoenix Setup
import phoenix as px
phoenix_session = px.launch_app()

Dataset: phoenix_dataset_493991eb-71fa-43af-8077-16875e97c18a initialized


  warn(


🌍 To view the Phoenix app in your browser, visit http://localhost:6006/
📺 To view the Phoenix app in a notebook, run `px.active_session().view()`
📖 For more information on how to use Phoenix, check out https://docs.arize.com/phoenix


In [4]:
from openinference.instrumentation.dspy import DSPyInstrumentor
from opentelemetry import trace as trace_api
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk import trace as trace_sdk
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
endpoint = "http://127.0.0.1:6006/v1/traces"
resource = Resource(attributes={})
tracer_provider = trace_sdk.TracerProvider(resource=resource)
span_otlp_exporter = OTLPSpanExporter(endpoint=endpoint)
tracer_provider.add_span_processor(SimpleSpanProcessor(span_exporter=span_otlp_exporter))
trace_api.set_tracer_provider(tracer_provider=tracer_provider)
DSPyInstrumentor().instrument()

  from pkg_resources import (


In [10]:
command_r("say hello")

HTTP Request: POST https://api.cohere.ai/v1/chat "HTTP/1.1 200 OK"


["Hello! How's it going? I hope you're having a fantastic day! 😊"]

In [12]:
import json

# Assuming 'dataset.json' is in the same directory as this script
file_path = './WeaviateBlogRAG-0-0-0.json'

# Read the dataset from 'dataset.json'
with open(file_path, 'r') as file:
    dataset = json.load(file)

# Initialize empty lists for gold_answers and queries
gold_answers = []
queries = []

# Parse the gold_answers and queries
for row in dataset:
    gold_answers.append(row["gold_answer"])
    queries.append(row["query"])
    

data = []

for i in range(len(gold_answers)):
    data.append(dspy.Example(gold_answer=gold_answers[i], question=queries[i]).with_inputs("question"))

trainset = data[:30]
devset = data[30:35] # Small Devset
testset = data[35:]

In [13]:
class Evaluator(dspy.Signature):
    """Evaluate the quality of a system's answer to a question according to a given criterion."""
    
    context = dspy.InputField(desc="The context for answering the question.")
    criterion = dspy.InputField(desc="The evaluation criterion.")
    question = dspy.InputField(desc="The question asked to the system.")
    ground_truth_answer = dspy.InputField(desc="An expert written Ground Truth Answer to the question.")
    predicted_answer = dspy.InputField(desc="The system's answer to the question.")
    rating = dspy.OutputField(desc="A rating between 1 and 5. IMPORTANT!! Only output the rating as an `int` and nothing else.")

class RatingParser(dspy.Signature):
    """Extract the FLOAT valued rating from a string."""
    
    raw_rating_response = dspy.InputField(desc="The string that contains the rating in it.")
    rating = dspy.OutputField(desc="A FLOAT valued rating.")
    
class Summarizer(dspy.Signature):
    """Summarize the information provided in the search results in 5 sentences."""
    
    question = dspy.InputField(desc="a question to a search engine")
    context = dspy.InputField(desc="context filtered as relevant to the query by a search engine")
    summary = dspy.OutputField(desc="a 5 sentence summary of information in the context that would help answer the question.")

class RAGMetricProgram(dspy.Module):
    def __init__(self):
        self.evaluator = dspy.Predict(Evaluator)
        self.rating_parser = dspy.Predict(RatingParser)
        self.summarizer = dspy.Predict(Summarizer)
    
    def forward(self, gold, pred, trace=None):
        # Todo add trace to interface with teleprompters
        predicted_answer = pred.answer
        question = gold.question
        ground_truth_answer = gold.gold_answer
        
        detail = "Is the assessed answer detailed?"
        faithful = "Is the assessed answer factually supported by the context?"
        ground_truth = f"The Ground Answer Truth to the Question: {question} is given as: \n \n {ground_truth_answer} \n \n How aligned is this Predicted Answer? {predicted_answer}"
        
        # Judgement
        with dspy.context(lm=command_r):
            context = dspy.Retrieve(k=10)(question).passages
            # Context Summary
            context = self.summarizer(question=question, context=context).summary
            raw_detail_response = self.evaluator(context=context, 
                                 criterion=detail,
                                 question=question,
                                 ground_truth_answer=ground_truth_answer,
                                 predicted_answer=predicted_answer).rating
            raw_faithful_response = self.evaluator(context=context, 
                                 criterion=faithful,
                                 question=question,
                                 ground_truth_answer=ground_truth_answer,
                                 predicted_answer=predicted_answer).rating
            raw_ground_truth_response = self.evaluator(context=context, 
                                 criterion=ground_truth,
                                 question=question,
                                 ground_truth_answer=ground_truth_answer,
                                 predicted_answer=predicted_answer).rating
        
        # Structured Output Parsing
        with dspy.context(lm=command_r):
            detail_rating = self.rating_parser(raw_rating_response=raw_detail_response).rating
            faithful_rating = self.rating_parser(raw_rating_response=raw_faithful_response).rating
            ground_truth_rating = self.rating_parser(raw_rating_response=raw_ground_truth_response).rating
        
        total = float(detail_rating) + float(faithful_rating)*2 + float(ground_truth_rating)
    
        return total / 5.0

toy_ground_truth_answer = """
Cross encoders score the relevance of a document to a query. They are commonly used to rerank documents.
"""

lgtm_query = "What do cross encoders do?"
lgtm_example = dspy.Example(question=lgtm_query, gold_answer=toy_ground_truth_answer)


# If this is your first time exploring LLM metrics,
# I recommend trying the exercise of improving this answer to achieve a higher LLM rating.

lgtm_pred = dspy.Example(answer="They re-rank documents.")

llm_metric = RAGMetricProgram()
llm_metric_rating = llm_metric(lgtm_example, lgtm_pred)
print(llm_metric_rating)

def MetricWrapper(gold, pred, trace=None):
    return llm_metric(gold, pred)

3.8


# RAG

In [14]:
class GenerateAnswer(dspy.Signature):
    """Assess the the context and answer the question."""

    context = dspy.InputField(desc="Helpful information for answering the question.")
    question = dspy.InputField()
    answer = dspy.OutputField(desc="A detailed answer that is supported by the context.")
    
class RAG(dspy.Module):
    def __init__(self, passages_per_hop=3, max_hops=2):
        super().__init__()
        
        self.retrieve = dspy.Retrieve(k=passages_per_hop)
        self.generate_answer = dspy.Predict(GenerateAnswer)
    
    def forward(self, question):
        context = self.retrieve(question).passages
        with dspy.context(lm=command_r):
            pred = self.generate_answer(context=context, question=question).answer
        return dspy.Prediction(context=context, answer=pred, question=question)

In [15]:
uncompiled_Prediction = RAG()(lgtm_query)
print(f"LGTM test query: {lgtm_query} \n \n ")
print(f"Uncompiled Answer: {uncompiled_Prediction.answer} \n \n")
test_example = dspy.Example(question=lgtm_query, gold_answer=toy_ground_truth_answer)
test_pred = uncompiled_Prediction
llm_metric_rating = llm_metric(test_example, test_pred)
print(f"LLM Metric Rating: {llm_metric_rating}")

LGTM test query: What do cross encoders do? 
 
 
Uncompiled Answer: Cross-encoders are ranking models used for content-based re-ranking. They output a value indicating the similarity between a pair of data items, such as two sentences. They're called cross-encoders because the input consists of a pair of data items, and the model encodes them crossly. You need to use a cross-encoder with each data item and search query to calculate their similarity. They're more accurate but slower than bi-encoders. 
 

LLM Metric Rating: 4.0


In [11]:
from dspy.evaluate.evaluate import Evaluate

evaluate = Evaluate(devset=devset, num_threads=4, display_progress=False)

uncompiled_score = evaluate(RAG(), metric=MetricWrapper)

Average Metric: 19.6 / 5  (392.0%)


In [12]:
from dspy.teleprompt import COPRO

COPRO_teleprompter = COPRO(prompt_model=command_r,
                          metric=MetricWrapper,
                          breadth=8,
                          depth=3,
                          init_temperature=0,
                          verbose=True,
                          track_stats=True)
kwargs = dict(num_threads=1, display_progress=True, display_table=5)

COPRO_compiled_RAG = COPRO_teleprompter.compile(RAG(), trainset=trainset[:3], eval_kwargs=kwargs)
eval_score = evaluate(COPRO_compiled_RAG, devset=devset, **kwargs)
print(eval_score)




You are an instruction optimizer for large language models. I will give you a ``signature`` of fields (inputs and outputs) in English. Your task is to propose an instruction that will lead a good language model to perform the task well. Don't be afraid to be creative.

---

Follow the following format.

Basic Instruction: The initial instructions before optimization
Proposed Instruction: The improved instructions for the language model
Proposed Prefix For Output Field: The string at the end of the prompt, which will help the model start solving the task

---

Basic Instruction: Assess the the context and answer the question.
Proposed Instruction:[32mBasic Instruction: Read the question carefully, understand the context, and provide a thoughtful answer.

Proposed Instruction: Answer the question while demonstrating a thorough understanding of the given context. 

Proposed Prefix For Output Field: "Context understood. Here's the answer:"[0m[31m 	 (and 290 other completions)[0m





Average Metric: 12.0 / 3  (400.0): 100%|██████████████████████████████████████████████████████████████████████████| 3/3 [00:12<00:00,  4.25s/it]

Average Metric: 12.0 / 3  (400.0%)



  df.loc[:, metric_name] = df[metric_name].apply(


Unnamed: 0,gold_answer,example_question,context,answer,pred_question,MetricWrapper
0,The Binary Independence Model in the BM25 algorithm used by Weaviate's hybrid search plays a crucial role in the calculation of the Inverse Document Frequency...,What is the role of the Binary Independence Model in the BM25 algorithm used by Weaviate's hybrid search?,"['Note, the current implementation of hybrid search in Weaviate uses BM25/BM25F and vector search. If you’re interested to learn about how dense vector indexes are...",The Binary Independence Model is a key component of the BM25 algorithm as it forms the basis for the normalization penalty. This penalty serves to...,What is the role of the Binary Independence Model in the BM25 algorithm used by Weaviate's hybrid search?,4.0
1,"Vector libraries might not be suitable for applications that require real-time updates and scalable semantic search because they have immutable index data, preventing real-time updates....",Why might vector libraries not be suitable for applications that require real-time updates and scalable semantic search?,"['Updatability: The index data is immutable, and thus no real-time updates are possible. 2. Scalability: Most vector libraries cannot be queried while importing your data,...","Vector libraries are not suitable for applications requiring real-time updates and scalable semantic search because the index data is immutable, which means no real-time updates...",Why might vector libraries not be suitable for applications that require real-time updates and scalable semantic search?,4.0
2,"The document recommends the ""LangChain Guide"" by Paul from CommandBar for learning about LangChain projects.",What guide does the document recommend for learning about LangChain projects?,"[""I recommend checking out the GitHub repository to test this out yourself!\n\n## Additional Resources\n• [LangChain Guide](https://www.commandbar.com/blog/langchain-projects) by Paul from CommandBar. import StayConnected from '/_includes/stay-connected.mdx'\n\n"",...",The document recommends the LangChain Guide available at https://www.commandbar.com/blog/langchain-projects for learning about LangChain projects. The guide provides an in-depth overview of LangChain and its capabilities.,What guide does the document recommend for learning about LangChain projects?,4.0





Extract the FLOAT valued rating from a string.

---

Follow the following format.

Raw Rating Response: The string that contains the rating in it.
Rating: A FLOAT valued rating.

---

Raw Rating Response: 5
Rating:[32m5.0[0m[31m 	 (and 2 other completions)[0m



prompt_model.inspect_history(n=1) 


Extract the FLOAT valued rating from a string.

---

Follow the following format.

Raw Rating Response: The string that contains the rating in it.
Rating: A FLOAT valued rating.

---

Raw Rating Response: 5
Rating:[32m5.0[0m[31m 	 (and 2 other completions)[0m



----------------
(instruction, prefix) ('Basic Instruction: Read the question carefully, understand the context, and provide a thoughtful answer.\n\nProposed Instruction: Answer the question while demonstrating a thorough understanding of the given context.', "Context understood. Here's the answer:")
----------------
Predictor 1
i: Assess the the context and answer the question.
p: Answer:

At Depth 1/3, Evaluating Prompt 

Average Metric: 12.0 / 3  (400.0): 100%|██████████████████████████████████████████████████████████████████████████| 3/3 [00:11<00:00,  3.82s/it]

Average Metric: 12.0 / 3  (400.0%)



  df.loc[:, metric_name] = df[metric_name].apply(


Unnamed: 0,gold_answer,example_question,context,answer,pred_question,MetricWrapper
0,The Binary Independence Model in the BM25 algorithm used by Weaviate's hybrid search plays a crucial role in the calculation of the Inverse Document Frequency...,What is the role of the Binary Independence Model in the BM25 algorithm used by Weaviate's hybrid search?,"['Note, the current implementation of hybrid search in Weaviate uses BM25/BM25F and vector search. If you’re interested to learn about how dense vector indexes are...",The Binary Independence Model is a key component of the BM25 algorithm because it provides the basis for the normalization penalty. This penalty evaluates a...,What is the role of the Binary Independence Model in the BM25 algorithm used by Weaviate's hybrid search?,4.0
1,"Vector libraries might not be suitable for applications that require real-time updates and scalable semantic search because they have immutable index data, preventing real-time updates....",Why might vector libraries not be suitable for applications that require real-time updates and scalable semantic search?,"['Updatability: The index data is immutable, and thus no real-time updates are possible. 2. Scalability: Most vector libraries cannot be queried while importing your data,...","Vector libraries are not suitable for applications requiring real-time updates and scalable semantic search because the index data is immutable, which means no real-time updates...",Why might vector libraries not be suitable for applications that require real-time updates and scalable semantic search?,4.0
2,"The document recommends the ""LangChain Guide"" by Paul from CommandBar for learning about LangChain projects.",What guide does the document recommend for learning about LangChain projects?,"[""I recommend checking out the GitHub repository to test this out yourself!\n\n## Additional Resources\n• [LangChain Guide](https://www.commandbar.com/blog/langchain-projects) by Paul from CommandBar. import StayConnected from '/_includes/stay-connected.mdx'\n\n"",...",The document recommends the LangChain Guide by Paul from CommandBar import StayConnected for learning about LangChain projects. The guide's link is provided as: https://www.commandbar.com/blog/langchain-projects.,What guide does the document recommend for learning about LangChain projects?,4.0





Extract the FLOAT valued rating from a string.

---

Follow the following format.

Raw Rating Response: The string that contains the rating in it.
Rating: A FLOAT valued rating.

---

Raw Rating Response: 5
Rating:[32m5.0[0m[31m 	 (and 2 other completions)[0m



prompt_model.inspect_history(n=1) 


Extract the FLOAT valued rating from a string.

---

Follow the following format.

Raw Rating Response: The string that contains the rating in it.
Rating: A FLOAT valued rating.

---

Raw Rating Response: 5
Rating:[32m5.0[0m[31m 	 (and 2 other completions)[0m



----------------
(instruction, prefix) ('Assess the the context and answer the question.', 'Answer:')
Updating Predictor 6074760912 to:
i: Basic Instruction: Read the question carefully, understand the context, and provide a thoughtful answer.

Proposed Instruction: Answer the question while demonstrating a thorough understanding of the given context.
p: Context understood. Here's the answer:
Full predictor with update: 
P

Average Metric: 12.0 / 3  (400.0): 100%|██████████████████████████████████████████████████████████████████████████| 3/3 [00:16<00:00,  5.60s/it]

Average Metric: 12.0 / 3  (400.0%)



  df.loc[:, metric_name] = df[metric_name].apply(


Unnamed: 0,gold_answer,example_question,context,answer,pred_question,MetricWrapper
0,The Binary Independence Model in the BM25 algorithm used by Weaviate's hybrid search plays a crucial role in the calculation of the Inverse Document Frequency...,What is the role of the Binary Independence Model in the BM25 algorithm used by Weaviate's hybrid search?,"['Note, the current implementation of hybrid search in Weaviate uses BM25/BM25F and vector search. If you’re interested to learn about how dense vector indexes are...",Answer: The Binary Independence Model is a key component of the BM25 algorithm as it forms the basis for calculating the length normalization penalty. Explanation:...,What is the role of the Binary Independence Model in the BM25 algorithm used by Weaviate's hybrid search?,4.0
1,"Vector libraries might not be suitable for applications that require real-time updates and scalable semantic search because they have immutable index data, preventing real-time updates....",Why might vector libraries not be suitable for applications that require real-time updates and scalable semantic search?,"['Updatability: The index data is immutable, and thus no real-time updates are possible. 2. Scalability: Most vector libraries cannot be queried while importing your data,...","Vector libraries are not suitable for real-time updates or scalable semantic search because the index data is immutable. This means that, although they offer efficient...",Why might vector libraries not be suitable for applications that require real-time updates and scalable semantic search?,4.0
2,"The document recommends the ""LangChain Guide"" by Paul from CommandBar for learning about LangChain projects.",What guide does the document recommend for learning about LangChain projects?,"[""I recommend checking out the GitHub repository to test this out yourself!\n\n## Additional Resources\n• [LangChain Guide](https://www.commandbar.com/blog/langchain-projects) by Paul from CommandBar. import StayConnected from '/_includes/stay-connected.mdx'\n\n"",...","Answer: LangChain Guide. Explanation: The provided context contains references to various resources and guides related to Weaviate and open-source contributions. However, the specific guide that...",What guide does the document recommend for learning about LangChain projects?,4.0





Extract the FLOAT valued rating from a string.

---

Follow the following format.

Raw Rating Response: The string that contains the rating in it.
Rating: A FLOAT valued rating.

---

Raw Rating Response: 5
Rating:[32m5.0[0m[31m 	 (and 2 other completions)[0m



prompt_model.inspect_history(n=1) 


Extract the FLOAT valued rating from a string.

---

Follow the following format.

Raw Rating Response: The string that contains the rating in it.
Rating: A FLOAT valued rating.

---

Raw Rating Response: 5
Rating:[32m5.0[0m[31m 	 (and 2 other completions)[0m



----------------
(instruction, prefix) ('Attempted Instructions: \n- Instruction #1: Answer the question based on the provided context, offering a concise and accurate response.\n- Prefix #1: Answer: \n- Resulting Score #1: 420.0\n\n- Instruction #2: Read and understand the context carefully, then provide a detailed explanation along with the answer. Ensure the language model focuses on clarity and comprehensiveness.\n- Pr

Average Metric: 12.0 / 3  (400.0): 100%|██████████████████████████████████████████████████████████████████████████| 3/3 [00:21<00:00,  7.05s/it]

Average Metric: 12.0 / 3  (400.0%)



  df.loc[:, metric_name] = df[metric_name].apply(


Unnamed: 0,gold_answer,example_question,context,answer,pred_question,MetricWrapper
0,The Binary Independence Model in the BM25 algorithm used by Weaviate's hybrid search plays a crucial role in the calculation of the Inverse Document Frequency...,What is the role of the Binary Independence Model in the BM25 algorithm used by Weaviate's hybrid search?,"['Note, the current implementation of hybrid search in Weaviate uses BM25/BM25F and vector search. If you’re interested to learn about how dense vector indexes are...","Context: The text describes the BM25 algorithm used in Weaviate's hybrid search and its importance, along with the recent addition of a new fusion algorithm....",What is the role of the Binary Independence Model in the BM25 algorithm used by Weaviate's hybrid search?,4.0
1,"Vector libraries might not be suitable for applications that require real-time updates and scalable semantic search because they have immutable index data, preventing real-time updates....",Why might vector libraries not be suitable for applications that require real-time updates and scalable semantic search?,"['Updatability: The index data is immutable, and thus no real-time updates are possible. 2. Scalability: Most vector libraries cannot be queried while importing your data,...","Summary: Vector libraries are efficient for in-memory vector searches but have limitations in real-time updating and scalability, making them unsuitable for dynamic applications. Vector databases...",Why might vector libraries not be suitable for applications that require real-time updates and scalable semantic search?,4.0
2,"The document recommends the ""LangChain Guide"" by Paul from CommandBar for learning about LangChain projects.",What guide does the document recommend for learning about LangChain projects?,"[""I recommend checking out the GitHub repository to test this out yourself!\n\n## Additional Resources\n• [LangChain Guide](https://www.commandbar.com/blog/langchain-projects) by Paul from CommandBar. import StayConnected from '/_includes/stay-connected.mdx'\n\n"",...","Context: The text provides valuable resources for making an open-source contribution to Weaviate, including guides, workshops, and a GitHub repository. Question: Which guide in the...",What guide does the document recommend for learning about LangChain projects?,4.0





Extract the FLOAT valued rating from a string.

---

Follow the following format.

Raw Rating Response: The string that contains the rating in it.
Rating: A FLOAT valued rating.

---

Raw Rating Response: 5
Rating:[32m5.0[0m[31m 	 (and 2 other completions)[0m



prompt_model.inspect_history(n=1) 


Extract the FLOAT valued rating from a string.

---

Follow the following format.

Raw Rating Response: The string that contains the rating in it.
Rating: A FLOAT valued rating.

---

Raw Rating Response: 5
Rating:[32m5.0[0m[31m 	 (and 2 other completions)[0m



----------------
(instruction, prefix) ('Attempted Instructions: [1]\nProposed Instruction: Provide a brief overview/summary followed by a more detailed explanation, ensuring clarity and conciseness in the summary while providing a comprehensive understanding in the explanation. This approach strikes a balance between accuracy and understanding, enhancing the chance of success.', "Summary: \n\n---\n\nHere's an example that

Average Metric: 0.0 / 1  (0.0):  20%|███████████████▍                                                             | 1/5 [00:01<00:07,  1.93s/it]

Error for example in dev set: 		 'NoneType' object is not callable


Average Metric: 0.0 / 2  (0.0):  40%|██████████████████████████████▊                                              | 2/5 [00:04<00:07,  2.45s/it]

Error for example in dev set: 		 'NoneType' object is not callable


Average Metric: 0.0 / 3  (0.0):  60%|██████████████████████████████████████████████▏                              | 3/5 [00:05<00:03,  1.75s/it]

Error for example in dev set: 		 'NoneType' object is not callable


Average Metric: 0.0 / 4  (0.0):  80%|█████████████████████████████████████████████████████████████▌               | 4/5 [00:06<00:01,  1.34s/it]

Error for example in dev set: 		 'NoneType' object is not callable


TypeError: 'NoneType' object is not callable

In [14]:
print(COPRO_compiled_RAG(question="What is ref2vec?").answer)

Ref2Vec, short for reference-to-vector, is a Weaviate 1.16 module that enables vectorization of a data object with cross-references to other objects. Essentially, it finds the average vector of cross-referenced vectors to represent the referencing object. It's a lightweight method to determine real-time preferences and actions, which is useful for recommendations and relevant results in apps.


In [15]:
print(command_r.inspect_history(n=1))




Basic Instruction: Read the question carefully, understand the context, and provide a thoughtful answer.

Proposed Instruction: Answer the question while demonstrating a thorough understanding of the given context.

---

Follow the following format.

Context: Helpful information for answering the question.
Question: ${question}
Context understood. Here's the answer: A detailed answer that is supported by the context.

---

Context:
[1] «---
title: What is Ref2Vec and why you need it for your recommendation system
slug: ref2vec-centroid
authors: [connor]
date: 2022-11-23
tags: ['integrations', 'concepts']
image: ./img/hero.png
description: "Weaviate introduces Ref2Vec, a new module that utilises Cross-References for Recommendation!"
---
![Ref2vec-centroid](./img/hero.png)

<!-- truncate -->

Weaviate 1.16 introduced the [Ref2Vec](/developers/weaviate/modules/retriever-vectorizer-modules/ref2vec-centroid) module. In this article, we give you an overview of what Ref2Vec is and some exa

# MIPRO

In [16]:
class ObservationSummarizer(dspy.Signature):
    """Given a series of observations I have made about my dataset, please summarize them into a brief 2-3 sentence summary which highlights only the most important details."""

    observations = dspy.InputField(desc="Observations I have made about my dataset")
    summary = dspy.OutputField(
        desc="Two to Three sentence summary of only the most significant highlights of my observations",
    )


class DatasetDescriptor(dspy.Signature):
    (
        """Given several examples from a dataset please write observations about trends that hold for most or all of the samples. """
        """Some areas you may consider in your observations: topics, content, syntax, conciceness, etc. """
        """It will be useful to make an educated guess as to the nature of the task this dataset will enable. Don't be afraid to be creative"""
    )

    examples = dspy.InputField(desc="Sample data points from the dataset")
    observations = dspy.OutputField(desc="Somethings that holds true for most or all of the data you observed")


class DatasetDescriptorWithPriorObservations(dspy.Signature):
    (
        """Given several examples from a dataset please write observations about trends that hold for most or all of the samples. """
        """I will also provide you with a few observations I have already made.  Please add your own observations or if you feel the observations are comprehensive say 'COMPLETE' """
        """Some areas you may consider in your observations: topics, content, syntax, conciceness, etc. """
        """It will be useful to make an educated guess as to the nature of the task this dataset will enable. Don't be afraid to be creative"""
    )

    examples = dspy.InputField(desc="Sample data points from the dataset")
    prior_observations = dspy.InputField(desc="Some prior observations I made about the data")
    observations = dspy.OutputField(
        desc="Somethings that holds true for most or all of the data you observed or COMPLETE if you have nothing to add",
    )

In [24]:
dataset_descriptor = dspy.Predict(DatasetDescriptor)
dataset_descriptor_with_prior = dspy.Predict(DatasetDescriptorWithPriorObservations)
observation_summarizer = dspy.Predict(ObservationSummarizer)

def examples_to_strings(trainset):
    example_strings = []
    for example in trainset:
        question = example.inputs["question"]
        gold_answer = example.gold_answer
        example_string = f"Question: {question}\nAnswer: {gold_answer}"
        example_strings.append(example_string)
    return example_strings

print("HELLO")
batch_size=5
for example in range(0, len(trainset), batch_size):
    examples = examples_to_strings(trainset[i:i+batch_size])
    examples = "".join(examples)
    print(examples)

HELLO








In [16]:
from dspy.teleprompt import MIPRO

teleprompter = MIPRO(prompt_model=command_r, task_model=command_r, metric=MetricWrapper, num_candidates=10, init_temperature=0)
kwargs = dict(num_threads=1, display_progress=True, display_table=0)
MIPRO_compiled_RAG = teleprompter.compile(RAG(), trainset=trainset[:3], num_trials=5, max_bootstrapped_demos=1, max_labeled_demos=0, eval_kwargs=kwargs)
eval_score = evaluate(MIPRO_compiled_RAG, devset=devset, **kwargs)
print(eval_score)


Please be advised that based on the parameters you have set, the maximum number of LM calls is projected as follows:

[93m- Task Model: [94m[1m3[0m[93m examples in dev set * [94m[1m5[0m[93m trials * [94m[1m# of LM calls in your program[0m[93m = ([94m[1m15 * # of LM calls in your program[0m[93m) task model calls[0m
[93m- Prompt Model: # data summarizer calls (max [94m[1m10[0m[93m) + [94m[1m10[0m[93m * [94m[1m1[0m[93m lm calls in program = [94m[1m20[0m[93m prompt model calls[0m

[93m[1mEstimated Cost Calculation:[0m

[93mTotal Cost = (Number of calls to task model * (Avg Input Token Length per Call * Task Model Price per Input Token + Avg Output Token Length per Call * Task Model Price per Output Token) 
            + (Number of calls to prompt model * (Avg Input Token Length per Call * Task Prompt Price per Input Token + Avg Output Token Length per Call * Prompt Model Price per Output Token).[0m

For a preliminary estimate of potential costs, we


  0%|                                                                                                                     | 0/3 [00:00<?, ?it/s][A
 33%|████████████████████████████████████▎                                                                        | 1/3 [00:04<00:08,  4.09s/it][A


Bootstrapped 1 full traces after 2 examples in round 0.



  0%|                                                                                                                     | 0/3 [00:00<?, ?it/s][A
 33%|████████████████████████████████████▎                                                                        | 1/3 [00:04<00:08,  4.18s/it][A


Bootstrapped 1 full traces after 2 examples in round 0.



  0%|                                                                                                                     | 0/3 [00:00<?, ?it/s][A
 33%|████████████████████████████████████▎                                                                        | 1/3 [00:04<00:08,  4.13s/it][A


Bootstrapped 1 full traces after 2 examples in round 0.



  0%|                                                                                                                     | 0/3 [00:00<?, ?it/s][A
 33%|████████████████████████████████████▎                                                                        | 1/3 [00:03<00:07,  3.51s/it][A


Bootstrapped 1 full traces after 2 examples in round 0.



  0%|                                                                                                                     | 0/3 [00:00<?, ?it/s][A
 33%|████████████████████████████████████▎                                                                        | 1/3 [00:03<00:07,  3.91s/it][A


Bootstrapped 1 full traces after 2 examples in round 0.



  0%|                                                                                                                     | 0/3 [00:00<?, ?it/s][A
 33%|████████████████████████████████████▎                                                                        | 1/3 [00:04<00:08,  4.01s/it][A


Bootstrapped 1 full traces after 2 examples in round 0.



  0%|                                                                                                                     | 0/3 [00:00<?, ?it/s][A
 33%|████████████████████████████████████▎                                                                        | 1/3 [00:03<00:06,  3.44s/it][A


Bootstrapped 1 full traces after 2 examples in round 0.



  0%|                                                                                                                     | 0/3 [00:00<?, ?it/s][A
 33%|████████████████████████████████████▎                                                                        | 1/3 [00:03<00:07,  3.82s/it][A


Bootstrapped 1 full traces after 2 examples in round 0.



  0%|                                                                                                                     | 0/3 [00:00<?, ?it/s][A
 33%|████████████████████████████████████▎                                                                        | 1/3 [00:03<00:07,  3.94s/it][A


Bootstrapped 1 full traces after 2 examples in round 0.


[I 2024-03-28 21:38:15,053] A new study created in memory with name: no-name-01e73fb9-ee9b-46bc-9ff0-ac53a63d18be


Starting trial #0



  0%|                                                                                                                     | 0/3 [00:00<?, ?it/s][A
Average Metric: 3.6 / 1  (360.0):   0%|                                                                                   | 0/3 [00:05<?, ?it/s][A
Average Metric: 3.6 / 1  (360.0):  33%|█████████████████████████                                                  | 1/3 [00:05<00:11,  5.99s/it][A
Average Metric: 7.4 / 2  (370.0):  33%|█████████████████████████                                                  | 1/3 [00:10<00:11,  5.99s/it][A
Average Metric: 7.4 / 2  (370.0):  67%|██████████████████████████████████████████████████                         | 2/3 [00:10<00:05,  5.37s/it][A
Average Metric: 11.4 / 3  (380.0):  67%|█████████████████████████████████████████████████▎                        | 2/3 [00:14<00:05,  5.37s/it][A
Average Metric: 11.4 / 3  (380.0): 100%|███████████████████████████████████████████████████████████████████████

Average Metric: 11.4 / 3  (380.0%)
Starting trial #1



  0%|                                                                                                                     | 0/3 [00:00<?, ?it/s][A
Average Metric: 4.0 / 1  (400.0):   0%|                                                                                   | 0/3 [00:04<?, ?it/s][A
Average Metric: 4.0 / 1  (400.0):  33%|█████████████████████████                                                  | 1/3 [00:04<00:09,  4.66s/it][A
Average Metric: 8.0 / 2  (400.0):  33%|█████████████████████████                                                  | 1/3 [00:08<00:09,  4.66s/it][A
Average Metric: 8.0 / 2  (400.0):  67%|██████████████████████████████████████████████████                         | 2/3 [00:08<00:04,  4.42s/it][A
Average Metric: 12.0 / 3  (400.0):  67%|█████████████████████████████████████████████████▎                        | 2/3 [00:12<00:04,  4.42s/it][A
Average Metric: 12.0 / 3  (400.0): 100%|███████████████████████████████████████████████████████████████████████

Average Metric: 12.0 / 3  (400.0%)
Starting trial #2



  0%|                                                                                                                     | 0/3 [00:00<?, ?it/s][A
Average Metric: 3.0 / 1  (300.0):   0%|                                                                                   | 0/3 [00:06<?, ?it/s][A
Average Metric: 3.0 / 1  (300.0):  33%|█████████████████████████                                                  | 1/3 [00:06<00:12,  6.26s/it][A
Average Metric: 7.0 / 2  (350.0):  33%|█████████████████████████                                                  | 1/3 [00:12<00:12,  6.26s/it][A
Average Metric: 7.0 / 2  (350.0):  67%|██████████████████████████████████████████████████                         | 2/3 [00:12<00:06,  6.30s/it][A
Average Metric: 10.2 / 3  (340.0):  67%|█████████████████████████████████████████████████▎                        | 2/3 [00:17<00:06,  6.30s/it][A
Average Metric: 10.2 / 3  (340.0): 100%|███████████████████████████████████████████████████████████████████████

Average Metric: 10.2 / 3  (340.0%)
Starting trial #3



  0%|                                                                                                                     | 0/3 [00:00<?, ?it/s][A
Average Metric: 3.8 / 1  (380.0):   0%|                                                                                   | 0/3 [00:05<?, ?it/s][A
Average Metric: 3.8 / 1  (380.0):  33%|█████████████████████████                                                  | 1/3 [00:05<00:11,  6.00s/it][A
Average Metric: 7.8 / 2  (390.0):  33%|█████████████████████████                                                  | 1/3 [00:11<00:11,  6.00s/it][A
Average Metric: 7.8 / 2  (390.0):  67%|██████████████████████████████████████████████████                         | 2/3 [00:11<00:05,  5.60s/it][A
Average Metric: 11.8 / 3  (393.3):  67%|█████████████████████████████████████████████████▎                        | 2/3 [00:16<00:05,  5.60s/it][A
Average Metric: 11.8 / 3  (393.3): 100%|███████████████████████████████████████████████████████████████████████

Average Metric: 11.8 / 3  (393.3%)
Starting trial #4



  0%|                                                                                                                     | 0/3 [00:00<?, ?it/s][A
Average Metric: 3.2 / 1  (320.0):   0%|                                                                                   | 0/3 [00:05<?, ?it/s][A
Average Metric: 3.2 / 1  (320.0):  33%|█████████████████████████                                                  | 1/3 [00:05<00:10,  5.18s/it][A
Average Metric: 6.800000000000001 / 2  (340.0):  33%|████████████████████▎                                        | 1/3 [00:11<00:10,  5.18s/it][A
Average Metric: 6.800000000000001 / 2  (340.0):  67%|████████████████████████████████████████▋                    | 2/3 [00:11<00:06,  6.14s/it][A
Average Metric: 10.8 / 3  (360.0):  67%|█████████████████████████████████████████████████▎                        | 2/3 [00:15<00:06,  6.14s/it][A
Average Metric: 10.8 / 3  (360.0): 100%|███████████████████████████████████████████████████████████████████████

Average Metric: 10.8 / 3  (360.0%)
Returning generate_answer = Predict(StringSignature(context, question -> answer
    instructions="Observations: The provided observations indicate a need for the model to understand and interpret technical details, focusing on specific aspects to craft a clear, detailed response. The task requires an understanding of the role of various algorithms and their interactions. \n\nExamples: \n\n- Context: [Insert technical details and background information on algorithms]\nQuestion: What is the function of X in Y algorithm? \nAnswer: X is responsible for Z, which contributes to the overall goal of Q. \n\nBasic Instruction: Assess the context and answer the question. \n\nProposed Instruction: Analyze the technical intricacies and focus on the interplay of algorithms. Explain the role of the queried element, X, within the Y algorithm's mechanism, providing a concise yet detailed response."
    context = Field(annotation=str required=True json_schema_extra={'d


  0%|                                                                                                                     | 0/5 [00:00<?, ?it/s][A

TypeError: 'NoneType' object is not callable

In [None]:
print(MIPRO_compiled_RAG(question="What is ref2vec?").answer)

In [None]:
print(command_r.inspect_history(n=1))