[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/weaviate/recipes/blob/main/integrations/llm-agent-frameworks/dspy/4.Structured-Outputs-with-DSPy.ipynb)

## Load Data into Weaviate
**You need a running Weaviate cluster with data**:
1. Learn about the installation options [here](https://weaviate.io/developers/weaviate/installation)
2. Import your data:
    1. You can follow the `Weaviate-Import.ipynb` notebook to load in the Weaviate blogs (recipes/integrations/dspy/Weaviate-Import.ipynb)
    2. Or follow this [Quickstart Guide](https://weaviate.io/developers/weaviate/quickstart)

## Setup

In [31]:
import dspy
from dspy.retrieve.weaviate_rm import WeaviateRM
from dspy.retrieve.you_rm import YouRM
import weaviate

gpt4 = dspy.OpenAI(model="gpt-4-1106-preview", max_tokens=4000, model_type="chat")
gpt_turbo = dspy.OpenAI(model="gpt-3.5-turbo", max_tokens=4000, model_type="chat")
command_r = dspy.Cohere(model="command-r", max_tokens=4000, api_key=cohere_api_key)
mistral_ollama = dspy.OllamaLocal(model="mistral", max_tokens=4000, timeout_s=480)

lms = [{"name": "GPT-4", "lm": gpt4},
       {"name": "GPT-3.5-Turbo", "lm": gpt_turbo},
       {"name": "Command-R", "lm": command_r},
       {"name": "Mistral-7B", "lm": mistral_ollama}]

weaviate_client = weaviate.connect_to_local()
weaviate_rm = WeaviateRM("WeaviateBlogChunk", weaviate_client=weaviate_client)
you_rm = YouRM(ydc_api_key=you_api_key)
dspy.settings.configure(lm=gpt_turbo, rm=weaviate_rm)

In [32]:
command_r("say hello")

["Hello! How's it going? I hope you're having a fantastic day! 😊"]

In [33]:
# Phoenix Setup
import phoenix as px
phoenix_session = px.launch_app()

Existing running Phoenix instance detected! Shutting it down and starting a new instance...
🌍 To view the Phoenix app in your browser, visit http://localhost:6006/
📺 To view the Phoenix app in a notebook, run `px.active_session().view()`
📖 For more information on how to use Phoenix, check out https://docs.arize.com/phoenix


Replace `TemplateResponse(name, {"request": request})` by `TemplateResponse(request, name)`.
Replace `TemplateResponse(name, {"request": request})` by `TemplateResponse(request, name)`.


In [34]:
from openinference.instrumentation.dspy import DSPyInstrumentor
from opentelemetry import trace as trace_api
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk import trace as trace_sdk
from opentelemetry.sdk.trace.export import SimpleSpanProcessor

endpoint = "http://127.0.0.1:6006/v1/traces"
tracer_provider = trace_sdk.TracerProvider()
span_otlp_exporter = OTLPSpanExporter(endpoint=endpoint)
tracer_provider.add_span_processor(SimpleSpanProcessor(span_exporter=span_otlp_exporter))

trace_api.set_tracer_provider(tracer_provider=tracer_provider)
DSPyInstrumentor().instrument()

Overriding of current TracerProvider is not allowed
Attempting to instrument while already instrumented


## Hurricane's `Question2BlogOutline`

In [39]:
class Question2BlogOutline(dspy.Signature):
    """Your task is to write a Weaviate blog post that will help answer the given question.\nPlease use the contexts from a web search and published Weaviate blog posts to evaluate the structure of the blog post."""
    
    question = dspy.InputField()
    blog_context = dspy.InputField()
    web_context = dspy.InputField()
    blog_outline = dspy.OutputField(desc="A list of topics the blog will cover. IMPORTANT!! This must follow a comma separated list of values!")

In [40]:
# Utils
def format_weaviate_and_you_contexts(weaviateRM_output, youRM_output):
    weaviateRM_output = [d['long_text'] for d in weaviateRM_output]
    weaviateRM_output = "".join(weaviateRM_output)
    youRM_output = [d['long_text'] for d in youRM_output]
    youRM_output = "".join(youRM_output)
    return weaviateRM_output, youRM_output

## Blog2Outline

In [43]:
class Blog2Outline(dspy.Module):
    def __init__(self, weaviate_rm, you_rm):
        self.question_to_blog_outline = dspy.Predict(Question2BlogOutline)
        self.weaviate_rm = weaviate_rm
        self.you_rm = you_rm

    def forward(self, question):
        blog_contexts = self.weaviate_rm(question)
        web_contexts = self.you_rm(question)
        blog_contexts, web_contexts = format_weaviate_and_you_contexts(blog_contexts, web_contexts)
        question_to_blog_outline_outputs = self.question_to_blog_outline(question=question, blog_context=blog_contexts, web_context=web_contexts)
        return question_to_blog_outline_outputs.blog_outline
                                                                         
toy_question = "What are cross encoders?"
blog2outline = Blog2Outline(weaviate_rm, you_rm)

for lm_dict in lms:
    lm, name = lm_dict["lm"], lm_dict["name"]
    with dspy.context(lm=lm):
        print(f"\033[91mResult for {name}\n")
        print(f"\033[0m{blog2outline(question=toy_question)} \n")

[91mResult for GPT-4

[0mIntroduction to Cross Encoders, Definition and Working Mechanism of Cross Encoders, Advantages of Cross Encoders, Challenges of Cross Encoders, The Math Behind Cross Encoders, Practical Applications of Cross Encoders, Understanding Bi-Encoders, Advantages of Bi-Encoders, Challenges of Bi-Encoders, The Math Behind Bi-Encoders, Comparative Analysis: Cross-Encoders vs. Bi-Encoders, When to Use Cross-Encoders and Bi-Encoders, Combining Bi- and Cross-Encoders, Training Cross-Encoders, Conclusion 

[91mResult for GPT-3.5-Turbo

[0m1. Introduction to Cross Encoders
2. Comparison between Cross Encoders and Bi Encoders
3. Advantages and Challenges of Cross Encoders
4. Use Cases of Cross Encoders
5. Training and Implementation of Cross Encoders
6. Combining Bi and Cross Encoders
7. Conclusion and Future of Cross Encoders 

[91mResult for Command-R

[0mWhat are cross-encoders and how do they work? 
- A brief introduction to the two types of encoders and their trade-

## DSPy TypedPredictors

In [56]:
from dspy.functional import TypedPredictor
import pydantic
from typing import List

class Topic(pydantic.BaseModel):
    topic: str
    topic_description: str

class Topics(pydantic.BaseModel):
    topics: List[Topic]

class TypedQuestion2BlogOutline(dspy.Signature):
    """Your task is to write a Weaviate blog post that will help answer the given question.\nPlease use the contexts from a web search and published Weaviate blog posts to evaluate the structure of the blog post."""
    
    question: str = dspy.InputField()
    blog_context: str = dspy.InputField()
    web_context: str = dspy.InputField()
    blog_outline: Topics = dspy.OutputField(desc="A list of topics the blog will cover. IMPORTANT!! This must follow a comma separated list of values!")

In [63]:
import functools

class TypedBlog2Outline(dspy.Module):
    def __init__(self, weaviate_rm, you_rm):
        self.question_to_blog_outline = dspy.functional.TypedPredictor(TypedQuestion2BlogOutline)
        self.weaviate_rm = weaviate_rm
        self.you_rm = you_rm

    def forward(self, question):
        blog_contexts = self.weaviate_rm(question)
        web_contexts = self.you_rm(question)
        blog_contexts, web_contexts = format_weaviate_and_you_contexts(blog_contexts, web_contexts)
        question_to_blog_outline_outputs = self.question_to_blog_outline(question=question, blog_context=blog_contexts, web_context=web_contexts)
        return question_to_blog_outline_outputs.blog_outline

                                                                         
blog2outline = TypedBlog2Outline(weaviate_rm, you_rm)
        
toy_question = "What are cross encoders?"

for lm_dict in lms:
    lm, name = lm_dict["lm"], lm_dict["name"]
    with dspy.context(lm=lm):
        print(f"\033[91mResult for {name}\n")
        print(f"\033[0m{blog2outline(question=toy_question)} \n")

[91mResult for GPT-4

[0mtopics=[Topic(topic='Introduction to Cross Encoders', topic_description='An overview of what Cross Encoders are and their significance in the field of NLP.'), Topic(topic='Understanding the Mechanism of Cross Encoders', topic_description='A detailed explanation of how Cross Encoders work, including their process of encoding sentence pairs.'), Topic(topic='Advantages of Cross Encoders', topic_description='Discussion of the benefits of using Cross Encoders, such as their high accuracy and detailed textual analysis capabilities.'), Topic(topic='Challenges and Limitations of Cross Encoders', topic_description='Exploration of the computational intensity and potential drawbacks of using Cross Encoders in certain applications.'), Topic(topic='Comparative Analysis: Cross Encoders vs. Bi-Encoders', topic_description='A comparison between Cross Encoders and Bi-Encoders, highlighting the scenarios where each is most effective.'), Topic(topic='Practical Applications of C

ValueError: ('Too many retries trying to get the correct output format. Try simplifying the requirements.', {'blog_outline': "ValueError('json output should start and end with { and }')"})

## Assertions

- Assertion-driven backtracking allows pipelines to self-correct at inference time by retrying failing modules with refined prompts.

In [51]:
import functools

def is_comma_separated_list(string):
    string = string.strip()
    if "," in string:
        values = string.split(",")
        for value in values:
            if not value.strip():
                return False
        return True
    else:
        return False
    
failed_assertion_message = """
Output must be a comma-separated list of topics!
Please remove any numerical listing, such as (1., 2., ...) or alphabetical listing, such as (A., B., ...) or other symbols to denote lists such as '*' or '-'
"""
    
class Blog2OutlineWithAssertions(dspy.Module):
    def __init__(self, weaviate_rm, you_rm):
        self.question_to_blog_outline = dspy.Predict(Question2BlogOutline)
        self.weaviate_rm = weaviate_rm
        self.you_rm = you_rm

    def forward(self, question):
        blog_contexts = self.weaviate_rm(question)
        web_contexts = self.you_rm(question)
        blog_contexts, web_contexts = format_weaviate_and_you_contexts(blog_contexts, web_contexts)
        question_to_blog_outline_outputs = self.question_to_blog_outline(question=question, blog_context=blog_contexts, web_context=web_contexts)
        dspy.Suggest(is_comma_separated_list(question_to_blog_outline_outputs.blog_outline),
                    failed_assertion_message)
        return question_to_blog_outline_outputs.blog_outline


                                                                         
toy_question = "What are cross encoders?"

from dspy.primitives.assertions import assert_transform_module, backtrack_handler

blog2outline_with_assertions = assert_transform_module(Blog2OutlineWithAssertions(weaviate_rm, you_rm),
                                                      functools.partial(backtrack_handler, max_backtracks=1))

for lm_dict in lms:
    lm, name = lm_dict["lm"], lm_dict["name"]
    with dspy.context(lm=lm):
        print(f"\033[91mResult for {name}\n")
        print(f"\033[0m{blog2outline_with_assertions(question=toy_question)} \n")

[91mResult for GPT-4

[0mIntroduction to Cross Encoders, Definition and Working Mechanism of Cross Encoders, Advantages of Cross Encoders, Challenges of Cross Encoders, The Math Behind Cross Encoders, Practical Applications of Cross Encoders, Understanding Bi-Encoders, Advantages of Bi-Encoders, Challenges of Bi-Encoders, The Math Behind Bi-Encoders, Comparative Analysis: Cross-Encoders vs. Bi-Encoders, When to Use Cross-Encoders and Bi-Encoders, Combining Bi- and Cross-Encoders, Training Cross-Encoders, Conclusion 

[91mResult for GPT-3.5-Turbo

SuggestionFailed: 
Output must be a comma-separated list of topics!
Please remove any numerical listing, such as (1., 2., ...) or alphabetical listing, such as (A., B., ...) or other symbols to denote lists such as '*' or '-'

[0mIntroduction to Cross Encoders, Comparison between Cross Encoders and Bi Encoders, Advantages and Challenges of Cross Encoders, Use Cases of Cross Encoders, Training and Implementation of Cross Encoders, Combining

## Custom Guardrails with the DSPy Programming Model

Notes on Compiling Assertions

- Assertion-driven example bootstrapping genrates more robust few-shot examples that adhere to constraints during DSPy's prompt optimization phase.

- Counterexample bootstrapping creates demonstrations with failed examples and fixes to further improve the LM's ability to comply with constraints.

Notes on Metrics

- Intrinsic Metrics = Assertions
- Extrinsic Metrics = Answer Quality in QA

In [107]:
class TopicGuardrails(dspy.Signature):
    """Please assess whether this generated topic list is properly formatted as a comma-separated list."""
    
    topic_str = dspy.InputField()
    properly_formatted = dspy.OutputField(desc = "only output True or False")
    reason_for_properly_formatted_decision = dspy.OutputField()

class RetryTopic(dspy.Signature):
    """Given the original output and the reason for it's failure, please correct it. 
    Please also remove any extra text before the topics begin such as something lik `A list of topics the blog will cover:`"""
    
    original_output = dspy.InputField()
    reason_for_failure = dspy.InputField()
    corrected_output = dspy.OutputField()

class Blog2OutlineWithCustomGuardrails(dspy.Module):
    def __init__(self, weaviate_rm, you_rm):
        self.question_to_blog_outline = dspy.Predict(Question2BlogOutline)
        self.topic_guardrails = dspy.Predict(TopicGuardrails)
        self.retry_topic = dspy.Predict(RetryTopic)
        self.weaviate_rm = weaviate_rm
        self.you_rm = you_rm

    def forward(self, question):
        blog_contexts = self.weaviate_rm(question)
        web_contexts = self.you_rm(question)
        blog_contexts, web_contexts = format_weaviate_and_you_contexts(blog_contexts, web_contexts)
        question_to_blog_outline_outputs = self.question_to_blog_outline(question=question, blog_context=blog_contexts, web_context=web_contexts)
        blog_outline = question_to_blog_outline_outputs.blog_outline
        counter = 0
        while True:
            with dspy.context(lm=gpt4):
                guardrails_outputs = self.topic_guardrails(topic_str=blog_outline)
            print(f"\n Guardrails Outputs {guardrails_outputs}")
            if guardrails_outputs.properly_formatted == 'True':
                break
            reason_for_failure = guardrails_outputs.reason_for_properly_formatted_decision
            blog_outline = self.retry_topic(original_output=blog_outline, reason_for_failure=reason_for_failure).corrected_output
            print(f"\n Retried Blog Outline: {blog_outline}\n")
            counter += 1
            if counter >= 3:
                print("Exceeded Retry Limit, exiting.")
                break
        return blog_outline

In [108]:
toy_question = "How does HNSW work?"

blog2outline_with_custom_guardrails = Blog2OutlineWithCustomGuardrails(weaviate_rm, you_rm)

for lm_dict in lms:
    lm, name = lm_dict["lm"], lm_dict["name"]
    dspy.settings.configure(lm=lm)
    print(f"\033[91mResult for {name}\n")
    print(f"\033[0m{blog2outline_with_custom_guardrails(question=toy_question)} \n")

[91mResult for GPT-4


 Guardrails Outputs Prediction(
    properly_formatted='True',
    reason_for_properly_formatted_decision='The topic list is properly formatted as a comma-separated list, with each topic separated by a comma and a space, following standard English punctuation rules for lists.'
)
[0mIntroduction to HNSW and its relevance in vector search, Overview of ANN algorithms and their importance, Understanding the concept of "length scale" in HNSW, The hierarchical structure of HNSW and its comparison to Skip Lists, The search process in HNSW explained, Insertion and deletion mechanisms in HNSW, The role of bi-directional links in HNSW, Random layer assignment and its significance, Performance complexities of HNSW, Practical implementation of HNSW using HNSWlib, Addressing common questions and misconceptions about HNSW, Conclusion and future outlook on HNSW in vector databases 

[91mResult for GPT-3.5-Turbo


 Guardrails Outputs Prediction(
    properly_formatted='False'

## Connect Custom Guardrails into Assertions


```
guardrail = dspy.Predict(MyGuardrail)

def GuardRailWrapper(in):
  return bool(guardrail(in).judgement)

# ... program code

dspy.Suggest(GuardRailWrapper, "Failed Guardrail Message for Retry")
```

## Compiling DSPy Guardrails

In [None]:
# - What to compile?
# -- initial instructions to pass guardrails as the metric
# -- guardrails to accurately detect failed topics
# -- retry to quickly retry failed topics

# dataset of questions

# metric = ?

# compilers