# DSPy Component Overview

In this notebook, we will explore the high level components of DSPy and build an intuition for how each of them work together.

## Notebook Setup

In [1]:
# Importing the necessary Python libraries
import os
import json
import yaml

import dspy

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
# Loading my personal API keys from file (not pushed to GitHub due to .gitignore file)
if os.path.exists('../keys/api_keys.yaml'):
    with open('../keys/api_keys.yaml') as f:
        api_keys = yaml.safe_load(f)

## Language Model (LM)

In [3]:
# Setting the two different language models (LMs) we will work with
try:
    lm_4o_mini = dspy.LM('openai/gpt-4o-mini', api_key = os.environ['OPENAI_API_KEY'])
    lm_4o = dspy.LM('openai/gpt-4o', api_key = os.environ['OPENAI_API_KEY'])
except:
    lm_4o_mini = dspy.LM('openai/gpt-4o-mini', api_key = api_keys['OPENAI_API_KEY'])
    lm_4o = dspy.LM('openai/gpt-4o', api_key = api_keys['OPENAI_API_KEY'])

# Setting DSPy to use GPT-4o-mini as the default LM
dspy.configure(lm = lm_4o_mini)

## Signatures
In DSPy, a **signature** is more or less a blueprint to define what you expect to feed as an **input** versus what you expect to get back as an **output**. Given that our goal may be to optimize our prompt template, our signature does NOT need to contain a robust prompt template that you may be used to forming through trial-and-error experiments. We might provide details to the signature where things may be ambiguous, but by and large, we do not need a fully fledged prompt template to get started.

While we don't provide a full prompt template, we may need to supply the signature with a few details to help the model understand what we are specifically expecting. This may include...

- Setting the **data type** of the inputs / outputs
- Providing a **brief description** of what the input / output is
- Providing an expected **input / output structure** of what the inputs / outputs will consist of

DSPy offers two ways to create signatures: a **simpler string-based inline signature** and a **more robust class-based signature**. The simpler version is intended to offer a simple means to get going quickly; however, I personally am concerned with the lack of "rigidity" that the inline approach offers. Specifically, if you're using an IDE like VS Code, it won't be able to catch syntax errors in the inline version of the signature. Additionally, if you have a complex set of inputs and outputs, that can quickly get messy in the inline approach.

That said, I will demonstrate both approaches in this notebook, but in the other tutorials, you will see that I will be using the class-based approach.

(Note: In order to make this sample code work, we're going to have to make use of a simple **DSPy module** called `dspy.Predict()`. At this point in the tutorial, we have not yet covered modules, so please skip on down to that section if you want a better intuition on the code here.)

### Inline Signature
First, let's demonstrate how the inline signature works. We're going to keep it simple by creating a DSPy signature that expects any sentence but returns a Boolean value that represents the sentiment of the sentance. If the sentiment is positive, we will return True; otherwise, we will return False.

In [4]:
# Creating sample sentences representing positive and negative sentiment
positive_sentence = "I am very happy with the results of this project."
negative_sentence = "I am disappointed with the outcome of this task."

# Instantiating a simple DSPy module for sentiment classification
dspy_sentiment_classification = dspy.Predict('sentence -> sentiment: bool')

# Invoking the DSPy model with each respective sentence.
print(f'Positive sentence: {dspy_sentiment_classification(sentence = positive_sentence)}')
print(f'Negative sentence: {dspy_sentiment_classification(sentence = negative_sentence)}')

Positive sentence: Prediction(
    sentiment=True
)
Negative sentence: Prediction(
    sentiment=False
)


### Class-based Signature
Now that we've demonstrated the simpler inline-based signature, let's move onto demonstrating the class-based signature. As I stated above, this is my personal preference as things like code IDEs can more easily detect for syntax errors, which is especially important when you start to build a more complex set of inputs and outputs.

Notice something additional here that is not possible in the inline-based method: we can provide a docstring at the top of the class-based signature. This helps to guide our later DSPy module to form something like a goal to achieve when performing the prompt template optimization. You shouldn't look at this docstring as something you need to get perfect, as you might in something like trail-and-error prompt engineering testing. Instead, think of it as a high-level guide to help the model understand what you're trying to achieve.

In the example below, we're going to do something similar to the idea above but expound upon it to intentionally add complexity.

In [5]:
# Creating a class-based DSPy signature for text analysis
class TextAnalysisSignature(dspy.Signature):
    """Analyze text for sentiment, main topic, and formality level."""
    
    # Setting the input fields
    text = dspy.InputField(desc = 'The text to be analyzed')
    language = dspy.InputField(desc = 'The language of the text', default = 'English')
    
    # Setting the output fields
    sentiment = dspy.OutputField(desc = 'The sentiment of the text (positive, negative, or neutral)')
    topic = dspy.OutputField(desc = 'The main topic of the text')
    formality = dspy.OutputField(desc = 'The formality level (formal, informal, or neutral)')
    word_count = dspy.OutputField(type = int, desc = 'The number of words in the text')

# Creating a module using our custom signature
dspy_text_analyzer = dspy.Predict(TextAnalysisSignature)

# Creating sample texts for analysis
business_email = "Dear Mr. Johnson, I am writing to follow up on our meeting last week regarding the quarterly financial report. The results exceeded our expectations."
casual_message = "Hey! Can't wait to see you this weekend. The party's gonna be awesome!"

# Analyzing the texts
business_analysis = dspy_text_analyzer(text = business_email, language = "English")
casual_analysis = dspy_text_analyzer(text = casual_message, language = "English")

# Displaying analysis results in a more creative way
def display_analysis(title, analysis):
    width = 50
    print("=" * width)
    print(f" {title} ".center(width, "*"))
    print("=" * width)
    print(f"📊 SENTIMENT: {analysis.sentiment}")
    print(f"📝 TOPIC: {analysis.topic}")
    print(f"🎩 FORMALITY: {analysis.formality}")
    print(f"🔢 WORD COUNT: {analysis.word_count}")
    print("-" * width)
    
display_analysis("BUSINESS EMAIL ANALYSIS", business_analysis)
print("\n")
display_analysis("CASUAL MESSAGE ANALYSIS", casual_analysis)

************ BUSINESS EMAIL ANALYSIS *************
📊 SENTIMENT: positive
📝 TOPIC: quarterly financial report
🎩 FORMALITY: formal
🔢 WORD COUNT: 27
--------------------------------------------------


************ CASUAL MESSAGE ANALYSIS *************
📊 SENTIMENT: positive
📝 TOPIC: social gathering
🎩 FORMALITY: informal
🔢 WORD COUNT: 15
--------------------------------------------------


## Modules
