# Getting Started with DSPy

This notebook introduces the fundamental concepts of DSPy:
- Setting up language models
- Creating signatures
- Using basic modules
- Making predictions

DSPy is a framework for algorithmically optimizing LM prompts and weights, especially when LMs are used one or more times within a pipeline.

## Setup

First, let's import the necessary libraries and set up our environment.

In [None]:
import os
import sys
sys.path.append('../../')

import dspy
from utils import setup_default_lm, print_step, print_result, print_error
from dotenv import load_dotenv

# Load environment variables
load_dotenv('../../.env')

## Language Model Configuration

DSPy supports various language models. Let's configure one for our examples.

In [None]:
print_step("Setting up Language Model", "Configuring DSPy with OpenAI gpt-4o")

try:
    # Set up the language model
    lm = setup_default_lm(provider="openai", model="gpt-4o", max_tokens=500)
    
    # Configure DSPy to use this model
    dspy.configure(lm=lm)
    
    print_result("Language model configured successfully!", "Status")
    
except Exception as e:
    print_error(f"Failed to configure language model: {e}")
    print("Make sure you have set your OPENAI_API_KEY in the .env file")

## DSPy Signatures

Signatures define the input/output behavior of your language model calls. They're like type hints for LM operations.

In [None]:
print_step("Creating DSPy Signatures", "Defining input/output specifications")

# Simple question answering signature
class QuestionAnswering(dspy.Signature):
    """Answer the given question with a concise and accurate response."""
    question = dspy.InputField(desc="The question to be answered")
    answer = dspy.OutputField(desc="A concise answer to the question")

# Text classification signature
class SentimentClassification(dspy.Signature):
    """Classify the sentiment of the given text as positive, negative, or neutral."""
    text = dspy.InputField(desc="The text to classify")
    sentiment = dspy.OutputField(desc="The sentiment: positive, negative, or neutral")

print_result("Signatures created successfully!")

## Basic Prediction Module

The `Predict` module is the simplest way to use a signature with a language model.

In [None]:
print_step("Using Predict Module", "Making basic predictions with our signatures")

# Create prediction modules
qa_predictor = dspy.Predict(QuestionAnswering)
sentiment_predictor = dspy.Predict(SentimentClassification)

# Test question answering
question = "What is the capital of France?"
qa_result = qa_predictor(question=question)

print_result(f"Question: {question}\nAnswer: {qa_result.answer}", "Question Answering")

# Test sentiment classification
text = "I absolutely love this new product! It's fantastic!"
sentiment_result = sentiment_predictor(text=text)

print_result(f"Text: {text}\nSentiment: {sentiment_result.sentiment}", "Sentiment Classification")

## Chain of Thought Reasoning

The `ChainOfThought` module adds reasoning steps before providing the final answer.

In [None]:
print_step("Using ChainOfThought Module", "Adding reasoning steps to predictions")

# Create a math reasoning signature
class MathReasoning(dspy.Signature):
    """Solve the mathematical problem step by step."""
    problem = dspy.InputField(desc="The mathematical problem to solve")
    reasoning = dspy.OutputField(desc="Step-by-step reasoning")
    answer = dspy.OutputField(desc="The final numerical answer")

# Use ChainOfThought for better reasoning
math_cot = dspy.ChainOfThought(MathReasoning)

# Test with a math problem
problem = "If a rectangle has a length of 8 meters and a width of 5 meters, what is its area?"
math_result = math_cot(problem=problem)

print_result(f"Problem: {problem}\nReasoning: {math_result.reasoning}\nAnswer: {math_result.answer}", "Math Reasoning")

## Custom DSPy Module

You can create custom modules by subclassing `dspy.Module`.

In [None]:
print_step("Creating Custom Module", "Building a comprehensive question answering system")

class SmartQA(dspy.Module):
    def __init__(self):
        super().__init__()
        
        # Define signature for classification
        class QuestionType(dspy.Signature):
            """Classify the type of question being asked."""
            question = dspy.InputField(desc="The question to classify")
            question_type = dspy.OutputField(desc="Type: factual, mathematical, creative, or analytical")
        
        # Define signature for answering
        class AnswerQuestion(dspy.Signature):
            """Answer the question based on its type."""
            question = dspy.InputField(desc="The question to answer")
            question_type = dspy.InputField(desc="The type of question")
            answer = dspy.OutputField(desc="A comprehensive answer")
        
        # Initialize modules
        self.classify_question = dspy.Predict(QuestionType)
        self.answer_question = dspy.ChainOfThought(AnswerQuestion)
    
    def forward(self, question):
        # First, classify the question type
        classification = self.classify_question(question=question)
        
        # Then answer based on the type
        answer = self.answer_question(
            question=question,
            question_type=classification.question_type
        )
        
        return dspy.Prediction(
            question_type=classification.question_type,
            reasoning=answer.reasoning,
            answer=answer.answer
        )

# Create and test the custom module
smart_qa = SmartQA()

test_questions = [
    "What is the speed of light?",
    "If I have 10 apples and eat 3, how many do I have left?",
    "Write a creative story about a robot learning to paint.",
]

for question in test_questions:
    result = smart_qa(question=question)
    print_result(
        f"Question: {question}\n"
        f"Type: {result.question_type}\n"
        f"Reasoning: {result.reasoning}\n"
        f"Answer: {result.answer}",
        f"Smart QA Result"
    )
    print("-" * 80)

## Working with Examples

DSPy uses `Example` objects to represent training and evaluation data.

In [None]:
print_step("Working with Examples", "Creating and using DSPy Example objects")

# Create examples
examples = [
    dspy.Example(question="What is 2+2?", answer="4"),
    dspy.Example(question="Who wrote Romeo and Juliet?", answer="William Shakespeare"),
    dspy.Example(question="What is the largest planet?", answer="Jupiter"),
]

print_result(f"Created {len(examples)} examples")

# Test our QA predictor on these examples
print("Testing predictor on examples:")
for i, example in enumerate(examples, 1):
    prediction = qa_predictor(question=example.question)
    print(f"\nExample {i}:")
    print(f"Question: {example.question}")
    print(f"Expected: {example.answer}")
    print(f"Predicted: {prediction.answer}")
    print(f"Match: {prediction.answer.lower().strip() == example.answer.lower().strip()}")

## Inspecting LM Calls

DSPy allows you to inspect the actual prompts and responses sent to the language model.

In [None]:
print_step("Inspecting LM History", "Looking at prompts and responses")

# Make a prediction to generate history
result = qa_predictor(question="What is machine learning?")

# Inspect the history
if hasattr(lm, 'history') and lm.history:
    latest_call = lm.history[-1]
    print_result(
        f"Prompt: {latest_call.get('prompt', 'N/A')}\n\n"
        f"Response: {latest_call.get('response', 'N/A')}",
        "Latest LM Call"
    )
else:
    print_result("History not available for this LM configuration", "Note")

print_result(f"Answer: {result.answer}", "Final Result")

## Summary

In this notebook, we covered:

1. **Language Model Setup**: How to configure DSPy with different LM providers
2. **Signatures**: Defining input/output specifications for LM operations
3. **Basic Modules**: Using `Predict` for simple predictions
4. **Chain of Thought**: Adding reasoning steps with `ChainOfThought`
5. **Custom Modules**: Creating complex workflows by subclassing `dspy.Module`
6. **Examples**: Working with training/evaluation data
7. **Inspection**: Understanding what's happening under the hood

These are the fundamental building blocks for creating more sophisticated DSPy applications. In the next notebooks, we'll explore optimization, retrieval-augmented generation, and advanced techniques.