# DSPy Getting Started Notebook

This notebook demonstrates the basics of using DSPy for language model programming.

In [25]:
import dspy
import requests

print("DSPy version:", dspy.__version__)

DSPy version: 2.6.27


In [37]:
def check_ollama():
    try:
        # Check if Ollama server is running
        response = requests.get("http://localhost:11434/api/tags", timeout=5)
        if response.status_code == 200:
            models = response.json().get('models', [])
            print("Ollama server is running!")
            print(f"Available models: {len(models)}")
            for model in models[:6]:  # Show first 6 models
                print(f"   - {model['name']}")
            if len(models) > 6:
                print(f"   ... and {len(models) - 6} more")
            return True
        else:
            print("Ollama server not responding")
            return False
    except requests.exceptions.RequestException:
        print("Ollama server is not running")
        print("Start it with: ollama serve")
        print("Install models with: ollama pull gemma3:1b")
        return False

ollama_available = check_ollama()

Ollama server is running!
Available models: 7
   - gemma3:1b
   - mistral-small3.2:24b
   - llava:7b
   - deepseek-r1:70b
   - deepseek-r1:32b
   - gemma3:27b
   ... and 1 more


## 1. Setting up a Language Model

First configure DSPy to use a language model. This notebook is configured to use **Ollama** as the default backend, which allows you to run models locally.

### Prerequisites for Ollama:
1. Install Ollama: `curl -fsSL https://ollama.ai/install.sh | sh`
2. Start Ollama server: `ollama serve`
3. Install a model: `ollama pull gemma3:1b`

In [None]:
# Configure language model - Ollama as default
try:
    print("Attempting to connect to Ollama...")
    lm = dspy.LM(
        "ollama_chat/gemma3:1b",
        api_base="http://localhost:11434",
        api_key="",
        model_type="chat"
    )
    dspy.configure(lm=lm)
except Exception as e:
    print(f"Failed to connect to Ollama: {e}")

Attempting to connect to Ollama...


## 2. Creating a Basic Signature

DSPy signatures define the input-output behavior of language model modules.

In [38]:
class BasicQA(dspy.Signature):
    """Answer questions with short factual answers."""
    question = dspy.InputField()
    answer = dspy.OutputField(desc="often between 1 and 5 words")

# Create a predictor
qa_predictor = dspy.Predict(BasicQA)

print("QA predictor created!")

QA predictor created!


## 3. Testing the Basic QA System

In [39]:
# Test with some questions
questions = [
    "What is the capital of France?",
    "Who wrote Romeo and Juliet?",
    "What is 2 + 2?",
    "What is the largest planet in our solar system?"
]

for question in questions:
    try:
        result = qa_predictor(question=question)
        print(f"Q: {question}")
        print(f"A: {result.answer}")
        print("-" * 50)
    except Exception as e:
        print(f"Error with question '{question}': {e}")

Q: What is the capital of France?
A: Paris
--------------------------------------------------
Q: Who wrote Romeo and Juliet?
A: William Shakespeare
--------------------------------------------------
Q: What is 2 + 2?
A: 4
--------------------------------------------------
Q: What is the largest planet in our solar system?
A: Jupiter
--------------------------------------------------


## 4. Creating a More Complex Signature

Create a signature for text classification.

In [40]:
class SentimentClassification(dspy.Signature):
    """Classify the sentiment of a given text as positive, negative, or neutral."""
    text = dspy.InputField(desc="text to classify")
    sentiment = dspy.OutputField(desc="positive, negative, or neutral")

# Create sentiment classifier
sentiment_classifier = dspy.Predict(SentimentClassification)

print("Sentiment classifier created!")

Sentiment classifier created!


In [41]:
# Test sentiment classification
texts = [
    "I love this movie! It's absolutely fantastic.",
    "This is terrible. I hate it.",
    "The weather is okay today.",
    "This product exceeded my expectations!"
]

for text in texts:
    try:
        result = sentiment_classifier(text=text)
        print(f"Text: {text}")
        print(f"Sentiment: {result.sentiment}")
        print("-" * 50)
    except Exception as e:
        print(f"Error with text '{text}': {e}")



Text: I love this movie! It's absolutely fantastic.
Sentiment: positive
--------------------------------------------------
Text: This is terrible. I hate it.
Sentiment: negative
--------------------------------------------------
Text: The weather is okay today.
Sentiment: neutral
--------------------------------------------------
Text: This product exceeded my expectations!
Sentiment: positive
--------------------------------------------------


## 5. Chain of Thought Reasoning

DSPy supports chain of thought reasoning with `dspy.ChainOfThought`.

In [32]:
class MathProblem(dspy.Signature):
    """Solve a math word problem step by step."""
    problem = dspy.InputField(desc="math word problem")
    solution = dspy.OutputField(desc="step-by-step solution with final answer")

# Create a chain of thought predictor
math_solver = dspy.ChainOfThought(MathProblem)

print("Math solver with chain of thought created!")

Math solver with chain of thought created!


In [33]:
# Test math problem solving
problem = "Sarah has 15 apples. She gives 3 apples to her friend and buys 8 more apples. How many apples does Sarah have now?"

try:
    result = math_solver(problem=problem)
    print(f"Problem: {problem}")
    print(f"\nReasoning: {result.reasoning}")
    print(f"\nSolution: {result.solution}")
except Exception as e:
    print(f"Error solving math problem: {e}")

Problem: Sarah has 15 apples. She gives 3 apples to her friend and buys 8 more apples. How many apples does Sarah have now?

Reasoning: Sarah starts with 15 apples. She gives away 3, so she has 15 - 3 = 12 apples. Then she buys 8 more, so she has 12 + 8 = 20 apples.

Solution: Sarah now has 20 apples.


## 6. Optimization Example

DSPy includes powerful optimization features that can automatically improve your prompts and model performance. Let's explore a basic example using `BootstrapFewShot` optimization.

In [34]:
# First, let's create some training data for optimization
training_data = [
    {"question": "What is the tallest mountain in the world?", "answer": "Mount Everest"},
    {"question": "Who developed the theory of relativity?", "answer": "Albert Einstein"},
    {"question": "What is the main language spoken in Brazil?", "answer": "Portuguese"},
    {"question": "Which planet is known as the Red Planet?", "answer": "Mars"},
    {"question": "What is the hardest natural substance?", "answer": "Diamond"},
    {"question": "Who is the author of '1984'?", "answer": "George Orwell"},
    {"question": "What is the boiling point of water in Celsius?", "answer": "100"},
    {"question": "What is the currency of Japan?", "answer": "Yen"}
]

# Convert to DSPy format
train_examples = [dspy.Example(question=item["question"], answer=item["answer"]).with_inputs("question") 
                  for item in training_data]

print(f"Created {len(train_examples)} training examples for optimization")
print("\nSample training example:")
print(f"Question: {train_examples[0].question}")
print(f"Expected Answer: {train_examples[0].answer}")

Created 8 training examples for optimization

Sample training example:
Question: What is the tallest mountain in the world?
Expected Answer: Mount Everest


In [35]:
# Set up optimization using BootstrapFewShot
from dspy.teleprompt import BootstrapFewShot

# Define a simple metric to evaluate answers
def answer_correctness_metric(example, pred, trace=None):
    """Simple metric that checks if the predicted answer contains key words from the expected answer."""
    predicted = pred.answer.lower().strip()
    expected = example.answer.lower().strip()
    
    # Simple check: if the expected answer is contained in the prediction or vice versa
    return expected in predicted or predicted in expected

# Create the optimizer
optimizer = BootstrapFewShot(
    metric=answer_correctness_metric,
    max_bootstrapped_demos=4,  # Number of examples to use for few-shot
    max_labeled_demos=4        # Maximum labeled examples to consider
)

print("Optimizer configured!")
print("Metric: Answer correctness based on keyword matching")
print("Strategy: BootstrapFewShot with up to 4 examples")

Optimizer configured!
Metric: Answer correctness based on keyword matching
Strategy: BootstrapFewShot with up to 4 examples


In [36]:
# Compile/optimize the QA predictor
print("Optimizing the QA predictor...")
try:
    optimized_qa = optimizer.compile(qa_predictor, trainset=train_examples[:6])  # Use 6 examples for training
    print("Optimization completed!")
    
    # Test both original and optimized versions
    test_questions = [
        "What is the capital of Italy?",
        "Who discovered penicillin?",
        "What is the largest mammal?"
    ]
    
    print("\n" + "="*60)
    print("COMPARISON: Original vs Optimized QA System")
    print("="*60)
    
    for question in test_questions:
        print(f"\nQuestion: {question}")
        
        # Original predictor
        try:
            original_result = qa_predictor(question=question)
            print(f"Original: {original_result.answer}")
        except Exception as e:
            print(f"Original: Error - {e}")
        
        # Optimized predictor
        try:
            optimized_result = optimized_qa(question=question)
            print(f"Optimized: {optimized_result.answer}")
        except Exception as e:
            print(f"Optimized: Error - {e}")
        
        print("-" * 40)

except Exception as e:
    print(f"Optimization failed: {e}")
    print("This might happen if the model doesn't support the optimization strategy")

Optimizing the QA predictor...


 67%|██████▋   | 4/6 [00:00<00:00, 1978.45it/s]

Bootstrapped 4 full traces after 4 examples for up to 1 rounds, amounting to 4 attempts.
Optimization completed!

COMPARISON: Original vs Optimized QA System

Question: What is the capital of Italy?
Original: Rome
Optimized: Rome
----------------------------------------

Question: Who discovered penicillin?
Original: Alexander Fleming discovered penicillin.
Optimized: Alexander Fleming
----------------------------------------

Question: What is the largest mammal?
Original: The blue whale
Optimized: Blue Whale
----------------------------------------





### Other DSPy Optimizers

DSPy offers several other optimization strategies:
- **`LabeledFewShot`**: Uses provided examples directly
- **`COPRO`**: Coordinate ascent prompt optimization  
- **`MIPRO`**: Multi-prompt instruction optimization
- **`BayesianSignatureOptimizer`**: Bayesian optimization of signatures