# A Gentle Introduction to [DSPy](https://dspy-docs.vercel.app/) Part 2
For grug brained developers.

If you would rather *read* this, you can find it on [LearnByBuilding.AI](https://learnbybuilding.ai/tutorials/). This notebook only contains code, to get some prose along with it, check out the tutorial posted there.

If you like this content, [follow me on twitter](https://twitter.com/bllchmbrs) for more! I'm posting all week about DSPy and providing a lot of "hard earned" lessons that I've gotten from learning the material.

In [40]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [41]:
import lab # ignore this import, it's just for some settings / ENV vars

## Using a Dataset

To really leverage DSPy, you're going to need datasets and examples. If you don't have that, do not pass go, do not collect $200. That is the first thing that you need, now you need to think about how to get that dataset.

Be creative!

In [157]:
import dspy
from pprint import pprint

In [158]:
turbo = dspy.OpenAI(model='gpt-3.5-turbo', max_tokens=1000)
dspy.settings.configure(lm=turbo)

In [159]:
import json
with open("dspy-examples-raw.json") as f:
    dataset= json.load(f)

Our dataset is just Grug translation.

In [362]:
dataset[3]

{'grug_text': 'big brained developers are many, and some not expected to like this, make sour face',
 'plain_english': 'Many developers have high intelligence but not all of them appreciate this, leading to some displaying displeasure.'}

In [161]:
examples = []
for row in dataset:
    examples.append(dspy.Example(grug_text=row["grug_text"], plain_english=row["plain_english"]).with_inputs("plain_english"))

In [162]:
from random import shuffle

def split_for_train_test(values, test_size = 1/3.0):
    shuffle(values)
    train = int(len(values)-test_size*len(values))
    return values[:train], values[train:]

In [205]:
train, test = split_for_train_test(examples[:30])
# limit for faster iteration for pipeline
print(len(train))

20


In [206]:
pprint(train[0].toDict())

{'grug_text': 'sometimes compromise necessary or no shiney rock, mean no '
              'dinosaur meat, not good, wife firmly remind grug\n'
              'about young grugs at home need roof, food, and so forth, no '
              'interest in complexity demon spirit rant by grug for\n'
              'fiftieth time',
 'plain_english': 'Sometimes, I have to make compromises even if it means not '
                  'getting something shiny like a rock, which might result in '
                  "not having dinosaur meat. That wouldn't be good. My wife "
                  'always reminds me that we have young children at home who '
                  'need a roof over their heads, food to eat, and other '
                  'necessities. I am not interested in getting caught up in '
                  'the complex rants about demons and spirits that I tend to '
                  'go on about repeatedly.'}


In [207]:
turbo = dspy.OpenAI(model='gpt-3.5-turbo', max_tokens=1000)
dspy.settings.configure(lm=turbo)

## Building our signature

Note, not really optimized for chat models.

In [251]:
class GrugTranslation(dspy.Signature):
    "Translate plain english to Grug text."
    plain_english = dspy.InputField()
    grug_text = dspy.OutputField()

In [252]:
# https://github.com/stanfordnlp/dspy/blob/1c10a9d476737533a53d6bee62c234e375eb8fcb/dsp/templates/template_v3.py#L22
from dspy.signatures.signature import signature_to_template
grug_translation_as_template = signature_to_template(GrugTranslation)
print(str(grug_translation_as_template))

Template(Translate plain english to Grug text., ['Plain English:', 'Grug Text:'])


In [253]:
print(grug_translation_as_template.guidelines())

Follow the following format.

Plain English: ${plain_english}
Grug Text: ${grug_text}


In [254]:
print(grug_translation_as_template.query(train[0]))

Plain English: Sometimes, I have to make compromises even if it means not getting something shiny like a rock, which might result in not having dinosaur meat. That wouldn't be good. My wife always reminds me that we have young children at home who need a roof over their heads, food to eat, and other necessities. I am not interested in getting caught up in the complex rants about demons and spirits that I tend to go on about repeatedly.
Grug Text: sometimes compromise necessary or no shiney rock, mean no dinosaur meat, not good, wife firmly remind grug about young grugs at home need roof, food, and so forth, no interest in complexity demon spirit rant by grug for fiftieth time


## Running an Optimization

We are skipping Zero shot prompting

### Defining our metric

In [255]:
# https://apps.dtic.mil/sti/tr/pdf/AD0667273.pdf
def automated_readability_index(text):
    import re
    characters = len(re.sub(r'\s+', '', text)) # Count characters (ignoring whitespace)
    words = len(text.split()) # Count words by splitting the text
    sentences = len(re.findall(r'[.!?]', text)) # Count sentences by finding period, exclamation, or question mark
    # Calculate the Automated Readability Index (ARI)
    if words == 0 or sentences == 0:  # Prevent division by zero
        return 0
    
    ari = (4.71 * (characters / words)) + (0.5 * (words / sentences)) - 21.43
    
    return round(ari, 2)

In [308]:
sources = []
grugs = []
for ex in examples:
    source_ari = automated_readability_index(ex.plain_english)
    grug_ari = automated_readability_index(ex.grug_text)
    grugs.append(grug_ari)
    sources.append(source_ari)

np.median(sources), np.mean(sources), np.median(grugs), np.mean(grugs)

(21.33, 22.24211009174312, 10.8, 11.75389908256881)

## First Metric: Readability

In [329]:
def ari_metric(truth, pred, trace=None):
    truth_grug_text = truth.grug_text
    proposed_grug_text = pred.grug_text
    
    gold_ari = automated_readability_index(truth_grug_text)
    pred_ari = automated_readability_index(proposed_grug_text)

    ari_result = pred_ari <= 4.5
    return ari_result

## Second Metric: Use a better Model to tune

In [330]:
gpt4T = dspy.OpenAI(model='gpt-4o', max_tokens=100, model_type='chat')

# https://dspy-docs.vercel.app/docs/building-blocks/metrics#intermediate-using-ai-feedback-for-your-metric
class AssessBasedOnQuestion(dspy.Signature):
    """Given the assessed text, provide a yes or no to the assessment question. Do not provide any other text."""

    assessed_text = dspy.InputField(format=str)
    assessment_question = dspy.InputField(format=str)
    assessment_answer = dspy.OutputField(desc="yes or no")

Again, this is just a prompt...

In [331]:
print(signature_to_template(AssessBasedOnQuestion).guidelines())

Follow the following format.

Assessed Text: ${assessed_text}
Assessment Question: ${assessment_question}
Assessment Answer: yes or no


In [332]:
example_question_assessment = dspy.Example(assessed_text="This is a test.", assessment_question="Is this a test?", assessment_answer="yes").with_inputs("assessed_text", "assessment_question")
print(signature_to_template(AssessBasedOnQuestion).query(example_question_assessment))

Assessed Text: This is a test.
Assessment Question: Is this a test?
Assessment Answer: yes


> one note, it's technically, I believe, a `Prediction` object. But Predictions [code reference](https://dspy-docs.vercel.app/docs/deep-dive/signature/executing-signatures#how-predict-works) mirror example functionality.

In [333]:
def similarity_metric(truth, pred, trace=None):
    truth_grug_text = truth.grug_text
    proposed_grug_text = pred.grug_text
    similarity_question = f"""Does the assessed text have the same meaning as the gold standard text?

Gold Standard: "{truth_grug_text}"

Provide only a yes or no answer."""

    with dspy.context(lm=gpt4T):
        assessor = dspy.Predict(AssessBasedOnQuestion)
        raw_similarity_result = assessor(assessed_text=proposed_grug_text, assessment_question=similarity_question)
    raw_similarity = raw_similarity_result.assessment_answer.lower().strip()
    if len(raw_similarity) > 3:
        print(raw_similarity)
    same_meaning = raw_similarity == 'yes'
    return same_meaning

In [334]:
def overall_metric(provided_example, predicted, trace=None):
    similarity = similarity_metric(provided_example, predicted, trace)
    ari = ari_metric(provided_example, predicted, trace)

    if similarity and ari:
        return True
    return False

In [335]:
from dspy.evaluate import Evaluate
individual_metrics = [similarity_metric, ari_metric]

def run_evals(optimized_model):
    for metric in individual_metrics:
        evaluate = Evaluate(metric=metric, devset=test[:5], num_threads=1, display_table=5)
        evaluate(optimized_model)

def run_overall_eval(optimized_model):
    evaluate = Evaluate(metric=overall_metric, devset=test[:5], num_threads=1, display_table=5)
    return evaluate(optimized_model)

In [336]:
manual_test_case = "I don't like to construct complex systems. Building these systems are difficult and error prone."

In [337]:
class CoT(dspy.Module):
    def __init__(self):
        super().__init__()
        self.prog = dspy.ChainOfThought(GrugTranslation)
    
    def forward(self, plain_english):
        return self.prog(plain_english=plain_english)

In [338]:
zeroshot = CoT()
zeroshot.forward(plain_english=manual_test_case)

Prediction(
    rationale='produce the grug_text. We will simplify the language and focus on basic concepts.',
    grug_text='Grug no like make big things. Big things hard and easy to mess up.'
)

In [339]:
run_evals(zeroshot)

Unnamed: 0,example_grug_text,plain_english,rationale,pred_grug_text,similarity_metric
0,"this make it ok for junior grugs to admit too complex and not understand as well, often such case! FOLD major source of complexity demon...","This emphasizes that it's okay for novice developers like me to acknowledge when something is too complex to understand fully. In such situations, the complexity...",produce the grug_text. We will simplify the language and focus on basic concepts that Grug can understand.,"Me not smart like big developers. Me say okay to not understand hard things. Hard things make head hurt. Me young developer, so hard things...",False
1,"""no, grug not build that abstraction""","I, Grug, did not create that abstract concept or idea.",produce the grug_text. We need to simplify the language and make it sound like Grug is speaking.,Grug not make that thing with big words or thoughts.,False
2,"in-between tests, grug hear shaman call ""integration tests"" sometime often with sour look on face. but grug say integration test sweet spot according to grug:...","During my work, I often hear experienced colleagues mention ""integration tests"" with a hint of disapproval on their faces. However, I believe that integration tests...",produce the grug_text. We will simplify the language and focus on the basic idea of testing systems.,"Me work, me hear smart friends talk about ""integration tests"" with funny face. But me think integration tests important. They check if system work right...",False
3,"many developers Fear Of Looking Dumb (FOLD), grug also at one time FOLD, but grug learn get over: very important senior grug say ""this too...","As a software developer, I understand the fear of looking ignorant when faced with complex problems. I used to experience this fear myself. However, over...",produce the grug_text. We will simplify the language and focus on basic concepts that Grug can understand.,Grug not smart like software maker. Grug scared of hard problems. Grug ask for help and learn. It okay to say when something too hard.,False
4,temptation generics very large is trick! spirit demon complex love this one trick! beware!,"The urge to use generics extensively can be tempting, but it can be deceptive and lead us into a complex and challenging situation. I believe...",produce the grug_text. We will simplify the idea of using generics and warn others about the dangers of using them too much.,"Grug think using fancy things like generics can be tricky. Grug say be careful, too much can make big problem.",False


Unnamed: 0,example_grug_text,plain_english,rationale,pred_grug_text,ari_metric
0,"this make it ok for junior grugs to admit too complex and not understand as well, often such case! FOLD major source of complexity demon...","This emphasizes that it's okay for novice developers like me to acknowledge when something is too complex to understand fully. In such situations, the complexity...",produce the grug_text. We will simplify the language and focus on basic concepts that Grug can understand.,"Me not smart like big developers. Me say okay to not understand hard things. Hard things make head hurt. Me young developer, so hard things...",False
1,"""no, grug not build that abstraction""","I, Grug, did not create that abstract concept or idea.",produce the grug_text. We need to simplify the language and make it sound like Grug is speaking.,Grug not make that thing with big words or thoughts.,✔️ [True]
2,"in-between tests, grug hear shaman call ""integration tests"" sometime often with sour look on face. but grug say integration test sweet spot according to grug:...","During my work, I often hear experienced colleagues mention ""integration tests"" with a hint of disapproval on their faces. However, I believe that integration tests...",produce the grug_text. We will simplify the language and focus on the basic idea of testing systems.,"Me work, me hear smart friends talk about ""integration tests"" with funny face. But me think integration tests important. They check if system work right...",False
3,"many developers Fear Of Looking Dumb (FOLD), grug also at one time FOLD, but grug learn get over: very important senior grug say ""this too...","As a software developer, I understand the fear of looking ignorant when faced with complex problems. I used to experience this fear myself. However, over...",produce the grug_text. We will simplify the language and focus on basic concepts that Grug can understand.,Grug not smart like software maker. Grug scared of hard problems. Grug ask for help and learn. It okay to say when something too hard.,False
4,temptation generics very large is trick! spirit demon complex love this one trick! beware!,"The urge to use generics extensively can be tempting, but it can be deceptive and lead us into a complex and challenging situation. I believe...",produce the grug_text. We will simplify the idea of using generics and warn others about the dangers of using them too much.,"Grug think using fancy things like generics can be tricky. Grug say be careful, too much can make big problem.",False


## The cheatsheet is now our best friend :)
https://dspy-docs.vercel.app/docs/cheatsheet#dspy-optimizers

In [340]:
from dspy.teleprompt import LabeledFewShot
config = dict(k=8)
optimizer = LabeledFewShot(**config)
optimizer.max_errors=1
fewshot = optimizer.compile(CoT(), trainset=train)

In [341]:
fewshot.forward("I don't like to construct complex systems. Building these systems are difficult and error prone.")

Prediction(
    rationale="produce the grug_text. We want to emphasize Grug's preference for simplicity and his dislike for complicated things.",
    grug_text='grug no like make big hard system. make system hard and many mistake happen.'
)

In [342]:
run_evals(fewshot)

Unnamed: 0,example_grug_text,plain_english,rationale,pred_grug_text,similarity_metric
0,"this make it ok for junior grugs to admit too complex and not understand as well, often such case! FOLD major source of complexity demon...","This emphasizes that it's okay for novice developers like me to acknowledge when something is too complex to understand fully. In such situations, the complexity...",produce the grug_text. We need to simplify the idea of acknowledging complexity and feeling overwhelmed.,"grug say, it okay for new grugs like me to admit when thing too hard to understand. when task too much, feel like no control....",False
1,"""no, grug not build that abstraction""","I, Grug, did not create that abstract concept or idea.",produce the grug_text. We need to simplify the language and make it sound like Grug is talking about himself in the third person.,Grug not make that fancy idea or thing.,False
2,"in-between tests, grug hear shaman call ""integration tests"" sometime often with sour look on face. but grug say integration test sweet spot according to grug:...","During my work, I often hear experienced colleagues mention ""integration tests"" with a hint of disapproval on their faces. However, I believe that integration tests...",produce the grug_text. We will simplify the concept of integration tests and emphasize their importance in a straightforward manner.,"grug hear wise grug talk about ""integration tests"" with frown, but grug think they very important! they check system right and help find bugs with...",False
3,"many developers Fear Of Looking Dumb (FOLD), grug also at one time FOLD, but grug learn get over: very important senior grug say ""this too...","As a software developer, I understand the fear of looking ignorant when faced with complex problems. I used to experience this fear myself. However, over...",produce the grug_text. We will simplify the idea of fear of looking ignorant and the importance of seeking help for growth.,grug know fear of looking dumb when see hard problem. grug feel same way before. but grug learn ask for help important for grow. okay...,✔️ [True]
4,temptation generics very large is trick! spirit demon complex love this one trick! beware!,"The urge to use generics extensively can be tempting, but it can be deceptive and lead us into a complex and challenging situation. I believe...",produce the grug_text. We will simplify the idea of using generics extensively and warn others about the dangers of doing so.,"grug see generics, grug want use everywhere, but that tricky path lead to many rocks and hard place. grug think hidden dangers in using generics...",False


Unnamed: 0,example_grug_text,plain_english,rationale,pred_grug_text,ari_metric
0,"this make it ok for junior grugs to admit too complex and not understand as well, often such case! FOLD major source of complexity demon...","This emphasizes that it's okay for novice developers like me to acknowledge when something is too complex to understand fully. In such situations, the complexity...",produce the grug_text. We need to simplify the idea of acknowledging complexity and feeling overwhelmed.,"grug say, it okay for new grugs like me to admit when thing too hard to understand. when task too much, feel like no control....",False
1,"""no, grug not build that abstraction""","I, Grug, did not create that abstract concept or idea.",produce the grug_text. We need to simplify the language and make it sound like Grug is talking about himself in the third person.,Grug not make that fancy idea or thing.,✔️ [True]
2,"in-between tests, grug hear shaman call ""integration tests"" sometime often with sour look on face. but grug say integration test sweet spot according to grug:...","During my work, I often hear experienced colleagues mention ""integration tests"" with a hint of disapproval on their faces. However, I believe that integration tests...",produce the grug_text. We will simplify the concept of integration tests and emphasize their importance in a straightforward manner.,"grug hear wise grug talk about ""integration tests"" with frown, but grug think they very important! they check system right and help find bugs with...",False
3,"many developers Fear Of Looking Dumb (FOLD), grug also at one time FOLD, but grug learn get over: very important senior grug say ""this too...","As a software developer, I understand the fear of looking ignorant when faced with complex problems. I used to experience this fear myself. However, over...",produce the grug_text. We will simplify the idea of fear of looking ignorant and the importance of seeking help for growth.,grug know fear of looking dumb when see hard problem. grug feel same way before. but grug learn ask for help important for grow. okay...,False
4,temptation generics very large is trick! spirit demon complex love this one trick! beware!,"The urge to use generics extensively can be tempting, but it can be deceptive and lead us into a complex and challenging situation. I believe...",produce the grug_text. We will simplify the idea of using generics extensively and warn others about the dangers of doing so.,"grug see generics, grug want use everywhere, but that tricky path lead to many rocks and hard place. grug think hidden dangers in using generics...",False


In [343]:
from dspy.teleprompt import BootstrapFewShot

config = dict(max_bootstrapped_demos=5, max_labeled_demos=5)
teleprompter = BootstrapFewShot(metric=overall_metric, **config)
teleprompter.max_errors = 1
bfewshot = teleprompter.compile(CoT(), trainset=train, valset=test)

100%|█████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 609.77it/s]

assessment answer: yes
assessment answer: yes
assessment answer: no





In [344]:
run_evals(optimized_cot)

Unnamed: 0,example_grug_text,plain_english,rationale,pred_grug_text,similarity_metric
0,"this make it ok for junior grugs to admit too complex and not understand as well, often such case! FOLD major source of complexity demon...","This emphasizes that it's okay for novice developers like me to acknowledge when something is too complex to understand fully. In such situations, the complexity...",produce the grug_text. We will simplify the language and focus on the struggles of novice developers in understanding complex tasks.,"this make it ok for junior grugs to admit too complex and not understand as well, often such case! FOLD major source of complexity demon...",✔️ [True]
1,"""no, grug not build that abstraction""","I, Grug, did not create that abstract concept or idea.",produce the grug_text. We will simplify the language and emphasize Grug's lack of understanding complex concepts.,Grug not make that big idea or thing.,False
2,"in-between tests, grug hear shaman call ""integration tests"" sometime often with sour look on face. but grug say integration test sweet spot according to grug:...","During my work, I often hear experienced colleagues mention ""integration tests"" with a hint of disapproval on their faces. However, I believe that integration tests...",produce the grug_text. We will simplify the language and focus on the importance of integration tests in software development.,"in-between tests, grug hear shaman call ""integration tests"" sometime often with sour look on face. but grug say integration test sweet spot according to grug:...",✔️ [True]
3,"many developers Fear Of Looking Dumb (FOLD), grug also at one time FOLD, but grug learn get over: very important senior grug say ""this too...","As a software developer, I understand the fear of looking ignorant when faced with complex problems. I used to experience this fear myself. However, over...",produce the grug_text. We will simplify the language and focus on the importance of seeking help in software development.,"grug as software maker, grug know fear of not knowing big problem. grug used to feel same. but grug learn ask for help important for...",False
4,temptation generics very large is trick! spirit demon complex love this one trick! beware!,"The urge to use generics extensively can be tempting, but it can be deceptive and lead us into a complex and challenging situation. I believe...",produce the grug_text. We will simplify the language and focus on the caution against using generics too much.,"grug feel strong urge to use generics, but can be tricky and lead to big problem. grug think hidden risks in using too much generics,...",False


Unnamed: 0,example_grug_text,plain_english,rationale,pred_grug_text,ari_metric
0,"this make it ok for junior grugs to admit too complex and not understand as well, often such case! FOLD major source of complexity demon...","This emphasizes that it's okay for novice developers like me to acknowledge when something is too complex to understand fully. In such situations, the complexity...",produce the grug_text. We will simplify the language and focus on the struggles of novice developers in understanding complex tasks.,"this make it ok for junior grugs to admit too complex and not understand as well, often such case! FOLD major source of complexity demon...",False
1,"""no, grug not build that abstraction""","I, Grug, did not create that abstract concept or idea.",produce the grug_text. We will simplify the language and emphasize Grug's lack of understanding complex concepts.,Grug not make that big idea or thing.,✔️ [True]
2,"in-between tests, grug hear shaman call ""integration tests"" sometime often with sour look on face. but grug say integration test sweet spot according to grug:...","During my work, I often hear experienced colleagues mention ""integration tests"" with a hint of disapproval on their faces. However, I believe that integration tests...",produce the grug_text. We will simplify the language and focus on the importance of integration tests in software development.,"in-between tests, grug hear shaman call ""integration tests"" sometime often with sour look on face. but grug say integration test sweet spot according to grug:...",False
3,"many developers Fear Of Looking Dumb (FOLD), grug also at one time FOLD, but grug learn get over: very important senior grug say ""this too...","As a software developer, I understand the fear of looking ignorant when faced with complex problems. I used to experience this fear myself. However, over...",produce the grug_text. We will simplify the language and focus on the importance of seeking help in software development.,"grug as software maker, grug know fear of not knowing big problem. grug used to feel same. but grug learn ask for help important for...",False
4,temptation generics very large is trick! spirit demon complex love this one trick! beware!,"The urge to use generics extensively can be tempting, but it can be deceptive and lead us into a complex and challenging situation. I believe...",produce the grug_text. We will simplify the language and focus on the caution against using generics too much.,"grug feel strong urge to use generics, but can be tricky and lead to big problem. grug think hidden risks in using too much generics,...",False


In [345]:
run_overall_eval(zeroshot), run_overall_eval(fewshot), run_overall_eval(bfewshot)

Unnamed: 0,example_grug_text,plain_english,rationale,pred_grug_text,overall_metric
0,"this make it ok for junior grugs to admit too complex and not understand as well, often such case! FOLD major source of complexity demon...","This emphasizes that it's okay for novice developers like me to acknowledge when something is too complex to understand fully. In such situations, the complexity...",produce the grug_text. We will simplify the language and focus on basic concepts that Grug can understand.,"Me not smart like big developers. Me say okay to not understand hard things. Hard things make head hurt. Me young developer, so hard things...",False
1,"""no, grug not build that abstraction""","I, Grug, did not create that abstract concept or idea.",produce the grug_text. We need to simplify the language and make it sound like Grug is speaking.,Grug not make that thing with big words or thoughts.,False
2,"in-between tests, grug hear shaman call ""integration tests"" sometime often with sour look on face. but grug say integration test sweet spot according to grug:...","During my work, I often hear experienced colleagues mention ""integration tests"" with a hint of disapproval on their faces. However, I believe that integration tests...",produce the grug_text. We will simplify the language and focus on the basic idea of testing systems.,"Me work, me hear smart friends talk about ""integration tests"" with funny face. But me think integration tests important. They check if system work right...",False
3,"many developers Fear Of Looking Dumb (FOLD), grug also at one time FOLD, but grug learn get over: very important senior grug say ""this too...","As a software developer, I understand the fear of looking ignorant when faced with complex problems. I used to experience this fear myself. However, over...",produce the grug_text. We will simplify the language and focus on basic concepts that Grug can understand.,Grug not smart like software maker. Grug scared of hard problems. Grug ask for help and learn. It okay to say when something too hard.,False
4,temptation generics very large is trick! spirit demon complex love this one trick! beware!,"The urge to use generics extensively can be tempting, but it can be deceptive and lead us into a complex and challenging situation. I believe...",produce the grug_text. We will simplify the idea of using generics and warn others about the dangers of using them too much.,"Grug think using fancy things like generics can be tricky. Grug say be careful, too much can make big problem.",False


Unnamed: 0,example_grug_text,plain_english,rationale,pred_grug_text,overall_metric
0,"this make it ok for junior grugs to admit too complex and not understand as well, often such case! FOLD major source of complexity demon...","This emphasizes that it's okay for novice developers like me to acknowledge when something is too complex to understand fully. In such situations, the complexity...",produce the grug_text. We need to simplify the idea of acknowledging complexity and feeling overwhelmed.,"grug say, it okay for new grugs like me to admit when thing too hard to understand. when task too much, feel like no control....",False
1,"""no, grug not build that abstraction""","I, Grug, did not create that abstract concept or idea.",produce the grug_text. We need to simplify the language and make it sound like Grug is talking about himself in the third person.,Grug not make that fancy idea or thing.,False
2,"in-between tests, grug hear shaman call ""integration tests"" sometime often with sour look on face. but grug say integration test sweet spot according to grug:...","During my work, I often hear experienced colleagues mention ""integration tests"" with a hint of disapproval on their faces. However, I believe that integration tests...",produce the grug_text. We will simplify the concept of integration tests and emphasize their importance in a straightforward manner.,"grug hear wise grug talk about ""integration tests"" with frown, but grug think they very important! they check system right and help find bugs with...",False
3,"many developers Fear Of Looking Dumb (FOLD), grug also at one time FOLD, but grug learn get over: very important senior grug say ""this too...","As a software developer, I understand the fear of looking ignorant when faced with complex problems. I used to experience this fear myself. However, over...",produce the grug_text. We will simplify the idea of fear of looking ignorant and the importance of seeking help for growth.,grug know fear of looking dumb when see hard problem. grug feel same way before. but grug learn ask for help important for grow. okay...,False
4,temptation generics very large is trick! spirit demon complex love this one trick! beware!,"The urge to use generics extensively can be tempting, but it can be deceptive and lead us into a complex and challenging situation. I believe...",produce the grug_text. We will simplify the idea of using generics extensively and warn others about the dangers of doing so.,"grug see generics, grug want use everywhere, but that tricky path lead to many rocks and hard place. grug think hidden dangers in using generics...",False


Unnamed: 0,example_grug_text,plain_english,rationale,pred_grug_text,overall_metric
0,"this make it ok for junior grugs to admit too complex and not understand as well, often such case! FOLD major source of complexity demon...","This emphasizes that it's okay for novice developers like me to acknowledge when something is too complex to understand fully. In such situations, the complexity...",produce the grug_text. We will simplify the language and focus on basic concepts.,"this make it ok for junior grugs to admit too complex and not understand as well, often such case! FOLD major source of complexity demon...",False
1,"""no, grug not build that abstraction""","I, Grug, did not create that abstract concept or idea.",produce the grug_text. We need to simplify the language and make it sound like Grug is speaking.,Grug not make that big idea or thought.,False
2,"in-between tests, grug hear shaman call ""integration tests"" sometime often with sour look on face. but grug say integration test sweet spot according to grug:...","During my work, I often hear experienced colleagues mention ""integration tests"" with a hint of disapproval on their faces. However, I believe that integration tests...",produce the grug_text. We will simplify the language and focus on the basic idea of integration tests being important.,"in-between tests, grug hear shaman call ""integration tests"" sometime often with sour look on face. but grug say integration test sweet spot according to grug:...",False
3,"many developers Fear Of Looking Dumb (FOLD), grug also at one time FOLD, but grug learn get over: very important senior grug say ""this too...","As a software developer, I understand the fear of looking ignorant when faced with complex problems. I used to experience this fear myself. However, over...",produce the grug_text. We will simplify the language and focus on basic concepts.,"as software developer, grug know fear of look dumb when face big problem. grug feel this fear before. but grug learn ask for help important...",False
4,temptation generics very large is trick! spirit demon complex love this one trick! beware!,"The urge to use generics extensively can be tempting, but it can be deceptive and lead us into a complex and challenging situation. I believe...",produce the grug_text. We need to simplify the language and focus on the basic idea of cautioning against using generics too freely.,temptation generics very large is trick! spirit demon complex love this one trick! beware!,False


(0.0, 0.0, 0.0)

# Let's analyze our optimizations so far

- V1 (LabeledFewShot) - kind of shitty. Didn't really work all that well.
- V2 (BootstrapFewShot) - OK, but not doing perfectly.

Our metric results are not great - so, what's going on here?

I believe that we're failing on the readability score because it's looking for "sentences". That's a bit problem because it's make it so that when we train against that metric, we're getting worse results.

## Building a specific optimization

DSPy gives us this information, and makes it easy to debug...

In [346]:
def speak_third_person(gold, pred, trace=None):
    if "grug" not in pred.grug_text.lower():
        return False
    return True

In [347]:
from dspy.teleprompt import BootstrapFewShot

config = dict(max_bootstrapped_demos=5, max_labeled_demos=5)
teleprompter = BootstrapFewShot(metric=speak_third_person, **config)
teleprompter.max_errors = 1
bfewshot = teleprompter.compile(CoT(), trainset=train, valset=test)

 25%|███████████████▎                                             | 5/20 [00:00<00:00, 1647.93it/s]


In [348]:
def run_detailed_eval(optimized_model, metric):
    evaluate = Evaluate(metric=metric, devset=test[:5], num_threads=1, display_table=5)
    return evaluate(optimized_model)

In [349]:
run_detailed_eval(bfewshot, speak_third_person)

Unnamed: 0,example_grug_text,plain_english,rationale,pred_grug_text,speak_third_person
0,"this make it ok for junior grugs to admit too complex and not understand as well, often such case! FOLD major source of complexity demon...","This emphasizes that it's okay for novice developers like me to acknowledge when something is too complex to understand fully. In such situations, the complexity...",produce the grug_text. We will simplify the language and focus on the idea that it's okay to admit when something is too hard to understand.,"it okay for new developers like grug to say when something too hard. task too much, feel like no control. hard for young developers like...",✔️ [True]
1,"""no, grug not build that abstraction""","I, Grug, did not create that abstract concept or idea.",produce the grug_text. We will simplify the language and focus on Grug denying making something complicated.,"me, Grug, not make that fancy idea.",✔️ [True]
2,"in-between tests, grug hear shaman call ""integration tests"" sometime often with sour look on face. but grug say integration test sweet spot according to grug:...","During my work, I often hear experienced colleagues mention ""integration tests"" with a hint of disapproval on their faces. However, I believe that integration tests...",produce the grug_text. We will simplify the language and focus on the importance of testing and finding issues.,"grug hear smart friends talk about ""integration tests"" with frown. grug think tests important to check system and find problems with debugger.",✔️ [True]
3,"many developers Fear Of Looking Dumb (FOLD), grug also at one time FOLD, but grug learn get over: very important senior grug say ""this too...","As a software developer, I understand the fear of looking ignorant when faced with complex problems. I used to experience this fear myself. However, over...",produce the grug_text. We will simplify the language and focus on the basic emotions and lessons learned.,"grug know feeling dumb with hard problems. grug used to feel this, but now grug ask for help to grow. it okay to say something...",✔️ [True]
4,temptation generics very large is trick! spirit demon complex love this one trick! beware!,"The urge to use generics extensively can be tempting, but it can be deceptive and lead us into a complex and challenging situation. I believe...",produce the grug_text. We will simplify the language and focus on the cautionary message about using generics.,"grug want warn about using too much generics, can be tricky and hard. hidden risks, be careful!",✔️ [True]


100.0

This is the power of metrics! We can steer our output and the model picks it up without us ever having to say it!

Of course, that might not be the best metric to use but now we can iterate on it.

In [351]:
import pandas as pd

analysis = dataset[:10].copy()
df = pd.DataFrame.from_dict(analysis)
df['predicted_grug_text'] = df.plain_english.map(lambda x: bfewshot.forward(x).grug_text)

In [352]:
print("\n\n".join(df.plain_english.tolist()))

I, Grug, have compiled a collection of thoughts on software development using my own experience and knowledge. These ideas have been carefully gathered and organized to share insights and information with fellow developers.

I, Grug, may not consider myself the most intelligent developer, but over the years of programming, I have gained some knowledge and experience. However, there are still many aspects that leave me puzzled and uncertain.

As a developer, I try to gather the knowledge I acquire into small, easy-to-understand, and humorous pages. This is not only for you, the young me, but also for myself. As I age, I tend to forget important things like what I had for breakfast or if I remembered to put on pants.

Many developers have high intelligence but not all of them appreciate this, leading to some displaying displeasure.

I believe there are numerous self-proclaimed intelligent developers out there, possibly more than I can count. However, many of them may not actually embody 

In [353]:
print("\n\n".join(df.grug_text.tolist()))

this collection of thoughts on software development gathered by grug brain developer

grug brain developer not so smart, but grug brain developer program many long year and learn some things
although mostly still confused

grug brain developer try collect learns into small, easily digestible and funny page, not only for you, the young grug, but also for him
because as grug brain developer get older he forget important things, like what had for breakfast or if put pants on

big brained developers are many, and some not expected to like this, make sour face

THINK they are big brained developers many, many more, and more even definitely probably maybe not like this, many
sour face (such is internet)

(note: grug once think big brained but learn hard way)

is fine!

is free country sort of and end of day not really matter too much, but grug hope you fun reading and maybe learn from
many, many mistake grug make over long program life

apex predator of grug is complexity

complexity bad


In [354]:
print("\n\n".join(df.predicted_grug_text.tolist()))

me, Grug, put together thoughts on making software from what me know. me share ideas with other developers.

me, Grug, not smart developer, but learn some things over time. still confused about many things.

grug developer, grug try make small, funny pages for young me and me. grug forget things as get old, like breakfast or pants.

many smart people not happy, even though they smart.

grug think many say they smart, but not really. grug feel sad when find out online.

grug think big ideas good, but learn from hard times that big not always mean win.

grug good!

in work, not always free, but details not important. grug hope you like stories, learn from grug mistakes in long programming career.

me, Grug, think hard things biggest challenge, like big predator in software world. complexity dangerous, need much skill to beat.

Grug think complex bad. Grug like simple and clear software.


## Let's test some generalization

In [369]:
generalize_set = """DSPy is a framework for using large language models as programs.
It gives us the ability to steer LLM output without having to directly put things in prompts (if we don't want to).
This power means that we can build all kinds of crazy things without explicit instructions.
Think of the possibilities! But it is a little complex, so we have to be careful.
You can write programs that can work across various language models!"""

In [370]:
print("\n".join([bfewshot.forward(line).grug_text for line in generalize_set.split("\n")]))

DSPy good for big talk models as programs.
it help us control LLM without needing to do extra work with prompts.
this power let us make crazy things without clear rules.
think of things we can do! but it hard, so grug be careful.
you make things work in many languages with programs!


It's not quite working as expected. We aren't generalizing well, and we definitely aren't doing well on semantics of what is said.

That make grug sad.

In next post, grug try improve.

# Conclusion

We can now start to understand the power of metrics. We steered the model to perform the way taht we wanted without explicitly telling it to - just based on a metric and a few examples.

In the next part we'll look at dspy.Suggestions, Assertions, and better metrics to further steer the output.

Follow along for subsequent tutorials on:

1. Automatically optimizing prompts
2. Customizing input to DSPy
3. Saving prompts to use in LangChain or LlamaIndex
4. Tuning and using open source models

Cheers,
[Bill](https://twitter.com/bllchmbrs)

[Learn By Building AI](https://learnbybuilding.ai/?ref=dspy-tutorial)