##### Introduction to DSPy
Dspy allows us to structurally encode the behaviour of our foundation models, it defines a structural way to program instructions into foundation models, and optimizes our instructions for a particular model based on a defined metric

In [19]:
import dspy
from dotenv import load_dotenv
from dspy.datasets.gsm8k import GSM8K, gsm8k_metric


In [21]:
turbo = dspy.OpenAI(model="gpt-3.5-turbo-instruct", max_tokens=250)
dspy.settings.configure(lm=turbo)

In [22]:
# load dataset
gsm8k = GSM8K()
gsm8k_trainset, gsm8k_devset = gsm8k.train[:10], gsm8k.test[:10]

100%|██████████| 7473/7473 [00:00<00:00, 65117.33it/s]
100%|██████████| 1319/1319 [00:00<00:00, 71934.77it/s]


##### Define the module
With our environments setup, we will now define a custom program that uses ChainOfThought module to perform step by step reasoning to generate answers


In [23]:
class CoT(dspy.Module):
    def __init__(self):
        super().__init__()
        self.prog = dspy.ChainOfThought("question -> answer")
    
    def forward(self, question):
        return self.prog(question=question)

##### Compile and Evaluate model
Now that we have defined our module and our dataset, we can then compile the moodel, it is during the compilation step where the optimizations take place. We also define a metric for optimizing the model. We will be performing optimiztions using the `BootstrapFewShotWithRandomSearch` teleprompter. 

In [26]:
from dspy.teleprompt import BootstrapFewShot
config = dict(max_bootstrapped_demos=4, max_labeled_demos=4)

teleprompter = BootstrapFewShot(metric=gsm8k_metric, **config)
optimized_cot = teleprompter.compile(CoT(), trainset=gsm8k_trainset, valset=gsm8k_devset)


 50%|█████     | 5/10 [00:00<00:00, 4740.40it/s]

Bootstrapped 4 full traces after 6 examples in round 0.





##### Evaluate 
Now that we have a compiled(optimized) DSPy program, let's move to evaluating its performance of the dev dataset


In [27]:
from dspy.evaluate import Evaluate

evaluate = Evaluate(devset=gsm8k_devset, metric=gsm8k_metric, num_threads=4, 
                    display_progress=True, display_table=0)

evaluate(optimized_cot)

Average Metric: 8 / 10  (80.0): 100%|██████████| 10/10 [00:00<00:00, 3867.14it/s]

Average Metric: 8 / 10  (80.0%)





80.0