# DSPy

A Math example for DSPy.

Please set your OpenAI API key in a local .env file for load_dotenv():
```
OPENAI_API_KEY=sk...
```
or set it in the environment explicity using 
```
import os
os.environ["OPENAI_API_KEY"] = "sk..."
```

Install the latest DSPy via 
```
pip install -U dspy
```

In [1]:
import dspy

gpt4o_mini = dspy.LM('openai/gpt-4o-mini', max_tokens=2000)
gpt4o = dspy.LM('openai/gpt-4o', max_tokens=2000)
dspy.configure(lm=gpt4o_mini)  # we'll use gpt-4o-mini as the default LM, unless otherwise specified

In [2]:
# MATH dataset: https://arxiv.org/abs/2103.03874
# https://github.com/hendrycks/math

!pip install git+https://github.com/hendrycks/math.git

Collecting git+https://github.com/hendrycks/math.git
  Cloning https://github.com/hendrycks/math.git to /private/var/folders/kx/jf5lcf41451__wc2mylscgx80000gn/T/pip-req-build-x0oh5187
  Running command git clone --filter=blob:none --quiet https://github.com/hendrycks/math.git /private/var/folders/kx/jf5lcf41451__wc2mylscgx80000gn/T/pip-req-build-x0oh5187
  Resolved https://github.com/hendrycks/math.git to commit 357963a7f5501a6c1708cf3f3fb0cdf525642761
  Preparing metadata (setup.py) ... [?25ldone
[?25hInstalling collected packages: math-equivalence
[33m  DEPRECATION: math-equivalence is being installed using the legacy 'setup.py install' method, because it does not have a 'pyproject.toml' and the 'wheel' package is not installed. pip 23.1 will enforce this behaviour change. A possible replacement is to enable the '--use-pep517' option. Discussion can be found at https://github.com/pypa/pip/issues/8559[0m[33m
[0m  Running setup.py install for math-equivalence ... [?25ldone
[?25

In [3]:
# Next, let's load some data examples from the MATH benchmark.
# We'll use a training split for optimization and evaluate it on a held-out dev set.

from dspy.datasets import MATH

dataset = MATH(subset='algebra')
print(len(dataset.train), len(dataset.dev))

MATH.py:   0%|          | 0.00/4.10k [00:00<?, ?B/s]

0000.parquet:   0%|          | 0.00/505k [00:00<?, ?B/s]

0000.parquet:   0%|          | 0.00/353k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/1744 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/1187 [00:00<?, ? examples/s]

350 350


In [4]:
# Let's inspect one example from the training set.

example = dataset.train[0]
print("Question:", example.question)
print("Answer:", example.answer)

Question: The doctor has told Cal O'Ree that during his ten weeks of working out at the gym, he can expect each week's weight loss to be $1\%$ of his weight at the end of the previous week. His weight at the beginning of the workouts is $244$ pounds. How many pounds does he expect to weigh at the end of the ten weeks? Express your answer to the nearest whole number.
Answer: 221


In [5]:
# Extremely simple CoT module

module = dspy.ChainOfThought("question -> answer")
module(question=example.question)

Prediction(
    reasoning="Cal O'Ree's weight loss each week is $1\\%$ of his weight at the end of the previous week. This means that at the end of each week, he will weigh $99\\%$ of his weight from the previous week. \n\nWe can express this mathematically. If \\( W_0 \\) is his initial weight, then his weight at the end of week \\( n \\) can be calculated using the formula:\n\n\\[\nW_n = W_0 \\times (0.99)^n\n\\]\n\nwhere \\( n \\) is the number of weeks. \n\nGiven that \\( W_0 = 244 \\) pounds and \\( n = 10 \\):\n\n\\[\nW_{10} = 244 \\times (0.99)^{10}\n\\]\n\nNow we calculate \\( (0.99)^{10} \\):\n\n\\[\n(0.99)^{10} \\approx 0.904382\n\\]\n\nNow we can calculate \\( W_{10} \\):\n\n\\[\nW_{10} \\approx 244 \\times 0.904382 \\approx 220.5\n\\]\n\nRounding to the nearest whole number, Cal O'Ree can expect to weigh approximately \\( 221 \\) pounds at the end of the ten weeks.",
    answer='221'
)

In [7]:
# Inspect the last N prompts
dspy.inspect_history(n=1)





[34m[2024-12-01T15:16:53.346675][0m

[31mSystem message:[0m

Your input fields are:
1. `question` (str)

Your output fields are:
1. `reasoning` (str)
2. `answer` (str)

All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## question ## ]]
{question}

[[ ## reasoning ## ]]
{reasoning}

[[ ## answer ## ]]
{answer}

[[ ## completed ## ]]

In adhering to this structure, your objective is: 
        Given the fields `question`, produce the fields `answer`.


[31mUser message:[0m

[[ ## question ## ]]
At 50 miles per hour, how far would a car travel in $2\frac{3}{4}$ hours? Express your answer as a mixed number.

Respond with the corresponding output fields, starting with the field `[[ ## reasoning ## ]]`, then `[[ ## answer ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`.


[31mResponse:[0m

[32m[[ ## reasoning ## ]]
To find the distance traveled by the car, we can use the formula: 

Distance = Speed × Time.

H

In [6]:
#  set up an evaluator for the zero-shot module above, before prompt optimization.

THREADS = 24
kwargs = dict(num_threads=THREADS, display_progress=True, display_table=5)
evaluate = dspy.Evaluate(devset=dataset.dev, metric=dataset.metric, **kwargs)

evaluate(module)

Average Metric: 255.00 / 350 (72.9%): 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 350/350 [01:46<00:00,  3.30it/s]

2024/12/01 15:16:53 INFO dspy.evaluate.evaluate: Average Metric: 255 / 350 (72.9%)





Unnamed: 0,question,example_reasoning,example_answer,pred_reasoning,pred_answer,method
0,What is the smallest integer value of $c$ such that the function $...,The given function has a domain of all real numbers if and only if...,1,To determine the smallest integer value of \( c \) such that the f...,1,✔️ [True]
1,What is the least value of $x$ that is a solution of $|{-x+3}|=7$?,"In order to have $|{-x+3}| = 7$, we must have $-x + 3 = 7$ or $-x ...",-4,"To solve the equation \( |{-x+3}|=7 \), we need to consider the tw...",-4,✔️ [True]
2,Evaluate $\left\lceil -\frac{7}{4}\right\rceil$.,"$-\frac{7}{4}$ is between $-1$ and $-2$, so $\left\lceil -\frac{7}...",-1,"To evaluate \(\left\lceil -\frac{7}{4}\right\rceil\), we first nee...",-1,✔️ [True]
3,"A triangle has vertices at coordinates $(11,1)$, $(2,3)$ and $(3,7...",We must find the distance between each pair of points by using the...,10,To find the length of the longest side of the triangle with vertic...,10,✔️ [True]
4,Let $f(x) = x + 2$ and $g(x) = 1/f(x)$. What is $g(f(-3))$?,"First, we find that $f(-3) = (-3) + 2 = -1$. Then, $$g(f(-3)) = g(...",1,"To find \( g(f(-3)) \), we first need to evaluate \( f(-3) \). The...",1,✔️ [True]


72.86

In [11]:
THREADS=2
kwargs = dict(num_threads=THREADS, teacher_settings=dict(lm=gpt4o), prompt_model=gpt4o_mini)
optimizer = dspy.MIPROv2(metric=dataset.metric, auto="medium", **kwargs)

kwargs = dict(requires_permission_to_run=False, max_bootstrapped_demos=4, max_labeled_demos=4)
optimized_module = optimizer.compile(module, trainset=dataset.train, **kwargs)

2024/12/01 16:36:07 INFO dspy.teleprompt.mipro_optimizer_v2: 
RUNNING WITH THE FOLLOWING MEDIUM AUTO RUN SETTINGS:
num_trials: 25
minibatch: True
num_candidates: 19
valset size: 280

2024/12/01 16:36:07 INFO dspy.teleprompt.mipro_optimizer_v2: 
==> STEP 1: BOOTSTRAP FEWSHOT EXAMPLES <==
2024/12/01 16:36:07 INFO dspy.teleprompt.mipro_optimizer_v2: These will be used as few-shot example candidates for our program and for creating instructions.

2024/12/01 16:36:07 INFO dspy.teleprompt.mipro_optimizer_v2: Bootstrapping N=19 sets of demonstrations...


Bootstrapping set 1/19
Bootstrapping set 2/19
Bootstrapping set 3/19


  7%|█████████▌                                                                                                                            | 5/70 [00:00<00:00, 578.99it/s]


Bootstrapped 4 full traces after 5 examples for up to 1 rounds, amounting to 5 attempts.
Bootstrapping set 4/19


  6%|███████▋                                                                                                                              | 4/70 [00:00<00:00, 798.38it/s]


Bootstrapped 4 full traces after 4 examples for up to 1 rounds, amounting to 4 attempts.
Bootstrapping set 5/19


  6%|███████▋                                                                                                                              | 4/70 [00:00<00:00, 890.32it/s]


Bootstrapped 3 full traces after 4 examples for up to 1 rounds, amounting to 4 attempts.
Bootstrapping set 6/19


  6%|███████▋                                                                                                                              | 4/70 [00:00<00:00, 715.57it/s]


Bootstrapped 4 full traces after 4 examples for up to 1 rounds, amounting to 4 attempts.
Bootstrapping set 7/19


  7%|█████████▌                                                                                                                            | 5/70 [00:00<00:00, 855.70it/s]


Bootstrapped 4 full traces after 5 examples for up to 1 rounds, amounting to 5 attempts.
Bootstrapping set 8/19


  1%|█▉                                                                                                                                    | 1/70 [00:00<00:00, 710.30it/s]


Bootstrapped 1 full traces after 1 examples for up to 1 rounds, amounting to 1 attempts.
Bootstrapping set 9/19


  3%|███▊                                                                                                                                  | 2/70 [00:00<00:00, 831.13it/s]


Bootstrapped 2 full traces after 2 examples for up to 1 rounds, amounting to 2 attempts.
Bootstrapping set 10/19


  6%|███████▋                                                                                                                              | 4/70 [00:00<00:00, 861.87it/s]


Bootstrapped 3 full traces after 4 examples for up to 1 rounds, amounting to 4 attempts.
Bootstrapping set 11/19


  1%|█▉                                                                                                                                    | 1/70 [00:00<00:00, 680.67it/s]


Bootstrapped 1 full traces after 1 examples for up to 1 rounds, amounting to 1 attempts.
Bootstrapping set 12/19


  4%|█████▋                                                                                                                                | 3/70 [00:00<00:00, 927.67it/s]


Bootstrapped 2 full traces after 3 examples for up to 1 rounds, amounting to 3 attempts.
Bootstrapping set 13/19


  1%|█▉                                                                                                                                    | 1/70 [00:00<00:00, 812.53it/s]


Bootstrapped 1 full traces after 1 examples for up to 1 rounds, amounting to 1 attempts.
Bootstrapping set 14/19


  7%|█████████▌                                                                                                                            | 5/70 [00:00<00:00, 977.74it/s]


Bootstrapped 4 full traces after 5 examples for up to 1 rounds, amounting to 5 attempts.
Bootstrapping set 15/19


  1%|█▉                                                                                                                                    | 1/70 [00:00<00:00, 956.95it/s]


Bootstrapped 1 full traces after 1 examples for up to 1 rounds, amounting to 1 attempts.
Bootstrapping set 16/19


  3%|███▊                                                                                                                                  | 2/70 [00:00<00:00, 850.43it/s]


Bootstrapped 2 full traces after 2 examples for up to 1 rounds, amounting to 2 attempts.
Bootstrapping set 17/19


  1%|█▉                                                                                                                                    | 1/70 [00:00<00:00, 880.97it/s]


Bootstrapped 1 full traces after 1 examples for up to 1 rounds, amounting to 1 attempts.
Bootstrapping set 18/19


  1%|█▉                                                                                                                                     | 1/70 [00:00<00:01, 67.19it/s]


Bootstrapped 1 full traces after 1 examples for up to 1 rounds, amounting to 1 attempts.
Bootstrapping set 19/19


  3%|███▊                                                                                                                                  | 2/70 [00:00<00:00, 720.73it/s]
2024/12/01 16:36:07 INFO dspy.teleprompt.mipro_optimizer_v2: 
==> STEP 2: PROPOSE INSTRUCTION CANDIDATES <==
2024/12/01 16:36:07 INFO dspy.teleprompt.mipro_optimizer_v2: We will use the few-shot examples from the previous step, a generated dataset summary, a summary of the program code, and a randomly selected prompting tip to propose instructions.
2024/12/01 16:36:07 INFO dspy.teleprompt.mipro_optimizer_v2: 
Proposing instructions...

2024/12/01 16:36:07 INFO dspy.teleprompt.mipro_optimizer_v2: Proposed Instructions for Predictor 0:

2024/12/01 16:36:07 INFO dspy.teleprompt.mipro_optimizer_v2: 0: Given the fields `question`, produce the fields `answer`.

2024/12/01 16:36:07 INFO dspy.teleprompt.mipro_optimizer_v2: 1: You are a mathematics tutor. Given the field `question`, provide a detailed step-by-step reasoning p

Bootstrapped 1 full traces after 2 examples for up to 1 rounds, amounting to 2 attempts.
Average Metric: 99.00 / 109 (90.8%):  42%|████████████████████████████████████████                                                       | 118/280 [23:30<32:16, 11.95s/it]
Average Metric: 55.00 / 66 (83.3%):  27%|█████████████████████████▉                                                                       | 75/280 [21:26<58:35, 17.15s/it]
Average Metric: 189.00 / 280 (67.5%): 100%|████████████████████████████████████████████████████████████████████████████████████████████| 280/280 [00:00<00:00, 2490.93it/s]

2024/12/01 16:36:08 INFO dspy.evaluate.evaluate: Average Metric: 189 / 280 (67.5%)
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Default program score: 67.5

2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: ==> STEP 3: FINDING OPTIMAL PROMPT PARAMETERS <==
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: We will evaluate the program over a series of trials with different combinations of instructions and few-shot examples to find the optimal combination using Bayesian Optimization.

2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 1 / 25 ==



Average Metric: 18.00 / 25 (72.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 2768.59it/s]

2024/12/01 16:36:08 INFO dspy.evaluate.evaluate: Average Metric: 18 / 25 (72.0%)
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 72.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 12', 'Predictor 0: Few-Shot Set 7'].
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 67.5


2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 2 / 25 ==



Average Metric: 23.00 / 25 (92.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 2984.00it/s]

2024/12/01 16:36:08 INFO dspy.evaluate.evaluate: Average Metric: 23 / 25 (92.0%)
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 92.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 10', 'Predictor 0: Few-Shot Set 7'].
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 67.5


2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 3 / 25 ==



Average Metric: 21.00 / 25 (84.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 3124.30it/s]


2024/12/01 16:36:08 INFO dspy.evaluate.evaluate: Average Metric: 21 / 25 (84.0%)
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 84.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 7', 'Predictor 0: Few-Shot Set 18'].
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 67.5


2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 4 / 25 ==


Average Metric: 21.00 / 25 (84.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 2453.61it/s]


2024/12/01 16:36:08 INFO dspy.evaluate.evaluate: Average Metric: 21 / 25 (84.0%)
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 84.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 15', 'Predictor 0: Few-Shot Set 2'].
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 67.5


2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 5 / 25 ==


Average Metric: 22.00 / 25 (88.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 1617.92it/s]


2024/12/01 16:36:08 INFO dspy.evaluate.evaluate: Average Metric: 22 / 25 (88.0%)
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 88.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 8', 'Predictor 0: Few-Shot Set 18'].
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 67.5


2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 6 / 25 ==


Average Metric: 24.00 / 25 (96.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 1818.17it/s]


2024/12/01 16:36:08 INFO dspy.evaluate.evaluate: Average Metric: 24 / 25 (96.0%)
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 96.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 7', 'Predictor 0: Few-Shot Set 1'].
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 67.5


2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 7 / 25 ==


Average Metric: 19.00 / 25 (76.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 1826.28it/s]

2024/12/01 16:36:08 INFO dspy.evaluate.evaluate: Average Metric: 19 / 25 (76.0%)
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 76.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 7', 'Predictor 0: Few-Shot Set 12'].
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0, 76.0]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 67.5


2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 8 / 25 ==



Average Metric: 20.00 / 25 (80.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 1624.52it/s]

2024/12/01 16:36:08 INFO dspy.evaluate.evaluate: Average Metric: 20 / 25 (80.0%)
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 80.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 11', 'Predictor 0: Few-Shot Set 13'].
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0, 76.0, 80.0]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 67.5


2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 9 / 25 ==



Average Metric: 23.00 / 25 (92.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 1917.31it/s]

2024/12/01 16:36:08 INFO dspy.evaluate.evaluate: Average Metric: 23 / 25 (92.0%)
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 92.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 5', 'Predictor 0: Few-Shot Set 4'].
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0, 76.0, 80.0, 92.0]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 67.5


2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 10 / 25 ==



Average Metric: 22.00 / 25 (88.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 3816.75it/s]

2024/12/01 16:36:08 INFO dspy.evaluate.evaluate: Average Metric: 22 / 25 (88.0%)
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 88.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 14', 'Predictor 0: Few-Shot Set 1'].
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0, 76.0, 80.0, 92.0, 88.0]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5]
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 67.5


2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: ===== Full Eval 1 =====
2024/12/01 16:36:08 INFO dspy.teleprompt.mipro_optimizer_v2: Doing full eval on next top averaging program (Avg Score: 96.0) from minibatch trials...



Average Metric: 242.00 / 280 (86.4%): 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 280/280 [04:55<00:00,  1.06s/it]

2024/12/01 16:41:04 INFO dspy.evaluate.evaluate: Average Metric: 242 / 280 (86.4%)
2024/12/01 16:41:04 INFO dspy.teleprompt.mipro_optimizer_v2: [92mNew best full eval score![0m Score: 86.43
2024/12/01 16:41:04 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5, 86.43]
2024/12/01 16:41:04 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 86.43
2024/12/01 16:41:04 INFO dspy.teleprompt.mipro_optimizer_v2: 

2024/12/01 16:41:04 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 11 / 25 ==



Average Metric: 21.00 / 25 (84.0%): 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:37<00:00,  1.50s/it]

2024/12/01 16:41:41 INFO dspy.evaluate.evaluate: Average Metric: 21 / 25 (84.0%)
2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 84.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 3', 'Predictor 0: Few-Shot Set 1'].
2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0, 76.0, 80.0, 92.0, 88.0, 84.0]
2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5, 86.43]
2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 86.43


2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 12 / 25 ==



Average Metric: 23.00 / 25 (92.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 2691.42it/s]

2024/12/01 16:41:41 INFO dspy.evaluate.evaluate: Average Metric: 23 / 25 (92.0%)
2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 92.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 10', 'Predictor 0: Few-Shot Set 2'].
2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0, 76.0, 80.0, 92.0, 88.0, 84.0, 92.0]
2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5, 86.43]
2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 86.43


2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 13 / 25 ==



Average Metric: 22.00 / 25 (88.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 3257.76it/s]

2024/12/01 16:41:41 INFO dspy.evaluate.evaluate: Average Metric: 22 / 25 (88.0%)
2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 88.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 7', 'Predictor 0: Few-Shot Set 1'].
2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0, 76.0, 80.0, 92.0, 88.0, 84.0, 92.0, 88.0]
2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5, 86.43]
2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 86.43


2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 14 / 25 ==



Average Metric: 21.00 / 25 (84.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 3606.95it/s]

2024/12/01 16:41:41 INFO dspy.evaluate.evaluate: Average Metric: 21 / 25 (84.0%)
2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 84.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 10', 'Predictor 0: Few-Shot Set 15'].
2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0, 76.0, 80.0, 92.0, 88.0, 84.0, 92.0, 88.0, 84.0]
2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5, 86.43]
2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 86.43


2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 15 / 25 ==



Average Metric: 22.00 / 25 (88.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 1231.91it/s]

2024/12/01 16:41:41 INFO dspy.evaluate.evaluate: Average Metric: 22 / 25 (88.0%)
2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 88.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 0', 'Predictor 0: Few-Shot Set 7'].





2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0, 76.0, 80.0, 92.0, 88.0, 84.0, 92.0, 88.0, 84.0, 88.0]
2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5, 86.43]
2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 86.43


2024/12/01 16:41:41 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 16 / 25 ==


Average Metric: 22.00 / 25 (88.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 3617.40it/s]

2024/12/01 16:41:42 INFO dspy.evaluate.evaluate: Average Metric: 22 / 25 (88.0%)
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 88.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 10', 'Predictor 0: Few-Shot Set 7'].
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0, 76.0, 80.0, 92.0, 88.0, 84.0, 92.0, 88.0, 84.0, 88.0, 88.0]
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5, 86.43]
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 86.43


2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 17 / 25 ==



Average Metric: 24.00 / 25 (96.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 3815.64it/s]

2024/12/01 16:41:42 INFO dspy.evaluate.evaluate: Average Metric: 24 / 25 (96.0%)
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 96.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 13', 'Predictor 0: Few-Shot Set 10'].
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0, 76.0, 80.0, 92.0, 88.0, 84.0, 92.0, 88.0, 84.0, 88.0, 88.0, 96.0]
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5, 86.43]
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 86.43


2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 18 / 25 ==



Average Metric: 24.00 / 25 (96.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 3942.02it/s]

2024/12/01 16:41:42 INFO dspy.evaluate.evaluate: Average Metric: 24 / 25 (96.0%)
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 96.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 17', 'Predictor 0: Few-Shot Set 10'].
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0, 76.0, 80.0, 92.0, 88.0, 84.0, 92.0, 88.0, 84.0, 88.0, 88.0, 96.0, 96.0]
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5, 86.43]
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 86.43


2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 19 / 25 ==



Average Metric: 23.00 / 25 (92.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 3432.21it/s]

2024/12/01 16:41:42 INFO dspy.evaluate.evaluate: Average Metric: 23 / 25 (92.0%)
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 92.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 13', 'Predictor 0: Few-Shot Set 10'].
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0, 76.0, 80.0, 92.0, 88.0, 84.0, 92.0, 88.0, 84.0, 88.0, 88.0, 96.0, 96.0, 92.0]
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5, 86.43]
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 86.43


2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 20 / 25 ==



Average Metric: 21.00 / 25 (84.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 3607.57it/s]

2024/12/01 16:41:42 INFO dspy.evaluate.evaluate: Average Metric: 21 / 25 (84.0%)
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 84.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 13', 'Predictor 0: Few-Shot Set 5'].
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0, 76.0, 80.0, 92.0, 88.0, 84.0, 92.0, 88.0, 84.0, 88.0, 88.0, 96.0, 96.0, 92.0, 84.0]





2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5, 86.43]
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 86.43


2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: ===== Full Eval 2 =====
2024/12/01 16:41:42 INFO dspy.teleprompt.mipro_optimizer_v2: Doing full eval on next top averaging program (Avg Score: 96.0) from minibatch trials...


Average Metric: 248.00 / 280 (88.6%): 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 280/280 [07:38<00:00,  1.64s/it]

2024/12/01 16:49:20 INFO dspy.evaluate.evaluate: Average Metric: 248 / 280 (88.6%)
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: [92mNew best full eval score![0m Score: 88.57
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5, 86.43, 88.57]
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 88.57
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: 

2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 21 / 25 ==



Average Metric: 21.00 / 25 (84.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 3145.57it/s]

2024/12/01 16:49:20 INFO dspy.evaluate.evaluate: Average Metric: 21 / 25 (84.0%)
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 84.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 1', 'Predictor 0: Few-Shot Set 16'].
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0, 76.0, 80.0, 92.0, 88.0, 84.0, 92.0, 88.0, 84.0, 88.0, 88.0, 96.0, 96.0, 92.0, 84.0, 84.0]
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5, 86.43, 88.57]
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 88.57


2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 22 / 25 ==



Average Metric: 24.00 / 25 (96.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 3741.44it/s]

2024/12/01 16:49:20 INFO dspy.evaluate.evaluate: Average Metric: 24 / 25 (96.0%)
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 96.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 17', 'Predictor 0: Few-Shot Set 3'].
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0, 76.0, 80.0, 92.0, 88.0, 84.0, 92.0, 88.0, 84.0, 88.0, 88.0, 96.0, 96.0, 92.0, 84.0, 84.0, 96.0]
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5, 86.43, 88.57]
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 88.57


2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 23 / 25 ==



Average Metric: 22.00 / 25 (88.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 3928.13it/s]

2024/12/01 16:49:20 INFO dspy.evaluate.evaluate: Average Metric: 22 / 25 (88.0%)
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 88.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 9', 'Predictor 0: Few-Shot Set 10'].
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0, 76.0, 80.0, 92.0, 88.0, 84.0, 92.0, 88.0, 84.0, 88.0, 88.0, 96.0, 96.0, 92.0, 84.0, 84.0, 96.0, 88.0]
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5, 86.43, 88.57]
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 88.57


2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 24 / 25 ==



Average Metric: 23.00 / 25 (92.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 3883.90it/s]

2024/12/01 16:49:20 INFO dspy.evaluate.evaluate: Average Metric: 23 / 25 (92.0%)
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 92.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 18', 'Predictor 0: Few-Shot Set 10'].
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0, 76.0, 80.0, 92.0, 88.0, 84.0, 92.0, 88.0, 84.0, 88.0, 88.0, 96.0, 96.0, 92.0, 84.0, 84.0, 96.0, 88.0, 92.0]
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5, 86.43, 88.57]
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 88.57


2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 25 / 25 ==



Average Metric: 20.00 / 25 (80.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 3625.15it/s]

2024/12/01 16:49:20 INFO dspy.evaluate.evaluate: Average Metric: 20 / 25 (80.0%)
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 80.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 17', 'Predictor 0: Few-Shot Set 17'].
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [72.0, 92.0, 84.0, 84.0, 88.0, 96.0, 76.0, 80.0, 92.0, 88.0, 84.0, 92.0, 88.0, 84.0, 88.0, 88.0, 96.0, 96.0, 92.0, 84.0, 84.0, 96.0, 88.0, 92.0, 80.0]
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5, 86.43, 88.57]
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 88.57


2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: ===== Full Eval 3 =====
2024/12/01 16:49:20 INFO dspy.teleprompt.mipro_optimizer_v2: Doing full eval on next top averaging program (Avg Score: 96.0) from minibatch trials...



Average Metric: 221.00 / 280 (78.9%): 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 280/280 [09:16<00:00,  1.99s/it]

2024/12/01 16:58:37 INFO dspy.evaluate.evaluate: Average Metric: 221 / 280 (78.9%)
2024/12/01 16:58:37 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [67.5, 86.43, 88.57, 78.93]
2024/12/01 16:58:37 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 88.57
2024/12/01 16:58:37 INFO dspy.teleprompt.mipro_optimizer_v2: 

2024/12/01 16:58:37 INFO dspy.teleprompt.mipro_optimizer_v2: Returning best identified program with score 88.57!





In [14]:
THREADS = 2
kwargs = dict(num_threads=THREADS, display_progress=True, display_table=5)
evaluate = dspy.Evaluate(devset=dataset.dev, metric=dataset.metric, **kwargs)
evaluate(optimized_module)

Average Metric: 313.00 / 350 (89.4%): 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 350/350 [03:42<00:00,  1.57it/s]

2024/12/01 17:11:25 INFO dspy.evaluate.evaluate: Average Metric: 313 / 350 (89.4%)





Unnamed: 0,question,example_reasoning,example_answer,pred_reasoning,pred_answer,method
0,What is the smallest integer value of $c$ such that the function $...,The given function has a domain of all real numbers if and only if...,1,To determine the smallest integer value of \( c \) such that the f...,1,✔️ [True]
1,What is the least value of $x$ that is a solution of $|{-x+3}|=7$?,"In order to have $|{-x+3}| = 7$, we must have $-x + 3 = 7$ or $-x ...",-4,The equation \( |{-x+3}|=7 \) implies two possible cases: 1. \(-x ...,-4,✔️ [True]
2,Evaluate $\left\lceil -\frac{7}{4}\right\rceil$.,"$-\frac{7}{4}$ is between $-1$ and $-2$, so $\left\lceil -\frac{7}...",-1,"To evaluate \(\left\lceil -\frac{7}{4}\right\rceil\), we first nee...",-1,✔️ [True]
3,"A triangle has vertices at coordinates $(11,1)$, $(2,3)$ and $(3,7...",We must find the distance between each pair of points by using the...,10,To find the length of the sides of the triangle formed by the vert...,10,✔️ [True]
4,Let $f(x) = x + 2$ and $g(x) = 1/f(x)$. What is $g(f(-3))$?,"First, we find that $f(-3) = (-3) + 2 = -1$. Then, $$g(f(-3)) = g(...",1,"To find \( g(f(-3)) \), we first need to evaluate \( f(-3) \). The...",1,✔️ [True]


89.43

In [15]:
dspy.inspect_history()





[34m[2024-12-01T17:11:25.594109][0m

[31mSystem message:[0m

Your input fields are:
1. `question` (str)

Your output fields are:
1. `reasoning` (str)
2. `answer` (str)

All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## question ## ]]
{question}

[[ ## reasoning ## ]]
{reasoning}

[[ ## answer ## ]]
{answer}

[[ ## completed ## ]]

In adhering to this structure, your objective is: 
        Given a mathematical problem stated in the `question`, provide a detailed step-by-step explanation in the `reasoning` field that leads to the final result, which should be presented in the `answer` field. Make sure to include any relevant formulas and apply Vieta's formulas appropriately to calculate the sum of the reciprocals of the roots of the quadratic equation. Your response should clearly format the question, reasoning, and answer for clarity and understanding.


[31mUser message:[0m

[[ ## question ## ]]
If $|4x+2|=10$ and $x<0$, w