You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Add teacher module to MIPROv2
* Update MIPROv2 Docs
Add teacher program to the compile method
---------
Co-authored-by: Michael Ryan <michael_ryan_2000@yahoo.com>
|`student`|`dspy.Module`|**Required**| The base program to optimize. |
193
193
|`trainset`|`List[dspy.Example]`|**Required**| Training dataset which is used to bootstrap few-shot examples and instructions. If a separate `valset` is not specified, 80% of this training set will also be used as a validation set for evaluating new candidate prompts. |
194
194
|`valset`|`List[dspy.Example]`| Defaults to 80% of trainset | Dataset which is used to evaluate candidate prompts. We recommend using somewhere between 50-500 examples for optimization. |
195
+
|`teacher`|`dspy.Module`| Defaults to student | The program to run in order to bootstrap the few-shot examples. |
195
196
|`num_trials`|`int`|`30`| Number of optimization trials to run. When `minibatch` is set to `True`, this represents the number of minibatch trials that will be run on batches of size `minibatch_size`. When minibatch is set to `False`, each trial uses a full evaluation on the training set. In both cases, we recommend setting `num_trials` to a *minimum* of .75 x # modules in program x # variables per module (2 if few-shot examples & instructions will both be optimized, 1 in the 0-shot case). |
196
197
|`minibatch`|`bool`|`True`| Flag to enable evaluating over minibatches of data (instead of the full validation set) for evaluation each trial. |
197
198
|`minibatch_size`|`int`|`25.0`| Size of minibatches for evaluations. |
@@ -216,4 +217,4 @@ These steps are broken down in more detail below:
216
217
3.**Find an Optimized Combination of Few-Shot Examples & Instructions**. Finally, now that we've created these few-shot examples and instructions, we use Bayesian Optimization to choose which set of these would work best for each predictor in our program. This works by running a series of `num_trials` trials, where a new set of prompts are evaluated over our validation set at each trial. This helps the Bayesian Optimizer learn which combination of prompts work best over time. If `minibatch` is set to `True` (which it is by default), then the new set of prompts are only evaluated on a minibatch of size `minibatch_size` at each trial which generally allows for more efficient exploration / exploitation. The best averaging set of prompts is then evaluated on the full validation set every `minibatch_full_eval_steps` get a less noisey performance benchmark. At the end of the optimization process, the LM program with the set of prompts that performed best on the full validation set is returned.
217
218
218
219
219
-
For those interested in more details, more information on `MIPROv2` along with a study on `MIPROv2` compared with other DSPy optimizers can be found in [this paper](https://arxiv.org/abs/2406.11695).
220
+
For those interested in more details, more information on `MIPROv2` along with a study on `MIPROv2` compared with other DSPy optimizers can be found in [this paper](https://arxiv.org/abs/2406.11695).
0 commit comments