# dspy methodology 101

1. programming
   1. LMs (tasks)
   2. signatures (i/o - eg `"context: list[str], question: str -> answer: str"`) - compiling leads to better prompts than humans write
      1. tasks, instruct the model what it needs to do
      2. underlying dSPY compiler will do the optimization, rather than brittle prompts
   3. modules (ie `dspy.Predict`, `dspy.ChainOfThought`)
      1. prompting techniques
2. evaluation
3. optimization

## TOC:
* [intro](#dspy-methodology-101)
* [LMs](#set-a-generator-lm)
* [evaluations 101](#dspy-evaluations)
* [optimizations](#dspy-optimizers)

## set a generator LM

<a class="anchor" id="LM"></a>

In [3]:
import dspy
import os

ANTHROPIC_API_KEY = os.getenv("ANTHROPIC_API_KEY")
TOGETHER_API_KEY = os.getenv("TOGETHER_API_KEY")

lm=dspy.LM('together_ai/deepseek-ai/DeepSeek-R1', temperature=0.1, max_tokens=2500, stop=None, cache=False, api_key=TOGETHER_API_KEY)
dspy.configure(lm=lm)

# DSPy optimizers

**pre-requisite**: metrics, metrics, metrics<br>
DSPy optimizers can be used tune the prompts or weights in your program.<br>
some optimizers specify `trainset`, some need `trainset` **and** `valset`<br>
recommended for prompting optimizers to produce 20% split for training and 80% for validation

<p style="text-align:center">A DSPy optimizer is an algorithm that can tune the parameters of a DSPy program (i.e., the prompts and/or the LM weights) to maximize the metrics you specify, like accuracy.</p>

In [4]:
from IPython.display import display, HTML

mermaid_code = """
<div class="mermaid">
flowchart TD
    A[Training Inputs] --> B[DSPy Program (Model Prediction)]
    B --> C[Metric Function (Evaluate & Score)]
    C --> D[Final Output (Model Evaluation Score)]
</div>
"""

display(HTML(mermaid_code))

In [None]:
# Example DSPy program function (model)
def dspy_predict(inputs):
    # Process inputs through the DSPy model (this could involve transformation, prediction, etc.)
    predictions = model.process(inputs)
    return predictions

# Example evaluation metric function
def evaluation_metric(predictions, actual_labels):
    # Evaluate the predictions using some metric (e.g., accuracy, F1 score, etc.)
    score = calculate_accuracy(predictions, actual_labels)
    return score

# Example training input (could be partial or incomplete)
training_inputs = [
    # Sample input data (without labels for unsupervised or semi-supervised learning)
    {"feature1": 0.1, "feature2": 0.2},
    {"feature1": 0.4, "feature2": 0.5},
    # Additional incomplete data samples...
]

# Optional: Actual labels for supervised learning (if available)
actual_labels = [0, 1]  # Example labels for training purposes (supervised case)

# Main pseudo function:
def train_and_evaluate(training_inputs, actual_labels):
    # Step 1: Predict output using DSPy program (model)
    predictions = dspy_predict(training_inputs)

    # Step 2: Evaluate the model's predictions using a metric
    score = evaluation_metric(predictions, actual_labels)

    # Step 3: Return or log the score
    print(f"Model evaluation score: {score}")
    return score

# Example execution:
train_and_evaluate(training_inputs, actual_labels)

### DSPy optimizers

available [here](https://dspy.ai/learn/optimization/optimizers/)

In [7]:
from dspy.teleprompt import *

# Get all classes from dspy.teleprompt
from dspy.teleprompt import __all__ as teleprompt_classes

print("DSPy Optimizers/Teleprompt Classes:")
print("-" * 40)
for class_name in teleprompt_classes:
    print(f"- {class_name}")

DSPy Optimizers/Teleprompt Classes:
----------------------------------------
- AvatarOptimizer
- BetterTogether
- BootstrapFewShot
- BootstrapFinetune
- COPRO
- Ensemble
- KNNFewShot
- MIPROv2
- BootstrapFewShotWithRandomSearch
- BootstrapFewShotWithOptuna
- LabeledFewShot
- InferRules
