# 使用 DSPy 开发分类功能案例

## 一 为 DSPy 指定模型

In [2]:
import dspy

lm = dspy.LM('ollama_chat/qwen2.5:0.5b', api_base='http://localhost:11434', api_key='', cache=False)
dspy.configure(lm=lm)

## 二 分类测试

In [3]:
from typing import Literal

class Category(dspy.Signature):
    """Classify food"""

    food: str = dspy.InputField()
    category: Literal["未知","蔬菜","水果","肉类"] = dspy.OutputField()
    confidence: float = dspy.OutputField()

classify = dspy.Predict(Category)

result = classify(food="猪肉")
result

Prediction(
    category='肉类',
    confidence=0.85
)

In [4]:
# 打印对话记录
dspy.inspect_history()





[34m[2025-02-05T16:18:59.474342][0m

[31mSystem message:[0m

Your input fields are:
1. `food` (str)

Your output fields are:
1. `category` (Literal[未知, 蔬菜, 水果, 肉类])
2. `confidence` (float)

All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## food ## ]]
{food}

[[ ## category ## ]]
{category}        # note: the value you produce must be one of: 未知; 蔬菜; 水果; 肉类

[[ ## confidence ## ]]
{confidence}        # note: the value you produce must be a single float value

[[ ## completed ## ]]

In adhering to this structure, your objective is: 
        Classify food


[31mUser message:[0m

[[ ## food ## ]]
猪肉

Respond with the corresponding output fields, starting with the field `[[ ## category ## ]]` (must be formatted as a valid Python Literal[未知, 蔬菜, 水果, 肉类]), then `[[ ## confidence ## ]]` (must be formatted as a valid Python float), and then ending with the marker for `[[ ## completed ## ]]`.


[31mResponse:[0m

[32m[[ ## categor

## 三 优化 Prompts

In [5]:
import csv
from dspy.evaluate import Evaluate
from dspy.teleprompt import *

In [6]:
# 加载训练集
trainset = []
with open('food_category.csv', 'r', encoding='utf-8') as file:
    reader = csv.DictReader(file)
    for row in reader:
        example = dspy.Example(food=row['food'], category=row['category']).with_inputs("food")
        trainset.append(example)

In [7]:
# 定义评估指标
def validate_category(example, prediction, trace=None):
    return prediction.category == example.category

In [8]:
# Evaluate our existing function
evaluator = Evaluate(devset=trainset, num_threads=1, display_progress=True, display_table=5)
evaluator(classify, metric=validate_category)

Average Metric: 31.00 / 87 (35.6%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 87/87 [04:39<00:00,  3.22s/it]

2025/02/05 16:25:41 INFO dspy.evaluate.evaluate: Average Metric: 31 / 87 (35.6%)





Unnamed: 0,food,example_category,pred_category,confidence,validate_category
0,苹果,水果,蔬菜,0.85,
1,香蕉,水果,蔬菜,0.85,
2,橙子,水果,蔬菜,0.85,
3,草莓,水果,蔬菜,0.85,
4,葡萄,水果,蔬菜,0.85,


35.63

In [59]:
# Optimize
tp = dspy.MIPROv2(metric=validate_category, auto="light")
optimized_classify = tp.compile(classify, trainset=trainset, max_labeled_demos=0, max_bootstrapped_demos=0)
optimized_classify.save("optimized_event_classifier.json")
dspy.inspect_history()

2025/01/23 17:54:15 INFO dspy.teleprompt.mipro_optimizer_v2: 
RUNNING WITH THE FOLLOWING LIGHT AUTO RUN SETTINGS:
num_trials: 7
minibatch: True
num_candidates: 7
valset size: 69



[93m[1mProjected Language Model (LM) Calls[0m

Based on the parameters you have set, the maximum number of LM calls is projected as follows:

[93m- Prompt Generation: [94m[1m10[0m[93m data summarizer calls + [94m[1m7[0m[93m * [94m[1m1[0m[93m lm calls in program + ([94m[1m2[0m[93m) lm calls in program-aware proposer = [94m[1m19[0m[93m prompt model calls[0m
[93m- Program Evaluation: [94m[1m25[0m[93m examples in minibatch * [94m[1m7[0m[93m batches + [94m[1m69[0m[93m examples in val set * [94m[1m1[0m[93m full evals = [94m[1m244[0m[93m LM Program calls[0m

[93m[1mEstimated Cost Calculation:[0m

[93mTotal Cost = (Number of calls to task model * (Avg Input Token Length per Call * Task Model Price per Input Token + Avg Output Token Length per Call * Task Model Price per Output Token) 
            + (Number of program calls * (Avg Input Token Length per Call * Task Prompt Price per Input Token + Avg Output Token Length per Call * Prompt Model P

2025/01/23 17:54:18 INFO dspy.teleprompt.mipro_optimizer_v2: 
==> STEP 1: BOOTSTRAP FEWSHOT EXAMPLES <==
2025/01/23 17:54:18 INFO dspy.teleprompt.mipro_optimizer_v2: These will be used for informing instruction proposal.

2025/01/23 17:54:18 INFO dspy.teleprompt.mipro_optimizer_v2: Bootstrapping N=7 sets of demonstrations...


Bootstrapping set 1/7
Bootstrapping set 2/7


100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18/18 [00:00<00:00, 346.34it/s]


Bootstrapped 0 full traces after 17 examples for up to 1 rounds, amounting to 18 attempts.
Bootstrapping set 3/7


100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18/18 [00:00<00:00, 439.27it/s]


Bootstrapped 0 full traces after 17 examples for up to 1 rounds, amounting to 18 attempts.
Bootstrapping set 4/7


100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18/18 [00:00<00:00, 545.76it/s]


Bootstrapped 0 full traces after 17 examples for up to 1 rounds, amounting to 18 attempts.
Bootstrapping set 5/7


100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18/18 [00:00<00:00, 545.77it/s]


Bootstrapped 0 full traces after 17 examples for up to 1 rounds, amounting to 18 attempts.
Bootstrapping set 6/7


100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18/18 [00:00<00:00, 473.96it/s]


Bootstrapped 0 full traces after 17 examples for up to 1 rounds, amounting to 18 attempts.
Bootstrapping set 7/7


100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18/18 [00:00<00:00, 529.74it/s]
2025/01/23 17:54:18 INFO dspy.teleprompt.mipro_optimizer_v2: 
==> STEP 2: PROPOSE INSTRUCTION CANDIDATES <==
2025/01/23 17:54:18 INFO dspy.teleprompt.mipro_optimizer_v2: We will use the few-shot examples from the previous step, a generated dataset summary, a summary of the program code, and a randomly selected prompting tip to propose instructions.
2025/01/23 17:54:18 INFO dspy.teleprompt.mipro_optimizer_v2: 
Proposing instructions...



Bootstrapped 0 full traces after 17 examples for up to 1 rounds, amounting to 18 attempts.


2025/01/23 17:54:18 INFO dspy.teleprompt.mipro_optimizer_v2: Proposed Instructions for Predictor 0:

2025/01/23 17:54:18 INFO dspy.teleprompt.mipro_optimizer_v2: 0: Classify food

2025/01/23 17:54:18 INFO dspy.teleprompt.mipro_optimizer_v2: 1: The pipeline is designed to classify food based on its category and confidence.

2025/01/23 17:54:18 INFO dspy.teleprompt.mipro_optimizer_v2: 2: The pipeline is designed to classify food based on its category and confidence.

2025/01/23 17:54:18 INFO dspy.teleprompt.mipro_optimizer_v2: 3: The pipeline is designed to classify food based on its category and confidence.

2025/01/23 17:54:18 INFO dspy.teleprompt.mipro_optimizer_v2: 4: Predict the category of the input based on its content.

2025/01/23 17:54:18 INFO dspy.teleprompt.mipro_optimizer_v2: 5: Predict food, return category and confidence.

2025/01/23 17:54:18 INFO dspy.teleprompt.mipro_optimizer_v2: 6: Predict the category of the input based on its content.

2025/01/23 17:54:18 INFO dspy.te

Average Metric: 31.00 / 69 (44.9%): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 69/69 [00:00<00:00, 595.16it/s]

2025/01/23 17:54:18 INFO dspy.evaluate.evaluate: Average Metric: 31 / 69 (44.9%)
2025/01/23 17:54:18 INFO dspy.teleprompt.mipro_optimizer_v2: Default program score: 44.93

2025/01/23 17:54:18 INFO dspy.teleprompt.mipro_optimizer_v2: ==> STEP 3: FINDING OPTIMAL PROMPT PARAMETERS <==
2025/01/23 17:54:18 INFO dspy.teleprompt.mipro_optimizer_v2: We will evaluate the program over a series of trials with different combinations of instructions and few-shot examples to find the optimal combination using Bayesian Optimization.

2025/01/23 17:54:18 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 1 / 7 ==



Average Metric: 13.00 / 25 (52.0%): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 490.48it/s]

2025/01/23 17:54:19 INFO dspy.evaluate.evaluate: Average Metric: 13 / 25 (52.0%)
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 52.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 1'].
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [52.0]
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [44.93]
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 44.93


2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 2 / 7 ==



Average Metric: 0.00 / 25 (0.0%): 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 410.06it/s]

2025/01/23 17:54:19 INFO dspy.evaluate.evaluate: Average Metric: 0 / 25 (0.0%)
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 0.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 5'].
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [52.0, 0.0]
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [44.93]
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 44.93







2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 3 / 7 ==


Average Metric: 15.00 / 25 (60.0%): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 431.28it/s]

2025/01/23 17:54:19 INFO dspy.evaluate.evaluate: Average Metric: 15 / 25 (60.0%)
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 60.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 2'].
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [52.0, 0.0, 60.0]
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [44.93]
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 44.93







2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 4 / 7 ==


Average Metric: 2.00 / 25 (8.0%): 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 431.27it/s]


2025/01/23 17:54:19 INFO dspy.evaluate.evaluate: Average Metric: 2 / 25 (8.0%)
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 8.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 5'].
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [52.0, 0.0, 60.0, 8.0]
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [44.93]
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 44.93


2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 5 / 7 ==


Average Metric: 10.00 / 25 (40.0%): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 521.00it/s]

2025/01/23 17:54:19 INFO dspy.evaluate.evaluate: Average Metric: 10 / 25 (40.0%)
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 40.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 4'].
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [52.0, 0.0, 60.0, 8.0, 40.0]
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [44.93]





2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 44.93


2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 6 / 7 ==


Average Metric: 15.00 / 25 (60.0%): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 532.21it/s]

2025/01/23 17:54:19 INFO dspy.evaluate.evaluate: Average Metric: 15 / 25 (60.0%)





2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 60.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 1'].
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [52.0, 0.0, 60.0, 8.0, 40.0, 60.0]
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [44.93]
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 44.93


2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 7 / 7 ==


Average Metric: 9.00 / 25 (36.0%): 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:00<00:00, 367.86it/s]

2025/01/23 17:54:19 INFO dspy.evaluate.evaluate: Average Metric: 9 / 25 (36.0%)
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 36.0 on minibatch of size 25 with parameters ['Predictor 0: Instruction 6'].
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [52.0, 0.0, 60.0, 8.0, 40.0, 60.0, 36.0]
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [44.93]
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 44.93


2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: ===== Full Eval 1 =====
2025/01/23 17:54:19 INFO dspy.teleprompt.mipro_optimizer_v2: Doing full eval on next top averaging program (Avg Score: 60.0) from minibatch trials...



Average Metric: 31.00 / 69 (44.9%): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 69/69 [00:00<00:00, 385.70it/s]

2025/01/23 17:54:20 INFO dspy.evaluate.evaluate: Average Metric: 31 / 69 (44.9%)
2025/01/23 17:54:20 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [44.93, 44.93]
2025/01/23 17:54:20 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 44.93
2025/01/23 17:54:20 INFO dspy.teleprompt.mipro_optimizer_v2: 

2025/01/23 17:54:20 INFO dspy.teleprompt.mipro_optimizer_v2: Returning best identified program with score 44.93!







[34m[2025-01-23T17:54:20.150874][0m

[31mSystem message:[0m

Your input fields are:
1. `food` (str)

Your output fields are:
1. `category` (Literal[未知, 蔬菜, 水果, 肉类])
2. `confidence` (float)

All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## food ## ]]
{food}

[[ ## category ## ]]
{category}        # note: the value you produce must be one of: 未知; 蔬菜; 水果; 肉类

[[ ## confidence ## ]]
{confidence}        # note: the value you produce must be a single float value

[[ ## completed ## ]]

In adhering to this structure, your objective is: 
        The pipeline is designed to classify food based on its category and confidence.


[31mUser message:[0m

[[ ## food ## ]]
竹笋

Respond with the corresponding output fields, starting with the field `[[ ## category ## ]]` (must be formatted as a valid Python Literal[未知, 蔬菜, 水果, 肉类]), then `[[ ## confidence ## ]]` (must be formatted as a valid Python float), and then ending with the marker fo