
<img src="../../docs/images/DSPy8.png" alt="DSPy7 图片" height="150"/>

## DSPy: 从 `LangChain` 编译链


**DSPy** 中最强大的功能之一是优化器。**DSPy 优化器** 可以接受任何 LM 系统，并调整提示（或 LM 权重）以最大化任何目标。

优化器可以提高 LM 系统的质量，并使您的代码适应新的 LM 或新的数据。这旨在取代像（i）手动提示工程、（ii）设计用于生成合成数据的复杂流水线、（iii）或设计用于微调的复杂流水线等 hacky 方法，从而带来结构和模块化。

In [1]:
# 如果需要，安装依赖项。
# %pip install -U dspy-ai
# %pip install -U openai jinja2
# %pip install -U langchain langchain-community langchain-openai langchain-core

通常，我们使用 DSPy 优化器与 DSPy 模块。但在这里，我们与 [Harrison Chase](https://twitter.com/hwchase17) 合作，确保 DSPy 也可以优化使用 `LangChain` 库构建的链。

这个简短的教程演示了这个概念验证功能的工作原理。_这将 **不会** 给您提供完整的 DSPy 或 LangChain 功能，但如果有高需求，我们将扩展它。_

如果我们将其转化为更完整的集成，所有用户都将受益。LangChain 用户将获得优化任何链的能力，使用任何 DSPy 优化器。DSPy 用户将获得将任何 DSPy 程序导出到支持流式处理和跟踪以及 LangChain 中其他丰富的面向生产的功能的 LCEL 的能力。

### 1) 设置

首先，让我们导入 `dspy` 并配置其中的默认语言模型和检索模型。

In [2]:
# 导入dspy模块
import dspy

# 从dspy.evaluate.evaluate模块中导入Evaluate类
from dspy.evaluate.evaluate import Evaluate
# 从dspy.teleprompt模块中导入BootstrapFewShotWithRandomSearch类
from dspy.teleprompt import BootstrapFewShotWithRandomSearch

# 创建一个ColBERTv2对象，指定url为'http://20.102.90.50:2017/wiki17_abstracts'
colbertv2 = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')

# 配置dspy，将rm参数设置为colbertv2对象
dspy.configure(rm=colbertv2)

  from .autonotebook import tqdm as notebook_tqdm


接下来，让我们导入`langchain`和用于与LangChain运行模块进行交互的DSPy模块，即`LangChainPredict`和`LangChainModule`。

In [3]:
from langchain_openai import OpenAI  # 导入OpenAI类
from langchain.globals import set_llm_cache  # 导入set_llm_cache函数
from langchain.cache import SQLiteCache  # 导入SQLiteCache类

set_llm_cache(SQLiteCache(database_path="cache.db"))  # 设置LLM缓存为SQLiteCache，数据库路径为"cache.db"

llm = OpenAI(model_name="gpt-3.5-turbo-instruct", temperature=0)  # 初始化OpenAI对象，模型名称为"gpt-3.5-turbo-instruct"，温度为0
retrieve = lambda x: dspy.Retrieve(k=5)(x["question"]).passages  # 定义一个lambda函数retrieve，用于从问题中检索前5个段落

如果有用的话，我们可以设置一些缓存，这样您就可以在Google Colab中运行整个笔记本而无需任何API密钥。请告诉我们。

### 2) 将链定义为 `LangChain` 表达式

为了说明，让我们来解决以下任务。

**任务：** 构建一个用于生成信息性推文的 RAG 系统。
- **输入：** 一个事实性的 **问题**，可能相当复杂。
- **输出：** 一个引人入胜的 **推文**，正确回答从检索到的信息中得出的问题。

让我们使用 LangChain 的表达语言（LCEL）来说明这一点。这里的任何提示都可以，我们将使用 DSPy 来优化最终提示。

考虑到这一点，让我们只保留最基本的内容：**给定 {context}，以推文的形式回答问题 {question}。**

In [4]:
# 从LangChain中导入用于提示的标准模块。
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

# 为这个任务创建一个简单的提示。如果复杂也没关系。
prompt = PromptTemplate.from_template("给定{context}，以推文的形式回答问题`{question}`。")

# 这是使用LCEL构建链的常规方式。该链执行检索然后生成（RAG）。
vanilla_chain = RunnablePassthrough.assign(context=retrieve) | prompt | llm | StrOutputParser()

### 3) 将链转换为 **DSPy 模块**

我们的目标是优化这个提示，以便我们有一个更好的推文生成器。DSPy优化器可以帮助，但它们只能与DSPy模块一起使用！

因此，我们在DSPy中创建了两个新模块：`LangChainPredict` 和 `LangChainModule`。

In [5]:
# 从 DSPy 中导入模块，这些模块知道如何与 LangChain LCEL 进行交互。
from dspy.predict.langchain import LangChainPredict, LangChainModule

# 这是如何包装它，使其表现得像一个 DSPy 程序。
# 只需将每个类似 `prompt | llm` 的模式替换为 `LangChainPredict(prompt, llm)`。
zeroshot_chain = RunnablePassthrough.assign(context=retrieve) | LangChainPredict(prompt, llm) | StrOutputParser()
zeroshot_chain = LangChainModule(zeroshot_chain)  # 然后将链包装在一个 DSPy 模块中。

### 4) 尝试该模块

我们的 `LangChainModule` 在这个任务上表现如何？好吧，我们可以让它为以下问题生成一条推文。

In [6]:
# 导入必要的库
from transformers import pipeline

# 加载零样本文本分类模型
zeroshot_chain = pipeline("zero-shot-classification")

# 定义问题
question = "In what region was Eddy Mazzoleni born?"

# 使用零样本文本分类模型对问题进行推断
zeroshot_chain.invoke({"question": question})

' Eddy Mazzoleni, Italian professional cyclist, was born in Bergamo, Italy on July 29, 1973. #cyclist #Italy #Bergamo'

Ah that sounds about right!（这听起来大致正确！）（从技术上讲并不完美：我们要求的是_地区_而不是城市。我们可以在下面做得更好。）

检查问题和答案是非常重要的，以了解您的系统。然而，一个优秀的系统设计师总是试图迭代地**基准测试**他们的工作以量化进展！

为此，我们需要两样东西：我们想要最大化的**度量标准**和一个（小）用于我们系统的示例**数据集**。

是否有用于好推文的预定义度量标准？我应该手动标记10万条推文吗？可能不需要。不过，在您开始生产数据之前，我们可以轻松地做一些合理的事情！

### 5) 评估模块

为了开始，我们将定义自己的简单度量标准，并从一个问答数据集中借用一堆问题，在这里用于调整。

**什么样的推文是好的？** 我不知道，但在迭代开发的精神中，让我们从简单的开始！

定义一个好的推文应具备三个属性：它应该是（1）事实正确的，（2）基于真实来源，以及（3）对人们具有吸引力。

In [7]:
# 我们自行定义了这个度量标准，并从标准QA数据集中加载了一些示例。
# 让我们从与此笔记本相同目录中的`tweet_metric.py`中导入它们。
from tweet_metric import metric, trainset, valset, devset

# 我们分别加载了200、50和150个示例用于训练、验证（调整）和开发（评估）。
# 您可以加载更少（或更多）的示例，很有可能，正确的DSPy优化器将适用于许多问题。
len(trainset), len(valset), len(devset)

  table = cls._concat_blocks(blocks, axis=0)


(200, 50, 150)

这是正确的度量标准或最具代表性的问题集吗？未必。但它们让我们以系统化的方式开始迭代！

**注意：** 请注意我们的数据集实际上并不包含任何推文！它只有问题和答案。没关系，我们的度量标准将负责评估以推文形式的输出。

好的，让我们评估从我们的`LangChain` LCEL对象转换而来的链条的未优化的“零-shot”版本。

In [8]:
# 创建一个Evaluate对象，传入metric、devset、num_threads、display_progress和display_table参数
evaluate = Evaluate(metric=metric, devset=devset, num_threads=8, display_progress=True, display_table=5)
# 对zeroshot_chain进行评估
evaluate(zeroshot_chain)

Average Metric: 63.999999999999986 / 150  (42.7): 100%|██████████| 150/150 [00:02<00:00, 66.08it/s]
  df = df.applymap(truncate_cell)


Average Metric: 63.999999999999986 / 150  (42.7%)


Unnamed: 0,question,answer,gold_titles,output,tweet_response,metric
0,Who was a producer who produced albums for both rock bands Juke Karten and Thirty Seconds to Mars?,Brian Virtue,"{'Thirty Seconds to Mars', 'Levolution (album)'}","Brian Virtue, who has worked with bands like Jane's Addiction and Velvet Revolver, produced albums for both Juke Kartel and Thirty Seconds to Mars, showcasing...","Brian Virtue, who has worked with bands like Jane's Addiction and Velvet Revolver, produced albums for both Juke Kartel and Thirty Seconds to Mars, showcasing...",1.0
1,Are both the University of Chicago and Syracuse University public universities?,no,"{'Syracuse University', 'University of Chicago'}","No, only Syracuse University is a public university. The University of Chicago is a private research university. #Syracuse #University #Chicago #Public #Private","No, only Syracuse University is a public university. The University of Chicago is a private research university. #Syracuse #University #Chicago #Public #Private",0.3333333333333333
2,In what region was Eddy Mazzoleni born?,"Lombardy, northern Italy","{'Eddy Mazzoleni', 'Bergamo'}","Eddy Mazzoleni, Italian professional cyclist, was born in Bergamo, Italy on July 29, 1973. #cyclist #Italy #Bergamo","Eddy Mazzoleni, Italian professional cyclist, was born in Bergamo, Italy on July 29, 1973. #cyclist #Italy #Bergamo",0.0
3,Who edited the 1990 American romantic comedy film directed by Garry Marshall?,Raja Raymond Gosnell,"{'Raja Gosnell', 'Pretty Woman'}",J. F. Lawton edited the 1990 American romantic comedy film directed by Garry Marshall. #PrettyWoman #GarryMarshall #JFLawton,J. F. Lawton edited the 1990 American romantic comedy film directed by Garry Marshall. #PrettyWoman #GarryMarshall #JFLawton,0.0
4,Burrs Country Park railway station is what stop on the railway line that runs between Heywood and Rawtenstall,seventh,"{'East Lancashire Railway', 'Burrs Country Park railway station'}",Burrs Country Park railway station is the seventh stop on the East Lancashire Railway line that runs between Heywood and Rawtenstall.,Burrs Country Park railway station is the seventh stop on the East Lancashire Railway line that runs between Heywood and Rawtenstall.,1.0


42.67

好的，很酷。我们的 `zeroshot_chain` 在开发集中的150个问题中获得约**43%**的准确率。

上面的表格显示了一些示例。例如：

- **问题**: 谁是制作过摇滚乐队Juke Karten和Thirty Seconds to Mars专辑的制作人？
- **推文**: Brian Virtue，曾与Jane's Addiction和Velvet Revolver等乐队合作，制作了Juke Kartel和Thirty Seconds to Mars的专辑，展示...
- **指标**: 1.0（一个正确、忠实且引人入胜的推文！*）

注脚：* 至少根据我们的指标，这只是一个DSPy程序，所以如果您愿意，它也可以进行优化！不过这是另一个笔记本的话题。

### 6) 优化模块

DSPy有许多优化器，但目前事实上的默认优化器是`BootstrapFewShotWithRandomSearch`。

**如果你对它是如何工作感兴趣：** 这个优化器通过在`trainset`问题上运行你的程序（在本例中为`zeroshot_chain`）来工作。每次运行时，DSPy将记住每个LM调用的输入和输出。这些被称为迹，而这个特定的优化器将跟踪“好”的迹（即度量喜欢的迹）。然后，这个优化器将尝试找到利用这些迹作为自动少样本示例的好方法。它将尝试它们，寻求最大化`valset`上的平均度量。有许多自动生成（引导）示例的方法。也有许多优化它们的选择的方法（在这里，使用随机搜索）。这就是为什么DSPy中还有其他几个优化器。

In [9]:
# 设置优化器。在这个示例中，我们将使用非常少的超参数。
# 只需进行大约3次尝试的随机搜索，在每次尝试中，bootstrap <= 3个跟踪。
optimizer = BootstrapFewShotWithRandomSearch(metric=metric, max_bootstrapped_demos=3, num_candidate_programs=3)

# 现在使用优化器来*编译*链。这可能需要5-10分钟，除非已经缓存。
optimized_chain = optimizer.compile(zeroshot_chain, trainset=trainset, valset=valset)

Going to sample between 1 and 3 traces per predictor.
Will attempt to train 3 candidate sets.


Average Metric: 22.333333333333336 / 50  (44.7): 100%|██████████| 50/50 [00:00<00:00, 55.47it/s]
  df = df.applymap(truncate_cell)


Average Metric: 22.333333333333336 / 50  (44.7%)
Score: 44.67 for set: [0]
New best score: 44.67 for seed -3
Scores so far: [44.67]
Best score: 44.67


Average Metric: 22.333333333333336 / 50  (44.7): 100%|██████████| 50/50 [00:00<00:00, 166.70it/s]
  df = df.applymap(truncate_cell)


Average Metric: 22.333333333333336 / 50  (44.7%)
Score: 44.67 for set: [16]
Scores so far: [44.67, 44.67]
Best score: 44.67


  2%|▎         | 5/200 [00:00<00:07, 26.88it/s]


Bootstrapped 3 full traces after 6 examples in round 0.


Average Metric: 27.000000000000004 / 50  (54.0): 100%|██████████| 50/50 [00:00<00:00, 72.21it/s]
  df = df.applymap(truncate_cell)


Average Metric: 27.000000000000004 / 50  (54.0%)
Score: 54.0 for set: [16]
New best score: 54.0 for seed -1
Scores so far: [44.67, 44.67, 54.0]
Best score: 54.0
Average of max per entry across top 1 scores: 0.54
Average of max per entry across top 2 scores: 0.5933333333333334
Average of max per entry across top 3 scores: 0.5933333333333334
Average of max per entry across top 5 scores: 0.5933333333333334
Average of max per entry across top 8 scores: 0.5933333333333334
Average of max per entry across top 9999 scores: 0.5933333333333334


  4%|▍         | 9/200 [00:00<00:06, 28.04it/s]


Bootstrapped 2 full traces after 10 examples in round 0.


Average Metric: 25.000000000000007 / 50  (50.0): 100%|██████████| 50/50 [00:00<00:00, 70.71it/s]
  df = df.applymap(truncate_cell)


Average Metric: 25.000000000000007 / 50  (50.0%)
Score: 50.0 for set: [16]
Scores so far: [44.67, 44.67, 54.0, 50.0]
Best score: 54.0
Average of max per entry across top 1 scores: 0.54
Average of max per entry across top 2 scores: 0.5933333333333334
Average of max per entry across top 3 scores: 0.6066666666666667
Average of max per entry across top 5 scores: 0.6066666666666667
Average of max per entry across top 8 scores: 0.6066666666666667
Average of max per entry across top 9999 scores: 0.6066666666666667


  0%|          | 1/200 [00:00<00:07, 28.24it/s]


Bootstrapped 1 full traces after 2 examples in round 0.


Average Metric: 25.666666666666664 / 50  (51.3): 100%|██████████| 50/50 [00:00<00:00, 75.37it/s]
  df = df.applymap(truncate_cell)


Average Metric: 25.666666666666664 / 50  (51.3%)
Score: 51.33 for set: [16]
Scores so far: [44.67, 44.67, 54.0, 50.0, 51.33]
Best score: 54.0
Average of max per entry across top 1 scores: 0.54
Average of max per entry across top 2 scores: 0.5800000000000001
Average of max per entry across top 3 scores: 0.6133333333333334
Average of max per entry across top 5 scores: 0.6266666666666667
Average of max per entry across top 8 scores: 0.6266666666666667
Average of max per entry across top 9999 scores: 0.6266666666666667


  1%|          | 2/200 [00:00<00:07, 27.81it/s]


Bootstrapped 1 full traces after 3 examples in round 0.


Average Metric: 26.0 / 50  (52.0): 100%|██████████| 50/50 [00:00<00:00, 73.67it/s]              
  df = df.applymap(truncate_cell)


Average Metric: 26.0 / 50  (52.0%)
Score: 52.0 for set: [16]
Scores so far: [44.67, 44.67, 54.0, 50.0, 51.33, 52.0]
Best score: 54.0
Average of max per entry across top 1 scores: 0.54
Average of max per entry across top 2 scores: 0.5733333333333335
Average of max per entry across top 3 scores: 0.6133333333333334
Average of max per entry across top 5 scores: 0.64
Average of max per entry across top 8 scores: 0.64
Average of max per entry across top 9999 scores: 0.64
6 candidate programs found.


### 7) 评估优化后的链条

那么，这有多好呢？_并不是每次优化运行都会神奇地在未见示例上取得改进！_ 所以让我们来检查一下！

首先让我们从上面那个问题开始。

In [10]:
# 定义问题字符串
question = "In what region was Eddy Mazzoleni born?"

# 调用优化后的链式模型，并传入问题参数
optimized_chain.invoke({"question": question})

' Eddy Mazzoleni was born in Bergamo, a city in the Lombardy region of Italy. #EddyMazzoleni #Italy #Lombardy'

很好，从经验上看，它似乎比使用`zeroshot_chain`得到的答案更精确。但现在让我们进行一些适当的评估！

In [11]:
# 调用 evaluate 函数对优化后的链表进行评估
evaluate(optimized_chain)

Average Metric: 78.66666666666667 / 150  (52.4): 100%|██████████| 150/150 [00:02<00:00, 72.64it/s] 

Average Metric: 78.66666666666667 / 150  (52.4%)



  df = df.applymap(truncate_cell)


Unnamed: 0,question,answer,gold_titles,output,tweet_response,metric
0,Who was a producer who produced albums for both rock bands Juke Karten and Thirty Seconds to Mars?,Brian Virtue,"{'Thirty Seconds to Mars', 'Levolution (album)'}","Brian Virtue is a producer who has worked with both Juke Kartel and Thirty Seconds to Mars, helping to create their unique sounds. #BrianVirtue #producer...","Brian Virtue is a producer who has worked with both Juke Kartel and Thirty Seconds to Mars, helping to create their unique sounds. #BrianVirtue #producer...",1.0
1,Are both the University of Chicago and Syracuse University public universities?,no,"{'Syracuse University', 'University of Chicago'}","Yes, both Northeastern Illinois University and Syracuse University are public universities. #publicuniversity #Chicago #Syracuse","Yes, both Northeastern Illinois University and Syracuse University are public universities. #publicuniversity #Chicago #Syracuse",0.0
2,In what region was Eddy Mazzoleni born?,"Lombardy, northern Italy","{'Eddy Mazzoleni', 'Bergamo'}","Eddy Mazzoleni was born in Bergamo, a city in the Lombardy region of Italy. #EddyMazzoleni #Italy #Lombardy","Eddy Mazzoleni was born in Bergamo, a city in the Lombardy region of Italy. #EddyMazzoleni #Italy #Lombardy",1.0
3,Who edited the 1990 American romantic comedy film directed by Garry Marshall?,Raja Raymond Gosnell,"{'Raja Gosnell', 'Pretty Woman'}","Garry Marshall directed and edited the 1990 American romantic comedy film ""Pretty Woman"", starring Richard Gere and Julia Roberts. #PrettyWoman #GarryMarshall #RomanticComedy","Garry Marshall directed and edited the 1990 American romantic comedy film ""Pretty Woman"", starring Richard Gere and Julia Roberts. #PrettyWoman #GarryMarshall #RomanticComedy",0.0
4,Burrs Country Park railway station is what stop on the railway line that runs between Heywood and Rawtenstall,seventh,"{'East Lancashire Railway', 'Burrs Country Park railway station'}","Burrs Country Park railway station is the seventh stop on the East Lancashire Railway line, which runs between Heywood and Rawtenstall. #EastLancashireRailway #BurrsCountryPark #railwaystation","Burrs Country Park railway station is the seventh stop on the East Lancashire Railway line, which runs between Heywood and Rawtenstall. #EastLancashireRailway #BurrsCountryPark #railwaystation",1.0


52.44

我们从`zeroshot_chain`开始时为**43%**，现在为**52%**。这是一个不错的**21%**相对改善。不错！

### 8) 在实际操作中检查优化后的链条

In [12]:
# 从dspy.settings.langchain_history列表中获取倒数第四个元素，并将其分别赋值给prompt和output变量
prompt, output = dspy.settings.langchain_history[-4]

# 打印出PROMPT的内容
print('PROMPT:\n\n', prompt)

# 打印出OUTPUT的内容
print('\n\nOUTPUT:\n\n', output)

PROMPT:

 Essential Instructions: Respond to the provided question based on the given context in the style of a tweet, which typically requires a concise and engaging answer within the character limit of a tweet (280 characters).

---

Follow the following format.

Context: ${context}
Question: ${question}
Tweet Response: ${tweet_response}

---

Context:
[1] «Candace Kita | Kita's first role was as a news anchor in the 1991 movie "Stealth Hunters". Kita's first recurring television role was in Fox's "Masked Rider", from 1995 to 1996. She appeared as a series regular lead in all 40 episodes. Kita also portrayed a frantic stewardess in a music video directed by Mark Pellington for the British group, Catherine Wheel, titled, "Waydown" in 1995. In 1996, Kita also appeared in the film "Barb Wire" (1996) and guest starred on "The Wayans Bros.". She also guest starred in "Miriam Teitelbaum: Homicide" with "Saturday Night Live" alumni Nora Dunn, "Wall To Wall Records" with Jordan Bridges, "Eve

#### 致谢：

感谢[Harrison Chase](https://twitter.com/hwchase17)共同领导这项新的整合工作。感谢我们自己的[Arnav Singhvi](https://arnavsinghvi11.github.io/)帮助完成这个推文生成任务，并提供了获取数据的见解。