# DSPy-Clarifai lm and retriever示例笔记本

本笔记本将指导您将clarifai集成到DSPy中，从而使DSPy用户能够利用clarifai平台调用llm模型的功能，并将clarifai应用程序用作其向量搜索用例的检索器。

### 设置

In [None]:
# 安装 clarifai 库
!pip install clarifai

In [None]:
!pip install dspy-ai

# 安装 dspy-ai 包

导入必要的包

In [4]:
# 导入dspy模块
import dspy
# 从dspy.retrieve.clarifai_rm模块中导入ClarifaiRM类
from dspy.retrieve.clarifai_rm import ClarifaiRM 

#### 初始化clarifai应用程序ID、用户ID和PAT。
通过以下链接[入门指南](https://docs.clarifai.com/clarifai-basics/quick-start/your-first-predictions)在clarifai门户中创建一个AI应用程序，时间不超过1分钟。

您可以浏览门户以获取clarifai社区中不同模型的[模型URL](https://clarifai.com/explore/models)。

In [6]:
# 为了演示，我们选择了llama2-70b-chat模型
MODEL_URL = "https://clarifai.com/meta/Llama-2/models/llama2-70b-chat" 
PAT = "CLARIFAI_PAT"  # 用于身份验证的个人访问令牌
USER_ID = "YOUR_ID"  # 用户ID
APP_ID = "YOUR_APP"  # 应用程序ID

### 将数据导入到 Clarifai 向量数据库

要将 Clarifai 用作检索器，您只需将文档导入到 Clarifai 应用程序中，该应用程序充当您的向量数据库，以检索类似的文档。
为了简化数据导入，我们正在利用 Clarifai 向量数据库集成进行数据导入。

In [None]:
# 运行此代码块将文档作为块导入到 Clarifai 应用程序中。
# 如果遇到任何问题，请确保运行 `pip install langchain`

from langchain.text_splitter import CharacterTextSplitter
from langchain.document_loaders import TextLoader
from langchain.vectorstores import Clarifai as clarifaivectorstore

loader = TextLoader("YOUR_TEXT_FILE") # 替换为您的文件路径
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1024, chunk_overlap=200)
docs = text_splitter.split_documents(documents)

clarifai_vector_db = clarifaivectorstore.from_documents(
    user_id=USER_ID,
    app_id=APP_ID,
    documents=docs,
    pat=PAT
)

#### 初始化LLM类

确保将所有模型参数传递到clarifaiLLM类的inference_params字段中。

In [55]:
# 创建一个Clarifai对象，并传入模型URL、API密钥、n值和推断参数
llm = dspy.Clarifai(model=MODEL_URL, api_key=PAT, n=2, inference_params={"max_tokens":100,'temperature':0.6})

初始化 Clarifai Retriever 模型类

In [56]:
# 创建一个ClarifaiRM对象，并传入用户ID、应用ID、PAT和k值
retriever_model = ClarifaiRM(clarifai_user_id=USER_ID, clarfiai_app_id=APP_ID, clarifai_pat=PAT, k=2)

使用llm和rm模型配置dspy。

In [57]:
# 配置dspy设置，设置左模型为llm，右模型为retriever_model
dspy.settings.configure(lm=llm, rm=retriever_model)

### 示例：使用clairfaiLLM的dspy.signature和dspy.module

In [11]:
sentence = "disney again ransacks its archives for a quick-buck sequel ."  # 来自SST-2数据集的示例。

classify = dspy.Predict('sentence -> sentiment')  # 创建一个用于情感分类的预测器对象
print(classify(sentence=sentence).sentiment)  # 打印句子的情感分类结果

NEGATIVE




### 示例：当查询传递给dspy.Retrieve类时，我们快速了解我们的检索器是如何工作的

这里我们使用了Formula Student Germany比赛的规则书。

链接：https://www.formulastudent.de/fileadmin/user_upload/all/2024/rules/FS-Rules_2024_v1.0.pdf

我们在演示中使用了该文件的.txt版本。

In [51]:
# 创建一个Retrieve对象
retrieve = dspy.Retrieve()

# 使用Retrieve对象来检索包含问题"can I test my vehicle engine in pit?"的相关段落
topK_passages = retrieve("can I test my vehicle engine in pit?").passages

In [52]:
# 打印topK_passages变量的值
print(topK_passages)

['A 6.8.3\n\nCranking engines in the pits is allowed, when the following conditions are met:\n• The vehicle has passed mechanical inspection.\n• The driven axles are securely jacked up.\n• Gearbox is in neutral.\n• All driven wheels are removed.\n• Connectors to all injectors and ignition coils are detached.\n• A fire extinguisher must be placed next to the engine.\n\nA 6.9\n\nFueling and Oil\n\nA 6.9.1\n\nFueling may only take place at the fuel station and must be conducted by officials only.\n\nA 6.9.2\n\nOpen fuel containers are not permitted at the competition.\n\nA 6.9.3\n\nWaste oil must be taken to the fuel station for disposal.\n\nA 6.10\n\n[EV ONLY ] Working on the Vehicle\n\nA 6.10.1\n\nAll activities require the TSAL to be green.\n\nA 6.10.2\n\nA prominent manual sign indicating the “TSAL green” state must be present whenever the\nLVS is switched off and the requirements for an only green TSAL according to EV 4.10 are\nmet.\n\nA 6.10.3', 'A 6.8.3\n\nCranking engines in the p

## 使用 Clarifai 作为检索器的 RAG dspy 模块

通常在dspy中构建一个模块时，您可能需要定义

签名：
用简洁的几个词直观地解释输入和输出字段。
("问题"-> "答案")

模块：
模块可以是您将签名付诸实践的地方，通过定义一个特定的模块，为给定的查询编译并生成响应。

构建一个签名类，该类定义所需的输入字段和输出字段。
此外，提供详细的文档字符串和描述，以便dspy签名能够理解上下文并为用例编译最佳提示。

In [53]:
class GenerateAnswer(dspy.Signature):
    """根据提供的上下文思考并回答问题。"""

    context = dspy.InputField(desc="可能包含与用户查询相关的事实")
    question = dspy.InputField(desc="用户查询")
    answer = dspy.OutputField(desc="一两句话的答案")

定义模块，其中包含需要执行的操作，这里我们展示了一个小的RAG用例，我们正在使用我们的检索器类检索相似的上下文，并基于事实上下文使用DSPy模块`ChainOfThought`生成响应。

In [54]:
# 定义一个名为RAG的类，继承自dspy.Module类
class RAG(dspy.Module):
    def __init__(self):
        super().__init__()

        # 初始化一个Retrieve对象并赋值给self.retrieve
        self.retrieve = dspy.Retrieve()
        # 初始化一个ChainOfThought对象，参数为GenerateAnswer类，并赋值给self.generate_answer
        self.generate_answer = dspy.ChainOfThought(GenerateAnswer)
    
    # 定义forward方法，接收question作为输入
    def forward(self, question):
        # 调用self.retrieve方法，传入question参数，获取context
        context = self.retrieve(question).passages
        # 调用self.generate_answer方法，传入context和question参数，获取prediction
        prediction = self.generate_answer(context=context, question=question)
        # 返回一个Prediction对象，包含context和prediction的answer
        return dspy.Prediction(context=context, answer=prediction.answer)

现在我们正在传递我们的查询，并使用clarifai检索器检索相关的块，基于事实证据，模型能够生成响应。

In [59]:
# 询问任何问题给这个RAG程序。
my_question = "can I test my vehicle engine in pit before inspection?"

# 获取预测结果。这包括 `pred.context` 和 `pred.answer`。
Rag_obj = RAG()
predict_response_llama70b = Rag_obj(my_question)

# 打印问题和答案。
print(f"问题: {my_question}")
print(f"预测答案: {predict_response_llama70b.answer}")
print(f"检索到的上下文 (截断): {[c[:200] + '...' for c in predict_response_llama70b.context]}")

Question: can I test my vehicle engine in pit before inspection?
Predicted Answer: No, you cannot test your vehicle engine in the pit before inspection.
Retrieved Contexts (truncated): ['A 6.8.3\n\nCranking engines in the pits is allowed, when the following conditions are met:\n• The vehicle has passed mechanical inspection.\n• The driven axles are securely jacked up.\n• Gearbox is in neut...', 'A 6.8.3\n\nCranking engines in the pits is allowed, when the following conditions are met:\n• The vehicle has passed mechanical inspection.\n• The driven axles are securely jacked up.\n• Gearbox is in neut...']


### 现在我们将比较我们的RAG DSPy模块与来自clarifai和comapare响应的不同社区模型。

### Mistral-7b 指令

In [64]:
# 创建一个 Clarifai 模型对象 mistral_lm，使用指定的模型和 API 密钥
mistral_lm = dspy.Clarifai(model="https://clarifai.com/mistralai/completion/models/mistral-7B-Instruct", api_key=PAT, n=2, inference_params={'temperature':0.6})

# 配置 dspy 库的设置，指定语言模型为 mistral_lm，检索模型为 retriever_model
dspy.settings.configure(lm=mistral_lm, rm=retriever_model)

In [70]:
# 定义问题
my_question = "can I test my vehicle engine in pit before inspection?"

# 创建 RAG 对象
Rag_obj = RAG()

# 使用 RAG 模型预测问题的答案
predict_response_mistral = Rag_obj(my_question)

# 打印问题
print(f"Question: {my_question}")

# 打印预测的答案
print(f"Predicted Answer: {predict_response_mistral.answer}")

# 打印检索到的上下文（截断显示前200个字符）
print(f"Retrieved Contexts (truncated): {[c[:200] + '...' for c in predict_response_mistral.context]}")

Question: can I test my vehicle engine in pit before inspection?
Predicted Answer: Reasoning: According to the context, cranking engines in the pits is allowed only when the vehicle has passed mechanical inspection.

Answer: No, you cannot test your vehicle engine in pit before inspection.
Retrieved Contexts (truncated): ['A 6.8.3\n\nCranking engines in the pits is allowed, when the following conditions are met:\n• The vehicle has passed mechanical inspection.\n• The driven axles are securely jacked up.\n• Gearbox is in neut...', 'A 6.8.3\n\nCranking engines in the pits is allowed, when the following conditions are met:\n• The vehicle has passed mechanical inspection.\n• The driven axles are securely jacked up.\n• Gearbox is in neut...']


### Gemini Pro

In [66]:
# 创建一个 Clarifai 对象 gemini_lm，使用指定的模型和 API 密钥
gemini_lm = dspy.Clarifai(model="https://clarifai.com/gcp/generate/models/gemini-pro", api_key=PAT, n=2)

# 配置 dspy 库的设置，设置语言模型为 gemini_lm，检索模型为 retriever_model
dspy.settings.configure(lm=gemini_lm, rm=retriever_model)

In [67]:
# 定义问题
my_question = "can I test my vehicle engine in pit before inspection?"

# 实例化 RAG 模型
Rag_obj = RAG()

# 使用 RAG 模型预测问题的答案
predict_response_gemini = Rag_obj(my_question)

# 打印问题
print(f"Question: {my_question}")

# 打印预测的答案
print(f"Predicted Answer: {predict_response_gemini.answer}")

# 打印检索到的上下文（截断显示前200个字符）
print(f"Retrieved Contexts (truncated): {[c[:200] + '...' for c in predict_response_gemini.context]}")

Question: can I test my vehicle engine in pit before inspection?
Predicted Answer: No, you can't test your vehicle engine in the pits before inspection.
Retrieved Contexts (truncated): ['A 6.8.3\n\nCranking engines in the pits is allowed, when the following conditions are met:\n• The vehicle has passed mechanical inspection.\n• The driven axles are securely jacked up.\n• Gearbox is in neut...', 'A 6.8.3\n\nCranking engines in the pits is allowed, when the following conditions are met:\n• The vehicle has passed mechanical inspection.\n• The driven axles are securely jacked up.\n• Gearbox is in neut...']


Clarifai使您能够使用不同的llm模型测试您的dspy模块，并比较响应，这是测试和实现正确的llm模型与正确提示组合的关键部分。