# Argilla

>[Argilla](https://argilla.io/) 是一个用于LLM的开源数据策划平台。
> 使用Argilla，每个人都可以通过更快的数据策划构建强大的语言模型，
> 同时利用人类和机器反馈。我们为MLOps周期中的每个步骤提供支持，
> 从数据标注到模型监控。

<a target="_blank" href="https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs/integrations/callbacks/argilla.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="在Colab中打开"/>
</a>

在本指南中，我们将演示如何使用`ArgillaCallbackHandler`跟踪LLM的输入和响应，以在Argilla中生成数据集。

跟踪LLM的输入和输出以生成未来微调的数据集非常有用。当你使用LLM为特定任务生成数据时，如问答、摘要或翻译，这尤其有用。

## 安装和设置

In [None]:
%pip install --upgrade --quiet  langchain langchain-openai argilla

### 获取API凭证

要获取Argilla API凭证，请按照以下步骤操作：

1. 进入Argilla用户界面。
2. 点击您的个人头像并前往"我的设置"。
3. 然后复制API密钥。

在Argilla中，API URL将与Argilla用户界面的URL相同。

要获取OpenAI API凭证，请访问 https://platform.openai.com/account/api-keys

In [11]:
import os

os.environ["ARGILLA_API_URL"] = "..."
os.environ["ARGILLA_API_KEY"] = "..."

os.environ["OPENAI_API_KEY"] = "..."

### 设置Argilla

要使用`ArgillaCallbackHandler`，我们需要在Argilla中创建一个新的`FeedbackDataset`来跟踪LLM实验。请使用以下代码：

In [3]:
import argilla as rg

In [None]:
from packaging.version import parse as parse_version

if parse_version(rg.__version__) < parse_version("1.8.0"):
    raise RuntimeError(
        "`FeedbackDataset`仅在Argilla v1.8.0或更高版本中可用，请"
        "通过`pip install argilla --upgrade`升级`argilla`。"
    )

In [None]:
dataset = rg.FeedbackDataset(
    fields=[
        rg.TextField(name="prompt"),
        rg.TextField(name="response"),
    ],
    questions=[
        rg.RatingQuestion(
            name="response-rating",
            description="您如何评价响应的质量？",
            values=[1, 2, 3, 4, 5],
            required=True,
        ),
        rg.TextQuestion(
            name="response-feedback",
            description="您对响应有什么反馈？",
            required=False,
        ),
    ],
    guidelines="您将被要求评价响应的质量并提供反馈。",
)

rg.init(
    api_url=os.environ["ARGILLA_API_URL"],
    api_key=os.environ["ARGILLA_API_KEY"],
)

dataset.push_to_argilla("langchain-dataset")

> 📌 注意：目前，`FeedbackDataset.fields`仅支持提示-响应对，因此`ArgillaCallbackHandler`将只跟踪提示（即LLM输入）和响应（即LLM输出）。

## 跟踪

要使用`ArgillaCallbackHandler`，您可以使用以下代码，或者复现以下部分中呈现的示例之一。

In [None]:
from langchain_community.callbacks.argilla_callback import ArgillaCallbackHandler

argilla_callback = ArgillaCallbackHandler(
    dataset_name="langchain-dataset",
    api_url=os.environ["ARGILLA_API_URL"],
    api_key=os.environ["ARGILLA_API_KEY"],
)

### 场景1：跟踪LLM

首先，让我们多次运行单个LLM，并在Argilla中捕获生成的提示-响应对。

In [None]:
from langchain_core.callbacks.stdout import StdOutCallbackHandler
from langchain_openai import OpenAI

argilla_callback = ArgillaCallbackHandler(
    dataset_name="langchain-dataset",
    api_url=os.environ["ARGILLA_API_URL"],
    api_key=os.environ["ARGILLA_API_KEY"],
)
callbacks = [StdOutCallbackHandler(), argilla_callback]

llm = OpenAI(temperature=0.9, callbacks=callbacks)
llm.generate(["讲个笑话", "给我写首诗"] * 3)

LLMResult(generations=[[Generation(text='\n\nQ: What did the fish say when he hit the wall? \nA: Dam.', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text='\n\nThe Moon \n\nThe moon is high in the midnight sky,\nSparkling like a star above.\nThe night so peaceful, so serene,\nFilling up the air with love.\n\nEver changing and renewing,\nA never-ending light of grace.\nThe moon remains a constant view,\nA reminder of life’s gentle pace.\n\nThrough time and space it guides us on,\nA never-fading beacon of hope.\nThe moon shines down on us all,\nAs it continues to rise and elope.', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text='\n\nQ. What did one magnet say to the other magnet?\nA. "I find you very attractive!"', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text="\n\nThe world is charged with the grandeur of God.\nIt will flame out, like shining from shook foil;\nIt gathers to a greatness, like t

![带有LangChain LLM输入-响应的Argilla界面](https://docs.argilla.io/en/latest/_images/llm.png)

### 场景2：在链中跟踪LLM

然后我们可以使用提示模板创建一个链，然后在Argilla中跟踪初始提示和最终响应。

In [None]:
from langchain.chains import LLMChain
from langchain_core.callbacks.stdout import StdOutCallbackHandler
from langchain_core.prompts import PromptTemplate
from langchain_openai import OpenAI

argilla_callback = ArgillaCallbackHandler(
    dataset_name="langchain-dataset",
    api_url=os.environ["ARGILLA_API_URL"],
    api_key=os.environ["ARGILLA_API_KEY"],
)
callbacks = [StdOutCallbackHandler(), argilla_callback]
llm = OpenAI(temperature=0.9, callbacks=callbacks)

template = """你是一名剧作家。给定剧本标题，你的工作是为该标题写一个剧情概要。
标题：{title}
剧作家：这是上述剧本的概要："""
prompt_template = PromptTemplate(input_variables=["title"], template=template)
synopsis_chain = LLMChain(llm=llm, prompt=prompt_template, callbacks=callbacks)

test_prompts = [{"title": "巴黎大脚怪纪录片"}]
synopsis_chain.apply(test_prompts)



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a playwright. Given the title of play, it is your job to write a synopsis for that title.
Title: Documentary about Bigfoot in Paris
Playwright: This is a synopsis for the above play:[0m

[1m> Finished chain.[0m


[{'text': "\n\nDocumentary about Bigfoot in Paris focuses on the story of a documentary filmmaker and their search for evidence of the legendary Bigfoot creature in the city of Paris. The play follows the filmmaker as they explore the city, meeting people from all walks of life who have had encounters with the mysterious creature. Through their conversations, the filmmaker unravels the story of Bigfoot and finds out the truth about the creature's presence in Paris. As the story progresses, the filmmaker learns more and more about the mysterious creature, as well as the different perspectives of the people living in the city, and what they think of the creature. In the end, the filmmaker's findings lead them to some surprising and heartwarming conclusions about the creature's existence and the importance it holds in the lives of the people in Paris."}]

![带有LangChain Chain输入-响应的Argilla界面](https://docs.argilla.io/en/latest/_images/chain.png)

### 场景3：使用带工具的Agent

最后，作为一个更高级的工作流程，您可以创建一个使用某些工具的代理。`ArgillaCallbackHandler`将跟踪输入和输出，但不跟踪中间步骤/思考过程，因此给定一个提示，我们记录原始提示和对该提示的最终响应。

> 注意，对于这个场景，我们将使用Google搜索API（Serp API），所以您需要同时安装`google-search-results`（通过`pip install google-search-results`），并设置Serp API密钥为`os.environ["SERPAPI_API_KEY"] = "..."`（可以在https://serpapi.com/dashboard找到），否则下面的示例将无法工作。

In [None]:
from langchain.agents import AgentType, initialize_agent, load_tools
from langchain_core.callbacks.stdout import StdOutCallbackHandler
from langchain_openai import OpenAI

argilla_callback = ArgillaCallbackHandler(
    dataset_name="langchain-dataset",
    api_url=os.environ["ARGILLA_API_URL"],
    api_key=os.environ["ARGILLA_API_KEY"],
)
callbacks = [StdOutCallbackHandler(), argilla_callback]
llm = OpenAI(temperature=0.9, callbacks=callbacks)

tools = load_tools(["serpapi"], llm=llm, callbacks=callbacks)
agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    callbacks=callbacks,
)
agent.run("美国的第一任总统是谁？")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to answer a historical question
Action: Search
Action Input: "who was the first president of the United States of America" [0m
Observation: [36;1m[1;3mGeorge Washington[0m
Thought:[32;1m[1;3m George Washington was the first president
Final Answer: George Washington was the first president of the United States of America.[0m

[1m> Finished chain.[0m


'George Washington was the first president of the United States of America.'

![带有LangChain Agent输入-响应的Argilla界面](https://docs.argilla.io/en/latest/_images/agent.png)