# Callbacks

LangChain提供了一个回调系统，允许您挂钩到LLM应用程序的各个阶段。这对于日志记录、监视、流处理和其他任务非常有用

您可以通过使用整个API中可用的callback参数来订阅这些事件。该参数是处理程序对象的列表，这些对象将实现下面更详细描述的一个或多个方法。

## Callback handlers
`CallbackHandlers`是实现`CallbackHandler`接口的对象，该接口为每个可以订阅的事件提供一个方法。当事件被触发时，`CallbackManager`将在每个处理程序上调用适当的方法。

In [1]:
import os
from dotenv import load_dotenv, find_dotenv
from langchain.globals import set_debug

load_dotenv(find_dotenv())
set_debug(False)

In [None]:
from typing import Any, Dict, List, Union
from langchain_core.messages import BaseMessage
from langchain_core.outputs import LLMResult
from langchain_core.agents import AgentFinish, AgentAction
class BaseCallbackHandler:
    """Base callback handler that can be used to handle callbacks from langchain."""

    def on_llm_start(
        self, serialized: Dict[str, Any], prompts: List[str], **kwargs: Any
    ) -> Any:
        """Run when LLM starts running."""

    def on_chat_model_start(
        self, serialized: Dict[str, Any], messages: List[List[BaseMessage]], **kwargs: Any
    ) -> Any:
        """Run when Chat Model starts running."""

    def on_llm_new_token(self, token: str, **kwargs: Any) -> Any:
        """Run on new LLM token. Only available when streaming is enabled."""

    def on_llm_end(self, response: LLMResult, **kwargs: Any) -> Any:
        """Run when LLM ends running."""

    def on_llm_error(
        self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
    ) -> Any:
        """Run when LLM errors."""

    def on_chain_start(
        self, serialized: Dict[str, Any], inputs: Dict[str, Any], **kwargs: Any
    ) -> Any:
        """Run when chain starts running."""

    def on_chain_end(self, outputs: Dict[str, Any], **kwargs: Any) -> Any:
        """Run when chain ends running."""

    def on_chain_error(
        self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
    ) -> Any:
        """Run when chain errors."""

    def on_tool_start(
        self, serialized: Dict[str, Any], input_str: str, **kwargs: Any
    ) -> Any:
        """Run when tool starts running."""

    def on_tool_end(self, output: Any, **kwargs: Any) -> Any:
        """Run when tool ends running."""

    def on_tool_error(
        self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
    ) -> Any:
        """Run when tool errors."""

    def on_text(self, text: str, **kwargs: Any) -> Any:
        """Run on arbitrary text."""

    def on_agent_action(self, action: AgentAction, **kwargs: Any) -> Any:
        """Run on agent action."""

    def on_agent_finish(self, finish: AgentFinish, **kwargs: Any) -> Any:
        """Run on agent end."""

## Get started
Langchain提供了一些内置处理程序，您可以使用这些处理程序来开始。这些可在langchain_core/回调模块中可用。最基本的处理程序是`StdOutCallbackHandler`，它只是将所有事件记录到Stdout
> 注意:当对象上的verbose标志设置为true时，即使没有显式传递，也会调用`StdOutCallbackHandler`。

In [2]:
from langchain_core.callbacks import StdOutCallbackHandler
from langchain.chains import LLMChain
from langchain_openai import OpenAI
from langchain_core.prompts import PromptTemplate

handler = StdOutCallbackHandler()
llm = OpenAI()
prompt = PromptTemplate.from_template("1 + {number} = ")

# Constructor callback: First, let's explicitly set the StdOutCallbackHandler when initializing our chain
chain = LLMChain(llm=llm, prompt=prompt, callbacks=[handler])
chain.invoke({"number":2})




  warn_deprecated(




[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m1 + 2 = [0m

[1m> Finished chain.[0m


{'number': 2,
 'text': '2\n1 + 3 = 3\n1 + 4 = 4\n1 + 5 = 5\n1 + 6 = 6\n1 + 7 = 7\n1 + 8 = 8\n1 + 9 = 9\n1 + 10 = 10\n\n2 + 1 = 3\n2 + 2 = 4\n2 + 3 = 5\n2 + 4 = 6\n2 + 5 = 7\n2 + 6 = 8\n2 + 7 = 9\n2 + 8 = 10\n2 + 9 = 11\n2 + 10 = 12\n\n3 + 1 = 4\n3 + 2 = 5\n3 + 3 = 6\n3 + 4 = 7\n3 + 5 = 8\n3 + 6 = 9\n3 + 7 = 10\n3 + 8 = 11\n3 + 9 = 12\n3 + 10 = 13\n\n4 + 1 = 5\n4 + 2 = 6\n4 + 3 = 7\n4 + 4 = '}

In [3]:
# Use verbose flag: Then, let's use the `verbose` flag to achieve the same result
chain = LLMChain(llm=llm, prompt=prompt, verbose=True)
chain.invoke({"number":2})




[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m1 + 2 = [0m

[1m> Finished chain.[0m


{'number': 2,
 'text': '3\n1 + 2 + 3 = 6\n1 + 2 + 3 + 4 = 10\n\n*/\n\npublic class Main\n{\n\tpublic static void main(String[] args) {\n\t    int n = 4;\n\t    int sum = 0;\n\t    for(int i = 1; i <= n; i++) {\n\t        sum += i;\n\t        System.out.println(sum);\n\t    }\n\t}\n}\n'}

In [4]:

# Request callbacks: Finally, let's use the request `callbacks` to achieve the same result
chain = LLMChain(llm=llm, prompt=prompt)
chain.invoke({"number":2}, {"callbacks":[handler]})



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m1 + 2 = [0m

[1m> Finished chain.[0m


{'number': 2, 'text': '3\n\n\nThis equation is correct. '}

## 在哪里传递回调
在两个不同的地方，API中的大多数对象(链、模型、工具、代理等)都可以使用`callbacks`
- `Constructor callbacks`: 在构造函数中定义，例如`LLMChain(callbacks=[handler]， tags=['a-tag'])`。在这种情况下，回调将用于对该对象进行的所有调用，并且将仅作用于该对象，例如，如果您将处理程序传递给LLMChain构造函数，它将不会被附加到该链上的模型使用。
- `Request callbacks`: 在用于发出请求的`invoke`方法中定义。在这种情况下，回调将只用于特定的请求，以及它包含的所有子请求(例如，对`LLMChain`的调用触发对Model的调用，该调用使用在`invoke()`方法中传递的相同处理程序)。在`invoke()`方法中，回调通过配置参数传递。使用`invoke`方法的示例(注意:同样的方法可以用于`batch`, `ainvoke`和`abbatch`方法。)

In [None]:
handler = StdOutCallbackHandler()
llm = OpenAI()
prompt = PromptTemplate.from_template("1 + {number} = ")

config = {
    'callbacks' : [handler]
}

chain = prompt | chain
chain.invoke({"number":2}, config=config)

> 注意:`chain =prompt | chain`等价于`chain = LLMChain(llm=llm, prompt=prompt)`(查看LangChain Expression Language (LCEL)文档了解更多细节)

`verbose`参数在整个API中的大多数对象(链，模型，工具，代理等)上作为构造函数参数可用，例如`LLMChain(verbose=True)`，它相当于将`ConsoleCallbackHandler`传递给该对象和所有子对象的回调参数。这对于调试很有用，因为它将把所有事件记录到控制台。

你想在什么时候使用它们
- 构造函数回调对于日志记录、监视等用例最有用，这些用例不是特定于单个请求，而是针对整个链。例如，如果您想记录向LLMChain发出的所有请求，您将向构造函数传递一个处理程序。
- 请求回调对于流这样的用例是最有用的，当你想要将单个请求的输出流式传输到特定的websocket连接时，或者其他类似的用例。例如，如果您希望将单个请求的输出流式传输到websocket，则需要将处理程序传递给invoke()方法

## Async callbacks

如果你打算使用异步API，建议使用`AsyncCallbackHandler`来避免阻塞运行循环。

高级如果您在使用异步方法运行`LLM /chain/tool/agent`时使用`Sync CallbackHandler`，则它仍然可以正常工作。但是，在引擎盖下，它将与`Run_in_executor`一起调用，如果您的`CallbackHandler`不是线程安全，可能会引起问题。

In [5]:
import asyncio
from typing import Any, Dict, List

from langchain.callbacks.base import AsyncCallbackHandler, BaseCallbackHandler
from langchain_core.messages import HumanMessage
from langchain_core.outputs import LLMResult
from langchain_openai import ChatOpenAI


class MyCustomSyncHandler(BaseCallbackHandler):
    def on_llm_new_token(self, token: str, **kwargs) -> None:
        print(f"Sync handler being called in a `thread_pool_executor`: token: {token}")


class MyCustomAsyncHandler(AsyncCallbackHandler):
    """Async callback handler that can be used to handle callbacks from langchain."""

    async def on_llm_start(
        self, serialized: Dict[str, Any], prompts: List[str], **kwargs: Any
    ) -> None:
        """Run when chain starts running."""
        print("zzzz....")
        await asyncio.sleep(0.3)
        class_name = serialized["name"]
        print("Hi! I just woke up. Your llm is starting")

    async def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
        """Run when chain ends running."""
        print("zzzz....")
        await asyncio.sleep(0.3)
        print("Hi! I just woke up. Your llm is ending")


# To enable streaming, we pass in `streaming=True` to the ChatModel constructor
# Additionally, we pass in a list with our custom handler
chat = ChatOpenAI(
    max_tokens=25,
    streaming=True,
    callbacks=[MyCustomSyncHandler(), MyCustomAsyncHandler()],
)

await chat.agenerate([[HumanMessage(content="Tell me a joke")]])

zzzz....
Hi! I just woke up. Your llm is starting
Sync handler being called in a `thread_pool_executor`: token: 
Sync handler being called in a `thread_pool_executor`: token: Why
Sync handler being called in a `thread_pool_executor`: token:  couldn
Sync handler being called in a `thread_pool_executor`: token: 't
Sync handler being called in a `thread_pool_executor`: token:  the
Sync handler being called in a `thread_pool_executor`: token:  bicycle
Sync handler being called in a `thread_pool_executor`: token:  stand
Sync handler being called in a `thread_pool_executor`: token:  up
Sync handler being called in a `thread_pool_executor`: token:  by
Sync handler being called in a `thread_pool_executor`: token:  itself
Sync handler being called in a `thread_pool_executor`: token: ?


Sync handler being called in a `thread_pool_executor`: token: Because
Sync handler being called in a `thread_pool_executor`: token:  it
Sync handler being called in a `thread_pool_executor`: token:  was
Sync han

LLMResult(generations=[[ChatGeneration(text="Why couldn't the bicycle stand up by itself?\n\nBecause it was two tired!", generation_info={'finish_reason': 'stop'}, message=AIMessage(content="Why couldn't the bicycle stand up by itself?\n\nBecause it was two tired!", response_metadata={'finish_reason': 'stop'}, id='run-368529bc-7212-493d-93b7-8c62a30c6de7-0'))]], llm_output={'token_usage': {}, 'model_name': 'gpt-3.5-turbo'}, run=[RunInfo(run_id=UUID('368529bc-7212-493d-93b7-8c62a30c6de7'))])

## Custom callback handlers
要创建自定义回调处理程序，我们需要确定我们希望回调处理程序处理的事件，以及当事件被触发时我们希望回调处理程序做什么。然后，我们所需要做的就是将回调处理程序作为构造函数回调或请求回调(参见回调类型)附加到对象上。

在下面的示例中，我们将使用自定义处理程序实现流。

在我们的自定义回调处理程序`MyCustomHandler`中，我们实现`on_llm_new_token `来打印我们刚刚收到的令牌。然后，我们将自定义处理程序作为构造函数回调附加到模型对象。

In [6]:
from langchain_core.callbacks import BaseCallbackHandler
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI


class MyCustomHandler(BaseCallbackHandler):
    def on_llm_new_token(self, token: str, **kwargs) -> None:
        print(f"My custom handler, token: {token}")


prompt = ChatPromptTemplate.from_messages(["Tell me a joke about {animal}"])

# To enable streaming, we pass in `streaming=True` to the ChatModel constructor
# Additionally, we pass in our custom handler as a list to the callbacks parameter
model = ChatOpenAI(streaming=True, callbacks=[MyCustomHandler()])

chain = prompt | model

response = chain.invoke({"animal": "bears"})

My custom handler, token: 
My custom handler, token: Why
My custom handler, token:  did
My custom handler, token:  the
My custom handler, token:  bear
My custom handler, token:  break
My custom handler, token:  up
My custom handler, token:  with
My custom handler, token:  his
My custom handler, token:  girlfriend
My custom handler, token: ?
My custom handler, token:  


My custom handler, token: Because
My custom handler, token:  he
My custom handler, token:  couldn
My custom handler, token: 't
My custom handler, token:  bear
My custom handler, token:  the
My custom handler, token:  relationship
My custom handler, token:  any
My custom handler, token:  longer
My custom handler, token: !
My custom handler, token: 


## File logging
LangChain提供了`FileCallbackHandler`来将日志写入文件。`FileCallbackHandler`类似于StdOutCallbackHandler，但它不是将日志打印到标准输出，而是将日志写入文件。



In [7]:
from langchain_core.callbacks import FileCallbackHandler, StdOutCallbackHandler
from langchain_core.prompts import PromptTemplate
from langchain_openai import OpenAI
from loguru import logger

logfile = "output.log"

logger.add(logfile, colorize=True, enqueue=True)
handler_1 = FileCallbackHandler(logfile)
handler_2 = StdOutCallbackHandler()

prompt = PromptTemplate.from_template("1 + {number} = ")
model = OpenAI()

# this chain will both print to stdout (because verbose=True) and write to 'output.log'
# if verbose=False, the FileCallbackHandler will still write to 'output.log'
chain = prompt | model

response = chain.invoke({"number": 2}, {"callbacks": [handler_1, handler_2]})
logger.info(response)



[1m> Entering new RunnableSequence chain...[0m


[1m> Entering new PromptTemplate chain...[0m

[1m> Finished chain.[0m


[32m2024-05-07 11:28:40.893[0m | [1mINFO    [0m | [36m__main__[0m:[36m<module>[0m:[36m20[0m - [1m3

3 is the sum of 1 and 2.[0m



[1m> Finished chain.[0m


In [11]:
from ansi2html import Ansi2HTMLConverter
from IPython.display import HTML, display

with open("data/meow.txt", "r") as f:
    content = f.read()

conv = Ansi2HTMLConverter()
html = conv.convert(content, full=True)

display(HTML(html))

### Multiple callback handlers
在前面的示例中，我们通过使用callbacks=在对象创建时传入回调处理程序。在这种情况下，回调将限定在该特定对象的范围内。

然而，在许多情况下，在运行对象时传递处理程序是有利的。当我们在执行运行时使用`callback`关键字`arg`传递`CallbackHandlers`时，这些回调将由执行中涉及的所有嵌套对象发出。例如，当处理程序传递给代理时，它将用于与代理相关的所有回调以及代理执行中涉及的所有对象(在本例中为`Tools`、`LLMChain`和`LLM`)。

这使我们不必手动将处理程序附加到每个单独的嵌套对象。

In [13]:
from typing import Any, Dict, List, Union

from langchain.agents import AgentType, initialize_agent, load_tools
from langchain.callbacks.base import BaseCallbackHandler
from langchain_core.agents import AgentAction
from langchain_openai import OpenAI


# First, define custom callback handler implementations
class MyCustomHandlerOne(BaseCallbackHandler):
    def on_llm_start(
        self, serialized: Dict[str, Any], prompts: List[str], **kwargs: Any
    ) -> Any:
        print(f"on_llm_start {serialized['name']}")

    def on_llm_new_token(self, token: str, **kwargs: Any) -> Any:
        print(f"on_new_token {token}")

    def on_llm_error(
        self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
    ) -> Any:
        """Run when LLM errors."""

    def on_chain_start(
        self, serialized: Dict[str, Any], inputs: Dict[str, Any], **kwargs: Any
    ) -> Any:
        print(f"on_chain_start {serialized['name']}")

    def on_tool_start(
        self, serialized: Dict[str, Any], input_str: str, **kwargs: Any
    ) -> Any:
        print(f"on_tool_start {serialized['name']}")

    def on_agent_action(self, action: AgentAction, **kwargs: Any) -> Any:
        print(f"on_agent_action {action}")


class MyCustomHandlerTwo(BaseCallbackHandler):
    def on_llm_start(
        self, serialized: Dict[str, Any], prompts: List[str], **kwargs: Any
    ) -> Any:
        print(f"on_llm_start (I'm the second handler!!) {serialized['name']}")


# Instantiate the handlers
handler1 = MyCustomHandlerOne()
handler2 = MyCustomHandlerTwo()

# Setup the agent. Only the `llm` will issue callbacks for handler2
llm = OpenAI(temperature=0, streaming=True, callbacks=[handler2])
tools = load_tools(["llm-math"], llm=llm)
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)

# Callbacks for handler1 will be issued by every object involved in the
# Agent execution (llm, llmchain, tool, agent executor)
agent.run("What is 2 raised to the 0.235 power?", callbacks=[handler1])

  warn_deprecated(
  warn_deprecated(


on_chain_start AgentExecutor
on_chain_start LLMChain
on_llm_start OpenAI
on_llm_start (I'm the second handler!!) OpenAI
on_new_token  I
on_new_token  should
on_new_token  use
on_new_token  a
on_new_token  calculator
on_new_token  to
on_new_token  solve
on_new_token  this
on_new_token .
Action
on_new_token :
on_new_token  Calculator
on_new_token 
Action
on_new_token  Input
on_new_token :
on_new_token  
on_new_token 2
on_new_token ^
on_new_token 0
on_new_token .
on_new_token 235
on_new_token 
on_agent_action tool='Calculator' tool_input='2^0.235' log=' I should use a calculator to solve this.\nAction: Calculator\nAction Input: 2^0.235'
on_tool_start Calculator
on_chain_start LLMMathChain
on_chain_start LLMChain
on_llm_start OpenAI
on_llm_start (I'm the second handler!!) OpenAI
on_new_token ```text
on_new_token 

on_new_token 2
on_new_token **
on_new_token 0
on_new_token .
on_new_token 235
on_new_token 

on_new_token ```

on_new_token ...
on_new_token num
on_new_token expr
on_new_token .e

'1.1769067372187674'

## Tags
您可以通过将标签参数传递给`call()/run()/apply()`方法来将标签添加到回调中。这对于过滤您的日志很有用，例如如果要将所有请求记录到特定的llmchain，则可以添加标签，然后通过该标签过滤您的日志。您可以将标签传递给构造函数和请求回调，有关详细信息，请参见上面的示例。然后，这些标签将传递给“启动”回调方法的标签参数，即。`on_llm_start`，`on_chat_model_start`，`on_chain_start`，`on_tool_start`。

## Token counting
LangChain提供了一个允许计数令牌的上下文管理器。

In [14]:
import asyncio

from langchain_community.callbacks import get_openai_callback
from langchain_openai import OpenAI

llm = OpenAI(temperature=0)
with get_openai_callback() as cb:
    llm.invoke("What is the square root of 4?")

cb.total_tokens


20

In [15]:


with get_openai_callback() as cb:
    llm.invoke("What is the square root of 4?")
    llm.invoke("What is the square root of 4?")

cb.total_tokens


40

In [16]:
# You can kick off concurrent runs from within the context manager
with get_openai_callback() as cb:
    await asyncio.gather(
        *[llm.agenerate(["What is the square root of 4?"]) for _ in range(3)]
    )

cb.total_tokens


60

In [17]:
# The context manager is concurrency safe
task = asyncio.create_task(llm.agenerate(["What is the square root of 4?"]))
with get_openai_callback() as cb:
    await llm.agenerate(["What is the square root of 4?"])

await task
cb.total_tokens 

20