<a href="https://colab.research.google.com/github/jerryjliu/llama_index/blob/main/docs/docs/examples/callbacks/LlamaDebugHandler.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="在 Colab 中打开"/></a>


# Llama调试处理程序

在这里，我们展示了我们的LlamaDebugHandler在LlamaIndex中运行查询时记录事件的能力。

**注意**：这是一个测试版功能。不同类中的使用方式以及CallbackManager和LlamaDebugHandler的API接口可能会发生变化！


如果您在colab上打开这个笔记本，您可能需要安装LlamaIndex 🦙。


In [None]:
%pip install llama-index-agent-openai
%pip install llama-index-llms-openai

In [None]:
!pip install llama-index

In [None]:
from llama_index.core.callbacks import (
    CallbackManager,
    LlamaDebugHandler,
    CBEventType,
)

## 下载数据


In [None]:
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

In [None]:
from llama_index.core import SimpleDirectoryReader

docs = SimpleDirectoryReader("./data/paul_graham/").load_data()

## Callback Manager设置


In [None]:
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", temperature=0)
llama_debug = LlamaDebugHandler(print_trace_on_end=True)
callback_manager = CallbackManager([llama_debug])

## 使用查询触发回调


In [None]:
from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_documents(
    docs, callback_manager=callback_manager
)
query_engine = index.as_query_engine()

**********
Trace: index_construction
    |_node_parsing ->  0.134458 seconds
      |_chunking ->  0.132142 seconds
    |_embedding ->  0.329045 seconds
    |_embedding ->  0.357797 seconds
**********


In [None]:
response = query_engine.query("What did the author do growing up?")

**********
Trace: query
    |_query ->  2.198197 seconds
      |_retrieve ->  0.122185 seconds
        |_embedding ->  0.117082 seconds
      |_synthesize ->  2.075836 seconds
        |_llm ->  2.069724 seconds
**********


## 探索调试信息

回调管理器将记录以下类型的多个开始和结束事件：
- CBEventType.LLM
- CBEventType.EMBEDDING
- CBEventType.CHUNKING
- CBEventType.NODE_PARSING
- CBEventType.RETRIEVE
- CBEventType.SYNTHESIZE 
- CBEventType.TREE
- CBEventType.QUERY

LlamaDebugHandler提供了一些基本方法，用于探索这些事件的信息。


In [None]:
# 在汇总索引查询期间打印有关LLM调用的信息
print(llama_debug.get_event_time_info(CBEventType.LLM))

EventStats(total_secs=2.069724, average_secs=2.069724, total_count=1)


In [None]:
# 打印有关LLM输入/输出的信息 - 返回每个LLM调用的开始/结束事件
event_pairs = llama_debug.get_llm_inputs_outputs()
print(event_pairs[0][0])
print(event_pairs[0][1].payload.keys())
print(event_pairs[0][1].payload["response"])

CBEvent(event_type=<CBEventType.LLM: 'llm'>, payload={<EventPayload.MESSAGES: 'messages'>: [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content="You are an expert Q&A system that is trusted around the world.\nAlways answer the query using the provided context information, and not prior knowledge.\nSome rules to follow:\n1. Never directly reference the given context in your answer.\n2. Avoid statements like 'Based on the context, ...' or 'The context information ...' or anything along those lines.", additional_kwargs={}), ChatMessage(role=<MessageRole.USER: 'user'>, content='Context information is below.\n---------------------\nWhat I Worked On\n\nFebruary 2021\n\nBefore college the two main things I worked on, outside of school, were writing and programming.I didn\'t write essays.I wrote what beginning writers were supposed to write then, and probably still are: short stories.My stories were awful.They had hardly any plot, just characters with strong feelings, which I imagined mad

In [None]:
# 获取任何事件类型的信息
event_pairs = llama_debug.get_event_pairs(CBEventType.CHUNKING)
print(event_pairs[0][0].payload.keys())  # 获取第一个分块开始事件
print(event_pairs[0][1].payload.keys())  # 获取第一个分块结束事件

dict_keys([<EventPayload.CHUNKS: 'chunks'>])
dict_keys([<EventPayload.CHUNKS: 'chunks'>])


In [None]:
# 清除当前缓存的事件
llama_debug.flush_event_logs()

## 查看代理的跟踪和事件

在这个示例中，我们将展示如何查看代理的跟踪和事件。


In [None]:
# 首先创建一个代理工具
from llama_index.core.tools import QueryEngineTool

tool = QueryEngineTool.from_defaults(
    query_engine=query_engine,
    name="PaulGrahamQuestionAnswer",
    description="给定一个关于Paul Graham的问题，将返回一个答案。",
)

In [None]:
# 现在构建代理
from llama_index.agent.openai import OpenAIAgent

agent = OpenAIAgent.from_tools(
    tools=[tool], llm=llm, callback_manager=callback_manager
)

In [None]:
response = agent.chat("What did Paul do growing up?")

**********
Trace: chat
    |_llm ->  1.169013 seconds
    |_query ->  2.357469 seconds
      |_retrieve ->  0.107983 seconds
        |_embedding ->  0.099368 seconds
      |_synthesize ->  2.24932 seconds
        |_llm ->  2.239481 seconds
    |_llm ->  2.153333 seconds
**********


In [None]:
# 对于异步操作也是一样的
response = await agent.achat("Paul在成长过程中做了什么？")

**********
Trace: chat
    |_llm ->  1.318663 seconds
    |_query ->  2.803533 seconds
      |_retrieve ->  0.121228 seconds
        |_embedding ->  0.116355 seconds
      |_synthesize ->  2.68217 seconds
        |_llm ->  2.676306 seconds
    |_llm ->  2.716374 seconds
**********


In [None]:
# 清除当前缓存的事件
llama_debug.flush_event_logs()