# API调用可观测性

使用新的`instrumentation`包，我们可以直接观测使用LLMs和嵌入模型进行的API调用。

在这个笔记本中，我们将探讨如何实现对LLM和嵌入调用的可观测性。


In [None]:
import os

os.environ["OPENAI_API_KEY"] = "sk-..."

## 定义事件处理程序


In [None]:
from llama_index.core.instrumentation.event_handlers import BaseEventHandlerfrom llama_index.core.instrumentation.events.llm import (    LLMCompletionEndEvent,    LLMChatEndEvent,)from llama_index.core.instrumentation.events.embedding import EmbeddingEndEventclass ModelEventHandler(BaseEventHandler):    @classmethod    def class_name(cls) -> str:        """类名。"""        return "ModelEventHandler"    def handle(self, event) -> None:        """处理事件的逻辑。"""        if isinstance(event, LLMCompletionEndEvent):            print(f"LLM提示长度: {len(event.prompt)}")            print(f"LLM完成: {str(event.response.text)}")        elif isinstance(event, LLMChatEndEvent):            messages_str = "\n".join([str(x) for x in event.messages])            print(f"LLM输入消息长度: {len(messages_str)}")            print(f"LLM响应: {str(event.response.message)}")        elif isinstance(event, EmbeddingEndEvent):            print(f"嵌入{len(event.chunks)}个文本块")

## 添加事件处理程序


In [None]:
from llama_index.core.instrumentation import get_dispatcher# 根调度器root_dispatcher = get_dispatcher()# 注册事件处理程序root_dispatcher.add_event_handler(ModelEventHandler())

## 调用处理程序！


In [None]:
from llama_index.core import Document, VectorStoreIndex

index = VectorStoreIndex.from_documents([Document.example()])

Embedding 1 text chunks


In [None]:
query_engine = index.as_query_engine()
response = query_engine.query("Tell me about LLMs?")

Embedding 1 text chunks
LLM Input Messages length: 1879
LLM Response: assistant: LlamaIndex is a "data framework" designed to assist in building LLM apps. It offers tools such as data connectors for various data sources, ways to structure data for easy use with LLMs, an advanced retrieval/query interface, and integrations with different application frameworks. It caters to both beginner and advanced users, providing a high-level API for simple data ingestion and querying, as well as lower-level APIs for customization and extension of modules to suit specific requirements.


In [None]:
query_engine = index.as_query_engine(streaming=True)
response = query_engine.query("Repeat only these two words: Hello world!")
for r in response.response_gen:
    ...

Embedding 1 text chunks
LLM Input Messages length: 1890
LLM Response: assistant: 
LLM Input Messages length: 1890
LLM Response: assistant: Hello
LLM Input Messages length: 1890
LLM Response: assistant: Hello world
LLM Input Messages length: 1890
LLM Response: assistant: Hello world!
LLM Input Messages length: 1890
LLM Response: assistant: Hello world!
