# 模型客户端

AutoGen 提供了 {py:mod}`autogen_core.components.models` 模块，其中包含一套用于使用 ChatCompletion API 的内置模型客户端。所有模型客户端都实现了 {py:class}`~autogen_core.components.models.ChatCompletionClient` 协议类。

## 内置模型客户端

目前有两个内置模型客户端：{py:class}`~autogen_ext.models.OpenAIChatCompletionClient` 和 {py:class}`~autogen_ext.models.AzureOpenAIChatCompletionClient`。两个客户端都是异步的。

要使用 {py:class}`~autogen_ext.models.OpenAIChatCompletionClient`，您需要通过环境变量 `OPENAI_API_KEY` 或通过 `api_key` 参数提供 API 密钥。

In [2]:
from autogen_core.components.models import UserMessage
from autogen_ext.models import OpenAIChatCompletionClient

# Create an OpenAI model client.
model_client = OpenAIChatCompletionClient(
    model="gpt-4o",
    api_key="sk-", # Optional if you have an API key set in the environment.
)

您可以调用 {py:meth}`~autogen_ext.models.OpenAIChatCompletionClient.create` 方法来创建一个聊天完成请求，并等待返回一个 {py:class}`~autogen_core.components.models.CreateResult` 对象。

In [3]:
# Send a message list to the model and await the response.
messages = [
    UserMessage(content="What is the capital of France?", source="user"),
]
response = await model_client.create(messages=messages)

# Print the response
print(response.content)

The capital of France is Paris.


In [4]:
# Print the response token usage
print(response.usage)

RequestUsage(prompt_tokens=15, completion_tokens=7)


### 流式响应

您可以使用 {py:meth}`~autogen_ext.models.OpenAIChatCompletionClient.create_streaming` 方法来创建一个带有流式响应的聊天完成请求。

In [5]:
messages = [
    UserMessage(content="Write a very short story about a dragon.", source="user"),
]

# Create a stream.
stream = model_client.create_stream(messages=messages)

# Iterate over the stream and print the responses.
print("Streamed responses:")
async for response in stream:  # type: ignore
    if isinstance(response, str):
        # A partial response is a string.
        print(response, flush=True, end="")
    else:
        # The last response is a CreateResult object with the complete message.
        print("\n\n------------\n")
        print("The complete response:", flush=True)
        print(response.content, flush=True)
        print("\n\n------------\n")
        print("The token usage was:", flush=True)
        print(response.usage, flush=True)

Streamed responses:
In a hidden valley surrounded by misty mountains, there lived a wise and gentle dragon named Zephyr. Unlike other dragons, Zephyr had shimmering emerald scales and eyes that sparkled like starlit skies. He spent his days guarding a secret garden brimming with vibrant flowers and ancient trees. 

Villagers from nearby towns whispered tales of Zephyr's garden, believing it to be enchanted. But only a pure heart could find the path through the dense woods. One day, a lost child named Amara stumbled upon the narrow trail leading to the garden. 

Zephyr, sensing her innocence, emerged gracefully, his wings casting a protective shadow. Instead of fear, Amara felt a warmth that echoed the kindness she saw in the dragon's eyes. Together, they played among the wildflowers, with Zephyr teaching her the language of the wind and secrets of the stars.

As dusk fell, Zephyr led Amara back to her village, invisible to all eyes but hers. Grateful and eager to share her tale, Amara 

```{note}
流式响应中的最后一个响应始终是类型为 {py:class}`~autogen_core.components.models.CreateResult` 的最终响应。
```

**注意：默认的使用量响应是返回零值**

### 关于流式示例中令牌使用计数的说明
比较上面非流式 `model_client.create(messages=messages)` 与流式 `model_client.create_stream(messages=messages)` 的使用量返回，我们看到了差异。
非流式响应默认返回有效的提示和完成令牌使用计数。
流式响应默认返回零值。

如 OPENAI API 参考文档中所述，可以指定额外的参数 `stream_options` 来返回有效的使用计数。参见 [stream_options](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)

仅在使用流式传输时设置此参数，即使用 `create_stream` 时

要在 `create_stream` 中启用此功能，设置 `extra_create_args={"stream_options": {"include_usage": True}},`

- **注意，虽然其他 API（如 LiteLLM）也支持此功能，但并不总是保证完全支持或正确**

#### 带有令牌使用量的流式示例

In [7]:
messages = [
    UserMessage(content="Write a very short story about a dragon.", source="user"),
]

# Create a stream.
stream = model_client.create_stream(messages=messages, extra_create_args={"stream_options": {"include_usage": True}})

# Iterate over the stream and print the responses.
print("Streamed responses:")
async for response in stream:  # type: ignore
    if isinstance(response, str):
        # A partial response is a string.
        print(response, flush=True, end="")
    else:
        # The last response is a CreateResult object with the complete message.
        print("\n\n------------\n")
        print("The complete response:", flush=True)
        print(response.content, flush=True)
        print("\n\n------------\n")
        print("The token usage was:", flush=True)
        print(response.usage, flush=True)

Streamed responses:
In a lush, emerald valley hidden by towering peaks, there lived a dragon named Ember. Unlike others of her kind, Ember cherished solitude over treasure, and the songs of the stream over the roar of flames. One misty dawn, a young shepherd stumbled into her sanctuary, lost and frightened. 

Instead of fury, he was met with kindness as Ember extended a wing, guiding him back to safety. In gratitude, the shepherd visited yearly, bringing tales of his world beyond the mountains. Over time, a friendship blossomed, binding man and dragon in shared stories and laughter.

As the years passed, the legend of Ember the gentle-hearted spread far and wide, forever changing the way dragons were seen in the hearts of many.

------------

The complete response:
In a lush, emerald valley hidden by towering peaks, there lived a dragon named Ember. Unlike others of her kind, Ember cherished solitude over treasure, and the songs of the stream over the roar of flames. One misty dawn, a 

### Azure OpenAI

要使用 {py:class}`~autogen_ext.models.AzureOpenAIChatCompletionClient`，您需要提供部署 ID、Azure 认知服务端点、API 版本和模型功能。对于身份验证，您可以提供 API 密钥或 Azure Active Directory (AAD) 令牌凭证。要使用 AAD 身份验证，您需要首先安装 `azure-identity` 包。

In [1]:
# pip install azure-identity

以下代码片段显示了如何使用 AAD 身份验证。使用的身份必须被分配 [**认知服务 OpenAI 用户**](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/role-based-access-control#cognitive-services-openai-user) 角色。

In [15]:
from autogen_ext.models import AzureOpenAIChatCompletionClient
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

# Create the token provider
token_provider = get_bearer_token_provider(DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default")

az_model_client = AzureOpenAIChatCompletionClient(
    model="{your-azure-deployment}",
    api_version="2024-06-01",
    azure_endpoint="https://{your-custom-endpoint}.openai.azure.com/",
    azure_ad_token_provider=token_provider,  # Optional if you choose key-based authentication.
    # api_key="sk-...", # For key-based authentication.
    model_capabilities={
        "vision": True,
        "function_calling": True,
        "json_output": True,
    },
)

```{note}
参见[此处](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/managed-identity#chat-completions)了解如何直接使用 Azure 客户端或获取更多信息。
```

## 使用模型客户端构建代理

让我们创建一个简单的 AI 代理，它可以使用 ChatCompletion API 响应消息。

In [4]:
from dataclasses import dataclass

from autogen_core.application import SingleThreadedAgentRuntime
from autogen_core.base import MessageContext
from autogen_core.components import RoutedAgent, message_handler
from autogen_core.components.models import ChatCompletionClient, SystemMessage, UserMessage
from autogen_ext.models import OpenAIChatCompletionClient


@dataclass
class Message:
    content: str


class SimpleAgent(RoutedAgent):
    def __init__(self, model_client: ChatCompletionClient) -> None:
        super().__init__("A simple agent")
        self._system_messages = [SystemMessage("You are a helpful AI assistant.")]
        self._model_client = model_client

    @message_handler
    async def handle_user_message(self, message: Message, ctx: MessageContext) -> Message:
        # Prepare input to the chat completion model.
        user_message = UserMessage(content=message.content, source="user")
        response = await self._model_client.create(
            self._system_messages + [user_message], cancellation_token=ctx.cancellation_token
        )
        # Return with the model's response.
        assert isinstance(response.content, str)
        return Message(content=response.content)

`SimpleAgent` 类是 {py:class}`autogen_core.components.RoutedAgent` 类的子类，用于方便地自动将消息路由到适当的处理程序。它有一个单一的处理程序 `handle_user_message`，用于处理来自用户的消息。它使用 `ChatCompletionClient` 生成对消息的响应。然后它按照直接通信模型将响应返回给用户。

```{note}
类型为 {py:class}`autogen_core.base.CancellationToken` 的 `cancellation_token` 用于取消异步操作。它与消息处理程序内的异步调用相关联，调用者可以使用它来取消处理程序。
```

In [6]:
# Create the runtime and register the agent.
from autogen_core.base import AgentId

runtime = SingleThreadedAgentRuntime()
await SimpleAgent.register(
    runtime,
    "simple_agent",
    lambda: SimpleAgent(
        OpenAIChatCompletionClient(
            model="gpt-4o-mini",
            # api_key="sk-...", # Optional if you have an OPENAI_API_KEY set in the environment.
        )
    ),
)
# Start the runtime processing messages.
runtime.start()
# Send a message to the agent and get the response.
message = Message("Hello, what are some fun things to do in Seattle?")
response = await runtime.send_message(message, AgentId("simple_agent", "default"))
print(response.content)
# Stop the runtime processing messages.
await runtime.stop()

Seattle is a vibrant city with a wide range of activities and attractions. Here are some fun things to do in Seattle:

1. **Space Needle**: Visit this iconic observation tower for stunning views of the city and surrounding mountains.

2. **Pike Place Market**: Explore this historic market where you can see the famous fish toss, buy local produce, and find unique crafts and eateries.

3. **Museum of Pop Culture (MoPOP)**: Dive into the world of contemporary culture, music, and science fiction at this interactive museum.

4. **Chihuly Garden and Glass**: Marvel at the beautiful glass art installations by artist Dale Chihuly, located right next to the Space Needle.

5. **Seattle Aquarium**: Discover the diverse marine life of the Pacific Northwest at this engaging aquarium.

6. **Seattle Art Museum**: Explore a vast collection of art from around the world, including contemporary and indigenous art.

7. **Kerry Park**: For one of the best views of the Seattle skyline, head to this small pa

## 管理模型上下文

上述 `SimpleAgent` 总是以只包含系统消息和最新用户消息的新上下文进行响应。我们可以使用来自 {py:mod}`autogen_core.components.model_context` 的模型上下文类来使代理"记住"之前的对话。模型上下文支持聊天完成消息的存储和检索。它总是与模型客户端一起使用来生成基于 LLM 的响应。

例如，{py:mod}`~autogen_core.components.model_context.BufferedChatCompletionContext` 是一个最近使用（MRU）上下文，它存储最近的 `buffer_size` 数量的消息。这对于避免许多 LLM 中的上下文溢出很有用。

让我们更新前面的示例以使用 {py:mod}`~autogen_core.components.model_context.BufferedChatCompletionContext`。

In [9]:
from autogen_core.components.model_context import BufferedChatCompletionContext
from autogen_core.components.models import AssistantMessage


class SimpleAgentWithContext(RoutedAgent):
    def __init__(self, model_client: ChatCompletionClient) -> None:
        super().__init__("A simple agent")
        self._system_messages = [SystemMessage("You are a helpful AI assistant.")]
        self._model_client = model_client
        self._model_context = BufferedChatCompletionContext(buffer_size=5)

    @message_handler
    async def handle_user_message(self, message: Message, ctx: MessageContext) -> Message:
        # Prepare input to the chat completion model.
        user_message = UserMessage(content=message.content, source="user")
        # Add message to model context.
        await self._model_context.add_message(user_message)
        # Generate a response.
        response = await self._model_client.create(
            self._system_messages + (await self._model_context.get_messages()),
            cancellation_token=ctx.cancellation_token,
        )
        # Return with the model's response.
        assert isinstance(response.content, str)
        # Add message to model context.
        await self._model_context.add_message(AssistantMessage(content=response.content, source=self.metadata["type"]))
        return Message(content=response.content)

现在让我们尝试在第一个问题之后提出后续问题。

In [10]:
runtime = SingleThreadedAgentRuntime()
await SimpleAgentWithContext.register(
    runtime,
    "simple_agent_context",
    lambda: SimpleAgentWithContext(
        OpenAIChatCompletionClient(
            model="gpt-4o-mini",
            # api_key="sk-...", # Optional if you have an OPENAI_API_KEY set in the environment.
        )
    ),
)
# Start the runtime processing messages.
runtime.start()
agent_id = AgentId("simple_agent_context", "default")

# First question.
message = Message("Hello, what are some fun things to do in Seattle?")
print(f"Question: {message.content}")
response = await runtime.send_message(message, agent_id)
print(f"Response: {response.content}")
print("-----")

# Second question.
message = Message("What was the first thing you mentioned?")
print(f"Question: {message.content}")
response = await runtime.send_message(message, agent_id)
print(f"Response: {response.content}")

# Stop the runtime processing messages.
await runtime.stop()

Question: Hello, what are some fun things to do in Seattle?
Response: Seattle offers a wide variety of fun activities and attractions for visitors. Here are some highlights:

1. **Pike Place Market**: Explore this iconic market, where you can find fresh produce, unique crafts, and the famous fish-throwing vendors. Don’t forget to visit the original Starbucks!

2. **Space Needle**: Enjoy breathtaking views of the city and Mount Rainier from the observation deck of this iconic structure. You can also dine at the SkyCity restaurant.

3. **Chihuly Garden and Glass**: Admire the stunning glass art installations created by artist Dale Chihuly. The garden and exhibit are particularly beautiful, especially in good weather.

4. **Museum of Pop Culture (MoPOP)**: Dive into the world of music, science fiction, and pop culture through interactive exhibits and memorabilia.

5. **Seattle Aquarium**: Located on the waterfront, the aquarium features a variety of marine life native to the Pacific North

从第二个响应中，您可以看到代理现在可以回忆起它自己之前的响应。