# 中间件

## 一、预算控制

利用运行时（runtime）和上下文（context），我们可以动态地更改模型配置。这种动态性能帮助我们更好地控制模型。

一个实用的场景是「预算控制」。随着对话轮次增加，积累的历史对话越来越多，每次请求的费用也随之增加。为了控制预算，我们可以设定在对话超过某个轮次之后，切换到费率较低的模型。该功能可以通过中间件实现。

事实上，中间件（middleware）能实现的功能有很多，可以将它视为 Agent 的万能控制接口。`LangChain v1.0` 通过引入中间件，极大增强了对 Agent 的掌控力。

In [1]:
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain.agents import create_agent
from langchain.agents.middleware import wrap_model_call, ModelRequest, ModelResponse
from langchain_core.messages import HumanMessage
from langgraph.graph import MessagesState

# 加载模型配置
_ = load_dotenv()

# 低费率模型
basic_model = ChatOpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url=os.getenv("DASHSCOPE_BASE_URL"),
    model="qwen3-coder-plus",
)

# 高费率模型
advanced_model = ChatOpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url=os.getenv("DASHSCOPE_BASE_URL"),
    model="qwen3-max",
)

具体来讲，使用 [@wrap_tool_call](https://reference.langchain.com/python/langchain/middleware/#langchain.agents.middleware.wrap_tool_call) 装饰器可以创建中间件。

In [2]:
@wrap_model_call
def dynamic_model_selection(request: ModelRequest, handler) -> ModelResponse:
    """Choose model based on conversation complexity."""
    message_count = len(request.state["messages"])

    if message_count > 5:
        # Use an advanced model for longer conversations
        model = advanced_model
    else:
        model = basic_model

    request.model = model
    print(f"message_count: {message_count}")
    print(f"model_name: {model.model_name}")

    return handler(request)

agent = create_agent(
    model=basic_model,  # Default model
    middleware=[dynamic_model_selection]
)

In [3]:
state: MessagesState = {"messages": []}
items = ['汽车', '飞机', '摩托车', '自行车']
for idx, i in enumerate(items):
    print(f"\n=== Round {idx+1} ===")
    state["messages"] += [HumanMessage(content=f"{i}有几个轮子，请简单回答")]
    result = agent.invoke(state)
    state["messages"] = result["messages"]
    print(f"content: {result["messages"][-1].content}")


=== Round 1 ===
message_count: 1
model_name: qwen3-coder-plus
content: 汽车有4个轮子。

=== Round 2 ===
message_count: 3
model_name: qwen3-coder-plus
content: 飞机有3个轮子（起落架）。

=== Round 3 ===
message_count: 5
model_name: qwen3-coder-plus
content: 摩托车有2个轮子。

=== Round 4 ===
message_count: 7
model_name: qwen3-max
content: 自行车有2个轮子。


## 二、截断消息

In [4]:
from langchain.messages import RemoveMessage
from langgraph.graph.message import REMOVE_ALL_MESSAGES
from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents import create_agent, AgentState
from langchain.agents.middleware import before_model
from langgraph.runtime import Runtime
from langchain_core.runnables import RunnableConfig
from typing import Any

在下面的例子中，由于我们始终保留第一条消息，因此智能体总是记得我叫 bob。

In [5]:
@before_model
def trim_messages(state: AgentState, runtime: Runtime) -> dict[str, Any] | None:
    """Keep only the last few messages to fit context window."""
    messages = state["messages"]

    if len(messages) <= 3:
        return None  # No changes needed

    first_msg = messages[0]
    recent_messages = messages[-3:] if len(messages) % 2 == 0 else messages[-4:]
    new_messages = [first_msg] + recent_messages

    return {
        "messages": [
            RemoveMessage(id=REMOVE_ALL_MESSAGES),
            *new_messages
        ]
    }

agent = create_agent(
    basic_model,
    middleware=[trim_messages],
    checkpointer=InMemorySaver(),
)

config: RunnableConfig = {"configurable": {"thread_id": "1"}}

def agent_invoke(agent):
    agent.invoke({"messages": "hi, my name is bob"}, config)
    agent.invoke({"messages": "write a short poem about cats"}, config)
    agent.invoke({"messages": "now do the same but for dogs"}, config)
    final_response = agent.invoke({"messages": "what's my name?"}, config)
    
    final_response["messages"][-1].pretty_print()

agent_invoke(agent)


Your name is Bob! You introduced yourself to me earlier.


我们对中间件进行一些修改，仅保留最后两条对话记录，现在智能体不记得我是 bob 了。

In [6]:
@before_model
def trim_without_first_message(state: AgentState, runtime: Runtime) -> dict[str, Any] | None:
    """Keep only the last few messages to fit context window."""
    messages = state["messages"]

    return {
        "messages": [
            RemoveMessage(id=REMOVE_ALL_MESSAGES),
            *messages[-2:]
        ]
    }

agent = create_agent(
    basic_model,
    middleware=[trim_without_first_message],
    checkpointer=InMemorySaver(),
)

agent_invoke(agent)


I don't have access to your name or personal information. I don't know who you are - I can only see the text of our conversation. If you'd like to share your name, I'd be happy to use it, but I can't identify you from just our chat history.
