# 04 大语言模型
大型语言模型（LLM）是LangChain的核心组件。LangChain不提供自己的LLM，而是提供了一个标准接口，用于与许多不同的LLM进行交互。
https://python.langchain.com/docs/modules/model_io/models/llms/


In [2]:
#设置代理
import os
os.environ['http_proxy'] = 'http://127.0.0.1:10809'
os.environ['https_proxy'] = 'http://127.0.0.1:10809'

In [5]:
from langchain.llms import OpenAI
llm = OpenAI()

使用 LLM 的最简单方法是可调用对象：传入字符串，获取字符串完成。
```shell
__call__: string in -> string out
```
这是他内部函数

In [8]:
#可以直接调用
llm("给我讲一个笑话")

'\n\n两个人在一起聊天，一个问另一个："你最近怎么样？"\n\n另一个回答："我想像一根螺丝钉一样，拧得越来越紧！"'

generate: 批量调用，输出更丰富

generate 允许您使用字符串列表调用模型，从而返回比文本更完整的响应。
此完整响应可能包括多个热门响应和其他特定于LLM提供程序的信息：

In [9]:
llm_result = llm.generate(["给我讲个笑话", "给我讲个诗词"]*15)

In [10]:
len(llm_result.generations)

30

In [11]:
llm_result.generations[0]

[Generation(text='\n\n笑话：\n\n一个男孩走进一家商店，问老板：“你有什么新鲜的东西卖吗？”\n\n老板回答：“有，我有一只新鲜的鸡！”\n\n男孩说：“太好了，我要买。你知道怎么把它变成鸭子吗？”\n\n老板说：“不知道，为什么？”\n\n男孩说：“因为我想给它一个惊喜！”', generation_info={'finish_reason': 'stop', 'logprobs': None})]

In [12]:
llm_result.generations[1]

[Generation(text='\n\n《春晓》\n\n春眠不觉晓，\n处处闻啼鸟。\n夜来风雨声，\n花落知多少。', generation_info={'finish_reason': 'stop', 'logprobs': None})]

In [13]:
llm_result.generations[2]

[Generation(text='\n\n一个猴子去拜访朋友，准备给他带礼物，但是他不知道该带什么，于是他就想：“现在是冬天，我可以带一件厚厚的毛衣给他！”于是他就带了毛衣去拜访朋友，朋友很开心，问猴子：“你怎么知道我需要一件毛衣呢？”猴子说：“因为这是猴子送人的冬衣！”', generation_info={'finish_reason': 'stop', 'logprobs': None})]

In [14]:
llm_result.generations[3]

[Generation(text='\n\n春晓\n\n孟浩然\n\n春眠不觉晓，\n\n处处闻啼鸟。\n\n夜来风雨声，\n\n花落知多少。', generation_info={'finish_reason': 'stop', 'logprobs': None})]

In [15]:
#您还可以访问返回的提供程序特定信息。此信息在提供商之间没有标准化。
llm_result.llm_output

{'token_usage': {'total_tokens': 4236,
  'prompt_tokens': 435,
  'completion_tokens': 3801},
 'model_name': 'text-davinci-003'}

## 4.1 异步接口
LangChain通过利用asyncio库为LLM提供异步支持。
异步支持对于同时调用多个 LLM 特别有用，因为这些调用是网络绑定的。
目前、 OpenAI PromptLayerOpenAI ChatOpenAI 、 Anthropic 和 Cohere 受支持，但对其他 LLM 的异步支持已在路线图上。
您可以使用该方法 agenerate 异步调用 OpenAI LLM。


In [4]:
import openai
openai.proxy = os.getenv('https_proxy')

In [5]:
# 导入所需的模块
import time  # 用于计时
import asyncio  # 用于处理异步编程

from langchain.llms import OpenAI  # 从langchain.llms库导入OpenAI类

# 定义一个串行（同步）方式生成文本的函数
def generate_serially():
    llm = OpenAI(temperature=0.9)  # 创建OpenAI对象，并设置temperature参数为0.9
    for _ in range(10):  # 循环10次
        resp = llm.generate(["Hello, how are you?"])  # 调用generate方法生成文本
        print(resp.generations[0][0].text)  # 打印生成的文本

# 定义一个异步生成文本的函数
async def async_generate(llm):
    resp = await llm.agenerate(["Hello, how are you?"])  # 异步调用agenerate方法生成文本
    print(resp.generations[0][0].text)  # 打印生成的文本

# 定义一个并发（异步）方式生成文本的函数
async def generate_concurrently():
    llm = OpenAI(temperature=0.9)  # 创建OpenAI对象，并设置temperature参数为0.9
    tasks = [async_generate(llm) for _ in range(10)]  # 创建10个异步任务
    await asyncio.gather(*tasks)  # 使用asyncio.gather等待所有异步任务完成




In [6]:
# 记录当前时间点
s = time.perf_counter()
# 使用异步方式并发执行生成文本的任务
# 如果在Jupyter以外运行此代码，使用 asyncio.run(generate_concurrently())
await generate_concurrently()
# 计算并发执行所花费的时间
elapsed = time.perf_counter() - s
print("\033[1m" + f"Concurrent executed in {elapsed:0.2f} seconds." + "\033[0m")




I'm doing well, thank you. How about you?

I'm doing well, thank you. How about you?


I'm doing well, thank you. How about you?


I'm doing well, how about you?


I'm doing well, thank you. How about you?


I'm doing good, thanks! How about you?


I'm doing well, thank you for asking! How about yourself?


I'm doing well, thank you. How about you?


I'm doing well, thanks for asking! How about you?


I'm doing well, thank you. How about you?
[1mConcurrent executed in 2.14 seconds.[0m


In [19]:
# 记录当前时间点
s = time.perf_counter()
# 使用同步方式串行执行生成文本的任务
generate_serially()
# 计算串行执行所花费的时间
elapsed = time.perf_counter() - s
print("\033[1m" + f"Serial executed in {elapsed:0.2f} seconds." + "\033[0m")


I'm doing well, thank you. How about you?


I'm doing well, thank you. How about you?


I'm doing well, thank you. How about you?


I'm doing well, thank you. How about you?


I'm doing well, thank you. How about you?


I'm doing well, thank you. How about you?


I'm good, thanks! How about you?


I'm doing well. How about you?


I'm doing great. How about you?


I'm doing well, thank you. How about you?
[1mSerial executed in 12.38 seconds.[0m


## 4.2 定制大语言模型
如果您想使用自己的 LLM 或与 LangChain 中支持的包装器不同的包装器。
自定义LLM只需要实现一件必需的事情：

一个 _call 方法，它接受一个字符串、一些可选的非索引字，并返回一个字符串

它可以实现第二个可选的东西：

用于帮助打印此类 _identifying_params 的属性。应该返回字典。

In [11]:
# 让我们实现一个非常简单的自定义 LLM，它只返回输入的前 N 个字符。
from typing import Any, List, Mapping, Optional
from langchain.callbacks.manager import CallbackManagerForLLMRun
from langchain.llms.base import LLM

#这个类 CustomLLM 继承了 LLM 类，并增加了一个新的类变量 n。
#有两个 property 装饰的方法，分别是 _llm_type 和 _identifying_params，这两个方法都返回一些固定的属性值。
#_call 方法主要是对输入的 prompt 字符串进行处理，返回前 n 个字符。如果提供了 stop 参数，它将引发一个异常。


# 继承自 LLM 的 CustomLLM 类
class CustomLLM1(LLM):

    # 类变量，表示一个整数
    n: int

    # 一个属性装饰器，用于获取 _llm_type 的值
    @property
    def _llm_type(self) -> str:
        # 返回 "custom" 字符串作为 _llm_type 的值
        return "custom"

    # _call 方法用于处理某些操作
    def _call(
        self,
        prompt: str,  # 输入的提示字符串
        stop: Optional[List[str]] = None,  # 可选的停止字符串列表，默认为 None
        run_manager: Optional[CallbackManagerForLLMRun] = None,  # 可选的回调管理器，默认为 None
    ) -> str:
        # 如果 stop 参数不为 None，则抛出 ValueError 异常
        if stop is not None:
            raise ValueError("stop kwargs are not permitted.")
        # 返回 prompt 字符串的前 n 个字符
        return prompt[: self.n]

    # 一个属性装饰器，用于获取 _identifying_params 的值
    @property
    def _identifying_params(self) -> Mapping[str, Any]:
        # 这个方法的文档字符串，说明这个方法的功能是获取标识参数
        """Get the identifying parameters."""
        # 返回一个字典，包含 n 的值
        return {"n": self.n}


In [12]:
llm = CustomLLM1(n=10)

In [13]:
llm("This is a foobar thing")

'This is a '

In [14]:
print(llm)

[1mCustomLLM1[0m
Params: {'n': 10}


## 4.3假的大语言模型
有时，您可能希望使用一个假的LLM，它只是返回输入的字符串。
这对于测试或调试非常有用，因为它允许您在不使用真实LLM的情况下测试您的代码。


In [3]:
# 从langchain.llms.fake模块导入FakeListLLM类，此类可能用于模拟或伪造某种行为
from langchain.llms.fake import FakeListLLM

# 从langchain.agents模块导入load_tools、initialize_agent和AgentType
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.agents import AgentType

# 调用load_tools函数，加载名为"python_repl"的工具
tools = load_tools(["python_repl"])

# 定义一个响应列表，这些响应可能是模拟LLM的预期响应
responses = ["Action: Python REPL\nAction Input: print(2 + 2)", "Final Answer: 4"]

# 使用上面定义的responses初始化一个FakeListLLM对象
llm = FakeListLLM(responses=responses)

# 调用initialize_agent函数，使用上面的tools和llm，以及指定的代理类型和verbose参数来初始化一个代理
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

# 调用代理的run方法，传递字符串"whats 2 + 2"作为输入，询问代理2加2的结果
agent.run("whats 2 + 2")




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mAction: Python REPL
Action Input: print(2 + 2)[0m
Observation: Python REPL is not a valid tool, try another one.
Thought:[32;1m[1;3mFinal Answer: 4[0m

[1m> Finished chain.[0m


'4'

与假的LLM类似，LangChain提供了一个伪LLM类，可用于测试，调试或教育目的。
这允许您模拟对LLM的呼叫，并模拟人类在收到提示时将如何响应。

In [4]:
! pip install wikipedia

Collecting wikipedia
  Using cached wikipedia-1.4.0.tar.gz (27 kB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Building wheels for collected packages: wikipedia
  Building wheel for wikipedia (setup.py): started
  Building wheel for wikipedia (setup.py): finished with status 'done'
  Created wheel for wikipedia: filename=wikipedia-1.4.0-py3-none-any.whl size=11707 sha256=8d815f24833c9e74fa8e641a8a13d04aca7e24f9444a6647cd112c708a1f592d
  Stored in directory: c:\users\lenovo\appdata\local\pip\cache\wheels\b2\7f\26\524faff9145e274da278dc97d63ab0bfde1f791ecf101a9c95
Successfully built wikipedia
Installing collected packages: wikipedia
Successfully installed wikipedia-1.4.0



[notice] A new release of pip available: 22.3.1 -> 23.2.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [5]:
#设置代理
import os
os.environ['http_proxy'] = 'http://127.0.0.1:10809'
os.environ['https_proxy'] = 'http://127.0.0.1:10809'

In [10]:
# 从langchain.llms.human模块导入HumanInputLLM类，此类可能允许人类输入或交互来模拟LLM的行为
from langchain.llms.human import HumanInputLLM

# 从langchain.agents模块导入load_tools、initialize_agent和AgentType
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.agents import AgentType

# 调用load_tools函数，加载名为"wikipedia"的工具
tools = load_tools(["wikipedia"])

# 初始化一个HumanInputLLM对象，其中prompt_func是一个函数，用于打印提示信息
llm = HumanInputLLM(
    prompt_func=lambda prompt: print(
        f"\n===PROMPT====\n{prompt}\n=====END OF PROMPT======"
    )
)

# 调用initialize_agent函数，使用上面的tools和llm，以及指定的代理类型和verbose参数来初始化一个代理
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

# 调用代理的run方法，传递字符串"What is 'Bocchi the Rock!'?"作为输入，询问代理关于'Bocchi the Rock!'的信息
agent.run("What is 'Bocchi the Rock!'?")




[1m> Entering new AgentExecutor chain...[0m

===PROMPT====
Answer the following questions as best you can. You have access to the following tools:

Wikipedia: A wrapper around Wikipedia. Useful for when you need to answer general questions about people, places, companies, facts, historical events, or other subjects. Input should be a search query.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [Wikipedia]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: What is 'Bocchi the Rock!'?
Thought:


OutputParserException: Parsing LLM output produced both a final answer and a parse-able action:: I need to use a tool.     Action: Wikipedia     Action Input: Bocchi the Rock!, Japanese four-panel manga and anime series.     Observation: Page: Bocchi the Rock!     Summary: Bocchi the Rock! (ぼっち・ざ・ろっく!, Bocchi Za Rokku!) is a Japanese four-panel manga series written and illustrated by Aki Hamaji. It has been serialized in Houbunsha's seinen manga magazine Manga Time Kirara Max since December 2017. Its chapters have been collected in five tankōbon volumes as of November 2022.     An anime television series adaptation produced by CloverWorks aired from October to December 2022. The series has been praised for its writing, comedy, characters, and depiction of social anxiety, with the anime's visual creativity receiving acclaim.          Page: Manga Time Kirara     Summary: Manga Time Kirara (まんがタイムきらら, Manga Taimu Kirara) is a Japanese seinen manga magazine published by Houbunsha which mainly serializes four-panel manga. The magazine is sold on the ninth of each month and was first published as a special edition of Manga Time, another Houbunsha magazine, on May 17, 2002. Characters from this magazine have appeared in a crossover role-playing game called Kirara Fantasia.          Page: Manga Time Kirara Max     Summary: Manga Time Kirara Max (まんがタイムきららMAX) is a Japanese four-panel seinen manga magazine published by Houbunsha. It is the third magazine of the "Kirara" series, after "Manga Time Kirara" and "Manga Time Kirara Carat". The first issue was released on September 29, 2004. Currently the magazine is released on the 19th of each month.     Thought:
These are not relevant articles.     Action: Wikipedia     Action Input: Bocchi the Rock!, Japanese four-panel manga series written and illustrated by Aki Hamaji.     Observation: Page: Bocchi the Rock!     Summary: Bocchi the Rock! (ぼっち・ざ・ろっく!, Bocchi Za Rokku!) is a Japanese four-panel manga series written and illustrated by Aki Hamaji. It has been serialized in Houbunsha's seinen manga magazine Manga Time Kirara Max since December 2017. Its chapters have been collected in five tankōbon volumes as of November 2022.     An anime television series adaptation produced by CloverWorks aired from October to December 2022. The series has been praised for its writing, comedy, characters, and depiction of social anxiety, with the anime's visual creativity receiving acclaim.     Thought:
It worked.
Final Answer: Bocchi the Rock! is a four-panel manga series and anime television series. The series has been praised for its writing, comedy, characters, and depiction of social anxiety, with the anime's visual creativity receiving acclaim.
> Finished chain.

## 4.5 大语言模型的缓存

LangChain为LLM提供了一个可选的缓存层。这很有用，原因有两个：

如果您经常多次请求相同的完成，它可以通过减少您对 LLM 提供程序进行的 API 调用次数来节省您的资金。它可以通过减少您对LLM提供程序进行的API调用次数来加速您的应用程序。

In [14]:
#在内存缓存中
import langchain
from langchain.llms import OpenAI
import time
# To make the caching really obvious, lets use a slower model.
llm = OpenAI(model_name="text-davinci-002", n=2, best_of=2)
from langchain.cache import InMemoryCache
langchain.llm_cache = InMemoryCache()

start_time = time.time()  # 记录开始时间
# The first time, it is not yet in cache, so it should take longer
print(llm.predict("Tell me a joke"))
end_time = time.time()  # 记录结束时间
elapsed_time = end_time - start_time  # 计算总时间
print(f"Predict method took {elapsed_time:.4f} seconds to execute.")



Why did the chicken cross the road?

To get to the other side!
Predict method took 1.1823 seconds to execute.


In [15]:
start_time = time.time()  # 记录开始时间
# The second time it is, so it goes faster
print(llm.predict("Tell me a joke"))
end_time = time.time()  # 记录结束时间
elapsed_time = end_time - start_time  # 计算总时间
print(f"Predict method took {elapsed_time:.4f} seconds to execute.")



Why did the chicken cross the road?

To get to the other side!
Predict method took 0.0000 seconds to execute.


In [19]:
# 使用SQLite数据库缓存
# We can do the same thing with a SQLite cache
from langchain.cache import SQLiteCache
langchain.llm_cache = SQLiteCache(database_path=".langchain.db")


start_time = time.time()  # 记录开始时间
# The first time, it is not yet in cache, so it should take longer
print(llm.predict("用中文讲个笑话"))
end_time = time.time()  # 记录结束时间
elapsed_time = end_time - start_time  # 计算总时间
print(f"Predict method took {elapsed_time:.4f} seconds to execute.")



今天，我在公司附近的商场买了一件新衣服。我觉得自己很漂亮，所以我决定去买一杯咖啡。我坐下来点了一杯咖啡，一个英俊的年轻人走过来坐在我对面。他给了我一个微笑，我微笑着回答他。我们聊了一会儿，然后他问我：“你要不要和我一起去看电影？”我说：“
Predict method took 3.7383 seconds to execute.


In [21]:
start_time = time.time()  # 记录开始时间
# The second time it is, so it goes faster
print(llm.predict("用中文讲个笑话"))
end_time = time.time()  # 记录结束时间
elapsed_time = end_time - start_time  # 计算总时间
print(f"Predict method took {elapsed_time:.4f} seconds to execute.")



今天，我在公司附近的商场买了一件新衣服。我觉得自己很漂亮，所以我决定去买一杯咖啡。我坐下来点了一杯咖啡，一个英俊的年轻人走过来坐在我对面。他给了我一个微笑，我微笑着回答他。我们聊了一会儿，然后他问我：“你要不要和我一起去看电影？”我说：“
Predict method took 0.0030 seconds to execute.


## 4.6 大语言模型的序列化配置
LangChain提供了一个方便的方法，用于将LLM的配置序列化为JSON字符串，以便将其保存到磁盘上的文件中。

LLM 配置写入磁盘和从磁盘读取 LLM 配置。如果要保存给定LLM的配置（例如，提供程序，温度等），这将非常有用。

In [22]:
from langchain.llms import OpenAI
from langchain.llms.loading import load_llm
llm = load_llm("llmstore/llm.json")

In [23]:
llm

OpenAI(cache=None, verbose=False, callbacks=None, callback_manager=None, tags=None, metadata=None, client=<class 'openai.api_resources.completion.Completion'>, model_name='text-davinci-003', temperature=0.7, max_tokens=256, top_p=1.0, frequency_penalty=0.0, presence_penalty=0.0, n=1, best_of=1, model_kwargs={}, openai_api_key='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', openai_api_base='', openai_organization='', openai_proxy='', batch_size=20, request_timeout=None, logit_bias={}, max_retries=6, streaming=False, allowed_special=set(), disallowed_special='all', tiktoken_model_name=None)

In [25]:
llm = load_llm("llmstore/llm.yaml")
llm

OpenAI(cache=None, verbose=False, callbacks=None, callback_manager=None, tags=None, metadata=None, client=<class 'openai.api_resources.completion.Completion'>, model_name='text-davinci-003', temperature=0.7, max_tokens=256, top_p=1.0, frequency_penalty=0.0, presence_penalty=0.0, n=1, best_of=1, model_kwargs={}, openai_api_key='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', openai_api_base='', openai_organization='', openai_proxy='', batch_size=20, request_timeout=None, logit_bias={}, max_retries=6, streaming=False, allowed_special=set(), disallowed_special='all', tiktoken_model_name=None)

### Saving
如果要从内存中的 LLM 转到它的序列化版本，可以通过调用该方法 .save 轻松完成。同样，这同时支持 json 和 yaml。

In [26]:
llm.save("llmsave.json")

In [28]:
llm.save("llmsave.yaml")

## 4.7 大语言模型的流式处理响应
某些 LLM 提供流式处理响应。这意味着，您可以在响应可用时立即开始处理它，而不是等待整个响应返回。

如果要在生成响应时向用户显示响应，或者要在生成响应时处理响应，这将非常有用。

目前，我们支持各种 LLM 实现的流式处理，包括但不限于 OpenAI 、 ChatOpenAI ChatAnthropic Hugging Face Text Generation Inference 和 Replicate 。

此功能已扩展为适应大多数型号。要利用流式处理，请使用 CallbackHandler 实现 on_llm_new_token .在此示例中，我们使用 StreamingStdOutCallbackHandler .

In [2]:
from langchain.llms import OpenAI
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler


llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()], temperature=0)
resp = llm("Write me a song about sparkling water.")



Verse 1
I'm sippin' on sparkling water,
It's so refreshing and light,
It's the perfect way to quench my thirst
On a hot summer night.

Chorus
Sparkling water, sparkling water,
It's the best way to stay hydrated,
It's so crisp and so clean,
It's the perfect way to stay refreshed.

Verse 2
I'm sippin' on sparkling water,
It's so bubbly and bright,
It's the perfect way to cool me down
On a hot summer night.

Chorus
Sparkling water, sparkling water,
It's the best way to stay hydrated,
It's so crisp and so clean,
It's the perfect way to stay refreshed.

Verse 3
I'm sippin' on sparkling water,
It's so light and so clear,
It's the perfect way to keep me cool
On a hot summer night.

Chorus
Sparkling water, sparkling water,
It's the best way to stay hydrated,
It's so crisp and so clean,
It's the perfect way to stay refreshed.

我们仍然可以通过使用generate来访问最终的LLMResult。然而，目前不支持在流式处理中使用token_usage。

In [3]:
llm.generate(["Tell me a joke."])



Q: What did the fish say when it hit the wall?
A: Dam!

LLMResult(generations=[[Generation(text='\n\nQ: What did the fish say when it hit the wall?\nA: Dam!', generation_info={'finish_reason': 'stop', 'logprobs': None})]], llm_output={'token_usage': {}, 'model_name': 'text-davinci-003'}, run=[RunInfo(run_id=UUID('c47c9b6f-07b2-4769-baf6-f93916b80077'))])

## 4.8 大语言模型的跟踪令牌使用情况
LangChain提供了一个方便的方法，用于跟踪LLM的令牌使用情况。这对于调试或教育目的非常有用。
仅适用于OpenAI API。

In [1]:
from langchain.llms import OpenAI
from langchain.callbacks import get_openai_callback
llm = OpenAI(model_name="text-davinci-002", n=2, best_of=2,cache = None)

with get_openai_callback() as cb:
    result = llm("讲个笑话")
    print(cb)

Tokens Used: 346
	Prompt Tokens: 10
	Completion Tokens: 336
Successful Requests: 1
Total Cost (USD): $0.00692


上下文管理器中的任何内容都将被跟踪。下面是使用它按顺序跟踪多个调用的示例。

In [2]:
with get_openai_callback() as cb:
    result2 = llm("给我讲个笑话")
    result3 = llm("给我讲个笑话")
    print(cb)

Tokens Used: 530
	Prompt Tokens: 30
	Completion Tokens: 500
Successful Requests: 2
Total Cost (USD): $0.010600000000000002


如果使用具有多个步骤的链或代理，它将跟踪所有这些步骤。

In [4]:
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.agents import AgentType
from langchain.llms import OpenAI

llm = OpenAI(temperature=0)
tools = load_tools(["llm-math"], llm=llm)
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)
with get_openai_callback() as cb:
    response = agent.run(
        "王菲现在的年龄是多少？"
    )
    print(f"Total Tokens: {cb.total_tokens}")
    print(f"Prompt Tokens: {cb.prompt_tokens}")
    print(f"Completion Tokens: {cb.completion_tokens}")
    print(f"Total Cost (USD): ${cb.total_cost}")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to find out when she was born.
Action: Calculator
Action Input: 2020 - 1957[0m
Observation: [36;1m[1;3mAnswer: 63[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: 王菲现在的年龄是63岁。[0m

[1m> Finished chain.[0m
Total Tokens: 681
Prompt Tokens: 606
Completion Tokens: 75
Total Cost (USD): $0.01362
