# LangChain 的核心模块：Model I/O

Model I/O 是LangChain为开发者提供的的一套面向LLM的标准化模型接口，包括模型输入(Prompts),模型输出(Output Parsers)和模型本身(Models)
- Prompts: 模板化，动态输入
- Models:调用的语言模型
- Output Parser: 从模型输出中提取信息，并规范化内容

![img](./picture/IO_model.png)

## 安装LangChain

In [3]:
# 安装
!pip install langchain
# 更新到最新的版本
# !pip install -U langchain


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.1.2[0m[39;49m -> [0m[32;49m23.2.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


## 模型输入 Prompts

一个语言模型的提示是用户提供一组指令或者文字描述，用于引导模型响应，帮助它理解上下文并生成相关和连贯的基于语言的输出，例如回答问题，完成对话
- 提示模板 (Prompt Templates):参数化的模型输入
- 示例选择器 (Example Selectors):动态选择要包含在提示中示例

### 提示模板(Prompt Templates)

Prompt Templates 提供了一种预定义，动态注入，模板无关和参数化的提示词生成方式，可以在不同语言模型之间重用的模板
提示模板通常为一个字符串(LLMs)或者一组聊天消息(Chat Model)

以下是常用的LangChain中常用模板类的继承关系

```
BasePromptTemplate --> PipelinePromptTemplate
                       StringPromptTemplate --> PromptTemplate
                                                FewShotPromptTemplate
                                                FewShotPromptWithTemplates
                       BaseChatPromptTemplate --> AutoGPTPrompt
                                                  ChatPromptTemplate --> AgentScratchPadChatPromptTemplate



BaseMessagePromptTemplate --> MessagesPlaceholder
                              BaseStringMessagePromptTemplate --> ChatMessagePromptTemplate
                                                                  HumanMessagePromptTemplate
                                                                  AIMessagePromptTemplate
                                                                  SystemMessagePromptTemplate

PromptValue --> StringPromptValue
                ChatPromptValue
```

**代码实现：https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/prompts**

#### 使用PromptTemplate生成提示词

In [13]:
# 方式一：使用from_template方法实例化模板
from langchain import PromptTemplate
prompt_template = PromptTemplate.from_template("Tell me a {adjective} joke about {content}.")
# 使用format生成提示
prompt = prompt_template.format(adjective="funny",content="chickens")

print(prompt, type(prompt), sep="\n")

Tell me a funny joke about chickens.
<class 'str'>


In [19]:
# 方式二：使用构造函数实例化模板
# 在实例化PromptTemplate的时候传参：input_variables 和 template
from langchain import PromptTemplate
prompt_template = PromptTemplate(
    input_variables=["adjective","content"],
    template="Tell me a {adjective} joke about {content}."
)
prompt = prompt_template.format(adjective="funny",content="dog")
print(prompt, type(prompt), sep="\n")


Tell me a funny joke about dog.
<class 'str'>


#### 使用PromptTemplate生成的提示词交互大模型

In [27]:
from langchain.llms import OpenAI
from langchain import PromptTemplate
# 定义大模型
llm = OpenAI(openai_api_key="sk-SJBPQGvGY6P31aLpks0X6s1VaR75hgPlvdF181UESvtLNMer", 
             openai_api_base="https://api.chatanywhere.com.cn/v1",
             model_name="gpt-3.5-turbo")

# 定义模型
prompt_template = PromptTemplate(
    input_variables=["num"],
    template="讲{num}个关于程序员的笑话"
)

# 发送请求并获取相应
result = llm(prompt_template.format(num=3))
print(type(result))
print(f"result:\n{result}")

<class 'str'>
result:
1. 为什么程序员总是喜欢在夜晚工作？因为他们喜欢黑客入侵的感觉！

2. 为什么程序员总是喜欢用黑色主题的编辑器？因为他们觉得黑色背景更酷，而且可以节省电量，让电脑更快！

3. 为什么程序员总是喜欢用鼠标右键点击？因为他们觉得左键是用来修复问题的，而右键是用来制造问题的！


#### 使用ChatPromptTemplate 生成适用于聊天模型的提示词

In [4]:
# 方式一：使用消息列表方法生成提示词
from langchain.prompts import ChatPromptTemplate
prompt_template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful AI bot. Your name is {name}."),
    ("human", "Hello, how are you doing?"),
    ("ai", "I'm doing well, thanks!"),
    ("human", "{user_input}"),
])
# 生成提示词
prompt = prompt_template.format_messages(name="victor", user_input="what is your name?")
print(prompt, type(prompt), sep="\n")

[SystemMessage(content='You are a helpful AI bot. Your name is victor.', additional_kwargs={}), HumanMessage(content='Hello, how are you doing?', additional_kwargs={}, example=False), AIMessage(content="I'm doing well, thanks!", additional_kwargs={}, example=False), HumanMessage(content='what is your name?', additional_kwargs={}, example=False)]
<class 'list'>


In [5]:
print(prompt[0].content)

You are a helpful AI bot. Your name is victor.


In [15]:
# 方式二：使用角色PromptTemplate生成提示词
from langchain.prompts import HumanMessagePromptTemplate, AIMessagePromptTemplate, SystemMessagePromptTemplate, ChatPromptTemplate

sysMessage = SystemMessagePromptTemplate.from_template("You are a helpful AI bot. Your name is {name}.")
humanMessage_1 = HumanMessagePromptTemplate.from_template("Hello, how are you doing?")
ai_message = AIMessagePromptTemplate.from_template("I'm doing well, thanks!")
humanMessage_2 = HumanMessagePromptTemplate.from_template("{user_input}")

prompt_template = ChatPromptTemplate.from_messages([sysMessage, humanMessage_1, ai_message, humanMessage_2])
# 生成提示词
prompt = prompt_template.format_messages(name="kunmzhao", user_input="what is your name?")
print(prompt, type(prompt), sep="\n")

[SystemMessage(content='You are a helpful AI bot. Your name is kunmzhao.', additional_kwargs={}), HumanMessage(content='Hello, how are you doing?', additional_kwargs={}, example=False), AIMessage(content="I'm doing well, thanks!", additional_kwargs={}, example=False), HumanMessage(content='what is your name?', additional_kwargs={}, example=False)]
<class 'list'>


#### 使用ChatPromptTemplate生成的提示词交互大模型

In [20]:
from langchain.chat_models import ChatOpenAI
# 定义会话大模型
chat_model = ChatOpenAI(
    openai_api_key="sk-SJBPQGvGY6P31aLpks0X6s1VaR75hgPlvdF181UESvtLNMer", 
    openai_api_base="https://api.chatanywhere.com.cn/v1",
    model_name="gpt-3.5-turbo"
)

# 定义会话 prompt
summary_template = ChatPromptTemplate.from_messages([
    ("system", "你将获得关于同一主题的{num}篇文章（用-----------标签分隔）。首先总结每篇文章的论点。然后指出哪篇文章提出了更好的论点，并解释原因。"),
    ("human", "{user_input}"),
])

messages = summary_template.format_messages(
    num=3,
    user_input='''1. [PHP是世界上最好的语言]
PHP是世界上最好的情感派编程语言，无需逻辑和算法，只要情绪。它能被蛰伏在冰箱里的PHP大神轻易驾驭，会话结束后的感叹号也能传达对代码的热情。写PHP就像是在做披萨，不需要想那么多，只需把配料全部扔进一个碗，然后放到服务器上，热乎乎出炉的网页就好了。
-----------
2. [Python是世界上最好的语言]
Python是世界上最好的拜金主义者语言。它坚信：美丽就是力量，简洁就是灵魂。Python就像是那个永远在你皱眉的那一刻扔给你言情小说的好友。只有Python，你才能够在两行代码之间感受到飘逸的花香和清新的微风。记住，这世上只有一种语言可以使用空格来领导全世界的进步，那就是Python。
-----------
3. [Java是世界上最好的语言]
Java是世界上最好的德育课编程语言，它始终坚守了严谨、安全的编程信条。Java就像一个严格的老师，他不会对你怀柔，不会让你偷懒，也不会让你走捷径，但他教会你规范和自律。Java就像是那个喝咖啡也算加班费的上司，拥有对邪恶的深度厌恶和对善良的深度拥护。
'''
)

result = chat_model(messages)
print(result.content)

1. 第一篇文章的论点是PHP是世界上最好的语言，因为它无需逻辑和算法，只需要情绪。
2. 第二篇文章的论点是Python是世界上最好的语言，因为它拥有美丽和简洁的特性。
3. 第三篇文章的论点是Java是世界上最好的语言，因为它具有严谨和安全的编程信条。

根据这些论点，我认为第三篇文章提出了更好的论点。原因如下：
第三篇文章的论点强调了Java对规范和自律的重视，这是一种重要的编程价值观。它强调了在编程过程中的严谨性和安全性，这对于确保代码质量和安全性非常重要。与第一篇和第二篇文章相比，第三篇文章更加关注编程的核心原则和价值观，而不是只强调情感或美学。因此，第三篇文章提出了更好的论点。


#### 使用FewShotPromtTemplate类生成提示词

In [26]:
from langchain.prompts.few_shot import FewShotPromptTemplate
from langchain import PromptTemplate

# 定义例子
examples = [
  {
    "question": "谁活得更久，穆罕默德·阿里还是艾伦·图灵？",
    "answer": 
"""
这里需要进一步的问题吗：是的。
追问：穆罕默德·阿里去世时多大了？
中间答案：穆罕默德·阿里去世时74岁。
追问：艾伦·图灵去世时多大了？
中间答案：艾伦·图灵去世时41岁。
所以最终答案是：穆罕默德·阿里
"""
  },
  {
    "question": "craigslist的创始人是什么时候出生的？",
    "answer": 
"""
这里需要进一步的问题吗：是的。
追问：谁是craigslist的创始人？
中间答案：Craigslist是由Craig Newmark创办的。
追问：Craig Newmark是什么时候出生的？
中间答案：Craig Newmark出生于1952年12月6日。
所以最终答案是：1952年12月6日
"""
  },
  {
    "question": "乔治·华盛顿的外祖父是谁？",
    "answer":
"""
这里需要进一步的问题吗：是的。
追问：谁是乔治·华盛顿的母亲？
中间答案：乔治·华盛顿的母亲是Mary Ball Washington。
追问：Mary Ball Washington的父亲是谁？
中间答案：Mary Ball Washington的父亲是Joseph Ball。
所以最终答案是：Joseph Ball
"""
  },
  {
    "question": "《大白鲨》和《皇家赌场》的导演是同一个国家的吗？",
    "answer":
"""
这里需要进一步的问题吗：是的。
追问：谁是《大白鲨》的导演？
中间答案：《大白鲨》的导演是Steven Spielberg。
追问：Steven Spielberg来自哪里？
中间答案：美国。
追问：谁是《皇家赌场》的导演？
中间答案：《皇家赌场》的导演是Martin Campbell。
追问：Martin Campbell来自哪里？
中间答案：新西兰。
所以最终答案是：不是
"""
  }
]
# 定义例子模板
example_prompt = PromptTemplate(
    input_variables=["question", "answer"],
    template="Question: {question}\n{answer}"
)

# 创建一个 FewShotPromptTemplate 对象
few_shot_prompt = FewShotPromptTemplate(examples=examples,example_prompt=example_prompt, suffix="Question:{input}",input_variables=["input"])
prompt = few_shot_prompt.format(input="习近平和袁隆平谁的年纪大?")
print(prompt)

Question: 谁活得更久，穆罕默德·阿里还是艾伦·图灵？

这里需要进一步的问题吗：是的。
追问：穆罕默德·阿里去世时多大了？
中间答案：穆罕默德·阿里去世时74岁。
追问：艾伦·图灵去世时多大了？
中间答案：艾伦·图灵去世时41岁。
所以最终答案是：穆罕默德·阿里


Question: craigslist的创始人是什么时候出生的？

这里需要进一步的问题吗：是的。
追问：谁是craigslist的创始人？
中间答案：Craigslist是由Craig Newmark创办的。
追问：Craig Newmark是什么时候出生的？
中间答案：Craig Newmark出生于1952年12月6日。
所以最终答案是：1952年12月6日


Question: 乔治·华盛顿的外祖父是谁？

这里需要进一步的问题吗：是的。
追问：谁是乔治·华盛顿的母亲？
中间答案：乔治·华盛顿的母亲是Mary Ball Washington。
追问：Mary Ball Washington的父亲是谁？
中间答案：Mary Ball Washington的父亲是Joseph Ball。
所以最终答案是：Joseph Ball


Question: 《大白鲨》和《皇家赌场》的导演是同一个国家的吗？

这里需要进一步的问题吗：是的。
追问：谁是《大白鲨》的导演？
中间答案：《大白鲨》的导演是Steven Spielberg。
追问：Steven Spielberg来自哪里？
中间答案：美国。
追问：谁是《皇家赌场》的导演？
中间答案：《皇家赌场》的导演是Martin Campbell。
追问：Martin Campbell来自哪里？
中间答案：新西兰。
所以最终答案是：不是


Question:习近平和袁隆平谁的年纪大?


#### 使用FewShotPromtTemplate类生成提示词交互大模型

In [27]:
from langchain.llms import OpenAI
from langchain import PromptTemplate
# 定义大模型
llm = OpenAI(openai_api_key="sk-SJBPQGvGY6P31aLpks0X6s1VaR75hgPlvdF181UESvtLNMer", 
             openai_api_base="https://api.chatanywhere.com.cn/v1",
             model_name="gpt-3.5-turbo")

result = llm(prompt)
print(result)

这里需要进一步的问题吗：是的。
追问：习近平的出生日期是什么？
中间答案：习近平出生于1953年6月15日。
追问：袁隆平的出生日期是什么？
中间答案：袁隆平出生于1930年9月7日。
所以最终答案是：袁隆平


### 示例选择器 Example Selectors
#### Select by similarity

如果有大量的参考示例，我们需要选择那些包含在实体词中。常根据某些条件或者规则来自动选择

In [6]:
!pip install chromadb
!pip install tiktoken


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3.1[0m[39;49m -> [0m[32;49m23.2.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3.1[0m[39;49m -> [0m[32;49m23.2.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [9]:
from langchain.prompts.example_selector import SemanticSimilarityExampleSelector
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.prompts import FewShotPromptTemplate, PromptTemplate

example_prompt = PromptTemplate(
    input_variables=["input", "output"],
    template="Input: {input}\nOutput: {output}",
)

# These are a lot of examples of a pretend task of creating antonyms.
examples = [
    {"input": "happy", "output": "sad"},
    {"input": "tall", "output": "short"},
    {"input": "energetic", "output": "lethargic"},
    {"input": "sunny", "output": "gloomy"},
    {"input": "windy", "output": "calm"},
]

example_selector = SemanticSimilarityExampleSelector.from_examples(
    # This is the list of examples available to select from.
    examples, 
    # This is the embedding class used to produce embeddings which are used to measure semantic similarity.
    OpenAIEmbeddings(
        openai_api_key="sk-SJBPQGvGY6P31aLpks0X6s1VaR75hgPlvdF181UESvtLNMer", 
        openai_api_base="https://api.chatanywhere.com.cn/v1",
        model_name="gpt-3.5-turbo"
    ), 
    # This is the VectorStore class that is used to store the embeddings and do a similarity search over.
    Chroma, 
    # This is the number of examples to produce.
    k=1
)
similar_prompt = FewShotPromptTemplate(
    # We provide an ExampleSelector instead of examples.
    example_selector=example_selector,
    example_prompt=example_prompt,
    prefix="Give the antonym of every input",
    suffix="Input: {adjective}\nOutput:", 
    input_variables=["adjective"],
)

print(similar_prompt.format(adjective="worried"))

Give the antonym of every input

Input: happy
Output: sad

Input: worried
Output:


#### 使用示例选择器 Select by similarity 交互大模型

In [13]:
from langchain.llms import OpenAI
from langchain import PromptTemplate
# 定义大模型
llm = OpenAI(openai_api_key="sk-SJBPQGvGY6P31aLpks0X6s1VaR75hgPlvdF181UESvtLNMer", 
             openai_api_base="https://api.chatanywhere.com.cn/v1",
             model_name="gpt-3.5-turbo")

result = llm(similar_prompt.format(adjective="worried"))
print(result)

calm


#### Select by length
这个示例选择器根据长度选择要使用的示例。当担心构造一个将超过上下文窗口长度的提示时，这很有用。对于较长的输入，它会选择更少的例子，而对于较短的输入，它会选择更多的例子。

In [23]:
from langchain.prompts import PromptTemplate
from langchain.prompts import FewShotPromptTemplate
from langchain.prompts.example_selector import LengthBasedExampleSelector

examples = [
    {"input": "happy", "output": "sad"},
    {"input": "tall", "output": "short"},
    {"input": "energetic", "output": "lethargic"},
    {"input": "sunny", "output": "gloomy"},
    {"input": "windy", "output": "calm"},
]

example_prompt = PromptTemplate(
    input_variables=["input", "output"],
    template="Input: {input}\nOutput: {output}",
)

example_selector = LengthBasedExampleSelector(
    examples=examples,
    example_prompt = example_prompt,
    max_length=25,#Length is measured by the get_text_length function
)
length_prompt = FewShotPromptTemplate(
    example_selector=example_selector,
    example_prompt=example_prompt,
    prefix="Give the antonym of every input",
    suffix="Input:{adjective}\nOutput:",
    input_variables=["adjective"],
)
print(length_prompt.format(adjective="big"))

Give the antonym of every input

Input: happy
Output: sad

Input: tall
Output: short

Input: energetic
Output: lethargic

Input: sunny
Output: gloomy

Input: windy
Output: calm

Input:big
Output:


#### 使用示例选择器 Select by length 交互大模型

In [26]:
from langchain.llms import OpenAI
from langchain import PromptTemplate
# 定义大模型
llm = OpenAI(openai_api_key="sk-SJBPQGvGY6P31aLpks0X6s1VaR75hgPlvdF181UESvtLNMer", 
             openai_api_base="https://api.chatanywhere.com.cn/v1",
             model_name="gpt-3.5-turbo")

result = llm(length_prompt.format(adjective="big"))
print(result)

small


#### Custom example selector
有时候我们可能需要自定义selector来满足不同业务的开发
自定义的selector需要依赖两个方法
- add_example: 接收example
- select_examples: 筛选需要的example

下面自定义一个随机获取example的业务需求

In [43]:
from langchain.prompts.example_selector.base import BaseExampleSelector
from typing import Dict, List
import numpy as np


class CustomExampleSelector(BaseExampleSelector):
    
    def __init__(self, examples: List[Dict[str, str]]):
        self.examples = examples
    
    def add_example(self, example: Dict[str, str]) -> None:
        """Add new example to store for a key."""
        self.examples.append(example)

    def select_examples(self, input_variables: Dict[str, str]) -> List[dict]:
        """Select which examples to use based on the inputs."""
        return np.random.choice(self.examples, size=2, replace=False)


In [59]:
examples = [
    {"input": "happy", "output": "sad"},
    {"input": "tall", "output": "short"},
    {"input": "energetic", "output": "lethargic"},
    {"input": "sunny", "output": "gloomy"},
    {"input": "windy", "output": "calm"},
]
example_selector = CustomExampleSelector(examples=examples)
example_selector.examples

[{'input': 'happy', 'output': 'sad'},
 {'input': 'tall', 'output': 'short'},
 {'input': 'energetic', 'output': 'lethargic'},
 {'input': 'sunny', 'output': 'gloomy'},
 {'input': 'windy', 'output': 'calm'}]

In [61]:
print(example_selector.select_examples({"input":"big","output":"small"}))

[{'input': 'energetic', 'output': 'lethargic'}
 {'input': 'windy', 'output': 'calm'}]


In [72]:
example_prompt = PromptTemplate(
    input_variables=["input","output",],
    template="input:{input}\noutput:{output}",
)

custom_prompt = FewShotPromptTemplate(
    example_selector=example_selector,
    example_prompt=example_prompt,
    prefix="Give the antonym of every input",
    suffix="input:{input}\noutput:",
    input_variables=["input"],
)
prompt = custom_prompt.format(input="small")
print(prompt)

Give the antonym of every input

input:energetic
output:lethargic

input:windy
output:calm

input:small
output:


## 模型 Models

- 语言模型(LLMs):LangChain的核心组件，LangChain本身不提供自己的LLMs，而是与许多不同的LLMs(OpenAI,Hugging Face等)进行交互提供了一系列标准接口
- 聊天模型(Chat Models): 语言模型的一种变体，虽然聊天模型内部使用了语言模型，但是LangChain提供的接口是不同的，提供了一个以聊天为输入和输出的接口

### 语言模型 LLMs


类继承关系：

```
BaseLanguageModel --> BaseLLM --> LLM --> <name>  # Examples: AI21, HuggingFaceHub, OpenAI
```

**API 参考文档：https://api.python.langchain.com/en/latest/api_reference.html#module-langchain.llms**

LLMs 已支持模型清单

**开发者文档：https://python.langchain.com/docs/integrations/llms/**

以下主要介绍关于OpenAI的使用

#### 使用 LangChain 调用 OpenAI GPT Completion API

**代码实现：https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/llms/openai.py**
```python

class BaseOpenAI(BaseLLM):
    """Base OpenAI large language model class."""

    client: Any  #: :meta private:
    model_name: str = Field("text-davinci-003", alias="model")
    """Model name to use."""
    temperature: float = 0.7
    """What sampling temperature to use."""
    max_tokens: int = 256
    """The maximum number of tokens to generate in the completion.
    -1 returns as many tokens as possible given the prompt and
    the models maximal context size."""
    top_p: float = 1
    """Total probability mass of tokens to consider at each step."""
    frequency_penalty: float = 0
    """Penalizes repeated tokens according to frequency."""
    presence_penalty: float = 0
    """Penalizes repeated tokens."""
    n: int = 1
    """How many completions to generate for each prompt."""
    best_of: int = 1
    """Generates best_of completions server-side and returns the "best"."""
    model_kwargs: Dict[str, Any] = Field(default_factory=dict)
    """Holds any model parameters valid for `create` call not explicitly specified."""
    openai_api_key: Optional[str] = None
    openai_api_base: Optional[str] = None
    openai_organization: Optional[str] = None
    # to support explicit proxy for OpenAI
    openai_proxy: Optional[str] = None
    batch_size: int = 20
    """Batch size to use when passing multiple documents to generate."""
    request_timeout: Optional[Union[float, Tuple[float, float]]] = None
    """Timeout for requests to OpenAI completion API. Default is 600 seconds."""
    logit_bias: Optional[Dict[str, float]] = Field(default_factory=dict)
    """Adjust the probability of specific tokens being generated."""
    max_retries: int = 6
    """Maximum number of retries to make when generating."""
    streaming: bool = False
    """Whether to stream the results or not."""
    allowed_special: Union[Literal["all"], AbstractSet[str]] = set()
    """Set of special tokens that are allowed。"""
    disallowed_special: Union[Literal["all"], Collection[str]] = "all"
    """Set of special tokens that are not allowed。"""
    tiktoken_model_name: Optional[str] = None
    """The model name to pass to tiktoken when using this class. 
    Tiktoken is used to count the number of tokens in documents to constrain 
    them to be under a certain limit. By default, when set to None, this will 
    be the same as the embedding model name. However, there are some cases 
    where you may want to use this Embedding class with a model name not 
    supported by tiktoken. This can include when using Azure embeddings or 
    when using one of the many model providers that expose an OpenAI-like 
    API but with different models. In those cases, in order to avoid erroring 
    when tiktoken is called, you can specify a model name to use here."""

    ...


class OpenAI(BaseOpenAI):
    """OpenAI large language models.

    To use, you should have the ``openai`` python package installed, and the
    environment variable ``OPENAI_API_KEY`` set with your API key.

    Any parameters that are valid to be passed to the openai.create call can be passed
    in, even if not explicitly saved on this class.

    Example:
        .. code-block:: python

            from langchain.llms import OpenAI
            openai = OpenAI(model_name="text-davinci-003")
    """

    @property
    def _invocation_params(self) -> Dict[str, Any]:
        return {**{"model": self.model_name}, **super()._invocation_params}
```

In [20]:
from langchain.llms import OpenAI
llm = OpenAI(openai_api_key="sk-SJBPQGvGY6P31aLpks0X6s1VaR75hgPlvdF181UESvtLNMer", 
             openai_api_base="https://api.chatanywhere.com.cn/v1",
             model_name="gpt-3.5-turbo")

result = llm("给我讲1个笑话")
print(result)

好的，给你讲一个笑话：

有一天，小明去参加一个面试。面试官问：“小明，你有什么特长吗？”小明回答：“我可以模仿动物的叫声。”面试官觉得很好奇，就说：“那好，你模仿一下猫叫吧。”小明立刻弯下腰，用力地喵喵喵地叫了几声。面试官惊讶地说：“太棒了！你还可以模仿其他动物吗？”小明点了点头，然后闭上眼睛，用力地喘了几口气，然后说：“我是一只鱼，我是一只鱼...”


In [19]:
print(type(llm))
llm.__dict__
llm.temperature

<class 'langchain.llms.openai.OpenAIChat'>


AttributeError: 'OpenAIChat' object has no attribute 'temperature'

In [14]:
# 跟踪特定呼叫的令牌使用情况。它目前只针对OpenAI API实现。
from langchain.callbacks import get_openai_callback
with get_openai_callback() as cb:
    result = llm("Tell me a joke")
    result1 = llm("Tell me a another joke")
    print(cb)
print("===========")

print("===========")
print(result, result1, sep='\n')

Tokens Used: 68
	Prompt Tokens: 23
	Completion Tokens: 45
Successful Requests: 2
Total Cost (USD): $0.0001245
Sure, here's a classic one for you:

Why don't scientists trust atoms?

Because they make up everything!
Sure, here's another joke for you:

Why don't scientists trust atoms?

Because they make up everything!


### 聊天模型 Chat Models

类继承关系：

```
BaseLanguageModel --> BaseChatModel --> <name>  # Examples: ChatOpenAI, ChatGooglePalm
```
**API 参考文档：https://api.python.langchain.com/en/latest/api_reference.html#module-langchain.chat_models**
```python

class BaseChatModel(BaseLanguageModel[BaseMessageChunk], ABC):
    """Base class for chat models."""

    cache: Optional[bool] = None
    """是否缓存响应。"""
    verbose: bool = Field(default_factory=_get_verbosity)
    """是否打印响应文本。"""
    callbacks: Callbacks = Field(default=None, exclude=True)
    """添加到运行追踪的回调函数。"""
    callback_manager: Optional[BaseCallbackManager] = Field(default=None, exclude=True)
    """添加到运行追踪的回调函数管理器。"""
    tags: Optional[List[str]] = Field(default=None, exclude=True)
    """添加到运行追踪的标签。"""
    metadata: Optional[Dict[str, Any]] = Field(default=None, exclude=True)
    """添加到运行追踪的元数据。"""
    ...


class ChatOpenAI(BaseChatModel):
    """Wrapper around OpenAI Chat large language models.

    To use, you should have the ``openai`` python package installed, and the
    environment variable ``OPENAI_API_KEY`` set with your API key.

    Any parameters that are valid to be passed to the openai.create call can be passed
    in, even if not explicitly saved on this class.

    Example:
        .. code-block:: python

            from langchain.chat_models import ChatOpenAI
            openai = ChatOpenAI(model_name="gpt-3.5-turbo")
    """

    @property
    def lc_secrets(self) -> Dict[str, str]:
        return {"openai_api_key": "OPENAI_API_KEY"}

    @property
    def lc_serializable(self) -> bool:
        return True

    client: Any = None  #: :meta private:
    model_name: str = Field(default="gpt-3.5-turbo", alias="model")
    """Model name to use."""
    temperature: float = 0.7
    """What sampling temperature to use."""
    model_kwargs: Dict[str, Any] = Field(default_factory=dict)
    """Holds any model parameters valid for `create` call not explicitly specified."""
    openai_api_key: Optional[str] = None
    """Base URL path for API requests, 
    leave blank if not using a proxy or service emulator."""
    openai_api_base: Optional[str] = None
    openai_organization: Optional[str] = None
    # to support explicit proxy for OpenAI
    openai_proxy: Optional[str] = None
    request_timeout: Optional[Union[float, Tuple[float, float]]] = None
    """Timeout for requests to OpenAI completion API. Default is 600 seconds."""
    max_retries: int = 6
    """Maximum number of retries to make when generating."""
    streaming: bool = False
    """Whether to stream the results or not."""
    n: int = 1
    """Number of chat completions to generate for each prompt."""
    max_tokens: Optional[int] = None
    """Maximum number of tokens to generate."""
    tiktoken_model_name: Optional[str] = None
```

In [12]:
from langchain.chat_models import ChatOpenAI
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

chat_model = ChatOpenAI(
    openai_api_key="sk-SJBPQGvGY6P31aLpks0X6s1VaR75hgPlvdF181UESvtLNMer", 
    openai_api_base="https://api.chatanywhere.com.cn/v1",
    model_name="gpt-3.5-turbo",
    streaming=True, 
    callbacks=[StreamingStdOutCallbackHandler()],
)

In [13]:
from langchain.schema import (
    AIMessage,
    HumanMessage,
    SystemMessage
)

messages = [SystemMessage(content="You are a helpful assistant."),
 HumanMessage(content="Who won the world series in 2020?"),
 AIMessage(content="The Los Angeles Dodgers won the World Series in 2020."), 
 HumanMessage(content="Where was it played?")]

chat_result = chat_model(messages)
print(chat_result.content)

The 2020 World Series was played at Globe Life Field in Arlington, Texas.The 2020 World Series was played at Globe Life Field in Arlington, Texas.


## 模型输出 Output Parser
语言模型的输出是文本。

但很多时候，您可能希望获得比纯文本更结构化的信息。这就是输出解析器的价值所在。

输出解析器是帮助结构化语言模型响应的类。它们必须实现两种主要方法：

    "获取格式指令"：返回一个包含有关如何格式化语言模型输出的字符串的方法。
    "解析"：接受一个字符串（假设为来自语言模型的响应），并将其解析成某种结构。
    "使用提示进行解析"：接受一个字符串（假设为来自语言模型的响应）和一个提示（假设为生成此响应的提示），并将其解析成某种结构。在需要重新尝试或修复输出，并且需要从提示中获取信息以执行此操作时，通常会提供提示。



列表解析

当您想要返回一个逗号分隔的项目列表时，可以使用此输出解析器。

In [1]:
from langchain.output_parsers import CommaSeparatedListOutputParser
from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate
from langchain.llms import OpenAI

# 创建一个输出解析器，用于处理带逗号分隔的列表输出
output_parser = CommaSeparatedListOutputParser()

# 获取格式化指令，该指令告诉模型如何格式化其输出
format_instructions = output_parser.get_format_instructions()

# 创建一个提示模板
prompt = PromptTemplate(
    template="List five {subject}.\n{format_instructions}",  # 模板内容
    input_variables=["subject"],  # 输入变量
    partial_variables={"format_instructions": format_instructions}  # 预定义的变量，这里我们传入格式化指令
)
# 使用提示模板和给定的主题来格式化输入
_input = prompt.format(subject="ice cream flavors")
print(_input)

List five ice cream flavors.
Your response should be a list of comma separated values, eg: `foo, bar, baz`


In [2]:
from langchain.llms import OpenAI
# 定义大模型
llm = OpenAI(openai_api_key="sk-SJBPQGvGY6P31aLpks0X6s1VaR75hgPlvdF181UESvtLNMer", 
             openai_api_base="https://api.chatanywhere.com.cn/v1",
             model_name="gpt-3.5-turbo")
result = llm(_input)
print(result)



Vanilla, Chocolate, Strawberry, Mint Chocolate Chip, Cookies and Cream


In [3]:
# 使用之前创建的输出解析器来解析模型的输出
output_parser.parse(result)

['Vanilla',
 'Chocolate',
 'Strawberry',
 'Mint Chocolate Chip',
 'Cookies and Cream']

#### 日期解析

In [7]:
from langchain.output_parsers import DatetimeOutputParser
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI

output_parser = DatetimeOutputParser()
format_instructions = output_parser.get_format_instructions()
# print(format_instructions)
template = """Answer the users question:

{question}

your anwser should be like above
{format_instructions}"""

prompt_template = PromptTemplate.from_template(
    template,
    partial_variables={"format_instructions": format_instructions},
)

chain = LLMChain(prompt=prompt_template, llm=OpenAI(
    openai_api_key="sk-SJBPQGvGY6P31aLpks0X6s1VaR75hgPlvdF181UESvtLNMer", 
    openai_api_base="https://api.chatanywhere.com.cn/v1",
    model_name="gpt-3.5-turbo"
))

output = chain.run("When did Hong Kong return to China?")
print(output)

1997-07-01T00:00:00.000000Z


In [8]:
output_parser.parse(output)

datetime.datetime(1997, 7, 1, 0, 0)

In [9]:
print(output_parser.parse(output))

1997-07-01 00:00:00


#### 结构化输出解析
当希望返回多个字段时，可以使用他的输出解析器

In [10]:
from langchain.output_parsers import StructuredOutputParser, ResponseSchema
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI

# 定义了想要解析的响应格式
response_schemas = [
    ResponseSchema(name="answer", description="answer to the user's question"),
    ResponseSchema(name="source", description="source used to answer the user's question, should be a website.")
]

output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
format_instructions = output_parser.get_format_instructions()
prompt = PromptTemplate(
    template="answer the users question as best as possible.\n{format_instructions}\n{question}",
    input_variables=["question"],
    partial_variables={"format_instructions": format_instructions}
)

llm=OpenAI(
    openai_api_key="sk-SJBPQGvGY6P31aLpks0X6s1VaR75hgPlvdF181UESvtLNMer", 
    openai_api_base="https://api.chatanywhere.com.cn/v1",
    model_name="gpt-3.5-turbo"
)

_input = prompt.format_prompt(question="what's the capital of france?")
print(type(_input),_input.to_string(), sep="\n")


<class 'langchain.prompts.base.StringPromptValue'>
answer the users question as best as possible.
The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"answer": string  // answer to the user's question
	"source": string  // source used to answer the user's question, should be a website.
}
```
what's the capital of france?


In [11]:
output = llm(_input.to_string())
print(output)

```json
{
	"answer": "The capital of France is Paris.",
	"source": "https://en.wikipedia.org/wiki/Paris"
}
```
