 # LangChain Core Modules: Model I/O

`Model I/O` is a standardized interface provided by LangChain for working with large language models (LLMs). It includes components for model inputs (Prompts), outputs (Output Parsers), and the models themselves (Models).

- Prompts: Template-based, dynamic selection and managed model inputs. 模板化、动态选择和管理模型输入

- Models: Standard interface for calling language models. 以通用接口调用语言模型

- Output Parsers: Extract and normalize information from model outputs. 从模型输出中提取信息，并规范化内容

![](../../images/model_io.jpeg)

In [None]:
# LangChain Python SDK（https://github.com/langchain-ai/langchain）
# !pip install -U langchain

## Model Abstraction 模型抽象

- Language Models (LLMs): The core component of LangChain. LangChain does not provide its own LLMs; instead, it offers a standardized interface to interact with a variety of LLMs (such as OpenAI, Cohere, Hugging Face, etc.). LangChain 的核心组件。LangChain并不提供自己的LLMs，而是为与许多不同的LLMs（OpenAI、Cohere、Hugging Face等）进行交互提供了一个标准接口。

- Chat Models（聊天模型）: A variant of language models. Although chat models internally use language models, their interfaces are slightly different. Rather than exposing a "text in, text out" API, they use an interface based on "chat messages" as both input and output. 语言模型的一种变体。虽然聊天模型在内部使用了语言模型，但它们提供的接口略有不同。与其暴露一个“输入文本，输出文本”的API不同，它们提供了一个以“聊天消息”作为输入和输出的接口。

(Note: Compare OpenAI's Completion API and Chat Completion API)

## 语言模型（LLMs)

Class Inheritance Hierarchy：

LangChain organizes its model abstractions using an inheritance-based structure:

```
BaseLanguageModel
└── BaseLLM (for text-based language models)
    └── Specific implementations (e.g., OpenAI, Cohere, AI21, HuggingFaceHub, etc.)

BaseChatModel
└── ChatOpenAI, ChatAnthropic, etc.
```

This structure allows both LLMs and Chat Models to share common behavior while supporting interface differences.


Main Abstract:

```
LLMResult, PromptValue,
CallbackManagerForLLMRun, AsyncCallbackManagerForLLMRun,
CallbackManager, AsyncCallbackManager,
AIMessage, BaseMessage
```

**API Reference：https://api.python.langchain.com/en/latest/api_reference.html#module-langchain.llms**

### BaseLanguageModel Class

**https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/schema/language_model.py**

这个基类为语言模型定义了一个接口，该接口允许用户以不同的方式与模型交互（例如通过提示或消息）。`generate_prompt` 是其中的一个主要方法，它接受一系列提示，并返回模型的生成结果。

```python
# 定义 BaseLanguageModel 抽象基类，它从 Serializable, Runnable 和 ABC 继承
class BaseLanguageModel(
    Serializable, Runnable[LanguageModelInput, LanguageModelOutput], ABC
):
    """
    与语言模型交互的抽象基类。

    所有语言模型的封装器都应从 BaseLanguageModel 继承。

    主要提供三种方法：
    - generate_prompt: 为一系列的提示值生成语言模型输出。提示值是可以转换为任何语言模型输入格式的模型输入（如字符串或消息）。
    - predict: 将单个字符串传递给语言模型并返回字符串预测。
    - predict_messages: 将一系列 BaseMessages（对应于单个模型调用）传递给语言模型，并返回 BaseMessage 预测。

    每种方法都有对应的异步方法。
    """

    # 定义一个抽象方法 generate_prompt，需要子类进行实现
    @abstractmethod
    def generate_prompt(
        self,
        prompts: List[PromptValue],  # 输入提示的列表
        stop: Optional[List[str]] = None,  # 生成时的停止词列表
        callbacks: Callbacks = None,  # 回调，用于执行例如日志记录或流式处理的额外功能
        **kwargs: Any,  # 任意的额外关键字参数，通常会传递给模型提供者的 API 调用
    ) -> LLMResult:
        """
        将一系列的提示传递给模型并返回模型的生成。

        对于提供批处理 API 的模型，此方法应使用批处理调用。

        使用此方法时：
            1. 希望利用批处理调用，
            2. 需要从模型中获取的输出不仅仅是最顶部生成的值，
            3. 构建与底层语言模型类型无关的链（例如，纯文本完成模型与聊天模型）。

        参数:
            prompts: 提示值的列表。提示值是一个可以转换为与任何语言模型匹配的格式的对象（对于纯文本生成模型为字符串，对于聊天模型为 BaseMessages）。
            stop: 生成时使用的停止词。模型输出在这些子字符串的首次出现处截断。
            callbacks: 要传递的回调。用于执行额外功能，例如在生成过程中进行日志记录或流式处理。
            **kwargs: 任意的额外关键字参数。通常这些会传递给模型提供者的 API 调用。

        返回值:
            LLMResult，它包含每个输入提示的候选生成列表以及特定于模型提供者的额外输出。
        """
```


### BaseLLM Class

**代码实现：https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/llms/base.py**

这段代码定义了一个名为 BaseLLM 的抽象基类。这个基类的主要目的是提供一个基本的接口来处理大型语言模型 (LLM)。

```python
# 定义 BaseLLM 抽象基类，它从 BaseLanguageModel[str] 和 ABC（Abstract Base Class）继承
class BaseLLM(BaseLanguageModel[str], ABC):
    """Base LLM abstract interface.
    
    It should take in a prompt and return a string."""

    # 定义可选的缓存属性，其初始值为 None
    cache: Optional[bool] = None

    # 定义 verbose 属性，该属性决定是否打印响应文本
    # 默认值使用 _get_verbosity 函数的结果
    verbose: bool = Field(default_factory=_get_verbosity)
    """Whether to print out response text."""

    # 定义 callbacks 属性，其初始值为 None，并从序列化中排除
    callbacks: Callbacks = Field(default=None, exclude=True)

    # 定义 callback_manager 属性，其初始值为 None，并从序列化中排除
    callback_manager: Optional[BaseCallbackManager] = Field(default=None, exclude=True)

    # 定义 tags 属性，这些标签会被添加到运行追踪中，其初始值为 None，并从序列化中排除
    tags: Optional[List[str]] = Field(default=None, exclude=True)
    """Tags to add to the run trace."""

    # 定义 metadata 属性，这些元数据会被添加到运行追踪中，其初始值为 None，并从序列化中排除
    metadata: Optional[Dict[str, Any]] = Field(default=None, exclude=True)
    """Metadata to add to the run trace."""

    # 内部类定义了这个 pydantic 对象的配置
    class Config:
        """Configuration for this pydantic object."""

        # 允许使用任意类型
        arbitrary_types_allowed = True

```
这个基类使用了 Pydantic 的功能，特别是 Field 方法，用于定义默认值和序列化行为。BaseLLM 的子类需要提供实现具体功能的方法。

### LLM Class

**代码实现：https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/llms/base.py**

这段代码定义了一个名为 LLM 的类，该类继承自 BaseLLM。这个类的目的是为了为用户提供一个简化的接口来处理LLM（大型语言模型），而不期望用户实现完整的 _generate 方法。

```python

# 继承自 BaseLLM 的 LLM 类
class LLM(BaseLLM):
    """Base LLM abstract class.

    The purpose of this class is to expose a simpler interface for working
    with LLMs, rather than expect the user to implement the full _generate method.
    """

    # 使用 @abstractmethod 装饰器定义一个抽象方法，子类需要实现这个方法
    @abstractmethod
    def _call(
        self,
        prompt: str,  # 输入提示
        stop: Optional[List[str]] = None,  # 停止词列表
        run_manager: Optional[CallbackManagerForLLMRun] = None,  # 运行管理器
        **kwargs: Any,  # 其他关键字参数
    ) -> str:
        """Run the LLM on the given prompt and input."""
        # 此方法的实现应在子类中提供

    # _generate 方法使用了 _call 方法，用于处理多个提示
    def _generate(
        self,
        prompts: List[str],  # 多个输入提示的列表
        stop: Optional[List[str]] = None,
        run_manager: Optional[CallbackManagerForLLMRun] = None,
        **kwargs: Any,
    ) -> LLMResult:
        """Run the LLM on the given prompt and input."""
        # TODO: 在此处添加缓存逻辑
        generations = []  # 用于存储生成的文本
        # 检查 _call 方法的签名是否支持 run_manager 参数
        new_arg_supported = inspect.signature(self._call).parameters.get("run_manager")
        for prompt in prompts:  # 遍历每个提示
            # 根据是否支持 run_manager 参数来选择调用方法
            text = (
                self._call(prompt, stop=stop, run_manager=run_manager, **kwargs)
                if new_arg_supported
                else self._call(prompt, stop=stop, **kwargs)
            )
            # 将生成的文本添加到 generations 列表中
            generations.append([Generation(text=text)])
        # 返回 LLMResult 对象，其中包含 generations 列表
        return LLMResult(generations=generations)
```

### LLMs 已支持模型清单

**开发者文档：https://python.langchain.com/docs/integrations/llms/**
** https://python.langchain.com/api_reference/openai/llms/langchain_openai.llms.base.BaseOpenAI.html#langchain_openai.llms.base.BaseOpenAI **

**代码实现：https://github.com/langchain-ai/langchain/tree/master/libs/langchain/langchain/llms**

### 使用 LangChain 调用 OpenAI GPT Completion API

**代码实现：https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/llms/openai.py**

#### BaseOpenAI Class

```python
class BaseOpenAI(BaseLLM):
    """OpenAI 大语言模型的基类。"""

    @property
    def lc_secrets(self) -> Dict[str, str]:
        return {"openai_api_key": "OPENAI_API_KEY"}

    @property
    def lc_serializable(self) -> bool:
        return True

    client: Any  #: :meta private:
    model_name: str = Field("text-davinci-003", alias="model") #使用的模型名
    temperature: float = 0.7 #要使用的采样温度。
    max_tokens: int = 256 #完成中生成的最大token数。 -1表示根据提示和模型的最大上下文大小返回尽可能多的token。
    top_p: float = 1 # Total probability mass of tokens to consider at each step.
    frequency_penalty: float = 0 # Penalizes repeated tokens according to frequency.
    presence_penalty: float = 0 # Penalizes repeated tokens.
    n: int = 1 # How many completions to generate for each prompt
    best_of: int = 1 # Generates best_of completions server-side and returns the “best”.
    model_kwargs: Dict[str, Any] = Field(default_factory=dict)
    openai_api_key: Optional[str] = None
    openai_api_base: Optional[str] = None
    openai_organization: Optional[str] = None
    # 支持OpenAI的显式代理
    openai_proxy: Optional[str] = None
    batch_size: int = 20
    request_timeout: Optional[Union[float, Tuple[float, float]]] = None
    logit_bias: Optional[Dict[str, float]] = Field(default_factory=dict)
    max_retries: int = 6
    streaming: bool = False
    allowed_special: Union[Literal["all"], AbstractSet[str]] = set()
    disallowed_special: Union[Literal["all"], Collection[str]] = "all"
    tiktoken_model_name: Optional[str] = None
```

### Examples

In [36]:
import os
from dotenv import load_dotenv


In [9]:
from langchain_openai import OpenAI

llm = OpenAI(model_name="gpt-4o-mini")

In [10]:
print(llm.invoke("Tell me a Joke"))

." The assistant responds, "Why did the scarecrow win an award? Because he was outstanding in his field!" 

This interaction captures the essence of the user asking for humor and the assistant providing it in a light-hearted manner. If you have any specific requests or topics you'd like a joke about, feel free to let me know!


### 对比直接调用Open AI 的API

In [14]:
from openai import OpenAI

client = OpenAI()
completion = client.completions.create(model='gpt-4o-mini', prompt="Tell me a Joke")

print(completion.choices[0].text)
print(dict(completion).get('usage'))
print(completion.model_dump_json(indent=2))

!
Why did the scarecrow win an award? 
Because he was outstanding in
CompletionUsage(completion_tokens=16, prompt_tokens=4, total_tokens=20, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0))
{
  "id": "cmpl-BejNXkQal4ITCPK79LORNtvwdF5hM",
  "choices": [
    {
      "finish_reason": "length",
      "index": 0,
      "logprobs": null,
      "text": "!\nWhy did the scarecrow win an award? \nBecause he was outstanding in"
    }
  ],
  "created": 1749047255,
  "model": "gpt-4o-mini-2024-07-18",
  "object": "completion",
  "system_fingerprint": "fp_34a54ae93c",
  "usage": {
    "completion_tokens": 16,
    "prompt_tokens": 4,
    "total_tokens": 20,
    "completion_tokens_details": {
      "accepted_prediction_tokens": 0,
      "audio_tokens": 0,
      "reasoning_tokens": 0,
      "rejected_prediction_tokens": 0
    },
  

In [15]:
llm.max_tokens

256

In [16]:
llm.max_tokens = 1024
llm.max_tokens

1024

In [17]:
result = llm.invoke("generate a quick sort algorithm in python")
print(result)


def quick_sort(arr):
    if len(arr) <= 1:
    return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quick_sort(left) + middle + quick_sort(right)

# Example usage
arr = [3, 6, 8, 10, 1, 2, 1]
sorted_arr = quick_sort(arr)
print(sorted_arr)  # Output: [1, 1, 2, 3, 6, 8, 10]


LaingChain 的LLM 抽象维护了openAI 的连接状态（参数设定）

In [None]:
result = llm.invoke("Tell me a Joke")
print(result)

## 聊天模型（Chat Models)

Class Inheritance Hierarchy：

```
BaseChatModel
└── ChatOpenAI, ChatAnthropic, etc.
```

Main Abstract:

```
AIMessage, BaseMessage, HumanMessage
```

### BaseChatModel Class

**代码实现：https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/chat_models/base.py**


```python
class BaseChatModel(BaseLanguageModel[BaseMessageChunk], ABC):
    cache: Optional[bool] = None
    """是否缓存响应。"""
    verbose: bool = Field(default_factory=_get_verbosity)
    """是否打印响应文本。"""
    callbacks: Callbacks = Field(default=None, exclude=True)
    """添加到运行追踪的回调函数。"""
    callback_manager: Optional[BaseCallbackManager] = Field(default=None, exclude=True)
    """添加到运行追踪的回调函数管理器。"""
    tags: Optional[List[str]] = Field(default=None, exclude=True)
    """添加到运行追踪的标签。"""
    metadata: Optional[Dict[str, Any]] = Field(default=None, exclude=True)
    """添加到运行追踪的元数据。"""

    # 需要子类实现的 _generate 抽象方法
    @abstractmethod
    def _generate(
        self,
        messages: List[BaseMessage],
        stop: Optional[List[str]] = None,
        run_manager: Optional[CallbackManagerForLLMRun] = None,
        **kwargs: Any,
    ) -> ChatResult:

```

### ChatOpenAI Class（调用 Chat Completion API）


**代码实现：https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/chat_models/openai.py**

```python
class ChatOpenAI(BaseChatModel):
    """OpenAI Chat大语言模型的包装器。
    """

    @property
    def lc_secrets(self) -> Dict[str, str]:
        return {"openai_api_key": "OPENAI_API_KEY"}

    @property
    def lc_serializable(self) -> bool:
        return True

    client: Any = None  #: :meta private:
    model_name: str = Field(default="gpt-3.5-turbo", alias="model")
    temperature: float = 0.7
    model_kwargs: Dict[str, Any] = Field(default_factory=dict)
    openai_api_key: Optional[str] = None
    """API请求的基础URL路径，
    如果不使用代理或服务仿真器，请留空。"""
    openai_api_base: Optional[str] = None
    openai_organization: Optional[str] = None
    # 支持OpenAI的显式代理
    openai_proxy: Optional[str] = None
    request_timeout: Optional[Union[float, Tuple[float, float]]] = None
    """请求OpenAI完成API的超时。默认为600秒。"""
    max_retries: int = 6
    streaming: bool = False
    n: int = 1
    max_tokens: Optional[int] = None
    tiktoken_model_name: Optional[str] = None



```

In [None]:
from openai import OpenAI

client = OpenAI()
completion = client.chat.completions.create(
    model='gpt-4o-mini', 
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
        {"role": "user", "content": "Where was it played?"}
    ]
)


In [22]:
completion

ChatCompletion(id='chatcmpl-BfIZgYV7ntRJRgbZpeWSSfQjeg7Qk', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='The 2020 World Series was played at Globe Life Field in Arlington, Texas. This was notable because it was a neutral site due to the COVID-19 pandemic, and it was the first time a World Series was played at a neutral location since the World Series started in 1903.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None, annotations=[]))], created=1749182548, model='gpt-4o-mini-2024-07-18', object='chat.completion', service_tier='default', system_fingerprint='fp_34a54ae93c', usage=CompletionUsage(completion_tokens=59, prompt_tokens=53, total_tokens=112, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))

In [24]:
print(completion.choices)
print(dict(completion).get('usage'))
print(completion.model_dump_json(indent=2))

[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='The 2020 World Series was played at Globe Life Field in Arlington, Texas. This was notable because it was a neutral site due to the COVID-19 pandemic, and it was the first time a World Series was played at a neutral location since the World Series started in 1903.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None, annotations=[]))]
CompletionUsage(completion_tokens=59, prompt_tokens=53, total_tokens=112, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0))
{
  "id": "chatcmpl-BfIZgYV7ntRJRgbZpeWSSfQjeg7Qk",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "The 2020 World Series was played at Globe Life Field in Arl

In [28]:
from langchain_openai import ChatOpenAI

chat_model = ChatOpenAI(model_name="gpt-4o-mini")

In [29]:
from langchain.schema import (
    AIMessage,
    HumanMessage,
    SystemMessage
)

messages = [SystemMessage(content="You are a helpful assistant."),
 HumanMessage(content="Who won the world series in 2020?"),
 AIMessage(content="The Los Angeles Dodgers won the World Series in 2020."), 
 HumanMessage(content="Where was it played?")]

In [30]:
print(messages)

[SystemMessage(content='You are a helpful assistant.', additional_kwargs={}, response_metadata={}), HumanMessage(content='Who won the world series in 2020?', additional_kwargs={}, response_metadata={}), AIMessage(content='The Los Angeles Dodgers won the World Series in 2020.', additional_kwargs={}, response_metadata={}), HumanMessage(content='Where was it played?', additional_kwargs={}, response_metadata={})]


In [31]:
chat_model.invoke(messages)

AIMessage(content='The 2020 World Series was played at Globe Life Field in Arlington, Texas. This was notable because it was the first time the World Series was played at a neutral site due to the COVID-19 pandemic.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 43, 'prompt_tokens': 53, 'total_tokens': 96, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_34a54ae93c', 'finish_reason': 'stop', 'logprobs': None}, id='run--ca2141fb-f034-4f9e-a690-3ea711dab7fc-0', usage_metadata={'input_tokens': 53, 'output_tokens': 43, 'total_tokens': 96, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [33]:
chat_result = chat_model.invoke(messages)

In [34]:
print(chat_result.model_dump_json(indent=2))

{
  "content": "The 2020 World Series was played at Globe Life Field in Arlington, Texas. This was a deviation from the traditional site of alternating games between the two teams' home stadiums, as it was held at a neutral site due to the COVID-19 pandemic.",
  "additional_kwargs": {
    "refusal": null
  },
  "response_metadata": {
    "token_usage": {
      "completion_tokens": 53,
      "prompt_tokens": 53,
      "total_tokens": 106,
      "completion_tokens_details": {
        "accepted_prediction_tokens": 0,
        "audio_tokens": 0,
        "reasoning_tokens": 0,
        "rejected_prediction_tokens": 0
      },
      "prompt_tokens_details": {
        "audio_tokens": 0,
        "cached_tokens": 0
      }
    },
    "model_name": "gpt-4o-mini-2024-07-18",
    "system_fingerprint": "fp_62a23a81ef",
    "finish_reason": "stop",
    "logprobs": null
  },
  "type": "ai",
  "name": null,
  "id": "run--b3870ece-7848-4de4-9f7a-845c39334759-0",
  "example": false,
  "tool_calls": [],
  

In [35]:
type(chat_result)

langchain_core.messages.ai.AIMessage