## Output parsers
语言模型输出的是文本。但很多时候，我们想要获得更结构化的信息。
输出解析器可以帮助我们结构化语言模型的输出。

输出解析器有两个主要方法：
“获取格式说明(Get format instructions)”：返回一个字符串，告诉语言模型输出应该输出什么格式
“解析(Parse)”：接受一个字符串（假定为语言模型的响应），并将其解析为某种结构。

In [1]:
from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI

from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field, validator
from typing import List
model = OpenAI()

# 定义我们输出的类结构
class Joke(BaseModel):
    setup: str = Field(description="设定笑话的问题")
    punchline: str = Field(description="解决笑话的答案")

    # 你可以很容易地用Pydantic添加自定义验证逻辑。
    @validator('setup')
    def question_ends_with_question_mark(cls, field):
        if field[-1] != '?':
            raise ValueError("问题格式错误！")
        return field
# 定义我们的输出解析器
parser = PydanticOutputParser(pydantic_object=Joke)

In [2]:
print(parser.get_format_instructions())

The output should be formatted as a JSON instance that conforms to the JSON schema below.

As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.

Here is the output schema:
```
{"properties": {"setup": {"title": "Setup", "description": "\u8bbe\u5b9a\u7b11\u8bdd\u7684\u95ee\u9898", "type": "string"}, "punchline": {"title": "Punchline", "description": "\u89e3\u51b3\u7b11\u8bdd\u7684\u7b54\u6848", "type": "string"}}, "required": ["setup", "punchline"]}
```


In [4]:
prompt = PromptTemplate(
    template="回答用户的输入：.\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()}
)
joke_query = "给我讲一个笑话？"
_input = prompt.format_prompt(query=joke_query)
output = model(_input.to_string())
print(output)

{"setup": "Why did the chicken cross the road?", "punchline": "To get to the other side!"}


In [6]:
parser.parse(output)

Joke(setup='Why did the chicken cross the road?', punchline='To get to the other side!')

In [2]:
class CrowdInfoRequest(BaseModel):
    crowd_ids: str = Field(description="使用逗号分隔的人群Id列表；比如：1,2,3")

parser = PydanticOutputParser(pydantic_object=CrowdInfoRequest)
print(parser.get_format_instructions())

The output should be formatted as a JSON instance that conforms to the JSON schema below.

As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.

Here is the output schema:
```
{"properties": {"crowd_ids": {"title": "Crowd Ids", "description": "\u4f7f\u7528\u9017\u53f7\u5206\u9694\u7684\u4eba\u7fa4Id\u5217\u8868\uff1b\u6bd4\u5982\uff1a1,2,3", "type": "string"}}, "required": ["crowd_ids"]}
```
