# 如何解析JSON输出

:::info 前提条件

本指南假设您熟悉以下概念：
- [聊天模型](/docs/concepts/chat_models)
- [输出解析器](/docs/concepts/output_parsers)
- [提示模板](/docs/concepts/prompt_templates)
- [结构化输出](/docs/how_to/structured_output)
- [链接可运行组件](/docs/how_to/sequence/)

:::

虽然一些模型提供商支持[内置方式返回结构化输出](/docs/how_to/structured_output)，但并非所有提供商都支持。我们可以使用输出解析器来帮助用户通过提示指定任意JSON模式，查询模型以获取符合该模式的输出，并最终将该模式解析为JSON。

:::note
请记住，大型语言模型是有泄漏的抽象！您需要使用具有足够能力生成格式良好的JSON的LLM。
:::

[`JsonOutputParser`](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.json.JsonOutputParser.html)是一种内置选项，用于提示并解析JSON输出。虽然它在功能上与[`PydanticOutputParser`](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.pydantic.PydanticOutputParser.html)相似，但它还支持流式返回部分JSON对象。

以下是如何将其与[Pydantic](https://docs.pydantic.dev/)一起使用以方便地声明预期模式的示例：

In [None]:
%pip install -qU langchain langchain-openai

import os
from getpass import getpass

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass()

In [None]:
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_openai import ChatOpenAI
from pydantic import BaseModel, Field

model = ChatOpenAI(temperature=0)


# 定义您想要的数据结构
class Joke(BaseModel):
    setup: str = Field(description="搭建笑话的问题部分")
    punchline: str = Field(description="解开笑话的答案部分")


# 一个旨在提示语言模型填充数据结构的查询
joke_query = "讲个笑话。"

# 设置解析器 + 将指令注入到提示模板中
parser = JsonOutputParser(pydantic_object=Joke)

prompt = PromptTemplate(
    template="回答用户查询。\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

chain = prompt | model | parser

chain.invoke({"query": joke_query})

{'setup': "Why couldn't the bicycle stand up by itself?",
 'punchline': 'Because it was two tired!'}

请注意，我们正在将解析器的`format_instructions`直接传递到提示中。您可以并且应该尝试在提示的其他部分添加自己的格式提示，以增强或替换默认指令：

In [3]:
parser.get_format_instructions()

'The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{"properties": {"setup": {"title": "Setup", "description": "question to set up a joke", "type": "string"}, "punchline": {"title": "Punchline", "description": "answer to resolve the joke", "type": "string"}}, "required": ["setup", "punchline"]}\n```'

## 流式处理

如上所述，`JsonOutputParser`和`PydanticOutputParser`之间的一个关键区别是`JsonOutputParser`输出解析器支持流式传输部分块。以下是它的样子：

In [4]:
for s in chain.stream({"query": joke_query}):
    print(s)

{}
{'setup': ''}
{'setup': 'Why'}
{'setup': 'Why couldn'}
{'setup': "Why couldn't"}
{'setup': "Why couldn't the"}
{'setup': "Why couldn't the bicycle"}
{'setup': "Why couldn't the bicycle stand"}
{'setup': "Why couldn't the bicycle stand up"}
{'setup': "Why couldn't the bicycle stand up by"}
{'setup': "Why couldn't the bicycle stand up by itself"}
{'setup': "Why couldn't the bicycle stand up by itself?"}
{'setup': "Why couldn't the bicycle stand up by itself?", 'punchline': ''}
{'setup': "Why couldn't the bicycle stand up by itself?", 'punchline': 'Because'}
{'setup': "Why couldn't the bicycle stand up by itself?", 'punchline': 'Because it'}
{'setup': "Why couldn't the bicycle stand up by itself?", 'punchline': 'Because it was'}
{'setup': "Why couldn't the bicycle stand up by itself?", 'punchline': 'Because it was two'}
{'setup': "Why couldn't the bicycle stand up by itself?", 'punchline': 'Because it was two tired'}
{'setup': "Why couldn't the bicycle stand up by itself?", 'punchline'

## 不使用Pydantic

您也可以不使用Pydantic而使用`JsonOutputParser`。这将提示模型返回JSON，但不提供关于模式应该是什么的具体信息。

In [None]:
joke_query = "讲个笑话。"

parser = JsonOutputParser()

prompt = PromptTemplate(
    template="回答用户查询。\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

chain = prompt | model | parser

chain.invoke({"query": joke_query})

{'response': "Sure! Here's a joke for you: Why couldn't the bicycle stand up by itself? Because it was two tired!"}

## 下一步

您现在已经学习了一种提示模型返回结构化JSON的方法。接下来，请查看[获取结构化输出的更广泛指南](/docs/how_to/structured_output)以了解其他技术。