<a href="https://colab.research.google.com/github/sugarforever/LangChain-Tutorials/blob/main/LangChain_Output_Parsing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this notebook, we will learn how to use LangChain's output parser to process LLM output in a more programming language friendly way.

In [1]:
# !pip install langchain openai -qU

# CommaSeparatedListOutputParser

In [2]:
import os,sys
import openai
from dotenv import load_dotenv, find_dotenv
# sys.path.append("../..")

# 读取本地/项目的环境变量。

# find_dotenv()寻找并定位.env文件的路径
# load_dotenv()读取该.env文件，并将其中的环境变量加载到当前的运行环境中  
# 如果你设置的是全局的环境变量，这行代码则没有任何作用。
print(find_dotenv())
_ = load_dotenv(find_dotenv())
print(os.environ["OPENAI_API_KEY"])

from langchain import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.schema import (
    SystemMessage,
    HumanMessage,
    AIMessage
)

from langchain.output_parsers import CommaSeparatedListOutputParser
from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate

d:\Anaconda3\envs\LLM\lib\site-packages\numpy\.libs\libopenblas.FB5AE2TYXYH2IJRDKGDGQ3XBKLKTF43H.gfortran-win_amd64.dll
d:\Anaconda3\envs\LLM\lib\site-packages\numpy\.libs\libopenblas64__v0.3.23-246-g3d31191b-gcc_10_3_0.dll


c:\Users\lenovo\Desktop\LangChainPlayGround\DeeperTutorials\.env
sk-lANo2jIeCWQt94UCCf5d16B7C32744279bF98b06C822D519


In [19]:
output_parser = CommaSeparatedListOutputParser()
format_instructions = output_parser.get_format_instructions()
prompt = PromptTemplate(
    template="List 3 main-stream {subject}.\n{format_instructions}",
    input_variables=["subject"],
    partial_variables={"format_instructions": format_instructions}
)
print(format_instructions)

llm = OpenAI(
    temperature = 0
)

_input = prompt.format(subject="programming language")
# _input = prompt.format(subject="music styles")
output = llm(_input)
print(output)

finall = output_parser.parse(output)
print(finall)

# llm = ChatOpenAI()
# prompt = ChatPromptTemplate.from_template("List 3 main-stream {subject}.\n")

# chain = prompt | llm | output_parser
# print(chain.invoke("music styles"))

Your response should be a list of comma separated values, eg: `foo, bar, baz`


Java, C++, Python
['Java', 'C++', 'Python']


# EnumOutputParser

In [20]:
from langchain.output_parsers.enum import EnumOutputParser
from enum import Enum

class Genders(Enum):
    MALE = "male"
    FEMALE = "female"

In [22]:
output_parser = EnumOutputParser(enum=Genders)
format_instructions = output_parser.get_format_instructions()
prompt = PromptTemplate(
    template="Tell me the gender of the celebrity {name}.\n{format_instructions}",
    input_variables=["name"],
    partial_variables={"format_instructions": format_instructions}
)
print(format_instructions)

_input = prompt.format(name="Michael Jordan")
output = llm(_input)
print(output)

print(output_parser.parse(output))

# output_parser.parse('xyz')
print(output_parser)
print(output_parser.parse('male'))

Select one of the following options: male, female

male
Genders.MALE


OutputParserException: Response 'xyz' is not one of the expected values: ['male', 'female']

# StructuredOutputParser

In [26]:
from langchain.output_parsers import StructuredOutputParser, ResponseSchema

response_schemas = [
    ResponseSchema(name="answer", description="answer to the human's question"),
    ResponseSchema(name="source", description="source used to answer the human's question, should be a website.")
]
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
format_instructions = output_parser.get_format_instructions()
print(format_instructions)


prompt = PromptTemplate(
    template="answer the users question as best as possible.\n{format_instructions}\n{question}",
    input_variables=["question"],
    partial_variables={"format_instructions": format_instructions}
)


_input = prompt.format_prompt(question="what are the ingredients of milk?")
output = llm(_input.to_string())

print(output)


The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"answer": string  // answer to the human's question
	"source": string  // source used to answer the human's question, should be a website.
}
```


```json
{
	"answer": "Milk is typically made up of water, fat, proteins, lactose (sugar) and minerals.",
	"source": "https://www.dairynz.co.nz/nutrition/what-is-in-milk/"
}
```


In [27]:
json_data = output_parser.parse(output)

In [28]:
print(json_data['answer'])

Milk is typically made up of water, fat, proteins, lactose (sugar) and minerals.
