<a href="https://colab.research.google.com/github/ckjen168/LLMColab/blob/main/3_structured_output.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Structured Output

This example refers to [LangChain quickstart](https://python.langchain.com/docs/introduction/) and [LangChain開發手冊(旗標)](https://www.tenlong.com.tw/products/9789863127918)

In [None]:
!pip install langchain --quiet
!pip install langchain_openai --quiet
!pip install rich --quiet

In [None]:
import os
from google.colab import userdata

if not os.environ.get("OPENAI_API_KEY"):
  os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')

In [None]:
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4o-mini")

## Structred Output & Output Parsers

In [None]:
from langchain_core.output_parsers import StrOutputParser, JsonOutputParser
str_parser = StrOutputParser()

In [None]:
message = model.invoke("請提供一個國家的名稱和首都, 使用繁體中文")
print(message.content)

國家名稱：日本  
首都：東京


In [None]:
str_parser.invoke(message)

'國家名稱：日本  \n首都：東京'

### JSON Parser

In [None]:
json_parser = JsonOutputParser()
format_instructions = json_parser.get_format_instructions()
print(format_instructions)

Return a JSON object.


In [None]:
message = model.invoke("請提供一個國家的名稱和首都,"
                    f"{format_instructions}, 使用台灣語言")
print(message.content)

```json
{
  "國家": "日本",
  "首都": "東京"
}
```


In [None]:
json_output = json_parser.invoke(message)
print(json_output)

{'國家': '日本', '首都': '東京'}


### CSV Parser

In [None]:
from langchain_core.output_parsers import (
    CommaSeparatedListOutputParser)

In [None]:
list_parser = CommaSeparatedListOutputParser()
print(list_parser.get_format_instructions())

Your response should be a list of comma separated values, eg: `foo, bar, baz` or `foo,bar,baz`


In [34]:
from langchain_core.prompts import PromptTemplate

# 建立提示模板
prompt = PromptTemplate.from_template(
    "請說出國家{city}的知名景點\n{instructions}"
).partial(instructions=list_parser.get_format_instructions())
response = model.invoke(prompt.format(city='台灣'))
print(response.content)

臺北101, 故宮博物院, 九份老街, 阿里山, 日月潭, 士林夜市, 高雄哈瑪星, 垦丁国家公园, 太魯閣國家公園, 鹿港小镇


In [35]:
pprint(list_parser.invoke(response))

### 使用Pydantic自訂class

#### 旅遊計劃書

In [36]:
from langchain_core.output_parsers import PydanticOutputParser
from langchain_core.prompts import PromptTemplate, ChatPromptTemplate
from langchain_openai import OpenAI
from pydantic import BaseModel, Field, model_validator
from typing import List
from rich import print as pprint

In [37]:
class TravelPlan(BaseModel):
    destination: str = Field(description="旅遊目的地, 如日本北海道")
    activities: List[str] = Field(description="推薦的活動")
    budget: float = Field(description="預算範圍,單位新台幣")
    accommodation: List[str] = Field(description="住宿選項")

In [38]:
parser = PydanticOutputParser(pydantic_object=TravelPlan)
format_instructions = parser.get_format_instructions()
pprint(format_instructions)

In [39]:
prompt = ChatPromptTemplate.from_messages(
    [("system","使用繁體中文並根據使用者要求推薦出適合的旅遊計劃,\n"
               "{format_instructions}"),
     ("human","{query}")
    ]
)
new_prompt = prompt.partial(format_instructions=format_instructions)

In [40]:
user_query = "我喜歡潛水以及在日落時散步, 所以想要安排一個海邊假期"
user_prompt = new_prompt.invoke({"query": user_query})
response = model.invoke(user_prompt)
pprint(response.content)

In [41]:
parser_output = parser.invoke(response)
pprint(parser_output)

#### Use @model_validator to check Output format

In [42]:
# Define your desired data structure.
class Joke(BaseModel):
    setup: str = Field(description="question to set up a joke")
    punchline: str = Field(description="answer to resolve the joke")

    # You can add custom validation logic easily with Pydantic.
    @model_validator(mode="before")
    @classmethod
    def question_ends_with_question_mark(cls, values: dict) -> dict:
        setup = values.get("setup")
        if setup and setup[-1] != "?":
            raise ValueError("Badly formed question!")
        return values


# Set up a parser + inject instructions into the prompt template.
parser = PydanticOutputParser(pydantic_object=Joke)

prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

In [43]:
# And a query intended to prompt a language model to populate the data structure.
prompt_and_model = prompt | model
output = prompt_and_model.invoke({"query": "Tell me a joke about cats"})
parser.invoke(output)

Joke(setup='Why was the cat sitting on the computer?', punchline='Because it wanted to keep an eye on the mouse!')

In [44]:
output = prompt_and_model.invoke({"query": "寫一份飲料店創業計畫書"})

In [45]:
parser.invoke(output)

OutputParserException: Failed to parse Joke from completion {"setup": "\u5982\u4f55\u5728\u7af6\u722d\u6fc0\u70c8\u7684\u5e02\u5834\u4e2d\u958b\u8a2d\u4e00\u5bb6\u6210\u529f\u7684\u98f2\u6599\u5e97\uff1f", "punchline": "\u901a\u904e\u5275\u65b0\u98f2\u54c1\u3001\u512a\u8cea\u670d\u52d9\u548c\u6709\u6548\u7684\u5e02\u5834\u884c\u92b7\u7b56\u7565\u4f86\u5438\u5f15\u9867\u5ba2\u3002"}. Got: 1 validation error for Joke
  Value error, Badly formed question! [type=value_error, input_value={'setup': '如何在競...略來吸引顧客。'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.10/v/value_error
For troubleshooting, visit: https://python.langchain.com/docs/troubleshooting/errors/OUTPUT_PARSING_FAILURE 