# CommaSeparatedListOutputParser

## Overview

這個 ```CommaSeparatedListOutputParser``` 是 LangChain 中設計用於生成逗號分隔列表的結構化輸出的專門輸出解析器。

它簡化了提取和展示數據的過程，使其特別適合組織數據點、姓名、項目或其他結構化值等信息。通過利用這個解析器，用戶可以增強數據清晰度，確保一致的格式，並提高工作流程效率，特別是在結構化輸出至关重要的應用程序中。

本教程演示了如何使用 ```CommaSeparatedListOutputParser```:

  1. 設定並初始化生成逗號分隔列表的解析器
  2. 將其整合到提示模板和語言模型中
  3. 使用流式機制迭代處理結構化輸出


### Table of Contents

- [Overview](#overview)
- [Environment Setup](#environment-setup)
- [Implementing the CommaSeparatedListOutputParser](#implementing-the-commaseparatedlistoutputparser)
- [Using Streamed Outputs](#using-streamed-outputs)

### References

- [LangChain ChatOpenAI API reference](https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html)
- [LangChain Core Output Parsers](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.list.CommaSeparatedListOutputParser.html#)
- [Python List Tutorial](https://docs.python.org/3.13/tutorial/datastructures.html)
---

## Environment Setup

Set up the environment. You may refer to [Environment Setup](https://wikidocs.net/257836) for more details.

**[Note]**
- ```langchain-opentutorial``` is a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials. 
- You can checkout the [```langchain-opentutorial```](https://github.com/LangChain-OpenTutorial/langchain-opentutorial-pypi) for more details.

In [1]:
%%capture --no-stderr
%pip install langchain-opentutorial

In [2]:
# Install required packages
from langchain_opentutorial import package

package.install(
    [
        "langsmith",
        "langchain",
        "langchain_openai",
        "langchain_community",
    ],
    verbose=False,
    upgrade=False,
)

In [3]:
# Set environment variables
from langchain_opentutorial import set_env

set_env(
    {
        "OPENAI_API_KEY": "",
        "LANGCHAIN_API_KEY": "",
        "LANGCHAIN_TRACING_V2": "true",
        "LANGCHAIN_ENDPOINT": "https://api.smith.langchain.com",
        "LANGCHAIN_PROJECT": "02-CommaSeparatedListOutputParser",
    }
)

Environment variables have been set successfully.


You can alternatively set ```OPENAI_API_KEY``` in ```.env``` file and load it. 

[Note] This is not necessary if you've already set ```OPENAI_API_KEY``` in previous steps.

In [3]:
from dotenv import load_dotenv

load_dotenv()

True

## Implementing the ```CommaSeparatedListOutputParser```
如果需要生成逗號分隔的列表形式的輸出，LangChain 的 ```CommaSeparatedListOutputParser``` 可以簡化這個過程。以下是一個逐步實現:

### 1. Importing Required Modules
開始導入必要的模組並初始化 ```CommaSeparatedListOutputParser```。從解析器中獲取格式指示以指導輸出結構。


In [None]:
from langchain_core.output_parsers import CommaSeparatedListOutputParser

# Initialize the output parser
output_parser = CommaSeparatedListOutputParser()

# Retrieve format instructions for the output parser
format_instructions = output_parser.get_format_instructions()
print(format_instructions)

Your response should be a list of comma separated values, eg: `foo, bar, baz` or `foo,bar,baz`


### 2. Creating the Prompt Template
定義一個 ```PromptTemplate```，動態生成一個項目清單。占位符主題將在執行時被所需的主題取代。

In [5]:
from langchain_core.prompts import PromptTemplate

# Define the prompt template
prompt = PromptTemplate(
    template="List five {subject}.\n{format_instructions}",
    input_variables=["subject"],  # 'subject' will be dynamically replaced
    partial_variables={
        "format_instructions": format_instructions
    },  # Use parser's format instructions
)
print(prompt)

input_variables=['subject'] input_types={} partial_variables={'format_instructions': 'Your response should be a list of comma separated values, eg: `foo, bar, baz` or `foo,bar,baz`'} template='List five {subject}.\n{format_instructions}'


### 3. Integrating with ```ChatOpenAI``` and Running the Chain
將 ```PromptTemplate```, ```ChatOpenAI``` 模型和 ```CommaSeparatedListOutputParser``` 組合成一個鏈。最後，使用特定的 ```subject``` 執行鏈以產生結果。

In [6]:
from langchain_openai import ChatOpenAI

# Initialize the ChatOpenAI model
model = ChatOpenAI(temperature=0)

# Combine the prompt, model, and output parser into a chain
chain = prompt | model | output_parser

# Run the chain with a specific subject
result = chain.invoke({"subject": "famous landmarks in South Korea"})
print(result)

['Gyeongbokgung Palace', 'N Seoul Tower', 'Bukchon Hanok Village', 'Seongsan Ilchulbong Peak', 'Haeundae Beach']


### 4. 使用 Python Indexing 讀取資料
由於 ```CommaSeparatedListOutputParser``` 會自動將輸出格式化為 Python 列表，因此您可以使用索引輕鬆存取個別元素。

In [7]:
# Accessing specific elements using Python indexing
print("First Landmark:", result[0])
print("Second Landmark:", result[1])
print("Last Landmark:", result[-1])

First Landmark: Gyeongbokgung Palace
Second Landmark: N Seoul Tower
Last Landmark: Haeundae Beach


## 使用 Streamed Outputs
為了更大的輸出 or 即時反饋feedback, 你可以使用 ```stream``` method。這允許你按部就班地處理生成的數據。

In [8]:
# Iterate through the streamed output for a subject
for output in chain.stream({"subject": "famous landmarks in South Korea"}):
    print(output)

['Gyeongbokgung Palace']
['N Seoul Tower']
['Bukchon Hanok Village']
['Seongsan Ilchulbong Peak']
['Haeundae Beach']
