# Other Machine-Readable Formats with Output Parsers

You can also use an LLM or chat model to produce output in other formats, such as CSV or XML. This is where output parsers come in handy. <em>Output parsers</em> are classes that help you structure large language model responses. They serve two functions:


<em>Providing format instructions</em>
	
- Output parsers can be used to inject some additional instructions in the prompt that will help guide the LLM to output text in the format it knows how to parse.
	
<em>Validating and parsing output</em>
	
- The main function is to take the textual output of the LLM or chat model and render it to a more structured format, such as a list, XML, or other format. This can include removing extraneous information, correcting incomplete output, and validating the parsed values.
	


Here’s an example of how an output parser works:

In [2]:
from langchain_core.output_parsers import CommaSeparatedListOutputParser

parser = CommaSeparatedListOutputParser()

print(parser.get_format_instructions())

response = parser.invoke("apple, banana, cherry")
print(response)

Your response should be a list of comma separated values, eg: `foo, bar, baz` or `foo,bar,baz`
['apple', 'banana', 'cherry']


# Use Cases
1. Data Extraction from Unstructured Text

An LLM with a <b>CSV output parser</b> could extract structured information from unstructured sources like research papers, financial reports, or clinical notes. 

For example, a medical researcher could process hundreds of patient case reports, having the LLM automatically extract key data points (patient demographics, symptoms, treatments, outcomes) into a <b>comma-separated format</b> that can be directly imported into statistical analysis software or spreadsheets for further analysis.

In [5]:
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import CommaSeparatedListOutputParser
from langchain_core.prompts import PromptTemplate

# Initialize the parser and LLM
csv_parser = CommaSeparatedListOutputParser()
llm = ChatOpenAI(model="gpt-4o-mini-2024-07-18", temperature=0)

# Create prompt template with format instructions
prompt = PromptTemplate(
    template="Extract the following from this patient note:\n"
             "Age, Gender, Main Symptom, Diagnosis, Treatment\n\n"
             "Patient Note:\n{note}\n\n"
             "{format_instructions}",
    input_variables=["note"],
    partial_variables={"format_instructions": csv_parser.get_format_instructions()}
)

# Create chain: prompt -> LLM -> parser
extraction_chain = prompt | llm | csv_parser

# Simple patient note
patient_note = """
Jane Doe, a 42-year-old female, came in with headaches 
lasting for 3 days. Patient reports sensitivity to light and sound.
Prescribed sumatriptan 50mg as needed and recommended stress reduction techniques.
"""

# Extract data
extracted_data = extraction_chain.invoke({"note": patient_note})
print(extracted_data)

# Convert to structured format
import pandas as pd

headers = ["Age", "Gender", "Main Symptom", "Diagnosis", "Treatment"]
df = pd.DataFrame([extracted_data], columns=headers)
df

['42', 'female', 'headaches', 'migraine', 'sumatriptan 50mg']


Unnamed: 0,Age,Gender,Main Symptom,Diagnosis,Treatment
0,42,female,headaches,migraine,sumatriptan 50mg
