# Output Parsers

Output Parsers in LangChain help convert raw LLM responses into structured formats like JSON, CSV, Pydantic models, and more. They ensure consistency, validation, and ease of use in applications.

### What are Output Parsers in LangChain?

**Output Parsers** in LangChain are utilities that help you transform the raw text output from a language model (LLM) into structured, machine-readable data (such as Python dictionaries, lists, or custom objects). They are essential for building reliable, automated workflows with LLMs.

### Why Use Output Parsers?

- **Reliability:** Ensures LLM responses follow a predictable structure.
- **Automation:** Enables downstream code to easily process and use LLM outputs.
- **Validation:** Catches formatting errors or deviations from the expected schema.

### Common Types of Output Parsers

1. **StructuredOutputParser**  
   - Parses model output into a structured dictionary based on a schema.
   - Used with response schemas to extract specific fields.

2. **PydanticOutputParser**  
   - Uses [Pydantic](https://docs.pydantic.dev/) models for robust schema validation.
   - Returns instances of your Pydantic models.

3. **CommaSeparatedListOutputParser**  
   - Parses a comma-separated string into a Python list.

4. **RegexParser**  
   - Uses regular expressions to extract fields from model outputs.

### Example: Structured Output Parsing

```python
from langchain.output_parsers import StructuredOutputParser, ResponseSchema
from langchain.prompts import PromptTemplate

# Define the expected output schema
response_schemas = [
    ResponseSchema(name="summary", description="A short summary"),
    ResponseSchema(name="keywords", description="List of main keywords")
]

output_parser = StructuredOutputParser.from_response_schemas(response_schemas)

prompt = PromptTemplate(
    template="Summarize and extract keywords. {format_instructions}\nText: {text}",
    input_variables=["text"],
    partial_variables={"format_instructions": output_parser.get_format_instructions()}
)

# After getting LLM result, parse it:
# parsed = output_parser.parse(llm_output)
```

### Example: Pydantic Output Parsing

```python
from pydantic import BaseModel
from langchain.output_parsers import PydanticOutputParser

class User(BaseModel):
    name: str
    age: int

parser = PydanticOutputParser(pydantic_object=User)

prompt = "Extract the user's name and age: {format_instructions}\nText: {text}"

# Use parser.get_format_instructions() in your prompt for guidance

# parsed = parser.parse(llm_output)
```

**Summary:**  
LangChain’s output parsers are vital for turning raw LLM text into reliable, structured data—enabling robust automation and integration in your applications.

# StrOutputParser


The StrOutputParser is the simplest output parser in LangChain. It is used to parse the output of a Language Model (LLM) and return it as a plain string.

### StrOutputParser in LangChain

**StrOutputParser** is a simple output parser provided by LangChain that takes the raw output from a language model (LLM) and returns it as a plain Python string—without any additional formatting or parsing. It is typically used when you just want the LLM’s response as a string, and don’t require structured or validated output.

### When to Use StrOutputParser

- When you want the raw text from the model.
- For simple use cases where you do not need to extract fields, lists, or JSON from the response.
- As a default or fallback parser when no special output handling is required.


### Example Usage

```python
from langchain.output_parsers import StrOutputParser

parser = StrOutputParser()

# Suppose llm_output = "Hello, world!"
parsed = parser.parse("Hello, world!")
print(parsed)  # Output: Hello, world!
```

### How It Works

- The parser does not validate, split, or reformat the output.
- It is often used in basic chains or for debugging, where more complex parsing is unnecessary.

**Summary:**  
`StrOutputParser` is the most basic output parser in LangChain, returning the LLM response as a plain string—ideal for straightforward use cases.

# JSONOutputParser in LangChain

**JSONOutputParser** is an output parser in LangChain designed to convert the raw text output from a language model (LLM) into a Python data structure (usually a dictionary or list) by parsing it as JSON. It is especially useful when you prompt your LLM to return responses in JSON format for reliable, machine-readable outputs.

### Key Features

- **Parses LLM output as JSON:** Converts the string output to Python objects (dict, list, etc.).
- **Error handling:** Raises an error if the output is not valid JSON.
- **Automation:** Enables chaining and automation by making model outputs easy to consume in code.

### Example Usage

```python
from langchain.output_parsers import JSONOutputParser

parser = JSONOutputParser()

llm_output = '{"summary": "LangChain is a framework.", "keywords": ["LangChain", "framework"]}'
parsed = parser.parse(llm_output)
print(parsed)
# Output: {'summary': 'LangChain is a framework.', 'keywords': ['LangChain', 'framework']}
```

### When to Use JSONOutputParser

- When you instruct the LLM to return its response in JSON format.
- For structured outputs needed in data pipelines, APIs, or further programmatic use.
- When you want robust handling and validation of LLM outputs.


### Best Practices

- Always include clear instructions in your prompt to have the LLM return well-formatted JSON.
- Consider using `format_instructions` from the parser in your prompt:
  ```python
  prompt = f"Return your answer in the following format:\n{parser.get_format_instructions()}\nQuestion: {question}"
  ```

**Summary:**  
`JSONOutputParser` is the go-to tool in LangChain for extracting reliable, structured data from LLM outputs formatted as JSON.

# StructuredOutputParser
StructuredOutputParser is an output parser in LangChain that helps extract structured JSON data from LLM responses based on predefined field schemas. It works by defining a list of fields (ResponseSchema) that the model should return, ensuring the output follows a structured format.

## StructuredOutputParser in LangChain

**StructuredOutputParser** is one of the most powerful output parsers in LangChain. It enables you to reliably extract structured, machine-readable data (such as Python dicts) from the text output of a language model, based on predefined response schemas.

### Key Features

- **Schema-driven:** Define exactly what fields and types you expect from the LLM.
- **Reliable extraction:** Parses the LLM output into a Python dictionary that matches your schema.
- **Format instructions:** Automatically generates instructions for the prompt, helping the model return the correct structure.

### Example Usage

```python
from langchain.output_parsers import StructuredOutputParser, ResponseSchema
from langchain.prompts import PromptTemplate

# 1. Define the schema for the output
response_schemas = [
    ResponseSchema(name="summary", description="A short summary of the text"),
    ResponseSchema(name="keywords", description="A list of keywords from the text"),
]

# 2. Create the parser
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)

# 3. Insert format instructions into your prompt
prompt = PromptTemplate(
    template="Summarize the following text and extract keywords.\n{format_instructions}\nText: {text}",
    input_variables=["text"],
    partial_variables={"format_instructions": output_parser.get_format_instructions()},
)

# 4. After you get LLM output, parse it
llm_output = '''
{
    "summary": "LangChain is a framework for building LLM-powered applications.",
    "keywords": ["LangChain", "framework", "LLM", "applications"]
}
'''
parsed = output_parser.parse(llm_output)
print(parsed)
# Output: {'summary': 'LangChain is a framework for building LLM-powered applications.', 'keywords': ['LangChain', 'framework', 'LLM', 'applications']}
```

### When to Use StructuredOutputParser

- When you need specific fields/values from LLM output.
- For structured data extraction (e.g., summaries, entities, custom schemas).
- To enable robust automation and chaining in data pipelines.


**Summary:**  
`StructuredOutputParser` lets you reliably extract multiple, well-defined fields from LLM output—ideal for robust, production-ready workflows.

# PydanticOutputParser
- **What is** `PydanticOutputParser` in LangChain?

  PydanticOutputParser is a structured output parser in LangChain that uses Pydantic models to enforce schema validation when processing LLM responses.

- **Why Use** `PydanticOutputParser`?

  - ✅ **Strict Schema Enforcement** — Ensures that LLM responses follow a well-defined structure.
  - ✅ **Type Safety** — Automatically converts LLM outputs into Python objects.
  - ✅ **Easy Validation** — Uses Pydantic's built-in validation to catch incorrect or missing data.
  - ✅ **Seamless Integration** — Works well with other LangChain components.

## PydanticOutputParser in LangChain

**PydanticOutputParser** is an output parser in LangChain that uses [Pydantic](https://docs.pydantic.dev/) models to define and validate the structure of data you expect from a language model (LLM). It ensures that the LLM's output matches your schema and automatically parses it into a Pydantic model instance, providing type safety and robust error handling.


### Key Features

- **Schema validation:** Enforces that the LLM output matches a strict Pydantic model.
- **Type safety:** Returns a validated Pydantic object, making downstream code more reliable.
- **Automatic parsing:** Converts JSON-formatted LLM output to Python objects.
- **Error handling:** Raises descriptive errors if the output doesn’t fit the schema.

### Example Usage

```python
from pydantic import BaseModel
from langchain.output_parsers import PydanticOutputParser

# 1. Define your Pydantic model
class User(BaseModel):
    name: str
    age: int

# 2. Create the parser
parser = PydanticOutputParser(pydantic_object=User)

# 3. Add parser’s format instructions to your prompt
format_instructions = parser.get_format_instructions()
prompt = f"Extract the user's name and age and return in this format:\n{format_instructions}\nText: Alice, 27"

# 4. After generating output with the LLM (should be valid JSON), parse it:
llm_output = '{"name": "Alice", "age": 27}'
parsed = parser.parse(llm_output)
print(parsed)
# Output: name='Alice' age=27
# Type: <class '__main__.User'>
```

### When to Use PydanticOutputParser

- When you need strongly-typed, validated outputs from LLMs.
- For complex, nested, or strictly formatted data.
- In production systems where reliability and type safety are critical.


### Best Practices

- Always include the parser’s `format_instructions` in your prompt to guide the LLM.
- Validate LLM output before using it downstream.
- Use Pydantic’s features (e.g., optional fields, validators) for robust schemas.

**Summary:**  
`PydanticOutputParser` is ideal for extracting structured, validated, and type-safe outputs from language models using the power of Pydantic models.