This is notebook for output parsers in langchain


Output Parsers in LangChain are classes that structure and validate the raw text output from LLMs into a more usable format. They ensure the model's response conforms to a specific schema or structure that your application expects.

Why Use Output Parsers?
Consistency: Get structured data instead of free-form text

Validation: Catch malformed responses early

Type safety: Convert strings to proper Python objects

Reliability: Ensure downstream code receives expected formats

1. StrOutputParser

String Output Parsers in LangChain are the simplest type of output parsers that handle basic text processing and transformation of LLM outputs. They work with raw string responses and provide various ways to clean, format, and structure text.

In [35]:
from langchain_google_genai  import ChatGoogleGenerativeAI
from langchain_core.prompts import PromptTemplate
from dotenv import load_dotenv
from langchain_core.output_parsers import StrOutputParser,JsonOutputParser

load_dotenv()


True

In [25]:
parser=StrOutputParser()
model=ChatGoogleGenerativeAI(model="gemini-2.5-pro")

In [22]:
response=model.invoke("Hello, how are you?")
parsed_response=parser.invoke(response)

In [23]:
type(parsed_response)

str

second example of stroutput parsers

In [26]:
template1 = PromptTemplate(
    template='Write a detailed report on {topic}',
    input_variables=['topic']
)

# 2nd prompt -> summary
template2 = PromptTemplate(
    template='Write a 5 line summary on the following text. /n {text}',
    input_variables=['text']
)

prompt1 = template1.invoke({'topic':'black hole'})

result = model.invoke(prompt1)

prompt2 = template2.invoke({'text':result.content})

result1 = model.invoke(prompt2)

print(result1.content)

A black hole is a region of spacetime with gravity so intense that nothing, not even light, can escape its event horizon. They typically form from the collapsed remnants of massive stars, with supermassive versions residing at the centers of most galaxies. Although invisible, they are detected by their gravitational effects on nearby stars, the bright accretion disks of matter swirling into them, and gravitational waves from their mergers. These objects are crucial to galaxy evolution and challenge our fundamental understanding of physics.


In [27]:
type(result1)

langchain_core.messages.ai.AIMessage

In [28]:
type(parser.invoke(result1.content))

str

2.jsonoutputparsers

In [48]:
parser = JsonOutputParser()

In [57]:
template = PromptTemplate(
    template="""Give me 5 facts about {topic} as a JSON array of strings.
    
    {format_instructions}
    
    
    Topic: {topic}""",
    input_variables=['topic'],
    partial_variables={'format_instructions': parser.get_format_instructions()}
)

chain = template | model | parser
result = chain.invoke({'topic': 'black hole'})

print("Result:", result)
print("Type:", type(result))

Result: {'topic': 'black hole', 'facts': ["A black hole's gravitational pull is so strong that nothing, not even light, can escape once it crosses the event horizon.", 'The center of a black hole is a point of infinite density called a singularity, where the laws of physics as we know them break down.', 'Supermassive black holes, millions or even billions of times the mass of our Sun, are thought to reside at the center of most large galaxies, including our own Milky Way.', 'Black holes are formed from the remnants of massive stars that collapse under their own gravity in a supernova explosion.', 'The first-ever direct image of a black hole was captured by the Event Horizon Telescope collaboration and released in 2019.']}
Type: <class 'dict'>


3.structure outpput parsers

In [None]:
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain.output_parsers import StructuredOutputParser, ResponseSchema

load_dotenv()

# Define the model
llm = HuggingFaceEndpoint(
    repo_id="google/gemma-2-2b-it",
    task="text-generation"
)

model = ChatHuggingFace(llm=llm)

schema = [
    ResponseSchema(name='fact_1', description='Fact 1 about the topic'),
    ResponseSchema(name='fact_2', description='Fact 2 about the topic'),
    ResponseSchema(name='fact_3', description='Fact 3 about the topic'),
]

parser = StructuredOutputParser.from_response_schemas(schema)

template = PromptTemplate(
    template='Give 3 fact about {topic} \n {format_instruction}',
    input_variables=['topic'],
    partial_variables={'format_instruction':parser.get_format_instructions()}
)

chain = template | model | parser

result = chain.invoke({'topic':'black hole'})

print(result)

4 pydantic

In [None]:
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field

load_dotenv()

# Define the model
llm = HuggingFaceEndpoint(
    repo_id="google/gemma-2-2b-it",
    task="text-generation"
)

model = ChatHuggingFace(llm=llm)

class Person(BaseModel):

    name: str = Field(description='Name of the person')
    age: int = Field(gt=18, description='Age of the person')
    city: str = Field(description='Name of the city the person belongs to')

parser = PydanticOutputParser(pydantic_object=Person)

template = PromptTemplate(
    template='Generate the name, age and city of a fictional {place} person \n {format_instruction}',
    input_variables=['place'],
    partial_variables={'format_instruction':parser.get_format_instructions()}
)

chain = template | model | parser

final_result = chain.invoke({'place':'sri lankan'})

print(final_result)