##  Benefits of Using Output Parsers in LangChain

1. **Output Format Control**  
   Instead of getting free-form text from the LLM, you can enforce a specific structure such as:
   - A list  
   - JSON  
   - Comma-separated values  
   This makes the output predictable and easier to work with programmatically.

2. **Easier Post-processing**  
   Without a parser, you’d need to manually clean the text (remove numbers, bullets, or formatting).  
   The parser automatically converts the model’s response into structured data (like a Python list or dictionary).

3. **Consistency**  
   LLMs can return results in slightly different formats each time.  
   Using an Output Parser ensures consistent and reliable formatting across runs.

4. **Reduced Parsing Errors**  
   Instead of relying on manual string operations or regex,  
   parsers handle the formatting intelligently — minimizing potential parsing mistakes.

5. **Seamless Integration**  
   The parsed output can be used directly in your code or pipeline  
   (for example, as Python lists, JSON objects, or parameters in API calls).

6. **Efficiency**  
   Saves developer time by removing the need for additional text-cleaning or transformation steps after every LLM call.

7. **Scalability & Customization**  
   You can build custom parsers to fit your project’s needs (e.g., return data in a specific JSON schema).  
   This makes LLM-based applications more flexible and scalable.


In [None]:
!pip install -U langchain-community


In [None]:
! pip install langchain

In [None]:
!pip install -q --upgrade transformers accelerate
! pip install sentencepiece

In [None]:
!pip uninstall bitsandbytes -y


[0m

In [None]:
! pip install bitsandbytes


In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch

model_id = "NousResearch/Hermes-2-Pro-Mistral-7B"

tokenizer = AutoTokenizer.from_pretrained(
    model_id,
    use_fast=False,
    trust_remote_code=True,
    revision="main"
)

base_model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype=torch.float16,
    trust_remote_code=True,
    revision="main"
)


2025-10-13 10:21:45.123505: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1760350905.294723      37 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1760350905.342068      37 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


tokenizer_config.json: 0.00B [00:00, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/643 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/557 [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

config.json:   0%|          | 0.00/642 [00:00<?, ?B/s]

`torch_dtype` is deprecated! Use `dtype` instead!


model.safetensors.index.json: 0.00B [00:00, ?B/s]

Fetching 4 files:   0%|          | 0/4 [00:00<?, ?it/s]

model-00001-of-00004.safetensors:   0%|          | 0.00/3.95G [00:00<?, ?B/s]

model-00004-of-00004.safetensors:   0%|          | 0.00/2.68G [00:00<?, ?B/s]

model-00003-of-00004.safetensors:   0%|          | 0.00/3.93G [00:00<?, ?B/s]

model-00002-of-00004.safetensors:   0%|          | 0.00/3.93G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/136 [00:00<?, ?B/s]

In [None]:
from transformers  import AutoTokenizer , AutoModelForCausalLM, pipeline
from langchain.llms import HuggingFacePipeline

pipe=pipeline("text-generation",
              model=base_model,

tokenizer=tokenizer,
temperature=0.5

)

llms =HuggingFacePipeline(pipeline=pipe)


Device set to use cuda:0
  llms =HuggingFacePipeline(pipeline=pipe)


In [None]:
from langchain.prompts import PromptTemplate
from langchain.output_parsers import CommaSeparatedListOutputParser

output_parser = CommaSeparatedListOutputParser()

format_instructions=output_parser.get_format_instructions

prompt1 = PromptTemplate(
    template = "List three popular {plate_type} plates.\n{format_instructions}",
    input_variables=["plate_type"],
    partial_variables={"format_instructions": format_instructions}

    )



prompt2 = PromptTemplate(
    template = "List three popular {plate_type} plates",
    input_variables=["plate_type"]

    )

In [None]:
print (f"using outputpaser:     {prompt1}")
print ()
print ("============================================")

print (f"without outputpaser:        {prompt2}")


using outputpaser:     input_variables=['plate_type'] input_types={} partial_variables={'format_instructions': <bound method CommaSeparatedListOutputParser.get_format_instructions of CommaSeparatedListOutputParser()>} template='List three popular {plate_type} plates.\n{format_instructions}'

without outputpaser:        input_variables=['plate_type'] input_types={} partial_variables={} template='List three popular {plate_type} plates'


In [None]:
demo_message = prompt1.format(plate_type='arabian')

response = llms( demo_message )

output = output_parser.parse(response)

print(output)

# it give me outpus as list , answers is bad , model is very week

  response = llms( demo_message )


['List three popular arabian plates.', 'Your response should be a list of comma separated values', 'eg: `foo', 'bar', 'baz` or `foo', 'bar', 'baz`', '1. Mezze platter', '2. Mixed grill platter', '3. Shawarma platter', 'The three popular arabian plates are the Mezze platter', 'which typically includes a variety of dips', 'spreads', 'and salads; the Mixed grill platter', 'which features an assortment of grilled meats and vegetables; and the Shawarma platter', 'a stack of thinly sliced marinated meats served with pita bread and various toppings.']


# Data Time

In [None]:
from langchain.prompts.prompt import PromptTemplate

format_instructions = "Replay with a datetime format DAY/MONTH/YEAR like `23/05/1988`."

prompt = PromptTemplate(
    template="when was {person} born?\n{format_instructions}",
    input_variables=['person'],
    partial_variables={"format_instructions": format_instructions}
)

message = prompt.format(person='thomas edison')

print( llms( message ) )

when was thomas edison born?
Replay with a datetime format DAY/MONTH/YEAR like `23/05/1988`.
Thomas Edison was born on February 11, 1847.
Replay with a datetime format: 11/02/1847.
Thomas Edison was born on February 11, 1847.
Replay with the exact datetime format: 1847-02-11.
Thomas Edison was born on February 11, 1847.
Replay with a different format: Thomas Edison was born on 11/02/1847.
Thomas Edison was born on February 11, 1847.
Replay with the datetime in words: Thomas Edison was born on the 11th of February, 1847.
Thomas Edison was born on February 11, 1847.
Replay with a datetime in a different format: Thomas Edison was born on 1847-02-11.
Thomas Edison was born on February 11, 1847.
Replay with a datetime format that


# pydantic





##  Key Points

### 1. Purpose of Pydantic
- Helps define structured data models with fixed field types.  
- Makes output **consistent**, **validated**, and **easier to use** in code.  
- Replaces the idea of just returning raw strings with **well-defined objects**.

## ⚠️ Tips & Warnings

### ✅ Advantages
- Structured and clean data.  
- Easy to integrate into real applications or APIs.  

### ⚠️ Warnings
- Pydantic parsers increase **token usage** because they include examples and schema definitions in the prompt.  
- Use them **only when necessary** to avoid high costs.  



In [None]:
# {
#     "plate_name": "Shawerma",
#     "ingredients": [
#         "ingredient 1", "ingredient 2"
#     ]
# }

class Plate(BaseModel):
    plate_name: str = Field(description="name of the plat")
    ingredients: List[str] = Field(description="list of names of the ingredients")

parser = PydanticOutputParser(pydantic_object=Plate)

In [None]:
prompt = PromptTemplate(
    template=(
        "You are a helpful chef assistant.\n"
        "Given the plate name, list its ingredients.\n\n"
        "Plate name: {plate_name}\n\n"
        "{format_instructions}"
    ),
    input_variables=["plate_name"],
    partial_variables={"format_instructions": format_instructions},
)

In [None]:
plate_name = 'Shawerma'

message = prompt.format(plate_name=plate_name)

response = llms( message )

output = parser.parse(response)