#### Using Pydantic Class ---> Structuring the LLM Output

- Pydantic automatically validates the data, ensuring that it conforms to the expected types and constraints.

- The key advantage of using Pydantic is that the model-generated output will be validated. Pydantic will raise an error if any required fields are missing or if any fields are of the wrong type.

**Summary :** By using a Pydantic class, you're essentially telling the LLM: 
- **"Hey, whatever output you generate, make sure it conforms to the structure and format I’ve defined in this Pydantic class."**

#### Key Takeaway
- The LLM generates data, but Pydantic ensures it’s **filtered** and **reshaped** to match your requirements.
- This process effectively transforms raw, unstructured data into something clean, usable, and reliable.

#### Why is this Useful?

1. **Validation**:  
   Pydantic automatically validates the data. If the LLM or any other system gives invalid data, Pydantic will raise clear errors.

2. **Formatting**:  
   Pydantic ensures the output is in a specific format. This is especially useful when working with APIs or structured data from LLMs.

3. **Error Prevention**:  
   By catching issues early, you avoid bugs in later parts of your program.


##### Ex-1:

In [1]:
from pydantic import BaseModel
from typing import List, Optional

class User(BaseModel):
    id: int  # User ID must be an integer
    name: str  # Name must be a string
    age: Optional[int] = None  # Age is optional but must be an integer if provided
    skills: List[str]  # Skills must be a list of strings

# Correctly formatted data
user = User(id=1, name="Praveen", age=25, skills=["Python", "Django"])
print(user)

# Incorrectly formatted data (will raise a validation error)
invalid_user = User(id="abc", name="John", age="twenty", skills=[123])

##### Ex-2:
- In the blow example '**from typing import Optional**' is nothing but the short hand of '**Optional[X] == Union[X, None]**' i.e the value can be either of type X (e.g., str, int) or None.

#### Step 1: LLM Generates the Joke
- The LLM first creates a joke based on your prompt (e.g., "Tell me a joke").
- At this stage, the joke is just raw output—no rules or structure are applied yet.

#### Step 2: Pydantic Class Kicks In
- After the joke is generated, the Pydantic class checks if the output matches the structure you defined (like `setup`, `punchline`, and optional `rating` with their respective data types).
- If something is missing or doesn’t fit the structure (e.g., a number in the `punchline`), it raises an error or forces the data into the structured format.

#### Final Result: A Structured Joke
- Once the LLM’s output is validated and reshaped by the Pydantic class, you get a clean, reliable joke that fits the predefined structure.


In [None]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini")
from typing import Optional
from pydantic import BaseModel, Field

class Joke(BaseModel):
    """Joke to tell user."""

    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline to the joke")
    rating: Optional[int] = Field(
        default=None, description="How funny the joke is, from 1 to 10"
    )

structured_llm = llm.with_structured_output(Joke)

structured_llm.invoke("Tell me a Joke")

In [None]:
#### Customer Object type --USER REQUIREMENT BASED
from dotenv import load_dotenv
load_dotenv()
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import JsonOutputParser
from pydantic import BaseModel, Field 
import os
api_key = os.getenv('GOOGLE_API_KEY')
model = ChatGoogleGenerativeAI(model="gemini-1.5-pro",api_key=api_key)

def call_json_output_parser():
    prompt = ChatPromptTemplate.from_messages([
        ("system", "Extract information from the following phrase.\nFormatting Instructions: {format_instructions}"),
        ("human", "{phrase}")
    ])

    class Person(BaseModel):
        recipe: str = Field(description="the name of the recipe")
        ingredients: list = Field(description="ingredients")
        

    parser = JsonOutputParser(pydantic_object=Person)
    # print(parser.get_format_instructions())
    chain = prompt | model | parser
    # print("Parser function",parser.get_format_instructions())
    return chain.invoke({
        "phrase": "The ingredients for a Margherita pizza are tomatoes, onions, cheese, basil",
        "format_instructions": parser.get_format_instructions()
    })
print(call_json_output_parser())