# Structured Output

## Structured Output (`with_structured_output`)
Work with those LLM who give output in structured format.

### TypedDict

**TypedDict** is a way to define a dictionary in Python where you specify what keys and values should exist.   
  It helps ensure that your dictionary follows a specific structure.

#### Why use TypedDict?

- It tells Python what keys are required and what types of values they should have.
- It does not validate data at runtime (it just helps with type hints for better coding).

In [1]:
# Example
from typing_extensions import TypedDict

class Person(TypedDict):
    name: str
    age: int

p1: Person = {
    'name': "Vivek Kumare",
    'age':24
}

p2: Person = {
    'name':"Bytes Code",
    'age': '21'
}

print(p1)
print(p2)

{'name': 'Vivek Kumare', 'age': 24}
{'name': 'Bytes Code', 'age': '21'}


In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI
from typing_extensions import TypedDict
from dotenv import load_dotenv

load_dotenv()

Gemini = ChatGoogleGenerativeAI(model='gemini-2.5-flash-lite')

#Scructured Output Example
class Review(TypedDict):
    summary: str
    sentiment: str

StructuredGemini = Gemini.with_structured_output(Review)

review = """I absolutely love the Nova X12. The display is super smooth, the battery easily 
            lasts a full day, and the camera quality is amazing even in low light. Performance 
            is fast with no lag at all. Totally worth the price!"""

response = StructuredGemini.invoke(review)

response

{'summary': 'The Nova X12 has a smooth display, long-lasting battery, amazing camera quality even in low light, and fast performance with no lag, making it worth the price.',
 'sentiment': 'positive'}

In [14]:
print(response['summary'])
print(response['sentiment'])

The Nova X12 has a smooth display, long-lasting battery, amazing camera quality even in low light, and fast performance with no lag, making it worth the price.
positive


In [17]:
from langchain_google_genai import ChatGoogleGenerativeAI
from typing_extensions import TypedDict
from dotenv import load_dotenv

load_dotenv()

Gemini = ChatGoogleGenerativeAI(model='gemini-2.5-flash-lite')

class Review(TypedDict):
    summary: str
    sentiment: str

StructuredGemini = Gemini.with_structured_output(Review)

with open('reviews.txt', 'r') as file:
    reviews = [review.strip() for review in file.readlines()]

response = [StructuredGemini.invoke(review) for review in reviews]

response


[{'summary': 'The Nova X12 has a smooth display, long-lasting battery, amazing camera quality even in low light, and fast performance with no lag, making it worth the price.',
  'sentiment': 'positive'},
 {'summary': 'The phone was a disappointment due to rapid battery drain, overheating during normal use, poor camera quality, and unhelpful customer support.',
  'sentiment': 'negative'},
 {'summary': "The phone is adequate for daily tasks like calls, messages, and basic apps, though it struggles with intensive use. It's neither outstanding nor poor.",
  'sentiment': 'neutral'},
 {'summary': 'Good phone with premium design and excellent performance, but average battery life and high price.',
  'sentiment': 'neutral'},
 {'summary': "This phone is the best I've used due to its incredible fast charging, top-notch camera, and outstanding gaming performance, with no issues encountered since purchase.",
  'sentiment': 'positive'},
 {'summary': 'The phone has a good appearance but poor perform

In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI
from typing_extensions import TypedDict, Annotated, Optional
from dotenv import load_dotenv

load_dotenv()

Gemini = ChatGoogleGenerativeAI(model='gemini-2.5-flash')

#Schema with Annotations
class Review(TypedDict):
    key_themes: Annotated[list[str], "List the key themes discussed in the review, separated by commas."]
    summary: Annotated[str, "Provide a short summary about 7-10 words of the review."]
    sentiment: Annotated[str, "Is the review Positive, Negative or Neutral?"]
    pros: Annotated[Optional[list[str]], "List the pros mentioned in the review, separated by commas."]
    cons: Annotated[Optional[list[str]], "List the cons mentioned in the review, separated by commas."]

StructuredGemini = Gemini.with_structured_output(Review)

with open('reviews.txt', 'r') as file:
    reviews = [review.strip() for review in file.readlines()][:2]

response = [StructuredGemini.invoke(review) for review in reviews]

response

[{'key_themes': ['display',
   'battery life',
   'camera quality',
   'performance',
   'value for money'],
  'summary': 'The Nova X12 receives high praise for its exceptionally smooth display, impressive all-day battery life, and excellent camera performance, even in low light. Users highlight its fast and lag-free operation, deeming it a worthwhile purchase.',
  'sentiment': 'Positive',
  'pros': ['Display is super smooth',
   'Battery easily lasts a full day',
   'Camera quality is amazing even in low light',
   'Performance is fast with no lag at all',
   'Totally worth the price'],
  'cons': []},
 {'key_themes': ['battery life',
   'overheating',
   'camera quality',
   'customer support'],
  'summary': 'The user experienced significant disappointment with the phone due to its fast battery drain, overheating during regular use, and substandard camera performance compared to advertisements. Additionally, customer support was unhelpful.',
  'sentiment': 'negative',
  'pros': [],
  

### Pydantic

**Pydantic** is a data validation and data parsing library for Python.  
It ensures that the data you work with is correct, structured, and type-safe.

#### Why use Pydantic?

- It automatically validates data types at runtime.
- It converts input data into the correct Python types when possible.
- It helps catch bugs early by enforcing strict data schemas.
- It provides clear and readable error messages.
- It is widely used in APIs, especially with frameworks like FastAPI.



In [28]:
from langchain_google_genai import ChatGoogleGenerativeAI
from pydantic import BaseModel, Field
from dotenv import load_dotenv

load_dotenv()

Gemini = ChatGoogleGenerativeAI(model='gemini-2.5-flash')

#Schema with Pydantic
class Review(BaseModel):
    key_themes: list[str] = Field(description="List the key themes discussed in the review, separated by commas.")
    summary: str = Field(description="Provide a short summary about 7-10 words of the review.")
    sentiment: str = Field(description="Is the review Positive, Negative or Neutral?")
    pros: list[str] = Field(description="List the pros mentioned in the review, separated by commas.")
    cons: list[str] = Field(description="List the cons mentioned in the review, separated by commas.")

StructuredGemini = Gemini.with_structured_output(Review)

with open('reviews.txt', 'r') as file:
    reviews = [review.strip() for review in file.readlines()][2:4]

response = [StructuredGemini.invoke(review) for review in reviews]

response

[Review(key_themes=['everyday use', 'basic functionality', 'performance'], summary='Phone is okay for basic use, but performance drops with heavy usage.', sentiment='Neutral', pros=['handles calls', 'handles messages', 'handles basic apps'], cons=['performance drops with heavy usage', 'nothing impressive']),
 Review(key_themes=['Design', 'Performance', 'Battery life', 'Price'], summary='Premium design, excellent performance, but average battery and expensive.', sentiment='Neutral', pros=['Premium design', 'Excellent performance'], cons=['Average battery life', 'Expensive'])]

### JSON Schema

**JSON Schema** is a vocabulary that allows you to define the structure, required fields, and data types of JSON data.  
It ensures that JSON data is valid, well-structured, and follows a predefined format.

#### Why use JSON Schema?

- It validates JSON data against a defined structure.
- It enforces required fields and correct data types.
- It helps ensure consistency when exchanging data between systems.
- It is language-agnostic and works across different platforms.
- It is widely used in APIs, configuration files, and data contracts.


In [31]:
# JSON Schema Example

json_schema = {
  "title": "Review",
  "type": "object",
  "properties": {
    "key_themes": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "description": "List the key themes discussed in the review, separated by commas."
    },
    "summary": {
      "type": "string",
      "description": "Provide a short summary about 7-10 words of the review."
    },
    "sentiment": {
      "type": "string",
      "description": "Is the review Positive, Negative or Neutral?",
      "enum": ["Positive", "Negative", "Neutral"]
    },
    "pros": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "description": "List the pros mentioned in the review, separated by commas."
    },
    "cons": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "description": "List the cons mentioned in the review, separated by commas."
    }
  },
  "required": ["key_themes", "summary", "sentiment", "pros", "cons"],
}

In [33]:
from langchain_google_genai import ChatGoogleGenerativeAI
from dotenv import load_dotenv

load_dotenv()

Gemini = ChatGoogleGenerativeAI(model='gemini-2.5-flash')

StructuredGemini = Gemini.with_structured_output(json_schema)

with open('reviews.txt', 'r') as file:
    reviews = [review.strip() for review in file.readlines()][4:6]

response = [StructuredGemini.invoke(review) for review in reviews]

response

[{'key_themes': ['fast charging', 'camera quality', 'gaming performance'],
  'summary': 'Excellent phone with incredible fast charging, top camera, and outstanding gaming.',
  'sentiment': 'Positive',
  'pros': ['fast charging',
   'top-notch camera',
   'outstanding gaming performance',
   'no issues'],
  'cons': []},
 {'key_themes': ['Performance',
   'Camera quality',
   'Battery life',
   'App stability'],
  'summary': 'Looks good, but performance, camera, and battery are disappointing.',
  'sentiment': 'Negative',
  'pros': ['Looks good'],
  'cons': ['Performance is slow',
   'Apps crash frequently',
   'Camera quality is below average',
   'Battery life is disappointing']}]

# When to Use What?

## ‚úÖ Use TypedDict if:

- You only need type hints (basic structure enforcement).
- You don‚Äôt need validation (e.g., checking numbers are positive).
- You trust the LLM to return correct data.

---

## ‚úÖ Use Pydantic if:

- You need data validation (e.g., sentiment must be `"positive"`, `"neutral"`, or `"negative"`).
- You need default values if the LLM misses fields.
- You want automatic type conversion (e.g., `"100"` ‚Üí `100`).

---

## ‚úÖ Use JSON Schema if:

- You don‚Äôt want to import extra Python libraries (like Pydantic).
- You need validation but don‚Äôt need Python objects.
- You want to define structure in a standard JSON format.

---

## üöÄ When to Use What? (Comparison Table)

| Feature                     | TypedDict | Pydantic | JSON Schema |
|----------------------------|-----------|----------|-------------|
| Basic structure            | ‚úÖ        | ‚úÖ       | ‚úÖ          |
| Type enforcement           | ‚úÖ        | ‚úÖ       | ‚úÖ          |
| Data validation            | ‚ùå        | ‚úÖ       | ‚úÖ          |
| Default values             | ‚ùå        | ‚úÖ       | ‚ùå          |
| Automatic conversion       | ‚ùå        | ‚úÖ       | ‚ùå          |
| Cross-language compatibility | ‚ùå      | ‚ùå       | ‚úÖ          |
