# What is Pydantic?

Pydantic is a data validation and parsing library in Python, built on top of type hints.
It ensures that inputs/outputs follow strict schemas (like JSON schemas) and automatically converts data into the right types.



# Pydantic 
is a Python library for data validation and settings management that uses Python type hints. It ensures that input data matches the expected types, performs automatic type conversion when possible, and provides detailed error messages if validation fails.

* 🔑 Key Features of Pydantic:

✅ Validation → Automatically checks that data matches declared types.

✅ Type coercion → Converts compatible types (e.g., "123" → 123).

✅ Error handling → Raises ValidationError with useful messages.

✅ Serialization → Easy conversion to dict or JSON.

✅ Integration → Widely used in FastAPI for request/response models.

In [1]:
from pydantic import BaseModel

In [1]:
pip install pydantic


Collecting pydantic
  Downloading pydantic-2.11.9-py3-none-any.whl (444 kB)
     ---------------------------------------- 0.0/444.9 kB ? eta -:--:--
     ------ -------------------------------- 71.7/444.9 kB 2.0 MB/s eta 0:00:01
     -------------------------------------  440.3/444.9 kB 5.5 MB/s eta 0:00:01
     -------------------------------------- 444.9/444.9 kB 4.6 MB/s eta 0:00:00
Collecting pydantic-core==2.33.2
  Using cached pydantic_core-2.33.2-cp310-cp310-win_amd64.whl (2.0 MB)
Collecting typing-inspection>=0.4.0
  Using cached typing_inspection-0.4.1-py3-none-any.whl (14 kB)
Collecting annotated-types>=0.6.0
  Using cached annotated_types-0.7.0-py3-none-any.whl (13 kB)
Installing collected packages: typing-inspection, pydantic-core, annotated-types, pydantic
Successfully installed annotated-types-0.7.0 pydantic-2.11.9 pydantic-core-2.33.2 typing-inspection-0.4.1
Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 23.0.1 -> 25.2
[notice] To update, run: python.exe -m pip install --upgrade pip


# BaseModel
In Pydantic, BaseModel is the foundational class used to define data models. It provides automatic data validation, type conversion, error handling, and serialization/deserialization by enforcing Python type hints on fields.

# Dataclass
 in Python is a decorator (@dataclass) from the dataclasses module that automatically generates special methods for a class (like __init__, __repr__, __eq__, etc.) based on its type-annotated fields, making it easier to create classes for storing data without writing boilerplate code.

# 🔹 1. Dataclasses vs Pydantic Models

In [14]:
from dataclasses import dataclass

@dataclass
class User():
    name: str
    age: int
    city: str

person = User(name="Nidhi", age=25, city="Bhopal")
print(person)


User(name='Nidhi', age=25, city='Bhopal')


In [15]:
User = User(name="nidhi", age=25, city=25)
print(User)


User(name='nidhi', age=25, city=25)


⚠️ No error → dataclasses do NOT validate types. They only store data.

# Now compare with Pydantic:

In [16]:
from pydantic import BaseModel

class Person1(BaseModel):
    name: str
    age: int
    city: str

person = Person1(name="Nidhi", age=25, city="bhopal")
print(person)


name='Nidhi' age=25 city='bhopal'


In [17]:
person1 = Person1(name="nidhi", age=25, city=12)
print(person1)


ValidationError: 1 validation error for Person1
city
  Input should be a valid string [type=string_type, input_value=12, input_type=int]
    For further information visit https://errors.pydantic.dev/2.11/v/string_type

⚠️ Error → Pydantic enforces type validation (city must be str, not int).

👉 This is why Pydantic is useful for AI pipelines—it guarantees structured correctness.

# 🔹 2. Optional Fields

Optional[type]: means the field can be None.

Default values (= None, = True) make them not required.

In [18]:
from typing import Optional

class Employee(BaseModel):
    id: int
    name: str
    department: str
    salary: Optional[float] = None
    is_active: Optional[bool] = True


# Examples:

In [21]:
emp1 = Employee(id=1, name="sunny", department="IT")
print(emp1)  
# salary=None, is_active=True

emp2 = Employee(id=2, name="suraj", department="HR", salary=70000, is_active=False)
print(emp2)


id=1 name='sunny' department='IT' salary=None is_active=True
id=2 name='suraj' department='HR' salary=70000.0 is_active=False


✅ Both valid.

# 🔹 3. Lists & Validation

In [23]:
from typing import List

class Classroom(BaseModel):
    room_number: str
    students: List[str]
    capacity: int

classroom = Classroom(
    room_number="B101",
    students=("Sunita", "Neeta", "reeta"),  # tuple auto-converted to list
    capacity=30
)
print(classroom)


room_number='B101' students=['Sunita', 'Neeta', 'reeta'] capacity=30


Pydantic automatically converts compatible types (tuple → list).

In [25]:
invalid_val = Classroom(room_number="A1", students=["Siya", 123], capacity=30)


ValidationError: 1 validation error for Classroom
students.1
  Input should be a valid string [type=string_type, input_value=123, input_type=int]
    For further information visit https://errors.pydantic.dev/2.11/v/string_type

⚠️ Error: 123 is not a str.

# 🔹 4. Nested Models

In [26]:
class Address(BaseModel):
    street: str
    city: str
    zip_code: int

class Customer(BaseModel):
    customer_id: int
    name: str
    address: Address


Input can use dicts for nested models:

In [28]:
customer = Customer(
    customer_id=1,
    name="Emma",
    address={"street": "123 Main St", "city": "Boston", "zip_code": "04107"}
)
print(customer)


customer_id=1 name='Emma' address=Address(street='123 Main St', city='Boston', zip_code=4107)


Pydantic will auto-cast "04107" → 4107 (int).

⚠️ This is powerful: LLMs often return everything as string, Pydantic fixes that.

# 🔹 5. Field Customization & Constraints

In [29]:
from pydantic import Field

class Item(BaseModel):
    name: str = Field(min_length=2, max_length=50)
    price: float = Field(gt=0, le=1000)  # >0 and ≤1000
    quantity: int = Field(ge=0)


Validation rules:

min_length, max_length → for strings

gt, ge, lt, le → numeric bounds

In [30]:
item = Item(name="Book", price=10, quantity=10)
print(item)


name='Book' price=10.0 quantity=10


✅ Works.

❌ If price is -5 → validation error.

# 🔹 6. Metadata with Field

In [31]:
class User(BaseModel):
    username: str = Field(..., description="Unique username for the user")
    age: int = Field(default=18, description="User age, defaults to 18")
    email: str = Field(default_factory=lambda: "user@example.com", description="Default email address")

print(User.model_json_schema())


{'properties': {'username': {'description': 'Unique username for the user', 'title': 'Username', 'type': 'string'}, 'age': {'default': 18, 'description': 'User age, defaults to 18', 'title': 'Age', 'type': 'integer'}, 'email': {'description': 'Default email address', 'title': 'Email', 'type': 'string'}}, 'required': ['username'], 'title': 'User', 'type': 'object'}


✅ This generates a JSON Schema, useful when integrating with APIs, FastAPI, or LLMs (for structured responses).

# 🚀 Why This Matters in Gen AI

Strict validation → AI outputs can be noisy, Pydantic enforces clean structure.

Nested models → good for RAG pipelines (Answer + Sources).

Constraints → prevents invalid tool calls in agents.

Schemas → many Gen AI frameworks (LangChain, LlamaIndex, OpenAI function calling) rely on Pydantic to define structured outputs.