# why pydantic

Here's the content formatted in Markdown:

# Theory: The Real Problem

When building applications, we collect data from users, APIs, or files.* Python  → accepts any value, any type. This creates serious problems:

* Wrong data types go into databases
* Data analysis breaks with unexpected types
* Bugs appear later, not when data enters
* No way to ensure data quality

This perfectly illustrates why data validation is crucial, especially in Python applications. Let's see what happens when we don't validate our data:



In [12]:
class Patient:
    def __init__(self, name, age, disease):
        self.name = name
        self.age = age
        self.disease = disease


# user input
p1 = Patient("Ali", "thirty", "Flu")   # age as str (mistake)
p2 = Patient("Sara", 25, "Diabetes")
print(f"{p1.name} {p1.age} {p1.disease}")
print(f"{p2.name} {p2.age} {p2.disease}")



Ali thirty Flu
Sara 25 Diabetes


**👉 Why This is Bad: Database stores "thirty" as age. Later when you calculate average age, your program crashes!**

# Chapter 2: Type Hints - First Solution Attempt

## Theory: What Are Type Hints?

Python added type hints to suggest what type each variable should be. They help:

* Developers understand code better
* IDEs provide better suggestions
* Code documentation becomes clearer

> **Important:** Type hints are just suggestions - Python doesn't enforce them!

In [13]:
class Patient:
    def __init__(self, name: str, age: int, disease: str):
        self.name = name
        self.age = age
        self.disease = disease
# user input
p1 = Patient("Ali", "thirty", "Flu")   # age as str (mistake)
p2 = Patient("Sara", 25, "Diabetes")
print(f"{p1.name} {p1.age} {p1.disease}")
print(f"{p2.name} {p2.age} {p2.disease}")

Ali thirty Flu
Sara 25 Diabetes


**👉 Type hints don't enforce rules at runtime.**

They only provide warnings in editors (PyCharm, VSCode) or with tools like `mypy`.

> **Warning:** Invalid data can still slip in — for example, the database may still store `"thirty"` as a string for age.

![Pydantic-feature](https://github.com/Khalil-Haider/Backend/raw/main/pydantic/pydantic_feature.svg)

# Chapter 3: **Pydantic – The Real Solution**

👉 **Solution: ** Use **Pydantic**.

It **takes type hints seriously** and turns them into **real rules**.

---

## 2. What is Pydantic?

Pydantic is a **data validation and settings management library** that:

* **Enforces type hints** (not just suggestions).
* **Validates automatically** → rejects wrong input.
* **Performs smart conversions** when possible.
* **Raises errors early** (fail fast principle).
* **Supports serialization** (convert objects into dict/JSON).
* **Is extremely fast** because validation runs in **Rust** (`pydantic-core`).

> **Key Concept:**
> A **Pydantic Model** is a Python class that inherits from `BaseModel`.
> It describes your data schema (shape + rules) in one place.

---



## 6. Key Terms to Remember

* **Schema** → the structure & rules of your data (from type hints).
* **Validation** → checking that input matches the schema.
* **Serialization** → turning models into dicts/JSON.
* **Fail Fast** → catch problems immediately at input.
* **Strict Mode** → disallow auto-conversion, enforce exact types.

In [None]:
from pydantic import BaseModel

class Patient(BaseModel):  # Must inherit from BaseModel
    name: str
    age: int
    email: str

# This works - correct types
patient1 = Patient(name="John", age=30, email="john@email.com")
print(patient1)  # Patient(name='John', age=30, email='john@email.com')

name='John' age=30 email='john@email.com'


## Smart Type Conversion

Pydantic doesn't just reject wrong types - it intelligently converts data when possible:

* ✅ String `"30"` → Integer `30` (successful conversion)
* ❌ String `"thirty"` → `ValidationError` (cannot convert)

> **Key Feature:** This is especially useful when handling real-world data, like form submissions or API responses, where numbers often come as strings.

```

In [16]:
from pydantic import BaseModel

class Patient(BaseModel):  # Must inherit from BaseModel
    name: str
    age: int
    email: str
    
# Pydantic converts "30" to 30 automatically!
patient2 = Patient(name="Alice", age="30", email="alice@email.com")
print(patient2)  # 30 (integer, not string!)
print(type(patient2.age))  # <class 'int'>

name='Alice' age=30 email='alice@email.com'
<class 'int'>


**This fails - can't convert "thirty" to int**

In [None]:
from pydantic import BaseModel

class Patient(BaseModel):  # Must inherit from BaseModel
    name: str
    age: int
    email: str
    
# Pydantic 
patient2 = Patient(name="Alice", age="Thirty", email="alice@email.com")
print(patient2)  

In [18]:
from pydantic import BaseModel
from typing import List, Dict, Optional

class Patient(BaseModel):  # Must inherit from BaseModel
    name: str
    age: int
    allergies: List[str] = []  # Example list
    medical_records: Dict[str, str] = {}  # Example dictionary


# Example usage
patient1 = Patient(
    name="John",
    age=30,
    allergies=["Penicillin", "Peanuts"],
    medical_records={"2023-01-01": "Flu", "2024-05-10": "Checkup"}
)

print(patient1)


name='John' age=30 allergies=['Penicillin', 'Peanuts'] medical_records={'2023-01-01': 'Flu', '2024-05-10': 'Checkup'}


# Chapter 4:

* What happens if a field is missing?
* What if we don’t mark it `Optional`?
* What if we give a default value vs. no default?
* Does Pydantic raise an error or silently fill?
---

## Chapter 4: **Optional Fields and Defaults**

## 1. Theory: Handling Missing Data

In real-world applications, **not all fields are always provided**.
For example, in a hospital system:

* Some fields are **mandatory** (name, age).
* Some fields are **optional** (phone, secondary email).
* Some fields may have a **default value** (country="Pakistan" if not given).

👉 **Pydantic gives us full control** using:

* `Optional[...]` for optional fields.
* `=` (assignment) for default values.
* Combination of both.

## 2. Case 1 – Required Field (No Default)

## Required Fields Error Example

👉

> ✔ **Important:** If a field has **no default** and is **not Optional**,
> Pydantic **forces you to provide it**.

In [None]:
from pydantic import BaseModel

class Patient(BaseModel):
    name: str
    age: int

# ❌ Missing "age"
patient = Patient(name="Ali")

## 3. Case 2 –  Field With Default

In [25]:
from typing import Optional
from pydantic import BaseModel

class Patient(BaseModel):
    name: str
    age: int =30     # Field With Default

# ❌ Missing "phone"
patient = Patient(name="Ali")
print(patient)

name='Ali' age=30


## Case 3 – Optional Fields with Default Values

### Using `None` as Default

> **Key Points:**
> When you make a field `Optional` with a default value of `None`:
> * Can be omitted when creating the object
> * Will automatically be set to `None` if not provided
> * Can explicitly receive `None` as a value
> * If user provides a value, it will use the user's value instead of the default
>
> ```python
> # Example:
> patient1 = Patient(name="Ali", age=30)  # phone will be None (default)
> patient2 = Patient(name="Sara", age=25, phone="123-456-7890")  # phone will be "123-456-7890" (user value)
> ```

In [None]:
from typing import Optional
from pydantic import BaseModel

class Patient(BaseModel):
    name: str
    age: int
    phone: Optional[str] =None

# ❌ Missing "phone"
patient = Patient(name="Ali", age=30)
print(patient)


name='Ali' age=30 phone='9'


# Chapter 5: Advanced Data Validation

## Theory:  Basic Types Aren't Enough

Just checking `str` or `int` isn't enough for real-world data validation:

* ✉️ Email needs proper format validation (`user@domain.com`)
* 🔢 Age must be a positive number (can't be -30)
* 📱 Phone numbers need pattern matching (`xxx-xxx-xxxx`)
* 📝 Names should be proper case

> **Key Point:** Pydantic provides specialized validators and tools for real-world data requirements.

In [None]:
from pydantic import BaseModel, EmailStr ,ValidationError

class Patient(BaseModel):
    name: str
    age: int
    email: EmailStr  # Validates email format automatically

# This works
patient1 = Patient(name="John", age=30, email="john@gmail.com")

# This fails - invalid email 
try:
    patient2 = Patient(name="Bob", age=25, email="not-an-email")
except ValidationError:
    print("Invalid email format!")

Invalid email format!


### Chapter 5: Field 
**Field (Custom Validation(Constraints)+metadata+default value )**
## 1. Why Do We Need `Field()`?

Using **type hints** with `BaseModel` gives us basic validation,  
but real-world applications usually require **stricter rules**:

- ✅ Numbers should be **positive** or within a certain **range**  
- ✅ Strings may need a **regex pattern** or specific **length limits**  
- ✅ Default values should still go through validation  
- ✅ Fields might need **aliases** (different names pointing to the same value)

👉 That’s where `Field()` helps.  
It lets us attach **extra metadata**,**description** and **validation rules** directly to model fields.

---



## 2. Default Values with Field()

### Two ways to provide defaults:
**Default values for fields can be provided using the normal assignment syntax or by providing a value to the default argument:**

In [2]:
from pydantic import BaseModel, Field

class User(BaseModel):
    name: str = "John"                # normal default
    age: int = Field(default=20)      # Field default

print(User())  
# name='John' age=20


name='John' age=20


🔑 Difference:

* `= "John"` is shorthand, good for simple cases.
* `Field(default=20)` is more powerful → because later we can add constraints (e.g., `ge=0`) alongside.

## 3. Validating Default Values

By default, Pydantic does **not check if defaults are given**.
To enforce it, use `validate_default=True`.


👉 Without `validate_default=True`, this would silently keep `"twelve"`.
👉 With it, Pydantic raises error immediately.

---

In [3]:
from pydantic import BaseModel, Field, ValidationError

class User(BaseModel):
    age: int = Field(default="twelve", validate_default=True)

try:
    User()
except ValidationError as e:
    print(e)

1 validation error for User
age
  Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='twelve', input_type=str]
    For further information visit https://errors.pydantic.dev/2.11/v/int_parsing


## 🌐 4. Aliases (Renaming Fields) 

In real-world projects, **data often comes from different sources** — API, database, frontend, etc.
But each source might use **different field names** for the same concept.

* API sends: `username`
* Database column: `user_name`
* Internal Python code: `name`

Without aliases, you’d have to **rename fields manually everywhere**, which is messy and error-prone.

---

## ✅ The Solution: `alias`

Pydantic’s `Field(alias=...)` tells the model:
*"If incoming data uses this alias, map it to my internal field name."*

In [5]:
from pydantic import BaseModel, Field

class User(BaseModel):
    name: str = Field(alias="username")

# Incoming API request
user = User(username="khalil")

print(user.name)   # internal clean name -> 'khalil'
print(user.model_dump(by_alias=True))  # {'username': 'khalil'}

khalil
{'username': 'khalil'}


* **Input:** Accepts external data with `username`.
* **Internal:** You work with `name` in Python code (clean, consistent).
* **Output:** You can return `username` back to API if needed.

---

## 🚀 Where We Need It

1. **FastAPI Requests & Responses**

   * API clients might send `user_id`, but your internal model uses `id`.
   * Response JSON might need `fullName`, but you keep it as `full_name`.

2. **Database Mapping**

   * DB column is `user_name`, but you want your Python field `name`.

3. **Third-Party Integrations**

   * External service sends fields in camelCase (`userName`) but you want snake\_case (`user_name`).

---

## 🔑 Variants

* `alias` → For both input & output (old default).
* `validation_alias` → Only for **input** (validation).
* `serialization_alias` → Only for **output** (when dumping to dict/JSON).

---

👉 So, **aliases solve the “different names in different places” problem**, keeping your Python code clean while still working with APIs/databases that use their own naming.


## 5. Numeric Constraints

We can force conditions on numbers directly in the field:

There are some keyword arguments that can be used to constrain numeric values:

* gt - greater than
* lt - less than
* ge - greater than or equal to
* le - less than or equal to
* 
* `gt`, `ge`, `lt`, `le` → inequalities.
* `multiple_of` → divisibility check.
* `allow_inf_nan` → allow floats like `inf`,`-inf` `nan`.

---

In [6]:
from pydantic import BaseModel, Field

class Order(BaseModel):
    quantity: int = Field(gt=0)          # must be > 0
    discount: float = Field(ge=0, le=1)  # 0 <= discount <= 1
    even_number: int = Field(multiple_of=2)

## 6. String Constraints

For text validation:
There are fields that can be used to constrain strings:

* `min_length`: Minimum length of the string.
* `max_length`: Maximum length of the string.
* `pattern`: A regular expression that the string must match.

✔ Regex patterns are great for validating codes, phone numbers, emails

In [7]:
class Message(BaseModel):
    short: str = Field(min_length=3)
    password: str = Field(min_length=8, max_length=20)
    
    # Code must be exactly 6 characters, only uppercase letters and numbers
    # Example: "ABC123", "12WXYZ"
    code: str = Field(pattern=r"^[A-Z0-9]{6}$")

## Decimal Constraints

There are fields that can be used to constrain `Decimal` values:

* `max_digits`: Maximum number of digits within the Decimal
  * Does not include a zero before the decimal point
  * Does not include trailing decimal zeroes
* `decimal_places`: Maximum number of decimal places allowed
  * Does not include trailing decimal zeroes




In [11]:
from decimal import Decimal
from pydantic import BaseModel, Field

class Foo(BaseModel):
    # Maximum 5 digits total, with 2 decimal places
    # Valid: "123.45", "99.99", "1.23"
    # Invalid: "1234.56" (too many digits), "123.456" (too many decimal places)
    precise: Decimal = Field(max_digits=5, decimal_places=2)

# This works
foo = Foo(precise=Decimal('123.45'))
print(foo)
#> precise=Decimal('123.45')

precise=Decimal('123.45')


In [9]:
from decimal import Decimal

class Price(BaseModel):
    value: Decimal = Field(max_digits=5, decimal_places=2)

p = Price(value="123.45")  
print(p.value)  # Decimal('123.45')


123.45


> **Note:** This is particularly useful for financial calculations where you need to control the precision of

## 1️⃣ What is `Annotated`?

`Annotated` (from `typing`) lets us **attach extra metadata** to a type hint.
In Pydantic, we usually attach a `Field(...)` to a type using `Annotated`.

Example (simple field with constraints):

In [12]:
from typing import Annotated
from pydantic import BaseModel, Field

class User(BaseModel):
    age: Annotated[int, Field(ge=0, le=120)]

**✔ Here, `age` must be between 0 and 120.**

**Without `Annotated`, we could only write:**
```python
age: int = Field(ge=0, le=120)
```

> **Note:** While this works, using `Annotated` is the more modern and flexible approach, especially when working with complex types.

**But **`Annotated` is more powerful** → it works with nested/compound types.**

## 2️⃣ `Annotated` with Metadata

**You can also attach extra metadata (like description, title, JSON schema).**


In [13]:
from typing import Annotated
from pydantic import BaseModel, Field

class Product(BaseModel):
    name: Annotated[str, Field(description="The product name", min_length=3)]
    price: Annotated[float, Field(gt=0, description="Must be positive")]

**✔ This metadata is useful in **FastAPI docs**, OpenAPI schema, and auto-generated forms.**

## 3️⃣ `Annotated` inside Compound Types (Lists, Sets, etc.)

This is the **big use case**: applying constraints to items inside a list/set.

### Example: Marks in a List

In [None]:
from typing import Annotated
from pydantic import BaseModel, Field

class Scores(BaseModel):
    # Each mark must be between 0 and 100
    marks: list[Annotated[int, Field(ge=0, le=100)]]
```

Test it:

```python
Scores(marks=[90, 50, 101])
# ❌ ValidationError: value 101 not <= 100


✔ Every element in the list gets validated individually.
Without `Annotated`, you’d only validate the **list as a whole**, not its items.

---

## ✅ Why This is Useful?

* **Without Annotated**: you can constrain `list` length, but not each item.
* **With Annotated**: you can say “this list must contain valid items, and each item has its own constraints.”

---

🔑 **Summary:**

* `Annotated` = attach metadata/constraints to a type.
* Works with single fields (`int`, `str`, `float`) or compound (`list`, `dict`).
* Helps generate rich API docs (description, schema).
* Solves the problem of **validating items inside lists/sets**.

---

# Chapter 5: Field Constraints & Strict Mode

## 1. Why Do We Need Field()?

Type hints + BaseModel already give us basic validation. But in real applications, we often need **extra rules**:

- **Age must be positive** (not just an integer)
- **Email must follow specific domain** (like only @company.com)
- **Password must have minimum length**
- **Price must be greater than 0**

**Problem:** Basic type hints can't handle these business rules.

**Solution:** Use `Field()` function to add constraints!

---

## 2. Default Values with Field()

### Basic Default Values
```python
from pydantic import BaseModel, Field

class User(BaseModel):
    # Two ways to set defaults:
    name: str = 'John Doe'           # Normal assignment
    age: int = Field(default=20)     # Using Field()
    
user = User()  # No input needed
print(user)
# Output: name='John Doe' age=20
```

### Validate Default Values
By default, Pydantic **doesn't validate** default values. Enable it with `validate_default=True`:

```python
from pydantic import BaseModel, Field, ValidationError

class User(BaseModel):
    # This default is invalid (string instead of int)
    age: int = Field(default='twelve', validate_default=True)

try:
    user = User()  # This will fail!
except ValidationError as e:
    print("Error: Default value 'twelve' is not a valid integer")
```

---

## 3. Numeric Constraints

Control numeric values with these parameters:

- `gt` - greater than
- `lt` - less than  
- `ge` - greater than or equal to
- `le` - less than or equal to
- `multiple_of` - must be a multiple of given number
- `allow_inf_nan` - allow infinity/NaN values

```python
from pydantic import BaseModel, Field

class Product(BaseModel):
    price: float = Field(gt=0)                    # Must be > 0
    discount: int = Field(ge=0, le=100)           # 0-100%
    quantity: int = Field(multiple_of=5)          # Must be 5, 10, 15...
    rating: float = Field(ge=1.0, le=5.0)        # 1-5 stars
    
# This works
product = Product(price=99.99, discount=20, quantity=10, rating=4.5)

# This fails - price can't be 0
try:
    bad_product = Product(price=0, discount=20, quantity=10, rating=4.5)
except ValidationError:
    print("Price must be greater than 0!")
```

---

## 4. String Constraints

Control string values with:

- `min_length` - minimum characters
- `max_length` - maximum characters  
- `pattern` - regex pattern to match

```python
from pydantic import BaseModel, Field

class UserAccount(BaseModel):
    username: str = Field(min_length=3, max_length=20)
    password: str = Field(min_length=8)
    phone: str = Field(pattern=r'^\d{10}$')  # Exactly 10 digits
    
# This works
user = UserAccount(
    username="john_doe", 
    password="secure123", 
    phone="1234567890"
)

# This fails - username too short
try:
    bad_user = UserAccount(username="jo", password="secure123", phone="1234567890")
except ValidationError:
    print("Username must be at least 3 characters!")
```

---

## 5. Field Aliases

Sometimes the field name in your code differs from the input/output name:

### Basic Alias (for both input and output)
```python
from pydantic import BaseModel, Field

class User(BaseModel):
    name: str = Field(alias='username')

# Input uses alias
user = User(username='johndoe')  
print(user.name)  # Access using field name
# Output: johndoe

# Output can use alias
print(user.model_dump(by_alias=True))
# Output: {'username': 'johndoe'}
```

### Separate Aliases for Input/Output
```python
class User(BaseModel):
    name: str = Field(
        validation_alias='input_name',      # For input
        serialization_alias='output_name'   # For output
    )

# Input
user = User(input_name='johndoe')

# Output  
print(user.model_dump(by_alias=True))
# Output: {'output_name': 'johndoe'}
```

---

## 6. Real-World Example: Patient Registration

```python
from pydantic import BaseModel, Field, EmailStr
from typing import Optional

class Patient(BaseModel):
    # Name constraints
    full_name: str = Field(
        min_length=2, 
        max_length=100,
        alias='patient_name'
    )
    
    # Age constraints  
    age: int = Field(ge=0, le=150)
    
    # Email with domain restriction
    email: EmailStr = Field(pattern=r'.*@hospital\.com$')
    
    # Phone number format
    phone: str = Field(pattern=r'^\+?1?\d{9,15}$')
    
    # Emergency contact (optional)
    emergency_contact: Optional[str] = Field(
        default=None,
        min_length=10
    )
    
    # Insurance ID
    insurance_id: str = Field(
        min_length=8,
        max_length=20,
        pattern=r'^INS\d+$'  # Must start with 'INS' followed by numbers
    )

# Valid patient
patient = Patient(
    patient_name="John Smith",
    age=35,
    email="john.smith@hospital.com",
    phone="+1234567890",
    insurance_id="INS12345678"
)

print("Patient registered successfully!")
```

---

## 7. Strict Mode

By default, Pydantic tries to **convert** values to the correct type:

```python
# Normal (Lax) Mode - Auto conversion
class User(BaseModel):
    age: int
    score: float

# These work - strings get converted
user1 = User(age="30", score="95.5")  # ✅ Converts to int/float
print(user1)  # age=30 score=95.5
```

But sometimes you want **exact types only**:

### Enable Strict Mode

**Option 1: Per Model**
```python
class StrictUser(BaseModel):
    model_config = {"strict": True}
    
    age: int
    score: float

# This fails - no auto conversion
try:
    user = StrictUser(age="30", score="95.5")
except ValidationError:
    print("Strict mode: strings not allowed for numbers!")
```

**Option 2: Per Field**
```python
class MixedUser(BaseModel):
    age: int = Field(strict=True)      # Must be exact int
    name: str                          # Can convert to string
    
# This fails
try:
    user = MixedUser(age="30", name="John")
except ValidationError:
    print("Age must be exact integer, not string!")
```

**Option 3: Per Validation Call**
```python
class User(BaseModel):
    age: int
    
# Normal validation (with conversion)
user1 = User(age="30")  # ✅ Works

# Strict validation (no conversion)  
try:
    user2 = User.model_validate({"age": "30"}, strict=True)
except ValidationError:
    print("Strict validation: no conversion allowed!")
```

---

## 8. When to Use Strict Mode

**Use Strict Mode When:**
- Working with APIs where exact types are critical
- Parsing configuration files 
- Processing financial data (no accidental conversions)
- You want to catch type errors early

**Use Lax Mode When:**
- Processing user input (forms, URLs)
- Reading from databases or files
- Working with external APIs that might return inconsistent types

---

## 9. Key Takeaways

1. **Field() adds business rules** beyond basic type checking
2. **Numeric constraints** control ranges and multiples  
3. **String constraints** control length and format
4. **Aliases** separate internal names from external APIs
5. **Strict mode** disables auto-conversion for exact type matching
6. **Choose the right mode** based on your data source and requirements

**Next:** We'll explore Field Validators for custom validation logic and data transformation!