# Pydantic: A Comprehensive Guide

## What is Pydantic?

- **Definition:**
  - Pydantic extends Python's dataclasses with additional features:
    - **Validation:** Ensures data meets specified criteria.
    - **Serialization:** Converts data into formats suitable for other applications.
    - **Data Transformation:** Alters data shapes as needed.

## A Really Basic Example

- **Objective:**
  - Ensure a function receives both `first_name` and `last_name` as strings.


In [None]:
from pydantic import BaseModel

class MyFirstModel(BaseModel):
    first_name: str
    last_name: str

validating = MyFirstModel(first_name="marc", last_name="nealer")

In [3]:
import nbformat as nbf

# Create a new notebook
nb = nbf.v4.new_notebook()

# List of cells to add to the notebook
cells = [
    # Title and Introduction
    {
        "cell_type": "markdown",
        "source": "# A Practical Guide to using Pydantic\n### Based on the Medium Article by Marc Nealer\n\n**Source:** [Medium Article](https://medium.com/@marcnealer/a-practical-guide-to-using-pydantic-8aafa7feebf6)\n**Published:** June 22, 2024\n---"
    },
    # Introduction Section
    {
        "cell_type": "markdown",
        "source": "## Introduction\n\n- **Author's Journey:**\n  - Started experimenting with FastAPI.\n  - Discovered Pydantic as an essential tool within FastAPI.\n  - Initial challenges included a steep learning curve and multiple approaches for similar tasks.\n\n- **Author's Opinion:**\n  - Despite initial hurdles, Pydantic is highly powerful and ranks in the top 10 Python libraries.\n  - Emphasizes the importance of understanding Pydantic to leverage its full potential.\n\n- **Version Note:**\n  - Focuses on **Pydantic v2.\***.\n  - Significant differences exist between versions 1 and 2.\n  - Cautions against using AI tools like ChatGPT or Gemini for Pydantic coding due to potential version mix-ups."
    },
    # What is Pydantic Section
    {
        "cell_type": "markdown",
        "source": "---\n## What is Pydantic?\n\n- **Definition:**\n  - Pydantic extends Python's dataclasses with additional features:\n    - **Validation:** Ensures data meets specified criteria.\n    - **Serialization:** Converts data into formats suitable for other applications.\n    - **Data Transformation:** Alters data shapes as needed.\n\n- **Use Cases:**\n  - Validating incoming data.\n  - Transforming data structures.\n  - Preparing data for serialization and transmission."
    },
    # A Really Basic Example Section
    {
        "cell_type": "markdown",
        "source": "---\n## A Really Basic Example\n\n- **Objective:**\n  - Ensure a function receives both `first_name` and `last_name` as strings.\n\n- **Code Example:**"
    },
    {
        "cell_type": "code",
        "source": "from pydantic import BaseModel\n\nclass MyFirstModel(BaseModel):\n    first_name: str\n    last_name: str\n\nvalidating = MyFirstModel(first_name=\"marc\", last_name=\"nealer\")\nprint(validating)"
    },
    {
        "cell_type": "markdown",
        "source": "- **Key Points:**\n  - Pydantic models resemble Python dataclasses.\n  - Unlike dataclasses, Pydantic enforces type validation and raises errors for mismatches.\n  - **Default Validation:** Type checking is performed automatically."
    },
    # Handling Optional Parameters Section
    {
        "cell_type": "markdown",
        "source": "---\n## Handling Optional Parameters\n\n- **Challenge:**\n  - Managing optional fields with expected typing behaviors.\n\n- **Code Example:**"
    },
    {
        "cell_type": "code",
        "source": "from pydantic import BaseModel\nfrom typing import Union, Optional\n\nclass MySecondModel(BaseModel):\n    first_name: str\n    middle_name: Union[str, None]  # Parameter is optional\n    title: Optional[str]           # Parameter must be sent but can be None\n    last_name: str"
    },
    {
        "cell_type": "markdown",
        "source": "- **Explanation:**\n  - **`Union[str, None]`:** Field is optional; it can be omitted.\n  - **`Optional[str]`:** Field is required but can be `None`.\n  - Pydantic leverages Python's `typing` library for comprehensive type validations.\n\n- **Advanced Types Example:**"
    },
    {
        "cell_type": "code",
        "source": "from pydantic import BaseModel\nfrom typing import Union, List, Dict\nfrom datetime import datetime\n\nclass MyThirdModel(BaseModel):\n    name: Dict[str, str]\n    skills: List[str]\n    holidays: List[Union[str, datetime]]"
    },
    # Applying Default Values Section
    {
        "cell_type": "markdown",
        "source": "---\n## Applying Default Values\n\n- **Scenario:**\n  - Assign default values to fields when data is missing.\n\n- **Initial Approach:**"
    },
    {
        "cell_type": "code",
        "source": "from pydantic import BaseModel\n\nclass DefaultsModel(BaseModel):\n    first_name: str = \"jane\"\n    middle_names: list = []\n    last_name : str = \"doe\""
    },
    {
        "cell_type": "markdown",
        "source": "- **Issue:**\n  - Mutable default values (like lists) are shared across all instances, leading to unintended side effects.\n\n- **Solution: Use `Field` with `default_factory`:"
    },
    {
        "cell_type": "code",
        "source": "from pydantic import BaseModel, Field\n\nclass DefaultsModel(BaseModel):\n    first_name: str = \"jane\"\n    middle_names: list = Field(default_factory=list)\n    last_name: str = \"doe\""
    },
    {
        "cell_type": "markdown",
        "source": "- **Benefits:**\n  - Ensures each instance gets its own separate list.\n  - Prevents shared mutable defaults.\n\n- **Note on `Field`:**\n  - Versatile but can complicate models if overused.\n  - Recommended primarily for defaults and default factories."
    },
    # Nesting Models Section
    {
        "cell_type": "markdown",
        "source": "---\n## Nesting Models\n\n- **Purpose:**\n  - Organize complex data structures by embedding models within models.\n\n- **Code Example:**"
    },
    {
        "cell_type": "code",
        "source": "from pydantic import BaseModel\n\nclass NameModel(BaseModel):\n    first_name: str\n    last_name: str\n\nclass UserModel(BaseModel):\n    username: str\n    name: NameModel"
    },
    {
        "cell_type": "markdown",
        "source": "- **Advantages:**\n  - Promotes modularity and reusability.\n  - Simplifies validation of nested data structures."
    },
    # Custom Validation Section
    {
        "cell_type": "markdown",
        "source": "---\n## Custom Validation\n\n- **Need for Custom Validation:**\n  - Beyond basic type checks, additional constraints are often required.\n\n- **Types of Custom Validators:**\n  - **Before Validators:** Modify or validate data before default validation.\n  - **After Validators:** Perform checks after default validation.\n  - **Wrap Validators:** Act like middleware, handling actions before and after.\n\n- **Default Validation Context:**\n  - **Field-Level:** Validates individual fields.\n  - **Model-Level:** Validates the entire model, allowing inter-field dependencies.\n\n### Field Validation\n\n- **Approach:**\n  - Utilize decorators to define validation functions for specific fields.\n  - Prefer `Annotated` validators for clarity and maintainability.\n\n- **Code Example:**"
    },
    {
        "cell_type": "code",
        "source": "from pydantic import BaseModel, BeforeValidator, ValidationError\nfrom typing import Annotated\nimport datetime\n\ndef stamp2date(value):\n    if not isinstance(value, float):\n        raise ValidationError(\"Incoming date must be a timestamp\")\n    try:\n        res = datetime.datetime.fromtimestamp(value)\n    except ValueError:\n        raise ValidationError(\"Timestamp appears to be invalid\")\n    return res\n\nclass DateModel(BaseModel):\n    dob: Annotated[datetime.datetime, BeforeValidator(stamp2date)]"
    },
    {
        "cell_type": "markdown",
        "source": "- **Explanation:**\n  - **`BeforeValidator`:** Transforms a timestamp (float) into a `datetime` object before default validation.\n  - Ensures that the `dob` field receives a valid `datetime` object.\n\n- **Multiple Validators Example:**"
    },
    {
        "cell_type": "code",
        "source": "from pydantic import BaseModel, BeforeValidator, AfterValidator, ValidationError\nfrom typing import Annotated\nimport datetime\n\ndef one_year(value):\n    if value < datetime.datetime.today() - datetime.timedelta(days=365):\n        raise ValidationError(\"The date must be less than a year old\")\n    return value\n\ndef stamp2date(value):\n    if not isinstance(value, float):\n        raise ValidationError(\"Incoming date must be a timestamp\")\n    try:\n        res = datetime.datetime.fromtimestamp(value)\n    except ValueError:\n        raise ValidationError(\"Timestamp appears to be invalid\")\n    return res\n\nclass DateModel(BaseModel):\n    dob: Annotated[datetime.datetime, BeforeValidator(stamp2date), AfterValidator(one_year)]"
    },
    {
        "cell_type": "markdown",
        "source": "- **Usage Insights:**\n  - **`BeforeValidator`:** Ideal for data transformation and preliminary checks.\n  - **`AfterValidator`:** Suitable for additional constraints post-transformation.\n  - **`WrapValidator`:** Not extensively covered; author seeks community input on use cases.\n\n- **Handling Optional Fields with Validators:**"
    },
    {
        "cell_type": "code",
        "source": "from pydantic import BaseModel, BeforeValidator, ValidationError, Field\nfrom typing import Annotated\nimport datetime\n\ndef stamp2date(value):\n    if not isinstance(value, float):\n        raise ValidationError(\"Incoming date must be a timestamp\")\n    try:\n        res = datetime.datetime.fromtimestamp(value)\n    except ValueError:\n        raise ValidationError(\"Timestamp appears to be invalid\")\n    return res\n\nclass DateModel(BaseModel):\n    dob: Annotated[Annotated[datetime.datetime, BeforeValidator(stamp2date)] | None, Field(default=None)]"
    },
    {
        "cell_type": "markdown",
        "source": "- **Key Points:**\n  - Combines `BeforeValidator` with `Optional` typing.\n  - Allows `dob` to be omitted or set to `None`, while still enabling transformation when provided."
    },
    # Model Validation Section
    {
        "cell_type": "markdown",
        "source": "### Model Validation\n\n- **Scenario:**\n  - Ensuring at least one out of multiple optional fields is provided.\n\n- **Code Example:**"
    },
    {
        "cell_type": "code",
        "source": "from pydantic import BaseModel, model_validator, ValidationError\nfrom typing import Union, Any\n\nclass AllOptionalAfterModel(BaseModel):\n    param1: Union[str, None] = None\n    param2: Union[str, None] = None\n    param3: Union[str, None] = None\n\n    @model_validator(mode=\"after\")\n    def there_must_be_one(self):\n        if not (self.param1 or self.param2 or self.param3):\n            raise ValidationError(\"One parameter must be specified\")\n        return self\n\nclass AllOptionalBeforeModel(BaseModel):\n    param1: Union[str, None] = None\n    param2: Union[str, None] = None\n    param3: Union[str, None] = None\n\n    @model_validator(mode=\"before\")\n    @classmethod\n    def there_must_be_one(cls, data: Any):\n        if not (data.get(\"param1\") or data.get(\"param2\") or data.get(\"param3\")):\n            raise ValidationError(\"One parameter must be specified\")\n        return data"
    },
    {
        "cell_type": "markdown",
        "source": "- **Explanation:**\n  - **After Validation (`AllOptionalAfterModel`):**\n    - Decorated method runs after model initialization.\n    - Accesses fields via `self`.\n  - **Before Validation (`AllOptionalBeforeModel`):**\n    - Decorated method runs before model initialization.\n    - Accesses raw input data (typically a dictionary).\n    - Must return the (possibly modified) data.\n\n- **Important Notes:**\n  - **Decorator Order:** `@model_validator(mode=\"before\")` must precede `@classmethod`.\n  - **Error Handling:** Properly structured decorators prevent unexpected errors.\n  - **Data Modification:** Before validators can alter input data before validation."
    },
    # Aliases in Pydantic Section
    {
        "cell_type": "markdown",
        "source": "---\n## Aliases in Pydantic\n\n- **Purpose:**\n  - Handle discrepancies between incoming data field names and model field names.\n  - Facilitate data transformation during validation and serialization.\n\n- **Types of Aliases:**\n  - **Validation Aliases:** Incoming data has different field names than the model.\n  - **Serialization Aliases:** Change field names when outputting serialized data.\n\n- **Common Issue:**\n  - Combining defaults with field-level aliases using `Field()` can be problematic.\n\n- **Solution: Define Aliases at the Model Level:**"
    },
    {
        "cell_type": "code",
        "source": "from pydantic import AliasGenerator, BaseModel, ConfigDict\n\nclass Tree(BaseModel):\n    model_config = ConfigDict(\n        alias_generator=AliasGenerator(\n            validation_alias=lambda field_name: field_name.upper(),\n            serialization_alias=lambda field_name: field_name.title(),\n        )\n    )\n    age: int\n    height: float\n    kind: str\n\n# Usage\n\nt = Tree.model_validate({'AGE': 12, 'HEIGHT': 1.2, 'KIND': 'oak'})\nprint(t.model_dump(by_alias=True))  # Output: {'Age': 12, 'Height': 1.2, 'Kind': 'oak'}"
    },
    {
        "cell_type": "markdown",
        "source": "- **Key Points:**\n  - **Alias Generation:** Uses lambdas to transform field names during validation and serialization.\n  - **Serialization Control:** `model_dump(by_alias=True)` ensures output uses serialization aliases."
    },
    # AliasChoices Section
    {
        "cell_type": "markdown",
        "source": "### AliasChoices\n\n- **Use Case:**\n  - Handle multiple possible incoming field names for the same model field.\n\n- **Code Example:**"
    },
    {
        "cell_type": "code",
        "source": "from pydantic import BaseModel, ConfigDict, AliasGenerator, AliasChoices\nfrom typing import Union\n\naliases = {\n    \"first_name\": AliasChoices(\"fname\", \"surname\", \"forename\", \"first_name\"),\n    \"last_name\": AliasChoices(\"lname\", \"family_name\", \"last_name\")\n}\n\nclass FirstNameChoices(BaseModel):\n    model_config = ConfigDict(\n        alias_generator=AliasGenerator(\n            validation_alias=lambda field_name: aliases.get(field_name, None)\n        )\n    )\n    title: str\n    first_name: str\n    last_name: str"
    },
    {
        "cell_type": "markdown",
        "source": "- **Explanation:**\n  - **`AliasChoices`:** Specifies multiple possible names for a single field during validation.\n  - **Inclusion of Actual Field Name:** Ensures compatibility when serializing and deserializing.\n\n- **Practical Implications:**\n  - Facilitates integration with diverse data sources where field naming conventions vary.\n  - Simplifies data ingestion by accommodating various naming schemas."
    },
    # AliasPath Section
    {
        "cell_type": "markdown",
        "source": "### AliasPath\n\n- **Use Case:**\n  - Extract field values from nested dictionaries or lists within incoming data.\n\n- **Code Example:**"
    },
    {
        "cell_type": "code",
        "source": "from pydantic import BaseModel, ConfigDict, AliasGenerator, AliasPath\n\naliases = {\n    \"first_name\": AliasPath(\"name\", \"first_name\"),\n    \"last_name\": AliasPath(\"name\", \"last_name\")\n}\n\nclass FirstNameChoices(BaseModel):\n    model_config = ConfigDict(\n        alias_generator=AliasGenerator(\n            validation_alias=lambda field_name: aliases.get(field_name, None)\n        )\n    )\n    title: str\n    first_name: str\n    last_name: str\n\n# Usage\n\nobj = FirstNameChoices(**{\n    \"name\": {\"first_name\": \"marc\", \"last_name\": \"Nealer\"},\n    \"title\": \"Master Of All\"\n})\nprint(obj)"
    },
    {
        "cell_type": "markdown",
        "source": "- **Explanation:**\n  - **`AliasPath`:** Specifies the path to nested fields within the incoming data.\n  - **Flattening Data:** Extracts `first_name` and `last_name` from the nested `name` dictionary, presenting them at the model's top level.\n\n- **Benefits:**\n  - Streamlines data models by abstracting nested structures.\n  - Enhances readability and usability of models by flattening complex data."
    },
    # Using AliasPath and AliasChoices Section
    {
        "cell_type": "markdown",
        "source": "### Combining AliasChoices and AliasPath\n\n- **Objective:**\n  - Leverage both `AliasChoices` and `AliasPath` for flexible and robust data handling.\n\n- **Code Example:**"
    },
    {
        "cell_type": "code",
        "source": "from pydantic import BaseModel, ConfigDict, AliasGenerator, AliasPath, AliasChoices\n\naliases = {\n    \"first_name\": AliasChoices(\"first_name\", AliasPath(\"name\", \"first_name\")),\n    \"last_name\": AliasChoices(\"last_name\", AliasPath(\"name\", \"last_name\"))\n}\n\nclass FirstNameChoices(BaseModel):\n    model_config = ConfigDict(\n        alias_generator=AliasGenerator(\n            validation_alias=lambda field_name: aliases.get(field_name, None)\n        )\n    )\n    title: str\n    first_name: str\n    last_name: str\n\n# Usage\n\nobj = FirstNameChoices(**{\n    \"name\": {\"first_name\": \"marc\", \"last_name\": \"Nealer\"},\n    \"title\": \"Master Of All\"\n})\nprint(obj)"
    },
    {
        "cell_type": "markdown",
        "source": "- **Explanation:**\n  - **Combined Aliases:** Allows `first_name` and `last_name` to be sourced either directly or from a nested `name` dictionary.\n  - **Enhanced Flexibility:** Supports multiple data formats and sources seamlessly."
    },
    # Final Thoughts Section
    {
        "cell_type": "markdown",
        "source": "---\n## Final Thoughts\n\n- **Overall Impression:**\n  - **Pydantic:** Highly powerful and versatile library.\n  - **Complexity:** Offers multiple methods to achieve similar outcomes, which can be overwhelming initially.\n\n- **Learning Curve:**\n  - Understanding and effectively utilizing Pydantic requires substantial effort.\n  - The guide aims to accelerate the learning process by providing clear examples and explanations.\n\n- **Caution with AI Tools:**\n  - AI services like ChatGPT and Gemini may provide inconsistent or incorrect information regarding Pydantic versions.\n  - **Recommendations:**\n    - Avoid relying on AI tools for Pydantic-specific coding.\n    - Refer to official documentation and trusted resources instead."
    },
    # Summary Section
    {
        "cell_type": "markdown",
        "source": "---\n## Summary\n\nMarc Nealer's guide serves as a comprehensive introduction to Pydantic, particularly focusing on version 2.*. It covers foundational concepts like data validation and serialization, delves into advanced topics such as custom validators and alias handling, and provides practical code examples to illustrate each concept. While acknowledging the library's complexity, the guide emphasizes Pydantic's strengths and offers strategies to navigate its multifaceted features effectively. Additionally, it highlights potential pitfalls when using AI tools for Pydantic development, advocating for reliance on authoritative resources.\n\n---\n# References\n\n- **Pydantic Documentation:** [https://pydantic-docs.helpmanual.io/](https://pydantic-docs.helpmanual.io/)\n- **FastAPI Documentation:** [https://fastapi.tiangolo.com/](https://fastapi.tiangolo.com/)\n- **Medium Article by Marc Nealer:** [A Practical Guide to using Pydantic](https://medium.com/@marcnealer/a-practical-guide-to-using-pydantic-8aafa7feebf6)"
    }
]

# Add cells to the notebook
for cell in cells:
    if cell["cell_type"] == "markdown":
        nb.cells.append(nbf.v4.new_markdown_cell(cell["source"]))
    elif cell["cell_type"] == "code":
        nb.cells.append(nbf.v4.new_code_cell(cell["source"]))

# Write the notebook to a file
with open("Pydantic_Guide_Summary.ipynb", "w", encoding='utf-8') as f:
    nbf.write(nb, f)

print("Jupyter Notebook 'Pydantic_Guide_Summary.ipynb' has been created successfully.")


Jupyter Notebook 'Pydantic_Guide_Summary.ipynb' has been created successfully.


  "source": "## Introduction\n\n- **Author's Journey:**\n  - Started experimenting with FastAPI.\n  - Discovered Pydantic as an essential tool within FastAPI.\n  - Initial challenges included a steep learning curve and multiple approaches for similar tasks.\n\n- **Author's Opinion:**\n  - Despite initial hurdles, Pydantic is highly powerful and ranks in the top 10 Python libraries.\n  - Emphasizes the importance of understanding Pydantic to leverage its full potential.\n\n- **Version Note:**\n  - Focuses on **Pydantic v2.\***.\n  - Significant differences exist between versions 1 and 2.\n  - Cautions against using AI tools like ChatGPT or Gemini for Pydantic coding due to potential version mix-ups."
