# Run-Time Data Validation Frameworks in Python

## Background

Static type checkers (like mypy) only analyze code before it runs; they can’t stop bad data at runtime. Python’s dynamic nature means that incorrect types or malformed objects can easily slip through unless you enforce rules during execution. Run-time validation libraries fill that gap by adding guardrails that catch incorrect values while the program is running. This notebook walks through the spectrum—from lightweight decorators to full-featured validation frameworks—so you understand how they differ and when each is appropriate.


### Composable Libraries: Breaking Down the Systems Used in Applications 

Instead of relying on one monolithic tool, Python gives you building blocks: type hinting, decorators, structured data classes, and validation utilities. Together, they let you enforce contracts in plain Python code without committing to a heavyweight framework. This section explains those parts and how they fit together in real systems.

---

#### 1. Type Checking with `beartype`

`beartype` adds runtime type enforcement to ordinary Python functions and classes. It inspects type hints and inserts fast checks that run whenever a function is called or a dataclass is instantiated. The key insight: type hints are normally ignored by Python, but `beartype` makes them *real*. This section highlights the strengths and limitations of this approach—especially how easy it is to bypass if you’re not careful with object mutation.

**Exercise**: Find two ways to run this function and violate the type hints, without raising an error.

In [6]:
from beartype import beartype

@beartype
def add(x: int, y: int) -> float:
    return x + y

In [7]:
add('3', '5')

BeartypeCallHintParamViolation: Function __main__.add() parameter x='3' violates type hint <class 'int'>, as str '3' not instance of int.

**Exercise**: below is the same `add` function, this time with the `@beartype` decorator. Then run the your solutions from the previous exercise again, and see if it can get past `beartype` without raising an error.  What changed?

In [47]:
from beartype import beartype

@beartype
def add(x: int, y: int) -> float:
    return x + y


**Exercise**: Find two different ways to instantiate the `Person` object below, that violates its type annotations, but doesn't raise an error.

In [10]:
from dataclasses import dataclass


@beartype
@dataclass
class Person:
    name: str
    age: int


In [11]:
Person(3, 'nick')

BeartypeCallHintParamViolation: Method __main__.Person.__init__() parameter name=3 violates type hint <class 'str'>, as int 3 not instance of str.

**Exercise**: add `@beartype` above the `@dataclass` decorator on the `Person` dataclass, and re-run the previous exercises.  Does `beartype` catch the violations?

**Exercise**: Where is `beartype` **not** working?  Hack `beartype` by manipulating a `Person` instance's field type *after* the object has already been created.  This should work--`beartype` won't catch it.  Print the violated object.  (Note: PyLance/MyPy should catch this issue and show a red squiggly, even though `beartype` won't)

In [14]:
a = Person('nick', 3)
a.name = 3.3
a


Person(name=3.3, age=3)

---

### 2. Validated Domain Objects with `attrs`

`attrs` focuses on building clean, explicit data models. It generates boilerplate for you (like __init__ and __repr__), but it also supports field validators that run whenever you create an object. This turns plain Python classes into reliable domain objects that enforce invariants such as “age must be positive.” The section shows how to use types, converters, and custom validators to encode your rules directly in the data structure.

#### **Type** Validation with `attrs`

`attrs` doesn’t validate types automatically—you must opt in. Once you do, you get strict checking on every field assignment, which makes domain models safer and easier to reason about. This part demonstrates how that works and what kinds of structural guarantees you can enforce.

**Example**:   Make an `attrs`-based object with this interface: `Rectangle(length=4.3, width=1.2)`:

In [15]:
from attr import define, field

@define
class Rectangle:
    length: float = field(converter=float)
    width: float = field(converter=float)


Rectangle(3, 4)

Rectangle(length=3.0, width=4.0)

**Exercise**: Make an `attrs`-based object with this interface: `Person(name='Emma', age=3)`:

In [27]:
def string(el):
    return str(el)

@define
class Person:
    name: str = field(converter=string)
    age: int = field(converter=int)


a = Person('nick', '55')
a.name = 55
a

Person(name='55', age=55)

**Exercise**: Ensure the `Person` produces valid types on intance creation: Does it raise an error if the wrong type is supplied to a field?

**Exercise**: Ensure the `Person` produces valid types on intance **modification**: Does it raise an error if the wrong type is supplied to a field?

#### **Value** Validation with `attrs`

Built-in validators are useful, but real-world data almost always needs custom rules. This section shows how to write small validator functions that enforce domain-specific logic (e.g., non-empty strings, positive numbers). These validators scale nicely as your application grows.

Built-In Validators Reference:  https://www.attrs.org/en/stable/api.html#module-attrs.validators

| Validator | Description |
| :-- | :-- | 
| **`validators.lt()`** | Check a value is less than some threshold |
| **`validators.le()`** | Check a value is less-than or equal-to some threshold |
| **`validators.gt()`** | Check a value is greater than some threshold |
| **`validators.ge()`** | Check a value is greater-than or equal-to some threshold |
| **`validators.in_()`** | Check a substring is inside |


Custom validators can aso be made.

**Example**: Make sure that a string is never empty:

In [23]:
from attr import define, field, validators

def non_blank_string(instance, attribute, value: str):
    if len(value) == 0:
        raise ValueError(f"{attribute.name} must not be blank")

@define
class Contact:
    email: str = field(converter=str, validator=[non_blank_string])

Contact(email='')

ValueError: email must not be blank

**Exercise**: Either using custom validator functions or supplied validator functions from `attrs.validators`, make a `Person(name: str, age: int)` data structure that requires: 
  1. The first letter of `Person.name` is upper-cased (hint: `str.isupper()`)
  2. `Person.age` is positive.

In [33]:
def string(el):
    return str(el)

def start_with_upper(instance, attribute, value: str):
    if value and not value[0].isupper():
        setattr(instance, attribute.name, value[0].upper() + value[1:])


@define
class Person:
    name: str = field(converter=string, validator=[start_with_upper])
    age: int = field(converter=int)


a = Person('hiya', '55')
a

Person(name='Hiya', age=55)

## Larger Frameworks for Data Validation: Exploring `Pydantic`

`Pydantic` goes beyond simple containers and acts as a full validation engine. It parses input, coerces types, runs validators, and produces clean, stable objects that you can safely depend on. It’s especially useful when reading untrusted data (e.g. JSON, API requests, configuration files) because it refuses malformed inputs by default. This section illustrates why many production systems adopt Pydantic for reliability.

### Validated Objects

`Pydantic` models automatically enforce type rules and transform incoming values. You write your domain schema once, and Pydantic handles the conversions and checks. Here, you see how to enforce positivity, string constraints, and other common rules without extra boilerplate.

**Example**: Create a `Rectangle(length: float, width: float) object with `pydantic`, requiring:
  1. both `length` and `width` are always positive

In [49]:
from pydantic import BaseModel, Field, ConfigDict

class Rectangle(BaseModel):
    model_config = ConfigDict(validate_assignment=True)
    length: float = Field(ge=0)
    width: float = Field(ge=0)


Rectangle(length=3, width=4)

Rectangle(length=3.0, width=4.0)

**Exercise**: Create a `Person(name: str, age: int)` Object with `Pydantic`, requiring:
  1. The first letter of `Person.name` is never empty (hint: `min_length=0`)
  2. `Person.age` is always positive (hing: `ge=0`).

In [None]:
from pydantic import BaseModel, Field, constr, conint, EmailStr

class Person(BaseModel):
    name: constr(min_length=1)
    age: conint(ge=0)
    # email: constr(min_length=3, pattern=)


Person(name='nick', age=33)

Person(name='nick', age=33)

#### Custom Validators in Pydantic

When built-in constraints aren’t enough, you can add custom logic using decorator-based validators. These run after parsing and give you full control over domain rules. This section demonstrates how custom validation behaves differently from field constraints and how to combine them effectively.

**Example**: Using custom validators, create a `Rectangle(length: float, width: float)` object with `pydantic`, requiring:
  1. both `length` and `width` are always positive

In [51]:
from pydantic import BaseModel, ConfigDict, field_validator

class Rectangle(BaseModel):
    model_config = ConfigDict(validate_assignment=True)
    length: float
    width: float

    @field_validator('length', 'width')
    def _validate_is_positive(cls, value: float) -> float:
        if value < 0:
            raise ValueError('must be positive.')
        return value

Rectangle(length=3, width=4)

Rectangle(length=3.0, width=4.0)

**Exercise**: Use `pydantic.field_validator` to make a custom validator: Make a `Person` object that validates that `name` is always title-cased.


## **Conclusion**

Runtime data validation is essential when working in Python’s dynamic environment, where type hints alone don’t guarantee correctness. The tools explored in this notebook—`beartype`, `attrs`, and `pydantic`—sit at different points on the spectrum of complexity and strictness. `beartype` adds lightweight runtime checks to existing functions, `attrs` gives you explicit and maintainable domain objects with customizable validation, and `pydantic` provides a full validation and parsing engine suitable for production systems handling untrusted data. Understanding the trade-offs between these approaches helps you pick the right level of enforcement for each layer of your application.

## **References**

* **attrs documentation**: [https://www.attrs.org](https://www.attrs.org)
* **beartype documentation**: [https://beartype.readthedocs.io](https://beartype.readthedocs.io)
* **Pydantic documentation (v2)**: [https://docs.pydantic.dev](https://docs.pydantic.dev)
* **PEP 484 – Type Hints**: [https://peps.python.org/pep-0484/](https://peps.python.org/pep-0484/)
* **PEP 681 – Data Class Transform (used by attrs and pydantic)**: [https://peps.python.org/pep-0681/](https://peps.python.org/pep-0681/)
* **Pydantic vs. attrs discussion** (useful for design choices): [https://pydantic-docs.helpmanual.io/usage/pydantic_vs_attrs](https://pydantic-docs.helpmanual.io/usage/pydantic_vs_attrs)

