Skip to content

caniko/pydantic-numpy

Repository files navigation

pydantic-numpy

Python 3.9-3.12 Packaged with Poetry Code style: black Imports: isort Ruff

Usage

Package that integrates NumPy Arrays into Pydantic!

  • pydantic_numpy.typing provides many typings such as NpNDArrayFp64, Np3DArrayFp64 (float64 that must be 3D)! Works with both pydantic.BaseModel and pydantic.dataclass
  • NumpyModel (derived from pydantic.BaseModel) make it possible to dump and load np.ndarray within model fields alongside other fields that are not instances of np.ndarray!

See the test.helper.testing_groups to see types that are defined explicitly.

Examples

For more examples see test_ndarray.py

import numpy as np
from pydantic import BaseModel

import pydantic_numpy.typing as pnd
from pydantic_numpy import np_array_pydantic_annotated_typing
from pydantic_numpy.model import NumpyModel, MultiArrayNumpyFile


class MyBaseModelDerivedModel(BaseModel):
    any_array_dtype_and_dimension: pnd.NpNDArray

    # Must be numpy float32 as dtype
    k: np_array_pydantic_annotated_typing(data_type=np.float32)
    shorthand_for_k: pnd.NpNDArrayFp32

    must_be_1d_np_array: np_array_pydantic_annotated_typing(dimensions=1)


class MyDemoNumpyModel(NumpyModel):
    k: np_array_pydantic_annotated_typing(data_type=np.float32)


# Instantiate from array
cfg = MyDemoModel(k=[1, 2])
# Instantiate from numpy file
cfg = MyDemoModel(k="path_to/array.npy")
# Instantiate from npz file with key
cfg = MyDemoModel(k=MultiArrayNumpyFile(path="path_to/array.npz", key="k"))

cfg.k   # np.ndarray[np.float32]

cfg.dump("path_to_dump_dir", "object_id")
cfg.load("path_to_dump_dir", "object_id")

NumpyModel.load requires the original model:

MyNumpyModel.load(<path>)

Use model_agnostic_load when you have several models that may be the correct model:

from pydantic_numpy.model import model_agnostic_load

cfg.dump("path_to_dump_dir", "object_id")
equals_cfg = model_agnostic_load("path_to_dump_dir", "object_id", models=[MyNumpyModel, MyDemoModel])

Custom type

There are two ways to define. Function derived types with pydantic_numpy.helper.annotation.np_array_pydantic_annotated_typing.

Function derived types don't work with static type checkers like Pyright and MyPy. In case you need the support, just create the types yourself:

NpStrict1DArrayInt64 = Annotated[
    np.ndarray[tuple[int], np.dtype[np.int64]],
    NpArrayPydanticAnnotation.factory(data_type=np.int64, dimensions=1, strict_data_typing=True),
]

Custom serialization

If the default serialization of NumpyDataDict, as outlined in typing.py, doesn't meet your requirements, you have the option to define a custom type with its own serializer. This can be achieved using the NpArrayPydanticAnnotation.factory method, which accepts a custom serialization function through its serialize_numpy_array_to_json parameter. This parameter expects a function of the form Callable[[npt.ArrayLike], Iterable], allowing you to tailor the serialization process to your specific needs.

Example below illustrates definition of 1d-array of float32 type that serializes to flat Python list (without nested dict as in default NumpyDataDict case):

def _serialize_numpy_array_to_float_list(array_like: npt.ArrayLike) -> Iterable:
    return np.array(array_like).astype(float).tolist()


Np1DArrayFp32 = Annotated[
    np.ndarray[tuple[int], np.dtype[np.float32]],
    NpArrayPydanticAnnotation.factory(
        data_type=np.float32,
        dimensions=1,
        strict_data_typing=False,
        serialize_numpy_array_to_json=_serialize_numpy_array_to_float_list,
    ),
]

Install

pip install pydantic-numpy

History

The original idea originates from this discussion, and forked from cheind's repository.

About

Package that integrates NumPy Arrays into Pydantic

Resources

License

Stars

Watchers

Forks