-
Notifications
You must be signed in to change notification settings - Fork 469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better annotations support #1166
Comments
and also these examples would be: >>> @ureg.awrap
... def mypp(length: Quantity['meter]') -> Quantity['second]':
... return pendulum_period(length) and >>> @ureg.acheck
... def pendulum_period(length: Quantity['[length]'):
... return 2*math.pi*math.sqrt(length/G) where |
It would also be fantastic to have Mypy support for checking these annotations statically! Happy to contribute where I can. |
I haven't started using annotations in MetPy (yet), so I don't have any practical experience to rely on to see any obvious gotchas. In general, though, those look reasonable. |
I was playing with this concept. Some things to discuss:
|
It would be nice to set the expected type Like collection types are doing like |
Hello, My objective is to write code like follows:
In my case annotations should be done using dimensions: in the example above it is important to check that a I also agree with @jules-ch comment above about the expected type of magnitude. |
@hgrecco NumPy has also been adding annotation support for |
How about something like
But I would be more worried how to handle this.
|
type annotation of the magnitude should be the first thing we should target since Quantity type is a container just like List Tuple. Second should be unit or dimension. Quantity [type, unit] Type annotation for mypy usage, and we can have dimensions or unit telling the user which unit or dimension expect at first. |
For some internal projects, I have tried three different approaches to annotations for the output of
I would discourage (3) in pint but not so sure about the other two. Option 2 would allow for things like the following: ScalarVelocityQ = Quantity[float, '[speed]']
q1 = ScalarVelocityQ(3, 'm/s')
q2 = ScalarVelocityQ(3, 's') # Exception is raised In any case, I think we need to add good annotation introspection capability because we want to be able to evolve this without breaking everything. We need to avoid having to provide something like this https://stackoverflow.com/a/52664522/482819 |
We could take a look at https://docs.python.org/3/library/typing.html#typing.Annotated which describe what we want to achieve I think. |
Referring @jules-ch, in my opinion Quantity is not just a container. My perception: if I read somebody's code, I would like first to see if the return value of a function represents, say, a length, or energy, or pressure or whatever. Only after I would be interested to understand if that energy is, say, integer, float or some numpy type. Or at least, this is the way I see when "you first glance at somebody's code" In short I think |
Just as a note, not sure how relevant it is to this issue: I tried to add type annotations to the python-measurement library a while ago, hoping that I could write something like |
There are multiple use cases that we should address:
IMO the best option is : Make Quantity Generic & use utilities class to return
T = TypeVar("T")
class Quantity(Generic[T],QuantityGeneric, PrettyIPython, SharedRegistryObject):
...
@property
def magnitude(self) -> T:
"""Quantity's magnitude. Long form for `m`"""
return self._magnitude
...
def __iter__(self) -> Iterator[T]:
...
def to(self, other=None, *contexts, **ctx_kwargs) -> "Quantity[T]": I tried something like this : from typing import _tp_cache, _type_check
from typing import _AnnotatedAlias
class QuantityAlias(_AnnotatedAlias, _root=True):
def __call__(self, *args, **kwargs):
quantity = super().__call__(*args, **kwargs)
if self.__metadata__:
dim = quantity._REGISTRY.get_dimensionality(self.__metadata__[0])
if not quantity.check(dim):
raise TypeError("Dimensionality not matched")
return quantity
class TypedQuantity:
@_tp_cache
def __class_getitem__(cls, params):
from pint.quantity import Quantity
msg = "TypedQuantity[t, ...]: t must be a type."
origin = _type_check(Quantity[params[0]], msg)
metadata = tuple(params[1:])
return QuantityAlias(origin, metadata) Here we make a simple check at runtime for dimension just like @hgrecco example. So We could go further like it is done here https://docs.python.org/3/library/typing.html#typing.Annotated. We could translate to something like Those metadata can be added to the instance if needed. I'll try to Draft a PR. |
@jules-ch I really like your proposal. I am eager to see the draft PR. Great discussion everybody! |
I would like to make a plug within my company's software team to use pint for units. Having typing is a huge plus. I see #1259 was merged, is that the only PR needed for typing, or is there more work to be done? When do you think a release will be cut that incorporates that PR? |
We'll make 0.18 release soon, prob end of the month. pint typing support will be experimental at first, I still need to document it. |
Hi. I'm currently experimenting with the new typing features in v0.18 (#1259). How would I annotate functions or classes that handle from typing import TypeVar
import numpy as np
from pint import Quantity
A = TypeVar('A', np.ndarray, Quantity[np.ndarray])
def get_index(array: A, i: int) -> ???:
return array[i] I am aware that the same is relatively straightforward for example for lists, from typing import TypeVar, List
T = TypeVar('T')
def get_index(l: List[T], i: int) -> T:
return l[i] but I'm having a hard time translating it to the |
I think you'd need to use |
Ok, you are right. My example function is not ideal. What I was really trying to find is an annotation that says: "if you use numpy arrays here, expect scalars there" and equivalently "if you use array quantites here, expect scalar quantities there" or vice versa. Another example: from typing import TypeVar, Generic
import numpy as np
from pint import Quantity
A = TypeVar('A', np.ndarray, Quantity[np.ndarray])
class Converter(Generic[A]):
def __init__(self, scale: "float in case A is np.ndarray / Quantity[float] in case A is Quantity[np.ndarray]"):
self.scale = scale
def convert(self, array: A) -> A:
return A/self.scale |
I see. I think in that case you are looking for For the function you are implementing I think you will need a type annotation like def get_index(array: Union[np.ndarray, Quantity[np.ndarray]], i: int) -> Union[float, Quantity[float]]:
return array[i] but as you write that's not specific enough, a mypy run on data = np.asarray([3., 4.])
data_q = Q_(data, 'meter')
reveal_type(get_index(data, 0))
reveal_type(get_index(data_q, 0)) prints
If you add @overload
def get_index(array: np.ndarray, i: int) -> float: ...
@overload
def get_index(array: Quantity[np.ndarray], i: int) -> Quantity[float]: ... then mypy prints
|
I'm now suddenly interested in this. We have data providers handing us a mis-mash of TWh and PJ energy generation data and we'd like to keep our units straight. We are also using Pydantic. My first attempt to add a Quantity field resulted in this error message (using Pint 0.18):
Worked around by adding
to the models I'm enhancing with Quantity. |
Super interested in the use of Pint type hinting with Pydantic types. Wondering if you were able to add something like PositiveFloat or other Pydantic types to your example @MichaelTiemannOSC from pydantic import BaseModel, PositiveFloat
from pint import Quantity
class PowerPlant(BaseModel):
power_generation: Quantity['watt']
class Config:
arbitrary_types_allowed = True
noor_solar = PowerPlant(power_generation=Quantity(160, 'megawatt'))
noor_solar.power_generation |
Should be able to share some findings soon. I have an issue filed with pandas to sort out an ExtensionArray problem (pandas-dev/pandas#45240) and am working with some smart people (copied) on how to make this play well with both database connectors and REST APIs. |
@hgrecco astropy introduced something similar that we can implement, using |
Really curious on any progress on this as I'm getting into this very topic and have some ugly workarounds like: from pydantic import BaseModel, validator
from pint import Quantity
ureg = pint.UnitRegistry()
class MyModel(BaseModel):
distance: str
@validator("distance")
def is_length(cls, v):
q = ureg.Quantity(v)
assert q.check("[length]"), "dimensionality must be [length]"
return q >>> MyModel(distance="2 ly").distance
2 light_year |
I made a quick, slightly nicer, workaround based off your workaround from pydantic import BaseModel
import pint
class PintType:
Q = pint.Quantity
def __init__(self, q_check: str):
self.q_check = q_check
def __get_validators__(self):
yield self.validate
def validate(self, v):
q = self.Q(v)
assert q.check(self.q_check), f"Dimensionality must be {self.q_check}"
return q
Length = PintType("[length]")
class MyModel(BaseModel):
distance: Length
class Config:
json_encoders = {
pint.Quantity: str
} |
Indeed, thanks! |
Thank you all for posting this, it has been incredibly helpful. One thing I had to mention is that I was having issues with the example above because the fields were objects and not classes, so I tweaked things a bit to support Here is a public gist with a more complete example. Open to any suggestions on how to improve this: from pint import Quantity, Unit, UnitRegistry
from pydantic import BaseModel
registry = UnitRegistry()
schema_extra = dict(definitions=[
dict(
Quantity=dict(type="string"),
)
])
def quantity(dimensionality: str) -> type:
"""A method for making a pydantic compliant Pint quantity field type."""
try:
registry.get_dimensionality(dimensionality)
except KeyError:
raise ValueError(f"{dimensionality} is not a valid dimensionality in pint!")
@classmethod
def __get_validators__(cls):
yield cls.validate
@classmethod
def validate(cls, value):
quantity = Quantity(value)
assert quantity.check(cls.dimensionality), f"Dimensionality must be {cls.dimensionality}"
return quantity
@classmethod
def __modify_schema__(cls, field_schema):
field_schema.update(
{"$ref": "#/definitions/Quantity"}
)
return type(
"Quantity",
(Quantity,),
dict(
__get_validators__=__get_validators__,
__modify_schema__=__modify_schema__,
dimensionality=dimensionality,
validate=validate,
),
)
class MyModel(BaseModel):
distance: quantity("[length]")
speed: quantity("[length]/[time]")
class Config:
validate_assignment = True
schema_extra = schema_extra
json_encoders = {
Quantity: str,
} model = MyModel(distance="1.5 ly", speed="15 km/hr")
model
>>> MyModel(distance=<Quantity(1.5, 'light_year')>, speed=<Quantity(15.0, 'kilometer / hour')>)
# check the jsonschema, could make the definition for Quantity better...
print(MyModel.schema_json(indent=2))
>>> {
"title": "MyModel",
"type": "object",
"properties": {
"distance": {
"$ref": "#/definitions/Quantity"
},
"speed": {
"$ref": "#/definitions/Quantity"
}
},
"required": [
"distance",
"speed"
],
"definitions": [
{
"Quantity": {
"type": "string"
}
}
]
}
# convert to a python dictionary
model.dict()
>>> {'distance': 1.5 <Unit('light_year')>, 'speed': 15.0 <Unit('kilometer / hour')>}
# serialize to json
print(model.json(indent=2))
>>> {
"distance": "1.5 light_year",
"speed": "15.0 kilometer / hour"
}
import json
# load from json
MyModel.parse_obj(json.loads(model.json()))
>>> MyModel(distance=<Quantity(1.5, 'light_year')>, speed=<Quantity(15.0, 'kilometer / hour')>)
# test that it raises error when assigning wrong quantity kind
model.distance = "2 m/s"
---------------------------------------------------------------------------
ValidationError Traceback (most recent call last)
Cell In [14], line 1
----> 1 model.distance = "2 m/s"
File C:\mf\envs\jafte\lib\site-packages\pydantic\main.py:385, in pydantic.main.BaseModel.__setattr__()
ValidationError: 1 validation error for MyModel
distance
Dimensionality must be [length] (type=assertion_error) |
@sanbales that was incredibly helpful code! I'm now trying to build a
But pydantic gives me this error, which I haven't been able to fully grok:
|
I got past that error by changing the I'm still working out some other bits, so please don't take the above as correct reference code. It's more a reference to my current state of problems than a solution. |
Thanks @MichaelTiemannOSC , I was not familiar with the |
Cool. Here's a link to the code repository where I'm bringing together Pint, pydantic, uncertainties, and pandas: https://github.com/MichaelTiemannOSC/ITR/tree/template-v2 |
hgrecco/pint#1166 Signed-off-by: Cristian Le <cristian.le@mpsd.mpg.de>
hgrecco/pint#1166 Signed-off-by: Cristian Le <cristian.le@mpsd.mpg.de>
hgrecco/pint#1166 Signed-off-by: Cristian Le <cristian.le@mpsd.mpg.de>
hgrecco/pint#1166 Signed-off-by: Cristian Le <cristian.le@mpsd.mpg.de>
hgrecco/pint#1166 Signed-off-by: Cristian Le <cristian.le@mpsd.mpg.de>
hgrecco/pint#1166 Signed-off-by: Cristian Le <cristian.le@mpsd.mpg.de>
Sorry to neco this thread, but what's the status on type hints? The previously linked repo appears to be gone. |
It has since been merged into the main repository: https://github.com/os-climate/ITR. Not that this repository doesn't itself contain pandas, pint, or pint-pandas. I have created some local versions of those, but things have drifted as uncertainties proved more challenging to bring into pint-pandas than expected. |
Thank you for the examples! Here's an example on how to make it work with annotations in Pydantic 2: https://github.com/LiberTEM/LiberTEM-schema/blob/c096d5337f21c78232134ad9d9af19b8405b1992/src/libertem_schema/__init__.py#L1 (edit: inline code here) from typing import Any, Sequence
from typing_extensions import Annotated
from pydantic_core import core_schema
from pydantic import (
BaseModel,
GetCoreSchemaHandler,
WrapValidator,
ValidationInfo,
ValidatorFunctionWrapHandler,
)
import pint
__version__ = '0.1.0.dev0'
ureg = pint.UnitRegistry()
class DimensionError(ValueError):
pass
_pint_base_repr = core_schema.tuple_positional_schema(items_schema=[
core_schema.float_schema(),
core_schema.str_schema()
])
def to_tuple(q: pint.Quantity):
base = q.to_base_units()
return (float(base.magnitude), str(base.units))
class PintAnnotation:
@classmethod
def __get_pydantic_core_schema__(
cls,
_source_type: Any,
_handler: GetCoreSchemaHandler,
) -> core_schema.CoreSchema:
return core_schema.json_or_python_schema(
json_schema=_pint_base_repr,
python_schema=core_schema.is_instance_schema(pint.Quantity),
serialization=core_schema.plain_serializer_function_ser_schema(
to_tuple
),
)
_length_dim = ureg.meter.dimensionality
_angle_dim = ureg.radian.dimensionality
_pixel_dim = ureg.pixel.dimensionality
def _make_handler(dimensionality: str):
def is_matching(
q: Any, handler: ValidatorFunctionWrapHandler, info: ValidationInfo
) -> pint.Quantity:
# Ensure target type
if isinstance(q, pint.Quantity):
pass
elif isinstance(q, Sequence):
magnitude, unit = q
# Turn into Quantity: measure * unit
q = magnitude * ureg(unit)
else:
raise ValueError(f"Don't know how to interpret type {type(q)}.")
# Check dimension
if not q.check(dimensionality):
raise DimensionError(f"Expected dimensionality {dimensionality}, got quantity {q}.")
# Return target type
return q
return is_matching
Length = Annotated[
pint.Quantity, PintAnnotation, WrapValidator(_make_handler(_length_dim))
]
Angle = Annotated[
pint.Quantity, PintAnnotation, WrapValidator(_make_handler(_angle_dim))
]
Pixel = Annotated[
pint.Quantity, PintAnnotation, WrapValidator(_make_handler(_pixel_dim))
]
class Simple4DSTEMParams(BaseModel):
'''
Basic calibration parameters of a strongly simplified model
of a 4D STEM experiment.
See https://github.com/LiberTEM/Microscope-Calibration
and https://arxiv.org/abs/2403.08538
for the technical details.
'''
overfocus: Length
scan_pixel_pitch: Length
camera_length: Length
detector_pixel_pitch: Length
semiconv: Angle
cy: Pixel
cx: Pixel
scan_rotation: Angle
flip_y: bool def test_smoke():
params = Simple4DSTEMParams(
overfocus=0.0015 * ureg.meter,
scan_pixel_pitch=0.000001 * ureg.meter,
camera_length=0.15 * ureg.meter,
detector_pixel_pitch=0.000050 * ureg.meter,
semiconv=0.020 * ureg.radian,
scan_rotation=330. * ureg.degree,
flip_y=False,
cy=(32 - 2) * ureg.pixel,
cx=(32 - 2) * ureg.pixel,
)
as_json = params.model_dump_json()
pprint.pprint(("as json", as_json))
from_j = from_json(as_json)
pprint.pprint(("from json", from_j))
res = Simple4DSTEMParams.model_validate(from_j)
pprint.pprint(("validated", res))
assert isinstance(res.overfocus, Quantity)
assert isinstance(res.flip_y, bool)
assert res == params
To be figured out:
Is this useful? If yes, what would be a good way to make it easily available to others? CC @sk1p |
Yes!
I'm reminded of the project organization and code layout of https://github.com/p2p-ld/numpydantic, which exposes NumPy array shape validation as a Pydantic annotation similarly to how your code customizes Pint types. Could be useful to adopt some of its structure if you were to wrap up the Pint validator as its own pip installable thing!
Edit
I initially linked Numpydantic is stable at 1.0, but the "coming soon" goals of general metadata and extensibility may make it easier to implement some of these Pint needs. I would say a standalone package that exposes the simple validator is the closer reach, then using the metadata/extensible bits of Numpydantic in the future for robustness without having to re-implement a bunch of machinery. |
Good to hear that you like it! :-)
Oh, that one looks interesting! Indeed, this one could be a good template and address the magnitude portion.
Hm ok, probably good to experiment a bit and explore options before releasing 0.1. It feels like |
Here's a version that integrates with |
Would the machinery make sense as part of pint, by the way? One could put it into |
With PEP560 we could now try to have a better annotations experience for Pint. Briefly, my proposal would be to do something like this
or
and the provide to a nice API to check for this.
What do you think?
The text was updated successfully, but these errors were encountered: