-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for value-based polymorphism #503
Comments
I'm currently unavailable. |
Sorry, that was a bad joke about the issue id. This is possible, but without knowing all of what you're doing it's hard to give a full solution, still here's a broad outline: from typing import Union, List
from pydantic import BaseModel, validator
class ConcreteItemA(BaseModel):
type: str
a: str
@validator('type')
def check_type(cls, v):
if v != 'item-a':
raise ValueError('not item-a')
return v
class ConcreteItemB(BaseModel):
type: str
b: int
@validator('type')
def check_type(cls, v):
if v != 'item-b':
raise ValueError('not item-a')
return v
class BaseItem(BaseModel):
root: List[Union[ConcreteItemA, ConcreteItemB]]
m = BaseItem(root=[
{
'type': 'item-a',
'a': 'some-string'
},
{
'type': 'item-b',
'b': 10,
}
])
print(m.root)
print(m.dict()) Gives: [<ConcreteItemA type='item-a' a='some-string'>, <ConcreteItemB type='item-b' b=10>]
{'root': [{'type': 'item-a', 'a': 'some-string'}, {'type': 'item-b', 'b': 10}]} There are a couple of warts on this approach I'm afraid:
e = None
for model_cls in [ConcreteItemA, ConcreteItemB]:
try:
return model_ls(**data)
except ValidationError as e:
error = e
raise e Which is effectively what pydantic is doing when it sees Hope that helps, let me know if you need more info. |
@samuelcolvin e = None
for model_cls in [ConcreteItemA, ConcreteItemB]:
try:
return model_ls(**data)
except ValidationError as e:
error = e
raise e But, as my validations are done in backend service and I have to check quite large and nested json payloads for me it seems quite inefficient. What about introducing new class, say
And later you can call
So, For example, from datetime import datetime
from typing import Optional
from pydantic import BaseModel, validator
class ModelA(BaseModel):
type: str
message: str
class ModelB(BaseModel):
type: Optional[str]
message: str = None #required, default None If json doesn't contain field |
You can do the deserializing once, regardless of how you later try to validate the data. If you really care about performance, you could do something like model_lookup = {'item-a': ConcreteItemA, 'item-b': ConcreteItemB, ...}
data = ujson.loads(raw_data)
if not isinstance(data, dict):
raise ValueError('not a dict')
try:
model_cls = model_lookup[data['type']
except KeyError:
raise ...
m = model_cls(**data) The point is that by this state your into specifics of your application, which don't belong in pydantic.
I don't see how resolver can be significantly more efficient than the loop approach above, without significantly rewriting pydantic. The loop appoach is what we currently do for
I think this sounds a bit magic, either data is valid for a model or it's not - some floating measure of "specificity" sounds like an unnecessary complexity. If it is need, again it's probably application specific. I don't personally think |
I think it's already working that way. If we have two models both successfully validating json - the model, which will be returned depends on order in for loop (which model happens first). What to do, if data valid for two models? class IssueAction(str, Enum):
opened = 'opened'
class IssueEvent(BaseModel):
action: IssueAction
issue: IssuePayload
class IssueOpened(IssueEvent):
action: Final[IssueAction] = IssueAction.opened
@webhook.handler(IssueEvent)
async def issue_event(issue: IssueEvent):
print("[EVENT] Some general event")
@webhook.handler(IssueOpened)
async def issue_opened(event: IssueOpened):
print("[EVENT] Issue was opened") For incoming json payload:
it will be valid for both cases, and which handler will be called depends on the order of registering handlers. That's not what would I want.
That's the point of my idea - do not tie to applications specifics and do not write (these checks already should be done in pydantic validation sooner or later) if data['type'] == "issue":
if data["status"] == "opened":
process_opened_issue(data)
elif data["status"] == "closed":
process_closed_issue(data)
elif data['type'] == "pull request":
... Because:
I'm thinking about general solution, where you describe protocol (models) and subscribe to interesting events (model applicable to json) and this works without any other partial data parsing and custom routing logic.
If implemented in general case it might be in use for any webhook implementation (and I think this is a common case where any service wants to integrate with other third-party service; also it's a case if you want to get real-time notifications - instead of constant polling). Because almost always you will get all events (different json types) in webhook call. That's why I propose |
How about using a custom data type that grabs the type name from from enum import Enum
from pydantic import BaseModel
class ItemA(BaseModel):
x: int
y: int
class ItemB(BaseModel):
i: str
j: str
class ItemType(Enum):
A = 'item-a'
B = 'item-b'
class PolyItem:
type_map = {
ItemType.A: ItemA,
ItemType.B: ItemB,
}
@classmethod
def __get_validators__(cls):
yield cls.validate
@classmethod
def validate(cls, v, values):
item_type = values['type']
ItemModel = cls.type_map[item_type]
return ItemModel(**v)
class Record(BaseModel):
type: ItemType
item: PolyItem
The |
I was also running into this until I'd finally realized I could do this: class ConcreteItemA(BaseModel):
a: str
type: str = Field("item-a", const=True)
class ConcreteItemB(BaseModel):
b: int
type: str = Field("item-b", const=True)
class BaseItem(BaseModel):
__root__: Union[ConcreteItemA, ConcreteItemB] Hope this helps anyone who also comes across the same issue and ends up here. |
This makes the following modifications: * The type designators are const=True now in pydantic so that pydantic can use them to determine which class needs to be instantiated * We specify a union of all possible types (using class_descendants) for the range of an slot that refers to a class having a type designator References: * pydantic/pydantic#503 * linkml#1099
This makes the following modifications: * The type designators are const=True now in pydantic so that pydantic can use them to determine which class needs to be instantiated * We specify a union of all possible types (using class_descendants) for the range of an slot that refers to a class having a type designator References: * pydantic/pydantic#503 * #1099
I have also ended up here.
This is useful to know about. It also looks like you'd want to combine it with Discriminated Unions You can also use So combining it all together: from enum import StrEnum, auto
from typing import Literal, Union
from pydantic import BaseModel, Field
from pydantic.tools import parse_obj_as
class ItemType(StrEnum):
A = auto()
B = auto()
class ConcreteItemA(BaseModel):
a: str
type: Literal[ItemType.A] = ItemType.A
class ConcreteItemB(BaseModel):
b: int
type: Literal[ItemType.B] = ItemType.B
class BaseItem(BaseModel):
__root__: Union[ConcreteItemA, ConcreteItemB] = Field(...,descriminator="type")
# Test
# ------------------------------------
test_a = {"__root__": {"type": ItemType.A, "a": "foo"}}
test_b = {"__root__": {"type": ItemType.B, "b": 9000}}
a = parse_obj_as(BaseItem, test_a)
repr(a)
b = parse_obj_as(BaseItem, test_b)
repr(b) |
Another option: class BaseItem(BaseModel):
type: Literal['item-a', 'item-b']
class ConcreteItemA(BaseItem):
type: Literal['item-a']
a: str
class ConcreteItemB(BaseItem):
type: Literal['item-a']
b: int or
depending on what you are trying to do this may require some tweaks, but hopefully one of the above serves as a good starting point. I agree with @nhairs that Discriminated Union is the right concept here. |
Could be also: import enum
class ItemType(enum.Enum):
A: 'item-a'
B: 'item-b'
class BaseItem(BaseModel):
type: ItemType
class ConcreteItemA(BaseItem):
type: Literal[ItemType.A]
a: str
class ConcreteItemB(BaseItem):
type: Literal[ItemType.B]
b: int |
For anyone running into this thread now that we've got pydantic v2, If you are doing:
you should now be doing:
If you used to have a "catch-all" class, such as:
then as far as I can tell this is no longer possible to handle via RootModels. If anyone has a way to solve that, I'd be grateful to hear about it! |
@TheKevJames is there no way to do this that doesn't require listing every possible class up front? it would be nice if classes could be registered at run/import time. In my case I want to save configuration objects, where each configurable class has its own configuration class, which there is an ever growing number of. |
AFAICT there is no way at the moment, or at least not through the APIs I explored. In my (the?) ideal case, I'd expect to be able to automatically support all sub-classes of a class like so: import enum
from typing import Literal, Annotated
from pydantic import BaseModel, Field
class ItemType(enum.IntEnum):
FOO = 1
BAR = 2
class BaseItem(BaseModel):
item_type: ItemType
class FooItem(BaseItem):
item_type: Literal[ItemType.FOO] = ItemType.FOO
foo: int = 9000
class BarItem(BaseItem):
item_type: Literal[ItemType.BAR] = ItemType.BAR
bar: str = "baz"
class Sale(BaseModel):
price: int = 100
item: Annotated[BaseItem, Field(discriminator="item_type")]
test_foo = {"price": 1, "item": {"item_type": 1, "foo": 42}}
foo_sale = Sale.model_validate(test_foo)
print(foo_sale)
test_bar = {"price": 2, "item": {"item_type": 2, "bar": "rab"}}
bar_sale = Sale.model_validate(test_bar)
print(bar_sale) However as you will see if you run this code, pydantic requires the the discriminated field be a
However feels like it should be possible to use |
Hi guys! I'd love to use pydantic but I'm finding it hard to understand how I could use polymorphic types. Say that I have these classes:
and their corresponding JSON representation, where the type becomes JSON field:
I'd like to have a model (possibly
BaseItem
) that is capable of doing this kind of multiplexing, both in serialization and deserialization (i.e. I want to load aConcreteItem
, but I don't know which item until I read the json). Just to add more compelxity, the hierarchy could be deeper and some items might need self-referencing (i.e. an item that has aList[BaseItem]
).Is there anything built-in in pydantic? Any hint on how this could be achieved?
Thanks!
The text was updated successfully, but these errors were encountered: