-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pydantic classes as default values not serializable #64
Comments
Found a solution without importing pydantic at runtime, as long as you are fine with |
An alternative would be to allow setting |
Personally I'd prefer to handle this within Downside: This can not handle a dataclass that has an attribute that is a pydantic class with a default. Not sure how common this usecase is or I should add handling this to |
ack |
@faph any news on this? |
I am just trying to get my head around this and what the expected behavior should be. Reading the spec here: https://avro.apache.org/docs/1.11.1/specification/#schema-record makes me think that we should be following the Avro spec exactly on how field defaults should be serialized. There is only 1 correct serialization for a given type, so that would suggest we should NOT make the serialization configurable. Instead, we might need to consider using the I have to say, I have not thought about using an entire Pydantic object as a default. I guess the Avro spec supports it, so py-avro-schema should support it too. |
So the problem here is that |
Thanks a lot for your feedback. This makes sense yes. I am not sure if I understand how you want this to be implemented though. |
Let me push a concept PR that would demonstrate this. |
I think I have some idea. i'll close the merge request regarding orjson default, and update the other one. Maybe you can tell me if there are other more optimal fields to use 👍🏻 |
It's not unusual for complex record fields like in your case to be "nullable" instead of defaulting to a complex/record object. It's up to you whether such a default should be present in the Avro schema itself. For the Python class, it's not unusual for the class/function signature to default to |
You want to tell me we should rethink our defaults? 🤔 |
No, no, if that's the right design for you that's fine. Avro and Pydantic support it. Just saying that plain Python classes don't. |
So, this PR: #67 actually passes your test. But it would not be the correct solution. Although arguably better than the current implementation. So we can think about whether we should merge this in the meantime like this... |
Updated the merge request. Feel free to have a look. |
That looks good, seems simpler than I expected, which is good! |
Released as 3.5.0. Many thanks! |
For anyone interested, I managed to backport this to v2 (which I'm stuck on) with this diff on @@ -522,6 +522,12 @@
"items": self.items_schema.data(names=names),
}
+ def make_default(self, py_default: collections.abc.Sequence) -> JSONArray:
+ """Return an Avro schema compliant default value for a given Python Sequence
+ :param py_default: The Python sequence to generate a default value for.
+ """
+ return [self.items_schema.make_default(item) for item in py_default]
+
class DictSchema(Schema):
"""An Avro map schema for a given Python mapping"""
@@ -841,6 +847,22 @@
)
return field_obj
+ def make_default(self, py_default: pydantic.BaseModel) -> JSONObj:
+ """Return an Avro schema compliant default value for a given Python value"""
+ return {key: _schema_obj(self._annotation(key)).make_default(value) for key, value in py_default}
+
+ def _annotation(self, field_name: str) -> Type:
+ """
+ Fetch the raw annotation for a given field name
+
+ Pydantic "unpacks" annotated and forward ref types in their FieldInfo API. We need to access to full, raw
+ annotated type hints instead.
+ """
+ for class_ in self.py_type.mro():
+ if class_.__annotations__.get(field_name):
+ return class_.__annotations__[field_name]
+ raise ValueError(f"{field_name} is not a field of {self.py_type}") # Should never happen
+
@staticmethod
def _py_type(py_field: pydantic.fields.ModelField) -> Any:
"""Return a Python type annotation for a given Pydantic field""" |
Hi everyone,
I have a pydantic model that has an attribute of another pydantic model type that should be generated by default. This fails as the schema dict produced by py_avro_schema sets as default the pydantic class object, which is per default not json serializable via orjson.
example.py
This raises the following error:
Suggestion
On a PydanticSchema for each recordfield check if the
py_type
is a pydantic basemodel subclass and if so calldefault.model_dump(mode="json")
on itThis would then produce this schema:
which looks alright.
The downside is that pydantic would be needed at runtime so it must be either catched as an error or it would be imported on class / method level.
Let me know what you think about this please.
Thanks a lot.
The text was updated successfully, but these errors were encountered: