-
Notifications
You must be signed in to change notification settings - Fork 405
JSON to expression #2783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON to expression #2783
Conversation
Related to: #2518, #2775 # Rationale for this change This work was done by @Aniketsy, I just opened this to get the tests passing, and we can merge for scan planning. But, this PR allows `And` expressions to be deserialized from JSON through Pydantic. This PR aligns the `And` expression with the `Or`/`Not` pattern by adding `IcebergBaseModel` as an inherited class. This gets teh And expression into a proven serializable state, preparing it for the full expression tree [de]serializability work in #2783. ## Are these changes tested? Yes added a test and ensure that they align with EpressionParser in Iceberg Java ## Are there any user-facing changes? No this is just serialization cc: @kevinjqliu @Fokko --------- Co-authored-by: Aniket Singh Yadav <singhyadavaniket43@gmail.com>
|
PTAL @kevinjqliu & @geruh |
|
|
||
| @model_validator(mode="wrap") | ||
| @classmethod | ||
| def handle_primitive_type(cls, v: Any, handler: ValidatorFunctionWrapHandler) -> BooleanExpression: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the pydantic discriminator not work here because of the type field missing from boolean expressions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, Pydantic has difficulties with the boolean values (true/false), instead of a dict {"type": "in", ...}
|
|
||
|
|
||
| class BoundPredicate(Bound, BooleanExpression, ABC): | ||
| model_config = ConfigDict(arbitrary_types_allowed=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the bound expressions are also being included in the serialization and they're serialized identically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Bound expressions don't carry a type attribute. In this case, it will simply fall back to the default deserialized value.
|
Looks good! I ran some additional JSON roundtrip tests against the existing ones in Iceberg's TestExpressionParser. It looks like all core expression types were serialized and deserialized correctly. Picking out the test cases that align with the REST spec, which are the boolean expressions, all the predicates, the and, or, and not operators, and nested expressions. |
kevinjqliu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! thanks for fixing the serialization, exciting!
| else: | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: for this and below, should we throw value error here? just in case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we just want to return a None indicating that there is no strict-projection available.
|
Thanks @geruh and @kevinjqliu for the review 🙌 |
Rationale for this change
Are these changes tested?
Are there any user-facing changes?