Skip to content

Conversation

@Fokko
Copy link
Contributor

@Fokko Fokko commented Nov 25, 2025

Rationale for this change

Are these changes tested?

Are there any user-facing changes?

@Fokko Fokko changed the title Deserialize expression JSON to expression Nov 25, 2025
Fokko pushed a commit that referenced this pull request Nov 26, 2025
Related to: #2518, #2775 

# Rationale for this change

This work was done by @Aniketsy, I just opened this to get the tests
passing, and we can merge for scan planning.

But, this PR allows `And` expressions to be deserialized from JSON
through Pydantic.

This PR aligns the `And` expression with the `Or`/`Not` pattern by
adding `IcebergBaseModel` as an inherited class. This gets teh And
expression into a proven serializable state, preparing it for the full
expression tree [de]serializability work in #2783.

## Are these changes tested?

Yes added a test and ensure that they align with EpressionParser in
Iceberg Java

## Are there any user-facing changes?

No this is just serialization

cc: @kevinjqliu @Fokko

---------

Co-authored-by: Aniket Singh Yadav <singhyadavaniket43@gmail.com>
@Fokko Fokko requested a review from kevinjqliu November 28, 2025 14:46
@Fokko
Copy link
Contributor Author

Fokko commented Nov 28, 2025

PTAL @kevinjqliu & @geruh


@model_validator(mode="wrap")
@classmethod
def handle_primitive_type(cls, v: Any, handler: ValidatorFunctionWrapHandler) -> BooleanExpression:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the pydantic discriminator not work here because of the type field missing from boolean expressions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, Pydantic has difficulties with the boolean values (true/false), instead of a dict {"type": "in", ...}



class BoundPredicate(Bound, BooleanExpression, ABC):
model_config = ConfigDict(arbitrary_types_allowed=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the bound expressions are also being included in the serialization and they're serialized identically?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Bound expressions don't carry a type attribute. In this case, it will simply fall back to the default deserialized value.

@geruh
Copy link
Contributor

geruh commented Nov 28, 2025

Looks good! I ran some additional JSON roundtrip tests against the existing ones in Iceberg's TestExpressionParser. It looks like all core expression types were serialized and deserialized correctly. Picking out the test cases that align with the REST spec, which are the boolean expressions, all the predicates, the and, or, and not operators, and nested expressions.

Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! thanks for fixing the serialization, exciting!

Comment on lines -434 to -435
else:
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for this and below, should we throw value error here? just in case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we just want to return a None indicating that there is no strict-projection available.

@Fokko Fokko merged commit e3070d4 into apache:main Dec 3, 2025
8 checks passed
@Fokko Fokko deleted the fd-deserialize-expr branch December 3, 2025 16:34
@Fokko
Copy link
Contributor Author

Fokko commented Dec 3, 2025

Thanks @geruh and @kevinjqliu for the review 🙌

@geruh geruh mentioned this pull request Dec 3, 2025
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants