Skip to content

Make SetPredicate JSON serializable #2524

@Fokko

Description

@Fokko

Feature Request / Improvement

Make sure that the SetPredicate can be serialized to JSON:

class SetPredicate(UnboundPredicate[L], ABC):

This predicate has four implementations: IsNull, NotNull, IsNaN, and NotNan, and translates to:

{
    "type": "in" // Or not-in
    "term": str, // The column name
    "value": List[Any] // Should be List[Literal], but we can do that later since that's not JSON serializable yet
}

We use Pydantic for JSON serialization, which can be enabled by deriving from the IcebergBaseModel:

class PartitionSpec(IcebergBaseModel):

Example tests can be found here:

def test_serialize_partition_spec() -> None:
partitioned = PartitionSpec(
PartitionField(source_id=1, field_id=1000, transform=TruncateTransform(width=19), name="str_truncate"),
PartitionField(source_id=2, field_id=1001, transform=BucketTransform(num_buckets=25), name="int_bucket"),
spec_id=3,
)
assert (
partitioned.model_dump_json()
== """{"spec-id":3,"fields":[{"source-id":1,"field-id":1000,"transform":"truncate[19]","name":"str_truncate"},{"source-id":2,"field-id":1001,"transform":"bucket[25]","name":"int_bucket"}]}"""
)
def test_deserialize_unpartition_spec() -> None:
json_partition_spec = """{"spec-id":0,"fields":[]}"""
spec = PartitionSpec.model_validate_json(json_partition_spec)
assert spec == PartitionSpec(spec_id=0)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions