Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support UnboundPartitionSpec #106

Merged
merged 5 commits into from
Dec 7, 2023

Conversation

my-vegetable-has-exploded
Copy link
Contributor

close #98

Copy link
Collaborator

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pr, it looks great. For now, the most important use case of UnboundPartitionSpec is in the catalog api. We should replace the PartitionSpec in TableCreation with UnboundPartitionSpec since the spec id and field id should not be specified by user. Same should be applied to AddSpec table update.

crates/iceberg/src/error.rs Outdated Show resolved Hide resolved
pub transform: Transform,
}

#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone, Default, Builder)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use TypedBuilder?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't find a neat way to add #[builder(setter(each(name = "with_partition_field")))] in TypedBuilder.
Accoding to this, it may need to use mutators?
Maybe we can try later?

}

/// Bind unbound partition spec to a schema
pub fn bind(&self, schema: SchemaRef) -> Result<PartitionSpec> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I'm hesitating to implement bind method for now since I don't know if it's really useful, maybe we should implement it later when necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also a little confused about the process. If there's a need, I can try it then. Thanks @liurenjie1024.

Copy link
Collaborator

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. A minor suggestion is to add an unit test for it.

Copy link
Collaborator

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@liurenjie1024
Copy link
Collaborator

liurenjie1024 commented Dec 5, 2023

cc @Fokko PTAL, I think this is ready to go

Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM, only some comments for code reading.

crates/iceberg/src/spec/partition.rs Outdated Show resolved Hide resolved
crates/iceberg/src/spec/partition.rs Outdated Show resolved Hide resolved
Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

"fields": [
{
"source-id": 4,
"field-id": 1000,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this field_id in request is optional?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it is for historical reasons. For old specs they can be missing, and they'll just auto-increment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, cleaned up the spec a bit: apache/iceberg#9240

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reference, I saw the comments now. But I think in the rest api request, it's reasonable to make it optional, and ask the server to assign a meaningful field id to it?

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@Fokko Fokko merged commit d206c1d into apache:main Dec 7, 2023
6 checks passed
@Fokko
Copy link
Contributor

Fokko commented Dec 7, 2023

Thanks @my-vegetable-has-exploded for working on this, and @liurenjie1024 and @Xuanwo for reviewing this 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: Add support for UnboundPartitionSpec.
4 participants