-
Notifications
You must be signed in to change notification settings - Fork 63
feat(parquet): add HasFieldIds check #158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| auto group_node = internal::checked_pointer_cast<::parquet::schema::GroupNode>(node); | ||
| for (int i = 0; i < group_node->field_count(); i++) { | ||
| if (HasFieldIds(group_node->field(i))) { | ||
| return true; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this means? Any leaf has field-id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just simply check if any field (whether leaf or non-leaf) has field-id. This is in sync with the Java impl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we cannot blindly check if all fields have ids because it is valid to have some fields that do not have ids, especially when variant type is supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I knew that for 3-levels complex type, fieldId might not exists, but here I don't fully understand the purpose, so any leaf has field-id means writer is field-id aware?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Iceberg only supports three-level list encoding per the spec: https://iceberg.apache.org/spec/#parquet
Lists must use the 3-level representation.
|
@Fokko @zeroshade Could you help review this? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
No description provided.