-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[Kernel] Update max id to use SchemaIterable #4742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Kernel] Update max id to use SchemaIterable #4742
Conversation
.stream() | ||
.mapToInt( | ||
e -> { | ||
int columnId = hasColumnId(e.getField()) ? getColumnId(e.getField()) : 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw -- what do you think of (later) being able to do e.getField().hasColumnId
and e.getField().getColumnId
?
Today it seems like we have this strange c
struct
way of poking at our data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might be part of a larger refactor, because getColumnId might be a little weird when we start talking about nested types. If we push all IDs down into the schema and without metadata I think it makes sense, otherwise keeping it more private (like it is today as a non-OO method is probably the right direction).
TL;DR; This is probably blocked on deciding how things we stuff in FieldMetadata are abstracted to end-users.
08842ac
to
b60855c
Compare
b60855c
to
4631f20
Compare
## 🥞 Stacked PR Use this [link](https://github.com/delta-io/delta/pull/4752/files/4631f204cf59016d876a0c3bd408e7c67112b22b..19c2fe51f33bba7e4e584874238884d04bef514e) to review incremental changes. - [stack/schemaIdToSchemaIterable](#4742) [[Files changed](https://github.com/delta-io/delta/pull/4742/files)] - [**stack/update_table_features**](#4752) [[Files changed](https://github.com/delta-io/delta/pull/4752/files/4631f204cf59016d876a0c3bd408e7c67112b22b..19c2fe51f33bba7e4e584874238884d04bef514e)] - [stack/update_validators](#4753) [[Files changed](https://github.com/delta-io/delta/pull/4753/files/19c2fe51f33bba7e4e584874238884d04bef514e..378a8eda53800bcbfffc648459f5b0af69fdfffc)] - [stack/remove_recurse_from_schema_utils](#4754) [[Files changed](https://github.com/delta-io/delta/pull/4754/files/378a8eda53800bcbfffc648459f5b0af69fdfffc..c4c744aa0f7e6601da7b2d12254eeb0360d9ca13)] --------- #### Which Delta project/connector is this regarding? - [ ] Spark - [ ] Standalone - [ ] Flink - [x] Kernel - [ ] Other (fill in here) ## Description Migrate to using SchemaIterable for checking for table features. Relates to https://github.com/delta-io/delta/issues/4744Relates to #4744 ## How was this patch tested? Existing tests ## Does this PR introduce _any_ user-facing changes? No
## 🥞 Stacked PR Use this [link](https://github.com/delta-io/delta/pull/4753/files/19c2fe51f33bba7e4e584874238884d04bef514e..378a8eda53800bcbfffc648459f5b0af69fdfffc) to review incremental changes. - [stack/schemaIdToSchemaIterable](#4742) [[Files changed](https://github.com/delta-io/delta/pull/4742/files)] - [stack/update_table_features](#4752) [[Files changed](https://github.com/delta-io/delta/pull/4752/files/4631f204cf59016d876a0c3bd408e7c67112b22b..19c2fe51f33bba7e4e584874238884d04bef514e)] - [**stack/update_validators**](#4753) [[Files changed](https://github.com/delta-io/delta/pull/4753/files/19c2fe51f33bba7e4e584874238884d04bef514e..378a8eda53800bcbfffc648459f5b0af69fdfffc)] - [stack/remove_recurse_from_schema_utils](#4754) [[Files changed](https://github.com/delta-io/delta/pull/4754/files/378a8eda53800bcbfffc648459f5b0af69fdfffc..c4c744aa0f7e6601da7b2d12254eeb0360d9ca13)] --------- #### Which Delta project/connector is this regarding? - [ ] Spark - [ ] Standalone - [ ] Flink - [x] Kernel - [ ] Other (fill in here) ## Description Updates validators to use streams instead of filterRecursively. This makes the code more idiomatic, with less hops. Relates to #4744 ## How was this patch tested? Existing tests. ## Does this PR introduce _any_ user-facing changes? No
## 🥞 Stacked PR Use this [link](https://github.com/delta-io/delta/pull/4754/files/378a8eda53800bcbfffc648459f5b0af69fdfffc..c4c744aa0f7e6601da7b2d12254eeb0360d9ca13) to review incremental changes. - [stack/schemaIdToSchemaIterable](#4742) [[Files changed](https://github.com/delta-io/delta/pull/4742/files)] - [stack/update_table_features](#4752) [[Files changed](https://github.com/delta-io/delta/pull/4752/files/4631f204cf59016d876a0c3bd408e7c67112b22b..19c2fe51f33bba7e4e584874238884d04bef514e)] - [stack/update_validators](#4753) [[Files changed](https://github.com/delta-io/delta/pull/4753/files/19c2fe51f33bba7e4e584874238884d04bef514e..378a8eda53800bcbfffc648459f5b0af69fdfffc)] - [**stack/remove_recurse_from_schema_utils**](#4754) [[Files changed](https://github.com/delta-io/delta/pull/4754/files/378a8eda53800bcbfffc648459f5b0af69fdfffc..c4c744aa0f7e6601da7b2d12254eeb0360d9ca13)] --------- #### Which Delta project/connector is this regarding? - [ ] Spark - [ ] Standalone - [ ] Flink - [x] Kernel - [ ] Other (fill in here) ## Description Remove last usages of filterRecursively in favor of SchemaIterable. Relates to #4744 ## How was this patch tested? Existing tests. ## Does this PR introduce _any_ user-facing changes? No.
🥞 Stacked PR
Use this link to review incremental changes.
Which Delta project/connector is this regarding?
Description
This PR refactors findMaxId to use schema iterable.
Relates to #4744
How was this patch tested?
Existing unit tests.
Does this PR introduce any user-facing changes?
No.