Skip to content

[Kernel] Update max id to use SchemaIterable #4742

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

emkornfield
Copy link
Collaborator

@emkornfield emkornfield commented Jun 12, 2025

🥞 Stacked PR

Use this link to review incremental changes.


Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

This PR refactors findMaxId to use schema iterable.

Relates to #4744

How was this patch tested?

Existing unit tests.

Does this PR introduce any user-facing changes?

No.

.stream()
.mapToInt(
e -> {
int columnId = hasColumnId(e.getField()) ? getColumnId(e.getField()) : 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw -- what do you think of (later) being able to do e.getField().hasColumnId and e.getField().getColumnId?

Today it seems like we have this strange c struct way of poking at our data?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be part of a larger refactor, because getColumnId might be a little weird when we start talking about nested types. If we push all IDs down into the schema and without metadata I think it makes sense, otherwise keeping it more private (like it is today as a non-OO method is probably the right direction).

TL;DR; This is probably blocked on deciding how things we stuff in FieldMetadata are abstracted to end-users.

@emkornfield emkornfield force-pushed the stack/schemaIdToSchemaIterable branch from b60855c to 4631f20 Compare June 16, 2025 20:38
@vkorukanti vkorukanti merged commit 848344b into delta-io:master Jun 16, 2025
21 checks passed
vkorukanti pushed a commit that referenced this pull request Jun 17, 2025
## 🥞 Stacked PR
Use this
[link](https://github.com/delta-io/delta/pull/4752/files/4631f204cf59016d876a0c3bd408e7c67112b22b..19c2fe51f33bba7e4e584874238884d04bef514e)
to review incremental changes.
-
[stack/schemaIdToSchemaIterable](#4742)
[[Files changed](https://github.com/delta-io/delta/pull/4742/files)]
-
[**stack/update_table_features**](#4752)
[[Files
changed](https://github.com/delta-io/delta/pull/4752/files/4631f204cf59016d876a0c3bd408e7c67112b22b..19c2fe51f33bba7e4e584874238884d04bef514e)]
- [stack/update_validators](#4753)
[[Files
changed](https://github.com/delta-io/delta/pull/4753/files/19c2fe51f33bba7e4e584874238884d04bef514e..378a8eda53800bcbfffc648459f5b0af69fdfffc)]
-
[stack/remove_recurse_from_schema_utils](#4754)
[[Files
changed](https://github.com/delta-io/delta/pull/4754/files/378a8eda53800bcbfffc648459f5b0af69fdfffc..c4c744aa0f7e6601da7b2d12254eeb0360d9ca13)]

---------
#### Which Delta project/connector is this regarding?

- [ ] Spark
- [ ] Standalone
- [ ] Flink
- [x] Kernel
- [ ] Other (fill in here)

## Description

Migrate to using SchemaIterable for checking for table features.

Relates to https://github.com/delta-io/delta/issues/4744Relates to
#4744

## How was this patch tested?

Existing tests

## Does this PR introduce _any_ user-facing changes?

No
vkorukanti pushed a commit that referenced this pull request Jun 17, 2025
## 🥞 Stacked PR
Use this
[link](https://github.com/delta-io/delta/pull/4753/files/19c2fe51f33bba7e4e584874238884d04bef514e..378a8eda53800bcbfffc648459f5b0af69fdfffc)
to review incremental changes.
-
[stack/schemaIdToSchemaIterable](#4742)
[[Files changed](https://github.com/delta-io/delta/pull/4742/files)]
-
[stack/update_table_features](#4752)
[[Files
changed](https://github.com/delta-io/delta/pull/4752/files/4631f204cf59016d876a0c3bd408e7c67112b22b..19c2fe51f33bba7e4e584874238884d04bef514e)]
-
[**stack/update_validators**](#4753)
[[Files
changed](https://github.com/delta-io/delta/pull/4753/files/19c2fe51f33bba7e4e584874238884d04bef514e..378a8eda53800bcbfffc648459f5b0af69fdfffc)]
-
[stack/remove_recurse_from_schema_utils](#4754)
[[Files
changed](https://github.com/delta-io/delta/pull/4754/files/378a8eda53800bcbfffc648459f5b0af69fdfffc..c4c744aa0f7e6601da7b2d12254eeb0360d9ca13)]

---------
#### Which Delta project/connector is this regarding?

- [ ] Spark
- [ ] Standalone
- [ ] Flink
- [x] Kernel
- [ ] Other (fill in here)

## Description

Updates validators to use streams instead of filterRecursively. This
makes the code more idiomatic, with less hops.

Relates to #4744

## How was this patch tested?

Existing tests.

## Does this PR introduce _any_ user-facing changes?

No
vkorukanti pushed a commit that referenced this pull request Jun 17, 2025
## 🥞 Stacked PR
Use this
[link](https://github.com/delta-io/delta/pull/4754/files/378a8eda53800bcbfffc648459f5b0af69fdfffc..c4c744aa0f7e6601da7b2d12254eeb0360d9ca13)
to review incremental changes.
-
[stack/schemaIdToSchemaIterable](#4742)
[[Files changed](https://github.com/delta-io/delta/pull/4742/files)]
-
[stack/update_table_features](#4752)
[[Files
changed](https://github.com/delta-io/delta/pull/4752/files/4631f204cf59016d876a0c3bd408e7c67112b22b..19c2fe51f33bba7e4e584874238884d04bef514e)]
- [stack/update_validators](#4753)
[[Files
changed](https://github.com/delta-io/delta/pull/4753/files/19c2fe51f33bba7e4e584874238884d04bef514e..378a8eda53800bcbfffc648459f5b0af69fdfffc)]
-
[**stack/remove_recurse_from_schema_utils**](#4754)
[[Files
changed](https://github.com/delta-io/delta/pull/4754/files/378a8eda53800bcbfffc648459f5b0af69fdfffc..c4c744aa0f7e6601da7b2d12254eeb0360d9ca13)]

---------
#### Which Delta project/connector is this regarding?

- [ ] Spark
- [ ] Standalone
- [ ] Flink
- [x] Kernel
- [ ] Other (fill in here)

## Description

Remove last usages of filterRecursively in favor of SchemaIterable.

Relates to #4744

## How was this patch tested?

Existing tests.

## Does this PR introduce _any_ user-facing changes?

No.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants