Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix reading Iceberg stats when table has nested fields #8647

Merged
merged 3 commits into from Jul 27, 2021

Conversation

findepi
Copy link
Member

@findepi findepi commented Jul 23, 2021

Fixes #7999

@findepi
Copy link
Member Author

findepi commented Jul 23, 2021

cc @hashhar @losipiuk @electrum

.projected(0, 2, 3, 4, 5, 6) // ignore data size which is available for Parquet, but not for ORC
.skippingTypesCheck()
.matches("VALUES " +
// TODO (https://github.com/trinodb/trino/issues/8648) the NDV numbers are wrong, should be 1, not 0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't projected index 3 map to nulls-fraction which is correctly 0?

spec mentions that distinct values are deprecated anyway, so keeping them null (index 2) seems fine to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phd3 good catch. i fell into my own trap here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was surprised to see that these were deprecated, but it looks like they were just added back: apache/iceberg#2805

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@electrum i don't think per-file NDV is that useful, unless query is very selective.

Do we return them today, if they are present?

Since things like apache/iceberg#2805 depend on the writer, i guess we need to have SHOW STATS coverage with Trino and Spark writers.
cc @joshthoward

The method is oneliner.

This also makes `PartitionTable#getPartitions` even more similar to
analogous code inside `TableStatisticsMaker#makeTableStatistics` method.
@findepi findepi force-pushed the findepi/iceberg-nested branch 2 times, most recently from be06564 to e2e1d03 Compare July 23, 2021 14:28
@alexjo2144
Copy link
Member

Looks good to me, just checking that the other changes in the existing PR to fix this issue wound up not being necessary? https://github.com/trinodb/trino/pull/8337/files#diff-b610df3211e6549a57e131f03074eb1b133c6ab227cf4a0ae3868758d4c40491R294

@findepi
Copy link
Member Author

findepi commented Jul 23, 2021

I think PartitionTable.convert changes are not needed (and I don't think they are sound).

this.idToTypeMapping = icebergTable.schema().columns().stream()
.filter(column -> column.type().isPrimitiveType())
.collect(Collectors.toMap(Types.NestedField::fieldId, (column) -> column.type().asPrimitiveType()));
this.idToTypeMapping = primitiveFieldTypes(icebergTable.schema());
Copy link
Member

@hashhar hashhar Jul 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should somehow indicate/note that the idToTypeMapping contains the source fields instead of transformed ones. Not sure how though.

Only PartitionField refers to transformed field and to get to source field you need to use PartitionField#sourceId. The inconsistency is a bit confusing to me but that all happens within Iceberg API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should somehow indicate/note that the idToTypeMapping contains the source fields instead of transformed ones.

this didn't change, right?

Only PartitionField refers to transformed field and to get to source field you need to use PartitionField#sourceId.

We could narrow the map to only partition fields (and rename appropriately) , is that what you mean here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing changed - I'm just expressing frustration at something which I had to spend a bit of time fighting when debugging a past issue. 🙂

I don't think we can do anything other than people being aware of the difference between PartitionField and NestedField even though they sound the same.

Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

.projected(0, 2, 3, 4, 5, 6) // ignore data size which is available for Parquet, but not for ORC
.skippingTypesCheck()
.matches("VALUES " +
// TODO (https://github.com/trinodb/trino/issues/8648) the NDV numbers are wrong, should be 1, not 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't projected index 3 map to nulls-fraction which is correctly 0?

spec mentions that distinct values are deprecated anyway, so keeping them null (index 2) seems fine to me.

@@ -227,7 +227,7 @@ public void updateNullCount(Map<Integer, Long> nullCounts)
this.nullCounts.merge(key, counts, Long::sum));
}

public static Map<Integer, Object> toMap(Map<Integer, Type.PrimitiveType> idToTypeMapping, Map<Integer, ByteBuffer> idToMetricMap)
public static Map<Integer, Object> convertBounds(Map<Integer, Type.PrimitiveType> idToTypeMapping, Map<Integer, ByteBuffer> idToMetricMap)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: the method doesn't have anything specific to "bounds". May be convertToValues or getValuesFromByteBuffers, or convertMetrics, but don't have a suggestion that sounds obviously better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the method doesn't have anything specific to "bounds".

i think it actually has, because bounds come as byte buffers, whole ordinary partition values do not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping current iceberg data structures aside, is there a reason why logically byte buffers would be more suitable for representing bounds, and hence this naming is more appropriate?

(either way convertBounds reads more easily than toMap, so feel free to resolve this.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason why logically byte buffers would be more suitable for representing bounds

i don't see why we would use different representation in different code paths.
some performance reason?
or some unintentional API difference?
i need to find out.

(also seen here:

// Partition values are passed as String, but min/max values are passed as a CharBuffer
)

(either way convertBounds reads more easily than toMap, so feel free to resolve this.)

👍

@findepi findepi merged commit 669a41e into trinodb:master Jul 27, 2021
@findepi findepi deleted the findepi/iceberg-nested branch July 27, 2021 07:22
@findepi findepi mentioned this pull request Jul 27, 2021
11 tasks
@findepi findepi added this to the 360 milestone Jul 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

NullPointerException In Iceberg when joining a partitioned table with nested fields
5 participants