You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the Iceberg connector doesn't support returning table-level statistics for distinct values if the table is partitioned when using Hive+Iceberg. Statistics are critical for generating good query plans. The connector can do a better job of providing them.
The implementation at the moment only returns the statistics stored directly on the Iceberg table when the table is partitioned. See
if (mergeStrategy.equals(NONE) || spec.isPartitioned()) {
returnicebergStatistics;
}
I think we can improve this by merging the statistics even when partitioned, if we verify that table-level statistics exist in the metastore and that no constraint is provided in the call to getTableStatistics, then we should be able to safely merge them.
Presto Component, Service, or Connector
Iceberg Connector
Possible Implementation
If a constraint is passed, then we shouldn't return table statistics. However, without any constraint, we should still be able to return the table-level statistics.
Context
Better query plans for partitioned Iceberg datasets.
The text was updated successfully, but these errors were encountered:
Expected Behavior or Use Case
Currently, the Iceberg connector doesn't support returning table-level statistics for distinct values if the table is partitioned when using Hive+Iceberg. Statistics are critical for generating good query plans. The connector can do a better job of providing them.
The implementation at the moment only returns the statistics stored directly on the Iceberg table when the table is partitioned. See
presto/presto-iceberg/src/main/java/com/facebook/presto/iceberg/util/StatisticsUtil.java
Lines 39 to 41 in 88cda16
I think we can improve this by merging the statistics even when partitioned, if we verify that table-level statistics exist in the metastore and that no constraint is provided in the call to
getTableStatistics
, then we should be able to safely merge them.Presto Component, Service, or Connector
Iceberg Connector
Possible Implementation
If a constraint is passed, then we shouldn't return table statistics. However, without any constraint, we should still be able to return the table-level statistics.
Context
Better query plans for partitioned Iceberg datasets.
The text was updated successfully, but these errors were encountered: