-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TableStatistics in QueryInputMetadata for Join Query #12808
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % nits
@@ -35,26 +36,23 @@ | |||
private final String table; | |||
private final List<Column> columns; | |||
private final Optional<Object> connectorInfo; | |||
private final Optional<TableStatistics> tableStats; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: simply statistics
@@ -58,15 +63,15 @@ private static Column createColumn(ColumnMetadata columnMetadata) | |||
return new Column(columnMetadata.getName(), columnMetadata.getType().toString()); | |||
} | |||
|
|||
private Input createInput(TableMetadata table, TableHandle tableHandle, Set<Column> columns) | |||
private Input createInput(TableMetadata table, TableHandle tableHandle, Set<Column> columns, Optional<TableStatistics> tableStats) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: statistics
@@ -85,13 +114,20 @@ public Void visitTableScan(TableScanNode node, Void context) | |||
columns.add(createColumn(metadata.getColumnMetadata(session, tableHandle, columnHandle))); | |||
} | |||
|
|||
inputs.add(createInput(metadata.getTableMetadata(session, tableHandle), tableHandle, columns)); | |||
Optional<TableStatistics> tableStats = Optional.empty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: statistics
Optional<TableStatistics> tableStats = Optional.empty(); | ||
|
||
if (context.extractStats()) { | ||
Constraint<ColumnHandle> constraint = new Constraint<>(node.getCurrentConstraint()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: inline this
|
||
private static class Context | ||
{ | ||
private boolean extractStats; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extractStatistics
@@ -78,6 +80,7 @@ public int hashCode() | |||
return Objects.hash(min, max); | |||
} | |||
|
|||
@JsonValue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's annotate getMin
and getMax
with the @JsonProperty
to keep this class Json parsable
|
||
if (context.extractStats()) { | ||
Constraint<ColumnHandle> constraint = new Constraint<>(node.getCurrentConstraint()); | ||
tableStats = Optional.of(metadata.getTableStatistics(session, tableHandle, constraint)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's log the stats only for the columns references in the table scan
d41d3a1
to
ae9ede6
Compare
@@ -589,6 +589,23 @@ public TableStatistics getTableStatistics(ConnectorSession session, ConnectorTab | |||
return hiveStatisticsProvider.getTableStatistics(session, ((HiveTableHandle) tableHandle).getSchemaTableName(), columns, columnTypes, partitions); | |||
} | |||
|
|||
@Override | |||
public TableStatistics getTableStatistics(ConnectorSession session, ConnectorTableHandle tableHandle, Constraint<ColumnHandle> constraint, Set<ColumnHandle> desiredColumns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's don't change the SPI. You can simply request the stats using the old SPI, but before storing them in the Input
, filter out the stats that are not referenced in the TableScanNode
recreating the TableStatistics
object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requesting stats for only the columns needed requires more reworking in hive connector. By asking of filtering the stats for not used columns i just wanted to optimize the storage for the Input objects.
Rebase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % comment
presto-main/src/main/java/com/facebook/presto/sql/planner/InputExtractor.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/sql/planner/InputExtractor.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/sql/planner/InputExtractor.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/sql/planner/InputExtractor.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/sql/planner/InputExtractor.java
Outdated
Show resolved
Hide resolved
Record table statistics in QueryCompletedEvent for join query
Table stats availablity is important for cost-based optimization
of join query. Add it to QueryInputMetadata class.
Test is added to show example of TableStatisitcs in json format.