Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Hide table layouts from engine #363
In preparation for https://github.com/prestosql/presto/wiki/Pushdown-of-complex-operations, this change removes support for multiple layouts. It's a feature that's not used by any known connector, complicates the work of the optimizer and just gets in the way. In the future, we may add this functionality again but in a different form.
referenced this pull request
Mar 2, 2019
It's a feature we added a few years ago for a very specific use case at FB to allow connectors to provide multiple physical organizations of a table for the engine to use during optimization. No one else is using it. Since it makes it hard to introduce new features like complex operation pushdown, we're removing it for now. We may reintroduce it later in some other shape (materialized query tables/materialized views, etc).
Before, we had two constructors:
@JsonCreator public TableScanNode( @JsonProperty("id") PlanNodeId id, @JsonProperty("table") TableHandle table, @JsonProperty("outputSymbols") List<Symbol> outputs, @JsonProperty("assignments") Map<Symbol, ColumnHandle> assignments, @JsonProperty("layout") Optional<TableLayoutHandle> tableLayout)
public TableScanNode( PlanNodeId id, TableHandle table, List<Symbol> outputs, Map<Symbol, ColumnHandle> assignments)
After removing table layouts, both constructors have the same signature, but do something different.
It will still be a simple wrapper around
No, there won't. It will be just TableHandles wrapping ConnectorTableHandles. It's up to connectors to decide what they need to track in a ConnectorTableHandle based on what can be pushed down into the connector.
That's really hard to do in general. It forces entire parallel hierarchies and APIs to represent similar objects with different constraints (e.g., we'd need separate APIs to resolve a table during analysis for each possible use so we can get a table handle for insert, a table handle for delete, etc.). Also, where do we stop? Would we have one of each expression kind for each SQL type to prevent misuse? It's a fine balance, and I think the current concepts provide good separation between action (expression/plan node type), object (columns, tables, arguments) and attributes (physical or logical properties) without requiring a NxMxR explosion of valid combinations.