New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support syntax and AST building for Materialized View Commands ... #3283
Conversation
@anjalinorwood how does this relate to Hive 3's materialized views? |
Not related to Hive 3's materialized views. Details here: https://docs.google.com/document/d/1MOtYt7BFNFoDBc7SwCStHzXy6cyzOvAKURH3hbnk8VU/edit?usp=sharing I have been talking to @martint for this feature. Would you like to be added to that email conversation? |
@anjalinorwood yes, thanks |
@anjalinorwood How it is going to be modeled in connectors and how refresh is going to be implemented in the execution engine? The doc you mentioned is only about the syntax. |
There is an open question around how to implement refresh in the engine. A proposal here: As for Connector side implementation, here at Netflix we will start with Iceberg. Some details here: PR for API is here: #3061 The idea is to nail down the syntax for create, refresh, drop of materialized view and the connector API, so that community can start with materialized view implementation for their favorite connector. In this first version, we are not proposing automatic rewrite / query routing to materialized views. The user query will be written against the materialized view. Refresh provides a convenient way to keep materialized views fresh. (Incremental refresh can be implemented by the connector). |
@anjalinorwood can you add me also to the email conversation that you mentioned? |
With the community interest in this feature, it is a good idea to keep the discussion on Github/gdrive. Turns out there are no additional details in the email. Relevant links are in my comment here: #3283 (comment) I looked at the Verada syntax, it looks similar to the proposal above. :-) |
presto-main/src/main/java/io/prestosql/execution/RefreshMaterializedViewTask.java
Outdated
Show resolved
Hide resolved
Final proposal for syntax for create/refresh/drop materialized views is here: https://docs.google.com/document/d/10jPGw3t-Tu8OgWo5oC9d-O8d1PVdnAbEhyfvnJA0T8U/edit |
58a9c0b
to
ec96f88
Compare
presto-main/src/main/java/io/prestosql/sql/planner/LogicalPlanner.java
Outdated
Show resolved
Hide resolved
4bf8803
to
59c0d85
Compare
presto-main/src/main/java/io/prestosql/execution/CreateMaterializedViewTask.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/execution/CreateMaterializedViewTask.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/execution/CreateMaterializedViewTask.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/metadata/MetadataManager.java
Outdated
Show resolved
Hide resolved
presto-parser/src/main/java/io/prestosql/sql/parser/AstBuilder.java
Outdated
Show resolved
Hide resolved
presto-parser/src/main/java/io/prestosql/sql/parser/AstBuilder.java
Outdated
Show resolved
Hide resolved
presto-parser/src/main/java/io/prestosql/sql/tree/CreateMaterializedView.java
Outdated
Show resolved
Hide resolved
presto-parser/src/main/java/io/prestosql/sql/tree/CreateMaterializedView.java
Outdated
Show resolved
Hide resolved
presto-spi/src/main/java/io/prestosql/spi/connector/ConnectorMetadata.java
Outdated
Show resolved
Hide resolved
59c0d85
to
aee3d67
Compare
presto-main/src/main/java/io/prestosql/operator/TableWriterOperator.java
Outdated
Show resolved
Hide resolved
ac9b6f4
to
4e5e73f
Compare
presto-main/src/main/java/io/prestosql/execution/CreateMaterializedViewTask.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/sql/rewrite/RefreshMaterializedViewRewrite.java
Outdated
Show resolved
Hide resolved
presto-parser/src/main/java/io/prestosql/sql/tree/RefreshMV.java
Outdated
Show resolved
Hide resolved
presto-parser/src/main/java/io/prestosql/sql/tree/RefreshMV.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/metadata/MetadataManager.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/sql/analyzer/Analysis.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/sql/analyzer/Analysis.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/sql/planner/LogicalPlanner.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/sql/planner/QueryPlanner.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/sql/planner/plan/TableWriterNode.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/sql/planner/plan/TableWriterNode.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/sql/planner/plan/TableWriterNode.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/sql/analyzer/StatementAnalyzer.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/sql/analyzer/StatementAnalyzer.java
Outdated
Show resolved
Hide resolved
39524af
to
69364f6
Compare
presto-main/src/main/java/io/prestosql/execution/CreateMaterializedViewTask.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/sql/analyzer/StatementAnalyzer.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/sql/planner/LogicalPlanner.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/sql/planner/LogicalPlanner.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/sql/planner/LogicalPlanner.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/sql/planner/LogicalPlanner.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/sql/planner/LogicalPlanner.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/sql/planner/LogicalPlanner.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/sql/planner/optimizations/BeginTableWrite.java
Outdated
Show resolved
Hide resolved
35503fe
to
525b75f
Compare
presto-main/src/main/java/io/prestosql/sql/analyzer/Analysis.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/io/prestosql/execution/CreateMaterializedViewTask.java
Outdated
Show resolved
Hide resolved
0910942
to
23bf50a
Compare
presto-hive/src/main/java/io/prestosql/plugin/hive/util/HiveUtil.java
Outdated
Show resolved
Hide resolved
presto-spi/src/main/java/io/prestosql/spi/connector/TableToken.java
Outdated
Show resolved
Hide resolved
2d7d100
to
b6ec49e
Compare
This commit adds support for Materialized Views in Presto engine. Much like a logical view, a materialized view has a SQL query associated with it. Unlike logical view, it stores data corresponding to the SQL query. The commit adds support for commands like CREATE MATERIALIZED VIEW, REFRESH MATERIALIZED VIEW, SHOW CREATE MATERIALIZED VIEW and DROP MATERIALIZED VIEW. The commit adds support for reading data from a materialized view when it is fresh with respect to its underlying base tables. When a materialized view is stale with respect to its base tables, the materialized view is resolved to base tables using the associated definition. Querying the materialized view always returns the current/fresh data irrespective of the state of the materialized view. A materialized view is modeled as a combination of a SQL definition and a storage table that holds the data. The ‘Refresh Materialized View’ command is implemented as follows: REFRESH MATERIALIZED VIEW Implementation: + Refresh materialized view operation is implemented as a table writer that drops partitions from, deletes data from and inserts data into the storage table as needed. The source of the data is the query associated with the materialized view. + A new type of TableWriterOperator, ‘RefreshMaterializedViewTarget’ is implemented. This translates into two connector API calls ‘beginRefreshMaterializedView’ and ‘finishRefreshMaterializedView’. + StatementAnalyzer determines if the materialized view is fresh and sets the flag in Analysis. If the materialized view is fresh, logical planner plans the refresh operation as a no-op. + The ‘beginRefreshMaterializedView’ implementation for a connector is expected to do the following: + Start a transaction + Drop specified partitions of the storage table based on input parameters (applicable only for incremental refresh of the materialized view) + Delete data from specified partitions of the storage table or all of the data from the storage table based on input parameters (applicable for incremental refresh and full refresh respectively) + Return a ConnectorInsertTableHandle + The ‘finishRefreshMaterializedView’ implementation for a connector is expected to do the following: + Insert data into the storage table based on parameters + Store the table tokens for the base tables in the storage table + Commit the transaction. + Note that the refresh materialized view operation is performed in the scope of a single transaction in the connector. Access control: Given that materialized views can be seen as a combination of a view and a table, access control for a CREATE MATERIALIZED VIEW command is a combination of access checks for CREATE TABLE and CREATE VIEW commands. Similarly, a REFRESH MATERIALIZED VIEW command is a combination of DELETE and INSERT operations and access checks for this command is a combination of access checks for DELETE and INSERT. Lastly, a DROP MATERIALIZED VIEW access check is a combination of DROP TABLE and DROP VIEW command.
b6ec49e
to
abc24ae
Compare
@anjalinorwood @martint We need access control checks for this. See #5041 |
... like CREATE MATERIALIZED VIEW, REFRESH MATERIALIZED VIEW
and DROP MATERIALIZED VIEW.
Much like a logical view, a materialized view has a SQL query associated with it.
Unlike logical view, it stores data corresponding to the SQL query.
This commit adds support to parse the materialized view related commands and
build an AST for those commands. This commit does not include connector-side
implementation of materialized views.
Materialized views are modeled as an extension of logical views with additional
properties such as partitioning.
Given that materialized views can be seen as a combination of a view and a table,
access control for a CREATE MATERIALIZED VIEW command is a combination of access
checks for CREATE TABLE and CREATE VIEW commands.
Similarly, a REFRESH MATERIALIZED VIEW command is a combination of DELETE and INSERT
operations and access checks for this command is a combination of access checks
for DELETE and INSERT.
Lastly, a DROP MATERIALIZED VIEW access check is a combination of DROP TABLE and
DROP VIEW command.