-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SHOW CREATE TABLE with initial support for views #2830
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2830 +/- ##
==========================================
- Coverage 85.26% 85.21% -0.05%
==========================================
Files 275 275
Lines 48830 48907 +77
==========================================
+ Hits 41633 41676 +43
- Misses 7197 7231 +34
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @mrob95 -- this looks like a great start.
I would be happy with merging this PR in and doing any refactors as follow on PRs (like adding support for information_schema.views
and removing create_statement
column from information_schema.tables
).
Let me know if you want to make any changes to this PR otherwise I will merge this PR in and file a follow on ticket.
👍 again 🙏
Comment 1
I don't think information_schema.tables.create_statement
is the standard location for the view information. I think it is actually in a new information schema table information_table.views
. For example, in mysql:
mysql> select table_catalog, table_schema, table_name, view_definition from information_schema.views where table_name = 'ff';
+---------------+--------------+------------+------------------------------------------------+
| TABLE_CATALOG | TABLE_SCHEMA | TABLE_NAME | VIEW_DEFINITION |
+---------------+--------------+------------+------------------------------------------------+
| def | foo | ff | select count(0) AS `count(*)` from `foo`.`foo` |
+---------------+--------------+------------+------------------------------------------------+
1 row in set (0.00 sec)
Comment 2
Passing the SQL string down through SqlToRel::statement_to_plan alongside the parsed Statement seems a little awkward. An alternative would be to use format!("{}", statement) which would produce valid SQL, but not the exact SQL that was used to create the table/view.
I think using format
is a good idea (or perhaps sql.to_string()
) -- in general sqlparser-rs
has fairly good coverage for round trip parsing (many of the test cases are implemented like that)
I agree, have made this change now.
This should be fairly simple to do in a separate PR. I also think it would be useful to implement "SHOW CREATE TABLE" for external tables so you can get a summary of which tables point at which files using what options. |
Do you think "definition" would be a better name than "create_statement"? |
datafusion/proto/src/logical_plan.rs
Outdated
@@ -515,11 +515,16 @@ impl AsLogicalPlan for LogicalPlanNode { | |||
"Protobuf deserialization error, CreateViewNode has invalid LogicalPlan input.", | |||
)))? | |||
.try_into_logical_plan(ctx, extension_codec)?; | |||
let create_statement = match create_view.create_statement.is_empty() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: matching on a bool
could be simplified to if
/else
I do find |
That makes sense for the |
Yes |
Pushed an update |
I think this looks good -- if @andygrove agrees I will merge this and file a follow on ticket to add |
Thanks again @mrob95 |
My pleasure, thanks for the prompt and thorough reviews 👍 |
LGTM. Thanks @mrob95 |
Filed #2857 for |
Which issue does this PR close?
Closes #2529 .
Rationale for this change
What changes are included in this PR?
sqlparser-rs
. "SHOW CREATE TABLE" is what spark uses for both tables and views and using the same command for both seems simpler than having separate ones.SqlToRel::statement_to_plan
andLogicalPlan::CreateView
to be stored onViewTable
, which is registered in the catalog.create_statement
method to theTableProvider
trait that allows table providers to optionally expose the SQL query used to create them.AddLogicalPlan::ShowCreateTable
andShowCreateTableExec
to look the table up, get theTableProvider
, get thecreate_statement
and output it.create_statement
toinformation_schema.tables
information_schema.tables
This PR only supports views at the moment, but is designed to hopefully make it fairly easy to extend to tables and potentially other data sources. They would just need to implement
create_statement
.Passing the SQL string down through
SqlToRel::statement_to_plan
alongside the parsedStatement
seems a little awkward. An alternative would be to useformat!("{}", statement)
which would produce valid SQL, but not the exact SQL that was used to create the table/view. It doesn't seem like either MySQL or spark try to guarantee that the SQL return fromSHOW..
is literally the same string as was used inCREATE
. Would be good to get some opinions on this.Would be happy to take feedback on the structure of this if there is a better way of doing it - I spent a while reading through
CreateView
,Explain
,Analyze
etc to figure out where exactly the execution of this should be handled but may have missed something :). I think the main question is whether to put the creation statement ininformation_schema
or to have "SHOW CREATE TABLE" get it from the table provider directly. I like having it ininformation_schema
because it makes the implementation of "SHOW CREATE TABLE" simple and because exposing info in tables allows the user to slice and dice them as they wish. The original commit of this PR did it the other way though (with new logical and physical plan nodes forShowCreateTable
), so could easily revert if that is better.Are there any user-facing changes?