-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Support create table with explicit column definitions
#4194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @doki23 -- this looks like a nice improvement! I have a few suggestions -- the only thing I think is needed prior to merge is a test for mismatched column count.
We can file a follow up to support specifying column types from a create table as select as a follow on PR
datafusion/sql/src/planner.rs
Outdated
| SetExpr::Values(_) => { | ||
| let schema = self.build_schema(columns)?.to_dfschema_ref()?; | ||
| if schema.fields().len() != input_schema.fields().len() { | ||
| return Err(DataFusionError::Plan("Mismatch between schema and batches".to_string())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest adding test for this situation (incorrect length)
datafusion/sql/src/planner.rs
Outdated
| _ => return Err(DataFusionError::Plan( | ||
| "You can only specify schema when create table with a `values` statement" | ||
| .to_string() | ||
| )) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why not support queries as well? Why only VALUES -- the code you have written should work for a CREATE TABLE AS SELECT as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's because SELECT has its own schema, but surely, it makes sense if someone wants to cast these types. I'll consider your suggestions and add some tests in this situation.
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
I'm going to do it in this pr, here's no need to file a follow up issue @alamb . |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @doki23 -- this looks great!
| let ctx = SessionContext::new(); | ||
| register_aggregate_simple_csv(&ctx).await?; | ||
|
|
||
| let sql = "CREATE TABLE my_table(c1 float, c2 double, c3 boolean, c4 varchar) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| let ctx = SessionContext::new(); | ||
| register_aggregate_simple_csv(&ctx).await?; | ||
|
|
||
| let sql = "CREATE TABLE my_table(c1 float, c2 double, c3 boolean, c4 varchar) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create table with explicit column definitions
|
I took the liberty of merging this PR up to master and fixing the clippy issue so we can get it merged. Thanks again @doki23 |
|
BTW I also made a small change to one of the tests to satisfy clippy a2172db |
|
Thanks again @doki23 |
|
Benchmark runs are scheduled for baseline = 88c1201 and contender = 74199d6. 74199d6 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |

Which issue does this PR close?
Closes #4183
What changes are included in this PR?
Most changes are in planner of creating memory table.
Are these changes tested?
Yes.
Are there any user-facing changes?
Users can create a table with a specified schema with a values query.