New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-20845][SQL] Support specification of column names in INSERT INTO command. #22532
Conversation
@janewangfb, @gatorsmile could you please possibly review this change? |
ok to test |
Test build #97006 has finished for PR 22532 at commit
|
Thanks for submitting the PR! I quickly scan the code changes. It sounds like the general direction is right but the quality is not ready. I would suggest to write the test plan before doing the code review. Could you try your best to write down what we should test for supporting this feature? Both negative and positive cases. |
Many thanks for the feedback. I will list the test scenarios that I had in mind and collected while I implemented this item. And sorry about the failure, seems like I did not rerun all the tests in my last step... For example when the same field is queried multiple times it is not handled properly. I will fix them also ... |
Is there any progress? @misutoth |
|
||
override def visitNamedExpressionSeq(ctx: NamedExpressionSeqContext): Seq[Expression] = { | ||
ctx.namedExpression.asScala.map(visitNamedExpression) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it need to be overrided ?
Can one of the admins verify this patch? |
Closing this due to author's inactivity. |
@misutoth Can we push that forward? hence hive support that so thrift is not compatible? |
Our project will also not be able to use Apache Spark unless the standard insert with column names syntax is supported, so I would also be interested in this change being applied. |
What changes were proposed in this pull request?
One can specify a list of columns for an INSERT INTO command. The columns shall be listed in parenthesis just following the table name. Query columns are then matched to this very same order.
In the above example the second insertion utilizes the new functionality. The number and its associated string is given in reverse order
(2, 'second')
according to the column list specified for the table(i, s)
. The result can be seen at the end of the command list. Intermediate output of the commands are omitted for the sake of brevity.How was this patch tested?
InsertSuite (both in source and in hive sub-packages) were extended with tests exercising specification of column names listing in INSERT INTO commands.
Also ran the above sample, and ran tests in
sql
.