-
Notifications
You must be signed in to change notification settings - Fork 29.1k
[WIP][SPARK-17046][SQL] prevent user using dataframe.select with empty param list #14629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP][SPARK-17046][SQL] prevent user using dataframe.select with empty param list #14629
Conversation
|
Test build #63723 has finished for PR 14629 at commit
|
1179249 to
8060aa4
Compare
|
Test build #63724 has finished for PR 14629 at commit
|
|
@srowen |
|
Why do we want to enforce this? It is valid to have a DataFrame without any columns. |
|
Yes that's a good question. A 0-column DataFrame is valid, though that's a little different from being able to select 0 columns from a DataFrame. I don't have a database handy, but can you select no columns in any SQL syntax? Maybe best to emulate that? |
|
MySql do not allow select with 0 columns, and I think select() is useless, no one will do such operation, so, is it better to generate compiling error when detecting code use |
|
A |
|
Interesting point, yeah, because normally in an RDBMS you have to |
|
@hvanhovell @rxin unless you've changed your stance a little bit on this, I think the conclusion is that this isn't worth changing this behavior and we can close this @WeichenXu123 |
|
I haven't changed my mind of this. Lets close this one. |
What changes were proposed in this pull request?
We can see the DataFrame API:
def select(col: String, cols: String*)such definition can prevent user to call
selectin such way:df.select( )but, currently we can still use
df.select( )and pass compiling,because it match the API
def select(cols: Column*)so, my modification is, add an API such as:
def select(col: Column, cols: Column*)and change
def select(cols: Column*)intoprivate[spark] def select(cols: Column*)so that the public
selectAPI can only be called with non-empty param list.How was this patch tested?
Existing test.