-
Notifications
You must be signed in to change notification settings - Fork 29.1k
[SPARK-44475][SQL][CONNECT] Relocate DataType and Parser to sql/api #41928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't you move the entire utils class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could. I happed to move the least part that needed in this refactoring PR and it seems to be enough for moving a small portion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AttributeNameParser?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cd6a9f5 to
668a6c8
Compare
| this._precision = decimal.scale | ||
| this._scale = decimal.scale | ||
| } else if (decimal.scale < 0 && !SQLConf.get.allowNegativeScaleOfDecimalEnabled) { | ||
| } else if (decimal.scale < 0 && !SqlApiConf.get.allowNegativeScaleOfDecimalEnabled) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do we bind SqlApiConf.get to the active SQLConf at the server side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As soon as SQLConf is loaded we connect them: https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L185
|
The k8s failure is unrelated. Thanks, merging to master/3.5 (for scala client)! |
### What changes were proposed in this pull request? This PR relocates Parser and DataType family to sql/api. ### Why are the changes needed? To extract DataType and Parser as an API into a shared module. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? UT Closes #41928 from amaliujia/datatype_1. Authored-by: Rui Wang <rui.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 8ff6b7a) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
### What changes were proposed in this pull request? #41928 moved antlr4 related files and directories from the `sql/catalyst` module to the `sql/api` module. This pr fix the corresponding git ignore rules to avoid unexpected diffs when executing `git status`. ### Why are the changes needed? Avoid unexpected diffs when executing `git status`. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Manual check `git status` **Before** ``` sql/api/gen/ sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/gen/ ``` **After** no diff. Closes #42342 from LuciferYang/minor-gitignore. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request? #41928 moved antlr4 related files and directories from the `sql/catalyst` module to the `sql/api` module. This pr fix the corresponding git ignore rules to avoid unexpected diffs when executing `git status`. ### Why are the changes needed? Avoid unexpected diffs when executing `git status`. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Manual check `git status` **Before** ``` sql/api/gen/ sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/gen/ ``` **After** no diff. Closes #42342 from LuciferYang/minor-gitignore. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 2d3bb4d) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
JackBuggins
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amaliujia @dongjoon-hyun, Apologies if I am missing something or have this wrong, but I think there can be breaking user facing changes if an API has some function expecting a StructType param, then they rely on being able to call toAttributes based on what it is passed based on that param.
I appreciate the objective of the change, and perhaps there is a nice way to handle this now when we are provided with a StructType obj and we want to use this from it, however StructType is annotated as Stable so I didn't expect changes to come here.
Could we either provide an example of what best to do in that situation either here or in the docs when users hit this scenario to help mitigate any challenges that will pose to users, or otherwise consider keeping an implementation, even marked as deprecated in StructType that makes use of the implementation from DataTypeUtils?
|
Is it as simple as |
|
Yes it is as simple as |
|
Thank you @amaliujia. As for StructType's interface stability and the removal, do you have an opinion on StructType maintaining an implementation for the above type of case? |
|
@JackBuggins while There is no real way to bring back this particular method. We can however provide an implicit conversion that will make life a bit easier, e.g.: implicit class RichStructType(val schema: StructType) extends AnyVal {
def toAttributes: Seq[Attribute] = DataTypeUtils.toAttributes(schema)
}
|
…`sql/catalyst/pom.xml` ### What changes were proposed in this pull request? SPARK-44475(#41928) has already moved the `antlr4-maven-plugin` relevant configuration to `sql/api/pom.xml`, so the configuration in the `catalyst` module is unused now, so this pr remove it from `sql/catalyst/pom.xml` ### Why are the changes needed? Clean up unused Maven `antlr4-maven-plugin` from catalyst module. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass GitHub Actions - Manual verification ``` build/mvn clean install -DskipTests -pl sql/catalyst -am ``` Before We can see ``` [INFO] [INFO] --- antlr4-maven-plugin:4.9.3:antlr4 (default) spark-catalyst_2.12 --- [INFO] No ANTLR 4 grammars to compile in /Users/yangjie01/SourceCode/git/spark-mine-12/sql/catalyst/src/main/antlr4 [INFO] ``` After no relevant messages. ### Was this patch authored or co-authored using generative AI tooling? No Closes #42739 from LuciferYang/remove-antlr4-catalyst. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
…`sql/catalyst/pom.xml` ### What changes were proposed in this pull request? SPARK-44475(#41928) has already moved the `antlr4-maven-plugin` relevant configuration to `sql/api/pom.xml`, so the configuration in the `catalyst` module is unused now, so this pr remove it from `sql/catalyst/pom.xml` ### Why are the changes needed? Clean up unused Maven `antlr4-maven-plugin` from catalyst module. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass GitHub Actions - Manual verification ``` build/mvn clean install -DskipTests -pl sql/catalyst -am ``` Before We can see ``` [INFO] [INFO] --- antlr4-maven-plugin:4.9.3:antlr4 (default) spark-catalyst_2.12 --- [INFO] No ANTLR 4 grammars to compile in /Users/yangjie01/SourceCode/git/spark-mine-12/sql/catalyst/src/main/antlr4 [INFO] ``` After no relevant messages. ### Was this patch authored or co-authored using generative AI tooling? No Closes #42739 from LuciferYang/remove-antlr4-catalyst. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com> (cherry picked from commit 5df4e16) Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
What changes were proposed in this pull request?
This PR relocates Parser and DataType family to sql/api.
Why are the changes needed?
To extract DataType and Parser as an API into a shared module.
Does this PR introduce any user-facing change?
No
How was this patch tested?
UT