Skip to content

Conversation

@amaliujia
Copy link
Contributor

@amaliujia amaliujia commented Jul 10, 2023

What changes were proposed in this pull request?

This PR relocates Parser and DataType family to sql/api.

Why are the changes needed?

To extract DataType and Parser as an API into a shared module.

Does this PR introduce any user-facing change?

No

How was this patch tested?

UT

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you move the entire utils class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could. I happed to move the least part that needed in this refactoring PR and it seems to be enough for moving a small portion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AttributeNameParser?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@amaliujia amaliujia force-pushed the datatype_1 branch 2 times, most recently from cd6a9f5 to 668a6c8 Compare July 14, 2023 23:08
@amaliujia amaliujia changed the title [WIP] Move parser and data type to sql/api [SPARK-44475][SQL][CONNECT] Relocate parser and data type to sql/api Jul 18, 2023
@amaliujia amaliujia changed the title [SPARK-44475][SQL][CONNECT] Relocate parser and data type to sql/api [SPARK-44475][SQL][CONNECT] Relocate DataType and Parser to sql/api Jul 18, 2023
this._precision = decimal.scale
this._scale = decimal.scale
} else if (decimal.scale < 0 && !SQLConf.get.allowNegativeScaleOfDecimalEnabled) {
} else if (decimal.scale < 0 && !SqlApiConf.get.allowNegativeScaleOfDecimalEnabled) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do we bind SqlApiConf.get to the active SQLConf at the server side?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan
Copy link
Contributor

The k8s failure is unrelated. Thanks, merging to master/3.5 (for scala client)!

@cloud-fan cloud-fan closed this in 8ff6b7a Jul 20, 2023
cloud-fan pushed a commit that referenced this pull request Jul 20, 2023
### What changes were proposed in this pull request?

This PR relocates Parser and DataType family to sql/api.

### Why are the changes needed?

To extract DataType and Parser as an API into a shared module.

### Does this PR introduce _any_ user-facing change?

No
### How was this patch tested?

UT

Closes #41928 from amaliujia/datatype_1.

Authored-by: Rui Wang <rui.wang@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 8ff6b7a)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
dongjoon-hyun pushed a commit that referenced this pull request Aug 4, 2023
### What changes were proposed in this pull request?
#41928 moved antlr4 related files and directories from the `sql/catalyst` module to the `sql/api` module. This pr fix the corresponding git ignore rules to avoid unexpected diffs when executing `git status`.

### Why are the changes needed?
Avoid unexpected diffs when executing `git status`.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Manual check `git status`

**Before**

```
sql/api/gen/
sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/gen/
```

**After**
no diff.

Closes #42342 from LuciferYang/minor-gitignore.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun pushed a commit that referenced this pull request Aug 4, 2023
### What changes were proposed in this pull request?
#41928 moved antlr4 related files and directories from the `sql/catalyst` module to the `sql/api` module. This pr fix the corresponding git ignore rules to avoid unexpected diffs when executing `git status`.

### Why are the changes needed?
Avoid unexpected diffs when executing `git status`.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Manual check `git status`

**Before**

```
sql/api/gen/
sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/gen/
```

**After**
no diff.

Closes #42342 from LuciferYang/minor-gitignore.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 2d3bb4d)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Copy link

@JackBuggins JackBuggins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amaliujia @dongjoon-hyun, Apologies if I am missing something or have this wrong, but I think there can be breaking user facing changes if an API has some function expecting a StructType param, then they rely on being able to call toAttributes based on what it is passed based on that param.

I appreciate the objective of the change, and perhaps there is a nice way to handle this now when we are provided with a StructType obj and we want to use this from it, however StructType is annotated as Stable so I didn't expect changes to come here.

Could we either provide an example of what best to do in that situation either here or in the docs when users hit this scenario to help mitigate any challenges that will pose to users, or otherwise consider keeping an implementation, even marked as deprecated in StructType that makes use of the implementation from DataTypeUtils?

@JackBuggins
Copy link

Is it as simple as DataTypeUtils.toAttributes(schema), where schema is some StructType?

@amaliujia
Copy link
Contributor Author

Yes it is as simple as DataTypeUtils.toAttributes(schema) and same for fromAttributes.

@JackBuggins
Copy link

Thank you @amaliujia. As for StructType's interface stability and the removal, do you have an opinion on StructType maintaining an implementation for the above type of case?

@hvanhovell
Copy link
Contributor

@JackBuggins while StructType is definitely stable & public API, the toAttributes methods was not. It was clearly marked as protected[sql] and it returned a sequence of an internal catalyst type (Attribute). We can always change those methods.

There is no real way to bring back this particular method. We can however provide an implicit conversion that will make life a bit easier, e.g.:

implicit class RichStructType(val schema: StructType) extends AnyVal {
  def toAttributes: Seq[Attribute] = DataTypeUtils.toAttributes(schema)
}

dongjoon-hyun pushed a commit that referenced this pull request Aug 30, 2023
…`sql/catalyst/pom.xml`

### What changes were proposed in this pull request?
SPARK-44475(#41928) has already moved the `antlr4-maven-plugin` relevant configuration to `sql/api/pom.xml`, so the configuration in the `catalyst` module is unused now, so this pr remove it from `sql/catalyst/pom.xml`

### Why are the changes needed?
Clean up unused Maven `antlr4-maven-plugin` from catalyst module.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass GitHub Actions
- Manual verification

```
build/mvn clean install -DskipTests -pl sql/catalyst -am
```

Before

We can see

```
[INFO]
[INFO] --- antlr4-maven-plugin:4.9.3:antlr4 (default)  spark-catalyst_2.12 ---
[INFO] No ANTLR 4 grammars to compile in /Users/yangjie01/SourceCode/git/spark-mine-12/sql/catalyst/src/main/antlr4
[INFO]
```

After

no relevant messages.

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #42739 from LuciferYang/remove-antlr4-catalyst.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
dongjoon-hyun pushed a commit that referenced this pull request Aug 30, 2023
…`sql/catalyst/pom.xml`

### What changes were proposed in this pull request?
SPARK-44475(#41928) has already moved the `antlr4-maven-plugin` relevant configuration to `sql/api/pom.xml`, so the configuration in the `catalyst` module is unused now, so this pr remove it from `sql/catalyst/pom.xml`

### Why are the changes needed?
Clean up unused Maven `antlr4-maven-plugin` from catalyst module.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass GitHub Actions
- Manual verification

```
build/mvn clean install -DskipTests -pl sql/catalyst -am
```

Before

We can see

```
[INFO]
[INFO] --- antlr4-maven-plugin:4.9.3:antlr4 (default)  spark-catalyst_2.12 ---
[INFO] No ANTLR 4 grammars to compile in /Users/yangjie01/SourceCode/git/spark-mine-12/sql/catalyst/src/main/antlr4
[INFO]
```

After

no relevant messages.

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #42739 from LuciferYang/remove-antlr4-catalyst.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit 5df4e16)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants