Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-37643][SQL] when charVarcharAsString is true, for char datatype predicate query should skip rpadding rule #36187

Closed
wants to merge 2 commits into from

Conversation

fhygh
Copy link
Contributor

@fhygh fhygh commented Apr 14, 2022

What changes were proposed in this pull request?

after add ApplyCharTypePadding rule, when predicate query column data type is char, if column value length is less then defined, will be right-padding, then query will get incorrect result

Why are the changes needed?

fix query incorrect issue when predicate column data type is char, so in this case when charVarcharAsString is true, we should skip the rpadding rule.

Does this PR introduce any user-facing change?

before this fix, if we query with char data type for predicate, then we should be careful to set charVarcharAsString to true.

How was this patch tested?

add new UT.

@github-actions github-actions bot added the SQL label Apr 14, 2022
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@fhygh
Copy link
Contributor Author

fhygh commented Apr 14, 2022

cc @wzhfy @cloud-fan

@cloud-fan
Copy link
Contributor

What happens to #34900 ?

@fhygh
Copy link
Contributor Author

fhygh commented Apr 14, 2022

@cloud-fan Sorry, the #34900 check flow fails due to a bug in the GitHub action. I mistakenly thought that the problem was caused by my pr. Therefore, I closed the 34900 and raised the pr again. Later, I learned that the problem was caused by the bug in the GitHub action

@cloud-fan
Copy link
Contributor

It's OK, can you post the testing job link this time? We can merge this PR if tests passed.

@fhygh
Copy link
Contributor Author

fhygh commented Apr 15, 2022

ok, once testing passed, i'll post here, thank you.

@fhygh
Copy link
Contributor Author

fhygh commented Apr 16, 2022

@HyukjinKwon hello, can you help post the build link here? thank you.

@fhygh
Copy link
Contributor Author

fhygh commented Apr 18, 2022

@cloud-fan cloud-fan closed this in c1ea8b4 Apr 18, 2022
cloud-fan pushed a commit that referenced this pull request Apr 18, 2022
…e predicate query should skip rpadding rule

### What changes were proposed in this pull request?
after add ApplyCharTypePadding rule, when predicate query column data type is char, if column value length is less then defined,  will be right-padding, then query will get incorrect result

### Why are the changes needed?
fix query incorrect issue when predicate column data type is char, so in this case when charVarcharAsString is true, we should skip the rpadding rule.

### Does this PR introduce _any_ user-facing change?
before this fix, if we query with char data type for predicate, then we should be careful to set charVarcharAsString to true.

### How was this patch tested?
add new UT.

Closes #36187 from fhygh/charpredicatequery.

Authored-by: fhygh <283452027@qq.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit c1ea8b4)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
cloud-fan pushed a commit that referenced this pull request Apr 18, 2022
…e predicate query should skip rpadding rule

### What changes were proposed in this pull request?
after add ApplyCharTypePadding rule, when predicate query column data type is char, if column value length is less then defined,  will be right-padding, then query will get incorrect result

### Why are the changes needed?
fix query incorrect issue when predicate column data type is char, so in this case when charVarcharAsString is true, we should skip the rpadding rule.

### Does this PR introduce _any_ user-facing change?
before this fix, if we query with char data type for predicate, then we should be careful to set charVarcharAsString to true.

### How was this patch tested?
add new UT.

Closes #36187 from fhygh/charpredicatequery.

Authored-by: fhygh <283452027@qq.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit c1ea8b4)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
cloud-fan pushed a commit that referenced this pull request Apr 18, 2022
…e predicate query should skip rpadding rule

after add ApplyCharTypePadding rule, when predicate query column data type is char, if column value length is less then defined,  will be right-padding, then query will get incorrect result

fix query incorrect issue when predicate column data type is char, so in this case when charVarcharAsString is true, we should skip the rpadding rule.

before this fix, if we query with char data type for predicate, then we should be careful to set charVarcharAsString to true.

add new UT.

Closes #36187 from fhygh/charpredicatequery.

Authored-by: fhygh <283452027@qq.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit c1ea8b4)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@cloud-fan
Copy link
Contributor

thanks, merging to master/3.3/3.2/3.1!

kazuyukitanimura pushed a commit to kazuyukitanimura/spark that referenced this pull request Aug 10, 2022
…e predicate query should skip rpadding rule

### What changes were proposed in this pull request?
after add ApplyCharTypePadding rule, when predicate query column data type is char, if column value length is less then defined,  will be right-padding, then query will get incorrect result

### Why are the changes needed?
fix query incorrect issue when predicate column data type is char, so in this case when charVarcharAsString is true, we should skip the rpadding rule.

### Does this PR introduce _any_ user-facing change?
before this fix, if we query with char data type for predicate, then we should be careful to set charVarcharAsString to true.

### How was this patch tested?
add new UT.

Closes apache#36187 from fhygh/charpredicatequery.

Authored-by: fhygh <283452027@qq.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit c1ea8b4)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants