Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-30759][SQL][3.0] Fix cache initialization in StringRegexExpression #27713

Closed

Conversation

MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Feb 26, 2020

What changes were proposed in this pull request?

In the PR, I propose to fix cache initialization in StringRegexExpression by changing of expected value type in case Literal(value: String, StringType) from String to UTF8String.

This is a backport of #27502 and #27547

Why are the changes needed?

Actually, the case doesn't work at all because Literal's value has type UTF8String, see
Screen Shot 2020-02-08 at 22 45 50

Does this PR introduce any user-facing change?

No

How was this patch tested?

Added new test by RegexpExpressionsSuite.

…ingRegexExpression

### What changes were proposed in this pull request?
Added new test to `RegexpExpressionsSuite` which checks that `cache` of compiled pattern is set when the `right` expression (pattern in `LIKE`) is a foldable expression.

### Why are the changes needed?
To be sure that `cache` in `StringRegexExpression` is initialized for foldable patterns.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
By running the added test in `RegexpExpressionsSuite`.

Closes apache#27547 from MaxGekk/regexp-cache-test.

Authored-by: Maxim Gekk <max.gekk@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@MaxGekk
Copy link
Member Author

MaxGekk commented Feb 26, 2020

@cloud-fan @dongjoon-hyun Please, take a look at the PR.

@cloud-fan
Copy link
Contributor

I'm fine to have it in 3.0. It's definitely a bug.

@rednaxelafx
Copy link
Contributor

LGTM +1 as well

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @MaxGekk , @cloud-fan , @rednaxelafx .

@SparkQA
Copy link

SparkQA commented Feb 26, 2020

Test build #118985 has finished for PR 27713 at commit 72660cf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

dongjoon-hyun pushed a commit that referenced this pull request Feb 26, 2020
…sion

### What changes were proposed in this pull request?
In the PR, I propose to fix `cache` initialization in `StringRegexExpression` by changing of expected value type in `case Literal(value: String, StringType)` from `String` to `UTF8String`.

This is a backport of #27502 and #27547

### Why are the changes needed?
Actually, the case doesn't work at all because `Literal`'s value has type `UTF8String`, see
<img width="649" alt="Screen Shot 2020-02-08 at 22 45 50" src="https://user-images.githubusercontent.com/1580697/74091681-0d4a2180-4acb-11ea-8a0d-7e8c65f4214e.png">

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Added new test by `RegexpExpressionsSuite`.

Closes #27713 from MaxGekk/str-regexp-foldable-pattern-backport.

Authored-by: Maxim Gekk <max.gekk@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
dongjoon-hyun pushed a commit that referenced this pull request Feb 26, 2020
…sion

In the PR, I propose to fix `cache` initialization in `StringRegexExpression` by changing of expected value type in `case Literal(value: String, StringType)` from `String` to `UTF8String`.

This is a backport of #27502 and #27547

Actually, the case doesn't work at all because `Literal`'s value has type `UTF8String`, see
<img width="649" alt="Screen Shot 2020-02-08 at 22 45 50" src="https://user-images.githubusercontent.com/1580697/74091681-0d4a2180-4acb-11ea-8a0d-7e8c65f4214e.png">

No

Added new test by `RegexpExpressionsSuite`.

Closes #27713 from MaxGekk/str-regexp-foldable-pattern-backport.

Authored-by: Maxim Gekk <max.gekk@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit cfc48a8)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
@dongjoon-hyun
Copy link
Member

Merged to branch-3.0/branch-2.4 as a bug fix.

@MaxGekk MaxGekk deleted the str-regexp-foldable-pattern-backport branch June 5, 2020 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
5 participants