Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-38776][MLLIB][TESTS] Disable ANSI_ENABLED explicitly in ALSSuite #36051

Closed
wants to merge 3 commits into from
Closed

[SPARK-38776][MLLIB][TESTS] Disable ANSI_ENABLED explicitly in ALSSuite #36051

wants to merge 3 commits into from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Apr 3, 2022

What changes were proposed in this pull request?

This PR aims to disable ANSI_ENABLED explicitly in the following tests of ALSSuite.

test("ALS validate input dataset") {
test("input type validation") {

Why are the changes needed?

After SPARK-38490, this test became flaky in ANSI mode GitHub Action.

Screen Shot 2022-04-03 at 12 07 29 AM

[info] ALSSuite:
...
[info] - ALS validate input dataset *** FAILED *** (2 seconds, 449 milliseconds)
[info]   Invalid Long: out of range "Job aborted due to stage failure: Task 0 in stage 100.0 failed 1 times, most recent failure: Lost task 0.0 in stage 100.0 (TID 348) (localhost executor driver): 
org.apache.spark.SparkArithmeticException: 
Casting 1231000000000 to int causes overflow. 
To return NULL instead, use 'try_cast'. 
If necessary set spark.sql.ansi.enabled to false to bypass this error.

Does this PR introduce any user-facing change?

No. This is a test-only bug and fix.

How was this patch tested?

Pass the CIs.

@github-actions github-actions bot added the ML label Apr 3, 2022
@dongjoon-hyun
Copy link
Member Author

cc @gengliangwang

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-38776][MLLIB][TESTS] Disable ANSI_ENABLED explicitly in 'ALS validate input dataset' test case [SPARK-38776][MLLIB][TESTS] Disable ANSI_ENABLED explicitly in ALSSuite Apr 3, 2022
@gengliangwang
Copy link
Member

@dongjoon-hyun thanks for fixing it!

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine

@dongjoon-hyun
Copy link
Member Author

Thank you, @gengliangwang , @srowen , @yaooqinn . Merged to master/3.3.

dongjoon-hyun added a commit that referenced this pull request Apr 3, 2022
…ite`

This PR aims to disable `ANSI_ENABLED` explicitly in the following tests of `ALSSuite`.
```
test("ALS validate input dataset") {
test("input type validation") {
```

After SPARK-38490, this test became flaky in ANSI mode GitHub Action.

![Screen Shot 2022-04-03 at 12 07 29 AM](https://user-images.githubusercontent.com/9700541/161416006-7b76596f-c19a-4212-91d2-8602df569608.png)

- https://github.com/apache/spark/runs/5800714463?check_suite_focus=true
- https://github.com/apache/spark/runs/5803714260?check_suite_focus=true
- https://github.com/apache/spark/runs/5803745768?check_suite_focus=true

```
[info] ALSSuite:
...
[info] - ALS validate input dataset *** FAILED *** (2 seconds, 449 milliseconds)
[info]   Invalid Long: out of range "Job aborted due to stage failure: Task 0 in stage 100.0 failed 1 times, most recent failure: Lost task 0.0 in stage 100.0 (TID 348) (localhost executor driver):
org.apache.spark.SparkArithmeticException:
Casting 1231000000000 to int causes overflow.
To return NULL instead, use 'try_cast'.
If necessary set spark.sql.ansi.enabled to false to bypass this error.
```

No. This is a test-only bug and fix.

Pass the CIs.

Closes #36051 from dongjoon-hyun/SPARK-38776.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit d18fd7b)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@dongjoon-hyun
Copy link
Member Author

Oops. I realized that more OutOfRange failures were hidden in the same test case behind the previous Overflow failure. I'll make a follow-up soon.

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun added a commit that referenced this pull request Apr 3, 2022
…Out of Range` failures

### What changes were proposed in this pull request?

This is a follow-up of #36051.
After fixing `Overflow` errors, `Out Of Range` failures are observed in the rest of test code in the same test case.

### Why are the changes needed?

To make GitHub Action ANSI test CI pass.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

At this time, I used the following to simulate GitHub Action ANSI job.
```
$ SPARK_ANSI_SQL_MODE=true build/sbt "mllib/testOnly *.ALSSuite"
...
[info] All tests passed.
[success] Total time: 80 s (01:20), completed Apr 3, 2022 1:05:50 PM
```

Closes #36054 from dongjoon-hyun/SPARK-38776-2.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun added a commit that referenced this pull request Apr 3, 2022
…Out of Range` failures

This is a follow-up of #36051.
After fixing `Overflow` errors, `Out Of Range` failures are observed in the rest of test code in the same test case.

To make GitHub Action ANSI test CI pass.

No.

At this time, I used the following to simulate GitHub Action ANSI job.
```
$ SPARK_ANSI_SQL_MODE=true build/sbt "mllib/testOnly *.ALSSuite"
...
[info] All tests passed.
[success] Total time: 80 s (01:20), completed Apr 3, 2022 1:05:50 PM
```

Closes #36054 from dongjoon-hyun/SPARK-38776-2.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit fbcab01)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants