[SPARK-35678][ML][FOLLOWUP] softmax support offset and step #32991

zhengruifeng · 2021-06-21T05:56:11Z

What changes were proposed in this pull request?

softmax support offset and step, then we can use it in ANN and NB

Why are the changes needed?

to simplify impl

Does this PR introduce any user-facing change?

No

How was this patch tested?

existing testsuite

SparkQA · 2021-06-21T06:59:14Z

Test build #140058 has finished for PR 32991 at commit fb44753.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2021-06-21T07:09:02Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44586/

SparkQA · 2021-06-21T07:42:59Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44586/

SparkQA · 2021-06-21T08:23:10Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44593/

SparkQA · 2021-06-21T08:32:43Z

Test build #140064 has finished for PR 32991 at commit 73d11ff.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

zhengruifeng · 2021-06-21T08:42:19Z

mllib-local/src/main/scala/org/apache/spark/ml/impl/Utils.scala

@srowen @huaxingao This change makes MultilayerPerceptronClassifierTest in test_algorithms.py fail.
To pass the test, I need to use while loop instead of dscal here.

Oh OK, is it just that the answer is slightly different with dscal? would it be reasonable to loosen a tolerance? but this is fine.

existing result is:
master:

Row(features=DenseVector([0.1, 0.1, 0.25, 0.25]), rawPrediction=DenseVector([-11.6082, -8.1583, 22.1776]), probability=DenseVector([0.0, 0.0, 1.0]), prediction=2.0)

if we use dscal here, then

Row(features=DenseVector([0.1, 0.1, 0.25, 0.25]), rawPrediction=DenseVector([-11.824, -8.298, 22.5299]), probability=DenseVector([0.0, 0.0, 1.0]), prediction=2.0)

maybe OK to change the pytest

SparkQA · 2021-06-21T08:59:29Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44593/

zhengruifeng · 2021-06-21T09:05:13Z

link #32927

SparkQA · 2021-06-22T03:28:47Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44638/

SparkQA · 2021-06-22T03:32:24Z

Test build #140110 has finished for PR 32991 at commit a254a83.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2021-06-22T03:41:04Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44638/

huaxingao · 2021-06-24T02:05:57Z

Thanks, merged to master!

zhengruifeng · 2021-06-24T02:07:14Z

Thanks all!

HyukjinKwon · 2021-06-24T03:28:39Z

quick question:

======================================================================
ERROR [7.752s]: test_raw_and_probability_prediction (pyspark.ml.tests.test_algorithms.MultilayerPerceptronClassifierTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/__w/spark/spark/python/pyspark/ml/tests/test_algorithms.py", line 89, in test_raw_and_probability_prediction
    self.assertTrue(np.allclose(result.rawPrediction, expected_rawPrediction, rtol=0.1))
AssertionError: False is not true

----------------------------------------------------------------------

can this be related to this flaky test? I just saw it once at https://github.com/apache/spark/runs/2900969845 but might be worth noting here.

zhengruifeng · 2021-06-24T03:34:03Z

@HyukjinKwon it seems related to this pr.

@srowen @huaxingao should we remove this pytest or revert the usage of softmax in ANN?

srowen · 2021-06-24T03:38:01Z

I would revert the ANN change - something's too dependent on the math and ordering of operations or something. If we can just revert that and see it work in a follow up PR, that's a way forward; this may not even fail every time in master as I think it's simply flaky ATM

zhengruifeng · 2021-06-24T03:45:19Z

ok, I will send a pr to revert the change in ANN

srowen · 2021-06-28T19:33:32Z

Pardon if I missed it but was this partly reverted?

HyukjinKwon · 2021-06-28T23:59:51Z

Oh, yeah it was reverted partially already at #33049

github-actions bot added ML MLLIB labels Jun 21, 2021

zhengruifeng force-pushed the softmax_support_offset_step branch from fb44753 to 73d11ff Compare June 21, 2021 07:24

zhengruifeng commented Jun 21, 2021

View reviewed changes

zhengruifeng added 3 commits June 22, 2021 09:42

init

27577a7

support offset & step & output

0484369

update mlp pytest

a254a83

zhengruifeng force-pushed the softmax_support_offset_step branch from 73d11ff to a254a83 Compare June 22, 2021 01:42

github-actions bot added CORE PYTHON labels Jun 22, 2021

srowen approved these changes Jun 23, 2021

View reviewed changes

huaxingao approved these changes Jun 24, 2021

View reviewed changes

huaxingao closed this in a667388 Jun 24, 2021

zhengruifeng deleted the softmax_support_offset_step branch June 24, 2021 02:07

[SPARK-35678][ML][FOLLOWUP] softmax support offset and step #32991

[SPARK-35678][ML][FOLLOWUP] softmax support offset and step #32991

Uh oh!

Conversation

zhengruifeng commented Jun 21, 2021

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

SparkQA commented Jun 21, 2021

Uh oh!

SparkQA commented Jun 21, 2021

Uh oh!

SparkQA commented Jun 21, 2021

Uh oh!

SparkQA commented Jun 21, 2021

Uh oh!

SparkQA commented Jun 21, 2021

Uh oh!

zhengruifeng Jun 21, 2021

Choose a reason for hiding this comment

Uh oh!

srowen Jun 21, 2021

Choose a reason for hiding this comment

Uh oh!

zhengruifeng Jun 21, 2021

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Jun 21, 2021

Uh oh!

zhengruifeng commented Jun 21, 2021

Uh oh!

SparkQA commented Jun 22, 2021

Uh oh!

SparkQA commented Jun 22, 2021

Uh oh!

SparkQA commented Jun 22, 2021

Uh oh!

huaxingao commented Jun 24, 2021

Uh oh!

zhengruifeng commented Jun 24, 2021

Uh oh!

HyukjinKwon commented Jun 24, 2021

Uh oh!

zhengruifeng commented Jun 24, 2021

Uh oh!

srowen commented Jun 24, 2021

Uh oh!

zhengruifeng commented Jun 24, 2021

Uh oh!

srowen commented Jun 28, 2021

Uh oh!

HyukjinKwon commented Jun 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

HyukjinKwon commented Jun 28, 2021 •

edited

Loading