Skip to content

Conversation

@zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

softmax support offset and step, then we can use it in ANN and NB

Why are the changes needed?

to simplify impl

Does this PR introduce any user-facing change?

No

How was this patch tested?

existing testsuite

@SparkQA
Copy link

SparkQA commented Jun 21, 2021

Test build #140058 has finished for PR 32991 at commit fb44753.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 21, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44586/

@zhengruifeng zhengruifeng force-pushed the softmax_support_offset_step branch from fb44753 to 73d11ff Compare June 21, 2021 07:24
@SparkQA
Copy link

SparkQA commented Jun 21, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44586/

@SparkQA
Copy link

SparkQA commented Jun 21, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44593/

@SparkQA
Copy link

SparkQA commented Jun 21, 2021

Test build #140064 has finished for PR 32991 at commit 73d11ff.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srowen @huaxingao This change makes MultilayerPerceptronClassifierTest in test_algorithms.py fail.
To pass the test, I need to use while loop instead of dscal here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh OK, is it just that the answer is slightly different with dscal? would it be reasonable to loosen a tolerance? but this is fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

existing result is:
master:

Row(features=DenseVector([0.1, 0.1, 0.25, 0.25]), rawPrediction=DenseVector([-11.6082, -8.1583, 22.1776]), probability=DenseVector([0.0, 0.0, 1.0]), prediction=2.0)

if we use dscal here, then

Row(features=DenseVector([0.1, 0.1, 0.25, 0.25]), rawPrediction=DenseVector([-11.824, -8.298, 22.5299]), probability=DenseVector([0.0, 0.0, 1.0]), prediction=2.0)

maybe OK to change the pytest

@SparkQA
Copy link

SparkQA commented Jun 21, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44593/

@zhengruifeng
Copy link
Contributor Author

link #32927

@SparkQA
Copy link

SparkQA commented Jun 22, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44638/

@SparkQA
Copy link

SparkQA commented Jun 22, 2021

Test build #140110 has finished for PR 32991 at commit a254a83.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 22, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44638/

@huaxingao huaxingao closed this in a667388 Jun 24, 2021
@huaxingao
Copy link
Contributor

Thanks, merged to master!

@zhengruifeng
Copy link
Contributor Author

Thanks all!

@zhengruifeng zhengruifeng deleted the softmax_support_offset_step branch June 24, 2021 02:07
@HyukjinKwon
Copy link
Member

quick question:

======================================================================
ERROR [7.752s]: test_raw_and_probability_prediction (pyspark.ml.tests.test_algorithms.MultilayerPerceptronClassifierTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/__w/spark/spark/python/pyspark/ml/tests/test_algorithms.py", line 89, in test_raw_and_probability_prediction
    self.assertTrue(np.allclose(result.rawPrediction, expected_rawPrediction, rtol=0.1))
AssertionError: False is not true

----------------------------------------------------------------------

can this be related to this flaky test? I just saw it once at https://github.com/apache/spark/runs/2900969845 but might be worth noting here.

@zhengruifeng
Copy link
Contributor Author

@HyukjinKwon it seems related to this pr.

@srowen @huaxingao should we remove this pytest or revert the usage of softmax in ANN?

@srowen
Copy link
Member

srowen commented Jun 24, 2021

I would revert the ANN change - something's too dependent on the math and ordering of operations or something. If we can just revert that and see it work in a follow up PR, that's a way forward; this may not even fail every time in master as I think it's simply flaky ATM

@zhengruifeng
Copy link
Contributor Author

ok, I will send a pr to revert the change in ANN

@srowen
Copy link
Member

srowen commented Jun 28, 2021

Pardon if I missed it but was this partly reverted?

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Jun 28, 2021

Oh, yeah it was reverted partially already at #33049

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants