Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[scala] Make accuracy idependant of output size (fix #8226) #8297

Merged
merged 1 commit into from Nov 22, 2017

Conversation

benqua
Copy link
Contributor

@benqua benqua commented Oct 16, 2017

Description

This PR change EvalMetric.sumMetric from Float to Double.

Fix #8226

@yzhliu
Copy link
Member

yzhliu commented Oct 18, 2017

Thanks @benqua , I think a better way is to change sumMetric to Double. The fix here changes the definition of Acc.

@benqua
Copy link
Contributor Author

benqua commented Oct 18, 2017

@Javelinjs , you're right, it changes the definition of accuracy for output.size > 1.
What is the exact definition of Accuracy? I couldn't find a clear definition.

This change provides a definition of accuracy that match the one from wikipedia for binary classification, that says:
the accuracy is the proportion of true results (both true positives and true negatives) among the total number of cases examined
(https://en.wikipedia.org/wiki/Accuracy_and_precision#In_binary_classification).

It seems weird (at least to me :) ) that the accuracy depends on the output dimension and can grow to very large numbers. By dividing by the label dimension, we keep the accuracy between 0 and 1, which is the expected range of a "proportion".

If we change sumMetric to Double, should we do it only for the value stored internally and keep float in the EvalMetric API?

@yzhliu
Copy link
Member

yzhliu commented Oct 19, 2017

We should keep it the same as other language bindings, especially python. What if we make it Double in EvalMetric API?

@benqua
Copy link
Contributor Author

benqua commented Oct 19, 2017

It can be done, but it would break the API for people calling EvalMetric.get (https://github.com/apache/incubator-mxnet/blob/master/scala-package/core/src/main/scala/ml/dmlc/mxnet/EvalMetric.scala#L52) .

@yzhliu
Copy link
Member

yzhliu commented Oct 19, 2017

We can convert it back to Float when calling EvalMetric.get

@benqua
Copy link
Contributor Author

benqua commented Oct 19, 2017

@Javelinjs ok, I updated the PR as you suggested.
(I should maybe have done a new one because the comment thread is now difficult to understand)
What do you think?

@benqua
Copy link
Contributor Author

benqua commented Oct 20, 2017

Ho, it seems there is a build issue. Not sure it is related to my code (something about the windows gpu build), but I will check this week-end.

@yzhliu
Copy link
Member

yzhliu commented Oct 20, 2017

LGTM.
The CI seems to have some problems these days. Lets try again this weekend.

@yzhliu
Copy link
Member

yzhliu commented Oct 27, 2017

@benqua could you rebase the pr and re-trigger the CI build?

@benqua
Copy link
Contributor Author

benqua commented Oct 27, 2017

Rebased. Waiting for the CI to complete.

@yzhliu
Copy link
Member

yzhliu commented Oct 29, 2017

@piiswrong Could you help to do a force merge?

@benqua
Copy link
Contributor Author

benqua commented Nov 2, 2017

I tried again, still no luck.
The message from the CI doesn't help to understand if it is still a CI issue or a PR one.

@yzhliu
Copy link
Member

yzhliu commented Nov 3, 2017

Sorry to see that. Please rebase again, I think the CI is OK now.

@benqua
Copy link
Contributor Author

benqua commented Nov 3, 2017

done. let's see...

@benqua
Copy link
Contributor Author

benqua commented Nov 4, 2017

It fails with:

FAIL: test_operator_gpu.test_svmoutput_with_type

----------------------------------------------------------------------

Traceback (most recent call last):

  File "C:\Anaconda3\envs\py2\lib\site-packages\nose\case.py", line 197, in runTest

    self.test(*self.arg)

  File "c:\jenkins_slave\workspace\ut-python-gpu\tests\python\gpu\test_operator_gpu.py", line 999, in test_svmoutput_with_type

    check_consistency(sym, ctx_list)

  File "c:\jenkins_slave\workspace\ut-python-gpu\pkg_vc14_gpu\python\mxnet\test_utils.py", line 1338, in check_consistency

    raise e

AssertionError: 

Items are not equal:

Error 5.000000 exceeds tolerance rtol=0.100000, atol=0.100000.  Location of maximum error:(18, 3), a=0.000000, b=1.000000

 a: array([[-1.,  1.,  1., ...,  1.,  1.,  1.],

       [-1.,  1.,  0., ...,  0.,  1.,  1.],

       [-1.,  1.,  1., ...,  1.,  1.,  1.],...

 b: array([[-1.,  1.,  1., ...,  1.,  1.,  1.],

       [-1.,  1.,  0., ...,  0.,  1.,  1.],

       [-1.,  1.,  1., ...,  1.,  1.,  1.],...

-------------------- >> begin captured stdout << ---------------------

Train Err: ctx 2 vs ctx 0 at svmoutput_data


--------------------- >> end captured stdout << ----------------------


----------------------------------------------------------------------

Ran 243 tests in 1392.034s


FAILED (SKIP=5, failures=1)


(py2) c:\jenkins_slave\workspace\ut-python-gpu>IF 1 NEQ 0 exit /b 1 

script returned exit code 1

will investigate when I find time. idea, help welcome :)

@benqua benqua force-pushed the issue-8226 branch 4 times, most recently from 98c2b13 to 28dc2ee Compare November 16, 2017 10:55
@benqua benqua force-pushed the issue-8226 branch 2 times, most recently from b19ad48 to 0873ecb Compare November 20, 2017 20:15
@marcoabreu
Copy link
Contributor

marcoabreu commented Nov 21, 2017

I've encountered the same error with test_operator_gpu.test_svmoutput_with_type on my setup based on the release brach v0.12.0. Please check our internal wiki, @KellenSunderland @mbaijal @larroy

@marcoabreu
Copy link
Contributor

marcoabreu commented Nov 21, 2017

https://builds.apache.org/blue/organizations/jenkins/incubator-mxnet/detail/PR-8297/13/pipeline

line 452

[ERROR] /workspace/scala-package/core/src/main/scala/ml/dmlc/mxnet/EvalMetric.scala:115: error: not found: value la
[INFO] this.sumMetric += la.zip(predLabel.toArray)

That’s clearly not a CI issue

When the difference in magnitude between the total
accuracy and 1 becomes too big and accuracy is not updated anymore due
to the low precision of float numbers.
@benqua
Copy link
Contributor Author

benqua commented Nov 21, 2017

@Javelinjs , Finally, the PR passes all tests.
Do you think it can be merged?

@yzhliu yzhliu merged commit 8df20a2 into apache:master Nov 22, 2017
@yzhliu
Copy link
Member

yzhliu commented Nov 22, 2017

Thanks.

eric-haibin-lin pushed a commit to eric-haibin-lin/mxnet that referenced this pull request Dec 3, 2017
…he#8297)

When the difference in magnitude between the total
accuracy and 1 becomes too big and accuracy is not updated anymore due
to the low precision of float numbers.
KellenSunderland pushed a commit to KellenSunderland/incubator-mxnet that referenced this pull request Dec 3, 2017
…he#8297)

When the difference in magnitude between the total
accuracy and 1 becomes too big and accuracy is not updated anymore due
to the low precision of float numbers.
zhreshold pushed a commit to zhreshold/mxnet that referenced this pull request Dec 14, 2017
…he#8297)

When the difference in magnitude between the total
accuracy and 1 becomes too big and accuracy is not updated anymore due
to the low precision of float numbers.
rahul003 pushed a commit to rahul003/mxnet that referenced this pull request Jun 4, 2018
…he#8297)

When the difference in magnitude between the total
accuracy and 1 becomes too big and accuracy is not updated anymore due
to the low precision of float numbers.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants