add cohen kappa #1690

KickItLikeShika · 2021-02-23T13:51:51Z

Added Cohen's Kappa in ignite.contrib.metrics. Implemented 'CohenKappa' in cohen_kappa.py file. And added the tests in test_cohen_kappa.py.

Check list:

New tests are added (if a new feature is added)
New doc strings: description and/or example code are in RST format
Documentation is updated (if required)

KickItLikeShika · 2021-02-23T15:45:13Z

@vfdev-5. Putting try/except in the __init__ didn't work. I have simplified the implementation a lot more.

vfdev-5 · 2021-02-23T15:55:05Z

Hey @KickItLikeShika , what's the issue with try/except in the __init__ ?

As for other updates, please, take a look at a similar PR here and the comments about the documentation : #1682

As discussed in your previous closed PR, please use a single PR to bring all the updates according to the review. Otherwise, we wont be able to consider them.

KickItLikeShika · 2021-02-23T16:17:25Z

@vfdev-5. I have fixed it. Is there anything else to edit?

ignite/contrib/metrics/cohen_kappa.py

tests/ignite/contrib/metrics/test_cohen_kappa.py

…ohen-kappa

vfdev-5

@KickItLikeShika few more comments to address, please.

ignite/contrib/metrics/cohen_kappa.py

tests/ignite/contrib/metrics/test_cohen_kappa.py

KickItLikeShika · 2021-02-23T23:25:35Z

@vfdev-5. I have updated everything as you said and added the test. But please look at these CI errors, it works fine locally! and look at the values, it so close! That's weird!

vfdev-5 · 2021-02-23T23:28:17Z

@KickItLikeShika sounds great for the update ! Yes, it can happen that values could differ with a tiny eps. When you say that it works locally, how do you run the tests ?

KickItLikeShika · 2021-02-23T23:29:58Z

pytest test_cohen_kappa.py, that's how i run them.

ignite/contrib/metrics/cohen_kappa.py

vfdev-5 · 2021-02-23T23:35:43Z

If you'd like to replicate distributed tests on CPU, you have to replicate the command from tests/run_cpu_tests.sh:

export WORLD_SIZE=2
CUDA_VISIBLE_DEVICES="" pytest --dist=each --tx $WORLD_SIZE*popen//python=python tests/ignite/contrib/metrics/test_cohen_kappa.py -m distributed -vvv

and make sure to install pytest-xdist. Works on Linux only (I think)

KickItLikeShika · 2021-02-23T23:38:40Z

Okay thank you! Is there anything else to work on in this pull request?

ignite/contrib/metrics/cohen_kappa.py

vfdev-5 · 2021-02-23T23:45:22Z

And to fix precision issue, you can use pytest.approx, please search for its usage on the repo.

vfdev-5

@KickItLikeShika thanks for a thourough testing !
I have a remark about duplicating test functions and usage of parametrize.

vfdev-5 · 2021-02-24T01:03:23Z

tests/ignite/contrib/metrics/test_cohen_kappa.py

+        ck._check_shape((torch.randint(0, 2, size=(10, 1)).long(), torch.randint(0, 2, size=(10, 5, 2)).long()))
+
+
+def test_cohen_kappa_non_weighted():


Let's try to use pytest.mark.parametrize here. For example, take a look here : https://github.com/pytorch/ignite/pull/1683/files#diff-be88f17b02401b784cf4081d7e84bc7470d9beac1ac42294b8a0b1fc731900b3R20

Such that we can refactor all those 3 almost idenitcal functions. Probably, a similar thing can be done for test_integration_cohen_kappa_non_weighted_with_output_transform and its 2 triplets...

I understood how it works. I can use test.mark.parametrize to replace all of these function with just 2 or 3. Should i do that?

I understood how it works. I can use test.mark.parametrize to replace all of these function with just 2 or 3. Should i do that?

The idea is to make test code better with minimal code duplication:
3 test functions -> 1 test function with parametrize

test_cohen_kappa_* -> test_cohen_kappa

test_integration_cohen_kappa_*_with_output_transform -> test_integration_cohen_kappa_with_output_transform

Sorry, I do not quite get your question...

@vfdev-5, Updated all the tests, and added new tests.

@vfdev-5, It breaks the CI. The issue is the code formatting, black and pycodestyle don't agree on whitespaces before the columns. Any idea how to fix this?

Aftert running black test_cohen_kappa.py the columns in lines 98, 107, 108 get reformatted and whitespaces added before the columns :. But then pycodestyle test_cohen_kappy.py shows that i should remove the whitespaces before the columns np_y[size // 2 :] = 1. And that's how the unit tests in the CI fail

@vfdev-5, should i change the logic in update_fn in test_cohen_kappa_all_weights_with_output_transform for avoid these CI errors?

@KickItLikeShika I do not quite understand your problem with formatting here. I updated your branch to the master to restart the CI and see where it is failing.

Looks like it is OK.

Well! i will commit the latest updates now with and delete the tests you told my about

vfdev-5 · 2021-02-24T01:09:50Z

@KickItLikeShika as a suggestion, try to enable Github Actions on your fork such that our "Format python code / code-style" could commit fixed files to your fork and avoid manual fixing.

tests/ignite/contrib/metrics/test_cohen_kappa.py

vfdev-5

Thanks for the PR @KickItLikeShika ! Looks good now 👍

KickItLikeShika · 2021-02-24T15:57:22Z

@vfdev-5, thank you! I will work now on #1695

vfdev-5 · 2021-02-24T15:58:25Z

Actually, I was thinking that XLA test failure is unrelated but it is related:

tests/ignite/contrib/metrics/test_cohen_kappa.py::test_distrib_single_device_xla PASSED [  3%]
tests/ignite/contrib/metrics/test_cohen_kappa.py::test_distrib_xla_nprocs SKIPPED [  5%]
tests/ignite/contrib/metrics/regression/test_canberra_metric.py::test_distrib_single_device_xla FAILED [  6%]

as it does not fail on other PR, nor on master.

Let me see in details later what happens...

KickItLikeShika · 2021-02-24T16:00:01Z

Do you have any idea what might be the problem here?

KickItLikeShika · 2021-02-24T16:49:21Z

@vfdev-5, Do you think i have to edit anything before i change the rest of the metrics to this similar implementation?

vfdev-5 · 2021-02-24T16:55:45Z

@vfdev-5, Do you think i have to edit anything before i change the rest of the metrics to this similar implementation?

Well, the test with XLA should be fixed. As far as I can see from 61e5041#diff-54ddf6f7687e40789e7b5180b452037c40cb021140ae45ed5297d5698a6a8dbeR164 we disabled accumulation on XLA devices (unfortunately without any explanation). We have to invetigate this more in details.
Temporarily, we can comment out XLA tests and I'll merge the PR.

KickItLikeShika · 2021-02-24T17:14:41Z

@vfdev-5 then i will start improving the other metrics the same way and add the tests

vfdev-5 · 2021-02-24T23:35:15Z

@KickItLikeShika finally get it done 🎉 Congrats :)

KickItLikeShika · 2021-02-24T23:36:40Z

@vfdev-5, Thank you!!

KickItLikeShika added 3 commits February 23, 2021 15:48

add cohen kappa

eef60c6

add cohen kappa

5bc2f7a

add cohen kappa

4603c1b