New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invariance tests for clustering metrics #8102

Closed
jnothman opened this Issue Dec 22, 2016 · 8 comments

Comments

Projects
None yet
4 participants
@jnothman
Member

jnothman commented Dec 22, 2016

We should have common tests for clustering metrics including that labels can be permuted (e.g. 0 and 1 swapped) to achieve the same score. General properties such as scores reduce when clustering is not perfect can also be tested.

@jnothman

This comment has been minimized.

Member

jnothman commented Dec 22, 2016

See for instance #8101 (comment)

@anki08

This comment has been minimized.

anki08 commented Dec 26, 2016

Looks like a feasible first contribution, I'd like to look into this. @jnothman Could you please help me on how to start and the files associated . Thanks

@jnothman

This comment has been minimized.

Member

jnothman commented Dec 26, 2016

Thanks. Perhaps take a look at the common tests for classification and regression metrics at sklearn/metrics/test_common.py.

@anki08

This comment has been minimized.

anki08 commented Dec 26, 2016

@jnothman okay . Thanks

@gan3sh500

This comment has been minimized.

gan3sh500 commented Dec 29, 2016

I think I understand what needs to be done. Is there any functions other than checking invariance of the metrics in cluster and seeing if values reduce with less perfect clustering?

@anki08

This comment has been minimized.

anki08 commented Dec 29, 2016

@jnothman I have written the code .Could you please check if I am going in the right directions and are there any more tests that need to be added? How do I run the test file on my laptop. When I run it on spyder it gives no output.

https://github.com/anki08/scikit-learn/blob/d4875f4a862a2fabf07d1e71b6f37f1bc6a88779/test_file.py

@jnothman

This comment has been minimized.

Member

jnothman commented Dec 29, 2016

nosetests sklearn/test_file.py. However, I'd like it, ASAP, moved to sklearn/metrics/cluster/tests/test_common.py, pushed to a branch, and a pull request made. That way our continuous integration framework can test it and it can be reviewed.

@jnothman

This comment has been minimized.

Member

jnothman commented Dec 29, 2016

and thank you, @anki08. At a very basic glance, this looks like it's heading in the right direction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment