Odd/inconsistent behavior in adjusted_rand_score #12940

Engineero · 2019-01-08T15:08:30Z

Description

metrics.adjusted_rand_score seems to give inconsistent results. The example given below is extreme, wherein two almost identical inputs return an ARI of 0.0.

Steps/Code to Reproduce

Example:

from sklearn import metrics as m

labels_true = [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]
labels_pred = [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1]  # one digit off from labels_true

m.adjusted_rand_score(labels_true,labels_pred)
# 0.0

If you change the single 0 in labels_pred to a 1, the result is, as expected, 1.0.

Expected Results

One would expect the ARI in the case shown above to be very close to 1.0 for two almost-identical inputs.

Actual Results

The actual result for the example given is 0.0, which seems to indicate unexpected behavior in the algorithm.

Versions

Linux-2.6.32-696.23.1.el6.x86_64-x86_64-with-redhat-6.9-Santiago
Python 3.6.2 (default, Nov 4 2017, 17:40:18)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-17)]
NumPy 1.14.3
SciPy 1.1.0
Scikit-Learn 0.19.1

The text was updated successfully, but these errors were encountered:

Engineero · 2019-01-08T15:10:43Z

Note that this came to my attention through this SO question.

vivekk0903 · 2019-01-08T16:40:29Z

@Engineero In R also multiple packages are giving the same result.

x <- c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
y <- c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1)

library(mclust)
adjustedRandIndex(x, y)
Output: 0

library(clues)
adjustedRand(x, y)
Output:
     Rand           HA            MA           FM      Jaccard 
0.9090909    0.0000000     0.0000000    0.9534626    0.9090909 

library(CrossClustering)
adjustedRandIndex(x, y)
Output: 0

You can test the above code here: https://rdrr.io/snippets/

Scikit-learn's implementation is based on this (denoted by HA in the middle output):

L. Hubert and P. Arabie, Comparing Partitions, Journal of Classification 1985 
http://link.springer.com/article/10.1007%2FBF01908075

So I think that its consistent. Maybe some one can be able to explain the results.

Engineero · 2019-01-08T16:58:27Z

@vivekk-ezdi maybe I need to look into the math a bit more. I will close for now and reopen if I think I've found something that shows inconsistency with what one would expect from the math.

jnothman · 2019-01-08T22:30:37Z

Adjustment needs a probability distribution for agreement by chance given the true distribution. Since the true distribution is extremely peaked, I suspect the adjustment is not very suitable. But I also have not yet looked at the specific maths

Engineero closed this as completed Jan 8, 2019

smsaladi mentioned this issue May 9, 2019

Odd (incorrect) behavior with normalized_mutual_info_score #13836

Closed

scouvreur mentioned this issue May 24, 2019

[MRG] Bug fix for sklearn.metrics.cluster.normalized_mutual_info_score with sparse input #13939

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Odd/inconsistent behavior in adjusted_rand_score #12940

Odd/inconsistent behavior in adjusted_rand_score #12940

Engineero commented Jan 8, 2019

Engineero commented Jan 8, 2019

vivekk0903 commented Jan 8, 2019 •

edited

Engineero commented Jan 8, 2019

jnothman commented Jan 8, 2019 via email

Odd/inconsistent behavior in adjusted_rand_score #12940

Odd/inconsistent behavior in adjusted_rand_score #12940

Comments

Engineero commented Jan 8, 2019

Description

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

Engineero commented Jan 8, 2019

vivekk0903 commented Jan 8, 2019 • edited

Engineero commented Jan 8, 2019

jnothman commented Jan 8, 2019 via email

vivekk0903 commented Jan 8, 2019 •

edited