Skip to content

Why am I getting 32 bit response for confusion_matrix when all the inputs are 64 bit? #7929

Closed
@simonm3

Description

@simonm3

I have a sample of 1m rows and the cohen_kappa_score showed as -11.3 when it should be -1 to +1. Further investigation showed that the confusion matrix returns a 32 bit result and in cohen_kappa calculation the outer product overflows.

Here is an example with a small amount of data. If I put np.int64 around the confusion matrix then it works.

import sys
from sklearn.metrics import confusion_matrix
y1 = np.int64([1,0,0,1])
y2 = np.int64([0,0,1,1])
confusion = confusion_matrix(y1, y2)
sys.version, type(y1[0]), type(y2[0]), type(confusion[0,0])

('3.5.2 |Anaconda 4.1.1 (64-bit)| (default, Jul 5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)]',
numpy.int64,
numpy.int64,
numpy.int32)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions