Why am I getting 32 bit response for confusion_matrix when all the inputs are 64 bit?

I have a sample of 1m rows and the cohen_kappa_score showed as -11.3 when it should be -1 to +1. Further investigation showed that the confusion matrix returns a 32 bit result and in cohen_kappa calculation the outer product overflows.

Here is an example with a small amount of data. If I put np.int64 around the confusion matrix then it works.
```
import sys
from sklearn.metrics import confusion_matrix
y1 = np.int64([1,0,0,1])
y2 = np.int64([0,0,1,1])
confusion = confusion_matrix(y1, y2)
sys.version, type(y1[0]), type(y2[0]), type(confusion[0,0])
```

('3.5.2 |Anaconda 4.1.1 (64-bit)| (default, Jul  5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)]',
 numpy.int64,
 numpy.int64,
 numpy.int32)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Why am I getting 32 bit response for confusion_matrix when all the inputs are 64 bit? #7929

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Why am I getting 32 bit response for confusion_matrix when all the inputs are 64 bit? #7929

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions