Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Classification: add labels in confusion_matrix. #10

Closed
kashevg opened this issue May 28, 2019 · 2 comments
Closed

Classification: add labels in confusion_matrix. #10

kashevg opened this issue May 28, 2019 · 2 comments

Comments

@kashevg
Copy link

kashevg commented May 28, 2019

In the classification chapter, there is a part with a confusion matrix. It really confuses, because there TP stays for False and TN stays for True class.
Also, in the first book (don't have the second) confusion matrix is showed transponed. TP should be always in the upper left corner.

@ageron
Copy link
Owner

ageron commented May 28, 2019

Hi @kashevg,

Thanks for your feedback. I suppose you are referring to figure 3-2?

mlst_0302

I'm not sure what you mean by "TP stays for False and TN stays for True class"? In the example, we are building a "5-detector", so "positive" means it's a 5, and "negative" means it's not a 5. So a TP is an image that is correctly labeled as positive. The figure does show 5s correctly labeled as 5s in the lower right cell, and that's the cell labeled as TP, so frankly I don't see the problem.

Regarding the confusion matrix's orientation, I'm not sure there is any official standard for the order of the columns and rows. I just did a quick informal search on Google Image for "confusion matrix" and among the first 11 confusion matrices for binary classifiers, 6 have the TP in the top left, and 5 in the lower right. That's almost 50%.

Moreover, Scikit-Learn itself puts the TP in the lower right, as the following code demonstrates:

>>> from sklearn.metrics import confusion_matrix
>>> #               1 TN  2 FP   3 FN      4 TP
>>> confusion_matrix([0,  0, 0,  1, 1, 1,  1, 1, 1, 1], # y_true
...                  [0,  1, 1,  0, 0, 0,  1, 1, 1, 1]) # y_pred
...
array([[1, 2],
       [3, 4]])

I find this order quite natural, as it just follows the order of the classes (0=negative, 1=positive). This allows it to be consistent even when there are more classes.

I hope this helps!

@kashevg
Copy link
Author

kashevg commented May 28, 2019

Hi, @ageron
Ok, I get it.
As for me, it's so unnatural to have TP in the right lower corner. I know, it's a common approach for confusion matrices to sort classes in ascending order, but for binary classification, IMHO, it's better to have "positive" at the beginning, the same order as in wiki
Thank you.

@kashevg kashevg closed this as completed May 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants