[BUG]: NaN outputs for multiclass classification using MCC loss function #306

kevingreenman · 2022-07-01T15:51:07Z

Describe the bug
For a dataset where multiclass classification trains normally with the default cross_entropy loss function, it produces an error with the mcc loss function.

Example(s)
The script

python train.py --data_path debug.csv --dataset_type multiclass --save_dir debug-results --multiclass_num_classes 3

runs without error while the script

python train.py --data_path debug.csv --dataset_type multiclass --save_dir debug-results --multiclass_num_classes 3 --loss_function mcc

encounters ValueError: Input contains NaN, infinity or a value too large for dtype('float64'). here. The NaN values first appear during the encoding step here. The weights of self.W_i start out normally, but turn to NaNs, and I haven't yet been able to trace why that's happening.

The contents of debug.csv are as follows:

isomeric_smiles,PUBCHEM_ACTIVITY_OUTCOME_INT
C1CCCN(CC1)C(=O)C2=CSC3=CC=CC=C32,1
C1=CC(=C(C=C1[N+](=O)[O-])[Hg])O.C1=NC2=C(N1[C@H]3[C@H]([C@@H]([C@H](O3)CO)O)O)N=C(NC2=[SH+])N,1
C1CCN(CC1)C(=O)C2=CC=C(C=C2)COC3=CC=CC=C3Br,1
C1CCN(CC1)C(=O)CCC2=CC=C(C=C2)OC3=CC(=CC(=C3)[N+](=O)[O-])[N+](=O)[O-],1
C1CN(CC(=O)O[Hg]OC(=O)CN1CC(=O)O)CC(=O)O,1
C1=CC(=CC=C1N)[Hg]S[C-]2C3=C(NC(=N2)N)N(C=N3)[C@H]4[C@H]([C@@H]([C@H](O4)CO)O)O,1
CC1=CC(=CC=C1)N2C=NC3=C2C=CC(=C3)C(=O)N4CCCCC4,1
C1CCC(C1)NP2(=O)COC3=CC=CC=C3OC2,1
CC1CCCN(C1)C(=O)C2=CC=C(C=C2)COC3=CC=CC=C3Br,1
C1CCN(CC1)C(=O)C2=CC3=C(C=C2)OCO3,1

The text was updated successfully, but these errors were encountered:

kevingreenman · 2022-07-01T15:53:30Z

I just realized that I only have 1 class represented in my debugging file. If I change debug.csv to

isomeric_smiles,PUBCHEM_ACTIVITY_OUTCOME_INT
C1CCCN(CC1)C(=O)C2=CSC3=CC=CC=C32,2
C1=CC(=C(C=C1[N+](=O)[O-])[Hg])O.C1=NC2=C(N1[C@H]3[C@H]([C@@H]([C@H](O3)CO)O)O)N=C(NC2=[SH+])N,0
C1CCN(CC1)C(=O)C2=CC=C(C=C2)COC3=CC=CC=C3Br,1
C1CCN(CC1)C(=O)CCC2=CC=C(C=C2)OC3=CC(=CC(=C3)[N+](=O)[O-])[N+](=O)[O-],1
C1CN(CC(=O)O[Hg]OC(=O)CN1CC(=O)O)CC(=O)O,1
C1=CC(=CC=C1N)[Hg]S[C-]2C3=C(NC(=N2)N)N(C=N3)[C@H]4[C@H]([C@@H]([C@H](O4)CO)O)O,1
CC1=CC(=CC=C1)N2C=NC3=C2C=CC(=C3)C(=O)N4CCCCC4,1
C1CCC(C1)NP2(=O)COC3=CC=CC=C3OC2,1
CC1CCCN(C1)C(=O)C2=CC=C(C=C2)COC3=CC=CC=C3Br,2
C1CCN(CC1)C(=O)C2=CC3=C(C=C2)OCO3,0

the error no longer occurs, so it must be related to that. But this issue also happened on my full set of 130K molecules that has all 3 classes represented, so it must be more nuanced than that.

kevingreenman · 2022-07-01T18:28:49Z

I confirmed that in my full dataset, the train, val, and test splits are identical for the model trained with cross_entropy loss and the model trained with mcc loss, and each split has all 3 classes represented.

kevingreenman · 2022-07-01T20:18:43Z

The mcc_multiclass_loss function is returning Inf; I'm investigating why

kevingreenman · 2022-07-01T20:34:03Z

Based on the multiclass definition of MCC from sklearn, it seems that if all of the true values OR predicted values belong to the same class (that number is equal to the total number of samples), then the MCC will be Inf because the denominator will be 0. So if any batch has all predicted values or true values being the same, then MCC will be Inf, which will cause the weights to become NaN through backprop, which will make the predictions be NaN as well. It seems that mcc_multiclass_loss should raise an informative error message in this case. This could be a common problem for people trying to train on imbalanced datasets. Is there a way to modify our data loader to have the option of doing stratified sampling?

kevingreenman added the bug Something isn't working label Jul 1, 2022

kevingreenman mentioned this issue Jul 14, 2022

Fix multiclass MCC loss function #309

Merged

2 tasks

kevingreenman linked a pull request Jul 14, 2022 that will close this issue

Fix multiclass MCC loss function #309

Merged

2 tasks

kevingreenman closed this as completed in #309 Jul 15, 2022

kevingreenman mentioned this issue Oct 19, 2022

[BUG]: possible bug with multiclass MCC code? #331

Closed

YoochanMyung mentioned this issue Mar 29, 2023

[BUG]: MCC loss function for binary classification returns predicted values of 'NaN' #380

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]: NaN outputs for multiclass classification using MCC loss function #306

[BUG]: NaN outputs for multiclass classification using MCC loss function #306

kevingreenman commented Jul 1, 2022 •

edited

kevingreenman commented Jul 1, 2022

kevingreenman commented Jul 1, 2022

kevingreenman commented Jul 1, 2022

kevingreenman commented Jul 1, 2022

[BUG]: NaN outputs for multiclass classification using MCC loss function #306

[BUG]: NaN outputs for multiclass classification using MCC loss function #306

Comments

kevingreenman commented Jul 1, 2022 • edited

kevingreenman commented Jul 1, 2022

kevingreenman commented Jul 1, 2022

kevingreenman commented Jul 1, 2022

kevingreenman commented Jul 1, 2022

kevingreenman commented Jul 1, 2022 •

edited