Validation loss vs Training loss in AudioSet training #31

Tomlevron · 2021-10-07T09:02:19Z

Hi!

First of all i would like to thank you for sharing with everyone your amazing work! Truly inspiring and fascinating work you shard with us.

I have a question regarding the differences of the training loss and the validation loss. It seems that the validation loss is much higher than the training loss, is that make sense? isn't it overfitting?

I also tried to fine tune the Audioset trained model for my data and is showed the same differences (with and without augmentations).

Here is an example from the logs: test-full-f10-t10-pTrue-b12-lr1e-5/log_2090852.txt:

train_loss: 0.011128
valid_loss: 0.693989

I'm still new to deep learning so maybe I'm missing something.

Thank you!

The text was updated successfully, but these errors were encountered:

YuanGongND · 2021-10-08T17:15:34Z

Thanks for your interest.

I think it is not an overfitting issue as you should also see a performance drop in mAP or accuracy on the validation set if the model is overfitted. I think the reason is that we added a Sigmoid function on top of the output of the model in the inference stage (but not in the training stage) before loss computation to make sure mAP/accuracy is calculated correctly. It changes the validation loss. See here.

-Yuan

hbellafkir · 2021-10-22T13:55:33Z

Wouldn't it be wrong to train with Softmax and use Sigmoid for mAP? Using Softmax instead of Sigmoid gives a higher mAP value.

YuanGongND · 2021-10-22T14:25:54Z

Could you elaborate on this point?

I think we did not use softmax during training, the reason why we added an extra Sigmoid in inference but not in training is that BCEWithLogitsLoss already includes one Sigmoid.

hbellafkir · 2021-10-23T09:40:19Z

In the case of CrossEntropyLoss, the loss is calculated with Softmax (here) as it is included in the CrossEntropyLoss operation. It is not correct to use Sigmoid for inference when CrossEntropyLoss is used in training for my understanding. on a custom dataset that I use, switching from Sigmoid to Softmax results in a higher mAP value during inference.

hbellafkir · 2021-10-26T10:43:19Z

In the case of CrossEntropyLoss, the loss is calculated with Softmax (here) as it is included in the CrossEntropyLoss operation. It is not correct to use Sigmoid for inference when CrossEntropyLoss is used in training for my understanding. on a custom dataset that I use, switching from Sigmoid to Softmax results in a higher mAP value during inference.

@YuanGongND any thoughts on this?

YuanGongND · 2021-10-26T17:21:40Z

Yes - I think you can skip the Sigmoid in inference. That was just used to make training/inference consistent for the multi-label classification (i.e., one audio has more than one label) tasks.

When you use CrossEntropyLoss, I assume you have a single-label dataset, using Softmax here might improve mAP, but won't improve accuracy, but mAP is less important for single-label classification, that's why we use accuracy in the ESC-50 and SpeechCommands recipe.

For multi-label classification, adding Sigmoid won't change mAP either as Sigmoid is monotonic, so I think you can also remove that, but that could impact the ensemble performance.

YuanGongND · 2022-03-04T21:14:05Z

After some investigation, it seems to be a logging bug. The train and eval loss difference is over-estimated in the code.

In traintest.py, the loss_meter is cleaned up every epoch, but the average is printed out every 1000 iterations. So the large loss value at early iterations accumulates.

Changing 'loss_meter.avg' to 'loss_meter.val' at here can alleviate this problem. But I would suggest doing an offline loss evaluation (i.e., check the training loss using the best checkpoint model after the training process finishes), that would be the most accurate solution.

YuanGongND added the bug Something isn't working label Oct 8, 2021

YuanGongND mentioned this issue Jul 20, 2022

The validation loss seems too high #75

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validation loss vs Training loss in AudioSet training #31

Validation loss vs Training loss in AudioSet training #31

Tomlevron commented Oct 7, 2021

YuanGongND commented Oct 8, 2021

hbellafkir commented Oct 22, 2021

YuanGongND commented Oct 22, 2021

hbellafkir commented Oct 23, 2021

hbellafkir commented Oct 26, 2021

YuanGongND commented Oct 26, 2021

YuanGongND commented Mar 4, 2022

Validation loss vs Training loss in AudioSet training #31

Validation loss vs Training loss in AudioSet training #31

Comments

Tomlevron commented Oct 7, 2021

YuanGongND commented Oct 8, 2021

hbellafkir commented Oct 22, 2021

YuanGongND commented Oct 22, 2021

hbellafkir commented Oct 23, 2021

hbellafkir commented Oct 26, 2021

YuanGongND commented Oct 26, 2021

YuanGongND commented Mar 4, 2022