Training from scratch doesn't reach the same loss #22

nicolov · 2019-04-29T17:01:00Z

Hey, thanks a lot for the release. I've tried training the model from scratch using the datasets, but I can't reach the same validation loss. I noticed that the pre-trained network in the repo has two more convolutional layers compared to the code in the notebook, but adding them back doesn't help either.

Did you se any additional tricks for training?

For reference, above is what I see, below is what you have in the dataset:

gianlucahmd · 2019-04-30T13:24:53Z

Same here. I noticed that the network on the notebook is different from the model you've saved. Even using the same network, the performances are way worse than the ones you achieved using the same code. Maybe you had some clever training trick?

MiteshPuthran · 2019-05-02T01:25:00Z

Hello @nicolov and @gianlucahmd. I wish I had some magic tricks go increase the accuracy. There is only one thing that comes to my mind right now, which is try to augment the data that you have available and try to retrain the model.

gianlucahmd · 2019-05-02T08:10:11Z

Did you perform data augmentation yourself? Maybe that's the problem: I just downloaded the data from the links but end up having ~900 samples in the training set whereas you had ~1300.

Thanks for getting back to me!

nicolov · 2019-05-02T17:18:02Z

I wish I had some magic tricks go increase the accuracy.

Sure, I was just wondering how you got to the accuracy in the pretrained model that's in repo, as I can't seem to reproduce its accuracy using your training code.

srhoit59 · 2019-05-02T18:43:20Z

Can someone pls tell how you have organized data in the folder.have you created any subfolders

MiteshPuthran · 2019-05-02T18:57:24Z

@nicolov I used the same code as what you see in the notebook. Try using different sampling rates while extracting the features. Maybe more features would help to increase the accuracy.

@srhoit59 I put all the audio files in one folder.

gianlucahmd · 2019-05-02T21:09:23Z

Hey @MITESHPUTHRANNEU, I have the same problem as nicolov but it can't be a different number of features, otherwise your model wouldn't work at inference time. It must be something different during training and/or different data.

My first bet is different data, as downloading the data from the link you provided I get ~900 samples in the training set whereas from your notebook I see you had ~1300. Can you double-check that all the data you used is the one available from these links?

MiteshPuthran · 2019-05-02T21:32:43Z

Hi @gianlucahmd, yes if you change the sampling rate then you can't use my model. I have used used the data from the described sources. They may have changed the audio files as I had done this project in 2017.
Unfortunately I don't have the data that I had used during training anymore.

MiteshPuthran closed this as completed May 4, 2019

dehdari mentioned this issue Mar 18, 2020

Recommendations for Replicability #44

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training from scratch doesn't reach the same loss #22

Training from scratch doesn't reach the same loss #22

nicolov commented Apr 29, 2019

gianlucahmd commented Apr 30, 2019

MiteshPuthran commented May 2, 2019

gianlucahmd commented May 2, 2019

nicolov commented May 2, 2019

srhoit59 commented May 2, 2019

MiteshPuthran commented May 2, 2019

gianlucahmd commented May 2, 2019

MiteshPuthran commented May 2, 2019

Training from scratch doesn't reach the same loss #22

Training from scratch doesn't reach the same loss #22

Comments

nicolov commented Apr 29, 2019

gianlucahmd commented Apr 30, 2019

MiteshPuthran commented May 2, 2019

gianlucahmd commented May 2, 2019

nicolov commented May 2, 2019

srhoit59 commented May 2, 2019

MiteshPuthran commented May 2, 2019

gianlucahmd commented May 2, 2019

MiteshPuthran commented May 2, 2019