Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different prediction results depending on inference batch size #16

Closed
techscientist opened this issue Aug 10, 2017 · 5 comments
Closed

Comments

@techscientist
Copy link

techscientist commented Aug 10, 2017

Hi @keunwoochoi !

I'm enjoying your Kapre library, as I'm currently using it in my music-based deep learning project...

In my deep network, I'm using a similar architecture as the ones you have in your music auto tagging repo, but I have replaced manual input audio preprocessing and the batch normalization layers in the network with their Kapre equivalent layers (like Melspectrogram and Normalization2D).

However, I am getting different prediction results when I change the batch size for the number of audio samples I am predicting at once.

I believe this is because your Normalization2D layer (which I think mimics Keras's BatchNormalization) is recalculating the mean/std, etc. during testing mode (aka. Keras.learning_phase == 0). This might cause the difference that I am experiencing from changing the batch size during batch prediction...

Is it possible to fix this?

Update: You can see Keras's implementation of BatchNormalization here: https://github.com/fchollet/keras/blob/master/keras/layers/normalization.py

From what I observe, it seems that Keras handles this issue by checking if the model is in testing mode, and if it is, it simply performs the batch normalization by using the mean and variance values learned through training (via storing them and updating these weights)..

How can we implement this functionality into Normalization2D, so that it also produces accurate prediction results when predicting single and/or batch samples?

Or, can we maybe rewrite Normalization2D, such that it simply extends Keras's BatchNormalization and only passes the axis to it (determined by the str_axis)? I think this would resolve this issue and also reduce the amt. of code needed for this Kapre layer...

Thanks!

@keunwoochoi
Copy link
Owner

You're correct. Perhaps I should've made it clear. For other axes then batch axis, it is fine -- no differences by batch. It seems possible to fix the batch_axis Normalizatio2D to work like BN layer, all we have to do is just mimic the already existing code in Keras. PR would be appreciated, personally I'm not sure I could have spare time in near future..

@techscientist
Copy link
Author

Ok @keunwoochoi , sounds great. Once I get around finishing my current experiments, I'll try to clone your repo and fix the layer. Once I get it working, I'll send you a PR for your review..

@techscientist
Copy link
Author

techscientist commented Aug 12, 2017

Also @keunwoochoi , just to make sure before I send you a PR, the Normalization2D layer is simply meant to perform the same functionality as Keras's Batch Normalization, right? (except with the addition of specifying a str_axis for normalization?)

I am asking because, if this is the case, then I can simply rewrite the Normalization2D layer by making it a subclass of Keras's Batchnormalization and adding some code to connect the Normalization2D's 'str_axis' with the BatchNormalization layer in Keras...

@keunwoochoi
Copy link
Owner

keunwoochoi commented Aug 13, 2017 via email

@techscientist
Copy link
Author

techscientist commented Aug 17, 2017

Oh ok, I get it..

Right now, I've found a workaround to this problem by using Keras's Batch normalization layer instead of kapre's Normalization2D (With the addition of using the correct axis, as specified in kapre's Normalization2D's code)...

I have included this workaround here incase some else happens to stumble upon the same problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants