Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BN inference #1

Open
ChenglongChen opened this issue Mar 18, 2015 · 1 comment
Open

BN inference #1

ChenglongChen opened this issue Mar 18, 2015 · 1 comment

Comments

@ChenglongChen
Copy link

Nice work!
I am also interested in the BN inference part.
To my understand of the paper, algo2 should be something as follows (BVLC/caffe#1965 (comment)):

  1. before the TEST phase, we forward a few mini-batch to compute the mean & var for the 1st BN layer, then we save this mean & var for other round inference (& forward)
  2. we then forward those mini-batch to compute the mean & var for the 2nd BN layer, notice that the normalization part of the 1st BN layer is carried out using mean & var computed in step1 not the mini-batch statistics.
  3. similarly, we perform the above for the rest BN layers.
  4. after computing all the mean & var, we then have the inference BN network.

So I think in the following lines, you should switch the for-loop order (for k; then for iter;) and use (k-1)-th BN's population mean & var for computing k-th BN's mean & var.
https://github.com/lim0606/ndsb/blob/master/codes_for_caffe/predict_bn.cpp#L295-L302

What's your opinion?

@lim0606
Copy link
Owner

lim0606 commented Mar 18, 2015

Thank you for sharing your bn_layer codes and comments :)

I also agree with your opinion that estimating means and variances layer by layer from the bottom layer is more natural way (similar to greedy layer-wise pre-training of DBN).

When I was implement predict_pn.cpp, however, I thought that the means and variances would be converge to the similar value if I estimate them with enough number of training examples. It seems the reason that the original paper only briefly described the estimation of means and variances in inference.

I will compare the results of two inference! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants