You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Nice work!
I am also interested in the BN inference part.
To my understand of the paper, algo2 should be something as follows (BVLC/caffe#1965 (comment)):
before the TEST phase, we forward a few mini-batch to compute the mean & var for the 1st BN layer, then we save this mean & var for other round inference (& forward)
we then forward those mini-batch to compute the mean & var for the 2nd BN layer, notice that the normalization part of the 1st BN layer is carried out using mean & var computed in step1 not the mini-batch statistics.
similarly, we perform the above for the rest BN layers.
after computing all the mean & var, we then have the inference BN network.
Thank you for sharing your bn_layer codes and comments :)
I also agree with your opinion that estimating means and variances layer by layer from the bottom layer is more natural way (similar to greedy layer-wise pre-training of DBN).
When I was implement predict_pn.cpp, however, I thought that the means and variances would be converge to the similar value if I estimate them with enough number of training examples. It seems the reason that the original paper only briefly described the estimation of means and variances in inference.
Nice work!
I am also interested in the BN inference part.
To my understand of the paper, algo2 should be something as follows (BVLC/caffe#1965 (comment)):
So I think in the following lines, you should switch the for-loop order (for k; then for iter;) and use (k-1)-th BN's population mean & var for computing k-th BN's mean & var.
https://github.com/lim0606/ndsb/blob/master/codes_for_caffe/predict_bn.cpp#L295-L302
What's your opinion?
The text was updated successfully, but these errors were encountered: