Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About a tensorflow implementation #15

Closed
jh-jeong opened this issue May 2, 2017 · 3 comments
Closed

About a tensorflow implementation #15

jh-jeong opened this issue May 2, 2017 · 3 comments

Comments

@jh-jeong
Copy link

jh-jeong commented May 2, 2017

I've followed one of Tensorflow implementations of DenseNet (https://github.com/ikhlestov/vision_networks) to reproduce DenseNet-BC-100-12.
It seemed to me that the tensorflow implementation is nearly equivalent with one from this repo,
but I couldn't reach to ~4.5 % error (the best one was about ~4.8 %, by the way)
Could you give me any reasons why it is? I already compared two codes very carefully, but couldn't find.

@Tongcheng
Copy link

@jh-jeong In my "Much more efficient caffe implementation", I also reach about 4.8% for DenseNet-BC-100-12. I am curious of the cause which seems to be common between Caffe and Tensorflow.

@jh-jeong
Copy link
Author

jh-jeong commented May 16, 2017

@Tongcheng Finally I could get 4.5% in Tensorflow. What I changed are as follows:

  1. Changing the momentum in each BN. In Tensorflow, batch normalization uses 0.999 as the default value, but torch uses 0.9.
  2. Applying weight decay for 'all' trainable variables, as fb.resnet.torch did, including beta/gamma variables in BN and all biases.

@anthony123
Copy link

@jh-jeong can you share your tf-version code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants