About a tensorflow implementation #15

jh-jeong · 2017-05-02T21:23:08Z

I've followed one of Tensorflow implementations of DenseNet (https://github.com/ikhlestov/vision_networks) to reproduce DenseNet-BC-100-12.
It seemed to me that the tensorflow implementation is nearly equivalent with one from this repo,
but I couldn't reach to ~4.5 % error (the best one was about ~4.8 %, by the way)
Could you give me any reasons why it is? I already compared two codes very carefully, but couldn't find.

Tongcheng · 2017-05-02T21:37:22Z

@jh-jeong In my "Much more efficient caffe implementation", I also reach about 4.8% for DenseNet-BC-100-12. I am curious of the cause which seems to be common between Caffe and Tensorflow.

jh-jeong · 2017-05-16T17:06:59Z

@Tongcheng Finally I could get 4.5% in Tensorflow. What I changed are as follows:

Changing the momentum in each BN. In Tensorflow, batch normalization uses 0.999 as the default value, but torch uses 0.9.
Applying weight decay for 'all' trainable variables, as fb.resnet.torch did, including beta/gamma variables in BN and all biases.

anthony123 · 2017-11-24T03:35:24Z

@jh-jeong can you share your tf-version code?

jh-jeong closed this as completed May 16, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About a tensorflow implementation #15

About a tensorflow implementation #15

jh-jeong commented May 2, 2017

Tongcheng commented May 2, 2017

jh-jeong commented May 16, 2017 •

edited

anthony123 commented Nov 24, 2017

About a tensorflow implementation #15

About a tensorflow implementation #15

Comments

jh-jeong commented May 2, 2017

Tongcheng commented May 2, 2017

jh-jeong commented May 16, 2017 • edited

anthony123 commented Nov 24, 2017

jh-jeong commented May 16, 2017 •

edited