Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImageNet test #5

Closed
dccho opened this issue Sep 5, 2016 · 17 comments
Closed

ImageNet test #5

dccho opened this issue Sep 5, 2016 · 17 comments

Comments

@dccho
Copy link

dccho commented Sep 5, 2016

I'm trying to make DenseNet for ImageNet dataset. But, it doesn't converge well.
Have you ever try DenseNet to ImageNet dataset?
Please share it if you have any successful densenet network for imagenet.

@liuzhuang13
Copy link
Owner

We are experimenting with imagenet, right now we successfully trained a model with only 10m params, the top 1 error is 28.7% which is better than resnet-18 (30.4%)which has 11m params. If you want the model I can share it with you later

thanks

@dccho
Copy link
Author

dccho commented Sep 6, 2016

@liuzhuang13 Thanks! Hope to see your great densenet model soon.

@liuzhuang13
Copy link
Owner

Thanks, probably imagenet results needs a while. Do you want the model definition file for imagenet? Or do you want an actual pretrained model? I can share with you through email. Leave your email!

@dccho
Copy link
Author

dccho commented Sep 6, 2016

Thanks~! My email is dccho.cvpr.phd@gmail.com. If pretrained model is too big, you can send me definition only. I'll train from scratch

@wlw208dzy
Copy link

wlw208dzy commented Sep 20, 2016

@liuzhuang13 I would appreciate it if you can send me the model definition file for ImageNet Dataset. My email is dzy_wlw@163.com. Thanks!

@liuzhuang13
Copy link
Owner

@wlw208dzy I'll share the links with you here.

densenet (10M parameters, 28.7% val error)
definition: https://1drv.ms/u/s!AjwB4qLCejx-be9Qh7ZT-RtvV38
pretrained model: https://1drv.ms/u/s!AjwB4qLCejx-a17znBzqnquzaJY

densenet (40M parameters, 24.0% val error)
definition: https://1drv.ms/u/s!AjwB4qLCejx-bJQcJQi9ptGgbT0
pretrained model: https://1drv.ms/u/s!AjwB4qLCejx-bp0a4WlshgcWrNs

Due to limited resources, these are only preliminary models, we're still investigating different architecture design (e.g., bottleneck structures) for DenseNets.

@argman
Copy link

argman commented Sep 21, 2016

@liuzhuang13 , does densenet(40m parameters) compare to resnet-152 ?
from slim, the val error of resnet-152 is about 24.0%
And how long does it take to train on imagenet ? Why do you choose Nesterov as optimizer ? Tks!

@liuzhuang13
Copy link
Owner

liuzhuang13 commented Sep 21, 2016

@argman

@liuzhuang13 , does densenet(40m parameters) compare to resnet-152 ?

From this page https://github.com/facebook/fb.resnet.torch/tree/master/pretrained (Facebook's original implementation), resnet-152 has val error 22.16%, which is better than Densenet with 40M parameters. It has 60M parameters though. Note that data augmentation, optimization, etc. are kept the same. The tensorflow implementation may have some differences.

And how long does it take to train on imagenet ?

It took us 10 days to train 40M densenet for 120 epochs on 4 TITAN X GPUs, with batchsize 128

Why do you choose Nesterov as optimizer ?

We followed fb.resnet.torch's implementation for every setting and hyperparameter, except a smaller batchsize (due to memory constraint) and slightly more training epochs.

@yefanhust
Copy link

yefanhust commented Dec 10, 2017

Hi @liuzhuang13 I followed your paper's configurations (https://arxiv.org/abs/1608.06993) and trained denseNet-BC-121 (theta=0.5) on ImageNet without data augmentation or dropout. I can only achieve val error 28.15% after 62 epochs (namely 2 epochs after the second lr decrease). And since then the val error is slightly increasing every epoch. I can't replicate the top1 val error 25.02% in your paper. Could you please give me any suggestions?

@liuzhuang13
Copy link
Owner

Hi @yefanhust We trained DenseNet with data augmentation implemented by the fb.resnet.torch repo here https://github.com/facebook/fb.resnet.torch#notes

If you don't use data augmentation, it's unlikely that you will get the same performance.

@yefanhust
Copy link

Thanks much for your prompt reply @liuzhuang13! I'll try the data augmentation then.

@yefanhust
Copy link

yefanhust commented Dec 14, 2017

Hi @liuzhuang13 I've turned on the data augmentation for training densenet121. I used scale and aspect ratio augmentation (inception-style scale jittering), color jittering (image brightness 0.4, image contrast 0.4, image saturation 0.4), AlexNet style color lighting (std=0.1, with pca eigval and eigvec), color normalizations (means [123.675, 116.28, 103.53], stds [58.395, 57.12, 57.375]) and random mirroring. However, the best top1 error I achieved was 27.59% after 82 epochs, only 0.56% better than without data augmentation, still far from your paper's 25.02%. Do I miss anything here? Or should I train the network longer, say 120 epochs?
densenet121

@liuzhuang13
Copy link
Owner

@yefanhust What library are you using?

@yefanhust
Copy link

@liuzhuang13 I'm using caffe2.

@liuzhuang13
Copy link
Owner

Maybe the implementation details are different, e.g., batch normalization. BTW, what's your purpose for training it on caffe2?

@yefanhust
Copy link

@liuzhuang13 Are you training from the raw imagenet12 data or other resized version? For answering your question, I work for NVIDIA, and this work is part of our NGC product, to have a base of trained models.

@liuzhuang13
Copy link
Owner

Our setting follows exactly as https://github.com/facebook/fb.resnet.torch, I think the image is first resized and then cropped to be 224x224.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants