-
Notifications
You must be signed in to change notification settings - Fork 562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Deep Residual Net (ResNet) reference models for ILSVRC #60
Conversation
Added Typical training call chain for using
Tested on a K40. |
Resnet 18,50,101,152 now ready for training. Training tips (to be updated):
|
Added |
Support for Deep Residual Net (ResNet) reference models for ILSVRC
Hi @beniz, |
hi @GiuliaP, I guess you are referring to this file https://github.com/beniz/deepdetect/blob/master/templates/caffe/resnet_18/resnet_18.prototxt So, two things:
|
Hi @beniz , thank you for the detailed answer, it's as I was expecting. In case I manage to come up with working models, I'll open a PR! However, at present I am stuck by the fact that I cannot find anywhere on the internet the |
Weights for Note that Now, a hint that derives from the above note is that you may be able to take the appropriate weights from
Finetuning |
Thanks for the feedback. In this case, I'll try to benchmark |
@GiuliaP FB has trained ResNets of various sizes in torch and is making their weights available (incl. 18, 34, also 200): https://github.com/facebook/fb.resnet.torch/tree/master/pretrained There is also a torch2caffe converter available here: https://github.com/facebook/fb-caffe-exts#torch2caffe I havent tried this path myself but if you decide to go down this road I would be interested to hear how this is going :-) |
Ah great, thanks @revilokeb. @GiuliaP make sure the architectures (e.g. the 18) are exactly the same when using converted weights. You can modify the |
Correct me if I'm wrong, but doesnt this implementation of resnet-18 create a huge FC layer. In resnet-50 (which I believe this implementation is just a cut-out of), the layer before the last 7x7 pooling is 7x7x2048, which then is followed by a 2048x1000 FC layer. However here, the layer before the last 7x7 pooling is something along the lines 28x28x512, which even after 7x7 stride 1 pooling will create a giant FC layer (225792x1000). I believe it would be better if we gradually scale down the activations to 7x7x512 by the end of 18 layers (through occasional stride 2 convolutions), or at least do a 7x7 stride 4 pooling, etc. |
Yes, the To scale down the top of the network, you can look at the resnet with 18 layers from the torch implementation, https://github.com/facebook/fb.resnet.torch/blob/master/models/resnet.lua, or take a look at a generator for Caffe (untested), https://github.com/soeaver/caffe-model/blob/master/resnet.py There's a resnet (and more) generator coming up for DD but it is in no state to be shared at the moment. If you ever scale down the net with good results, please PR the changes so that others can benefit from the change. |
FIrst off thank you for making the Resnet models available on your repository however I think something is a bit odd with your training ones because when you run them up in caffe basically it says all the nodes do not need backward computation that surely can't be right usually you would only see this on the data layer. That smells to me they are for interference not for training |
We've trained from scratch and finetuned many |
Thanks managed to fix it by adding two accuracy layers to the TEST phase. All might be because I changed the TEST input layer as well |
@beniz Thank you for your patient reply. Could you give me some tips about combining faster rcnn with resnet if you have any ideas. I am trying to finetune 50 layers faster rcnn with resnet, but I have some trouble about layers. Whether I need to modify the name of full connection layer and it is to convergence without the layer parameter(such as batch_norm_param{}). |
@OranjeeGeneral Would you mind sharing your fix? I'm not able to train a single resnet that converges. What made the difference? Thanks! |
I came here from this thread KaimingHe/deep-residual-networks#6 I have tried your ResNet-18 with batch size 16 and learning rate from 0.1 to 0.00001, but with no success. My task have 20k images and 2 class. |
Have you tried a simpler network I.e. GoogLeNet and had it converged ? If you are using dd, post your exact API calls. |
Yes, I have successfully trained AlexNet, GoogLeNet, VGG, etc. see in my repo https://github.com/mrgloom/kaggle-dogs-vs-cats-solution |
Ok, fyi our resnet-18 should be redone, it's on my to-do list, just got caught elsewhere. However, we ve used it with good success before. Two suggestions:
Let me know how this goes! |
Also note that these are for dd and it's custom Caffe version. We don't support Nvidia s version... |
Also no success with ResNet-50 from here https://github.com/jay-mahadeokar/pynetbuilder/tree/master/models/imagenet/resnet_50 |
This means the pb is elsewhere. We can't help you outside dd, good luck :) |
I also want to report that I tried multiple nets from https://github.com/jay-mahadeokar/pynetbuilder/tree/master/models/imagenet on NVIDIA with no success. I know you don't support NVIDIA but just figured I'd let you know. |
What about the original resnet-50 ? |
I haven't found an original resnet-50 yet but I did find a resnet-18 that works perfectly. Would love to try out all the variations from https://github.com/jay-mahadeokar/pynetbuilder/tree/master/models/imagenet but don't know how to make them work in NVIDIA. |
resnet 50 and above are available pre trained from Caffe, TF and dd. They can easily be finetuned. |
@beniz Do you know where I can find a vanilla resnet-50 network for Caffe? Haven't seen any available publicly. I see pre-trained models listed but without the network I can't finetune it. |
Have you looked at https://github.com/beniz/deepdetect/blob/master/README.md ? There are links to several pre trained resnet models. |
Pretrained - yes... but I can't find prototxt for any of them. |
Read the documentation, they are in the repository as templates. Resnets are used all over the place. |
Thanks, I did find the templates, however, they don't work with the pre-trained models in NVIDIA. It is due to the different BN layer implementations. They also don't converge above 60% on the 17Flowers dataset - not sure why. |
Can't help you with that, they work fine with dd, that's all we guarantee. |
ResNet-18 works on fresh caffe-master. Is there any pretrained on ImageNet model available? |
I am also looking for pretrained weights for resnet-18/32 for Caffe. The conversions from Facebook's Torch implementation do not work properly. |
@xiaojimi we finetune resnet_50 very often on a variety of tasks with excellent convergence. You'll need to share your API calls, server logs and list your model directory. |
Could you please provide the pretrained ResNet-18/34 models? |
@miquelmarti I am thinking of trying to convert resnet 18/34 from torch, weren't you able to do it? Did you find any other pretrained weights somewhere else? |
@olaff09 I did not try too hard, you can give it another try as it has to be possible, I just lack the sufficient knowledge of torch. |
What'd be so useful about ResNet-18/34, is that the lower memory per image at prediction time, along with speed ? |
hi there, |
This is support for the state-of-the-art nets just released by https://github.com/KaimingHe/deep-residual-networks. They are implemented as DeepDetect neural net templates:
resnet_50
,resnet_101
andresnet_152
are now available from the API.Note: training successfully tested on
resnet_18
andresnet_50
For using the nets in
predict
mode:mkdir path/to/model
cp ResNet-50-model.caffemodel path/to/model/
cp ResNet_mean.binaryproto path/to/model/mean.binaryproto
Note that
template
is set toresnet_50