New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update or add documentation on fine-tuning with different data / labels #967
Comments
A finetuning tutorial has been on our list -- @sergeyk is contributing an example for how to finetune the CaffeNet model to a style recognition task that will walk through each of the steps -- thanks Sergey! For now #970 can be used as a work-in-progress reference. In particular look at the flickr_finetune_*.prototxt files for the changes that need to be made relative to the ImageNet model definition. In general, you change the data layer to load your data, change the "output" layer such as an inner product classifier to fit the dimensionality of your task, and then switch the type of loss if desired or the task requires it. You don't need to change the names of the loss, accuracy, or data / label tops -- the names of layers with parameters are changed to trigger re-initialization instead of transferring weights from the model the finetuning is done from. In Caffe parameters are indexed by layer name. |
In #970 in the flickr_finetune_train.prototxt , the data set is changed but the mean file remains the same ... why is that ? Wont the means of the new data set effect the training .. Moreover after training when loading the trained model , Do I still use the ImageNet mean file ? even if I have finetuned with my own data having different mean ... Thanks |
Once you've processed enough images the means all start to look the same to On Tue, Aug 26, 2014 at 10:49 PM, suffvaughn notifications@github.com
|
The fine-tuning tutorial was merged in #970. See https://github.com/BVLC/caffe/blob/dev/examples/finetune_flickr_style/readme.md. Hope that helps clear up any confusion! |
I've been trying to fine-tune the Caffe reference ImageNet model using our own data and class labels. Essentially, I want to learn a new classifier by re-training just the softmax (and accuracy?) layer and leaving all weights in previous layers as-is.
There are several issues on this topic, such as #31, #140, which seem to be a bit outdated. Similarly the intro slides, which refer to the
finetune_net
binary which is now deprecated. Instead, I tried the command-lineHere,
my_dataset_solver.prototxt
is a copy of the original ImageNet solver file, where I changed the "net" and "snapshot_prefix" values to the name of my dataset. In themy_dataset_train_val.prototxt
file, which I also copied from the ImageNet example, I renamed:loss
layer -->loss_new
, including places where it is referenced (in "top")accuracy
layer -->accuracy_new
, including where it is referenceddata
layers -->data_new
label
-->label_new
Is this correct so far? If I now replace the reference to "data" in each layer by "data_new", and likewise for "labels", it appears that the entire net is being retrained, though I'm not quite sure how to figure out from the logging output which layers are indeed being retrained (I had it running for about 3 days in CPU mode, though I'd expect retraining the last layer to just take a couple of hours).
If I instead leave the old references to "data" unchanged (because the existing layers were trained on the original ImageNet dataset), and don't delete the original "data" layers, then I keep getting asked for the original ImageNet database (which I think shouldn't be required since I just want to retrain the last layer on my database).
I'd be happy if somebody could shed some light on how to do this correctly. Thanks!
The text was updated successfully, but these errors were encountered: