Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Load weights from multiple caffemodels. #1456
Conversation
jyegerlehner
added some commits
Nov 19, 2014
|
This is a helpful generalization for certain uses. Note that this can The level and stage rules for layer inclusion / exclusion are helpful for
|
|
With the advances of pycaffe one can copy weights from several models by Net.copy_from. I like preparing the nets through Python for its generality, but copying weights from multiple nets could be a useful special case. However I'm inclined to keep the |
|
I think it's useful and non-intrusive. I'm not a huge fan of the interface (commas in a flag argument) but I can't think of anything better (gflags doesn't let you specify the same flag multiple times and give you a |
shelhamer
added a commit
that referenced
this pull request
Mar 8, 2015
|
|
shelhamer |
a9bf7b9
|
|
@jyegerlehner thanks for the convenient multi-model fine-tuning initialization. I merged this to master in a9bf7b9 (and collapsed this to a single commit). |
jyegerlehner commentedNov 20, 2014
At least one use case requiring this is doing layerwise or "stacked" autoencoder training: First I train the newly-added encoder and decoder layers by themselves (using features extracted from the net having only the previously-trained layers). Then when I begin to train the combined network, it needs to pull weights from two different caffemodel files. So this change allows the
--weightsparameter to be a comma-separated list of caffemodels instead of just a single caffemodel.The other code change is that the test nets are also initialized from the provided caffemodels, not just the train net. So if the trained net is a subset of the test net, then some of the test nets' layers' weights would be uninitialized, whereas with this change they are initialized from the specified models.