Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Snapshot model weights/solver state to HDF5 files #2836
Conversation
|
This satisfies part of #1211. |
Yeongtae
commented
Aug 1, 2015
|
can i access the networks which are blobs using hdf5? If it can, please show the example. |
|
I've skimmed through this and it mostly looks good, thanks @erictzeng. My one piece of feedback right now is that |
shelhamer
commented on an outdated diff
Aug 6, 2015
| + string source_layer_name = hdf5_get_name_by_idx(data_hid, i); | ||
| + if (!layer_names_index_.count(source_layer_name)) { | ||
| + DLOG(INFO) << "Ignoring source layer " << source_layer_name; | ||
| + continue; | ||
| + } | ||
| + int target_layer_id = layer_names_index_[source_layer_name]; | ||
| + DLOG(INFO) << "Copying source layer " << source_layer_name; | ||
| + vector<shared_ptr<Blob<Dtype> > >& target_blobs = | ||
| + layers_[target_layer_id]->blobs(); | ||
| + hid_t layer_hid = H5Gopen2(data_hid, source_layer_name.c_str(), | ||
| + H5P_DEFAULT); | ||
| + CHECK_GE(layer_hid, 0) | ||
| + << "Error reading weights from " << trained_filename; | ||
| + // Check that source layer doesn't have more params than target layer | ||
| + int num_source_params = hdf5_get_num_links(layer_hid); | ||
| + CHECK_LE(num_source_params, target_blobs.size()) |
shelhamer
Owner
|
shelhamer
added enhancement focus
labels
Aug 6, 2015
|
This will be a good switch, and the backward compatibility saves a lot of heartache, but we might consider bringing the documentation and examples along with us as there are references to the current extensions here and there. This looks good to me code-wise (once Jeff's comment is addressed) but you could squash related changes and fixes when you're done. Since the weight sharing tests don't cover save and restore ( Thanks @erictzeng! |
The tests I added in #2866 do cover this (though they're less unit tests and more integration tests than what you propose, as they also rely on the solver snapshot/restore correctness). |
|
@jeffdonahue oh sweet, |
|
@bhack this lets us keep the same dependencies and interface for defining models. Migrating away to protobuf for a new format needs a good argument and its own issue since model definitions would change. |
|
@shelhamer Flatbuffers support .proto parsing for easier migration from Protocol Buffers |
jeffdonahue
added some commits
Jul 30, 2015
|
That should be all comments addressed! The constant has been lowered to 32 as requested, and history has been squashed. Let me know if anything else seems off. @Yeongtae I'm not sure I fully understand what you're asking, but this PR allows you to access network parameters via HDF5, if that's what you want. The parameters are stored in a fairly simple structure. Here's how you'd peek at the conv1 parameters in lenet:
The datasets 0 and 1 correspond to the weights and biases of the layer, respectively. |
jeffdonahue
commented on an outdated diff
Aug 7, 2015
| + CHECK_GE(file_hid, 0) | ||
| + << "Couldn't open " << filename << " to save weights."; | ||
| + hid_t data_hid = H5Gcreate2(file_hid, "data", H5P_DEFAULT, H5P_DEFAULT, | ||
| + H5P_DEFAULT); | ||
| + CHECK_GE(data_hid, 0) << "Error saving weights to " << filename << "."; | ||
| + hid_t diff_hid = H5Gcreate2(file_hid, "diff", H5P_DEFAULT, H5P_DEFAULT, | ||
| + H5P_DEFAULT); | ||
| + CHECK_GE(diff_hid, 0) << "Error saving weights to " << filename << "."; | ||
| + for (int layer_id = 0; layer_id < layers_.size(); ++layer_id) { | ||
| + const LayerParameter& layer_param = layers_[layer_id]->layer_param(); | ||
| + string layer_name = layer_param.name(); | ||
| + hid_t layer_data_hid = H5Gcreate2(data_hid, layer_name.c_str(), | ||
| + H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT); | ||
| + CHECK_GE(layer_data_hid, 0) | ||
| + << "Error saving weights to " << filename << "."; | ||
| + hid_t layer_diff_hid = H5Gcreate2(diff_hid, layer_name.c_str(), |
jeffdonahue
Contributor
|
|
Everything looks good, thanks Eric! |
jeffdonahue
added a commit
that referenced
this pull request
Aug 7, 2015
|
|
jeffdonahue |
fc77ef3
|
jeffdonahue
merged commit fc77ef3
into
BVLC:master
Aug 7, 2015
1 check passed
|
My vote still go to flatbuffers as a natural google successor to protobuf. But with this merge hdf5 it is the de facto standard for caffe models now and nobody replied to the evaluation process of protobuff substitute. |
This was referenced Aug 8, 2015
ronghanghu
added a commit
to ronghanghu/caffe
that referenced
this pull request
Aug 9, 2015
|
|
ronghanghu |
b43e93b
|
ronghanghu
referenced
this pull request
Aug 9, 2015
Open
[Don't Merge] Rebase and Clean up Hdf5DataLayer Prefetch #2892
ronghanghu
added a commit
to ronghanghu/caffe
that referenced
this pull request
Aug 9, 2015
|
|
ronghanghu |
3d0a331
|
ronghanghu
added a commit
to ronghanghu/caffe
that referenced
this pull request
Aug 9, 2015
|
|
ronghanghu |
1c53821
|
ronghanghu
added a commit
to ronghanghu/caffe
that referenced
this pull request
Aug 9, 2015
|
|
ronghanghu |
b00e872
|
ronghanghu
added a commit
to ronghanghu/caffe
that referenced
this pull request
Aug 9, 2015
|
|
ronghanghu |
2b7c2e4
|
ronghanghu
added a commit
to ronghanghu/caffe
that referenced
this pull request
Aug 9, 2015
|
|
ronghanghu |
70168ba
|
ronghanghu
added a commit
to ronghanghu/caffe
that referenced
this pull request
Aug 10, 2015
|
|
ronghanghu |
11d0d74
|
This was referenced Aug 20, 2015
|
What about a python interface for saving a net to HDF5? This can be useful for "net surgery".
To However, I got this error:
|
This was referenced Feb 8, 2016
|
@shelhamer There seems to be some "hiccups" with snapshoting to hdf5 format. |
|
@shaibagon I'm not aware of any issue, so could you post an issue with details to reproduce the problem with Caffe master? I don't know anything about the OpenCV DNN package mentioned at that SO link. Please mention @erictzeng in the issue as the author of this PR. |
bhack
referenced
this pull request
in tiny-dnn/tiny-dnn
Jul 22, 2016
Closed
quantization, bug fix in deconv and graph enet #206
myfavouritekk
added a commit
to myfavouritekk/caffe
that referenced
this pull request
Sep 12, 2016
|
|
myfavouritekk |
7fe8f36
|
erictzeng commentedJul 30, 2015
This pull request enables Caffe to snapshot model weights and solver states to HDF5 files and makes this format the default. This format provides a number of advantages:
To avoid confusion with the old snapshotting methods, snapshotting to HDF5 files adopts new file extensions, namely
.caffemodel.h5and.solverstate.h5. When restoring either weights or solver history from a file, the extension of the file is checked. If the extension is.h5, it is loaded as an HDF5 file. All other extensions are treated as a binary protobuf file and loaded as before.The default snapshot format is switched to HDF5 in this PR. If you prefer the old method, you can add
snapshot_format: BINARYPROTOto your solver prototxt to restore binary protobuf snapshotting.A few miscellaneous details:
TestSnapshottest for gradient-based solvers.util/io.cpphave been moved out to their own file,util/hdf5.cpp, and additional helper functions have been added.Netand theSolver, since we now have methods for both BinaryProto and HDF5. Everything in Caffe checks out, but downstream users who have implemented their own non-SGD solvers/solvers with nonstandard snapshotting may have a bad time.Potential caveats
hdf5_save_nd_dataset. Previously, said function always saved 4-D blobs. It has since been changed to save N-D blobs instead. This could potentially break people's workflows if they were relying on HDF5OutputLayers to output 4-D blobs.There aren't any tests that compare the loaded solver history.Possible extensions
These extensions won't end up in this PR, but possible things to do after this wraps up: