Saving all trained params in a single file #7722

sidgoyal78 · 2018-01-22T04:57:07Z

Merging all params in a single file

For inference, we will to have 2 files, one for the programDesc and one that has all the params together. We look at 1 approach to do this.

Understanding save/load ops (C++ side)

From the model_format design doc, we see some details in the table but it is not super clear. So we will look at the implementation.

To understand the current serialization: we look at save_op

In save_op the main work is performed by SerializeToStream( <ofstream>, <framework::LoDTensor>, .. ) Code. This function saves a version number, size of LoD and actual LoD data.
Then it calls, SerializeToStream(<ofstream>, <Tensor> ..) Code. This function saves a version number, tensor description as a serialized protobuf, and the actual data.

The corresponding load_op basically does the deserialization accordingly (respecting the ordering in the save_op).

Understanding how a model is saved (python api)

Now, we look at how the save/load works for saving actual model params, we look at the implementation of save_vars in fluid. Code. We see that a new program is created with save op is appended for each vars which is persistable. Then the executor runs this program.

Approach

We basically make two assumptions:

For both load/save, the order of iterating over the variables is the same. (This should hopefully be true)
We don't worry about the overwrite option which is in save_op.

While saving:

We basically store a uint64_t number in addition to the actual serialized bytes as in the original save. This number will tell us about the size of the serialized LoDTensor in bytes.
When the save is called for the first time, we will create a file, create a string that will have serialized LoDTensor data. Now we store the size of this string first in a fixed width (uint64_t) number, and then store the string.
When the save is called later, we basically go to the end of the file, and store 2 things: the size of the string and the string itself.

While loading:

We pass an additional attribute, in order to load the correct chunk of parameter. So we pass a counter value (which counts from 0 the relative order of the different params).
With this counter and the extra size information that we stored, we can hop to the appropriate part of the file, and read the chunk, and deserialize it.

For implementation, i think it will be better to have another op for this (rather than replacing the original save_op/load_op, so that is easier to debug, and i don't know the details of how the load_op and save_op are used in distributed version as of now).

The text was updated successfully, but these errors were encountered:

sidgoyal78 · 2018-01-22T06:25:41Z

I think we can even handle the "overwrite" file case. We can also have a counter for the save op, and if the counter is 0, and file exists and overwrite = false, we abort.
But if counter is 0, file exists, overwrite = true, we create a new file (open file in write mode).

Xreki · 2018-01-22T08:54:13Z

Let me take an example to explain the saving format as I understand.

If there are two parameters, fc0.w0 and fc0.b0, they will be saved in a file as:

uint64_t: size of fc0.w0
string: serialized LoDTensor data of fc0.w0
uint64_t: size of fc0.b0
string: serialized of LoDTensor data of fc0.b0

Am I right? I think we may need another string to record the parameter's name.

string: fc0.w0
uint64_t: size of fc0.w0
string: serialized LoDTensor data of fc0.w0
string: fc0.b0
uint64_t: size of fc0.b0
string: serialized of LoDTensor data of fc0.b0

For implementation, i think it will be better to have another op for this (rather than replacing the original save_op/load_op, so that is easier to debug, and i don't know the details of how the load_op and save_op are used in distributed version as of now).

In the first implementation, we may fill the parameter's Tensor in the Load() function directly.

sidgoyal78 · 2018-01-22T11:55:19Z

Yes, you are right. I think the 'name' of the param is not currently stored. From the implementation (ref), we see only TensorDesc is stored. So It should be merged with the serialized LoDTensor data and then the size should be generated: (see comment below)

So I am thinking of it more like this:

uint64_t: size of fc0.w0 + size of string fc0.w0 + 1
string: lengthof(fc0.w0) + fc0.w0 + serialized LoDTensor data of fc0.w0
uint64_t: size of fc0.b0 + size of string fc0.b0 + 1
string: lengthof(fc0.b0) + fc0.b0 + serialized of LoDTensor data of fc0.b0

(Otherwise, we won't know the size of the string fc0.w0/fc0.b0 beforehand, so we need to merge it with the serialization of the LoDtensor, and then generate the size)

sidgoyal78 · 2018-01-22T12:46:38Z

@Xreki : I think the name isn't required, it is obtained from the programDesc, and passed accordingly: code. And since we are storing the programDesc as a protobuf then i don't see the need of storing it again (provided the ordering when iterating through load_vars and save_vars remains same, as in assumption 1 holds true).

For a concrete example, if we have fc1, b1, fc2, b2. So the order will be the same when we call load_vars or save_vars.

We iterate over this list, and pass an additional counter for save and load:

While saving:

we call new_save with counter = 0, and a filepath

uint64_t: 340 (for example)
340 bytes of fc1 data and it's desc

we call new_save with counter = 1, and the same filepath
We basically go the end of the file and append this:

uint64_t: 340 (for example)
340 bytes of fc1 data and it's desc
uint64_t: 70 (for example)
70 bytes of b1 data and it's desc

.
.
.
we finally get:

uint64_t: 340 (for example)
340 bytes of fc1 data and it's desc
uint64_t: 70 (for example)
70 bytes of b1 data and it's desc
uint64_t: 640 (for example)
840 bytes of fc2 data and it's desc
uint64_t: 80 (for example)
80 bytes of b2 data and it's desc

Now while loading we proceed as:

we call new_load with counter = 0, and with op: fc1, and the above filepath
Since counter = 0; we don't hop at all, we read the first unit64 and read the string and convert it into fc1.

uint64_t: 340 (for example)                  <-- we proceed till here
340 bytes of fc1 data and it's desc
uint64_t: 70 (for example)
70 bytes of b1 data and it's desc
uint64_t: 640 (for example)
840 bytes of fc2 data and it's desc
uint64_t: 80 (for example)
80 bytes of b2 data and it's desc

Now we read the uint64 to get the bytes for fc1 and read those many bytes, then deserialize to obtain fc1.

we call new_load with counter = 1, and with op:b1, and the above filepath
Now, since the counter = 1, we hop once, and go from first uint64 to the next one:

uint64_t: 340 (for example)
340 bytes of fc1 data and it's desc
uint64_t: 70 (for example) .                  <-- we proceed till here
70 bytes of b1 data and it's desc
uint64_t: 640 (for example)
840 bytes of fc2 data and it's desc
uint64_t: 80 (for example)
80 bytes of b2 data and it's desc

Now we read the uint64 to get the bytes for b1 and read those many bytes, then deserialize to obtain b1.

.
.
.

we call new_load with counter = 3, and with op:b2, and the above filepath
Now, since the counter = 3, we hop three times, and go from first uint64 to the fourth one:

uint64_t: 340 (for example)
340 bytes of fc1 data and it's desc
uint64_t: 70 (for example) .                
70 bytes of b1 data and it's desc
uint64_t: 640 (for example)
840 bytes of fc2 data and it's desc
uint64_t: 80 (for example)                         <-- we proceed till here
80 bytes of b2 data and it's desc

Now we read the uint64 to get the bytes for b2 and read those many bytes, then deserialize to obtain b2.

Xreki · 2018-01-23T09:12:43Z

OK. The design is based on the order. We need to make sure the loading order is totally the same as the saving order.

Xreki · 2018-02-09T05:51:21Z

Fixed by #7995 and #7909

sidgoyal78 self-assigned this Jan 22, 2018

sidgoyal78 added the 预测原名Inference，包含Capi预测问题等 label Jan 22, 2018

sidgoyal78 added this to TODO in Inference Framework Jan 22, 2018

sidgoyal78 moved this from TODO to DOING in Inference Framework Jan 22, 2018

sidgoyal78 mentioned this issue Jan 23, 2018

Add new load and save ops for storing model params in a single file #7780

Closed

sidgoyal78 mentioned this issue Jan 27, 2018

Add variant of new load and save ops for storing model params in a single file #7909

Merged

Xreki closed this as completed Feb 9, 2018

Xreki moved this from Basic Usage (DOING) to Basic Usage (DONE) in Inference Framework Feb 9, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Saving all trained params in a single file #7722

Saving all trained params in a single file #7722

sidgoyal78 commented Jan 22, 2018

sidgoyal78 commented Jan 22, 2018

Xreki commented Jan 22, 2018

sidgoyal78 commented Jan 22, 2018 •

edited

Loading

sidgoyal78 commented Jan 22, 2018 •

edited

Loading

Xreki commented Jan 23, 2018

Xreki commented Feb 9, 2018

Saving all trained params in a single file #7722

Saving all trained params in a single file #7722

Comments

sidgoyal78 commented Jan 22, 2018

Merging all params in a single file

Understanding save/load ops (C++ side)

Understanding how a model is saved (python api)

Approach

sidgoyal78 commented Jan 22, 2018

Xreki commented Jan 22, 2018

sidgoyal78 commented Jan 22, 2018 • edited Loading

sidgoyal78 commented Jan 22, 2018 • edited Loading

Xreki commented Jan 23, 2018

Xreki commented Feb 9, 2018

sidgoyal78 commented Jan 22, 2018 •

edited

Loading

sidgoyal78 commented Jan 22, 2018 •

edited

Loading