Symbolic .json file not compatible with .params file generated since MXNet 1.2 #11091

ThomasDelteil · 2018-05-29T22:31:18Z

Since MXNet 1.2.0 one possible way of serializing Gluon models is not working anymore.

Reproducible here:

import mxnet as mx
from mxnet import gluon
ctx = mx.cpu()

# Create network
net = gluon.nn.HybridSequential(prefix="test_")
with net.name_scope():
    net.add(gluon.nn.Conv2D(10, (3, 3)))
    net.add(gluon.nn.Dense(50))
net.initialize()
net(mx.nd.ones((1,1,50,50)))

# Save network
a = net(mx.sym.var('data'))
a.save('test.json')
net.save_params('test.params')

# Load network
net2 = gluon.nn.HybridSequential(prefix="test_")
with net2.name_scope():
    sym = mx.sym.load_json(open('test.json', 'r').read())
    net2.add(gluon.nn.SymbolBlock(outputs=sym, inputs=mx.sym.var('data')))
net2.load_params('test.params', ctx=ctx)

Gives the following error:

AssertionError: Parameter 'conv0_weight' is missing in file 'test.params', which contains parameters: '0.weight', '0.bias', '1.weight', '1.bias'. Set allow_missing=True to ignore missing parameters.

Whilst it worked in 1.1.0.

This way of exporting symbol is recommended in this tutorial on the straight dope

The current recommended way, as described in the upcoming tutorial here, is to use the hybridized .export() function.

this would look like that, and works in 1.1.0 and 1.2.0:

import mxnet as mx
from mxnet import gluon
ctx = mx.cpu()

# Create network
net = gluon.nn.HybridSequential(prefix="test_")
with net.name_scope():
    net.add(gluon.nn.Conv2D(10, (3, 3)))
    net.add(gluon.nn.Dense(50))
net.initialize()

# Save network    
net.hybridize()
net(mx.nd.ones((1,1,50,50)))
net.export('test', epoch=0)

# Load network
sym = mx.sym.load_json(open('test-symbol.json', 'r').read())
net2 = gluon.nn.SymbolBlock(outputs=sym, inputs=mx.sym.var('data'))
net2.load_params('test-0000.params', ctx=ctx)

This is affecting people who were until now using this method.

@ifeherva reported this issue is affecting his team.

@piiswrong @marcoabreu @szha

The text was updated successfully, but these errors were encountered:

chinakook · 2018-05-30T14:50:17Z

Using a module to save_checkpoint or gluon’s export would be OK. Getting mixed with Gluon’s save and Symbol’s save is so bad.

anirudhacharya · 2018-06-04T21:20:19Z

@nswamy please label - "Breaking","Bug", "Gluon"

ThomasDelteil · 2018-06-04T21:57:38Z

One way I think we could fix that issue would be to have by default save_params(filename, format='named_params')
And an option to have save_params(filename, format='numbered_params') or something like that that uses the new behaviour. And switch the default behaviour in 2.0.0
What do you think @piiswrong ?

wikier · 2018-06-06T02:11:35Z

SGTM

ThomasDelteil · 2018-06-13T18:20:31Z

A more in-depth analysis written by @piiswrong about the cause of the issue and possible solutions:

Background
Gluon provides save_params API, which saves the model parameters (but not model definition) as a binary file ‘xxx.params’. It can be loaded back with API load_params. The saved file is intended to be opaque but you can load it with mx.nd.load and see the internal content (we don’t advertise this).
Gluon also provides export API for saving a gluon model definition (.json) and parameters (.params), which can be loaded with MXNet Module or other language bindings.
Gluon provides SymbolBlock API that can load .json model definition and .params parameter file. But an import helper for this functionality is missing.

The change
We changed the internal structure of the .params file saved by save_params to resolve a bug. Parameters saved by previous versions can still be loaded in new version.

The complaint from user
A user saved model definition and model parameters with mx.sym.save_json and save_params following the straight dope book. Because the book doesn’t show how to load it back, customer invented a hacky way to load it into SymbolBlock with load_params. User's code broke after upgrading from 1.1 to 1.2.

The cause
User's hack depended on internal similarities between .params files saved by save_params and export. After the change of save_params format, this hack stopped working.

Faults on our part

The straight dope book should have recommended saving model definition with export instead of mx.sym.save_json and save_params.
We should have provided an import utility so that user doesn’t need to invent their own hacks to load model definition into SymbolBlock.
(?) We changed the file format saved by save_params. Although it is intended as an opaque binary whose format is not defined in documentation, some customers could be depending on undefined behavior.

Solutions

Revert save_params to previous format. Add new API save_parameters for new format.

Pros

User relying on internal format of save_params won’t see breakage.

Cons

All users need to manually migrate to new API save_parameters
Having both save_params and save_parameters could be confusing.

2. Issue warnings and error messages to instruct users to move to `export` and `import`, and stop depending on undefined behavior

Pros

Most users won’t see breakage and won’t need to do anything.

Cons

Users depending on save_params’s internal format will see breakage.

For both solutions we can add more documentation and helper API to minimize impact.

Current open PRs related to this issue: #11236 #11127 #11210

reminisce added Gluon Breaking labels Jun 8, 2018

ThomasDelteil mentioned this issue Jun 8, 2018

[MXNET-532] Clarify documentation of save_parameters(), load_parameters() #11210

Merged

ThomasDelteil closed this as completed Jun 26, 2018

srochel mentioned this issue Jul 5, 2018

1.2.1 release notes #11478

Merged

7 tasks

piyushghai mentioned this issue Aug 6, 2018

Backwards compatibility checker: Parameter 'model.1._unfused.0.l_cell.i2h_bias' is missing in file 'lstm_gluon_save_parameters_api-params' #12046

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Symbolic .json file not compatible with .params file generated since MXNet 1.2 #11091

Symbolic .json file not compatible with .params file generated since MXNet 1.2 #11091

ThomasDelteil commented May 29, 2018 •

edited

Loading

chinakook commented May 30, 2018

anirudhacharya commented Jun 4, 2018

ThomasDelteil commented Jun 4, 2018 •

edited

Loading

wikier commented Jun 6, 2018

ThomasDelteil commented Jun 13, 2018 •

edited

Loading

Symbolic .json file not compatible with .params file generated since MXNet 1.2 #11091

Symbolic .json file not compatible with .params file generated since MXNet 1.2 #11091

Comments

ThomasDelteil commented May 29, 2018 • edited Loading

chinakook commented May 30, 2018

anirudhacharya commented Jun 4, 2018

ThomasDelteil commented Jun 4, 2018 • edited Loading

wikier commented Jun 6, 2018

ThomasDelteil commented Jun 13, 2018 • edited Loading

ThomasDelteil commented May 29, 2018 •

edited

Loading

ThomasDelteil commented Jun 4, 2018 •

edited

Loading

ThomasDelteil commented Jun 13, 2018 •

edited

Loading