[MXNET-433] Tutorial on saving and loading gluon models #11002

indhub · 2018-05-20T00:31:22Z

Description

Tutorial on saving and loading gluon models

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Code is well-documented:
Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html

thomelane

Great reference! Just a few suggestions.

thomelane · 2018-05-28T20:20:03Z

docs/tutorials/gluon/save_load_params.md

+batch_size = 64
+
+# Helper to preprocess data for training
+def transform(data, label):


Would be better to use the new transforms. Something like train_data = gluon.data.vision.MNIST(train=True).transform_first(transforms.ToTensor())

thomelane · 2018-05-28T20:21:19Z

docs/tutorials/gluon/save_load_params.md

+                                   batch_size, shuffle=True)
+
+# Build a simple convolutional network
+def build_lenet(net):    


Why does this need to take an argument? Can't we just create net inside of function here?

So that, I can use this function to build either Block or HybridBlock. I'm building the network as a Block to demonstrate saving and loading parameters. I'm then building the network as HybridBlock to demonstrate saving and loading parameters and model architecture.

thomelane · 2018-05-28T20:36:45Z

docs/tutorials/gluon/save_load_params.md

+import numpy as np
+```
+
+## Build and train a simple model


Setup: build and train a simple model

thomelane · 2018-05-28T20:38:07Z

docs/tutorials/gluon/save_load_params.md

+new_net.load_params(file_name, ctx=ctx)
+```
+
+Note that to do this, we need the definition of the network as Python code. If our network is [Hybrid](https://mxnet.incubator.apache.org/tutorials/gluon/hybrid.html), we can even save the network architecture into files and we won't need the network definition in a Python file to load the network. We'll see how to do it in the next section.


Would make a little more explicit that if on a different machine, you'd need to import the same function and run to create the same object before loading params.

thomelane · 2018-05-28T20:40:16Z

docs/tutorials/gluon/save_load_params.md

+
+Model predictions:  [1. 1. 4. 5. 0. 5. 7. 0. 3. 6.] <!--notebook-skip-line-->
+
+## Saving model architecture and weights to file


Saving model parameters AND architecture to file

So that it matches with title format above.

thomelane · 2018-05-28T20:41:49Z

docs/tutorials/gluon/save_load_params.md

+
+That's it! `export` in this case creates `lenet-symbol.json` and `lenet-0001.params` in the current directory.
+
+## Loading saved model architecture and weights from a different frontend


Loading model parameters AND architecture from file

So that it matches with title format above. As major heading. Then create subheadings for 'from different frontend' and 'from Python'

thomelane · 2018-05-28T20:44:29Z

docs/tutorials/gluon/save_load_params.md

+# Saving and Loading Gluon Models
+
+Training large models take a lot of time and it is a good idea to save the trained models to files to avoid training them again and again. There is a number of reasons to do this. For example, you might want to do inference on a machine that is different from the one where the model was trained. Sometimes model's performance on validation set decreases towards the end of the training because of overfitting. If you saved your model parameters after every epoch, at the end you can decide to use the model that performs best on the validation set.
+


Would introduce that reader will be looking at two methods: params only and params and architecture. And would try to mention somewhere which method is recommended for certain situations. Currently not really discussed.

thomelane · 2018-05-28T20:44:51Z

docs/tutorials/gluon/save_load_params.md

+# Load the network architecture and parameters
+sym, arg_params, aux_params = mx.model.load_checkpoint('lenet', 1)
+# Create a Gluon Block using the loaded network architecture
+deserialized_net = gluon.nn.SymbolBlock(outputs=sym, inputs=mx.sym.var('data'))


Would be good to explain data here. Where does the name come from?

thomelane · 2018-05-28T20:46:18Z

docs/tutorials/index.md

@@ -38,7 +38,7 @@ Select API:&nbsp;
    * [Visual Question Answering](http://gluon.mxnet.io/chapter08_computer-vision/visual-question-answer.html) <img src="https://upload.wikimedia.org/wikipedia/commons/6/6a/External_link_font_awesome.svg" alt="External link" height="15px" style="margin: 0px 0px 3px 3px;"/>
 * Practitioner Guides
    * [Multi-GPU training](http://gluon.mxnet.io/chapter07_distributed-learning/multiple-gpus-gluon.html) <img src="https://upload.wikimedia.org/wikipedia/commons/6/6a/External_link_font_awesome.svg" alt="External link" height="15px" style="margin: 0px 0px 3px 3px;"/>
-    * [Checkpointing and Model Serialization (a.k.a. saving and loading)](http://gluon.mxnet.io/chapter03_deep-neural-networks/serialization.html) <img src="https://upload.wikimedia.org/wikipedia/commons/6/6a/External_link_font_awesome.svg" alt="External link" height="15px" style="margin: 0px 0px 3px 3px;"/>
+    * [Saving and Loading Models](/tutorials/gluon/save_load_params.html)


Could leave straight dope article, and add as alternative link? Unless it's misleading, in which case we should submit pr for straight dope.

ThomasDelteil · 2018-05-29T20:36:24Z

docs/tutorials/gluon/save_load_params.md

+# Create a Gluon Block using the loaded network architecture
+deserialized_net = gluon.nn.SymbolBlock(outputs=sym, inputs=mx.sym.var('data'))
+# Set the parameters
+net_params = deserialized_net.collect_params()


recently learned you can load the parameters like that:

deserialized_net.collect_params().load('lenet-0001.params')

rather than:

net_params = deserialized_net.collect_params() for param in arg_params: if param in net_params: net_params[param]._load_init(arg_params[param], ctx=ctx) for param in aux_params: if param in net_params: net_params[param]._load_init(aux_params[param], ctx=ctx) ```

ThomasDelteil · 2018-05-31T22:12:14Z

docs/tutorials/gluon/save_load_params.md

+
+```python
+# Load the network architecture and parameters
+sym, arg_params, aux_params = mx.model.load_checkpoint('lenet', 1)


Since we just need sym I would suggest to rather use:
sym = mx.sym.load_json(open('lenet-symbol.json', 'r').read()

…arameters need to be loaded with Block.load_params()

thomelane · 2018-06-04T16:04:16Z

Checked your changes @indhub . Good to be merged @szha.

sandeep-krishnamurthy

LGTM.
Few nit comments. Will merge post changes.
Thanks for your contribution.

sandeep-krishnamurthy · 2018-06-05T01:01:56Z

docs/tutorials/gluon/save_load_params.md

@@ -0,0 +1,269 @@
+# Saving and Loading Gluon Models
+
+Training large models take a lot of time and it is a good idea to save the trained models to files to avoid training them again and again. There is a number of reasons to do this. For example, you might want to do inference on a machine that is different from the one where the model was trained. Sometimes model's performance on validation set decreases towards the end of the training because of overfitting. If you saved your model parameters after every epoch, at the end you can decide to use the model that performs best on the validation set.


nit: There -are- a number of reasons..

nit: another motivation would be to separate research from production by using more research native Python for training and Scala/C++ in production inference.

sandeep-krishnamurthy · 2018-06-05T01:04:49Z

docs/tutorials/gluon/save_load_params.md

+# Train a given model using MNIST data
+def train_model(model):
+    # Initialize the parameters with Xavier initializer
+    net.collect_params().initialize(mx.init.Xavier(), ctx=ctx)


model.collect_params?

net still works as it is a global parameter in your script.

Good catch. thanks!

sandeep-krishnamurthy · 2018-06-05T01:05:23Z

docs/tutorials/gluon/save_load_params.md

+    # Use cross entropy loss
+    softmax_cross_entropy = gluon.loss.SoftmaxCrossEntropyLoss()
+    # Use Adam optimizer
+    trainer = gluon.Trainer(net.collect_params(), 'adam', {'learning_rate': .001})


same as above. net->model

* Add tutorial to save and load parameters * Add outputs in markdown * Add image. Fix some formatting. * Add tutorial to index. Add to tests. * Minor language changes * Add download notebook button * Absorb suggestions for review * Add as alternate link * Use Symbol.load instead of model.load_checkpoint * Add a note discouraging the use of Block.collect_params().save() if parameters need to be loaded with Block.load_params() * Fix a bug. Also some language corrections.

indhub added 3 commits May 18, 2018 22:27

Add tutorial to save and load parameters

108f139

Add outputs in markdown

8e24aaf

Add image. Fix some formatting.

dbfe7ff

indhub requested a review from szha as a code owner May 20, 2018 00:31

indhub added 3 commits May 20, 2018 00:40

Add tutorial to index. Add to tests.

3763b67

Minor language changes

da4476d

Add download notebook button

dcced36

thomelane reviewed May 28, 2018

View reviewed changes

ThomasDelteil reviewed May 29, 2018

View reviewed changes

ThomasDelteil mentioned this pull request May 29, 2018

Symbolic .json file not compatible with .params file generated since MXNet 1.2 #11091

Closed

indhub added 2 commits May 31, 2018 18:29

Absorb suggestions for review

0dbe173

Add as alternate link

09932a5

ThomasDelteil reviewed May 31, 2018

View reviewed changes

indhub added 2 commits June 1, 2018 07:17

Use Symbol.load instead of model.load_checkpoint

44554c3

Add a note discouraging the use of Block.collect_params().save() if p…

ef1a5fb

…arameters need to be loaded with Block.load_params()

sandeep-krishnamurthy reviewed Jun 5, 2018

View reviewed changes

Fix a bug. Also some language corrections.

b72ab2d

sandeep-krishnamurthy merged commit 6ef7a0f into apache:master Jun 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MXNET-433] Tutorial on saving and loading gluon models #11002

[MXNET-433] Tutorial on saving and loading gluon models #11002

indhub commented May 20, 2018

thomelane left a comment

thomelane May 28, 2018

thomelane May 28, 2018

indhub May 31, 2018

thomelane May 28, 2018

thomelane May 28, 2018

thomelane May 28, 2018

thomelane May 28, 2018

thomelane May 28, 2018

indhub May 31, 2018

thomelane May 28, 2018

thomelane May 28, 2018

ThomasDelteil May 29, 2018

ThomasDelteil May 31, 2018

thomelane commented Jun 4, 2018

sandeep-krishnamurthy left a comment

sandeep-krishnamurthy Jun 5, 2018

sandeep-krishnamurthy Jun 5, 2018

sandeep-krishnamurthy Jun 5, 2018

indhub Jun 6, 2018

sandeep-krishnamurthy Jun 5, 2018


		Model predictions: [1. 1. 4. 5. 0. 5. 7. 0. 3. 6.] <!--notebook-skip-line-->

		## Saving model architecture and weights to file


		That's it! `export` in this case creates `lenet-symbol.json` and `lenet-0001.params` in the current directory.

		## Loading saved model architecture and weights from a different frontend

		# Saving and Loading Gluon Models

		Training large models take a lot of time and it is a good idea to save the trained models to files to avoid training them again and again. There is a number of reasons to do this. For example, you might want to do inference on a machine that is different from the one where the model was trained. Sometimes model's performance on validation set decreases towards the end of the training because of overfitting. If you saved your model parameters after every epoch, at the end you can decide to use the model that performs best on the validation set.

		@@ -0,0 +1,269 @@
		# Saving and Loading Gluon Models

		Training large models take a lot of time and it is a good idea to save the trained models to files to avoid training them again and again. There is a number of reasons to do this. For example, you might want to do inference on a machine that is different from the one where the model was trained. Sometimes model's performance on validation set decreases towards the end of the training because of overfitting. If you saved your model parameters after every epoch, at the end you can decide to use the model that performs best on the validation set.

[MXNET-433] Tutorial on saving and loading gluon models #11002

[MXNET-433] Tutorial on saving and loading gluon models #11002

Conversation

indhub commented May 20, 2018

Description

Checklist

Essentials

thomelane left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thomelane commented Jun 4, 2018

sandeep-krishnamurthy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment