onnx · jspisak · May 8, 2018 · May 5, 2018
diff --git a/.gitignore b/.gitignore
@@ -1,4 +1,6 @@
 .ipynb_checkpoints
 /tutorials/output/*
 /tutorials/output/*/
+.DS_Store
+.idea/
 !/tutorials/output/README.md
diff --git a/README.md b/README.md
@@ -19,6 +19,7 @@
 * [Converting SuperResolution model from PyTorch to Caffe2 and deploying on mobile device](tutorials/PytorchCaffe2SuperResolution.ipynb)
 * [Transferring SqueezeNet from PyTorch to Caffe2 and to Android app](tutorials/PytorchCaffe2MobileSqueezeNet.ipynb)
 * [Serving PyTorch Models on AWS Lambda with Caffe2 & ONNX](https://machinelearnings.co/serving-pytorch-models-on-aws-lambda-with-caffe2-onnx-7b096806cfac)
+* [Serving ONNX models with MXNet Model Server](tutorials/ONNXMXNetServer.ipynb)
 
 ## ONNX tools
 

diff --git a/tutorials/ONNXMXNetServer.ipynb b/tutorials/ONNXMXNetServer.ipynb
@@ -0,0 +1,222 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Serving ONNX model with MXNet Model Server\n",
+    "\n",
+    "This tutorial dmeonstrates how to serve an ONNX model with MXNet Model Server. We'll use a pre-trained SqueezeNet model from ONNX model zoo. The same example can be easily applied to other ONNX models.\n",
+    "\n",
+    "Frameworks used in this tutorial:\n",
+    "* [MXNet Model Server](https://github.com/awslabs/mxnet-model-server) that uses [MXNet](http://mxnet.io)\n",
+    "* [ONNX](https://onnx.ai)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Installing pre-requisites:\n",
+    "Next we'll install the pre-requisites:\n",
+    "* [ONNX](https://github.com/onnx/onnx)\n",
+    "* [MXNetModelServer](https://github.com/awslabs/mxnet-model-server)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!conda install -y -c conda-forge onnx\n",
+    "!pip install mxnet-model-server"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Downloading a pre-trained ONNX model\n",
+    "\n",
+    "Let's go ahead and download a aqueezenet onnx model into a new directory."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "scrolled": true
+   },
+   "outputs": [],
+   "source": [
+    "!mkdir squeezenet\n",
+    "%cd squeezenet\n",
+    "!curl -O https://s3.amazonaws.com/model-server/models/onnx-squeezenet/squeezenet.onnx"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Inspecting the ONNX model\n",
+    "\n",
+    "Let's make sure the exported ONNX model is well formed"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import onnx\n",
+    "\n",
+    "# Load the ONNX model\n",
+    "model = onnx.load(\"squeezenet.onnx\")\n",
+    "\n",
+    "# Check that the IR is well formed, identified issues will error out\n",
+    "onnx.checker.check_model(model)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "collapsed": true
+   },
+   "source": [
+    "## Packaging the ONNX model for serving with MXNet Model Server (MMS)\n",
+    "\n",
+    "To serve the ONNX model with MMS, we will first need to prepare some files that will be bundled into a **Model Archive**.  \n",
+    "The Model Archive containes everything MMS needs to setup endpoints and serve the model. \n",
+    "\n",
+    "The files needed are:\n",
+    "* squeezenet.onnx - the ONNX model file\n",
+    "* signature.json - defining the input and output of the model\n",
+    "* synset.txt - defining the set of classes the model was trained on, and returned by the model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Let's go ahead and download the files we need:\n",
+    "!curl -O https://s3.amazonaws.com/model-server/models/onnx-squeezenet/signature.json\n",
+    "!curl -O https://s3.amazonaws.com/model-server/models/onnx-squeezenet/synset.txt"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Let's take a peek into the **signature.json** file:\n",
+    "!cat signature.json"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Let's take a peek into the synset.txt file:\n",
+    "!head -n 10 synset.txt"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Let's package everything up into a Model Archive bundle\n",
+    "!mxnet-model-export --model-name squeezenet --model-path .\n",
+    "!ls -l squeezenet.model"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Serving the Model Archive with MXNet Model Server\n",
+    "Now that we have the **Model Archive**, we can start the server and ask it to setup HTTP endpoints to serve the model, emit real-time operational metrics and more.\n",
+    "\n",
+    "We'll also test the server's endpoints."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Spawning a new process to run the server\n",
+    "import subprocess as sp\n",
+    "server = sp.Popen(\"mxnet-model-server --models squeezenet=squeezenet.model\", shell=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Check out the health endpoint\n",
+    "!curl http://127.0.0.1:8080/ping"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Download an image Trying out image classification\n",
+    "!curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg\n",
+    "!curl -X POST http://127.0.0.1:8080/squeezenet/predict -F \"input_0=@kitten.jpg\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Lastly, we'll terminate the server\n",
+    "server.terminate()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.6.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}