# Amazon's Deep Learning Library (MXNet)


![alt text](https://image.slidesharecdn.com/mxnet-170323180455/95/a-deeper-dive-into-apache-mxnet-march-2017-aws-online-tech-talks-21-638.jpg?cb=1490292333 "Logo Title Text 1")


## Features


![alt text](https://pat-7ekbwyq9jmy.netdna-ssl.com/wp-content/uploads/2017/06/Mxnet-1000x380.jpg "Logo Title Text 1")

## Tensorflow vs MXNet? 

![alt text](http://www.svds.com/wp-content/uploads/2017/02/Deep_learning_ratings_final-1024x563.png "Logo Title Text 1")

![alt text](https://media.amazonwebservices.com/blog/2017/AI%20MxNet%20Blog-01-Header%20Pic.png "Logo Title Text 1")

![alt text](https://2.bp.blogspot.com/-Lzy5Vi7Hj0U/VstoM62XkFI/AAAAAAABTLg/AWgqwtm4BbI/s1600/Screen%2BShot%2B2016-02-22%2Bat%2B9.56.26%2BPM.png "Logo Title Text 1")

## Imperative API

![alt text](https://image.slidesharecdn.com/mxnet-170323180455/95/a-deeper-dive-into-apache-mxnet-march-2017-aws-online-tech-talks-17-638.jpg?cb=1490292333 "Logo Title Text 1")

![alt text](https://image.slidesharecdn.com/mxnet-170323180455/95/a-deeper-dive-into-apache-mxnet-march-2017-aws-online-tech-talks-18-638.jpg?cb=1490292333 "Logo Title Text 1")

![alt text](https://image.slidesharecdn.com/mxnet-170323180455/95/a-deeper-dive-into-apache-mxnet-march-2017-aws-online-tech-talks-19-638.jpg?cb=1490292333 "Logo Title Text 1")

## Installation

sudo -H pip install mxnet  --upgrade


## The NDArray API

- The NDArray is the core data sturcture for all mathmatical computations. 
- AN NDArray represents a multidimensional fix-sized homogeneous array, just like numpy.ndarray. 
- it enables imperative computation.
- it executes code lazily, allowing it to automatically parallelize multiple operations across the available hardware.
- training and running neural networks involve a lot of math operations. Multi-dimensional arrays is how we’ll store our data.

![alt text](https://cdn-images-1.medium.com/max/1600/1*D6pp5hTfUl8FmzYwJ3N8LQ.png "Logo Title Text 1")

is

![alt text](https://cdn-images-1.medium.com/max/1600/1*Ct7GN4a5gqONNUFV_qMu_A.png "Logo Title Text 1")

In [1]:
import mxnet 
a = mx.nd.array([[1,2,3], [4,5,6]])

NameError: name 'mx' is not defined

In [40]:
a.size

6

In [41]:
a.shape

(2, 3)

By default, an NDArray holds 32-bit floats, but we can customize that.

In [42]:
import numpy as np
b = mx.nd.array([[1,2,3], [4,5,6]], dtype=np.int32)
b.dtype

numpy.int32

Printing an NDArray is as easy as this.

In [43]:
b.asnumpy()

array([[1, 2, 3],
       [4, 5, 6]], dtype=int32)

All the math operators you’d expect are available. Let’s try an element-wise matrix multiplication.

In [44]:
a = mx.nd.array([[1,2,3], [4,5,6]])
b = a*a
b.asnumpy()

array([[  1.,   4.,   9.],
       [ 16.,  25.,  36.]], dtype=float32)

How about an proper matrix multiplication (aka ‘dot product’)?

In [46]:
a = mx.nd.array([[1,2,3], [4,5,6]])
b = a.T

c = mx.nd.dot(a,b)
c.asnumpy()

array([[ 14.,  32.],
       [ 32.,  77.]], dtype=float32)

Let’s try something a little more complicated:

- initialize a 1000 x 1000 matrix with a uniform distribution, stored on CPU#0 
- initialize another 1000 x 1000 matrix with a normal distribution (mean of 1 and standard deviation of 2), also on CPU#0.

In [47]:
c = mx.nd.uniform(low=0, high=1, shape=(1000,1000), ctx="cpu(0)")
d = mx.nd.normal(loc=1, scale=2, shape=(1000,1000), ctx="cpu(0)")
e = mx.nd.dot(c,d)

# The Symbol API 

![alt text](https://cdn-images-1.medium.com/max/1600/1*h0M4n_9FPyriCwT-LjE0HQ.png "Logo Title Text 1")

- What A,B,C and D are is irrelevant at this point. They are symbols.
- No matter what the inputs are (integers, vectors, matrices, etc.), this graph tells us how to compute the output value — provided that operations “+” and “*” are defined.
- This graph also tells us that (A*B) and (C*D) can be computed in parallel.

In [52]:
a = mx.symbol.Variable('A')
b = mx.symbol.Variable('B')
c = mx.symbol.Variable('C')
d = mx.symbol.Variable('D')
e = (a*b)+(c*d)

We can assign a result to e without knowing what a, b, c and d are. Let’s keep going.

In [53]:
(a,b,c,d)

(<Symbol A>, <Symbol B>, <Symbol C>, <Symbol D>)

In [54]:
e

<Symbol _plus2>

In [55]:
type(e)

mxnet.symbol.symbol.Symbol

a, b, c and d are symbols which we explicitly declared. e is different: it is a symbol as well, but one that is the result of a ‘+’ operation. Let’s try to learn more about e.

In [56]:
e.list_arguments()

['A', 'B', 'C', 'D']

In [57]:
e.list_outputs()

['_plus2_output']

What this tells us is that:

- e depends on variables a, b, c and d,
- the operation that computes e is a sum,
- e is indeed (a*b)+(c*d).
- Of course, we can do much more with symbols than ‘+’ and ‘*’. 

Just like for NDArrays, a lot of operations are defined (math, formatting, etc.).

- Applying computing steps defined with Symbols to data stored in NDArrays requires an operation called ‘binding’,
- Lets set ‘A’ to 1, ‘B’ to 2, C to ‘3’ and ‘D’ to 4, which is why I’m creating 4 NDArrays containing a single integer.

In [20]:
import numpy as np
a_data = mx.nd.array([1], dtype=np.int32)
b_data = mx.nd.array([2], dtype=np.int32)
c_data = mx.nd.array([3], dtype=np.int32)
d_data = mx.nd.array([4], dtype=np.int32)

- Next, I’m binding each NDArray to its corresponding Symbol. 

In [59]:
executor=e.bind(mx.cpu(), {'A':a_data, 'B':b_data, 'C':c_data, 'D':d_data})

- it’s time to let our input data flow through the graph in order to get a result: 
- the forward() function will get things going. 
- It returns an array of NDArrays, because a graph could have multiple outputs. 
- Here, we have a single output, holding the value ‘14’ — which is reassuringly equal to (1*2)+(3*4).

In [61]:
executor
e_data = executor.forward()
e_data

[
 [14]
 <NDArray 1 @cpu(0)>]

This clean separation between data and computation aims at giving us the best of both worlds

## The Module API - Building and training neural networks


Our (imaginary) data set is composed of 1000 data samples

- Each sample has 100 features.
- A feature is represented by a float value between 0 and 1.
- Samples are split in 10 categories. The purpose of the network will be to predict the correct category for a given sample.
- We’ll use 800 samples for training and 200 samples for validation.
- We’ll use a batch size of 10 for training and validation

In [None]:
import mxnet as mx
import numpy as np
import logging


logging.basicConfig(level=logging.INFO)
sample_count = 1000
train_count = 800
valid_count = sample_count - train_count
feature_count = 100
category_count = 10
batch=10

In [67]:
#Let’s use a uniform distribution to generate the 1000 samples. 
#They are stored in an NDArray named ‘X’: 1000 lines, 100 columns.
X = mx.nd.uniform(low=0, high=1, shape=(sample_count, feature_count))
X.shape

(1000, 100)

In [68]:
#The categories for these 1000 samples are represented as 
#integers in the 0–9 range. They are randomly generated and stored in an NDArray named ‘Y’.
Y = mx.nd.empty((sample_count,))
for i in range(0,sample_count-1):
  Y[i] = np.random.randint(0,category_count)


In [69]:
#split the data
X_train = mx.nd.crop(X, begin=(0,0), end=(train_count,feature_count-1))
X_valid = mx.nd.crop(X, begin=(train_count,0), end=(sample_count,feature_count-1))
Y_train = Y[0:train_count]
Y_valid = Y[train_count:sample_count]

In [70]:
#build the network
data = mx.sym.Variable('data')
fc1 = mx.sym.FullyConnected(data, name='fc1', num_hidden=64)
relu1 = mx.sym.Activation(fc1, name='relu1', act_type="relu")
fc2 = mx.sym.FullyConnected(relu1, name='fc2', num_hidden=category_count)
out = mx.sym.SoftmaxOutput(fc2, name='softmax')
mod = mx.mod.Module(out)

In [73]:
train_iter = mx.io.NDArrayIter(data=X_train,label=Y_train,batch_size=batch)
mod.init_params()
# Much better
mod.init_params(initializer=mx.init.Xavier(magnitude=2.))
mod.init_optimizer(optimizer='sgd', optimizer_params=(('learning_rate', 0.1), ))


  
  after removing the cwd from sys.path.
