# Dive into Deep Learning - Getting Started

## Installation

In [1]:
!pip install mxnet

Collecting mxnet
  Downloading https://files.pythonhosted.org/packages/35/1d/b27b1f37ba21dde4bb4c84a1b57f4a4e29c576f2a0e6982dd091718f89c0/mxnet-1.3.1-py2.py3-none-win_amd64.whl (21.5MB)
Collecting numpy<1.15.0,>=1.8.2 (from mxnet)
  Downloading https://files.pythonhosted.org/packages/dc/99/f824a73251589d9fcef2384f9dd21bd1601597fda92ced5882940586ec37/numpy-1.14.6-cp36-none-win_amd64.whl (13.4MB)
Installing collected packages: numpy, mxnet
  Found existing installation: numpy 1.15.4
    Uninstalling numpy-1.15.4:
      Successfully uninstalled numpy-1.15.4
Successfully installed mxnet-1.3.1 numpy-1.14.6


You are using pip version 9.0.1, however version 18.1 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.


## Manipulating Data with ndarray

NDArrays are MXNet’s primary tool for storing and transforming data. 

NDArrays are similar to NumPy’s multi-dimensional array. There are a few advantages:
1. NDArrays support asynchronous computation on CPU, GPU, and distributed cloud architectures.
2. NDArrays provide support for automatic differentiation

### Getting Started with ndarrays

In [2]:
import mxnet as mx
from mxnet import nd

  from ._conv import register_converters as _register_converters


In [14]:
# dir(nd)

In [15]:
# #Vector Operation - Creating a row vector of 10 integers
x = nd.arange(12)
x


[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11.]
<NDArray 12 @cpu(0)>

In [16]:
# dir(x)

In [17]:
x.shape

(12,)

In [18]:
x.shape_array()


[12]
<NDArray 1 @cpu(0)>

In [19]:
x.size

12

In [20]:
y = x.reshape(3,4)

In [21]:
y


[[ 0.  1.  2.  3.]
 [ 4.  5.  6.  7.]
 [ 8.  9. 10. 11.]]
<NDArray 3x4 @cpu(0)>

In [22]:
x.reshape((3,4))


[[ 0.  1.  2.  3.]
 [ 4.  5.  6.  7.]
 [ 8.  9. 10. 11.]]
<NDArray 3x4 @cpu(0)>

In [24]:
# #To automatically allow mxnet to figure out the other dimensions need to reshape the data, we could use -1
x.reshape(3,4), x.reshape(3,-1), x.reshape(-1, 4)

(
 [[ 0.  1.  2.  3.]
  [ 4.  5.  6.  7.]
  [ 8.  9. 10. 11.]]
 <NDArray 3x4 @cpu(0)>, 
 [[ 0.  1.  2.  3.]
  [ 4.  5.  6.  7.]
  [ 8.  9. 10. 11.]]
 <NDArray 3x4 @cpu(0)>, 
 [[ 0.  1.  2.  3.]
  [ 4.  5.  6.  7.]
  [ 8.  9. 10. 11.]]
 <NDArray 3x4 @cpu(0)>)

In [26]:
# #Working with Tensors i.e. multi-dimensional arrays
nd.zeros((2,3,4))


[[[0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]]

 [[0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]]]
<NDArray 2x3x4 @cpu(0)>

In [28]:
nd.ones((2,3,4))


[[[1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]]

 [[1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]]]
<NDArray 2x3x4 @cpu(0)>

In [29]:
arr = [[1,2,3,4], [2,3,4,1], [3,4,1,2], [4,1,2,3]]
nd.array(arr)


[[1. 2. 3. 4.]
 [2. 3. 4. 1.]
 [3. 4. 1. 2.]
 [4. 1. 2. 3.]]
<NDArray 4x4 @cpu(0)>

In [39]:
# #Generating data from distributions using ndarray
nd.random.normal(0, 1, shape=(3,4))


[[ 0.4938394  -0.90434265 -1.2140794   2.1564064 ]
 [ 1.0938222   1.8271433  -1.04467     1.006219  ]
 [ 0.5174201  -0.80693173  1.3769008   0.20588511]]
<NDArray 3x4 @cpu(0)>

### Operations

In [44]:
# #Element-wise operations
x = nd.array([1, 2, 4, 8])
y = nd.array([2, 4, 6, 8])
print('x: ', x)
print('y: ', y)
print('x + y', x + y)
print('x - y', x - y)
print('x * y', x * y)
print('x / y', x / y)

x:  
[1. 2. 4. 8.]
<NDArray 4 @cpu(0)>
y:  
[2. 4. 6. 8.]
<NDArray 4 @cpu(0)>
x + y 
[ 3.  6. 10. 16.]
<NDArray 4 @cpu(0)>
x - y 
[-1. -2. -2.  0.]
<NDArray 4 @cpu(0)>
x * y 
[ 2.  8. 24. 64.]
<NDArray 4 @cpu(0)>
x / y 
[0.5       0.5       0.6666667 1.       ]
<NDArray 4 @cpu(0)>


In [50]:
# #Matrix Multiplication
x = nd.arange(1,13).reshape((3,4))
y = x.T
print("x: ", x)
print("y: ", y)
print("Dot Product: ", nd.dot(x, y))

x:  
[[ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 9. 10. 11. 12.]]
<NDArray 3x4 @cpu(0)>
y:  
[[ 1.  5.  9.]
 [ 2.  6. 10.]
 [ 3.  7. 11.]
 [ 4.  8. 12.]]
<NDArray 4x3 @cpu(0)>
Dot Product:  
[[ 30.  70. 110.]
 [ 70. 174. 278.]
 [110. 278. 446.]]
<NDArray 3x3 @cpu(0)>


In [52]:
# #Comparison Operator
x = nd.arange(1,10).reshape((3,3))
y = x.T
x == y


[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
<NDArray 3x3 @cpu(0)>

In [53]:
x, x.sum()

(
 [[1. 2. 3.]
  [4. 5. 6.]
  [7. 8. 9.]]
 <NDArray 3x3 @cpu(0)>, 
 [45.]
 <NDArray 1 @cpu(0)>)

In [54]:
x.norm()


[16.881943]
<NDArray 1 @cpu(0)>

### Broadcasting

When the shapes of two ndarrays differ, mxnet performs the operations by using the concept of broadcasting on the ndarray with a smaller dimension.

### Indexing and Slicing is similar to that of Python

In [57]:
x = nd.arange(1,13).reshape(3,4)
x


[[ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 9. 10. 11. 12.]]
<NDArray 3x4 @cpu(0)>

In [58]:
x[1:2]


[[5. 6. 7. 8.]]
<NDArray 1x4 @cpu(0)>

In [59]:
x[1:3]


[[ 5.  6.  7.  8.]
 [ 9. 10. 11. 12.]]
<NDArray 2x4 @cpu(0)>

In [60]:
x[:, 1:3]


[[ 2.  3.]
 [ 6.  7.]
 [10. 11.]]
<NDArray 3x2 @cpu(0)>

In [64]:
x[1,3] = 100
x


[[  1.   2.   3.   4.]
 [  5.   6.   7. 100.]
 [  9.  10.  11.  12.]]
<NDArray 3x4 @cpu(0)>

In [69]:
x[:, :2] = -10
x


[[-10. -10.   3.   4.]
 [-10. -10.   7. 100.]
 [-10. -10.  11.  12.]]
<NDArray 3x4 @cpu(0)>

### Saving Memory

In [72]:
x = nd.array([1,1,1])
y = nd.array([1,2,3])
before = id(y)
y = y + x
id(y) == before, id(y), before

(False, 1950407163464, 1950407163240)

In this case, each time we run an operation, like the one above, we would have to allocate memory to the newly created y variable. As the size of the data grows, this becomes undesirable. A better solution would be to update the variables in-place.

In [73]:
x = nd.array([1,1,1])
y = nd.array([1,2,3])
before = id(y)
y[:] = y + x
id(y) == before, id(y), before

(True, 1950407209200, 1950407209200)

Although, this is comparitively more efficient, the operation y+x would still have to be stored in a buffer.

In [74]:
x = nd.array([1,1,1])
y = nd.array([1,2,3])
before = id(y)
nd.elemwise_add(x,y, out=y)
id(y) == before, id(y), before

(True, 1950407209032, 1950407209032)