NDArrays is the primary tool for storing and transforming data in MXNet. It is designed to be similar to NumPy's multi-dimensional array with 2 additional key features:
1. Support asynchronous computation on CPU, GPU and distributed cloud architectures.
2. Support automatic differentiation.

# Getting Started 

In [1]:
import mxnet as mx
from mxnet import nd

Create a vector with 12 consecutive integers.

In [3]:
x = nd.arange(12)
x


[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11.]
<NDArray 12 @cpu(0)>

The property ``<NDArray 12 @cpu(0)>`` shows that ``x`` is a one-dimensional array of length 12 and resides in CPU main memory.

We can get the shape of a NDArray instance though its ``shape`` property:

In [4]:
x.shape

(12,)

We can alse get the size of a NDArray instance though its ``size`` property:

In [6]:
x.size

12

We can change the shape of a NDArray by calling its ``reshape`` method with a tuple represents the target shape:

In [7]:
x = x.reshape((3, 4))
x


[[ 0.  1.  2.  3.]
 [ 4.  5.  6.  7.]
 [ 8.  9. 10. 11.]]
<NDArray 3x4 @cpu(0)>

The ``empty`` function give us an uninitialized matrix with some random value in it:

In [8]:
nd.empty((3, 4))


[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]
<NDArray 3x4 @cpu(0)>

Create array with all zeros and all ones:

In [9]:
nd.zeros((3, 4))


[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]
<NDArray 3x4 @cpu(0)>

In [10]:
nd.ones((3, 4))


[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]
<NDArray 3x4 @cpu(0)>

Specify the value of each element in NDArray through a python list:

In [11]:
nd.array([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])


[[2. 1. 4. 3.]
 [1. 2. 3. 4.]
 [4. 3. 2. 1.]]
<NDArray 3x4 @cpu(0)>

Create a random matrix with elements sampled from a normal distribution with zero mean and unit variance:

In [12]:
nd.random.normal(0, 1, shape=(3, 4))


[[ 1.1630787   0.4838046   0.29956347  0.15302546]
 [-1.1688148   1.5580711  -0.5459446  -2.3556297 ]
 [ 0.5414402   2.6785066   1.2546344  -0.54877394]]
<NDArray 3x4 @cpu(0)>

# Operations 

Element-wise function can be created from any function that maps from the scalars to the scalars. In MXNet, the common standard arithmetic operator (+, -, \*, /, \*\*) have all been lifted up to element-wise operators.

In [13]:
x = nd.array([1, 2, 4, 8])
y = nd.ones_like(x) * 2
print('x = ', x)
print('x + y = ', x + y)
print('x - y = ', x - y)
print('x * y = ', x * y)
print('x / y = ', x / y)
print('x ** y = ', x**y)

x =  
[1. 2. 4. 8.]
<NDArray 4 @cpu(0)>
x + y =  
[ 3.  4.  6. 10.]
<NDArray 4 @cpu(0)>
x - y =  
[-1.  0.  2.  6.]
<NDArray 4 @cpu(0)>
x * y =  
[ 2.  4.  8. 16.]
<NDArray 4 @cpu(0)>
x / y =  
[0.5 1.  2.  4. ]
<NDArray 4 @cpu(0)>
x ** y =  
[ 1.  4. 16. 64.]
<NDArray 4 @cpu(0)>


Many more operations can be applied element-wise such as exponentiation:

In [14]:
x.exp()


[2.7182817e+00 7.3890562e+00 5.4598148e+01 2.9809580e+03]
<NDArray 4 @cpu(0)>

Matraix multiplication using ``dot`` function (``T`` dose transpose):

In [16]:
x = nd.arange(12).reshape((3, 4))
y = nd.array([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
nd.dot(x, y.T)


[[ 18.  20.  10.]
 [ 58.  60.  50.]
 [ 98. 100.  90.]]
<NDArray 3x3 @cpu(0)>

Merge multiple NDArrays with ``concat`` function along a dimension: 

In [17]:
nd.concat(x, y, dim=0)


[[ 0.  1.  2.  3.]
 [ 4.  5.  6.  7.]
 [ 8.  9. 10. 11.]
 [ 2.  1.  4.  3.]
 [ 1.  2.  3.  4.]
 [ 4.  3.  2.  1.]]
<NDArray 6x4 @cpu(0)>

In [18]:
nd.concat(x, y, dim=1)


[[ 0.  1.  2.  3.  2.  1.  4.  3.]
 [ 4.  5.  6.  7.  1.  2.  3.  4.]
 [ 8.  9. 10. 11.  4.  3.  2.  1.]]
<NDArray 3x8 @cpu(0)>

We can create binary NDArray by a logical statement:

In [19]:
x == y


[[0. 1. 0. 1.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]
<NDArray 3x4 @cpu(0)>

Summing all the elements in NDArray yields an NDArray with one element:

In [20]:
x.sum()


[66.]
<NDArray 1 @cpu(0)>

We can transform the result into a scalar in Python using the ``asscalar`` function:

In [21]:
x.norm().asscalar()

22.494442

# Broadcast Mechanism

Broadcast mechanism copys the elements appropriately so that the two NDArrays have the same shape, and then carry out operations by elements:

In [22]:
a = nd.arange(3).reshape((3, 1))
b = nd.arange(2).reshape((1, 2))
a, b

(
 [[0.]
  [1.]
  [2.]]
 <NDArray 3x1 @cpu(0)>, 
 [[0. 1.]]
 <NDArray 1x2 @cpu(0)>)

In [23]:
a + b


[[0. 1.]
 [1. 2.]
 [2. 3.]]
<NDArray 3x2 @cpu(0)>

# Indexing and Slicing

In [24]:
x[1: 3]


[[ 4.  5.  6.  7.]
 [ 8.  9. 10. 11.]]
<NDArray 2x4 @cpu(0)>

In [25]:
x[1, 2] = 9
x


[[ 0.  1.  2.  3.]
 [ 4.  5.  9.  7.]
 [ 8.  9. 10. 11.]]
<NDArray 3x4 @cpu(0)>

In [26]:
x[0:2, :] = 12
x


[[12. 12. 12. 12.]
 [12. 12. 12. 12.]
 [ 8.  9. 10. 11.]]
<NDArray 3x4 @cpu(0)>

# Saving Memory 

Every time we ran a operation, we allocated new memory to host its results:

In [27]:
print(id(y))
y = x + y
print(id(y))

4414608384
4578824824


We can perform in place opteration by using slice notation:

In [30]:
z = nd.zeros_like(x)
print(id(z))
z[:] = x + y
print(id(z))

4578826952
4578826952


While ``z`` is reusing the same memory, the `x + y` operation still allocate a temporary buffer to store the result. To make even better use of memory, we can directly invoke the underlying `ndarray` operation:

In [31]:
print(id(z))
nd.elemwise_add(x, y, out=z)
print(id(z))

4578826952
4578826952


# Mutual Transformation of NDArray and NumPy 

The converted array **do not** share the memory

In [32]:
import numpy as np

a = x.asnumpy()
print(type(a), id(a))
b = nd.array(a)
print(type(b), id(b))

<class 'numpy.ndarray'> 4578919328
<class 'mxnet.ndarray.ndarray.NDArray'> 4578941304


# Problems 

1. Run the code in this section. Change the conditional statement `x == y` in this section to `x < y` or `x > y`, and then see what kind of NDArray you can get.

In [35]:
x, y

(
 [[12. 12. 12. 12.]
  [12. 12. 12. 12.]
  [ 8.  9. 10. 11.]]
 <NDArray 3x4 @cpu(0)>, 
 [[14. 13. 16. 15.]
  [13. 14. 15. 16.]
  [12. 12. 12. 12.]]
 <NDArray 3x4 @cpu(0)>)

In [33]:
x < y


[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]
<NDArray 3x4 @cpu(0)>

In [34]:
x > y


[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]
<NDArray 3x4 @cpu(0)>

2. Replace the two NDArrays that operate by element in the broadcast mechanism with other shapes, e.g. three dimensional tensors. Is the result the same as expected?

In [37]:
a = nd.arange(3).reshape((3, 1, 1))
b = nd.arange(2).reshape((1, 1, 2))
a, b

(
 [[[0.]]
 
  [[1.]]
 
  [[2.]]]
 <NDArray 3x1x1 @cpu(0)>, 
 [[[0. 1.]]]
 <NDArray 1x1x2 @cpu(0)>)

In [38]:
a + b


[[[0. 1.]]

 [[1. 2.]]

 [[2. 3.]]]
<NDArray 3x1x2 @cpu(0)>

3. Assume that we have three matrices `a`, `b` and `c`. Rewrite `c = nd.dot(a, b.T) + c` in the most memory efficient manner.

In [42]:
a = nd.ones((3, 4))
b = nd.random.normal(0, 1, shape=(3, 4))
c = nd.arange(9).reshape((3, 3))

c += nd.dot(a, b.T)
c


[[-2.827944   -1.6937523  -1.4320533 ]
 [ 0.17205596  1.3062477   1.5679467 ]
 [ 3.1720562   4.3062477   4.567947  ]]
<NDArray 3x3 @cpu(0)>