# Manipulate data the MXNet way with NDArray

MXNet's NDArray provides a data structure similar to NumPy's multi-dimensional array, adding some key capabilities. First, NDArrays support asynchronous computation on CPU, GPU, and distributed cloud architectures. Second, they provide support for automatic differentiation. These properties make it an ideal library for machine learning both for research projects and production systems.


## Getting started

First, let's import ``mxnet`` and (for convenience) ``mxnet.ndarray``, the only dependencies we'll need in this tutorial.

In [None]:
import mxnet as mx
import mxnet.ndarray as nd

Next, let's see how to create an NDArray, without initializing values. Speficially we'll create a 2D array (also called a *matrix*) with 6 rows and 4 columns.

In [24]:
x = nd.empty(shape=(6,4))
print(x)

[[  0.00000000e+00  -0.00000000e+00  -1.12447217e-19  -2.85864887e-42]
 [  8.40779079e-45   0.00000000e+00   0.00000000e+00   0.00000000e+00]
 [  0.00000000e+00   4.57523949e-41   1.87544948e+28   1.03964458e-05]
 [  3.25370380e+21   1.04132675e-11   3.36446871e+21   4.22486810e-05]
 [  6.70030531e-10   7.98834428e+20   1.03939814e+21   3.41820942e-06]
 [  4.16550197e-11   7.14495034e+31   4.14181787e-41   5.51012977e-40]]
<NDArray 6x4 @cpu(0)>


Often we'll want create arrays whose values are sampled randomly. This is especially common when we intend to use the array as a parameter in a neural network. In this snipped, we initialize with values drawn from a standard normal distribution.

In [25]:
x = nd.random_normal(shape=(6,4))
print(x)

[[-1.57344317  1.26625562 -0.14007865  0.89506418]
 [ 0.29670078 -0.60159451  1.31119514  1.20405591]
 [ 0.5035904  -0.9712193  -1.18944502 -0.58256227]
 [-0.55021369  0.37170771 -1.59187555  0.93000722]
 [-1.10819459 -1.42257547  0.07872018 -0.51761991]
 [-0.91856349  2.00883245 -0.74571455  0.2863085 ]]
<NDArray 6x4 @cpu(0)>


As in NumPy, the dimensions of each NDArray are accessible via the ``.shape`` attribute.

In [26]:
print(x.shape)

(6, 4)


We can also query its size, which is equal to the product of the components of the shape. Together with the precision of the stored values, this tells us how much memory the array occupies.

In [23]:
print(x.size)

24


## Operations

NDarray supports a large number of standard mathematical operations. 

In [27]:
y = nd.random_normal(shape=(6,4))
c = x + y
print(c)

[[-2.78152943  1.82671511  1.67394257  1.86482394]
 [-1.22604227 -1.13013196 -1.20404983 -0.68503404]
 [-0.85134214 -0.31642807 -2.14692903 -1.03737545]
 [-1.27506936  0.69681579 -0.47991192 -0.37022686]
 [-1.58646703 -1.05464101 -1.09522903  0.93580633]
 [-1.7107482   2.25037408  0.18638974  0.76528859]]
<NDArray 6x4 @cpu(0)>


## In-place operations

In the previous example, we allocated new memory for the sum ``x+y`` and assigned a reference to the variable ``c``. To make better use of memory, we often prefer to perform operations in place, reusing already allocated memory. 

In MXNet, we can specify where to write the results of operations by assigning them with slice notation, e.g., ``result[:] = ...``.

In [28]:
result = nd.zeros(shape=(6,4))
result[:] = x+y
print(result)

[[-2.78152943  1.82671511  1.67394257  1.86482394]
 [-1.22604227 -1.13013196 -1.20404983 -0.68503404]
 [-0.85134214 -0.31642807 -2.14692903 -1.03737545]
 [-1.27506936  0.69681579 -0.47991192 -0.37022686]
 [-1.58646703 -1.05464101 -1.09522903  0.93580633]
 [-1.7107482   2.25037408  0.18638974  0.76528859]]
<NDArray 6x4 @cpu(0)>


If we're not planning to re-use ``x``, then we can assign the result to ``x`` itself.

In [None]:
x[:] = x + y

But be careful! This is **NOT** the same as ``x = x + y``. If we don't use slice notation then we allocate new memory and assign a reference to the new data to the variable ``x``.

## Slicing

MXNet NDArrays support slicing in all the ridiculous ways you might imagine accessing your data. Here's an example of reading the second and third rows from ``x``.

In [30]:
x[2:4]

[[-0.85134214 -0.31642807 -2.14692903 -1.03737545]
 [-1.27506936  0.69681579 -0.47991192 -0.37022686]]
<NDArray 2x4 @cpu(0)>

In [None]:
Now let's try whiting to a specific element.

In [43]:
x[3,2] = 9.0
print(x[3])

[ -0.55021369   5.           9.          12.34566975]
<NDArray 4 @cpu(0)>


## Weird multi-dimensional slicing

We can even write to arbitrary ranges along each of the axes.

In [44]:
x[2:4,1:3] = 5.0
print(x)

[[ -1.57344317   1.26625562  -0.14007865   0.89506418]
 [  0.29670078  -0.60159451   1.31119514   1.20405591]
 [  0.5035904    5.           5.          -0.58256227]
 [ -0.55021369   5.           5.          12.34566975]
 [ -1.10819459  -1.42257547   0.07872018  -0.51761991]
 [ -0.91856349   2.00883245  -0.74571455   0.2863085 ]]
<NDArray 6x4 @cpu(0)>


## Converting from MXNet NDArray to NumPy 

Converting MXNet NDArrays to and from NumPy is easy. Note that, unlike in PyTorch, the converted arrays do not share memory.

In [14]:
a = nd.ones(shape=(5))
print(a)

[ 1.  1.  1.  1.  1.]
<NDArray 5 @cpu(0)>


In [15]:
b = a.asnumpy()
print(b)

[ 1.  1.  1.  1.  1.]


In [16]:
b[0] = 2
print(b)
print(a)

[ 2.  1.  1.  1.  1.]
[ 1.  1.  1.  1.  1.]
<NDArray 5 @cpu(0)>


## Converting from NumPy Array to MXNet NDArray

Constructing an MXNet NDarray from a NumPy Array is straightforward.

In [17]:
c = nd.array(b)
print(c)

[ 2.  1.  1.  1.  1.]
<NDArray 5 @cpu(0)>


## Managing context

In MXNet, every array has a context. One context could be the CPU. Other contexts might be various GPUs. Things can get even hairier when we deploy jobs across multiple servers. By assigning arrays to contexts intelligently, we can minimize the time spent transferring data between devices. For example, when training neural networks on a server with a GPU, we typically prefer for the model's parameters to live on the GPU. 


In [19]:
d = nd.array(b, mx.cpu())

Given an NDArray on a given context, we can copy it to another context by using the ``copyto()`` method.

In [45]:
e = d.copyto(mx.cpu(1))
print(e)

[ 2.  1.  1.  1.  1.]
<NDArray 5 @cpu(1)>


## Watch out!

Imagine that your variable ``d`` already lives on your second GPU (``mx.gpu(1)``). What happens if we call ``d.copyto(mx.gpu(1))``? It will make a copy and allocate new memory, even though that variable already lives on the desired device! 

Often, we only want to make a copy if the variable *currently* lives in the wrong context. In these cases, we can call ``as_in_context()``. If the variable is already on ``mx.gpu(1)`` then this is a no-op.

In [22]:
f = d.as_in_context(mx.cpu(0))
print(f)

[ 2.  1.  1.  1.  1.]
<NDArray 5 @cpu(0)>


For whinges or inquiries, [open an issue on  GitHub.](https://github.com/zackchase/mxnet-the-straight-dope)