# Getting started

To start, we import the np (numpy) and npx (numpy_extension) modules from MXNet. Here, the np
module includes functions supported by NumPy, while the npx module contains a set of extensions
developed to empower deep learning within a NumPy-like environment. When using ndarray, we
almost always invoke the set_np function: this is for compatibility of ndarray processing by other
components of MXNet.

In [1]:
from mxnet import np, npx

In [2]:
npx.set_np() # Activate numpy-like environment

Create a row vector x containing the first 12 integers starting with 0

In [3]:
x = np.arange(12)
x

array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.])

We can access an ndarray's shape (the length along each axis) by inspecting its shape property

In [4]:
x.shape

(12,)

If we want to know the number of elemetns in an ndarray

In [5]:
x.size

12

## Reshapes use

reshape `(12,)` to `(3,4)`, it's no necessary to specify both dimension, if you know one dimension set the other equal to `-1` and will be inferred

In [6]:
# x = x.reshape(3, -1)
x = x.reshape(3,4)
x

array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.]])

In [7]:
# x = x.reshape(-1, 4)
x = x.reshape(3, 4)
x

array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.]])

## Creating `ndarrays` methods

The empty method grabs a chunk of memory and hands us back a matrix without bothering to change the balue of any of its entries

In [8]:
np.empty(shape=(3,4))

array([[2.343108e-29, 3.086640e-41, 0.000000e+00, 0.000000e+00],
       [0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00],
       [0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00]])

Typically, we will want our matrices initialized either with zeros, ones, some other constants, or
numbers randomly sampled from a specific distribution. We can create an ndarray representing
a tensor with all elements set to 0 and a shape of `(2, 3, 4)` as follows:

In [9]:
np.zeros((2,3,4))

array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]]])

similarly we can create tensors with each element set to $1$ as follows:

In [10]:
np.ones((2,3,4))

array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]])

We want to randomly sample the values for each element in an `ndarray`from som probability distribution. For example, when we construct arrays to serve as parameters in a neural network, we will typically initialize their values randomly

In [11]:
np.random.normal(0, 1, size=(3,4))

array([[ 2.2122064 ,  1.1630787 ,  0.7740038 ,  0.4838046 ],
       [ 1.0434403 ,  0.29956347,  1.1839255 ,  0.15302546],
       [ 1.8917114 , -1.1688148 , -1.2347414 ,  1.5580711 ]])

Create a normal `ndarray` specifying the exact values as a python list

In [12]:
np.array([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])

array([[2., 1., 4., 3.],
       [1., 2., 3., 4.],
       [4., 3., 2., 1.]])

# Operations

## Element wise operations

In [13]:
x = np.array([1, 2, 4, 8])
y = np.array([2, 2, 2, 2])
x + y, x - y, x * y, x / y, x ** y

(array([ 3.,  4.,  6., 10.]),
 array([-1.,  0.,  2.,  6.]),
 array([ 2.,  4.,  8., 16.]),
 array([0.5, 1. , 2. , 4. ]),
 array([ 1.,  4., 16., 64.]))

Many more operations can be applied elementwise, including unary operators like exponentiation

In [14]:
np.exp(x)

array([2.7182817e+00, 7.3890562e+00, 5.4598148e+01, 2.9809580e+03])

## `ndarrays` concatenation

In [15]:
x = np.arange(12).reshape(3, -1)
y = np.array([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
x.shape, y.shape

((3, 4), (3, 4))

In [16]:
np.concatenate([x,y], axis=0), np.concatenate([x,y], axis=1) 

(array([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.],
        [ 2.,  1.,  4.,  3.],
        [ 1.,  2.,  3.,  4.],
        [ 4.,  3.,  2.,  1.]]),
 array([[ 0.,  1.,  2.,  3.,  2.,  1.,  4.,  3.],
        [ 4.,  5.,  6.,  7.,  1.,  2.,  3.,  4.],
        [ 8.,  9., 10., 11.,  4.,  3.,  2.,  1.]]))

## Logical operations

In [17]:
x == y

array([[False,  True, False,  True],
       [False, False, False, False],
       [False, False, False, False]])

In [18]:
x < y

array([[ True, False,  True, False],
       [False, False, False, False],
       [False, False, False, False]])

In [19]:
x[x < y], y[x < y]

(array([0., 2.]), array([2., 4.]))

In [20]:
x > y

array([[False, False, False, False],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True]])

In [21]:
x[x > y], y[x > y]

(array([ 4.,  5.,  6.,  7.,  8.,  9., 10., 11.]),
 array([1., 2., 3., 4., 4., 3., 2., 1.]))

## Operations along an axis

In [22]:
x.sum(), x.sum(axis=0), x.sum(axis=1)

(array(66.), array([12., 15., 18., 21.]), array([ 6., 22., 38.]))

In [23]:
x.mean(), x.mean(axis=0), x.mean(axis=1)

(array(5.5), array([4., 5., 6., 7.]), array([1.5, 5.5, 9.5]))

# Broadcasting mechanism

Under certain conditions, even when shapes differ, we can still perform elementwise operations by invoking the broadcasting mechanism

In [24]:
a = np.arange(3).reshape(3,1)
b = np.arange(2).reshape(1,2)
a,b

(array([[0.],
        [1.],
        [2.]]),
 array([[0., 1.]]))

In [25]:
a + b

array([[0., 1.],
       [1., 2.],
       [2., 3.]])

In order to this broadcasting operation function the shapes need to share one dimension at least

In [26]:
a * b

array([[0., 0.],
       [0., 1.],
       [0., 2.]])

In [27]:
a = np.arange(6).reshape(3,2,1)
b = np.arange(2).reshape(1,1,2)
a, b

(array([[[0.],
         [1.]],
 
        [[2.],
         [3.]],
 
        [[4.],
         [5.]]]),
 array([[[0., 1.]]]))

In [28]:
a + b

array([[[0., 1.],
        [1., 2.]],

       [[2., 3.],
        [3., 4.]],

       [[4., 5.],
        [5., 6.]]])

In [29]:
a = np.arange(6).reshape(3,2)
b = np.arange(2).reshape(1,2)
a, b

(array([[0., 1.],
        [2., 3.],
        [4., 5.]]),
 array([[0., 1.]]))

In [30]:
a + b

array([[0., 2.],
       [2., 4.],
       [4., 6.]])

In [31]:
a = np.arange(6).reshape(3,2)
b = np.arange(3).reshape(3,1)
a, b

(array([[0., 1.],
        [2., 3.],
        [4., 5.]]),
 array([[0.],
        [1.],
        [2.]]))

In [32]:
a + b

array([[0., 1.],
       [3., 4.],
       [6., 7.]])

# Indexing and slicing

The same as a python array

In [33]:
x[-1], x[1:3]

(array([ 8.,  9., 10., 11.]),
 array([[ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]]))

In [34]:
x[1,2]=9
x

array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  9.,  7.],
       [ 8.,  9., 10., 11.]])

Assingning to multiple indexes

In [35]:
x[0:2, :] = 12
x

array([[12., 12., 12., 12.],
       [12., 12., 12., 12.],
       [ 8.,  9., 10., 11.]])

# Saving in memory

Running operations can cause new memory to be allocated to host results

In [36]:
before = id(y)
y = y + x
id(y) == before

False

We need to perform operations and updates in place to save memory and time

In [37]:
z = np.zeros_like(y)
print('id(z):', id(z))
z[:] = x + y
print('id(z):', id(z))

id(z): 140241950092176
id(z): 140241950092176


if the value of x is not reused in subsequent computations we can use

In [38]:
before = id(x)
x += y
id(x) == before

True

# Conversion to other python objects

In [39]:
a = x.asnumpy()
b = np.array(a)
type(a), type(b)

(numpy.ndarray, mxnet.numpy.ndarray)

To convert a size-one ndarray to a python scalar, we can use the `item` function

In [40]:
a = np.array([3.5])
a, a.item(), float(a), int(a)

(array([3.5]), 3.5, 3.5, 3)