# NDArray Tutorial

This part of tutorial is also available in step-by-step notebook version on [github](https://github.com/dmlc/minpy/blob/master/examples/tutorials/ndarray_tutorial.ipynb). Please try it out!

## Basic NDArray Operation

MinPy has the same syntax as NumPy, which is flexible and familiar. You only need to replace `import numpy as np` with `import minpy.numpy as np` at the header of NumPy program. We have a brief introduction below to prove this point. if you are not familiar with NumPy, you may want to look up [NumPy Quickstart Tutorial](https://docs.scipy.org/doc/numpy-dev/user/quickstart.html) for more details.

### Array Creation 
An array can be created in multiple ways. For example, we can create an array from a regular Python list or tuple by using the `array` function

In [1]:
import minpy.numpy as np
# import numpy as np

a = np.array([1,2,3])  # create a 1-dimensional array with a python list
b = np.array([[1,2,3], [2,3,4]])  # create a 2-dimensional array with a nested python list 

If we only know the size but not the element values, there are several functions to create arrays with initial placeholder content. 

In [2]:
a = np.zeros((2,3))    # create a 2-dimensional array full of zeros with shape (2,3)  
b = np.ones((2,3))     # create a same shape array full of ones
c = np.full((2,3), 7)  # create a same shape array with all elements set to 7
d = np.empty((2,3))    # create a same shape whose initial content is random and depends on the state of the memory

### Basic Operations
Arithmetic operators on arrays apply *elementwise*. A new array is created and filled with the result.

In [3]:
a = np.ones((2,3))
b = np.ones((2,3))
c = a + b  # elementwise plus
d = - c    # elementwise minus
print(d)
e = np.sin(c**2).T  # elementwise pow and sin, and then transpose
print(e)
f = np.maximum(a, c)  # elementwise max
print(f)

[[-2. -2. -2.]
 [-2. -2. -2.]]
[[-0.7568025 -0.7568025]
 [-0.7568025 -0.7568025]
 [-0.7568025 -0.7568025]]
[[ 2.  2.  2.]
 [ 2.  2.  2.]]


### Indexing and Slicing
The slice operator `[]` applies on axis 0. 

In [4]:
a = np.arange(6)
a = np.reshape(a, (3,2))
print(a[:])
# assign -1 to the 2nd row
a[1:2] = -1 
print(a)

[[ 0.  1.]
 [ 2.  3.]
 [ 4.  5.]]
[[ 0.  1.]
 [-1. -1.]
 [ 4.  5.]]


We can also slice a particular axis with the method `slice_axis`

In [5]:
# slice out the 2nd column
d = np.slice_axis(a, axis=1, begin=1, end=2)
print(d)

[[ 1.]
 [-1.]
 [ 5.]]


## Lazy Evaluation

MinPy uses lazy evaluation for better performance. When we run `a=b+1` in python, the python thread just pushs the operation into the backend MXNet engine and then returns. There are two benefits for such optimization:
1. The main python thread can continue to execute other computations once the previous one is pushed. It is useful for frontend languages with heavy overheads. 
2. It is easier for the backend engine to explore further optimization, such as auto parallelization.

The backend engine is able to resolve the data dependencies and schedule the computations correctly. It is transparent to frontend users. We can explicitly call the method `asnumpy` that copy result data to numpy array, which will wait the computation finished.


In [6]:
import time

def do(x, n):
    """push computation into the backend engine"""
    return [np.dot(x,x) for i in range(n)]
def wait(x):
    """wait until all results are available"""
    for y in x:
        y.asnumpy()
        
tic = time.time()
a = np.ones((1000,1000))
b = do(a, 50)
toc = time.time() - tic
print('time for all computations are pushed into the backend engine: %f sec' % (time.time() - tic))
wait(b)
print('time for all computations are finished: %f sec' % (time.time() - tic))

time for all computations are pushed into the backend engine: 0.082473 sec
time for all computations are finished: 2.105203 sec


Policy and Context is two fundamental and critical concepts of MinPy, which expose how MinPy works.

## Brief Introduction to Policy and Blacklist

The truth is, MinPy integrates MXNet NDArray and NumPy into a seamless system. For a single operation, it may have MXNet implementation, pure NumPy implementation, or both of them. MinPy utilizes a policy system to determine which implementaiton should be applied, consisted of build-in policies in `minpy.dispatch.policy` (also aliased in minpy root):

- `PreferMXNetPolicy()` [**Default**]: Prefer MXNet. Use NumPy as a transparent fallback, which wil be discussed below.
- `OnlyNumPyPolicy()`: Only use NumPy operations.
- `OnlyMXNetPolicy()`: Only use MXNet operations.

The default policy PreferMXNetPolicy procedure can be expressed naively as below:

<img src="images/PreferMXNetPolicy.png" width=50%>

However, sometimes the implementation of a same operator will have minor difference between MXNet and NumPy. You will see an example in "Pitfalls when working together with NumPy". Due to the number of the combination of NumPy funciton and its arguments is very huge, it's difficult to cover all possibilities of how users call a function.

So we design another blacklist machinism for you. The operator in the blacklist will fallback to its numpy implementaiton and the content of blacklist will be prepared when you install MinPy automatically. The procedure of function call under `PerferMXNetPolicy` will become:

<img src="images/PreferMXNetPolicyWithBlacklist.png" width=40%>


Although we've provided some function to you for switching policy easily (see [Select Policy for Operations](https://minpy.readthedocs.io/en/latest/feature/policy.html)), you really don't need to consider this unless you meet a rare situation that current policy doesn't work properly, which will be discussed in "Pitfalls when working together with NumPy" below. 

## Brief Introduction to Context

Context represents the device information which determines where MXNet operations run. MinPy has two built-in Context in `minpy.context`:

- `minpy.context.cpu()` [**Default**]: runs on CPU. No device_id needed for CPU context.
- `minpy.context.gpu(device_id)`: runs on GPU specified by device_id. Usually gpu(0) is the first GPU in the system. Note that GPU context is only available with MXNet complied with GPU support.

Using Context flexibly, You can achieve some advanced features like run your code on multi-device. More details can be found in "[Select Context for MXNet Operations](https://minpy.readthedocs.io/en/latest/feature/context.html)"

## Concept of transparent fallback

Since MinPy fully integrates MXNet, it allows you to use GPU to speed up your algorithm with only minor change, while keeping the neat NumPy syntax you just went through. However, NumPy is a giant library with hundreds of operators. Our supported GPU operators are only a subset of them, so it is inevitable that you want to try some functions that are currently missing on GPU side. To solve this problem, MinPy gracefully adopts the NumPy implementation once the operator is missing on GPU side, and handles the memory copies among GPU and CPU for you.



In [7]:
# First turn on the logging to know what happens under the hood.
import logging
logging.getLogger('minpy.array').setLevel(logging.DEBUG)

# x is created as a MXNet array
x = np.zeros((10, 20))


# `cosh` is currently missing in MXNet's GPU implementation.
# So `x` will fallback to a NumPy array, so you will see a 
# logging like "Copy from MXNet array to NumPy array...", then
# NumPy's implementation of `cosh` will be called to get the
# result `y` as a NumPy array. But you don't need to worry 
# about the memory copy from GPU -> CPU
y = np.cosh(x)


# `log` has MXNet's GPU implementation, so it will copy the 
# array `y` from NumPy array to MXNet array and you will see
# a logging like "Copy from NumPy array to MXNet array..."
# Once again, you don't need to worry about it. It is transparent.
z = np.log(y)


# Turn off the logging.
logging.getLogger('minpy.array').setLevel(logging.WARN)

[32mI0928 13:57:24 11210 minpy.array:_synchronize_data:412][0m Copy from MXNet array to NumPy array for Array "4502572480".
[32mI0928 13:57:24 11210 minpy.array:_synchronize_data:418][0m Copy from NumPy array to MXNet array for Array "4502572320".


## Pitfalls when working together with NumPy

First, keep in mind that importing MinPy or NumPy as `np` first, instead of using MinPy or NumPy directly in your code. This tip will give your code the maximum compatibility, allowing you to change underlying framework flexibly.

Second, it is suggested to use immutable operations as much as possible, because mutable operations will bring some difficulties for our autograd system. For example:

In [8]:
a = np.zeros((2,3))
a = np.transpose(a)
# instead of a.transpose(), which is feasible in NumPy.
# In MinPy, it will occur an error, since we can't calculate
# its gradient.
a.transpose()

AttributeError: 'Array' object has no attribute 'transpose'

Third, due to the difference between NumPy and MXNet interface, there exist a few of NumPy function cannot work property in the PreferMXNetPolicy. We suggest that try to separate the code into two parts. One is for some non-critical codes like data preparation or visualization. Just use numpy in this part. Another is for performance-critical part like loss function or weight update. Use minpy in this part. Besides, you could always switch namespace by set_global_policy in any place like:

In [9]:
# In PreferMXNetPolicy, np.random.normal will redirect to MXNet's implementation
# but it does not support mu and sigma to be arrays (only scalar
# is supported right now).
def gaussian_cluster_generator(num_samples=10000, num_features=500, num_classes=5):
    mu = np.random.rand(num_classes, num_features)
    sigma = np.ones((num_classes, num_features)) * 0.1
    num_cls_samples = num_samples / num_classes
    x = np.zeros((num_samples, num_features))
    y = np.zeros((num_samples, num_classes))
    for i in range(num_classes):
        # this line will occur an error
        cls_samples = np.random.normal(mu[i,:], sigma[i,:], (num_cls_samples, num_features))
        x[i*num_cls_samples:(i+1)*num_cls_samples] = cls_samples
        y[i*num_cls_samples:(i+1)*num_cls_samples,i] = 1
    return x, y

gaussian_cluster_generator(10000, 500, 5)

MXNetError: Invalid Parameter format for loc expect float but value='<mxnet.ndarray.NDArray object at 0x124dbec50>'

In [10]:
import minpy

# This could be fixed once MXNet supports that. For now, you
# can change the policy of this function by adding
# @minpy.wrap_policy(minpy.OnlyNumPyPolicy()) in front of the
# function
@minpy.wrap_policy(minpy.OnlyNumPyPolicy())
def gaussian_cluster_generator_fixed(num_samples=10000, num_features=500, num_classes=5):
    mu = np.random.rand(num_classes, num_features)
    sigma = np.ones((num_classes, num_features)) * 0.1
    num_cls_samples = num_samples / num_classes
    x = np.zeros((num_samples, num_features))
    y = np.zeros((num_samples, num_classes))
    for i in range(num_classes):
        cls_samples = np.random.normal(mu[i,:], sigma[i,:], (num_cls_samples, num_features))
        x[i*num_cls_samples:(i+1)*num_cls_samples] = cls_samples
        y[i*num_cls_samples:(i+1)*num_cls_samples,i] = 1
    return x, y

# output the right result now
gaussian_cluster_generator_fixed(10000, 500, 5)

([[ 0.3434306   0.91109807  0.81138594 ...,  0.2046174   0.65634542
    0.28674177]
  [ 0.42330647  1.12805209  0.75757002 ...,  0.39964763  0.67332741
    0.3329061 ]
  [ 0.17643755  0.98228017  0.7354436  ...,  0.21712442  0.6823966
    0.30745387]
  ..., 
  [ 0.95788697  0.22241664  0.40826581 ...,  0.74575702  0.86119581
    0.169201  ]
  [ 0.77587111  0.25159795  0.47107019 ...,  0.86087215  0.71302598
    0.46475238]
  [ 0.79927081  0.17099329  0.59225605 ...,  0.7175653   0.59175765
    0.43652729]], [[ 1.  0.  0.  0.  0.]
  [ 1.  0.  0.  0.  0.]
  [ 1.  0.  0.  0.  0.]
  ..., 
  [ 0.  0.  0.  0.  1.]
  [ 0.  0.  0.  0.  1.]
  [ 0.  0.  0.  0.  1.]])