# NDArray Tutorial

## Basic NDArray Operation

Minpy has the same syntax with Numpy, which is flexible and familiar. You only need to replace `import numpy as np` with `import minpy.numpy as np` at the header of NumPy program. if you are not familiar with Numpy, you may want to look up [NumPy Quickstart Tutorial](https://docs.scipy.org/doc/numpy-dev/user/quickstart.html).

## Lazy Evaluation

Minpy uses lazy evaluation for better performance. When we run `a=b+1` in python, the python thread just pushs the operation into the backend MXNet engine and then returns. There are two benefits for such optimization:
1. The main python thread can continue to execute other computations once the previous one is pushed. It is useful for frontend languages with heavy overheads. 
2. It is easier for the backend engine to explore further optimization, such as auto parallelization.

The backend engine is able to resolve the data dependencies and schedule the computations correctly. It is transparent to frontend users. We can explicitly call the method `asnumpy` that copy result data to numpy array, which will wait the computation finished.


In [1]:
import minpy.numpy as np
import time

def do(x, n):
    """push computation into the backend engine"""
    return [np.dot(x,x) for i in range(n)]
def wait(x):
    """wait until all results are available"""
    for y in x:
        y.asnumpy()
        
tic = time.time()
a = np.ones((1000,1000))
b = do(a, 50)
toc = time.time() - tic
print('time for all computations are pushed into the backend engine: %f sec' % (time.time() - tic))
wait(b)
print('time for all computations are finished: %f sec' % (time.time() - tic))

time for all computations are pushed into the backend engine: 0.008766 sec
time for all computations are finished: 2.235336 sec


## Concept of transparent fallback

Since Minpy fully integrates MXNet, it allows you to use GPU to speed up your algorithm with only mirror change, while keeping the neat NumPy syntax you just went through. However, NumPy is a giant library with hundreds of operators. Our supported GPU operators are only a subset of them, so it is inevitable that you want to try some functions that are currently missing on GPU side. To solve this problem, MinPy gracefully adopts the NumPy implementation once the operator is missing on GPU side, and handles the memory copies among GPU and CPU for you.



In [2]:
# First turn on the logging to know what happens under the hood.
import logging
logging.getLogger('minpy.array').setLevel(logging.DEBUG)

x = np.zeros((10, 20))

# `cosh` is currently missing in MXNet's GPU implementation.
# So it will fallback to use NumPy's CPU implementation,
# but you don't need to worry about the memory copy from GPU -> CPU
y = np.cosh(x)

# `log` has GPU implementation, so it will copy the array from CPU -> GPU.
# Once again, you don't need to worry about it. It is transparent.
z = np.log(y)

# Turn off the logging.
logging.getLogger('minpy.array').setLevel(logging.WARN)

[32mI0916 17:21:34 34295 minpy.array:_synchronize_data:412][0m Copy from MXNet array to NumPy array for Array "4513052704".
[32mI0916 17:21:34 34295 minpy.array:_synchronize_data:418][0m Copy from NumPy array to MXNet array for Array "4513051984".


## Pitfalls when working together with Numpy

Due to the difference between NumPy and MXNet interface, there exist a few of NumPy function cannot work property in the `PreferMXNetPolicy`. We suggest that try to separate the code into two parts. One is for some non-critical codes like data preparation or visualization. Just use numpy in this part. Another is for performance-critical part like loss function or weight update. Use minpy in this part. Besides, you could always switch namespace by `set_global_policy` in any place like:


``` python
import minpy

def strange_func(...):
    minpy.set_global_policy(minpy.OnlyNumpyPolicy())
    # for some particular code only can run with NumPy...
    # pure numpy code
    minpy.set_global_policy(minpy.PreferMXNetPolicy())
    return x, y
```
