# Numpy

In [1]:
import os
import numpy as np

## Loop vs vectorization 

Let's define deux arrays

In [2]:
np.random.seed(123)
arr1 = np.random.randint(low=1, high=100, size=(1000,1000))
arr2 = np.random.randint(low=1, high=100, size=(1000,1000))

We want to compute $$10 \times \frac{arr1}{arr2}$$

We can do it with loops, which is not efficient

In [3]:
%%time
res = np.zeros((1000,1000))
for i in np.arange(1000):
    for j in np.arange(1000):
        res[i,j] = 10 * arr1[i,j] / arr2[i,j]

CPU times: user 894 ms, sys: 15.5 ms, total: 909 ms
Wall time: 885 ms


Or we can do it with vectorization. Vectorization in NumPy is a method of performing operations on entire arrays without explicit loops. The vectorization allows to speed up the code and to simplify the code.

Try to replace the previous cell with vectorization

In [12]:
%%time
res = 10 * arr1 / arr2

CPU times: user 11.4 ms, sys: 2.98 ms, total: 14.3 ms
Wall time: 12.3 ms


It's much more efficient

Now let's take a look at the case with if-else statements

In [13]:
%%time
# Iterating with loops
res = np.zeros((1000,1000))
for i in np.arange(1000):
    for j in np.arange(1000):
        if arr1[i,j] < 10 and arr2[i,j] < 10:
            res[i,j] = 200 - arr1[i,j] * arr2[i,j]
        elif arr1[i,j] > 50:
            res[i,j] = 200 - arr1[i,j] 
        else:
            res[i,j] = arr2[i,j] * arr2[i,j]

CPU times: user 1.47 s, sys: 22.1 ms, total: 1.49 s
Wall time: 1.46 s


Try to do it with vectorization

In [14]:
%%time
res = arr2 * arr2
res = np.where(arr1>50, 200 - arr1, res)
res = np.where((arr1<10) & (arr2<10), 200 - arr1 * arr2, res)

CPU times: user 23.6 ms, sys: 11.1 ms, total: 34.7 ms
Wall time: 32.2 ms


## Use `np.vectorize`

Apply custom functions with `np.vectorize()` function

Let's define a function 

In [15]:
def myfunc(a, b):
    if a < 10 and b < 10:
        return 200 - a * b
    elif a > 50:
        return 200 - a 
    else:
        return b * b

In [16]:
vfunc = np.vectorize(myfunc)

Apply the function

In [17]:
%%time
res = vfunc(arr1,arr2)

CPU times: user 234 ms, sys: 37 ms, total: 271 ms
Wall time: 269 ms
