<div class="licence">
<span>Licence CC BY-NC-ND</span>
<span>Valérie Roy</span>
<span><img src="media/ensmp-25-alpha.png" /></span>
</div>

In [None]:
import numpy as np

# vectorized operations and UFunc
(Universal Function)

## never use a pythoh-loop on *numpy.ndarray*

when you apply a **function** to **each** element of a *numpy.ndarray*:
   - **never** use a *python* **loop** !
   - **always** use the **vectorized** version of the **function**

**why ?**
   - for the sake of **computation time**
   - iterative version are always **slowler**
   
   - *numpy* provides **optimized functions for its numeric types**

**there is no magic !**

   - the **loop** is simply done in the **underlying library** (in C)
   - but it is **much faster**

*numpy* vectorized functions are called **UFuncs** (universal functions)
   - see https://docs.scipy.org/doc/numpy/reference/ufuncs.html

## magic function

   - using the **magic function** *%timeit*
   - we will compute the **execution time** of functions
   - to **get an idea** of what's going on
   
   
   - but **never** deduce **intangible rules** from **execution times**
   - (too many parameters are **at play**)

the **magic** functions *%timeit*

   - it measures **execution time** of **small code snippet**
   - note that it won't be **relevant** on a **small** number of elements

## computing execution time

we **raise** each element of an array *a* to the **power** of *2*

the array must be **"big enough"** for the **computation time** to be **relevant**

In [None]:
a = np.arange(1, 10000) 

   1. with a **python loop** by creating a **python-list**
   1. with a **python loop** by creating a **numpy.ndarray**
   1. with a **python comprehension**
   1. with a **vectorized numpy operation**
   1. with a **vectorized numpy function**

1) with a **python loop** by creating a **python-list**

In [None]:
l = []

In [None]:
%timeit -n 5 for e in a: l.append(e**2)

2) with a python **loop**, by creating a **numpy.ndarray**

In [None]:
l = np.empty(a.shape)
%timeit -n 5 for i in np.arange(0, a.shape[0]): l[i] = a[i]**2

3) with a python **comprehension**

In [None]:
%timeit -n 5 [e**2 for e in a]

4) with a **vectorized** *numpy* operation

In [None]:
%timeit -n 5 a**2

5) with a **vectorized** *numpy* function

In [None]:
%timeit -n 5 np.power(a, 2)

**conclusion**
   - vectorized operations and functions are **way** much faster !
   - never use **python loop**

## classical operators **are** **UFuncs**
   - when applied to *numpy.ndarray*
   - classical **operators** are **mapped** to their *numpy* **ufuncs** counterpart


| operator | numpy function    |
|----------|-------------------|
|   $+$    | *numpy.add* |
|   $-$    | *numpy.substract*|
|   $-$    | *numpy.negative* |
|   $*$    | *numpy.multiply* |
|   $/$    | *numpy.divide* |
|   $//$   | *numpy.floor_divide* |
|   $\%$   | *numpy.mod* |
|   $**$   | *numpy.power* |

## adding elements to elements two *numpy.ndarray*

In [None]:
a = np.array([1, 2, 3, 4, 5])
b = np.array([10, 20, 30, 40, 50])

In [None]:
a + b

In [None]:
a = np.arange(1, 100000)
b = np.arange(1, 100000)
%timeit -n 5 a + b

In [None]:
a = np.arange(1, 100000)
b = np.arange(1, 100000)
%timeit -n 5 np.add(a, b)

## adding *python lists* elements by elements 

because, in python **list** *+* is **concatenation**  
to add **element-by-element** you must use *numpy.add*

In [None]:
c = [1, 2, 3, 4, 5]
d = [10, 20, 30, 40, 50]

In [None]:
c + d # concatenation of the two lists !

In [None]:
np.add(c, d)

In [None]:
a = list(range(1, 100000))
b = list(range(1, 100000))
%timeit -n 5 np.add(a, b)

**a function add for python list and numpy ndarrays**

In [None]:
def add (x, y):
    return np.add(x, y)
add(c, d)

In [None]:
a = list(range(1, 100000))
b = list(range(1, 100000))
%timeit -n 5 add(a, b)

In [None]:
a = np.arange(1, 100000)
b = np.arange(1, 100000)
%timeit -n 5 add(a, b)

## there are many other **UFuncs** functions 

| function         | numpy function    |
|------------------|-------------------|
| comparison       | *numpy.greater*, *numpy.less*, *numpy.equal*, ...|
|   absolute       | *numpy.absolute* or *numpy.abs* |
|   trigonometry   | *numpy.sin*, *numpy.cos*, ... |
|   exponentiation | *numpy.exp*, *numpy.exp2*, .. |
|   logarithm      | *np.log*, *np.log2*, *numpy.log10* |
|   Floating point | *numpy.isinf*, ....|
| not a number     | *numpy.isnan*, *numpy.isnull*, ...|



## checking if a function is a **UFunc**

a **UFunc** is a *numpy.ufunc* object thus its **type** is *numpy.ufunc*  


ask for  **help**
   - *help(np.sum)*
   - *numpy.sum?*
   - *numpy.info(numpy.sum)*

In [None]:
type(np.sum) # numpy.sum is not a Ufunc

In [None]:
type(np.add) # numpty.ad is a UFunc

In [None]:
# help(np.add)

In [None]:
# np.add?

In [None]:
# np.info(np.add)

## conclusion
   - **writing** code using **vectorization** can be **harder** than **loop-based** python **code**
   - but for the sake of **time performance** you **cannot** avoid it
   
   
   - it is just **another way** to **think** your problem
   - you might need to **use** **different** algorithms or **invent** **new** ones