<div class="licence">
<span>Licence CC BY-NC-ND</span>
<span>Valérie Roy</span>
<span><img src="../media/ensmp-25-alpha.png" /></span>
</div>

In [1]:
import numpy as np

# vectorized operations and UFunc
(Universal Function)

## never use a python-loop on *numpy.ndarray* !

when you apply a **function** to **each** element of a *numpy.ndarray*:
   - **never** use a *python* **loop** !
   - **always** use the **vectorized** version of the **function**

**why ?**
   - for the sake of **computation time**
   - iterative versions are always **slowler**   
   - *numpy* provides **optimized functions for its numeric types**

**there is no magic !**

   - the **loop** is simply done in the **underlying library** (in C)
   - but it is **much faster**

*numpy* vectorized functions are called **UFuncs** (universal functions)
   - see https://docs.scipy.org/doc/numpy/reference/ufuncs.html

## magic function

   - using the **magic function** *%timeit*
   - we will compute the **execution time** of functions
   - to **get an idea** of what's going on
   
   
   - but **never** deduce **intangible rules** from **execution times**
   - (too many parameters are **at play**)

the **magic** functions *%timeit*

   - it measures **execution time** of **small code snippets**
   - note that it won't be **relevant** on a **small** number of elements
   - the array must be **"big enough"** for the **computation time** to be **relevant**

## computing execution time

we **raise** elements of *a* to the **power** of *2* with 5 **different** ways

In [5]:
a = np.arange(1, 10000) 

   1. with a **python loop** by creating a **python-list**
   1. with a **python loop** by creating a **numpy.ndarray**
   1. with a **python comprehension**
   1. with a **vectorized numpy operation**
   1. with a **vectorized numpy function**

1) with a **python loop** by creating a **python-list**

In [9]:
l = []

In [12]:
%timeit for e in a: l.append(e**2)

3.65 ms ± 77.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


2) with a python **loop**, by creating a **numpy.ndarray**

In [13]:
l = np.empty(a.shape)

In [14]:
%timeit for i in np.arange(0, a.shape[0]): l[i] = a[i]**2

6.3 ms ± 97.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


3) with a python **comprehension**

In [15]:
%timeit [e**2 for e in a]

3.21 ms ± 53.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


4) with a **vectorized** *numpy* operation

In [16]:
%timeit a**2

5.36 µs ± 97 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


5) with a **vectorized** *numpy* function

In [17]:
%timeit np.power(a, 2)

23.9 µs ± 877 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


**conclusion**
   - vectorized operations and functions are **way** much faster !
   - never use **python loop**

## classical operators **are** **UFuncs**
**operators**, applied to *numpy.ndarray*, are **mapped** to their *numpy* **ufuncs** counterpart


| operator | numpy function    |
|----------|-------------------|
|   $+$    | *numpy.add* |
|   $-$    | *numpy.substract*|
|   $*$    | *numpy.multiply* |
|   $/$    | *numpy.divide* |
|   $//$   | *numpy.floor_divide* |
|   $\%$   | *numpy.mod* |
|   $**$   | *numpy.power* |

## adding elements by elements two *numpy.ndarray*

In [22]:
a = np.array([1, 2, 3, 4, 5])
b = np.array([10, 20, 30, 40, 50])

In [23]:
a + b

array([11, 22, 33, 44, 55])

In [26]:
a = np.arange(1, 100000)
b = np.arange(1, 100000)
%timeit a + b

62 µs ± 4.46 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [27]:
a = np.arange(1, 100000)
b = np.arange(1, 100000)
%timeit np.add(a, b)

57.3 µs ± 1.99 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


## adding *python lists* elements by elements 

in python *+* for **list** is **concatenation**, to add **element-by-element** use *numpy.add*

In [39]:
c = [1, 2, 3, 4, 5]
d = [10, 20, 30, 40, 50]

In [40]:
c + d # concatenation of the two lists !

[1, 2, 3, 4, 5, 10, 20, 30, 40, 50]

In [41]:
np.add(c, d)

array([11, 22, 33, 44, 55])

In [42]:
a = list(range(1, 100000))
b = list(range(1, 100000))
%timeit np.add(a, b)

12.1 ms ± 319 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


**a function add for python list and numpy ndarrays**

In [43]:
def add (x, y):
    return np.add(x, y)
add(c, d)

array([11, 22, 33, 44, 55])

In [44]:
a = list(range(1, 100000))
b = list(range(1, 100000))
%timeit add(a, b)

12.7 ms ± 149 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [46]:
# faster for numpy.ndarrays !
a = np.arange(1, 100000)
b = np.arange(1, 100000)
%timeit add(a, b)

87 µs ± 33.7 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


## there are many other **UFuncs** functions 

| function         | numpy function    |
|------------------|-------------------|
| comparison       | *numpy.greater*, *numpy.less*, *numpy.equal*, ...|
|   absolute       | *numpy.absolute* or *numpy.abs* |
|   trigonometry   | *numpy.sin*, *numpy.cos*, ... |
|   exponentiation | *numpy.exp*, *numpy.exp2*, .. |
|   logarithm      | *np.log*, *np.log2*, *numpy.log10* |
|   Floating point | *numpy.isinf*, ....|
| not a number     | *numpy.isnan*, *numpy.isnull*, ...|



## checking if a function is a **UFunc**

a **UFunc** is a *numpy.ufunc* object thus its **type** is *numpy.ufunc*

In [57]:
type(np.sum) # numpy.sum is not a Ufunc

function

In [58]:
type(np.add) # numpty.ad is a UFunc

numpy.ufunc

ask for help !

In [59]:
# help(np.add)

In [60]:
# np.add?

## conclusion
   - **writing** code using **vectorization** can be **harder** than **loop-based** python **code**
   - but for the sake of **time performance** you **cannot** avoid it
   
   
   - it is just **another way** to **think** your problem
   - you might need to **use** **different** algorithms or **invent** **new** ones