<div class="licence">
<span>Licence CC BY-NC-ND</span>
<span>Val√©rie Roy</span>
<span><img src="media/ensmp-25-alpha.png" /></span>
</div>

## 5) function can aggregate array values in *numpy*

   - i.e. **combine** the values of the array

#### classic functions:
   - *numpy.sum*, *numpy.prod*
   - *numpy.mean*, *numpy.std*, *numpy.var*
   - *numpy.min*, *numpy.max*
   - *numpy.median*, *numpy.percentile*
   
example of one operation (here the sum)   

### 1) suming an **array** along **axis**

#### a) summing an **array** along **axis**

In [None]:
import numpy as np

In [None]:
a = np.random.randint(0, 10, size=(3, 4))
a

   - we have an array of $3$ rows and $4$ columns

**axis = 0** 
   - we sum along the **rows**
   - i.e. we sum the **columns** together

In [None]:
np.sum(a, axis = 0)

**axis = 1**
   - we sum along the **columns**
   - i.e. we sum the **rows** together

In [None]:
np.sum(a, axis = 1)

### 2) summing a group of arrays along **axis**

   - we have $2$ arrays to $3$ rows and $4$ columns 

In [None]:
a = np.random.randint(0, 50, size=(2, 3, 4))
a

  - summing on **axis** $0$ is **summing** the $2$ arrays (of the**axis** $0$) 

In [None]:
np.sum(a, axis=0) # we sum the arrays

   - summing along **axis** $1$ is summing along the **rows** of each array 
   - i.e. we obtain one **row** per array
   - they form a new array

In [None]:
np.sum(a, axis=1)

   - summing along **axis** $2$ is summing along the **columns** of each array 
   - i.e. we obtain one **column** per array
   - they form a new array

In [None]:
np.sum(a, axis=2)

### 3) summing over all the elements

In [None]:
np.sum(a) # sum of all the elements

### 4) summing in presence of *numpy.nan* values

#### classic functions with their **NaN-safe** counterpart:
   - *numpy.nansum*, *numpy.nanprod*
   - *numpy.nanmean*, *numpy.nanstd*, *numpy.nanvar*
   - *numpy.nanmin*, *numpy.nanmax*
   - *numpy.nanmedian*, *numpy.nanpercentile*

   - **NaN** needs **floats**

In [None]:
a = np.random.randint(0, 50, size=(2, 3, 4)).astype(float)
a

we insert some **NaN** values

In [None]:
a[0, 1, 0] = np.nan
a[0, 2, 2] = np.nan
a[0, 0, 3] = np.nan

a[1, 0, 0] = np.nan
a[1, 1, 3] = np.nan
a[1, 2, 2] = np.nan
a

with **normal** operations **NaN** is **dominant**

In [None]:
np.sum(a) # np.nan is dominant - contagious

we can **treat** **NaNs** as **zero**

In [None]:
np.nansum(a) # np.nan values are 0

   - on **axis=0** 

In [None]:
np.nansum(a, axis=0) # summing the two array together

In [None]:
np.nansum(a, axis=1) # summing the rows together in each array

In [None]:
np.nansum(a, axis=2) # summing the columns together in each array

### 5) index of min adn  max values
   - *numpy.argmax*, *numpy.argmin*

the indice is given on the **flattened** array

In [None]:
a = np.random.randint(0, 100, 30).reshape(5, 6)
a

In [None]:
np.min(a) # or a.min()

In [None]:
np.argmin(a) # or a.argmin()

In [None]:
a.flatten().argmin()

In [None]:
a.flatten().argmin()

## xxx) tests on all values
   - *numpy.all* returns *True* if **all** values are *True*
   - *numpy.any* returns *True* if **any** value is *True*
   - *np.where(cond, x, y)* returns *x* or *y* depending on the condition
   - (https://docs.scipy.org/doc/numpy/reference/generated/numpy.where.html)
   
   - they have no NaN-safer counterpart

In [None]:
a = np.random.randint(0, 100, 30).reshape(5, 6)
a

In [None]:
a <= 50

In [None]:
np.any(a <= 50)

In [None]:
np.all(a <= 100)

In [None]:
#np.where?

In [None]:
np.where(a<50, 2*a, 3*a) 