<div class="licence">
<span>Licence CC BY-NC-ND</span>
<span>Valérie Roy</span>
<span><img src="media/ensmp-25-alpha.png" /></span>
</div>

In [None]:
import numpy as np

# aggregating array values

## *numpy* function can **combine** the elements of the array along the specified axis
  
example of some *classic functions*:


| function | behavior|
|------|-----|
| *numpy.sum* | sums elements over an axis|
| *numpy.prod*| multiplies elements along an axis|
| *numpy.min* | returns the smallest element|
| *numpy.max* | returns the greatest element |
| *numpy.argmin* | returns the index of the smallest element|
| *numpy.max* | returns the index of the greatest element |
| *numpy.mean*| computes the means of the elements|
| *numpy.std*  | computes the standard deviation|
| *numpy.var* | .../... |

## suming elements along rows axis

we create a $3$ rows and $4$ columns **matrix**

In [None]:
a = np.random.randint(0, 10, size=(3, 4))
a

summing returns the global sum

In [None]:
a.sum()

## suming elements along the rows axis
   - it will be the same for the other operations (product, division, ...)   

we create a *3* rows and *4* columns **matrix**

In [None]:
a = np.random.randint(0, 10, size=(3, 4))
a

summing along the **rows** axis (0 in our example) is summing the **rows** together

In [None]:
a.sum(axis=0)

## suming elements along the columns axis

we create a *3* rows and *4* columns **matrix**

In [None]:
a = np.random.randint(0, 10, size=(3, 4))
a

summing along the **columns** axis (1 in our example) is summing the **columns** together

In [None]:
np.sum(a, axis=1)

## summing along groups of array (axis 0)

we create two **groups** of one *3* rows and *4* columns **matrix**

In [None]:
a = np.random.randint(0, 50, size=(2, 3, 4))
a

summing along **axis** $0$ is **summing** the two arrays together 

In [None]:
np.sum(a, axis=0) # we sum the arrays

In [None]:
a[0] + a[1]

## summing along axis 1 when we have several matrices

we create two matrices of size *(3 x 4)*

In [None]:
a = np.random.randint(0, 50, size=(2, 3, 4))
a

summing along the **axis** $1$ is summing along the **rows** of each array 
   - i.e. we obtain one **row** per array
   - they form a new array

In [None]:
b = np.sum(a, axis=1)
b, b.shape

## summing along axis 2 when we have several matrices

we create two matrices of size *(3 x 4)*

In [None]:
a = np.random.randint(0, 50, size=(2, 3, 4))
a

summing along **axis** $2$ is summing along the **columns** of each array 
   - i.e. we obtain one **column** per array
   - they form a new array

In [None]:
b = np.sum(a, axis=2)
b, b.shape

## summing without axis when we have several matrices

we create two matrices of size *(3 x 4)*

In [None]:
a = np.random.randint(0, 50, size=(2, 3, 4))
a

it sums all the elements

In [None]:
np.sum(a)

## summing in presence of *numpy.nan* values
   - it will be the same for the other functions

**classic functions** have their **NaN-safe** counterpart:

   - *numpy.nansum*, *numpy.nanprod*, ...
   - *numpy.nanmean*, *numpy.nanstd*, *numpy.nanvar*, ...
   - *numpy.nanmin*, *numpy.nanmax*, ...
   - *numpy.nanmedian*, *numpy.nanpercentile*, ...
   
where *numpy.nan* can be **replaced** by a given value  
to avoid NaN contagion

### NaN is dominant in classic operations

   - remember that **only** **float** values can be **NaN**  
   - we create two matrices *(3 x 4)* of type **float**  
   - **where** we insert some **NaN** values

In [None]:
a = np.random.randint(0, 50, size=(2, 3, 4))
a = a.astype(float)
a[0]

In [None]:
a[0, 1, 0] = np.nan
a[0, 2, 2] = np.nan
a[1, 0, 0] = np.nan
a[1, 1, 3] = np.nan
a[1]

 you can see that *numpy.nan* is **dominant** (**contagious**)

In [None]:
np.sum(a) # the result is NaN

## NaN-safe function (summing several matrices without axis and on axis 0)

**NaN** is treated as **zero**

In [None]:
a

In [None]:
np.nansum(a) # np.nan values are 0

on **axis=0** the two **arrays** are summed together

In [None]:
np.nansum(a, axis=0)

## NaN-safe function (summing several matrices on axis 1)

**NaN** is treated as **zero**

on **axis=1** the **rows** of the two arrays are **added**

In [None]:
np.nansum(a, axis=1)

## NaN-safe function (summing several matrices on axis 2)

**NaN** is treated as **zero**

on **axis=2** the **columns** of the two arrays are **added**


In [None]:
np.nansum(a, axis=2)

## index of min and  max values
   - *numpy.argmax*, *numpy.argmin*

In [None]:
a = np.random.randint(0, 100, 15).reshape(3, 5)
a

the indice is given on the **flattened** array

In [None]:
np.min(a) # or a.min()

In [None]:
np.argmin(a) # or a.argmin()

In [None]:
a.flatten().argmin()

In [None]:
a.flatten().argmin()

## tests on all values
   - *numpy.all* returns *True* if **all** values are *True*
   - *numpy.any* returns *True* if **any** value is *True*
   - *np.where(cond, x, y)* returns *x* or *y* depending on the condition
   - (https://docs.scipy.org/doc/numpy/reference/generated/numpy.where.html)
   
   - they have no NaN-safer counterpart

### all or any values

In [None]:
a = np.random.randint(0, 100, 15).reshape(3, 5)
a

In [None]:
# you create a mask
a <= 50

In [None]:
np.any(a <= 50)

In [None]:
np.all(a <= 100)

In [None]:
np.where(a<50, 2*a, 3*a) 