<div class="licence">
<span>Licence CC BY-NC-ND</span>
<span>Valérie Roy</span>
<span><img src="../media/ensmp-25-alpha.png" /></span>
</div>

In [2]:
import numpy as np

# Boolean Masks and Arrays indexing
   - conditions are applied to all elements of the array

### Boolean comparison operators

   - are **UFuncs**
   - when applied on arrays, they return the array of the **element-by-element comparisons**
   
   
   - you obtain a **mask** that you can use to **filter** your array

## example of Boolean masks

In [8]:
# we create a matrix of shape (3 x 4)
a = np.random.randint(-10, 10, 12).reshape(3, 4)
a

array([[ -2,   4,   7,   3],
       [ -3, -10,   3,  -2],
       [ -2,  -9,   6,   4]])

we compute mask of **even values**

In [11]:
a%2 == 0


array([[ True,  True, False, False],
       [False,  True, False,  True],
       [ True, False,  True,  True]])

## computing with masks of Boolean values
   - we use *False ==0* and *True==1* to **compute useful** functions

In [52]:
# we create a matrix of shape (3 x 4)
a = np.random.randint(-10, 10, 12).reshape(3, 4)
a

array([[  7,  -2,  -5,  -8],
       [  6,  -7,  -8, -10],
       [  2,  -9, -10,   0]])

to **count** the number of **even** values:  
we **compute** the **mask**  
we **sum** (or **count**) the *1*

In [53]:
np.sum( a%2==0 )

8

In [54]:
np.count_nonzero(a%2==0)

8

## computing with masks along the rows axis

In [64]:
# we create a matrix of shape (3 x 4)
a = np.random.randint(-10, 10, 12).reshape(3, 4)
a

array([[ 4,  5,  9, -7],
       [ 5, -2, -5,  3],
       [-2,  2, -7,  6]])

to count nb of **even** values in **columns**:  
we compute the **mask** of **even** values  
and we sum along **axis 0** i.e. the **rows** axis

In [65]:
np.sum(a%2==0, axis=0)

array([2, 2, 0, 1])

## computing with masks along the columns axis

In [80]:
# we create a matrix of shape (3 x 4)
a = np.random.randint(-10, 10, 12).reshape(3, 4)
a

array([[ 8,  7,  9,  5],
       [ 9,  7, -4,  4],
       [-9,  1, -2,  9]])


to count the nb of **even** values in **rows**  
we compute **mask** of even values  
we sum along **axis 1** i.e. the **columns axis**

In [81]:
np.sum(a%2==0, axis=1)

array([1, 2, 1])

## exercice of computation with Boolean masks  and axis

   1. **test** if all elements in a matrix are **less than N** (without using *numpy.all*)
   1. **test** if there exists at least one element **less that N** in a matrix  (without using *numpy.any*)

## composing questions with Boolean masks  and axis

In [86]:
# we create a matrix of shape *(3 x 3)*
a = np.random.randint(0, 10, 9).reshape((3, 3))
a

array([[5, 9, 0],
       [5, 4, 5],
       [9, 3, 6]])

we count the number of elements in the array *a* that are: **less than 6** and **even** 

In [87]:
np.count_nonzero((a < 6) & (a%2 == 0))
# use parentheses
# and do not use and !!

2

## logical operations

in *numpy*

   - you **must** use **bitwise** operators *&*, *|*, *~* 
   - or their respective *numpy* counterpart **np.logical_and**, **np.logical_or**, **np.logical_not**
   
   
   - because only **bitwise** operators are applied **element-by-element** 
   
   
   - do not use the **python logical** operators *and*, *or*, *not*

## Indexing arrays with **masks**
   - you can compute the array of the **elements** for which the **mask is True**
   - it creates a **new array**
   - it is **not** a **view** on the existing one

In [88]:
# we create a (3 x 4) matrix
a = np.random.randint(0, 11, 12).reshape(3, 4)
a

array([[ 1,  6, 10,  6],
       [ 2, 10,  8,  1],
       [ 3,  6,  6,  5]])

values less that *5* in *a*

In [90]:
a[a < 5]

array([1, 2, 1, 3])

## computing the index of elements from a mask
   - you can compute the **indices** of the elements for which the **mask is True**
   - with the function *numpy.argwhere*

In [95]:
# we create a (2 x 4) matrix
a = np.random.randint(0, 11, 8).reshape(2, 4)
a

array([[10,  8, 10,  7],
       [ 5,  5,  2,  8]])

you obtain a **list of couple** $[i, j]$  
where **i** is the indice in the **rows**  
and **j** in the **columns**

In [97]:
np.argwhere(a <= 5)

array([[1, 0],
       [1, 1],
       [1, 2]])

## index of elements in higher dimension

In [104]:
a = np.random.randint(0, 11, 8).reshape(2, 2, 2) # two matrices of shape (3 x 4)
a

array([[[10,  1],
        [ 2,  7]],

       [[ 7,  8],
        [ 0,  3]]])

you obtain a **list of triplets** $[k, i, j]$  
where **k** is the **matrix**  
**i** the indice in the **rows**  
and **j** the indice in the **columns**

In [106]:
np.argwhere(a < 5)

array([[0, 0, 1],
       [0, 1, 0],
       [1, 1, 0],
       [1, 1, 1]])

## advanced indexing

   - you can create **sub-arrays** using **lists** or **ndarrays** of indices
   - you obtain a **copy** of the **original** array

In [125]:
a = np.random.randint(0, 10, 5) # a vector
a

array([1, 8, 5, 3, 9])

In [127]:
# we create a python list of indices
l = [0, 3, 0, 4]

# we index 'a' by the list of indices
a[l] # you get a new array

array([1, 3, 1, 9])

In [128]:
# we create an ndarray of indices
n = np.array([0, 3, 0, 4])

# we index 'a' by the ndarray of indices
a[n]  # you get a new array

array([1, 3, 1, 9])

## modifying elements of an array
   - based on conditions  with the functin *numpy.putmask*

In [146]:
a = np.random.randn(10).reshape(2, 5)
a

array([[-0.3749339 , -0.83654882,  0.39657539, -0.66945727,  1.3166697 ],
       [-0.51339342,  0.64529605,  2.40656497, -0.38562   ,  1.01599337]])

**replace** values between *-0.5* and *0.5* with some value ($0$)

In [153]:
np.putmask(a,
           (-0.5 < a) & ( a <0.5),
           0)
# it modifies a !

## sorting arrays using advanced indexing

we want to **sort** the array *a*, along a **given column** and **keep** the **rows** the same 

In [180]:
a = np.array([[5, 2, 0], [9, 3, 8], [7, 0, 6]])
a

array([[5, 2, 0],
       [9, 3, 8],
       [7, 0, 6]])

we sort the array **by** the **second column** *a[:, 1]* is $[2_0, 3_1, 0_2]$  
we get the **indices** of the **sorted column** [2, 0, 1]  
we **indice** the array by the **array of indices** it is called **advanced** indexing   
it returns a **copy** of the array **not** a **view**

In [182]:
# the indices of the sorted second column
np.argsort(a[:, 1])

array([2, 0, 1])

In [183]:
a[ np.argsort(a[:, 1]) ]

array([[7, 0, 6],
       [5, 2, 0],
       [9, 3, 8]])

In [184]:
a

array([[5, 2, 0],
       [9, 3, 8],
       [7, 0, 6]])