# Masking and indexing of `numpy`-arrays

----
Besides the classical indexing and slicing of container types, `numpy`-arrays provides additional features which were called fancy indexing and masking.

----

## 1. Repitition of ```numpy```-Arrays

Just as a brief repition of the indexing and slicing of `numpy`-arrays:

In [None]:
import numpy as np

a = np.arange(0,10,1)
print(a[1])        # second element
print(a[-1])       # the last element
print(a[2:4])      # slicing from index 2 to 4 (4 not included!)
print(a[::2])      # ever second element
print(a[:2:-1])    # all element from the last until the 3rd element

**Note:** Return values of the slicing process are not copies of the array but so called views, which point to the same data.

Elements can be modified, the selection can be done with indexing or slicing (left hand side) and the new values (right hand side) must have the same type and length:

In [None]:
import numpy as np

a = np.arange(0,10,1)
a[1] = 100                # change the second element
a[2:4] = -1               # change a sliced part with one value
a[5:9] = np.arange(0,4,1) # change a sliced part with an array of the same length
print(a)

----

## 2. ```numpy```-array with bools

Similar to the classical mathematical operations you can also do comparisions:

In [None]:
import numpy as np

a = np.arange(0,10,1)
print(a>5)         # all elements larger than 5
print(a==2)        # all elements equal to 2

b = a < 1          # create a new array with the results a < 1
print(b)       
print(b.dtype)     # type is bool

These operations will be done also **elementwise**!

Using of the array comparision method is not possible in `if`-clauses:

In [None]:
import numpy as np

students_heights = np.array([1.65, 1.71, 1.69, 2.01, 1.83, 1.75])
                            
if students_heights > 1.8:
    print('Students may be too little for playing baskedball!')


In this you need a decision which you can use with `if`. `numpy` has two functions `.any` and `.all` which analyze the array. 
 * `.any` returns `True` if at least one element is `True`  (thinking of `or`)
 * `.all` returns `True` if all elements are `True` (thinking of `and`)


In [None]:
b = students_heights > 1.8
print(b.any())
print(b.all())

Additionally you can also combine more than one comparisons:

In [None]:
import numpy as np

students_heights = np.array([1.65, 1.71, 1.69, 2.01, 1.83, 1.75])

cheerleader = (students_heights > 1.5 ) & (students_heights < 1.75)
non_cheerleader = (students_heights < 1.5) | (students_heights > 1.75)
print(cheerleader)
print(non_cheerleader)


`&` replaces the  `and`  and `|` `or`. In this case brackets are absolutely necessary, otherwise the operator precedence is not giving the correct results!

---

## 3. Fancy indexing

So far we have seen the normal, regular indexing, which follow simple rules.

Assume that you want to access elements, e.g. the 2., 5., and 6. element out of an array with 10 elements:

In [None]:
import numpy as np

a = np.arange(0,10,1)*100       # create an array
print(a)
ind = np.array([1,4,5])     # create an index array 
print(a[ind])               # cut all elements addressed by the indices

In this case we put all indices which we want to access into a new array and use this array as an index. This works on both on the right hand side __and__ on the left hand side:

In [None]:
import numpy as np

a = np.arange(0,10,1)*100       # create an array

ind = np.array([1,4,5])     # create an index array
a[ind] = -100
print(a)

*Fancy* is also a description for the situation that you address some elements multiple times!

In [None]:
import numpy as np

a = np.arange(0,10,1)*100

ind = np.array([1,4,5,1])     # address the second element twice
print(a[ind])
a[ind] = np.array([2,4,8,10]) # overwrite the second element two times
print(a)

**Note**: In this case the overwriting process is sequential, so that the second element will be overwritten two time and gets the last element from the new array!

---

## 4. Masking

With `numpy`-arrays it is also possible to use a bool-array as an index. In this case the index of a `True` in the bool-array indicates, that this element will be addressed. The bool-array must have the same size as the `numpy`-array!

In [None]:
import numpy as np

a = np.arange(0,10,1)
ind = np.array([True,True,False,False,False,True,False,False,False,False])   # same length as a!
print(a[ind])

`ind` is called a **mask** ! Elements with `False` will be ignored! Similar to all indexing/slicing operations it is working also on the left hand side:

In [None]:
import numpy as np

a = np.arange(0,10,1)
ind = np.array([True,True,False,False,False,True,False,False,False,False])

a[ind] = -100                    # change the values of the addressed elements to a new value
print(a)
a[ind] = np.array([100,200,300]) # change all values with a different array
print(a)

### Creating of masks

Masks can be created manually, but a better approach is to use bool-arrays from comparisions:

In [None]:
import numpy as np

a = np.arange(0,10,1)

print(a)
ind = a > 4      # create a mask for all elements with values > 4
print(ind)       # show mask
print(a[ind])    # show masked elements

You can combine masks logically with `&` and `|`:

In [None]:
a = np.arange(0,10,1)

ind = (a > 4) & (a < 7)     # (a > 4) and (a < 7)
print(ind)
print(a[ind])

or

In [None]:
a = np.arange(0,10,1)

ind = (a == 2) | (a > 7)     # (a ==  2) or ( a > 7)
print(ind)
print(a[ind])

---

## 5. Array sorting

Sorting in general has nothing to do with indexing, but one use indexing to sort an arry. If you read carefully the fancy indexing, you can create a new array, with all elements but with a new order! This new order can be a sorted order!

In [None]:
a = np.array([6,3,1,5,6])  # unsorted

ind = np.argsort(a)        # create an index array with sorted elements 
print(ind)  
print(a[ind])              # sort the array a

---

## 6. Use indexing on different arrays

Assume that data of different arrays, e.g. $x$ and $y$ depend on each other. In this case a mask is useful if applied on both arrays:

In [None]:
%matplotlib inline

import matplotlib.pyplot as plt

# create two arrays which depends on each other (have the same length)
x = np.arange(-100,100,1)
y = x**2

# create a mask from one array
ind = x >= 50

# plot the masked values only

fig, ax = plt.subplots()

#ax.plot(x,y)
ax.plot(x[ind], y[ind])

**Note:** Sorting is another example, where you can sort with a mask several arrays which appear as a line swapping in tables!

----