# Question about 2-dimensional boolean masks

Charlie asked a good question about 2D masks in the lecture. The question was how do we create a smaller 2D array from a larger 2D array using a boolean mask. It is interesting because to index a 2D array we need both a row and a column index. 

Here we will start with a 3x3 array containing the numbers zero to 8. We want only the numbers in this array that are odd.

So here are a few ways we can do this (I am sure there are more.)

In [33]:
import numpy as np

In [34]:
my_array= np.array([[0,1,2], [3,4,5], [6,7,8]])
my_array

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

## Simple case: If we just want the elements of the matrix and do not care where they are.

Here we can simply reshape the array to be 1 dimensional and run the boolean mask

##### Long-form (meaning one step per line) approach to simple case

Note: Here we need a 1D array. In the numpy tutorial we used reshape to make an array a single row or a single column. The problem is reshape retains a 2D index. So if we have an array that is [1,2,3,4] after we reshape it, it will still have a row and column index. Flatten takes the second dimension out of the array. We can see this below with the shape changing to (9,) versus (9,1)

In [40]:
# reshape the array using flatten which changes it to one dimension
array_reshaped = my_array.flatten()
print(array_reshaped.shape)
array_reshaped


(9,)


array([0, 1, 2, 3, 4, 5, 6, 7, 8])

In [44]:
# create the mask
mask_v1 = np.mod(array_reshaped,2)==1
mask_v1

array([False,  True, False,  True, False,  True, False,  True, False])

In [47]:
# apply the mask on the reshaped array
new_array_v1 = array_reshaped[mask_v1]
new_array_v1

array([1, 3, 5, 7])

#### Shortcut mask
Combining the last 2 steps in one line of code. 

NOTE: This is not necessary. The long form works great and is never wrong. This is just more condensed.

Note: you could also do this in one line, but that is a little too complicated and much harder to read and is generally discouraged. Remember the third line of the 'Zen of Python'

    Simple is better than complex.

In [48]:
array_reshaped = my_array.flatten()
new_array_v2 = array_reshaped[np.mod(array_reshaped,2)==1]
new_array_v2

array([1, 3, 5, 7])

## Complex Case: You need to keep the locations (row, column indicies) of the odd elements.

This gets a little more complicated because we are dealing with two dimensions. We will use this great function <b> np.where </b> that will be used frequently in pandas.

In [49]:
# just restating the array here
my_array= np.array([[0,1,2], [3,4,5], [6,7,8]])
my_array

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [57]:
mask_np_where = np.where(my_mask==1)
print('mask_np_where is of type: ', type(mask_np_where))
mask_np_where

mask_np_where is of type:  <class 'tuple'>


(array([0, 1, 1, 2]), array([1, 0, 2, 1]))

So our variable mask_np_where contains two arrays. The first array is a list of the rows where the mask is true. 
The second array is a list of the columns where the mask if true.

To test this, lets pair up some of the items of the two arrays.

In [51]:
my_array[0,1]

1

In [52]:
my_array[1,0]

3

In [53]:
my_array[1,2]

5

In [54]:
my_array[2,1]

7

If we need a list of the row,column pairs we can do that by looping through these arrays.

First let's define the row_index and column_index variables from the tuple.


In [58]:
row_index = mask_np_where[0]
row_index

array([0, 1, 1, 2])

In [59]:
col_index = mask_np_where[1]
col_index

array([1, 0, 2, 1])

Here we need to determine how we want to keep the column and row indicies. Should it be a list where each row contains the row_index and col_index? Should it be a list of tuples? Should it be a dictionary? 
It all depends on what you want to get out of this. Since it will be the simplest, lets store the coordinates in a list.

In [64]:
list_mask_location = []

#NOTE: Since the number of rows is equal to the number of columns, we only need one loop.
for i, row in enumerate(row_index):
    #since row is known, we need to extract the columns
    this_col = col_index[i]
    list_mask_location.append([row, this_col])
    
list_mask_location
    

[[0, 1], [1, 0], [1, 2], [2, 1]]