# D. Numpy Array Indexing


In this section, we will be covering Numpy array indexing.



### _Objective_
1. **Numpy array indexing by positions**: Understanding how to access elements at certain positions.

2. **Numpy array indexing by conditions**: Understanding how to access elements under certain conditions. 


In [5]:
import numpy as np

# \[1. Array Indexing\]
Taking into account that a Numpy Array is an ordered collection of elements like lists and tuples, you can access the elements based on their position within the array. Let's look at it in more detail with the practice below.

#### Practice data) Here are the report cards for 6 students on 5 exams.

| StudentID | History | English | Math | Social Studies | Science |
|  ----   | --- |---| --- | --- | --- |
|0 |80 |92 |70 | 65 | 92 |
|1 |91 |75 |90 | 68 | 85 | 
|2 |86 |76 |42 | 72 | 88 |
|3 |77 |92 |52 | 60 | 80 |
|4 |75 |85 |85 | 92 | 95 |
|5 |96 |90 |95 | 81 | 72 |




In [6]:
scores = np.array([
    [80,92,70,65,92], # report card of student 0
    [91,75,90,68,85], # report card of student 1
    [86,76,42,72,88], # report card of student 2
    [77,92,52,60,80], # report card of student 3
    [75,85,85,92,95], # report card of student 4
    [96,90,95,81,72]  # report card of student 5
])
scores

array([[80, 92, 70, 65, 92],
       [91, 75, 90, 68, 85],
       [86, 76, 42, 72, 88],
       [77, 92, 52, 60, 80],
       [75, 85, 85, 92, 95],
       [96, 90, 95, 81, 72]])

## 1. Selecting Single Elements

Before we begin with the discussion on Numpy array indexing, you need to know that an `n-dimensional Numpy array` has `n-axes`, each of which is indexed with integers starting from 0 as illustrated below. ![image.png](attachment:image.png)




+  By having indexed axes, you can specify which elements of which axis to access and pull in square brackets as **`np.array[index values]`**. 

+ The comma-separated indices in square brackets refer to the positional value of elements on each axis. For example, a 2-dimensional Numpy array has two axes, namely **axis 0** and **axis 1**. Here, `axis = 0` refers to **rows** or **row-wise operations** while `axis = 1` refers to **columns** or **column-wise operations**, so you can access an element by inserting two index values into square brackets as `[a, b]` whereby `a` selects the row, and `b` selects the column. By doing so, you can directly get the element of `column b` in `row a`


### (1) Single element indexing

you can access an element by specifying its **positional value on each axis**.

#### 1) Select the score of student 2 in History

In [7]:
scores[2, 0] # In the report cards, the scores achieved by student 2 are in row 2, and the history grades are in column 0.

86

#### 2) Select the Math score of student 3

In [8]:
scores[3, 2] # In the report cards, the scores achieved by student 3 are in row 3, and the math grades are in column 2.

52

## 2. Range Indexing

+ When it comes to indexing an n-dimensional array, indices must be given sequentially from axis 0 to axis N. For example, if you want to access all columns of a specific row of a 2-dimensional array, set it to `[row_index, :]`. Here, the colon (`:`) means **"all"**, and either marking the column_index with `:` or leaving it blank automatically accesses all existing columns. It also applies when you access all values of a specific column. For instance, selecting **all values of column 1** will be written as `[:,1]`. <br>

+ If the index values are no longer specified, it is understood as accessing all values on further axes. For instance, giving an indexing notation as`[3,]` for a 3-dimensional array means accessing values **every column of every row of depth 3**. 

+ Aside from accessing a single element by its positions, you can also take a **subset of an array** by ranging the indices, which we call **Numpy Array Slicing**. The syntax of Numpy Array Slicing follows that of Python lists as `[start:stop:step]`.<br>
+ **start**(inclusive) tells where the slicing starts while **stop**(exclusive) is the endpoint of the slice.<br>
If they are not explicitly specified, each of **start, stop, and step** defaults to 0, the length of the corresponding dimension, and 1 respectively.

+ The array slicing starts sequentially from axis 0 to axis n just as how you did array indexing, and in case you want to access all on a specific axis, use a colon(`:`) for that axis.
 




### (1) Selecting elements by specific row positions (axis=0)
If you want to get a single specific row, you can simply insert the single positional value of the row as `[row_index, ]`. Or, if you want a set of consecutive rows, you need to follow <code>[start:stop:step, &nbsp;]</code> notation.


#### Scores of student 1 

In [9]:
# row=1, col=all
scores[1,:]

array([91, 75, 90, 68, 85])

#### Scores of student 3

In [10]:
# row=3, col=all
scores[3,:]

array([77, 92, 52, 60, 80])

In [11]:
# row=3, col=all
scores[3]

array([77, 92, 52, 60, 80])

#### Scores for student 1, 2, 3 and 4 in all subjects

In [12]:
# row=1-4 / col=all
scores[1:5] # scores in all subjects for student 1, 2, 3 and 4. Row 5 is excluded from the result

array([[91, 75, 90, 68, 85],
       [86, 76, 42, 72, 88],
       [77, 92, 52, 60, 80],
       [75, 85, 85, 92, 95]])

#### Scores for student 2, 3, 4 and 5 in all subjects 

In [13]:
# row=2~5 / col=all
scores[2:6] # scores in all subjects for student 2, 3, 4 and 5. Row 6 is excluded from the result

array([[86, 76, 42, 72, 88],
       [77, 92, 52, 60, 80],
       [75, 85, 85, 92, 95],
       [96, 90, 95, 81, 72]])

####  Scores of all even-numbered students

In [14]:
# row=0,2,4 / col=all
scores[0:6:2] # The entire report cards of all even-numbered students

array([[80, 92, 70, 65, 92],
       [86, 76, 42, 72, 88],
       [75, 85, 85, 92, 95]])

####  Scores of all odd-numbered students

In [15]:
# row=1,3,5 / col=all
scores[1:6:2] # The entire report cards of all odd-numbered students

array([[91, 75, 90, 68, 85],
       [77, 92, 52, 60, 80],
       [96, 90, 95, 81, 72]])

### (2) Selecting elements of specific columns (axis=1)

if you want to specify only the columns while selecting all rows, use <code>[:, column_index]</code> for a single-column selection or <code>[:, start:stop:step]</code> for the selection of a specific range of columns.

#### Scores in History for all students

In [16]:
# row=all / col=0
scores[:,0]

array([80, 91, 86, 77, 75, 96])

#### Scores in Math for all students

In [17]:
# row=all / col=2
scores[:,2]

array([70, 90, 42, 52, 85, 95])

#### Scores in History, English and Math for all students

In [1]:
# row=all / col=0,1,2
scores[:, 0:3] # selecting all row for all students' records, and column 0-3 for history, English and math scores.

NameError: name 'scores' is not defined

#### Scores in English, Math and Social Studies for all students

In [19]:
# row=all / col=1,2,3
scores[:, 1:4] # selecting all row for all students' records and column 1-4 for scores in English, math and social studies scores

array([[92, 70, 65],
       [75, 90, 68],
       [76, 42, 72],
       [92, 52, 60],
       [85, 85, 92],
       [90, 95, 81]])

### (3) Integer array indexing

Integer array indexing allows selection of arbitrary elements in the array based on the n-dimensional index values. Each integer array represents a number of indices into that dimension. It will especially be useful when you want to filter only the specific rows or columns at one go.

#### Scores of student 1, 3 and 4.

In [20]:
# row=1,3,4 / col = all
scores[[1,3,4]] # indexing by specifying the rows for student 1, 3, and 4 in a list.

array([[91, 75, 90, 68, 85],
       [77, 92, 52, 60, 80],
       [75, 85, 85, 92, 95]])

#### Scores in Math, Science for all students.

In [2]:
# row= all / col = 2,4
scores[:, [2,4]] 

NameError: name 'scores' is not defined

## 3. Boolean Array Indexing
Besides what we've seen so far, you can also index an array using Boolean Masks, and the method for array indexing is called **Boolean array indexing**.
+ **Boolean Mask** is an array composed of **True/False** values attained under certain conditions after applying a logical operator. Boolean indexing is used for filtering only the desired element values that have an index value of `True` in the boolean mask.

### (1) Boolean array indexing on one axis.
The shape of the Boolean mask and the Numpy array to be indexed must be the same.  
Let's practice the Boolean array indexing syntax with an array of shape (2, 3).

In [22]:
# (2,3) Array
array_1= np.arange(0,6).reshape(2,3)
array_1

array([[0, 1, 2],
       [3, 4, 5]])

####  Boolean array indexing on axis=0

Since the number of elements on the axis must be the same, create a Boolean mask of shape (2,&nbsp;&nbsp;) which has 2 elements on axis 0. The first element is True and the second one False, which means the first element of axis 0 in array_1 will be masked to be `True` while the second element to be `False`.


In [23]:
# (2,) Boolean Mask
mask = np.array([True,False])
mask.shape
array_1[mask,:] 

array([[0, 1, 2]])

By passing the Boolean mask into the indexing brackets, only the first row in `array_1` that corresponds to `True` in the Boolean mask will be returned.

#### Boolean array indexing  on axis=1

Likewise, the number of elements on the axis must be the same for bollean array indexing on axis. So, let's create a Boolean mask of shape (3,&nbsp;&nbsp;) with 3 elements.

In [24]:
# (3,) Boolean Mask
mask = np.array([False,False,True]) # False for column 0 and column 1, True for column 2
mask.shape

(3,)

In [25]:
array_1[:,mask]

array([[2],
       [5]])

You can see that only the elements of column 2 are taken.

#### Boolean array indexing over two axes.

If you want to use the Boolean indexing not on a single axis but over multiple axes at once, how would you do that? In order to do so, the Boolean mask must also be multidimensional of the same number of dimensions as the array to be indexed. In this part, we're going to learn the **boolean array indexing on two axes**. 

In [26]:
# (3,3)Array
array_2 = np.arange(0,9).reshape(3,3)
array_2

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [27]:
# (3,3)Boolean Mask
mask = np.array([
    [True,False, True],
    [False,False, False],
    [True,False, True]])
mask.shape

(3, 3)

In [28]:
array_2[mask]

array([0, 2, 6, 8])

You can see that only the elements of `array_2` corresponding `True` of `mask` are returned.
