# 2. Indexing NumPy Arrays

Similar to Python lists, we can access specific values and parts of NumPy arrays using indexing. Indexing for NumPy arrays is similar to indexing Python lists but more flexible.

In [1]:
import numpy as np

Simple 1-dimensional arrays can be indexed in the same way we index Python lists.

In [2]:
arr = np.array([1, 2, 3, 4])

print(arr[0])  # First value
print(arr[2])  # Third value
print(arr[-1]) # Last value

1
3
4


We can also use slicing to index a subset of an array by `arr[start : end+1]`. 

It is also possible to specify the step size by using `arr[start : end+1 : step]`.

In [6]:
arr = np.array([1, 2, 3, 4, 5, 6])

print(arr)       # Entire array (no indexing)
print(arr[0:2])  # First two values (index 0 and 1)
print(arr[:2])   # Does the same as above
print(arr[2:-1]) # From index 2 to the second last value
print(arr[2:])   # Everything starting from index 2
print(arr[::-1]) # Reverse array

[1 2 3 4 5 6]
[1 2]
[1 2]
[3 4 5]
[3 4 5 6]
[6 5 4 3 2 1]


Of course, we also need to index multidimensional NumPy arrays.

The syntax is `arr[i_0, i_1, ..., i_n]` where `i_j` is the index for dimension `j`. We can also do slicing in the different dimensions.

Here are some examples of indexing a 2-dimensional array.

In [7]:
arr = np.array([[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]])

print(arr)           # Entire array
print(arr[1, 1])     # 5 - The middle value at index (1, 1)
print(arr[0])        # [1, 2, 3] - First row
print(arr[:, 0])     # [1, 4, 7] - First column
print(arr[1:, 1:])   # [[5, 6], [8, 9]] - Bottom right 2x2 square submatrix
print(arr[::2, ::2]) # [[1, 3], [7, 9]] - Everything but the middle "cross" removed (since we skip index 1 in both dimensions using step size 2)

[[1 2 3]
 [4 5 6]
 [7 8 9]]
5
[1 2 3]
[1 4 7]
[[5 6]
 [8 9]]
[[1 3]
 [7 9]]


Here is an example showing how we can modify an array using slicing and indexing.

In [10]:
arr = np.array([[1, 2, 3],
                 [4, 5, 6],
                 [7, 8, 9]])

col = np.array([-2, -3, -1])
row = np.array([0, 1, 0])

print("Original array:")
print(arr)       # Print original array arr

print("Modified last column:")
arr[:, -1] = col # Replace the last column with the array col
print(arr)       # Print the modified array

print("Modified second row:")
arr[1] = row     # Replace the middle row with the array row
print(arr)       # Print the modified array

print("Modified center value:")
arr[1, 1] = 42   # Replace the middle value at index (1, 1) by 42
print(arr)       # Print the modified array

Original array:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Modified last column:
[[ 1  2 -2]
 [ 4  5 -3]
 [ 7  8 -1]]
Modified second row:
[[ 1  2 -2]
 [ 0  1  0]
 [ 7  8 -1]]
Modified center value:
[[ 1  2 -2]
 [ 0 42  0]
 [ 7  8 -1]]


#### Conditional Indexing

We can also index NumPy arrays using boolean arrays. This is useful when we want to do conditional indexing (for example, if we want all values greater than a certain value).

The code cell below finds the indices of all entries in `arr` greater or equal to `16` and then uses this boolean array to extract those entries.

In [8]:
arr = np.array([[1, 2, 4], 
                [8, 16, 32], 
                [64, 128, 256]])

idxs = (arr >= 16) # Create boolean array based on the condition x >= 16

print(arr)         # Print original array
print(idxs)        # Print boolean array
print(arr[idxs])   # Get only those values x in arr satisfying the condition x >= 16

[[  1   2   4]
 [  8  16  32]
 [ 64 128 256]]
[[False False False]
 [False  True  True]
 [ True  True  True]]
[ 16  32  64 128 256]


Here is an example showing how we can use conditional indexing to clip the values of an array.

By clipping, we mean for each `x` in the array:

1. If `x > max_val`, set `x = max_val` and
2. if `x < min_val`, set `x = min_val`.

In [9]:
arr = np.array([-3, 4, -1, 5, 7, 12, 0, -8, 4, -3, 1])

min_val = -2
max_val = 5

print(arr) # Print original array

# Use boolean indexing to modify values in the array
arr[arr < min_val] = min_val
arr[arr > max_val] = max_val

print(arr) # Print clipped array

# NumPy also has a built-in function for this called np.clip()

[-3  4 -1  5  7 12  0 -8  4 -3  1]
[-2  4 -1  5  5  5  0 -2  4 -2  1]


We will look at reduction operations such as `np.sum()` in more detail later. But here is a quick way to count the number of values in an array satisfying some condition.

In [11]:
arr = np.array([6, 4, 5, 1, 7, 3, 5, 8, 9, 0, 2, 3, 1, 1, 2, 4, 7, 8, 9, 2, 1, 5])

count_over_3 = np.sum(arr > 3) # NumPy sum function treats True as 1 and False as 0
count_1s = np.sum(arr == 1)

print(f"Ther are {count_over_3} values greater than 3 in the array.")
print(f"Ther are {count_1s} ones in the array.")

Ther are 12 values greater than 3 in the array.
Ther are 4 ones in the array.


### Using `.all()` and `.any()`

When you have a boolean NumPy array `arr`, you can use the methods `.all()` and `.any()` to reduce the array to a single boolean variable. 

The method `.all()` returns the logical AND of all the values in the array. 

The method `.any()` returns the logical OR. 

We can also specify which axis we want to take the AND (or OR) along (see example below).

In [None]:
# Over all dimensions
arr = np.array([[True, False], [True, False]])
print(arr)
print(f"any() : {arr.any()}")
print(f"all() : {arr.all()}")

arr = np.array([[True, True], [True, True]])
print(arr)
print(f"any() : {arr.any()}")
print(f"all() : {arr.all()}")

# Along a given dimension
arr = np.array([[True, False], [True, False]])
print(arr)
print(f"any(axis=0) : {arr.any(axis=0)}") # axis=0 applies the any operation to each column. I.e checks if the column contains any true value and returns true. Then moves to next column.
print(f"any(axis=1) : {arr.any(axis=1)}") # axis=1 applies the any operation to each row. I.e checks if the row contains any true value and returns true. Then moves to down/up to next row.
print(f"all(axis=0) : {arr.all(axis=0)}")
print(f"all(axis=1) : {arr.all(axis=1)}")

# Combination of all() and any()
# Check if there exists at least one column with all values True
arr = np.array([[True, True], [False, False]])
print(arr)
print(f"all(axis=0).any() : {arr.all(axis=0).any()}")

arr = np.array([[True, False], [True, False]])
print(arr)
print(f"all(axis=0).any() : {arr.all(axis=0).any()}")

[[ True False]
 [ True False]]
any() : True
all() : False
[[ True  True]
 [ True  True]]
any() : True
all() : True
[[ True False]
 [ True False]]
any(axis=0) : [ True False]
any(axis=1) : [ True  True]
all(axis=0) : [ True False]
all(axis=1) : [False False]
[[ True  True]
 [False False]]
all(axis=0).any() : False
[[ True False]
 [ True False]]
all(axis=0).any() : True


## Exercises

### 1. Indexing and Slicing a Matrix (2D-array)

First, create a NumPy array named `arr` storing the matrix $A=\begin{pmatrix}1&2&3\\4&5&6\\7&8&9\end{pmatrix}$.

Then use indexing and slicing to perform the following tasks:

1. Store the centre value of $A$, i.e., the value 5 at index $(1,1)$ to a variable named `centre_value`.
2. Store the second row of $A$, i.e., the array `[4, 5, 6]`, in a variable named `second_row`.
3. Store the last column of $A$, i.e., the array `[3, 6, 9]`, in a variable named `last_column`.
4. Store the bottom left $2\times2$ sub-matrix of $A$, i.e., the array `[[4, 5], [7, 8]]` in a variable named `bottom_left_submatrix`.


In [17]:
# Your code here
arr = np.array([[1, 2, 3],[4, 5, 6],[7, 8, 9]])
centre_value = arr[1,1]
print(centre_value)
second_row = arr[1]
print(second_row)
last_column = arr[:, -1]
print(last_column)
bottom_left_submatrix = arr[1:, :2]
print(bottom_left_submatrix)


# Automatic tests:
assert (arr == np.arange(1, 10).reshape(3, 3)).all()
assert centre_value == 5
assert (second_row == np.array([4, 5, 6])).all()
assert (last_column == np.array([3, 6, 9])).all()
assert (bottom_left_submatrix == np.array([[4, 5], [7, 8]])).all()
print("All test passed!")

5
[4 5 6]
[3 6 9]
[[4 5]
 [7 8]]
All test passed!


### 2. Change Array Content Using Indexing and Slicing

You are given a 2-dimensional array `arr` and two 1-dimensional array `new_1` and `new_2`.

Your task is to
1. Replace the last (third) row of `arr` with `new_1` and print the modified array, and then
2. Replace the two first elements of the second column `[2, 5]` with `new_2` and print the modified array.

The final modified array should be
```
[[1 7 3]
 [4 7 6]
 [3 2 1]]
```

In [None]:
arr = np.array([[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]])

new_1 = np.array([3, 2, 1])
new_2 = np.array([7, 7])

# Your code here...


### 3. Find All Values Over a Given Threshold

Write a function `return_values_over_threshold(arr, threshold)` that takes in a NumPy array `arr` and a float `threshold`, and returns a NumPy array containing those values in `arr` *greater than or equal to* `threshold`. 

For example, if `arr = np.array([1, 2, 3, 4, 2, 1, 3, 6, 2, 1])`, then `return_values_over_threshold(arr, 3)` should return the NumPy array `[3, 4, 3, 6]`.

**NB! Do not use any loops for this task.** The function should only be 1-3 lines of code if you use conditional indexing.

In [None]:
def return_values_over_threshold(arr: np.ndarray, threshold: float):
    # Your code here...
    ...


# Automatic tests
arr = np.array([-3, 4, -1, 5, 7, 12, 0, -8, 4, -3, 1])
threshold = -0.5
result = return_values_over_threshold(arr, threshold)
assert (result == np.array([4, 5, 7, 12, 0, 4, 1])).all()

arr = np.array([[17, 16, 38], [14, 1, 20], [43, 11, 23], [31, 15, 18]])
threshold = 16
result = return_values_over_threshold(arr, threshold)
assert (result == np.array([17, 16, 38, 20, 43, 23, 31, 18])).all()

print("All test passed!")