# 2. Indexing NumPy Arrays

We start by importing NumPy under the alias `np`.

In [None]:
import numpy as np

NumPy arrays can be indexed in the same way we index Python lists.

In [18]:
arr = np.array([1, 2, 3, 4])

print(arr[0])  # First value
print(arr[1])  # Second value
print(arr[-1]) # Last value

1
2
4


We can also use slicing to index a subarray by `arr[start:end+1]`. You can choose the step size by using `arr[start:end+1:step]` as well.

In [19]:
arr = np.array([1, 2, 3, 4, 5, 6])

print(arr[0:2])  # First two values
print(arr[:2])   # Does the same as above
print(arr[2:-1]) # From index 2 to the second last value
print(arr[2:])   # Everything starting from index 2
print(arr[::-1]) # Reverse array

[1 2]
[1 2]
[3 4 5]
[3 4 5 6]
[6 5 4 3 2 1]


We also want to do indexing on multidimensional NumPy arrays. The syntax is `arr[i_0, i_1, ..., i_n]` where `i_j` is the index for dimension `j`. We can do slicing also here.

In [20]:
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

print(arr)           # Original array
print(arr[1, 1])     # The middle value at index (1, 1)
print(arr[0])        # First row
print(arr[:, 0])     # First column
print(arr[1:, 1:])   # Bottom right 2x2 square submatrix
print(arr[::2, ::2]) # Everything but the middle "cross"

[[1 2 3]
 [4 5 6]
 [7 8 9]]
5
[1 2 3]
[1 4 7]
[[5 6]
 [8 9]]
[[1 3]
 [7 9]]


#### Conditional Indexing

We can also index using boolean arrays. One useful application is when we want to do conditional indexing. For example, the code cell below finds the indices of all entries in `arr` greater or equal to `16` and then uses this boolean array to extract those entries.

In [21]:
arr = np.array([[1, 2, 4], [8, 16, 32], [64, 128, 256]])
idxs = (arr >= 16)
print(arr)
print(idxs)
print(arr[idxs])

[[  1   2   4]
 [  8  16  32]
 [ 64 128 256]]
[[False False False]
 [False  True  True]
 [ True  True  True]]
[ 16  32  64 128 256]


### Bonus: Useful Methods of Boolean Arrays

**Using `arr.all()` and `arr.any()` on boolean arrays**: When you have a boolean NumPy array `arr`, you can use the methods `.all()` and `.any()` to reduce the array to a single boolean variable. The method `.all()` returns the logical AND of all the values in the array, whereas the method `.any()` returns the logical OR. We can also specify which axis we want to take the AND/OR along (see below example).

In [22]:
# Over all dimensions
arr = np.array([[True, False], [True, False]])
print(arr)
print(f"any() : {arr.any()}")
print(f"all() : {arr.all()}")

arr = np.array([[True, True], [True, True]])
print(arr)
print(f"any() : {arr.any()}")
print(f"all() : {arr.all()}")

# Along a given dimension
arr = np.array([[True, False], [True, False]])
print(arr)
print(f"any(axis=0) : {arr.any(axis=0)}")
print(f"any(axis=1) : {arr.any(axis=1)}")
print(f"all(axis=0) : {arr.all(axis=0)}")
print(f"all(axis=1) : {arr.all(axis=1)}")

# Combination of all() and any()
# Check if there exists at least one column with all values True
arr = np.array([[True, True], [False, False]])
print(arr)
print(f"all(axis=0).any() : {arr.all(axis=0).any()}")

arr = np.array([[True, False], [True, False]])
print(arr)
print(f"all(axis=0).any() : {arr.all(axis=0).any()}")

[[ True False]
 [ True False]]
any() : True
all() : False
[[ True  True]
 [ True  True]]
any() : True
all() : True
[[ True False]
 [ True False]]
any(axis=0) : [ True False]
any(axis=1) : [ True  True]
all(axis=0) : [ True False]
all(axis=1) : [False False]
[[ True  True]
 [False False]]
all(axis=0).any() : False
[[ True False]
 [ True False]]
all(axis=0).any() : True


## Exercises

### 1. Indexing and Slicing a Matrix (2D-array)

First, create a NumPy array named `arr` storing the matrix $A=\begin{pmatrix}1&2&3\\4&5&6\\7&8&9\end{pmatrix}$.

Then use indexing and slicing to perform the following tasks:

1. Store the centre value of $A$, i.e., the value at index $(1,1)$ to a variable named `centre_value`.
2. Store the second row of $A$, i.e., the array `[4, 5, 6]`, in a variable named `second_row`.
3. Store the last column of $A$, i.e., the array `[3, 6, 9]`, in a variable named `last_column`.
4. Store the bottom left $2\times2$ sub-matrix of $A$, i.e., the array `[[4, 5], [7, 8]]` in a variable named `bottom_left_submatrix`.

In [23]:
# Your code here
arr = ...
centre_value = ...
second_row = ...
last_column = ...
bottom_left_submatrix = ...

# Solution:
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
centre_value = arr[1,1]
second_row = arr[1]
last_column = arr[:, -1]
bottom_left_submatrix = arr[1:,:2]

# Automatic tests:
assert (arr == np.arange(1, 10).reshape(3, 3)).all()
assert centre_value == 5
assert (second_row == np.array([4, 5, 6])).all()
assert (last_column == np.array([3, 6, 9])).all()
assert (bottom_left_submatrix == np.array([[4, 5], [7, 8]])).all()
print("All test passed!")

All test passed!


### 3. Find All Values Over a Given Threshold

Write a function `return_values_over_threshold(arr, threshold)` that takes in a NumPy array `arr` and a float `threshold`, and returns a NumPy array containing those values in `arr` *greater than or equal to* `threshold`. 

**NB! Do not use any loops for this task.** Vectorizing our code speeds up things and is one of the main reasons we use NumPy.

In [24]:
def return_values_over_threshold(arr: np.ndarray, threshold: float):
    ...

# Solution:
def return_values_over_threshold(arr: np.ndarray, threshold: float):
    idxs = (arr >= threshold)
    return arr[idxs]

# Automatic tests
arr = np.array([-3, 4, -1, 5, 7, 12, 0, -8, 4, -3, 1])
threshold = -0.5
result = return_values_over_threshold(arr, threshold)
assert (result == np.array([4, 5, 7, 12, 0, 4, 1])).all()

arr = np.array([[17, 16, 38], [14, 1, 20], [43, 11, 23], [31, 15, 18]])
threshold = 16
result = return_values_over_threshold(arr, threshold)
assert (result == np.array([17, 16, 38, 20, 43, 23, 31, 18])).all()

print("All test passed!")

All test passed!
