# NumPy Indexing and Selection 

In this lecture, we will discuss how to select elements or group of elements from an array.

In [1]:
import numpy as np

In [4]:
# Creating a sample array
arr = np.arange(start=0, stop=11)

# Print the sample array
print(arr)

[ 0  1  2  3  4  5  6  7  8  9 10]


## Bracket Indexing and Selection 

The simplest way to pick one or many elements from a numpy array looks very similar to python lists.

In [5]:
# Get a value an index
arr[8]

8

In [6]:
# Get all the values in the range
arr[1:5]

array([1, 2, 3, 4])

In [7]:
# Another example of getting all the values in the range
arr[0:5]

array([0, 1, 2, 3, 4])

In [8]:
# Get all elements from the start of the array upto index 6
arr[:6]

array([0, 1, 2, 3, 4, 5])

In [9]:
# Get all elements from the 1st index of the array till the end of the array
arr[1:]

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

## Broadcasting

NumPy arrays are different from Python lists due to their ability to Broadcast

In [10]:
# Let us print our the array before broadcasting a value to its elements
print(arr)

[ 0  1  2  3  4  5  6  7  8  9 10]


In [12]:
# Setting a value to a range of indices (Broadcasting)
arr[0:5] = 100
# Broadcasting the value 100 to the indices from 0 to 5

# print the update array
print(arr)

[100 100 100 100 100   5   6   7   8   9  10]


In [14]:
# To see the change once again, we can reset the array
arr = np.arange(0, 11)

# print the reset array
print(arr)

[ 0  1  2  3  4  5  6  7  8  9 10]


In [16]:
# Important notes on slicing an array
slice_of_arr = arr[0:6]

# Show the slice of the array
print(slice_of_arr)

[0 1 2 3 4 5]


In [19]:
# Now let us broadcast a value to the slice of the array
slice_of_arr[:] = 99

# print the update slice of the array
print(slice_of_arr)

# print the original array to which the slice belongs
print(arr)

[99 99 99 99 99 99]
[99 99 99 99 99 99  6  7  8  9 10]


As we can see, when we call back the original array to which the slice belongs, the broadcast effect is reflected on it as well. This observation has an important conclusion that says that any slice of the array is not a copy of the data from the selected indices of the original array, instead it is just a view.

The reason behind this kind of an implementation of slices in Python is due to memory management (for large-size arrays). If you want a copy and not a reference of the original array then you need to delineate it explicity.

In [20]:
# To get a copy of the array, need to delineate it explicity
arr_copy = arr.copy()

In [21]:
# print the original array
print(arr)

# print the copy of the array
print(arr_copy)

[99 99 99 99 99 99  6  7  8  9 10]
[99 99 99 99 99 99  6  7  8  9 10]


In [23]:
# Not let us make a change to the array copy by broadcasting a value to it
arr_copy[:] = 100

# print the arr_copy (after broadcasting)
print(arr_copy)

# print the original array
print(arr)

[100 100 100 100 100 100 100 100 100 100 100]
[99 99 99 99 99 99  6  7  8  9 10]


As can be observed, the original array is unaffected 

## Indexing a 2D array (matrices) 

The general format is **arr_2d[row][col]** or **arr_2d[row, col]**. I recommend the comma notation for clarity.

In [26]:
# Create a 2D numPy array by passing in a list of lists
arr_2d = np.array([
    [5, 10, 15],
    [20, 25, 30],
    [35, 40, 45]
])

# print the 2D numPy array
print(arr_2d)

[[ 5 10 15]
 [20 25 30]
 [35 40 45]]


In [30]:
# Index the entire row
print(arr_2d[0])

[ 5 10 15]


In [29]:
# The general format is arr_2d[row][col] or arr_2d[row, col].

# print using the double bracket notation
print(arr_2d[0][0])

# print using the comma notation 
print(arr_2d[0, 0])

5
[ 5 10 15]


### 2D array slicing

Now, let us imagine that you did not want single elements but a chunks of this 2D array i.e. submatrices from this matrix. You can use : for slice notation in order to grab certain sections of the entire 2D array.

In [31]:
# print the 2D array
print(arr_2d)

[[ 5 10 15]
 [20 25 30]
 [35 40 45]]


In [34]:
#Shape (2,2) from top right corner
print(arr_2d[:2, 1:])

[[10 15]
 [25 30]]


In [35]:
# Shape bottom row
print(arr_2d[2])

[35 40 45]


## Fancy Indexing 

Fancy indexing allows us to select the entire rows or columns in arbitrary order. To show this, let's quickly build out a numPy array.

In [45]:
# Setup a 2D matrix filled with zeros
arr2d = np.zeros((10, 10))

# print the 2D matrix
print(arr2d)

[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]


In [46]:
# dimensions of the array
arr2d.shape

(10, 10)

In [47]:
arr_length = arr2d.shape[1]

# print the order of the matrix
print(arr_length)

10


In [48]:
# Setup the array

for i in range(arr_length):
    arr2d[i] = i

# print the array
print(arr2d)

[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [2. 2. 2. 2. 2. 2. 2. 2. 2. 2.]
 [3. 3. 3. 3. 3. 3. 3. 3. 3. 3.]
 [4. 4. 4. 4. 4. 4. 4. 4. 4. 4.]
 [5. 5. 5. 5. 5. 5. 5. 5. 5. 5.]
 [6. 6. 6. 6. 6. 6. 6. 6. 6. 6.]
 [7. 7. 7. 7. 7. 7. 7. 7. 7. 7.]
 [8. 8. 8. 8. 8. 8. 8. 8. 8. 8.]
 [9. 9. 9. 9. 9. 9. 9. 9. 9. 9.]]


Fancy Indexing allows the following: -

In [49]:
arr2d[[2, 4, 6, 8]]

array([[2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [8., 8., 8., 8., 8., 8., 8., 8., 8., 8.]])

Allows selection of rows and columns in any order

In [50]:
arr2d[[1, 4, 8, 3]]

array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [8., 8., 8., 8., 8., 8., 8., 8., 8., 8.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.]])

## Selection

Let us briefly go over how to use square brackets for boolean indexing so that we can select data based on some condition.

In [37]:
arr = np.arange(1, 11)

# print the array
print(arr)

[ 1  2  3  4  5  6  7  8  9 10]


In [40]:
# Let us compare all the entries of the array to a single number say 5
print(arr > 5)

bool_arr = arr > 5

[False False False False False  True  True  True  True  True]


In [41]:
# Performing conditional selection based on the boolean array
arr[bool_arr]

array([ 6,  7,  8,  9, 10])

In [42]:
# Condensing it in a single step
arr[arr>5]

array([ 6,  7,  8,  9, 10])

In [43]:
x = 4
arr[arr>x]

array([ 5,  6,  7,  8,  9, 10])