___

<a href='http://www.pieriandata.com'> <img src='../Pierian_Data_Logo.png' /></a>
___

# NumPy Indexing and Selection

In this lecture we will discuss how to select elements or groups of elements from an array.

In [1]:
import numpy as np

In [2]:
#Creating sample array
arr = np.arange(0,11)

In [4]:
#Show
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

## Bracket Indexing and Selection
The simplest way to pick one or some elements of an array looks very similar to python lists:

In [3]:
#Get a value at an index
arr[8]

8

In [4]:
#Get values in a range
arr[1:5]

array([1, 2, 3, 4])

In [5]:
#Get values in a range
arr[0:5]

array([0, 1, 2, 3, 4])

In [7]:
arr[0:5:2]

array([0, 2, 4])

In [8]:
arr[::-1]

array([10,  9,  8,  7,  6,  5,  4,  3,  2,  1,  0])

## Broadcasting

Numpy arrays differ from a normal Python list because of their ability to broadcast:

In [9]:
#Setting a value with index range (Broadcasting)
arr[1:5]=100

#Show
arr

array([  0, 100, 100, 100, 100,   5,   6,   7,   8,   9,  10])

In [10]:
# Reset array, we'll see why I had to reset in  a moment
arr = np.arange(0,11)

#Show
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [11]:
#Important notes on Slices
slice_of_arr = arr[:6]

#Show slice
slice_of_arr

array([0, 1, 2, 3, 4, 5])

In [12]:
#Change Slice
slice_of_arr[:] = 99

#Show Slice again
slice_of_arr

array([99, 99, 99, 99, 99, 99])

Now note the changes also occur in our original array!

In [13]:
arr

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

Data is not copied, it's a view of the original array! This avoids memory problems!

Using references to avoid unecessary memory usage

In [26]:
#To get a copy, need to be explicit
arr_copy = arr.copy()

arr_copy

array([ 3,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [27]:
arr_copy[:] = 100
print(arr_copy)
print(arr)

[100 100 100 100 100 100 100 100 100 100 100]
[ 3  1  2  3  4  5  6  7  8  9 10]


This only occurs though if you are using an array slice. Just referencing certain items in the array doesn't change
the data in the array itself.

In [17]:
arr = np.arange(11)
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [19]:
first = arr[0]
first

0

In [21]:
first = 1
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [23]:
first = arr[:1]
first

array([0])

In [24]:
first[0] = 3
arr

array([ 3,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

## Indexing a 2D array (matrices)

The general format is **arr_2d[row][col]** or **arr_2d[row,col]**. I recommend usually using the comma notation for clarity.

In [46]:
arr_2d = np.array(([5,10,15],[20,25,30],[35,40,45]))

#Show
arr_2d

array([[ 5, 10, 15],
       [20, 25, 30],
       [35, 40, 45]])

In [31]:
#Indexing row
arr_2d[1]


array([20, 25, 30])

In [32]:
# Format is arr_2d[row][col] or arr_2d[row,col]

# Getting individual element value
arr_2d[1][0]

20

In [33]:
# Getting individual element value
arr_2d[1,0]

20

In [35]:
# This does not work for python lists
[[0, 1], [2, 3]][1, 0]

TypeError: list indices must be integers or slices, not tuple

In [37]:
arr_2d[0][0]

5

In [38]:
arr_2d[0,0]

5

In [39]:
arr_2d[2,1]

40

In [40]:
arr_2d[(2,1)]  # It's really just a tuple

40

In [48]:
arr_2d[[2,1]]  # But when you pass in a list, it's different

array([[35, 40, 45],
       [20, 25, 30]])

In [49]:
# 2D array slicing

#Shape (2,2) from top right corner
arr_2d[:2,1:]

array([[10, 15],
       [25, 30]])

In [50]:
# Get second and third rows, all columns but backwards
arr_2d[1:,::-1]

array([[30, 25, 20],
       [45, 40, 35]])

In [51]:
#Shape bottom row
arr_2d[2]

array([35, 40, 45])

In [52]:
#Shape bottom row
arr_2d[2,:]

array([35, 40, 45])

In [54]:
arr_2d[2,::-1]

array([45, 40, 35])

In [64]:
arr_2d[2,1:3]

array([40, 45])

### Fancy Indexing

Fancy indexing allows you to select entire rows or columns out of order,to show this, let's quickly build out a numpy array:

In [65]:
#Set up matrix
arr2d = np.zeros((10,10))  # 10x10 empty matrix

In [66]:
#Length of array
arr_length = arr2d.shape[1]

In [67]:
arr2d.shape  # rows by columns

(10, 10)

In [68]:
arr2d.shape[1]  # number of columns

10

In [73]:
#Set up array

for i in range(arr_length):
    arr2d[i] = i  # set the content of the entire row to the value i
    
arr2d

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [5., 5., 5., 5., 5., 5., 5., 5., 5., 5.],
       [6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [7., 7., 7., 7., 7., 7., 7., 7., 7., 7.],
       [8., 8., 8., 8., 8., 8., 8., 8., 8., 8.],
       [9., 9., 9., 9., 9., 9., 9., 9., 9., 9.]])

In [74]:
arr2d[0]

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [75]:
arr2d[0] = 23
arr2d

array([[23., 23., 23., 23., 23., 23., 23., 23., 23., 23.],
       [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
       [ 2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.],
       [ 3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.],
       [ 4.,  4.,  4.,  4.,  4.,  4.,  4.,  4.,  4.,  4.],
       [ 5.,  5.,  5.,  5.,  5.,  5.,  5.,  5.,  5.,  5.],
       [ 6.,  6.,  6.,  6.,  6.,  6.,  6.,  6.,  6.,  6.],
       [ 7.,  7.,  7.,  7.,  7.,  7.,  7.,  7.,  7.,  7.],
       [ 8.,  8.,  8.,  8.,  8.,  8.,  8.,  8.,  8.,  8.],
       [ 9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.]])

In [76]:
# Reset
arr2d[0] = [0]

Fancy indexing allows the following

In [77]:
arr2d[[2,4,6,8]]

array([[2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [8., 8., 8., 8., 8., 8., 8., 8., 8., 8.]])

In [78]:
#Allows in any order
arr2d[[6,4,2,7]]

array([[6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [7., 7., 7., 7., 7., 7., 7., 7., 7., 7.]])

In [79]:
arr2d[[0,1]]  # allows you to grab rows

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]])

In [82]:
arr2d[[0,1],[0]]  # can also grab columns too

array([0., 1.])

In [89]:
arr2d = np.arange(100).reshape(10, 10)
arr2d

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

In [94]:
arr2d[[2, 3, 4]]

array([[20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49]])

In [96]:
arr2d[[2, 3, 4], [0]]

array([20, 30, 40])

In [99]:
arr2d[[2, 3, 4], [0, 1, 2]]  # grab arr2d[2][0], arr2d[3][1], arr2d[4][2]

array([20, 31, 42])

In [103]:
arr2d[[2, 3, 4], 2:5]

array([[22, 23, 24],
       [32, 33, 34],
       [42, 43, 44]])

In [105]:
arr2d[range(10), range(10)]  # grab the diagonal

array([ 0, 11, 22, 33, 44, 55, 66, 77, 88, 99])

## More Indexing Help
Indexing a 2d matrix can be a bit confusing at first, especially when you start to add in step size. Try google image searching NumPy indexing to fins useful images, like this one:

<img src= 'http://memory.osu.edu/classes/python/_images/numpy_indexing.png' width=500/>

## Selection

Let's briefly go over how to use brackets for selection based off of comparison operators.

In [106]:
arr = np.arange(1,11)
arr

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [107]:
arr > 4

array([False, False, False, False,  True,  True,  True,  True,  True,
        True])

In [108]:
# Equivalent to:
list(map(lambda val: val > 4, range(1, 11)))

[False, False, False, False, True, True, True, True, True, True]

In [109]:
bool_arr = arr>4

In [110]:
bool_arr

array([False, False, False, False,  True,  True,  True,  True,  True,
        True])

In [111]:
# Index or conditionally select elements where the boolean array is true
arr[bool_arr]

array([ 5,  6,  7,  8,  9, 10])

In [116]:
arr[arr>2]  # This is so cool

array([ 3,  4,  5,  6,  7,  8,  9, 10])

In [117]:
x = 2
arr[arr>x]

array([ 3,  4,  5,  6,  7,  8,  9, 10])

In [119]:
arr[list(map(lambda val: val > 4, range(1, 11)))]

array([ 5,  6,  7,  8,  9, 10])

## Practice

In [121]:
arr_2d = np.arange(50).reshape(5, 10)

In [122]:
arr_2d

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49]])

In [133]:
# Grab 21 and 45
arr_2d[[2, 4], [1, 5]]

array([21, 45])

# Great Job!
