___

<a href='http://www.pieriandata.com'> <img src='../Pierian_Data_Logo.png' /></a>
___

# NumPy Indexing and Selection

In this lecture we will discuss how to select elements or groups of elements from an array.

In [7]:
import numpy as np
%autosave 600

#show version:
from platform import python_version
print("python version: ",python_version())

import numpy as np	
print("numpy version: ",np.version.version)

Autosaving every 600 seconds
python version:  3.7.0
numpy version:  1.15.1


In [8]:
#Creating sample array
arr = np.arange(0,11)

In [9]:
#Show
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

## Bracket Indexing and Selection
The simplest way to pick one or some elements of an array looks very similar to python lists:

In [10]:
#Get a value at an index 8
arr[8]

8

In [11]:
#Get values in a range
arr[1:5]

array([1, 2, 3, 4])

In [12]:
#Get values in a range
arr[0:5]

array([0, 1, 2, 3, 4])

In [13]:
# arr[0:5] same as arr[:5]
arr[:5]

array([0, 1, 2, 3, 4])

In [14]:
# start from particular index to the end
arr[5:]

array([ 5,  6,  7,  8,  9, 10])

## Broadcasting

Numpy arrays differ from a normal Python list because of their ability to broadcast:

In [15]:
#Setting a value with index range (Broadcasting)
arr[0:5]=100

#Show
arr

array([100, 100, 100, 100, 100,   5,   6,   7,   8,   9,  10])

In [16]:
# Reset array, we'll see why I had to reset in  a moment
arr = np.arange(0,11)

#Show
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [17]:
#Important notes on Slices
slice_of_arr = arr[0:6]

#Show slice
slice_of_arr

array([0, 1, 2, 3, 4, 5])

In [18]:
#Change Slice, [:] everything in the array
slice_of_arr[:]=99

#Show Slice again
slice_of_arr

array([99, 99, 99, 99, 99, 99])

Now note the changes also occur in our original array!

In [19]:
arr

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

Data is not copied, it's a view of the original array! This avoids memory problems!

In [20]:
# to numpy will not automatically copy it do avoid memory issues with large array
# To get a copy, instead of a reference of original array, you need to be explicit specify copy
arr_copy = arr.copy()
arr_copy

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

In [21]:
# and if i boradcast this copy to be 100, the original array will not be changed
arr_copy[:]=100
arr_copy

array([100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100])

In [24]:
arr

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

In [25]:
# also read this great post:
# https://stackoverflow.com/questions/48871320/why-does-using-on-numpy-arrays-modify-the-original-array?newreg=f88fa149506d497494c99d97ab204ca2
# a = [1]
# b = a
# b += [2]
# print(a, b) #prints [1, 2] [1, 2]
# print(id(a), id(b)) #The same id

# a = [1]
# b = a
# b = b + [2]
# print(a, b) #prints [1], [1, 2]
# print(id(a), id(b)) #Not the same id

In [29]:
# arr owns the data
arr.flags

  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

In [31]:
# slice_of_arr doesnt own the data 
slice_of_arr.flags

  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

In [33]:
# arr_copy owns the data
arr_copy.flags

  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

## Indexing a 2D array (matrices)

The general format is **arr_2d[row][col]** or **arr_2d[row,col]**. I recommend usually using the comma notation for clarity.

In [35]:
arr_2d = np.array(([5,10,15],[20,25,30],[35,40,45]))

#Show
arr_2d

array([[ 5, 10, 15],
       [20, 25, 30],
       [35, 40, 45]])

In [39]:
#Indexing row
arr_2d[1]


array([20, 25, 30])

In [41]:
# Format is arr_2d[row][col] or arr_2d[row,col]
#                 double bracket      single bracket notation (recommended) 
# Getting individual element value
arr_2d[1][0]

20

In [42]:
# Getting individual element value
arr_2d[1,0]

20

In [45]:
# 2D array slicing

#Shape (2,2) from top right corner
arr_2d[:2,1:]

array([[10, 15],
       [25, 30]])

In [46]:
#Shape bottom row
arr_2d[2]

array([35, 40, 45])

In [47]:
#Shape bottom row
arr_2d[2,:]

array([35, 40, 45])

### Fancy Indexing

Fancy indexing allows you to select entire rows or columns out of order,to show this, let's quickly build out a numpy array:

In [37]:
#Set up matrix
arr2d = np.zeros((10,10))
arr2d

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

In [38]:
#Length of array
arr_length = arr2d.shape[1]
arr_length

10

In [39]:
#Set up array

for i in range(arr_length):
    arr2d[i] = i
    
arr2d

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [5., 5., 5., 5., 5., 5., 5., 5., 5., 5.],
       [6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [7., 7., 7., 7., 7., 7., 7., 7., 7., 7.],
       [8., 8., 8., 8., 8., 8., 8., 8., 8., 8.],
       [9., 9., 9., 9., 9., 9., 9., 9., 9., 9.]])

Fancy indexing allows the following

In [40]:
arr2d[[2,4,6,8]]

array([[2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [8., 8., 8., 8., 8., 8., 8., 8., 8., 8.]])

In [42]:
# this will not work !
arr2d[2,4,6,8]

IndexError: too many indices for array

In [68]:
#Allows in any order
arr2d[[6,4,2,7]]

array([[6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [7., 7., 7., 7., 7., 7., 7., 7., 7., 7.]])

## More Indexing Help
Indexing a 2d matrix can be a bit confusing at first, especially when you start to add in step size. Try google image searching NumPy indexing to fins useful images, like this one:

<img src= 'http://memory.osu.edu/classes/python/_images/numpy_indexing.png' width=500/>

## Selection

Let's briefly go over how to use brackets for selection based off of comparison operators.

In [72]:
arr = np.arange(1,11)
arr

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [78]:
# combine with comparison operator to give a boolean array
arr > 4

array([False, False, False, False,  True,  True,  True,  True,  True,
        True])

In [79]:
bool_arr = arr>4

In [81]:
bool_arr

array([False, False, False, False,  True,  True,  True,  True,  True,
        True])

In [88]:
# use it to do conditional selection
arr[bool_arr]

array([ 5,  6,  7,  8,  9, 10])

In [89]:
arr[arr>2]

array([ 3,  4,  5,  6,  7,  8,  9, 10])

In [90]:
x = 2
arr[arr>x]

array([ 3,  4,  5,  6,  7,  8,  9, 10])

# Great Job!
