# NumPy Indexing, Slicing, Selection

In this lecture we will discuss how to select elements or groups of elements from an array.

In [0]:
import numpy as np

In [0]:
#Creating sample array
arr = np.arange(0,11)

In [3]:
#Show
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

## Indexing and Slicing

#Indexing

The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.

In a one-dimensional array, you can access the i th value (counting from
zero) by specifying the desired index in square brackets, just as with Python lists

In [4]:
#Get a value at an index 8
arr[8]

8

In [6]:
#To index from the end of the array, you can use negative indices:
arr[-2]

9

## Indexing a 2D array (matrices)

In a multidimensional array, you access items using a comma-separated tuple of
indices:

The general format is **arr_2d[row][col]** or **arr_2d[row,col]**. 

In [7]:
arr_2d = np.array( ([5,10,15],[20,25,30],[35,40,45]) )

#Show
arr_2d

array([[ 5, 10, 15],
       [20, 25, 30],
       [35, 40, 45]])

In [8]:
#Indexing row
arr_2d[0]

array([ 5, 10, 15])

In [9]:
#Indexing row
arr_2d[1]

array([20, 25, 30])

In [10]:
# Format is arr_2d[row][col] or arr_2d[row,col]

#Method:1
# Getting individual element value
arr_2d[1][0]

20

In [11]:
#Method:2
# Getting individual element value
arr_2d[1,0]

20

## Slicing

Slicing in python means taking elements from one given index to another given index.

We pass slice instead of index like this: [start:end].

We can also define the step, like this: [start:end:step].

If we don't pass start its considered 0

If we don't pass end its considered length of array in that dimension

If we don't pass step its considered 1

Note: The result includes the start index, but excludes the end index.


## Slicing of 1-d array

In [12]:
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [13]:
#Get values in a range
arr[0:5]

array([0, 1, 2, 3, 4])

In [14]:
#Get values in a range
arr[1:5]

array([1, 2, 3, 4])

## Array Slicing: Accessing Subarrays

In [15]:
arr_2d

array([[ 5, 10, 15],
       [20, 25, 30],
       [35, 40, 45]])

In [16]:
#Shape bottom row
arr_2d[2,:]

array([35, 40, 45])

In [18]:
#Shape (2,2) from top right corner
arr_2d[:2,1:]

array([[10, 15],
       [25, 30]])

#combining indexing and slicing:

Accessing array rows and columns. One commonly needed routine is accessing single
rows or columns of an array. You can do this by combining indexing and slicing,
using an empty slice marked by a single colon ( : ):

In [22]:
#Accessing row at 0 index (0th row, all columns)
arr_2d[0,:]
# arr_2d[0, :] equivalent to arr_2d[0]

array([ 5, 10, 15])

In [24]:
#Accessing column at 0 index (all rows, 0th column)
arr_2d[:, 0]

array([ 5, 20, 35])

## Broadcasting

Numpy arrays differ from a normal Python list because of their ability to broadcast:

You can also modify values using any of the above index notation:

In [25]:
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [26]:
#Setting a value with index range (Broadcasting)
arr[0:5]=100

#Show
arr

array([100, 100, 100, 100, 100,   5,   6,   7,   8,   9,  10])

Keep in mind that, unlike Python lists, NumPy arrays have a fixed type. This means,
for example, that if you attempt to insert a floating-point value to an integer

In [27]:
arr[0:5]=3.14

#Show
arr

array([ 3,  3,  3,  3,  3,  5,  6,  7,  8,  9, 10])

## Subarrays as no-copy views / Views and Copies


One important—and extremely useful—thing to know about array slices is that they
return views rather than copies of the array data. This is one area in which NumPy
array slicing differs from Python list slicing: in lists, slices will be copies. 

In [28]:
# Reset array, we'll see why I had to reset in  a moment
arr = np.arange(0,11)

#Show
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [29]:
#Important notes on Slices
slice_of_arr = arr[0:6]

#Show slice
slice_of_arr

array([0, 1, 2, 3, 4, 5])

In [30]:
#Change Slice
slice_of_arr[:]=99

#Show Slice again
slice_of_arr

array([99, 99, 99, 99, 99, 99])

Now note the changes also occur in our original array!

In [31]:
arr

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

Data is not copied, it's a view of the original array! This default behavior is actually quite useful: it means that when we work with large
datasets, we can access and process pieces of these datasets without the need to copythe underlying data buffer. This avoids memory problems!


In [32]:
#To get a copy, need to be explicit
arr_copy = arr.copy()

arr_copy

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

###The Difference Between Copy and View

The main difference between a copy and a view of an array is that the copy is a new array, and the view is just a view of the original array.

The copy owns the data and any changes made to the copy will not affect original array, and any changes made to the original array will not affect the copy.

The view does not own the data and any changes made to the view will affect the original array, and any changes made to the original array will affect the view.

In [33]:
# Reset array
arr = np.arange(0,11)

#To get a copy, need to be explicit
arr_copy = arr.copy()

print(arr)
print(arr_copy)

[ 0  1  2  3  4  5  6  7  8  9 10]
[ 0  1  2  3  4  5  6  7  8  9 10]


In [34]:
arr_copy[0] = 123

print(arr)
print(arr_copy)

[ 0  1  2  3  4  5  6  7  8  9 10]
[123   1   2   3   4   5   6   7   8   9  10]


##View:

Make a view, change the original array, and display both arrays:

In [35]:
arr_view = arr.view()
arr[0] = 42

print(arr)
print(arr_view) 

[42  1  2  3  4  5  6  7  8  9 10]
[42  1  2  3  4  5  6  7  8  9 10]


## Check if Array Owns it's Data

As mentioned above, copies owns the data, and views does not own the data, but how can we check this?

Every NumPy array has the attribute `base` that returns `None` if the array owns the data.

Otherwise, the `base`  attribute refers to the original object. 

In [36]:
arr_copy = arr.copy()
arr_view = arr.view()

print(arr_copy.base)
print(arr_view.base) 

None
[42  1  2  3  4  5  6  7  8  9 10]


The copy returns None.

The view returns the original array.

### Fancy Indexing

Fancy indexing allows you to select entire rows or columns out of order,to show this, let's quickly build out a numpy array:

In [37]:
#Set up matrix
arr2d = np.zeros((10,10))
arr2d

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

In [38]:
#Length of array
print(arr2d.shape)


(10, 10)


In [0]:
# Access the values of the tuple
print(f'no. of rows = {arr2d.shape[0]}')
print(f'no. of columns = {arr2d.shape[1]}')

In [39]:
arr_total_rows = arr2d.shape[0]
arr_total_rows

10

In [40]:
#Set up array

for i in range(arr_total_rows):
    arr2d[i] = i #Assigning index value to all the elements of the row

arr2d

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [5., 5., 5., 5., 5., 5., 5., 5., 5., 5.],
       [6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [7., 7., 7., 7., 7., 7., 7., 7., 7., 7.],
       [8., 8., 8., 8., 8., 8., 8., 8., 8., 8.],
       [9., 9., 9., 9., 9., 9., 9., 9., 9., 9.]])

Fancy indexing allows the following

In [41]:
arr2d[[2,4,6,8]] #replacing the all elemnets of the array row-wise , row starting from index 0 

array([[2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [8., 8., 8., 8., 8., 8., 8., 8., 8., 8.]])

In [42]:
#Allows in any order
arr2d[[6,4,2,7]]

array([[6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [7., 7., 7., 7., 7., 7., 7., 7., 7., 7.]])

## More Indexing Help
Indexing a 2d matrix can be a bit confusing at first, especially when you start to add in step size. 

## Selection

Let's briefly go over how to use brackets for selection based off of comparison operators.

In [43]:
arr = np.arange(1,11)
arr

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [44]:
arr > 4

array([False, False, False, False,  True,  True,  True,  True,  True,
        True])

In [0]:
bool_arr = arr>4

In [46]:
bool_arr

array([False, False, False, False,  True,  True,  True,  True,  True,
        True])

In [47]:
#Boolean Mask
arr[bool_arr]

array([ 5,  6,  7,  8,  9, 10])

In [49]:
arr[arr>2]

array([ 3,  4,  5,  6,  7,  8,  9, 10])

In [50]:
x = 2
arr[arr>x]

array([ 3,  4,  5,  6,  7,  8,  9, 10])

In [51]:
# how many values less than 6?
np.count_nonzero(arr < 6)

5