---   
 <img align="left" width="75" height="75"  src="https://upload.wikimedia.org/wikipedia/en/c/c8/University_of_the_Punjab_logo.png"> 

<h1 align="center">Department of Data Science</h1>
<h1 align="center">Course: Tools and Techniques for Data Science</h1>

---
<h3><div align="right">Instructor: Muhammad Arif Butt, Ph.D.</div></h3>    

<h1 align="center">Lecture 3.4 (NumPy-04)</h1>

<a href="https://colab.research.google.com/github/arifpucit/data-science/blob/master/Section-3-Python-for-Data-Scientists/Lec-3.04(NumPy-04-Array-Indexing-and-Slicing).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# _Array Indexing, Subsetting and Slicing.ipynb_

<img align="center" width="500" height="500"  src="images/indexingandslicing.png" > 

# Learning agenda of this notebook
Indexing and Slicing are two of the most common operations that you need to be familiar with when working with Numpy arrays. You will use them when you would like to work with a subset of the array.
1. Indexing NumPy Arrays
    - Indexing 1-D NumPy Arrays
    - Indexing 2-D NumPy Arrays
    - Indexing 3-D NumPy Arrays
2. Slicing NumPy Arrays
    - Slicing 1-D NumPy Arrays
    - Slicing 2-D NumPy Arrays
3. Boolean Array Indexing
    - Boolean Indexing on 1-D NumPy Arrays
    - Boolean Indexing on 2-D NumPy Arrays

In [None]:
# To install this library in Jupyter notebook
#import sys
#!{sys.executable} -m pip install numpy

In [None]:
import numpy as np
np.__version__ , np.__path__

## 1. Indexing Numpy Arrays
- You can access entire dimension or an individual element of NumPy arrays using indexing.
- The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 and so on.
- You can use negative indixes as well, which starts. from the last element.

### a. Indexing 1-D NumPy Arrays
- Along a single axis, integers are used to select single elements, and so-called slices are used to select ranges and sequences of elements. 
- Positive integers are used to index elements from the beginning of the array (index starts at 0), and negative integers are used to index elements from the end of the array, where the last element is indexed with –1, the second to last element with –2, and so on.

In [1]:
import numpy as np
mylist = [4, 5, 6, 7, 0, 2, 3]
arr = np.array(mylist, dtype=np.uint8)


print("Original Array \n", arr, "\nArray Shape: ",arr.shape)

# You can access specific elements using positive as well as negative index
print("\narr[0] = ", arr[0])
print("arr[6] = ", arr[6])
print("arr[-1] = ", arr[-1])       
print("arr[-7] = ", arr[-7])  


print("arr[7] = ", arr[7])       # IndexError
print("arr[-8] = ", arr[-8])      # IndexError

Original Array 
 [4 5 6 7 0 2 3] 
Array Shape:  (7,)

arr[0] =  4
arr[6] =  3
arr[-1] =  3
arr[-7] =  4


IndexError: index 7 is out of bounds for axis 0 with size 7

### b. Indexing 2-D NumPy Arrays

In [2]:
import numpy as np
mylist = [
          [1, 2, 3, 4], 
          [5, 6, 7, 8], 
          [3, 2, 4, 1], 
          [7, 3, 4, 9], 
          [4, 0, 3, 1]
          ]
arr = np.array(mylist)
print("Original Array \n", arr, "\nArray Shape: ",arr.shape)
print("Strides: ", arr.strides)

# You can access specific elements
print("arr[3][2] = ", arr[3][2])
print("arr[3,2] = ", arr[3,2])

# You can access entire rows
print("arr[3] = ", arr[3])

# Negative indexing
print("arr[-1][-2] = ", arr[-1][-2])    
print("arr[-1, -2] = ", arr[-1, -2]) 
print("arr[-3][-3] = ", arr[-3][-3])    
print("arr[-3, -3] = ", arr[-3, -3])    
print("arr[-3] = ", arr[-3])       
print("arr[-1] = ", arr[-1])       


Original Array 
 [[1 2 3 4]
 [5 6 7 8]
 [3 2 4 1]
 [7 3 4 9]
 [4 0 3 1]] 
Array Shape:  (5, 4)
Strides:  (32, 8)
arr[3][2] =  4
arr[3,2] =  4
arr[3] =  [7 3 4 9]
arr[-1][-2] =  3
arr[-1, -2] =  3
arr[-3][-3] =  2
arr[-3, -3] =  2
arr[-3] =  [3 2 4 1]
arr[-1] =  [4 0 3 1]


### c. Indexing 3-D NumPy Arrays

In [None]:
import numpy as np
mylist = [
          [[1, 2, 3], [5, 6, 7], [9, 0, 5]],
          [[8, 1, 6], [1, 9, 4], [5, 8, 2]],
         ]
arr = np.array(mylist)
print("Original Array:\n", arr)
print("Array Shape = ",arr.shape)
print("Strides:", arr.strides)


In [None]:
# You can access 2-D matrix at first level
print("\narr[0]: \n", arr[0])
# You can access 2-D matrix at second level
print("arr[1]: \n", arr[1])

In [None]:
print("Original Array:\n", arr)
print("Array Shape = ",arr.shape)

# You can access a specific row at a specific level
print("\narr[0][1]: ", arr[0][1])
print("\narr[0, 1]: ", arr[0, 1])
print("arr[1][2]: ", arr[1][2])
print("arr[1, 2]: ", arr[1, 2])

In [None]:
print("Original Array:\n", arr)
print("Array Shape = ",arr.shape)

# You can access a specific element
print("\narr[0][1][2]: ", arr[0][1][2])
print("arr[1][2][1]: ", arr[1][2][1])


In [None]:
print("Original Array:\n", arr)
print("Array Shape = ",arr.shape)

# Negative indexing
print("\narr[-1][-2][-1]: ", arr[-1][-2][-1])    
print("arr[-2][2]: ", arr[-2][2])    

## 2. Slicing NumPy Arrays
- You can slice a numpy array in a similar fashion as we have sliced Python Lists, with two differences:
    - The difference is that NumPy arrays can be sliced in more than one dimension.
    - The other difference is that, when we slice a Python list we get a completely new list, while in case of numPy arrays, you get a **view** of the original array, which is just a way of accessing array data. Thus the original array is not copied in memory.
- An array can be sliced using `:` symbol, which returns the range of elements specified by the index numbers.
- There are three arguments for slicing arrays, all are optional:
```
array[[start]:[stop][:step]]
```

    - start: specifies from where the slicing should start, inclusive (default is 0) 
    - stop: specifies where it has to stop, exclusive (default is end of the array) 
    - step:  is by-default 1
    
Note: Subarrays that are extracted from arrays using slice operations are alternative views of the same underlying array data. This means that they are arrays that refer to the same data in memory as the original array, but with a different strides configuration.

### a. Slicing 1-D Arrays

In [None]:
import numpy as np
mylist = [4, 5, 6, 7, 0, 2, 3]
arr = np.array(mylist, dtype=np.uint8)
print("Original Array \n", arr, "\nArray Shape = ",arr.shape)

In [None]:
print("\narr[:] = ", arr[:])
print("arr[3:] = ", arr[3:])
print("arr[:4] = ", arr[:4])
print("arr[2:5] = ", arr[2:5])

print("arr[:-2] = ", arr[:-2])
print("arr[-1:] = ", arr[-1:])
print("arr[-1:-4] = ", arr[-1:-4])
print("arr[-1:-4:-1] = ", arr[-1:-4:-1])

# reverse the array using step value as -1
print("arr[::-1] = ", arr[::-1])
print("\nAfter all this arr is same: ", arr)

**Proof of Concept:** Slice of a Python List Returns a New List

In [None]:
list1 = [1, 2, 3, 4, 5, 6, 7,8, 9]
list1

In [None]:
# A new list object is created after slicing
list2 = list1[2:5]
list2

In [None]:
#If we make change to this list, it do not effect the original list
list2[0] = 99

In [None]:
list1

In [None]:
list2

**Proof of Concept:** Slice of a numPy array returns a **view** of original numPy array

In [None]:
list1 = [1, 2, 3, 4, 5, 6, 7,8, 9]
arr1 = np.array(list1)
arr1

In [None]:
# A view of the original numPy array is created after slicing
arr2 = arr1[2:5]
arr2

In [None]:
# If we make change to this view, it will ofcourse effect the original array
arr2[0] = 99

In [None]:
arr1

In [None]:
arr2, arr2.strides

### b. Slicing 2-D Arrays
- Slicing a two-dimensional array is very similar to slicing a one-dimensional array. You just use a comma to separate the row slice and the column slice.
- Numpy extends Python's list indexing notation using `[]` to multiple dimensions in an intuitive fashion. You can provide a comma-separated list of indices or ranges to select a specific element or a subarray (also called a slice) from a Numpy array.

In [None]:
import numpy as np
mylist = [
          [1, 2, 3, 4], 
          [5, 6, 7, 8], 
          [3, 2, 4, 1], 
          [7, 3, 4, 9], 
          [4, 0, 3, 1]
          ]
arr = np.array(mylist)
print("Original Array \n", arr, "\nArray Size = ",arr.shape)

# Note we have two slice objects (row slice and column slice) in case of 2-D slicing separated by a comma
print("\narr[:,:] = \n", arr[:,:])

In [None]:
print("Original Array \n", arr, "\nArray Size = ",arr.shape)

# Get the row at index 2
print("arr[2] = ", arr[2])      # you can ignore the : symbol for the column part
print("arr[2,:]= ", arr[2,:])   # for better readability give comma separated two values

In [None]:
print("Original Array \n", arr, "\nArray Size = ",arr.shape)

# Get the column at index 1 (Get all the row values of column at index 1)
print("arr[:, 1] = ", arr[:, 1])

In [None]:
print("Original Array \n", arr, "\nArray Size = ",arr.shape)

In [None]:
# From the row at index 1, slice elements from index 1 to index 2
print("\narr[1, 1:3] = ",arr[1, 1:3])

In [None]:
print("Original Array \n", arr, "\nArray Size = ",arr.shape)

# Reversing elements of all rows is tricky. 
# Read it as "select all the rows, and in each row reverse all the column values
print("\narr[:, ::-1] =\n", arr[:, ::-1])

In [None]:
print("Original Array \n", arr, "\nArray Size = ",arr.shape)

# Reverse elements of all columns is also tricky. Actually you want to reverse the elements of all the rows
# Read it as "select all the columns, and in each column reverse all the row values
print("\narr[::-1, :] =\n", arr[::-1, :])

>- **Slicing 3-D arrays can be performed in the same fashion. The only difference is that we have three comma separated slice objects instead of two and it is a bit tricky to visualize :)**

In [17]:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [3, 2, 4, 1], [7, 3, 4, 9], [4, 0, 3, 1] ])
print(arr[3,2])
print(arr[3])
print(arr[-1, -2]) 
print(arr[-3][-3])    
print(arr[-3])       
print(arr[2, 1:3])
print(arr[::-1, :])
print(arr[2::2, 0::2])

4
[7 3 4 9]
3
2
[3 2 4 1]
[2 4]
[[4 0 3 1]
 [7 3 4 9]
 [3 2 4 1]
 [5 6 7 8]
 [1 2 3 4]]
[[3 4]
 [4 3]]


## 3. Boolean/Fancy Array Indexing

- NumPy provides another convenient method to index arrays, called fancy indexing. With fancy indexing, an array can be indexed with another NumPy array, a Python list, or a sequence of integers, whose values select elements in the indexed array. Fancy indexing requires that the elements in the array or list used for indexing are integers.

- Boolean indexing is an extremely intuitive and elegant way of selecting contents from a numPy array based on logical conditions.
- In simple words, we can slice NumPy arrays by either:
    - Provide a condition inside the `[]` operator
    - Provide a Boolean mask corresponding to indexes in the array.
- If the value at an index of that list is True that element is contained in the filtered array, if the value at that index is False that element is excluded from the filtered array.

>- **Note:** Unlike normal slicing which creates a view, Boolean/Fancy Indexing creates a copy of numPy array.

>- **Note:** The Python keywords `and` and `or` do not work with boolean arrays. Use `&` and `|` instead

### a. Boolean/Fancy Indexing on 1-D Arrays

In [None]:
# Fancy Indexing
import numpy as np
# creating 1-D array of size 10 of int type b/w interval (1,100) 
arr = np.random.randint(1, 101, 10)
print("Original Array: ", arr)
print(arr[np.array([0, 2, 4, 9])])
print(arr[[0, 2, 4, 9]])

In [None]:
# Boolean Indexing
import numpy as np
# creating 1-D array of size 10 of int type b/w interval (1,100) 
arr = np.random.randint(1, 101, 10)
print("Original Array: ", arr)
print(arr > 50)
print(arr[arr>50])

In [None]:
# Boolean Indexing
import numpy as np
# creating 1-D array of size 10 of int type b/w interval (1,100) 
arr = np.random.randint(1, 101, 10)
print("Original Array: ", arr)

# Getting even values from array
print("\narr[arr%2 == 0] = ",arr[arr%2==0])
# Getting odd values from array
print("arr[arr%2 == 1] = ",arr[arr%2==1])

# Getting values greater than 50
print("\narr[arr > 50] = ",arr[arr > 50])
# Getting values between 25 and 75 both exclusive
print("arr[arr>30 & arr<60] = ",arr[(arr>30) & (arr<60)])

In [None]:
# We can use a mask instead of mentioning a condition as done above
import numpy as np
arr1 = np.array([1, 2, 3, 4, 5])
print("Original Array: ", arr1)

mask=[False, False, False, True, False]
arr2 = arr1[mask]

print("Filtered Array: ", arr2)  

### b. Boolean/Fancy Indexing on 2-D Arrays

**Example 1:**

In [None]:
# By passing a tuple to size means rows and columns
matrix1 = np.random.randint(low = 1, high = 10, size = (5,5))
print("Original matrix \n", matrix1, "\nMatrix Shape = ",matrix1.shape)

>Suppose we want a new matrix from above matrix, that only contains rows at index 0, 2, and 3

In [None]:
# Create a corresponding boolean mask
rows_wanted = np.array( [True, False, True, True, False] )

matrix2 = matrix1[rows_wanted, :]
print("\nFiltered matrix \n", matrix2, "\nMatrix Shape = ",matrix2.shape)

**Example 2:**

In [None]:
# By passing a tuple to size means rows and columns
matrix1 = np.random.randint(low = 1, high = 10, size = (5,5))
print("Original matrix \n", matrix1, "\nMatrix Shape = ",matrix1.shape)

>Suppose we want a new matrix from above matrix, that only contains columns at index 1 and 3 only

In [None]:
# Create a corresponding boolean mask
cols_wanted = np.array( [False, True, False, True, False] )

matrix2 = matrix1[ : , cols_wanted]
print("\nFiltered matrix \n", matrix2, "\nMatrix Shape = ",matrix2.shape)

**Example 3:**

In [1]:
import numpy as np
# By passing a tuple to size means rows and columns
matrix1 = np.random.randint(low = 1, high = 10, size = (5,5))
print("Original matrix \n", matrix1, "\nMatrix Shape = ",matrix1.shape)

Original matrix 
 [[8 5 1 9 1]
 [7 2 4 4 2]
 [3 5 5 6 1]
 [9 4 2 6 9]
 [5 5 5 8 2]] 
Matrix Shape =  (5, 5)


>Suppose we want a new matrix from above 5x5 matrix, that contains inner 3x3 matrix

In [2]:
rows_wanted = np.array( [False, True, True, True, False] )

matrix2 = matrix1[ rows_wanted , :]
print("\nFiltered matrix \n", matrix2, "\nMatrix Shape = ",matrix2.shape)


Filtered matrix 
 [[7 2 4 4 2]
 [3 5 5 6 1]
 [9 4 2 6 9]] 
Matrix Shape =  (3, 5)


In [3]:
cols_wanted = np.array( [False, True, True, True, False] )

matrix3 = matrix2[ :, cols_wanted]
print("\nFiltered matrix \n", matrix3, "\nMatrix Shape = ",matrix3.shape)


Filtered matrix 
 [[2 4 4]
 [5 5 6]
 [4 2 6]] 
Matrix Shape =  (3, 3)


**Can you think of doing this in one step or in a more elegant fashion?**

In [4]:
matrix3 = matrix1[1:4, 1:4]
print("\nFiltered matrix \n", matrix3, "\nMatrix Shape = ",matrix3.shape)


Filtered matrix 
 [[2 4 4]
 [5 5 6]
 [4 2 6]] 
Matrix Shape =  (3, 3)


In [6]:
matrix2 = matrix1[ 1 : -1,  1 : -1 ]
matrix2

array([[2, 4, 4],
       [5, 5, 6],
       [4, 2, 6]])

# Four Slicing Problems with Solutions

<img align="center" width="500" height="500"  src="images/slicingQ.jpeg" > 

In [None]:
m1 = np.array([
            [0,1,2,3,4,5],
            [6,7,8,9,10,11],
            [12,13,14,15,16,17],
            [18,19,20,21,22,23],
            [24,25,26,27,28,29],
            [30,31,32,33,34,35]
])

In [None]:
m1

In [None]:
m1[0,3:5]

In [None]:
m1[4:, 4:]

In [None]:
m2 = m1[:,2]

In [None]:
m2.strides

In [None]:
m1[2::2, 0::2]