___

<a href='http://www.pieriandata.com'> <img src='./Pierian_Data_Logo.png' /></a>
___

# NumPy Indexing and Selection

In this lecture we will discuss how to select elements or groups of elements from a NumPy array.

In [1]:
import numpy as np

In [3]:
#Creating sample array
arr = np.arange(0,11) #a 1D array of 11 elements (11,). 

In [4]:
#Show
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [6]:
arr.shape

(11,)

## Bracket Indexing and Selection
The simplest way to pick one or some elements of an array looks very similar to python lists:

In [12]:
#Get a value at an index
arr[8]

8

In [13]:
#Get values in a range (using slice notation)
arr[1:5]

array([1, 2, 3, 4])

In [14]:
arr[1:5:2]

array([1, 3])

In [15]:
#Get values in a range
arr[0:5]

array([0, 1, 2, 3, 4])

In [16]:
arr[:5]

array([0, 1, 2, 3, 4])

## Broadcasting

Numpy arrays differ from a normal Python list because of their ability to broadcast:

In [59]:
#Setting a value with index range (Broadcasting)
arr[0:5]=100

#Show
arr

array([100, 100, 100, 100, 100,  99,   6,   7,   8,   9,  10])

In [60]:
# Reset array, we'll see why I had to reset in  a moment
arr = np.arange(0,11)

#Show
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [61]:
#Important notes on Slices
slice_of_arr = arr[0:6]

#Show slice
slice_of_arr

array([0, 1, 2, 3, 4, 5])

In [62]:
#Change Slice
slice_of_arr[:]=99 #[:] means I'm grabbing everything in 'slice_of_arr' array.

#Show Slice again
slice_of_arr

array([99, 99, 99, 99, 99, 99])

Now note the changes also occur in our original array!

In [63]:
arr
# so it changed 99 not just on the slice ('slice_of_arr') but on the original array I had called to ('arr')!

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

Data is not copied, it's a view of the original array! This avoids memory problems! so NumPy is not automatically set copies of arrays. If you actually want a copy and not a reference or view to the original array what you can do is actually specifically specify copy.

In [64]:
#To get a copy, need to be explicit
arr_copy = arr.copy()
arr_copy[:] = 10
arr_copy

array([10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10])

In [65]:
arr #it is unaffected by the above broadcasting to 10

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

> ⚠️ **Caution:** The basic premise here is that if you grab a slice of the array  
> `'slice_of_arr = arr[0:6]'` and set it as a variable `'slice_of_arr[:] = 99'`  
> without explicitly saying that you want a copy of the array (`.copy()`), you should keep in mind that you're just viewing/referencing a link to the original array `'arr'`, and that changes you make will actually affect the original array `'arr'`.


### Performance
The `copy()` method is faster because it does not involve the recursive deep copying that `deepcopy()` does. For NumPy arrays, `copy()` is typically sufficient for creating independent copies.
(In this context, "recursive" refers to a process that repeatedly calls itself to handle complex structures, such as nested or hierarchical data. When we say that deepcopy() performs recursive deep copying, it means that it doesn't just copy the top-level object; it also goes deeper, copying all nested objects inside it. This ensures that every element within a structure (like a list of lists or an array of objects) is fully duplicated, rather than just referencing the original elements. On the other hand, copy() creates a new top-level array but does not recursively duplicate objects inside it (if there are any complex objects). This makes it faster because it avoids unnecessary deep copying.)
### Use Case
The `deepcopy()` method is generally used for more complex structures, such as lists of lists or arrays containing objects. However, it is rarely necessary for NumPy arrays that store primitive data types. For simple NumPy arrays (with primitive data types like int or float), the copy() method behaves like a deep copy.
### In Summary
For most NumPy arrays, using `copy()` is sufficient, while `deepcopy()` is typically reserved for more complex or nested scenarios.

## Indexing a 2D array (matrices)

The general format is **arr_2d[row][col]** or **arr_2d[row,col]**. I recommend usually using the comma notation for clarity.

In [40]:
arr_2d = np.array([[5,10,15],[20,25,30],[35,40,45]]) #a 2D array (matrix)
#or
#arr_2d = np.array(([5,10,15],[20,25,30],[35,40,45]))
#Show
arr_2d

array([[ 5, 10, 15],
       [20, 25, 30],
       [35, 40, 45]])

In [38]:
#Indexing row
arr_2d[1]


array([20, 25, 30])

In [41]:
# Format is arr_2d[row][col] or arr_2d[row,col]

# Getting individual element value
arr_2d[1][0]

20

In [42]:
# Getting individual element value
arr_2d[1,0]

20

In [43]:
# 2D array slicing

#Shape (2,2) from top right corner
arr_2d[:2,1:] #grab everything up to row 2 and then grab from column 1 onwards. 

array([[10, 15],
       [25, 30]])

In [47]:
arr_2d[:2]

array([[ 5, 10, 15],
       [20, 25, 30]])

In [49]:
arr_2d[:,1:]

array([[10, 15],
       [25, 30],
       [40, 45]])

In [51]:
arr_2d[:2,1:].shape #(2, 2)

(2, 2)

In [44]:
#Shape bottom row
arr_2d[2]

array([35, 40, 45])

In [46]:
#Shape bottom row
arr_2d[2,:]

array([35, 40, 45])

### Fancy Indexing

Fancy indexing allows you to select entire rows or columns out of order,to show this, let's quickly build out a numpy array:

In [58]:
#Set up matrix
arr2d = np.zeros((10,10))
arr2d

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

In [56]:
arr2d.shape

(10, 10)

In [54]:
#Length of array
arr_length = arr2d.shape[1]
arr_length

10

In [73]:
#Set up array

for i in range(arr_length):
    arr2d[i] = i
    
arr2d

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [5., 5., 5., 5., 5., 5., 5., 5., 5., 5.],
       [6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [7., 7., 7., 7., 7., 7., 7., 7., 7., 7.],
       [8., 8., 8., 8., 8., 8., 8., 8., 8., 8.],
       [9., 9., 9., 9., 9., 9., 9., 9., 9., 9.]])

Fancy indexing allows the following

In [74]:
arr2d[[2,4,6,8]]

array([[2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [8., 8., 8., 8., 8., 8., 8., 8., 8., 8.]])

**The line `arr2d[[2,4,6,8]]` is an example of fancy indexing.**  
Fancy indexing allows you to **select multiple rows (or columns) using an array of indices** instead of just a single index or a slice.

#### **Fancy Indexing Characteristics in Your Code:**

1. Instead of using a single index like `arr2d[2]`, you are using a **list of indices** `[2,4,6,8]`, which makes it fancy indexing.
2. It **returns a new array** instead of a simple reference to a slice of `arr2d`.

### **Alternative Fancy Indexing Example:**

```python
arr2d[:, [1, 3, 5, 7]] #This would select columns 1, 3, 5, and 7 from arr2d, which is also fancy indexing.
```


In [83]:
arr2d[:, [1, 3, 5, 7]]

array([[0., 0., 0., 0.],
       [1., 1., 1., 1.],
       [2., 2., 2., 2.],
       [3., 3., 3., 3.],
       [4., 4., 4., 4.],
       [5., 5., 5., 5.],
       [6., 6., 6., 6.],
       [7., 7., 7., 7.],
       [8., 8., 8., 8.],
       [9., 9., 9., 9.]])

In [None]:
arr2d[2,4,6,8]
# IndexError                                Traceback (most recent call last)
# Cell In[66], line 1
# ----> 1 arr2d[2,4,6,8]

# IndexError: too many indices for array: array is 2-dimensional, but 4 were indexed

In [80]:
arr2d[[2]]

array([[2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]])

In [77]:
arr2d[2]

array([2., 2., 2., 2., 2., 2., 2., 2., 2., 2.])

In [25]:
#Allows in any order
arr2d[[6,4,2,7]]

array([[ 6.,  6.,  6.,  6.,  6.,  6.,  6.,  6.,  6.,  6.],
       [ 4.,  4.,  4.,  4.,  4.,  4.,  4.,  4.,  4.,  4.],
       [ 2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.],
       [ 7.,  7.,  7.,  7.,  7.,  7.,  7.,  7.,  7.,  7.]])

## More Indexing Help
Indexing a 2d matrix can be a bit confusing at first, especially when you start to add in step size. Try google image searching NumPy indexing to fins useful images, like this one:

<img src= 'http://memory.osu.edu/classes/python/_images/numpy_indexing.png' width=500/>

## Selection

Let's briefly go over how to use brackets for selection based off of comparison operators.

In [87]:
arr = np.arange(1,11) #(10,) 1D array
arr

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [93]:
arr > 4 #if we compare our array to a single digit we get a full boolean array out of this.
#output: array([False, False, False, False,  True,  True,  True,  True,  True, True], dtype=bool)

array([False, False, False, False,  True,  True,  True,  True,  True,
        True])

In [99]:
bool_arr = arr>4

In [95]:
bool_arr #a boolean array (an array of all boolean values.)
#output: array([False, False, False, False,  True,  True,  True,  True,  True, True], dtype=bool)

array([False, False, False, False,  True,  True,  True,  True,  True,
        True])

In [33]:
arr[bool_arr] #no we can use that (bool_arr) to actually do conditional selection. 

array([ 5,  6,  7,  8,  9, 10])

In [96]:
arr[arr>2] #passing in a conditional statement

array([ 3,  4,  5,  6,  7,  8,  9, 10])

In [97]:
x = 2
arr[arr>x]

array([ 3,  4,  5,  6,  7,  8,  9, 10])

# Great Job!


In [101]:
arr_2d = np.arange(50).reshape(5, 10)
arr_2d

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49]])

In [104]:
arr_2d[arr_2d>3]
#arr_2d[arr_2d>3].shape output:(46,)

array([ 4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
       21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37,
       38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49])

In [105]:
arr_2d[2,1:5]

array([21, 22, 23, 24])

In [107]:
arr_2d[2,[1,2,3,4]]

array([21, 22, 23, 24])

In [116]:
arr_2d[[1,3],[3,4]] #this will return the element at (1, 3) and the element at (3, 4). 

array([13, 34])

In [113]:
arr_2d[1:3,3:5]
#In summary, this is selecting:
#   Rows 1 and 2
#   Columns 3 and 4

array([[13, 14],
       [23, 24]])