# Indexing and Slicing

### Setting up the data

In [1]:
import numpy as np

In [6]:
# a vector: the argument to the array function is a Python list
v = np.random.rand(10)
v

array([0.31337741, 0.83071271, 0.64899299, 0.95780203, 0.60968591,
       0.95656145, 0.24021716, 0.58431052, 0.26104131, 0.04564956])

In [7]:
# a matrix: the argument to the array function is a nested Python list
M = np.random.rand(10, 2)
M

array([[0.62405598, 0.92767433],
       [0.45364931, 0.16889246],
       [0.08483721, 0.02312899],
       [0.4115794 , 0.24591693],
       [0.85912486, 0.5214605 ],
       [0.34309773, 0.99709368],
       [0.86843474, 0.38883784],
       [0.41100728, 0.56703832],
       [0.56023987, 0.58194042],
       [0.08175319, 0.47086043]])

## Indexing

We can index elements in an array using the square bracket and indices:

In [8]:
# v is a vector, and has only one dimension, taking one index
v[0]

0.31337741017550913

In [9]:
# M is a matrix, or a 2 dimensional array, taking two indices 
M[1,1]

0.16889246185220408

If we omit an index of a multidimensional array it returns the whole row (or, in general, a N-1 dimensional array) 

In [10]:
M

array([[0.62405598, 0.92767433],
       [0.45364931, 0.16889246],
       [0.08483721, 0.02312899],
       [0.4115794 , 0.24591693],
       [0.85912486, 0.5214605 ],
       [0.34309773, 0.99709368],
       [0.86843474, 0.38883784],
       [0.41100728, 0.56703832],
       [0.56023987, 0.58194042],
       [0.08175319, 0.47086043]])

In [11]:
M[1]

array([0.45364931, 0.16889246])

The same thing can be achieved with using `:` instead of an index: 

In [12]:
M[1,:] # row 1

array([0.45364931, 0.16889246])

In [13]:
M[:,1] # column 1

array([0.92767433, 0.16889246, 0.02312899, 0.24591693, 0.5214605 ,
       0.99709368, 0.38883784, 0.56703832, 0.58194042, 0.47086043])

We can assign new values to elements in an array using indexing:

In [14]:
M[0,0] = 1

In [15]:
M

array([[1.        , 0.92767433],
       [0.45364931, 0.16889246],
       [0.08483721, 0.02312899],
       [0.4115794 , 0.24591693],
       [0.85912486, 0.5214605 ],
       [0.34309773, 0.99709368],
       [0.86843474, 0.38883784],
       [0.41100728, 0.56703832],
       [0.56023987, 0.58194042],
       [0.08175319, 0.47086043]])

In [16]:
M[0:3,1]

array([0.92767433, 0.16889246, 0.02312899])

In [17]:
M

array([[1.        , 0.92767433],
       [0.45364931, 0.16889246],
       [0.08483721, 0.02312899],
       [0.4115794 , 0.24591693],
       [0.85912486, 0.5214605 ],
       [0.34309773, 0.99709368],
       [0.86843474, 0.38883784],
       [0.41100728, 0.56703832],
       [0.56023987, 0.58194042],
       [0.08175319, 0.47086043]])

In [18]:
# also works for rows and columns
M[1,:] = 0
M[:,1] = -1

In [19]:
M

array([[ 1.        , -1.        ],
       [ 0.        , -1.        ],
       [ 0.08483721, -1.        ],
       [ 0.4115794 , -1.        ],
       [ 0.85912486, -1.        ],
       [ 0.34309773, -1.        ],
       [ 0.86843474, -1.        ],
       [ 0.41100728, -1.        ],
       [ 0.56023987, -1.        ],
       [ 0.08175319, -1.        ]])

## Index slicing


Index slicing is the technical name for the syntax `M[lower:upper:step]` to extract part of an array:

In [20]:
a = np.array([1,2,3,4,5])
a

array([1, 2, 3, 4, 5])

In [21]:
a[1:3]

array([2, 3])

Array slices are **mutable**: if they are assigned a new value the original array from which the slice was extracted is modified:

In [22]:
a[1:3] = [-2,-3]

a

array([ 1, -2, -3,  4,  5])

* We can omit any of the three parameters in `M[lower:upper:step]`:

In [23]:
a[::] # lower, upper, step all take the default values

array([ 1, -2, -3,  4,  5])

In [24]:
a[::2] # step is 2, lower and upper defaults to the beginning and end of the array

array([ 1, -3,  5])

In [25]:
a[:3] # first three elements

array([ 1, -2, -3])

In [26]:
a[3:] # elements from index 3

array([4, 5])

* Negative indices counts from the end of the array (positive index from the begining):

In [27]:
a = np.array([1,2,3,4,5])

In [28]:
a[-1] # the last element in the array

5

In [29]:
a[-3:] # the last three elements

array([3, 4, 5])

* Index slicing works exactly the same way for multidimensional arrays:

In [30]:
A = np.array([[n+m*10 for n in range(5)] for m in range(5)])
A

array([[ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34],
       [40, 41, 42, 43, 44]])

In [26]:
# a block from the original array
A[1:4, 1:4]

array([[11, 12, 13],
       [21, 22, 23],
       [31, 32, 33]])

In [27]:
# strides
A[::2, ::2]

array([[ 0,  2,  4],
       [20, 22, 24],
       [40, 42, 44]])

## Fancy indexing

Fancy indexing is the name for when an array or list is used in-place of an index: 

In [28]:
row_indices = [1, 2, 3]
A[row_indices]

array([[10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34]])

In [29]:
col_indices = [1, 2, -1] # remember, index -1 means the last element
A[row_indices, col_indices]

array([11, 22, 34])

* We can also index **masks**: 

    - If the index mask is an Numpy array of with data type `bool`, then an element is selected (True) or not (False) depending on the value of the index mask at the position each element: 

In [30]:
b = np.array([n for n in range(5)])
b

array([0, 1, 2, 3, 4])

In [31]:
row_mask = np.array([True, False, True, False, False])
b[row_mask]

array([0, 2])

* Alternatively:

In [32]:
# same thing
row_mask = np.array([1,0,1,0,0], dtype=bool)
b[row_mask]

array([0, 2])

This feature is very useful to conditionally select elements from an array, using for example comparison operators:

In [33]:
x = np.arange(0, 10, 0.5)
x

array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. , 5.5, 6. ,
       6.5, 7. , 7.5, 8. , 8.5, 9. , 9.5])

In [34]:
mask = (5 < x) * (x < 7.5)

mask

array([False, False, False, False, False, False, False, False, False,
       False, False,  True,  True,  True,  True, False, False, False,
       False, False])

In [35]:
x[mask]

array([5.5, 6. , 6.5, 7. ])

# Iterating over array elements

Generally, we want to avoid iterating over the elements of arrays whenever we can (at all costs). The reason is that in a interpreted language like Python (or MATLAB), iterations are really slow compared to vectorized operations. 

However, sometimes iterations are unavoidable. For such cases, the Python `for` loop is the most convenient way to iterate over an array:

In [36]:
v = np.array([1,2,3,4])

for element in v:
    print(element)

1
2
3
4


In [37]:
M = np.array([[1,2], [3,4]])
print(M)
for row in M:
    print("row", row)
    
    for element in row:
        print(element)

[[1 2]
 [3 4]]
row [1 2]
1
2
row [3 4]
3
4


* When we need to iterate over each element of an array and modify its elements, it is convenient to use the `enumerate` function to obtain both the element and its index in the `for` loop: 

In [38]:
for row_idx, row in enumerate(M):
    print("row_idx", row_idx, "row", row)
    
    for col_idx, element in enumerate(row):
        print("col_idx", col_idx, "element", element)
       
        # update the matrix M: square each element
        M[row_idx, col_idx] = element ** 2

row_idx 0 row [1 2]
col_idx 0 element 1
col_idx 1 element 2
row_idx 1 row [3 4]
col_idx 0 element 3
col_idx 1 element 4


In [39]:
# each element in M is now squared
M

array([[ 1,  4],
       [ 9, 16]])

# Using arrays in conditions

When using arrays in conditions in for example `if` statements and other boolean expressions, one need to use one of `any` or `all`, which requires that any or all elements in the array evalutes to `True`:

In [40]:
M

array([[ 1,  4],
       [ 9, 16]])

In [41]:
if (M > 5).any():
    print("at least one element in M is larger than 5")
else:
    print("no element in M is larger than 5")

at least one element in M is larger than 5


In [42]:
if (M > 5).all():
    print("all elements in M are larger than 5")
else:
    print("all elements in M are not larger than 5")

all elements in M are not larger than 5
