# 3. Indexing, slicing

Each element of an array can be located by its position in each dimension. Numpy offers multiple ways to access single elements or groups of elements in very efficient ways. We will illustrate these concepts both with small simple matrices as well as a regular image, in order to illustrate them.

In [1]:
import numpy as np

**3.1 Accessing single values**

We create a small 2D array to use as an example:




In [None]:
normal_array = np.random.normal(10, 2, (3,4))
normal_array

array([[12.99205086,  7.7157832 , 14.66021898,  8.21412356],
       [ 9.19391119,  7.92142871, 13.31222213,  8.19957688],
       [11.08009573,  8.54243953, 12.71096417, 10.09637761]])

It is very easy to access an array's values. One can just pass an *index* for each dimensions. For example to recover the value on the last row and second column of the ```normal_array``` array we just write (remember counting starts at 0):

In [None]:
single_value = normal_array[2,1]
single_value

8.542439525354693

What is returned in that case is a single number that we can re-use:

In [None]:
single_value += 10
single_value

18.542439525354695

And that change doesn't affect the original value in the array:

In [None]:
normal_array

array([[12.99205086,  7.7157832 , 14.66021898,  8.21412356],
       [ 9.19391119,  7.92142871, 13.31222213,  8.19957688],
       [11.08009573,  8.54243953, 12.71096417, 10.09637761]])

However we can also directly change the value in an array:

In [None]:
normal_array[2,1] = 23

In [None]:
normal_array

array([[12.99205086,  7.7157832 , 14.66021898,  8.21412356],
       [ 9.19391119,  7.92142871, 13.31222213,  8.19957688],
       [11.08009573, 23.        , 12.71096417, 10.09637761]])

## 3.2 Accessing part of an array with indices: slicing

### 3.2.1 Selecting a range of elements

One can also select multiple elements in each dimension (e.g. multiple rows and columns in 2D) by using the ```start:end:step``` syntax. By default, if omitted, ```start=0```, ```end=last element``` and ```step=1```. For example to select the first **and** second rows of the first column, we can write:

In [None]:
normal_array[0:2,0]

array([12.99205086,  9.19391119])

Note that the ```end``` element is **not** included. One can use the same notation for all dimensions:

In [None]:
normal_array[0:2,2:4]

array([[14.66021898,  8.21412356],
       [13.31222213,  8.19957688]])

In [None]:
normal_array[1:,2:4]

array([[13.31222213,  8.19957688],
       [12.71096417, 10.09637761]])

### 3.2.2 Selecting all elements
If we only specify ```:```, it means we want to recover all elements in that dimension:

In [None]:
normal_array[:,2:4]

array([[14.66021898,  8.21412356],
       [13.31222213,  8.19957688],
       [12.71096417, 10.09637761]])

Also in general, if you only specify the value for a single axis, this will take the first element of the first dimension:

In [None]:
normal_array

array([[12.99205086,  7.7157832 , 14.66021898,  8.21412356],
       [ 9.19391119,  7.92142871, 13.31222213,  8.19957688],
       [11.08009573, 23.        , 12.71096417, 10.09637761]])

In [None]:
normal_array[1]

array([ 9.19391119,  7.92142871, 13.31222213,  8.19957688])

Finally note that if you want to recover only one element along a dimension (single row, column etc), you can do that in two ways:

In [None]:
normal_array[0,:]

array([12.99205086,  7.7157832 , 14.66021898,  8.21412356])

This returns a one-dimensional array containing a single row from the original array:

In [None]:
normal_array[0,:].shape

(4,)

Instead, if you specify actual boundaries that still return only a single row:

In [None]:
normal_array[0:1,:]

array([[12.99205086,  7.7157832 , 14.66021898,  8.21412356]])

In [None]:
normal_array[0:1,:].shape

(1, 4)

## 3.2 Sub-arrays are not copies!

As often with Python when you create a new variable using a sub-array, that variable **is not independent** from the original variable:

In [None]:
sub_array = normal_array[:,2:4]

In [None]:
sub_array

array([[14.66021898,  8.21412356],
       [13.31222213,  8.19957688],
       [12.71096417, 10.09637761]])

In [None]:
normal_array

array([[12.99205086,  7.7157832 , 14.66021898,  8.21412356],
       [ 9.19391119,  7.92142871, 13.31222213,  8.19957688],
       [11.08009573, 23.        , 12.71096417, 10.09637761]])

If for example we modify ```normal_array```, this is going to be reflected in ```sub_array``` too:

In [None]:
normal_array[0,2] = 100

In [None]:
normal_array

array([[ 12.99205086,   7.7157832 , 100.        ,   8.21412356],
       [  9.19391119,   7.92142871,  13.31222213,   8.19957688],
       [ 11.08009573,  23.        ,  12.71096417,  10.09637761]])

In [None]:
sub_array

array([[100.        ,   8.21412356],
       [ 13.31222213,   8.19957688],
       [ 12.71096417,  10.09637761]])

The converse is also true:

In [None]:
sub_array[0,1] = 50

In [None]:
sub_array

array([[100.        ,  50.        ],
       [ 13.31222213,   8.19957688],
       [ 12.71096417,  10.09637761]])

In [None]:
normal_array

array([[ 12.99205086,   7.7157832 , 100.        ,  50.        ],
       [  9.19391119,   7.92142871,  13.31222213,   8.19957688],
       [ 11.08009573,  23.        ,  12.71096417,  10.09637761]])

If you want your sub-array to be an *independent* copy of the original, you have to use the ```.copy()``` method:

In [None]:
sub_array_copy = normal_array[1:3,:].copy()

In [None]:
sub_array_copy

array([[ 9.19391119,  7.92142871, 13.31222213,  8.19957688],
       [11.08009573, 23.        , 12.71096417, 10.09637761]])

In [None]:
sub_array_copy[0,0] = 500

In [None]:
sub_array_copy

array([[500.        ,   7.92142871,  13.31222213,   8.19957688],
       [ 11.08009573,  23.        ,  12.71096417,  10.09637761]])

In [None]:
normal_array

array([[ 12.99205086,   7.7157832 , 100.        ,  50.        ],
       [  9.19391119,   7.92142871,  13.31222213,   8.19957688],
       [ 11.08009573,  23.        ,  12.71096417,  10.09637761]])

## 3.4. Accessing parts of an array with coordinates

In the above case, we are limited to select rectangular sub-regions of the array. But sometimes we want to recover a series of specific elements for example the elements (row=0, column=3) and (row=2, column=2). To achieve that we can simply index the array with a list containing row indices and another with columns indices:

In [None]:
row_indices = [0,2]
col_indices = [3,2]

normal_array[row_indices, col_indices]

array([50.        , 12.71096417])

In [None]:
normal_array

array([[ 12.99205086,   7.7157832 , 100.        ,  50.        ],
       [  9.19391119,   7.92142871,  13.31222213,   8.19957688],
       [ 11.08009573,  23.        ,  12.71096417,  10.09637761]])

In [None]:
selected_elements = normal_array[row_indices, col_indices]

In [None]:
selected_elements

array([50.        , 12.71096417])

## 3.5 Logical indexing

The last way of extracting elements from an array is to use a boolean array of same shape. For example let's create a boolean array by comparing our original matrix to a threshold:

In [None]:
bool_array = normal_array > 40
bool_array

array([[False, False,  True,  True],
       [False, False, False, False],
       [False, False, False, False]])

We see that we only have two elements which are above the threshold. Now we can use this logical array to *index* the original array. Imagine that the logical array is a mask with holes only in ```True``` positions and that we superpose it to the original array. Then we just take all the values visible in the holes:

In [None]:
normal_array[bool_array]

array([100.,  50.])

## 3.6 Reshaping arrays

Often it is necessary to reshape arrays, i.e. keep elements unchanged but change their position. There are multiple functions that allow one to do this. The main one is of course ```reshape```.

### 3.6.1 ```reshape```

Given an array of $MxN$ elements, one can reshape it with a shape $OxP$ as long as $M*N = O*P$.

In [None]:
reshaped = np.reshape(normal_array,(2,6))
reshaped

array([[ 12.99205086,   7.7157832 , 100.        ,  50.        ,
          9.19391119,   7.92142871],
       [ 13.31222213,   8.19957688,  11.08009573,  23.        ,
         12.71096417,  10.09637761]])

In [None]:
reshaped.shape

(2, 6)

In [None]:
300*451/150

902.0

### 3.6.2 Flattening

It's also possible to simply flatten an array i.e. remove all dimensions to create a 1D array. This can be useful for example to create a histogram of a high-dimensional array.

In [None]:
flattened = np.ravel(normal_array)
flattened

array([ 12.99205086,   7.7157832 , 100.        ,  50.        ,
         9.19391119,   7.92142871,  13.31222213,   8.19957688,
        11.08009573,  23.        ,  12.71096417,  10.09637761])

In [None]:
flattened.shape

(12,)

### 3.6.3 Dimension collapse

Another common way that leads to reshaping is projection. Let's consider again our ```normal_array```:

In [None]:
normal_array

array([[ 12.99205086,   7.7157832 , 100.        ,  50.        ],
       [  9.19391119,   7.92142871,  13.31222213,   8.19957688],
       [ 11.08009573,  23.        ,  12.71096417,  10.09637761]])

We can project all values along the first or second axis, to recover for each row/column the largest value:

In [None]:
proj0 = np.max(normal_array, axis = 0)
proj0

array([ 12.99205086,  23.        , 100.        ,  50.        ])

In [None]:
proj0.shape

(4,)

### 3.6.4 Swaping dimensions

We can also simply exchange the position of dimensions. This can be achieved in different ways. For example we can ```np.roll``` dimensions, i.e. circularly shift dimensions. This conserves the relative oder of all axes:

In [None]:
array3D = np.ones((4, 10, 20))
array3D.shape

(4, 10, 20)

In [None]:
array_rolled = np.rollaxis(array3D, axis=1, start=0)
array_rolled.shape

(10, 4, 20)

Alternatively you can swap two axes. This doesn't preserver their relative positions:

In [None]:
array_swapped = np.swapaxes(array3D, 0,2)
array_swapped.shape

(20, 10, 4)

### 3.6.5 Change positions

Finally, we can also change the position of elements without changing the shape of the array. For example if we have an array with two columns, we can swap them:

In [None]:
array2D = np.random.normal(0,1,(4,2))
array2D

array([[ 1.69380702,  0.45317243],
       [ 0.97985485, -1.10186616],
       [ 2.16001609,  0.29160533],
       [-0.29204481, -0.80523649]])

In [None]:
np.fliplr(array2D)

array([[ 0.45317243,  1.69380702],
       [-1.10186616,  0.97985485],
       [ 0.29160533,  2.16001609],
       [-0.80523649, -0.29204481]])

Similarly, if we have two rows:

In [None]:
array2D = np.random.normal(0,1,(2,4))
array2D

array([[-0.00285876,  0.76241924,  1.18546015, -0.13881594],
       [-1.42554951,  0.36561497,  0.73252833, -1.43307846]])

In [None]:
np.flipud(array2D)

array([[-1.42554951,  0.36561497,  0.73252833, -1.43307846],
       [-0.00285876,  0.76241924,  1.18546015, -0.13881594]])