# Array slicing

## One-dimensional array slicing

To access a slice of an array x, use this: x[start:stop:step]. By default, step is 1.

In [22]:
import numpy as np
a = np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [23]:
# select first five element, index starts from 0
b = a[0:5] 
b

array([0, 1, 2, 3, 4])

In [24]:
# you can drop 0
b = b[:5]
b

array([0, 1, 2, 3, 4])

In [25]:
# select all element after index 5
c = a[5:]
c

array([5, 6, 7, 8, 9])

In [26]:
# select middle elements, index starts from 3 and stops at 6 (not including the last index)
d = a[3:6]
d

array([3, 4, 5])

In [27]:
# select every other elements using step = 2
e = a[::2]
e

array([0, 2, 4, 6, 8])

In [28]:
# reverse the array
f = a[::-1]
f

array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])

In [31]:
# reverse every other from index 8
g = a[8::-2]
g

array([8, 6, 4, 2, 0])

## Two-dimensional array slicing

Let's first use a function called 'reshape' to convert a one-dimensional array to a two-dimensional one. This is very useful when you dealing with two-dimensional data, i.e., a velocity field in a two-dimensional space.

In [38]:
# Let's convert the one-dimensional array, a, into a 2*5 array, b.
b = np.reshape(a,(2,5))
b

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [39]:
# Try a 5*2 array
c =np.reshape(a,(5,2))
c

array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7],
       [8, 9]])

In [100]:
# Let's work on a 4 by 4 array
a = np.reshape(np.arange(16),(4,4))
a

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [43]:
# Slice first two rows and first three columns
b = a[:2,:3]
b

array([[0, 1, 2],
       [4, 5, 6]])

In [45]:
# Down-sampling: slice every other element in both row and column
c = a[::2,::2]
c

array([[ 0,  2],
       [ 8, 10]])

In [46]:
# Down-sampling on rows only, keep the same columns
d = a[::2,]
d

array([[ 0,  1,  2,  3],
       [ 8,  9, 10, 11]])

In [47]:
# reverse rows and columns
e = a[::-1, ::-1]
e

array([[15, 14, 13, 12],
       [11, 10,  9,  8],
       [ 7,  6,  5,  4],
       [ 3,  2,  1,  0]])

In [49]:
# reverse column only, i.e., flip left and right 
f = a[:,::-1]
f

array([[ 3,  2,  1,  0],
       [ 7,  6,  5,  4],
       [11, 10,  9,  8],
       [15, 14, 13, 12]])

In [52]:
# flip upside down
g = a[::-1,]
g

array([[12, 13, 14, 15],
       [ 8,  9, 10, 11],
       [ 4,  5,  6,  7],
       [ 0,  1,  2,  3]])

In [53]:
# You can also reshape a 2-D array to a different 2-D array
g2 = np.reshape(g,(2,8))
g2

array([[12, 13, 14, 15,  8,  9, 10, 11],
       [ 4,  5,  6,  7,  0,  1,  2,  3]])

## Dealing with common data issues

Let's generate a 2D array and throw some zeros in there. Imaging these zeros are invalid data and represent no values. So we want to find them out and replace them using NaN, not-a-nubmer.

In [108]:
# Let's manually change some elements in a
a[1:3,1:3] = 0
a

array([[ 8.18181818,  1.        ,  2.        ,  3.        ],
       [ 4.        ,  0.        ,  0.        ,  7.        ],
       [ 8.        ,  0.        ,  0.        , 11.        ],
       [12.        , 13.        , 14.        , 15.        ]])

In above example, we found five zeros, they are a(0,0), a(1,1), a(1,2), a(2,1), and a(2,2)

In [105]:
# To change them to NaN, You can simple do this:
a = a.astype('float') # we should first change the integer array to a float type. 
a[a==0] = np.nan
a

array([[nan,  1.,  2.,  3.],
       [ 4., nan, nan,  7.],
       [ 8., nan, nan, 11.],
       [12., 13., 14., 15.]])

In [106]:
# Now, let's change nan to the mean value of all elements that have values
a[np.isnan(a)] = a[~np.isnan(a)].mean()
a

array([[ 8.18181818,  1.        ,  2.        ,  3.        ],
       [ 4.        ,  8.18181818,  8.18181818,  7.        ],
       [ 8.        ,  8.18181818,  8.18181818, 11.        ],
       [12.        , 13.        , 14.        , 15.        ]])

### Use 'np.where'

We can also use 'np.where' to find out which elements are zero, the returned two arrays are the indices of row and column.

In [114]:
a = np.random.randn(4,4)
a

array([[-0.42814115,  0.83712754, -0.14238948, -2.63254217],
       [ 0.93312466,  2.28220068,  2.05696813,  0.63603726],
       [ 1.20064713,  1.39666133,  0.4557124 ,  0.90969731],
       [ 1.48170748, -0.0399394 ,  0.70740673,  0.80910848]])

In [115]:
a[a<0] = 0
a

array([[0.        , 0.83712754, 0.        , 0.        ],
       [0.93312466, 2.28220068, 2.05696813, 0.63603726],
       [1.20064713, 1.39666133, 0.4557124 , 0.90969731],
       [1.48170748, 0.        , 0.70740673, 0.80910848]])

In [116]:
np.where(a==0)

(array([0, 0, 0, 3]), array([0, 2, 3, 1]))

In [119]:
b = np.where(a==0, np.nan, a) # np.nan is used to replace elements with a==0, otherwise, use the value of a itself
b

array([[       nan, 0.83712754,        nan,        nan],
       [0.93312466, 2.28220068, 2.05696813, 0.63603726],
       [1.20064713, 1.39666133, 0.4557124 , 0.90969731],
       [1.48170748,        nan, 0.70740673, 0.80910848]])

In [120]:
a

array([[0.        , 0.83712754, 0.        , 0.        ],
       [0.93312466, 2.28220068, 2.05696813, 0.63603726],
       [1.20064713, 1.39666133, 0.4557124 , 0.90969731],
       [1.48170748, 0.        , 0.70740673, 0.80910848]])

## Working on non-zero elements

In [121]:
a[np.nonzero(a)].mean()

1.1421999269746124

In [122]:
a[a!=0].mean()

1.1421999269746124

## Workig on non-nan elements

In [137]:
np.isnan(b)

array([[ True, False,  True,  True],
       [False, False, False, False],
       [False, False, False, False],
       [False,  True, False, False]])

In [123]:
np.nanmean(b)

1.1421999269746124

In [128]:
np.nanmedian(b)

0.9214109824071337

In [136]:
# Take mean along the columns
np.nanmean(b,axis=0)

array([1.20515976, 1.50532985, 1.07336242, 0.78494768])

In [135]:
# Take mean aong the rows
np.nanmean(b,axis=1)

array([0.83712754, 1.47708268, 0.99067954, 0.99940756])