# Numpy Reading


### Slicing ndarrays

In addition to being able to access individual elements one at a time, NumPy provides a way to access subsets of ndarrays. This is known as **slicing**. Slicing is performed by combining indices with the colon symbol inside the square brackets `[ : ]`. In general you will come across three types of slicing:

1. `ndarray[start:end]`
2. `ndarray[start:]`
3. `ndarray[:end]`

The first method is used to select elements between the start and end indices. The second method is used to select all elements from the start index till the last index. The third method is used to select all elements from the first index till the end index. We should note that in methods one and three, the end index is excluded. We should also note that since ndarrays can be multidimensional, when doing slicing you usually have to specify a slice for each dimension of the array.

Let's now see some examples of how to use the above methods to select different subsets of a rank 2 ndarray.

In [88]:
# create a 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)

# print X
print()
print('X = \n', X)

# select all the elements that are in the 2nd through 4th rows and in the 3rd to 5th columns
Z = X[1:4,2:5]

# print Z
print('\nZ = \n', Z)

# can select the same elements as above using method 2
W = X[1:,2:5]

# print W
print('\nW = \n', W)

# select all the elements that are in the 1st through 3rd rows and in the 3rd to 4th columns
Y = X[:3,2:5]

# print Y
print('\nY = \n', Y)

# select all the elements in the 3rd row
v = X[2,:]

# print v
print('\nv = ', v)

# select all the elements in the 3rd column
q = X[:,2]

# print q
print('\nq = ', q)

# select all the elements in the 3rd column but return a rank 2 ndarray
R = X[:,2:3]

# print R
print('\nR = \n', R)


X = 
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]

Z = 
 [[ 7  8  9]
 [12 13 14]
 [17 18 19]]

W = 
 [[ 7  8  9]
 [12 13 14]
 [17 18 19]]

Y = 
 [[ 2  3  4]
 [ 7  8  9]
 [12 13 14]]

v =  [10 11 12 13 14]

q =  [ 2  7 12 17]

R = 
 [[ 2]
 [ 7]
 [12]
 [17]]


Notice that when we selected all the elements in the 3rd column, variable q above, the slice returned a rank 1 ndarray instead of a rank 2 ndarray. However, slicing X in a slightly different way, variable R above, we can actually get a rank 2 ndarray instead.

It is important to note that when we perform slices on ndarrays and save them into new variables, as we did above, the data is not copied into the new variable. This is one feature that often causes confusion for beginners. Therefore, we will look at this in a bit more detail.

In the above examples, when we make assignments, such as:

`Z = X[1:4,2:5]`

the slice of the original array X is not copied in the variable Z. Rather, X and Z are now just two different names for the same ndarray. We say that slicing only creates a view of the original array. This means that if you make changes in Z you will be in effect changing the elements in X as well. Let's see this with an example:

In [90]:
# create a 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)

# print X
print('X = \n', X)

# select all the elements that are in the 2nd through 4th rows and in the 3rd to 4th columns
Z = X[1:4,2:5]

# print Z
print('\nZ = \n', Z)

# change the last element in Z to 555
Z[2,2] = 555

# We print X
print('\nX = \n', X)

X = 
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]

Z = 
 [[ 7  8  9]
 [12 13 14]
 [17 18 19]]

X = 
 [[  0   1   2   3   4]
 [  5   6   7   8   9]
 [ 10  11  12  13  14]
 [ 15  16  17  18 555]]


In the above example, it is evident that when we make changes to Z, X changes as well.

However, if we want to create a new ndarray that contains a copy of the values in the slice we need to use the `np.copy()` function. The `np.copy(ndarray)` function creates a copy of the given ndarray. This function can also be used as a method, in the same way as we did before with the reshape function. Let's do the same example we did before but now with copies of the arrays. We'll use copy both as a function and as a method.

In [93]:
# create a 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)

# print X
print('X = \n', X)

# create a copy of the slice using the np.copy() function
Z = np.copy(X[1:4,2:5])

#  create a copy of the slice using the copy as a method
W = X[1:4,2:5].copy()

# change the last element in Z to 555
Z[2,2] = 555

# change the last element in W to 888
W[2,2] = 888

# print X
print('\nX = \n', X)

# print Z
print('\nZ = \n', Z)

# print W
print('\nW = \n', W)

X = 
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]

X = 
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]

Z = 
 [[  7   8   9]
 [ 12  13  14]
 [ 17  18 555]]

W = 
 [[  7   8   9]
 [ 12  13  14]
 [ 17  18 888]]


We can clearly see that by using the `copy()` function, we are creating new ndarrays that are completely independent of each other.

It is often useful to use one ndarray to make slices, select, or change elements in another ndarray. Let's see some examples:

In [95]:
# create a 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)

# create a rank 1 ndarray that will serve as indices to select elements from X
indices = np.array([1,3])

# print X
print('X = \n', X)

# print indices
print('\nindices = ', indices)

# use the indices ndarray to select the 2nd and 4th row of X
Y = X[indices,:]

# use the indices ndarray to select the 2nd and 4th column of X
Z = X[:, indices]

# print Y
print('\nY = \n', Y)

# print Z
print('\nZ = \n', Z)

X = 
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]

indices =  [1 3]

Y = 
 [[ 5  6  7  8  9]
 [15 16 17 18 19]]

Z = 
 [[ 1  3]
 [ 6  8]
 [11 13]
 [16 18]]


NumPy also offers built-in functions to select specific elements within ndarrays. For example, the `np.diag(ndarray, k=N)` function extracts the elements along the diagonal defined by N. As default is k=0, which refers to the main diagonal. Values of k > 0 are used to select elements in diagonals above the main diagonal, and values of k < 0 are used to select elements in diagonals below the main diagonal. Let's see an example:

In [99]:
# create a 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(25).reshape(5, 5)

# print X
print()
print('X = \n', X)
print()

# print the elements in the main diagonal of X
print('z =', np.diag(X))

# print the elements above the main diagonal of X
print('y =', np.diag(X, k=1))

# print the elements below the main diagonal of X
print('w = ', np.diag(X, k=-1))


X = 
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]

z = [ 0  6 12 18 24]
y = [ 1  7 13 19]
w =  [ 5 11 17 23]


It is often useful to extract only the unique elements in an ndarray. We can find the unique elements in an ndarray by using the `np.unique()` function. The `np.unique(ndarray)` function returns the unique elements in the given ndarray, as in the example below:

In [98]:
# Create 3 x 3 ndarray with repeated values
X = np.array([[1,2,3],[5,2,8],[1,2,3]])

# print X
print('X = \n', X)

# print the unique elements of X 
print('\nThe unique elements in X are:',np.unique(X))

X = 
 [[1 2 3]
 [5 2 8]
 [1 2 3]]

The unique elements in X are: [1 2 3 5 8]


### Boolean Indexing

Up to now we have seen how to make slices and select elements of an ndarray using indices. This is useful when we know the exact indices of the elements we want to select. However, there are many situations in which we don't know the indices of the elements we want to select. For example, suppose we have a 10,000 x 10,000 ndarray of random integers ranging from 1 to 15,000 and we only want to select those integers that are less than 20. Boolean indexing can help us in these cases, by allowing us select elements using logical arguments instead of explicit indices. Let's see some examples:

In [102]:
# create a 5 x 5 ndarray that contains integers from 0 to 24
X = np.arange(25).reshape(5, 5)

# print X
print('Original X = \n', X)

# use Boolean indexing to select elements in X:
print('\nThe elements in X that are greater than 10:', X[X > 10])
print('The elements in X that less than or equal to 7:', X[X <= 7])
print('The elements in X that are between 10 and 17:', X[(X > 10) & (X < 17)])

# use Boolean indexing to assign the elements that are between 10 and 17 the value of -1
X[(X > 10) & (X < 17)] = -1

# print X
print('\nX = \n', X)

Original X = 
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]

The elements in X that are greater than 10: [11 12 13 14 15 16 17 18 19 20 21 22 23 24]
The elements in X that less than or equal to 7: [0 1 2 3 4 5 6 7]
The elements in X that are between 10 and 17: [11 12 13 14 15 16]

X = 
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 -1 -1 -1 -1]
 [-1 -1 17 18 19]
 [20 21 22 23 24]]


In addition to Boolean Indexing NumPy also allows for set operations. This useful when comparing ndarrays, for example, to find common elements between two ndarrays. Let's see some examples:

In [103]:
# create a rank 1 ndarray
x = np.array([1,2,3,4,5])

# create a rank 1 ndarray
y = np.array([6,7,2,8,4])

# print x
print('x = ', x)

# print y
print('y = ', y)

# use set operations to compare x and y:
print('\nThe elements that are both in x and y:', np.intersect1d(x,y))
print('The elements that are in x that are not in y:', np.setdiff1d(x,y))
print('All the elements of x and y:',np.union1d(x,y))

x =  [1 2 3 4 5]
y =  [6 7 2 8 4]

The elements that are both in x and y: [2 4]
The elements that are in x that are not in y: [1 3 5]
All the elements of x and y: [1 2 3 4 5 6 7 8]


### Sorting ndarrays

We can also sort ndarrays in NumPy. We will learn how to use the `np.sort()` function to sort rank 1 and rank 2 ndarrays in different ways. Like with other functions we saw before, the sort function can also be used as a method. However, there is a big difference on how the data is stored in memory in this case. When `np.sort()` is used as a function, it sorts the ndrrays out of place, meaning, that it doesn't change the original ndarray being sorted. However, when you use sort as a method, `ndarray.sort()` sorts the ndarray in place, meaning, that the original array will be changed to the sorted one. Let's see some examples:

In [104]:
# create an unsorted rank 1 ndarray
x = np.random.randint(1,11,size=(10,))

# print x
print('Original x = ', x)

# sort x and print the sorted array using sort as a function.
print('Sorted x (out of place):', np.sort(x))

# When we sort out of place the original array remains intact. To see this we print x again
print('x after sorting:', x)

Original x =  [ 2  7 10  5  7  1  7  6  9  2]
Sorted x (out of place): [ 1  2  2  5  6  7  7  7  9 10]
x after sorting: [ 2  7 10  5  7  1  7  6  9  2]


Notice that `np.sort()` sorts the array but, if the ndarray being sorted has repeated values, `np.sort()` leaves those values in the sorted array. However, if desired, we can sort only the unique elements in x by combining the sort function with the unique function. Let's see how we can sort the unique elements of x above:

In [105]:
print(np.sort(np.unique(x)))

[ 1  2  5  6  7  9 10]


Finally, let's see how we can sort ndarrays in place, by using sort as a method:

In [106]:
x = np.random.randint(1,11,size=(10,))
print('Original x = ', x)

# sort x and print the sorted array using sort as a method.
x.sort()

# When we sort in place the original array is changed to the sorted array. To see this we print x again
print('x after sorting:', x)

Original x =  [10  2  9  8  7  3  9  1  8  3]
x after sorting: [ 1  2  3  3  7  8  8  9  9 10]


When sorting rank 2 ndarrays, we need to specify to the `np.sort()` function whether we are sorting by rows or columns. This is done by specifying `axis`.

In [108]:
# create an unsorted rank 2 ndarray
X = np.random.randint(1,11,size=(5,5))

# print X
print('Original X = \n', X)

# sort the columns of X and print the sorted array
print('\nX with sorted columns :\n', np.sort(X, axis = 0))

# sort the rows of X and print the sorted array
print('\nX with sorted rows :\n', np.sort(X, axis = 1))

Original X = 
 [[ 5  7  8 10  5]
 [ 1  2  3 10  3]
 [ 9  3  6 10  8]
 [ 7  6  9  7  9]
 [ 6  7  2  5  4]]

X with sorted columns :
 [[ 1  2  2  5  3]
 [ 5  3  3  7  4]
 [ 6  6  6 10  5]
 [ 7  7  8 10  8]
 [ 9  7  9 10  9]]

X with sorted rows :
 [[ 5  5  7  8 10]
 [ 1  2  3  3 10]
 [ 3  6  8  9 10]
 [ 6  7  7  9  9]
 [ 2  4  5  6  7]]


### Arithmatic Operations and Broadcasting

NumPy allows element-wise operations on ndarrays as well as matrix operations. In order to do element-wise operations, NumPy sometimes uses something called **Broadcasting**. Broadcasting is the term used to describe how NumPy handles element-wise arithmetic operations with ndarrays of different shapes. For example, broadcasting is used implicitly when doing arithmetic operations between scalars and ndarrays.

Let's start by doing element-wise addition, subtraction, multiplication, and division, between ndarrays. To do this, NumPy provides a functional approach, where we use functions such as np.add(), or by using arithmetic symbols, such as +, that resembles more how we write mathematical equations. Both forms will do the same operation, the only difference is that if you use the function approach, the functions usually have options that you can tweak using keywords. It is important to note that when performing element-wise operations, the shapes of the ndarrays being operated on, must have the same shape or be broadcastable. Consider the following example: 

In [109]:
# create two rank 1 ndarrays
x = np.array([1,2,3,4])
y = np.array([5.5,6.5,7.5,8.5])

print('x = ', x)
print()
print('y = ', y)

# perfrom basic element-wise operations using arithmetic symbols and functions
print('x + y = ', x + y)
print('add(x,y) = ', np.add(x,y))
print()
print('x - y = ', x - y)
print('subtract(x,y) = ', np.subtract(x,y))
print()
print('x * y = ', x * y)
print('multiply(x,y) = ', np.multiply(x,y))
print()
print('x / y = ', x / y)
print('divide(x,y) = ', np.divide(x,y))

x =  [1 2 3 4]

y =  [ 5.5  6.5  7.5  8.5]
x + y =  [  6.5   8.5  10.5  12.5]
add(x,y) =  [  6.5   8.5  10.5  12.5]

x - y =  [-4.5 -4.5 -4.5 -4.5]
subtract(x,y) =  [-4.5 -4.5 -4.5 -4.5]

x * y =  [  5.5  13.   22.5  34. ]
multiply(x,y) =  [  5.5  13.   22.5  34. ]

x / y =  [ 0.18181818  0.30769231  0.4         0.47058824]
divide(x,y) =  [ 0.18181818  0.30769231  0.4         0.47058824]


We can also perform the same element-wise arithmetic operations on rank 2 ndarrays. Again, remember that in order to do these operations the shapes of the ndarrays being operated on, must have the same shape or be broadcastable.

In [110]:
# We create two rank 2 ndarrays
X = np.array([1,2,3,4]).reshape(2,2)
Y = np.array([5.5,6.5,7.5,8.5]).reshape(2,2)

# We print X
print()
print('X = \n', X)

# We print Y
print()
print('Y = \n', Y)
print()

# We perform basic element-wise operations using arithmetic symbols and functions
print('X + Y = \n', X + Y)
print()
print('add(X,Y) = \n', np.add(X,Y))
print()
print('X - Y = \n', X - Y)
print()
print('subtract(X,Y) = \n', np.subtract(X,Y))
print()
print('X * Y = \n', X * Y)
print()
print('multiply(X,Y) = \n', np.multiply(X,Y))
print()
print('X / Y = \n', X / Y)
print()
print('divide(X,Y) = \n', np.divide(X,Y))


X = 
 [[1 2]
 [3 4]]

Y = 
 [[ 5.5  6.5]
 [ 7.5  8.5]]

X + Y = 
 [[  6.5   8.5]
 [ 10.5  12.5]]

add(X,Y) = 
 [[  6.5   8.5]
 [ 10.5  12.5]]

X - Y = 
 [[-4.5 -4.5]
 [-4.5 -4.5]]

subtract(X,Y) = 
 [[-4.5 -4.5]
 [-4.5 -4.5]]

X * Y = 
 [[  5.5  13. ]
 [ 22.5  34. ]]

multiply(X,Y) = 
 [[  5.5  13. ]
 [ 22.5  34. ]]

X / Y = 
 [[ 0.18181818  0.30769231]
 [ 0.4         0.47058824]]

divide(X,Y) = 
 [[ 0.18181818  0.30769231]
 [ 0.4         0.47058824]]


We can also apply mathematical functions, such as `sqrt(x)`, `exp(x)`, and `power(x,n)`, to all elements of an ndarray at once.

In [111]:
# create a rank 1 ndarray
x = np.array([1,2,3,4])

print('x = ', x)

# apply different mathematical functions to all elements of x
print()
print('EXP(x) =', np.exp(x))
print('SQRT(x) =',np.sqrt(x))
print('POW(x,2) =',np.power(x,2)) 

x =  [1 2 3 4]

EXP(x) = [  2.71828183   7.3890561   20.08553692  54.59815003]
SQRT(x) = [ 1.          1.41421356  1.73205081  2.        ]
POW(x,2) = [ 1  4  9 16]


NumPy also offers a variety of statistical functions that may be applied to ndarrays. In the example below, we will apply `mean()`, `sum()`, `std()`, `median()`, `max()`, and `min()` functions. 

In [114]:
# create a 2 x 2 ndarray
X = np.array([[1,2], [3,4]])

print('X = \n', X)

print('Average of all elements in X:', X.mean())
print('Average of all elements in the columns of X:', X.mean(axis=0))
print('Average of all elements in the rows of X:', X.mean(axis=1))
print()
print('Sum of all elements in X:', X.sum())
print('Sum of all elements in the columns of X:', X.sum(axis=0))
print('Sum of all elements in the rows of X:', X.sum(axis=1))
print()
print('Standard Deviation of all elements in X:', X.std())
print('Standard Deviation of all elements in the columns of X:', X.std(axis=0))
print('Standard Deviation of all elements in the rows of X:', X.std(axis=1))
print()
print('Median of all elements in X:', np.median(X))
print('Median of all elements in the columns of X:', np.median(X,axis=0))
print('Median of all elements in the rows of X:', np.median(X,axis=1))
print()
print('Maximum value of all elements in X:', X.max())
print('Maximum value of all elements in the columns of X:', X.max(axis=0))
print('Maximum value of all elements in the rows of X:', X.max(axis=1))
print()
print('Minimum value of all elements in X:', X.min())
print('Minimum value of all elements in the columns of X:', X.min(axis=0))
print('Minimum value of all elements in the rows of X:', X.min(axis=1))

X = 
 [[1 2]
 [3 4]]
Average of all elements in X: 2.5
Average of all elements in the columns of X: [ 2.  3.]
Average of all elements in the rows of X: [ 1.5  3.5]

Sum of all elements in X: 10
Sum of all elements in the columns of X: [4 6]
Sum of all elements in the rows of X: [3 7]

Standard Deviation of all elements in X: 1.11803398875
Standard Deviation of all elements in the columns of X: [ 1.  1.]
Standard Deviation of all elements in the rows of X: [ 0.5  0.5]

Median of all elements in X: 2.5
Median of all elements in the columns of X: [ 2.  3.]
Median of all elements in the rows of X: [ 1.5  3.5]

Maximum value of all elements in X: 4
Maximum value of all elements in the columns of X: [3 4]
Maximum value of all elements in the rows of X: [2 4]

Minimum value of all elements in X: 1
Minimum value of all elements in the columns of X: [1 2]
Minimum value of all elements in the rows of X: [1 3]


Now, let's see how NumPy can add single numbers to all the elements of an ndarray without the use of complicated loops.

In [119]:
X = np.array([[1,2], [3,4]])
print('X = \n', X, '\n')

print('3 * X = \n', 3 * X, '\n')
print()
print('3 + X = \n', 3 + X, '\n')
print()
print('X - 3 = \n', X - 3, '\n')
print()
print('X / 3 = \n', X / 3, '\n')

X = 
 [[1 2]
 [3 4]] 

3 * X = 
 [[ 3  6]
 [ 9 12]] 


3 + X = 
 [[4 5]
 [6 7]] 


X - 3 = 
 [[-2 -1]
 [ 0  1]] 


X / 3 = 
 [[ 0.33333333  0.66666667]
 [ 1.          1.33333333]] 



In the example above, NumPy is working behind the scenes to broadcast 3 along the ndarray so that they have the same shape. This allows us to add 3 to each element of X with just one line of code.

Subject to certain constraints, Numpy can do the same for two ndarrays of different shapes. Let's consider an example:

In [121]:
# create a rank 1 ndarray
x = np.array([1,2,3])

# create a 3 x 3 ndarray
Y = np.array([[1,2,3],[4,5,6],[7,8,9]])

# create a 3 x 1 ndarray
Z = np.array([1,2,3]).reshape(3,1)

# print x
print('x = ', x)
print()

# print Y
print('Y = \n', Y)
print()

# print Z
print('Z = \n', Z)
print()

print('x + Y = \n', x + Y)
print()
print('Z + Y = \n',Z + Y)

x =  [1 2 3]

Y = 
 [[1 2 3]
 [4 5 6]
 [7 8 9]]

Z = 
 [[1]
 [2]
 [3]]

x + Y = 
 [[ 2  4  6]
 [ 5  7  9]
 [ 8 10 12]]

Z + Y = 
 [[ 2  3  4]
 [ 6  7  8]
 [10 11 12]]


As you can see, NumPy is able to add 1 x 3 and 3 x 1 ndarrays to 3 x 3 ndarrays by broadcasting the smaller ndarrays along the big ndarray so that they have compatible shapes. In general, NumPy can do this provided that the smaller ndarray, such as the 1 x 3 ndarray in our example, can be expanded to the shape of the larger ndarray in such a way that the resulting broadcast is unambiguous.