# 1 Install and import Numpy

Numpy is a powerful third-party library to manipulate numerical data. Many of its methods perform calculates in a parallellized way, so that it can handle large arrays of data very efficiently. Moreover, it contains convenient built-in functions for all kinds of scientific, numerical and statistical calculations.

First, we need to install it with the following command:

- pip install numpy

The pip package manager comes installed with Python. 

Instead of opening a command window, we can also run commands directly from this notebook:

In [None]:
! pip install numpy

In Colaboratory, the output will say "Requirement already satisfied". Numpy is already installed on the hosted Colaboratory environment. When using Jupyter notebooks on your own machine, this command will download and install the Python library if not already pressent.

In most cases, but not always, the library can be imported in the code with the same name as the name of the package in the command above.

In [None]:
import numpy as np # by convention, we import numpy as np.

# 2 Numpy Arrays

## 2.1 Basics

The basic object class in Numpy is an array. An array can have one or more dimensions of various sizes. 

In [None]:
#one-dimensional array

array1 = np.array([1,4,5,8])
array1 

The input to the constructor is an "array-like" object (list, tuple, ...). Conversely, an array can be turned into a list as follows:

In [None]:
list1 = list(array1)
list1

In [None]:
print(type(array1))

In [None]:
#two-dimensional array

array2 = np.array([[4,5,6],[4,8,7]])
array2


In [None]:
#four-dimensional array

array4 = np.array([[[[1,2],[3,4]],[[5,6],[7,8]]],[[[9,10],[11,12]],[[13,14],[15,16]]]])
array4

The **dimensions** of the array can be found with the **shape attribute**.

In [None]:
array2

In [None]:
array2.shape

In [None]:
array4.shape

Creating an ordered array is done with np.arange() 

In [None]:
ordered_array = np.arange(10)
ordered_array

## 2.2 Numpy datatypes

Unlike Python lists, a numpy array has a **fixed datatype**. Numpy defines a number of **built-in data types**.

In [None]:
array_int = np.array([[1,2],[3,4]])
array_int.dtype

In [None]:
array_int

In [None]:
array_int[1,1]

In [None]:
array_float = np.array([[1.0,2.0],[3.0,4.0]])
array_float.dtype

In [None]:
print(array_float[0,0])
print(array_float[1,1])
print(type(array_float[0,0]))
print(type(array_float[1,1]))

In [None]:
array_string = np.array(['hello','world'])
array_string.dtype

In [None]:
print(array_string[0])
print(array_string[1])
print(type(array_string[0]))
print(type(array_string[1]))

A numpy array has **only one type**

In [None]:
array_mixed = np.array([5.0,"hello"]) #ERROR
array_mixed.dtype

In [None]:
print(array_mixed[0])
print(array_mixed[1])
print(type(array_mixed[0]))
print(type(array_mixed[1]))

## 2.3 Slicing and indexing arrays

Select a specific element from the array by specifying the coordinates in square brackets. Remember that indices start at 0. 

In [None]:
array2

In [None]:
array2[0,1] #first row, second column

Select an entire row or column with a colon.

In [None]:
array2[:,0]

In [None]:
array2[1,:]

In [None]:
array4[0,0,0,:]

In [None]:
ordered_array

Make a stepwise selection with [start:stop:step]. The index 'start' will be included, index 'stop' is not included.

In [None]:
ordered_array[0:9:1]

In [None]:
ordered_array[0:9:2]

## 2.4 Exercise

First, we create a random 4x4 2D array.

In [None]:
np.random.seed(0)
my_array = np.random.randint(0,100,(4,12))
my_array

Remove every second column from the array

In [None]:
#CODE HERE

#SOLUTION
my_array = my_array[:,0:12:2]
my_array

# 3 Calculations with arrays

## 3.1 element-wise operations

In [None]:
array1 = np.array([[1,2],[3,4]])
array2 = np.array([[5,6],[7,8]])
print(array1)
print("\n")
print(array2)

In [None]:
array1*array2

In [None]:
array1>array2

In [None]:
array1 - array2

In [None]:
array1 ** 2

## 3.2 Matrix operations

**matrix multiplication**

In [None]:
np.matmul(array1,array2)

**transpose**

Transposing a 2D array can be done with '.T'

In [None]:
print(array1)
print("\n")
print(array1.T)

For moredimensional arrays, use **np.transpose()**. By default, it inverses the axes. You can also specify the exact permutation of the axes. See the [documentation for np.transpose()](https://docs.scipy.org/doc/numpy/reference/generated/numpy.transpose.html)

In [None]:
array3 = np.array([[[1,2,0],[3,4,0]],[[5,6,0],[7,8,0]],[[11,22,0],[33,44,0]],[[55,66,0],[77,88,0]]])
array3.shape

In [None]:
array3_transposed = np.transpose(array3)
array3_transposed.shape

In [None]:
array3_transposed = np.transpose(array3,axes=(1,2,0))
array3_transposed.shape

# 3.3 Broadcasting

Broadcasting describes how operations are done on arrays of different shapes. Remember, when using the **asteriks\*-operator**, the arrays are multiplied elementwise. 

In [None]:
array1 = np.array([[1,2],[3,4]])
array2 = np.array([[5,6],[7,8]])
print(array1)
print("\n")
print(array2)

In [None]:
array1*array2

Thanks to broadcasting, we can do more. For example, we can multiply a multidimensional array with a single scalar.

In [None]:
array1 * 5

The value "5" above is **"broadcast" accross the larger array so that the shapes become compatible**. The statement above is thus
equivalent to the one below.

In [None]:
array1 * np.array([[5,5],[5,5]])

We can also do the following.

In [None]:
array3 = np.array([5,6])

print(array1)
print("\n")
print(array3)

print("\nshape of array1: ",array1.shape)
print("shape of array3: ",array3.shape)


In [None]:
array3 * array1

In [None]:
array1 * array3

again, array3 was broadcast over the larger array1. The above is equivalent to the following:

In [None]:
array1 * np.array([[5,6],[5,6]])

Of course, broadcasting will never work if the shapes don't allow it. The **dimensions must be compatible**. Dimensions are compatible when

1. dimensions of both arrays are equal (obviously)
2. one of the dimensions (of either array) is 1

In [None]:
array4 = np.array([[1,2],[3,4],[5,6],[7,8]])

print(array1)
print("\n")
print(array4)

print("\nshape of array1: ",array1.shape)
print("shape of array4: ",array4.shape)

In [None]:
array1 * array4

## 3.4 Vectorization (advanced)

With vectorization, you can create vectorized functions to perform **custom operations** on arrays. Below is a short example, read more in the [documentation!](https://docs.scipy.org/doc/numpy/reference/generated/numpy.vectorize.html)

In [None]:
def myfunc(a, b):
    "Return a-b if a>b, otherwise return a+b"
    if a > b:
        return a - b
    else:
        return a + b

In [None]:
myfunc(4,5)

In [None]:
vfunc = np.vectorize(myfunc)

In [None]:
array1 = np.array([[1,2],[7,8]])
array2 = np.array([[7,8],[6,5]])

print(array1)
print("\n")
print(array2)

In [None]:
vfunc(array1,array2)

The **broadcasting rule applies** here too.

In [None]:
array1

In [None]:
array3 = np.array([5,6])
array3

In [None]:
vfunc(array1,array3)

Does this result make sense?

## 3.5 Exercise

**hints:** With the *np.ndarray.max()* function, we can find the mean of an array along a given axis. For example, the code below finds a vector *max_vector* that represents that mean of every row in *my_array*.

In [None]:
my_array = np.random.randint(0,100,(4,6))
print(my_array.shape)
print("")
print(my_array)

In [None]:
mean_vector = my_array.mean(axis=1) #axis indexing starts at 0, so axis=1 means the second axis.
print(mean_vector.shape)
print("the means of the rows are...")
print(mean_vector)

The resulting array is 1-dimensional. In order to support broadcasting with the (4,6) array, the row must be of shape (4,1). Use np.expand_dims():

In [None]:
mean_vector = np.expand_dims(mean_vector,1)
print(mean_vector.shape)
print("")
print(mean_vector)


**Exercise:** From my_array, create an array where some elements are replaced by zero:
1. If the value is higher than the mean value of its row, keep the original value
2. If the value is lower than the mean value of its row, set the value to zero.


In [None]:
my_array = np.random.randint(0,100,(4,6))
print(my_array.shape)
print("")
print(my_array)

In [None]:
#CODE HERE

#SOLUTION
#also emphasize importance of documentation. Explain how array1 is of type numpy.ndarray, and the linked page refers to a function
#that can be used on this object. This is different from functions like np.expand_dims(arr,...) where the array is an argument. However, 
#np.mean(arr, axis..) also exists!!

#step 1: find the maximums of all rows
mean_vector = my_array.mean(axis = 1)

#step 2. The resulting array is 1 dimensional, it has a shape "(of 4,)". Transform it into a (4,1) array. Why not a (1,4) array?
mean_vector = np.expand_dims(mean_vector,1)

#step 2. Create an array of boolean values by comparing each element with the mean of its row. Take advantage of broadcasting.
boolean_mask = (my_array > mean_vector)

#step 3: simply multiply the original array with the boolean mask to set all values corresponding to "False" to zero.
my_array = my_array * boolean_mask

print(my_array)



# 4 Numpy and images

## 4.1 OpenCV

Images can be represented as arrays of numbers. A **coloured RGB image is a threedimensional array**, where the first two dimensions represent the horizontal and vertical axes. The third dimension contains the pixel values for the three base colours Red, Green and Blue.

Images can be read and manipulated with the third-party open-source library **openCV**. We also use matplotlib here to visualize an image.

In [None]:
! pip install opencv-python
! pip install matplotlib

In [None]:
import cv2
import matplotlib.pyplot as plt # matlotlib is an often-used library. Matplotlib.pyplot is typically imported as plt.

In [None]:
image = cv2.imread("images/bird.jpg")

In [None]:
print(type(image))
print(image.shape)

OpenCV works perfectly with Numpy. The returned image is a numpy array of 600x697 pixels and 3 colour channels.

Let's **visualize** the image with matplotlib.

In [None]:
plt.imshow(image)

Looks great? Not really... In the original image, the bird was blue. Watch out for the following caveat.

**An image read by openCV is in BGR (blue-green-red) mode, while matplotlib displays in RGB (red-green-blue)**. We will fix this in the exercises.

## 4.2 Exercise 2: fix the bird (advanced)

In the previous section, matplotlib displayed the image in RGB, while it was read in BGR mode. Use a workaround to display the image correctly.

hint: The image is read in BGR mode. It must be swapped to RGB. Using indexing, grab the slices of the BGR array corresponding to each colour, and then use np.concatenate() to create a RGB image.

You will use  [np.concatenate](https://docs.scipy.org/doc/numpy/reference/generated/numpy.concatenate.html) and [np.expand_dims()](https://docs.scipy.org/doc/numpy/reference/generated/numpy.expand_dims.html)

In [None]:
image_BGR = cv2.imread("images/bird.jpg")
image_BGR.shape

In [None]:
#CODE HERE

#SOLUTION

#slice the image, seperating the three colours
image_B = image_BGR[:,:,0]
image_G = image_BGR[:,:,1]
image_R = image_BGR[:,:,2]

#the resulting slices are only two-dimensional. Use np.expand_dims() to transform these slices back to three-dimensional arrays,
#where the third dimension has size 1. Pick the right axis!
image_B = np.expand_dims(image_B,axis=2)
image_G = np.expand_dims(image_G,axis=2)
image_R = np.expand_dims(image_R,axis=2)

#concanate the slices with np.concatenate(). Pick the right axis!
image_RGB = np.concatenate((image_R,image_G,image_B),axis = 2) #concanate along third axis (axis = 2)



If everything is correct, the image below should display the bird in it's original blue colour.

In [None]:
plt.imshow(image_RGB)