# Exercise 0: Python tutorial
This tutorial will guide you through the basics of Python.

It consists of 3 parts
 1. Python basics: Introduces variables, functions, lists and other data structures
 2. NumPy arrays: Introduces the important NumPy package
 3. Images: Gives an example of a very simple image processig task

Even if you are familiar with Python - it might be worthwhile for the upcoming projects to read through the 3rd section.

**Note:** This tutorial is completely optional, and will not be graded. The solutions to the 3 problems are in fact provided at the end of this handout for your reference.




## Initialize the environment
First, we will mount our Google Drive folder so that we can easily access the files on Google Drive from colab.

In [None]:
# Mount your google drive folder at '/content/drive'. Note that when you
# execute this, Colab will ask you to provide permission to access your drive.
from google.colab import drive
drive.mount('/content/drive')


Next, set the path in Google drive where you uploaded this handout, e.g. MyDrive/iacv/ex0

In [None]:
import sys
from pathlib import Path

 # TODO set this
iacv_path = 'MyDrive/...'

env_path = Path("/content/drive") / iacv_path


# Add the handout folder to python paths
if env_path not in sys.path:
    sys.path.append(str(env_path))

## Part 1. Python basics
Python is a very simple language.
To define a variable, simply use the equation sign.
Then, you can print the value of the variable with the printing command.

In [None]:
x = 3
print(x)

Standard mathematical operations are done with +,-,/ and * commands.

In [None]:
x = 5
y = 6
a = 2*x + 5*y
a_square = a**2
print(a_square)

**Task 1.1.** In the cell below, compute how many foreign residents Switzerland had in 2015 if the total number of inhabitants was 8327126 and the fraction of foreigners was 24.6%. Hint: use round() function to remove decimals from your result.

In [None]:
# TODO

Next, we look into implementing functions in Python. To define a function in Python, use syntax as in the following example

In [None]:
def compute_number_of_days(age):
    """
    This is called docstring. Here you can describe, what this function does.
    You can also document input and output. Docstrings are useful, so that
    in the future you (and others) can still understand this function.

    Roughly compute the number of days a person has lived.

    Parameters
    ----------
      age: int
        Age of person in years

    Returns
    -------
      days: int
        Number of days a person has lived (approximately).
    """

    # In adition to docstrings, we can comment our code using the hash sign
    days = age * 365
    return days

The function will return the variable written next to the ```return``` word and will stop execution after.
To call a function after its definition, just type its name and pass its arguments in the brackets, as in the example below.

In [None]:
days = compute_number_of_days(22)
print(days)

Functions can also be called from inside another function

In [None]:
def print_congratulations(age, name):
    days = compute_number_of_days(age)
    print(f"Hello {name}! You have already lived on this planet for {days} days!")

In this example, we use f-strings to convert variables to strings.
Eg. `days`, which is initially of a numeric data type is automatically converted to string. f-strings have convenient formatting options - you will likely come across them while using Python.

In [None]:
print_congratulations(22, 'Nikolay')

Another important concept of Python is lists. Lists are similar to arrays in other languages, and are used to aggregate multiple (dis)similar objects in a sequence. It is very easy to define them as in the following example:


In [None]:
ages_parents = [51, 52]
ages_children = [2, 4, 10]

This code creates 2 lists. The first list stores ages of parents, and the second stores ages of their kids. You can easily concatenate arrays with the plus sign.

In [None]:
ages_family = ages_parents + ages_children
print(ages_family)

Another important structure in Python is the ```for``` loop, which is used to repeatedly perform the same operation. You can call a ```for``` loop for each element of a list in the way shown below:

In [None]:
total_age = 0
for age in ages_family:
    total_age = total_age + age
print(total_age)

This code will print 119, which is the total age of the people in the family. Note that you must *indent* the commands inside the ```for``` loop. The end of these commands is implicitly defined by switching to the previous indentation level. This is also the case for other Python structures, such as function definitions, ```while``` loops etc.

However, you can also iterate with an index and not array elements. This is useful in many cases, for example, when you have a second array that corresponds to the first one. Consider the following example:

In [None]:
ages_family = [51, 52, 2, 4, 10]
names = ['Patrick', 'Maria', 'Emma', 'Jordi', 'Vasiliy']

We have two lists, the first containing the ages of family members and the second containing their names. To find out the length of an array, you can use ```len()``` command. For example,

In [None]:
print(len(names))

Knowing the length of the list, you can iterate over it by using indexing. Consider the following example:

In [None]:
for i in range(0, len(names)):
    current_name = names[i]
    current_age = ages_family[i]
    print_congratulations(current_age, current_name)

Here, we use ```[i]``` to access the i-th element of the list.

You can also nest the loops inside each other – but be sure you don't forget to use proper indentation and a different index variable for each loop. Have a look at the next example

In [None]:
for i in range(3, 7):
    for j in range (100, 103):
        mul = i * j
        print(f"If you multiply {i} by {j}, you get {mul}")

It is possible to combine `for` loops and `list`s, with what Python calls `list comprehension`. This is a concise way of creating lists. For example, in the following, we create a `list` holding the square of the numbers 1 to 6.

In [None]:
square_list = [i**2 for i in range(1, 7)]
print(square_list)

Other useful data structures in Python, which we will only briefly mention here, are tuples, dictionaries and sets.
 - **Tuples** are very similar to lists, with the important distinction, that they are immutable, ie. one cannot re-assign elements of a tuple
 - **Dictionaries** can be thought of as lookup tables with key-value pairs.
 - **Sets** are unordered collections with no duplicate elements. Useful for membership testing and group operations (eg intersection)

In [None]:
# Creating a tuple
ages_family = (51, 52, 2, 4, 10)
names = ("Patrick", "Maria", "Emma", "Jordi", "Vasiliy")

# This is not allowed
names[1] = "Mariaaa"

In [None]:
# Creating a dicitonary - names are keys, ages are values
ages_family_dd = {
    "Patrick": 51,
    "Maria": 52,
    "Emma": 2,
    "Jordi": 4,
    "Vasiliy": 10,
}

# Access elements of the dictionary using their keys
print(f"Emma is {ages_family_dd['Emma']} years old\n")

# One can also iterate over dictionaries
for name, age in ages_family_dd.items():
  print(f"{name} is {age} years old")

In [None]:
# Create a set from a list
ll1 = [2, 4, 6, 6, 8, 10]
ll2 = [4, 8, 10]

ss1 = set(ll1)
ss2 = set(ll2)

# Note how duplicates are removed
print(ss1)

# We can find the intersection of two sets
print(ss1.intersection(ss2))


**Task 1.2.** For a triangle with base $b$ and height $h_b$, the area is computed as $A=\frac{h_b b}{2}$. You have 100 triangles with base sizes linearly growing from 1 to 100 and heights decreasing from 150 to 51. Write code to compute the total area of all the triangles. You need to define a function to compute an area of one triangle, and then use a for loop to compute the total sum.

A note: In previous Python versions, division of integers would result in an integer, eg. 5 / 2 = 2. To prevent this, you would enforce floating-point division using eg 5 / 2.0 = 2.5.
Nowadays floating-point division is the default. To force the output of the division to be an integer, one may use so-called floor division: 5 // 2 = 2.

In [None]:
# TODO

## Part 2. Working with NumPy arrays
The real power of Python is revealed when it comes to packages. Packages can be thought of as libraries and provide code and functions for specific applications. One of the most important and popular packages in Python is **numpy**. It is designed for scientific computing and has excellent support for matrix operations that are essential in Computer Vision. It is also well-documented online (https://numpy.org/doc/stable), with detailed explanations of functions, arguments, concepts etc.

To get started with numpy, you first have to import this package. This is done as follows:

In [None]:
import numpy as np

This will import the numpy package under the name 'np'.
Now we can generate a random number.
Or initialise an array from a list.

In [None]:
x = np.random.rand()
y = np.array([3.2, 5.7, 10.0])

It might be worthwhile to mention one fundamental difference between native Python lists and NumPy arrays right from the start:
 - NumPy arrays have a fixed size at the time of creation. This has impacts on operations like concatenation of multiple arrays. Concatenation will create a new array in memory, which is a relatively slow operation.
 - Lists can grow dynamically. To add an element to a list is typically fast and cheap.

To define a numpy matrix, the following syntax can be used:


In [None]:
A = np.matrix([[1, 2], [3, 4], [5, 6]])

This will create the following matrix:
$A = \begin{pmatrix} 1 & 2 \\ 3 & 4 \\ 5 & 6 \end{pmatrix}$.

If you are interested in the dimensions of the matrix, use the ```shape()``` function. For example,

In [None]:
# Two equivalent ways of finding the shape of matrix A
print(np.shape(A))
print(A.shape)

To left-multiply a vector by matrix, use ```dot()``` operation.
You can also use the at operator `@` (which is implemented by `np.matmul`)

In [None]:
b = np.array([1, 2])
r = A.dot(b)
print(r)

r = A @ b
print(r)

Please note that this results in a matrix of shape (3, 1). To transpose a matrix, use a ```.T``` construction:

In [None]:
print(r.T)

To find a maximal value in the array, you can use the ```max()``` function from numpy,

In [None]:
print(np.max(r))

To find the index of the element that contains the maximal value, use the ```argmax()``` function

In [None]:
print(np.argmax(r))

Note that the output of the `argmax` function is 2. This is because indices in Python are zero-based, and the last element of a vector of length 3 has index 2

**Task 2.1.** You have 50 warehouses that store 3 products with the prices CHF 3, 5, and 1 per product item. The quantity of each product in each warehouse is a random variable uniformly sampled between 1 and 10. Find the warehouse that has the highest value of goods in total. We suggest the following steps to solve the problem:
* Define random matrix of product quantities. For that, use numpy function ```np.random.randint(vmin, vmax, (rows, cols))```, where ```vmin``` and ```vmax``` are the minimal and maximal values of uniform distribution, and ```rows``` and ```cols``` define the dimensions of the matrix.
* Define the vector of prices.
* Compute the dot product between the matrix and the vector.
* Use ```argmax()``` to compute the index of the warehouse with the highest value of goods.

In [None]:
# TODO

## Part 3. Working with images
In this part, we will guide you through steps for building a naive object detector. Your goal is to follow the code and understand what is happening in every line. Ask a TA if you do not understand something.

We start by loading the needed packages.

In [None]:
import numpy as np

# cv2 (OpenCV) is a popular computer vision library which provides functions for common
# impage processing tasks. We will use it to read images.
import cv2

# matplotlib is a plotting tool for python. We will use it for image visualization
import matplotlib.pyplot as plt

# This is a special command for Jupyter notebook. It forces plots to be refreshed
# when you recompute a cell
%matplotlib inline

Now we are ready to load the picture and plot it! When an image is loaded, it is a NumPy array of shape (height, width, 3). 3 comes from the fact that there are 3 color channels (red, green and blue).

In [None]:
# First read the image. Recall that we have set env_path to point to the path
# of the handout in Google Drive.
clown_fname = env_path / "images/clown.jpg"
assert clown_fname.exists(), f"Image file {clown_fname} does not exist - you may have set env_path wrongly"

clown = cv2.imread(f'{env_path}/images/clown.jpg')

# print shape of the image
print('Shape of image is:')
print(clown.shape)

By default, OpenCV stores the colors in the order (blue, green, red), i.e. BGR format. We will convert it to RGB (red, green, blue) format so that we can visualize it properly in matplotlib. Furthermore, the numbers in the image are integers in range between 0 and 255. We will divide them by 255 to obtain an array with values in range (0, 1).

In [None]:
# convert from BGR format to RGB format and map to (0, 1) range
clown = cv2.cvtColor(clown, cv2.COLOR_BGR2RGB)
clown = clown / 255.0

# plot the image
plt.imshow(clown)

You can access an pixel in the image using the idexing operation `[]`

In [None]:
# Print RGB values at row 125, col 100. Observe that this pixel lies inside
# the red nose of the clown.
print(clown[125, 100, :])

# Print only the red value at the same location
print(clown[125, 100, 0])

A common operation useful when working with images is to reshape images to e.g. represent it as a vector. This can be done using the reshape function.

In [None]:
# print original image shape
print(clown.shape)

# Reshape the image to a single dimension vector
clown_reshaped = np.reshape(clown, (-1))

print(clown_reshaped.shape)

Another useful operation is to concatenate, or stack many NumPy arrays into a single array, as shown next. Do check the documentation for these operations.

In [None]:
# Concatenate 3 clown images along the last dimension
clowns_cat = np.concatenate((clown, clown, clown), -1)
print(clowns_cat.shape)

# Stack 3 clown images by creating a next dimension
clowns_stack = np.stack((clown, clown, clown), 0)
print(clowns_stack.shape)

We will implement a naive red nose detector. It just detects the red-most pixel in the image. During the course, you will learn more sophisticated ways of solving similar tasks. We present this instructional method to show you the pythonic way of solving image-related problems.

We will introduce another image that is just filled with red color. Then, we will compute the difference between the clown picture and the red one and will take the pixel with the smallest difference as the point where the nose is.

First, let us define an image that is just red. As images are encoded with red, green, and blue channels, we create an array of the same size as the clown picture. We set the red channel to 1, and green and blue to 0 (that happens by default as we initialize the whole array with zeros).

In [None]:
# Obtain numpy array of same shape as clown, filled with all zeros
red = np.zeros(np.shape(clown))

# Setting red channel (that has index 0) to 1
red[:,:,0] = 1
plt.imshow(red)

Now we want to compare pixels between the two images. We will use the standard mean-square distance for that:

$d_{ij} = \sqrt{\sum_{k=1}^{3}(s_{ijk} - t_{ijk})^2}$

Every element $d_{ij}$ of the distance matrix $D$ is just a Euclidean distance between pixel values for all 3 color channels in two compared images $S$ and $T$.

To solve this equation in a pythonic way, we first simplify it by splitting it into two tasks:

$r = s - t$

$d_{ij} = \sqrt{\sum_{k=1}^{3}r_{ijk}^2}$

This is exactly the same math, but the beauty of this approach is that one can compute both values with very simple numpy operations. To compute $r$, one simply has to subtract the arrays from each other. The definition of $d_{ij}$ is just a definition of Euclidean norm, for which numpy has a function. So the whole computation can be done with 2 lines of code.

In [None]:
r = clown - red

# The np.linalg.norm function computes the norm of a vector. We are giving it a tensor of size (200, 185, 3).
# By default, it will give one number that will be the norm of all items in this tensor. However, if we provide
# the axis argument, the function will only compute norm in the given dimension. If we set axis to 2, we get a
# matrix of size (200, 185), every element of which is a norm of (r, g, b) values of a corresponding pixel.
d = np.linalg.norm(r, axis=2)
# When given a matrix with values, the imshow function color-maps them. The lowest value is colored blue, and the
# largest red.
plt.imshow(d)
plt.colorbar()

Before we continue to find the minimum of this distance map, we would like to make two sidenotes.

**Sidenote 1: Using NumPy functions over manual loop**</br>
Instead of using NumPy function `np.linalg.norm`, we could also write this function ourselves, by iterating over every pixel and doing the calculation.

In [None]:
def img_distance_fn_without_numpy(img1, img2):
  """
  Given two images compute pixel-wise distance

    d_ij = sqrt( sum_k (img1_ijk - img2_ijk)^2 )

  by looping over pixels

  Parameters
  ----------
    img1: NumPy array of shape (H, W, 3)
      1st image
    img2: NumPy array of shape (H, W, 3)
      2nd image

  Returns
  -------
    dist: NumPy array of shape (H, W)
      A map holding distances d
  """
  H, W, C = img1.shape
  dist = np.zeros((H, W), dtype=float)
  for i in range(H):
    for j in range(W):
      r = img1[i, j] - img2[i, j]
      dist[i, j] = np.sqrt(r[0]**2 + r[1]**2 + r[2]**2)

  return dist

In [None]:
def img_distance_fn_with_numpy(img1, img2):
  """
  Given two images compute pixel-wise distance

    d_ij = sqrt( sum_k (img1_ijk - img2_ijk)^2 )

  using NumPy functions

  Parameters
  ----------
    img1: NumPy array of shape (H, W, 3)
      1st image
    img2: NumPy array of shape (H, W, 3)
      2nd image

  Returns
  -------
    dist: NumPy array of shape (H, W)
      A map holding distances d
  """
  r = img1 - img2
  return np.linalg.norm(r, axis=2)

In [None]:
# Time these two functions
print("Without using NumPy functions")
%timeit img_distance_fn_without_numpy(clown, red)

print("Using NumPy functions")
%timeit img_distance_fn_with_numpy(clown, red)

As you can see: Iterating over the image ourselves is two orders of magnitude slower! This is because NumPy functions are implemented in C (a compiled language), while Python (an interpreted language) loops are very slow.

**Takeaway:** Never ever in this course loop over an image with native Python loops!!!

**Sidenote 2: Broadcasting**</br>
Note how we created an entire image filled with red.
Does this seem like a wasteful use of memory to you?

NumPy provides a very elegant mechanism, called broadcasting, to avoid repeating the same data many times. Broadcasting works for many operations, but we will look at multiplication as a specific example.

Broadcasting allows us to element-wise multiply arrays of different size.
It is best understood with examples.

You can read more on this topic here: https://numpy.org/doc/stable/user/basics.broadcasting.html

In [None]:
# Scalar multiplication - the simplest broadcast
# Here a scalar - array of shape (1,) - is multiplied with an array of shape (3,)
arr = np.array([1., 2., 3.])
b = 2.
print(b * arr)

In [None]:
# In general, broadcasting starts from the last dimension (right most) and works towards the beginning (left)
# Two dimensions are compatible if they are:
#  - Equal
#  - One of them is 1
#
# Here we multiply arrays of shape (4, 3, 5) and (1, 3, 1)
# The axis, where the second dimensions are 1 can be imagined to be stretched, in order to match the first arrays dimension
arr1 = np.random.randn(4, 3, 5)
arr2 = np.random.randn(1, 3, 1)

arr3 = arr1 * arr2
print(f"Shape after broadcasting: {arr3.shape}")

In [None]:
# It is even possible to leave leading dimensions away - then they will be assumed to be of size 1
#
# Here we multiply the shapes
#   (4, 3, 5)
#   (   3, 1)
# This performes the exact same operation as the cell above
arr1 = np.random.randn(4, 3, 5)
arr2 = np.random.randn(3, 1)

arr3 = arr1 * arr2
print(f"Shape after broadcasting: {arr3.shape}")

In [None]:
# We can use this to avoid having to create an entire image filled with red
red_small = np.array([1.0, 0.0, 0.0])

# Broadcast
r = clown - red_small
d = np.linalg.norm(r, axis=2)

# Plot the result
plt.imshow(d)
plt.colorbar()

Back to the problem!

Now we can detect the pixel with the lowest value in the matrix (that will be the pixel that corresponds to the red-most pixel in the original image). The ```argmin``` function returns an index of the matrix element that has the lowest value. This index is a number from 0 to $N-1$, where $N$ is the number of elements in the matrix.
In other words, `argmin` corresponds to the index into the array after reshaping it to a one dimensional vector (aka the flattened array).
To convert this index into the flattened array to (x, y) coordinates, the ```unravel_index``` function needs to be used.

In [None]:
maxind = np.argmin(d)
# We obtain the height and width of the image
clown_hw = np.shape(clown)[0:2]
(y, x) = np.unravel_index(maxind, clown_hw)

The ```scatter``` function can be used for drawing a blue point in the picture to identify the nose pixel that our detector has identified.

In [None]:
plt.scatter(x, y)
plt.imshow(clown)

As you see, the detector we have built is not exactly perfect — it just returns a random point on the clown's nose. If there were some other red objects in the picture, we could not guarantee that the point would be on the nose any longer. During the course you will learn much more principled ways of image processing.

Good luck!

## Addendum: Working with external files (optional)

It is common practice to implement algorithms in external `*.py` files. This way we can simply import and reuse the functions in multiple projects. It also makes it easier to structure and organize our code.

Open the file `external_functions.py` and make some modifications. Observe the effect in the notebook.

**Note:**
See slides info_colab.pdf on how to open the *.py file in CoLab

In [None]:
# The following are `magic` statements.
# They are specific to IPython (the interactive python shell powering this notebook).
# We don't need to know more about this.
# This two lines in particular make sure, that changes made to an external file
# will have an effect in the notebook without reloading the function.
%load_ext autoreload
%autoreload 2

# If this cell creates an error - try to select a different runtime version:
#   Runtime -> Change runtime type -> Runtime version

In [None]:
# Before using an external function, we have to import it
from external_functions import external_function

In [None]:
result = external_function(3, 4)
print("Result:", result)

# Solutions

**Task 1.1**

In [None]:
foreigners = round(8327126 * 0.246)
print(foreigners)

**Task 1.2**

In [None]:
def triangle_area(h, b):
    return h * b / 2.0

total_area = 0
for i in range(0, 100):
    h = 150 - i
    b = i + 1
    area = triangle_area(h, b)
    total_area = total_area + area

print(total_area)

**Task 2.1**

In [None]:
import numpy as np

A = np.random.randint(1, 10, (50, 3))
b = np.array([3, 5, 1])
r = A.dot(b)
print(np.max(r))
print(np.argmax(r))