<a href="https://colab.research.google.com/github/kocurvik/edu/blob/master/PNNPPV/notebooky/cv01-en.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **1. lab** - Google Colab, NumPy Basics, Basic Image Manipulation in OpenCV

In this notebook we will cover the basics of working in the Google Colab environment, NumPy and basic image manipulation in OpenCV.

## Google Colab

Google Colab (http://colab.research.google.com) is a free service which allows the users to run ipython notebooks in the cloud. The service also allows the users to leverage a free GPU. A Google account is necessary to use this service. The notebooks can also be downloaded and run locally using the jupyter notebook in the directory where the notebooks are saved (provided you have jupyter installed).

There are two types of cells in these notbooks. The first type is a text cell such as this one. The other type of a cell is code cell. The cells with code can be run with the play button to the left of the cell. All of the cells can also be run by clicking choosing Runtime -> Run All in the menu at the top of this page. The cells share a scope so it is necessary to be mindful of the order in which the cells run. If you need to restart the notebook you can look into the Runtime tab in the options on top of the page.

You can try running a simple python code in the following cell.

In [None]:
a = [5,'Hello']
s = '{} World!'
print(s.format(a[1]))

def najlepsia_funkcia(arg):
  return arg + 5

print(najlepsia_funkcia(8))

Saving and using files that are stored permanently can be done in different ways, but the easiest option is to connecting Google Drive to our notebook instance. This is done in a following way:
(Note: this assumes that you have created a directory named Colab in your Google Drive)

In [None]:
from google.colab import drive
drive.mount('/content/gdrive')
root_path = 'gdrive/My Drive/Colab/'

You can then access the folder (if it exists).

In [None]:
import os
os.listdir(root_path)

You can also use the standard unix commands by preceeding them with an exclamation point.

In [None]:
!pwd
!ls
!mkdir random_dir
!ls
!wget https://www.google.com/images/branding/googlelogo/1x/googlelogo_color_272x92dp.png
!ls

This is not always necessary as you can access the instance file structure in the panel on the left.

**<----**

## NumPy
A tensor is a base data structure we will use when dealing with neural networks. Tensor is a generalization of vectors and matrices. Vectors from $\mathbb{R}^n$ are first order (rank) tensors and matrices are second order (rank) tensors. A third order (rank) tensor can be additionally indexed with one more index. Sometimes the order of tensor is also called its rank. Strictly speaking tensor dimension and order are two different concepts. For example, we can have a third order tensor with dimension $ 20 \times 24 \times 6$. However, in many texts authors interchange tensor order (rank) with dimension. This is sometimes confusing, but usually the meaning is evident from context.

To conform to standards we import numpy as np. A vector (e.g. first order tensor) can be created from a simple list using the np.array constructor. It might be important to initialize the array with a specific data type which can be done by using the dtype keyword in the constructor. **Not specifying the dtype can cause various complications later in the code.**


In [None]:
import numpy as np

a = np.array([3, 2, 3, 4])
print(a)
b = np.array([1,5,7], dtype=np.float32)
print(b)

A matrix can be created from a list of lists, but it is necessary for the lengths of the inner lists to be consitent. If they are not consistent we will obtain a vector of lists instead of matrix of values.

In [None]:
A = np.array([[1,5,8],[50,60,84]])
print(A)
B = np.array([[7,8],[6,7],[0,9,4]])
print(B)

The array object has a shape method which tells us its shape (dimensions) and method dtype which returns the data type of the values in the tensor. It also has a method astype which returns an array with the specified type.

In [None]:
print(a.shape)
print(A.shape)
print(A.dtype)
C = A.astype(np.float32)
print(C.dtype)


Numpy has a few functions to generate basic tensors. The most commnly used onse are: np.zeros, np.ones and np.empty.

In [None]:
z = np.zeros([5,10])
print(z)
o = np.ones([3,4,5])
print(o)
e = np.empty([6])
print(e)

It is also possible the create tensors with random numbers by using np.random.random.

In [None]:
r = np.random.random([6,10,3])
print(r)

We can now use these random arrays to work with indices. Indexing is similar to Matlab, but the python conventions are applied. We can use multiple indices (based on tensor order). If we omit one of the indices it is implicitly used as :, which means that all of the elements along that dimesnsions are used.

In [None]:
print(r[3,4,1])
print(r[:,:,-1])
print(r[:,:,1].shape)
print(r[0:4,5:6,:])
print(r[0:4,5:6,:].shape)

We can also use steps when indexing this is also called slicing. That can be done by another colon. The format is then [start:stop:step]. If any of that is left empty the start is implicitly assumed to be 0, the end -1 and step 1.

In [None]:
p = np.arange(25)
print(p)
print(p[4:16:2])
print(p[2:-4:6])
print(p[:10:])
print(p[::3])
print(p[1::6])
print(p[-6:])
print(r[1::2,0::3,:])

We will often need to create a so-called singleton dimension. This cane be done by adding a None index, or in a more python way by using np.newaxis.

In [None]:
print(r[None,:,:,:].shape)
print(r[None].shape)
print(r[np.newaxis,:,:,:].shape)
print(r[:,:,:,np.newaxis].shape)
print(r[np.newaxis,:,0,:].shape)

Similar to Matlab broadcasting is done implicitly (though different rules are applied in Matlab).

In [None]:
r += 10
print(r)
r[0,:,:] = np.random.random([10,3])
print(r.shape)
r[0] = np.random.random([10,3])
print(r.shape)
r[0] = np.zeros([10,1])
print(r.shape)
r/=500
print(r)

An array can be reshaped using np.reshape

In [None]:
q = np.reshape(r, [6,30])
print(q.shape)

Arrays can be joined. One way is to use np.concatenate([arr1, arr2, ...], axis = i) which connects the arrays arr1 and arr2 through a given axis i. Note that is usually better (for code comprehension) to explicitly use the keyword axis even if it is redundant.

In [None]:
a = np.ones([3,4])
b = np.zeros([6,4])
c = np.concatenate([a,b], axis = 0)
print(c)                   

A different option is to use np.stack([arr1, arr2, ...], axis= i). The difference is that this creates increases the tensor order by one.

In [None]:
a = np.ones([6,4,3])
b = np.zeros([6,4,3])
d = np.stack([a,b], axis = 0)
print(d.shape)
f = np.stack([a,b], axis = -1)
print(f.shape)
g = np.stack([a,b], axis = 1)
print(g.shape)

We can also index using conditions.

In [None]:
r = np.random.random([5,5])
print(r)
r[r < 0.5] = -500
print(r)

The conditions can also be using in np.whare(cond, a1, a2), which returns an array which contans elemnts from a1 where the condition is true and elements from a2 where condition is false. We can also use np.arange(i) which is equivalent to range(i) from python.

In [None]:
a = np.arange(10)
print(a)
b = np.where(a < 5, a, a**2)
print(b)

### Exercise 1
Create a funciton chessboard(rows, cols) which returns a numpy array with dimensions rows x cols. The array will contain 1 in spots where a chessboard tile would be white and 0 where a chessboard tile would be black.

The simplest way is to use slicing. If you feel adventurous you can check the NumPy documentation and use conditions with np.mgrid or repetition and np.tile.

In [None]:
def chessboard(rows, cols):
  ...

In [None]:
print(chessboard(8,8))
print(chessboard(5,12))

### NumPy Documentations
We will go through some basic linear algebra and statistics operations with NumPy during the next lab. 

If you are curious you can check out the documentation of NumPy: https://docs.scipy.org/doc/numpy/

## OpenCV

OpenCV is a computer vision library. In other classes related to computer vision and processing we use Matlab, but any of those methods are also implemented in the OpenCV. If you want to use it locally you need to install it for excample via pip with package name opencv-python.

In [None]:
import cv2

Loading images in OpenCV is simple. We will now read the image we downloaded at the beginning of this lab. We can see that the image is represented as a NumPy array with dimensions height x width x 3. The last dimension represents the three RGB color channels. We therefore have three matrices with each representing the intensity of the given color component. It important to be mindful of the fact that **OpenCV assumes the images to be stored in the BGR format by default!!!**

In [None]:
img = cv2.imread('googlelogo_color_272x92dp.png')
print(type(img))
print(img.shape)
rint(img.dtype)

Displaying the image is a bit complicated. Since we are in a notebook we need a different library. Specifically we will be using matplotlib. We therefore need to convert the image from BGR to RGB to display it.

The commented part of the code below demonstrates how you can do this if you are working locally on your machine. cv2.waitKey renders the image and the waits for a time given in miliseconds. If the time is given as 0 the program will halt until a key is pressed.

In [None]:
import matplotlib.pyplot as plt
plt.imshow(img[:,:,::-1])
plt.show()

# cv2.imshow('Obrazok', img)
# if cv2.waitKey(0) & 0xFF == ord('q'):
#     return

We can also display the individual color channels.

In [None]:
plt.imshow(img[:,:,0],cmap='gray')
plt.show()
plt.imshow(img[:,:,1],cmap='gray')
plt.show()
plt.imshow(img[:,:,2],cmap='gray')
plt.show()

The images are loaded as uint8 as default so integers in the range from 0 to 255. Sometimes we want to convert them to floats. In that case the general assumption is that the intensity values are in the range \[0, 1\].

In [None]:
img_f = img / 255
print(img_f.dtype)
plt.imshow(img_f[:,:,::-1])
plt.show()

We can easily do image manipulation. For instance we can increase intensity in one of the channels (this will cause an uint8 overflow in some parts of the image).

In [None]:
img[:,:,0] += 20
plt.imshow(img[:,:,::-1])
plt.show()

It is often useful to convert the image to grayscale. We can do this with cv2.cvtColor

In [None]:
!wget 'https://upload.wikimedia.org/wikipedia/commons/0/0a/Malachite_kingfisher_(Corythornis_cristatus_stuartkeithi).jpg'
!mv 'Malachite_kingfisher_(Corythornis_cristatus_stuartkeithi).jpg' bird.jpg
!ls -l

In [None]:
img_b = cv2.imread('bird.jpg')
plt.imshow(img_b[:,:,::-1])
plt.show()
img_g = cv2.cvtColor(img_b, cv2.COLOR_BGR2GRAY)
plt.imshow(img_g, cmap='gray')
plt.show()

### Exercise 2
Try to crop the head from the image of the bird and display it.

In [None]:
# kód pre úlohu 2

### Exercise 3

Download an image from the internet using wget or Google Drive. Convert it to grayscale and apply gamma correction to it in the form:

$$ O = I^{\frac{1}{G}}, $$ where $I$ is the original image intensity for a given pixel, $O$ is the new intensity for a pixel and $G$ is the gamma factor. In order for this to make sense you have to convert the image from uint8 to floats. Display the image after performing the gamma correction. Implement the gamma correction as a function of two parameters.


In [None]:
# kód pre úlohu 3

Často je dôležité obrázky zväčsovať a zmenšovať. V kurzoch spracovania obrazu sa preberajú rôzne metódy interpolácie. Nám bude stačiť použiť defualtné nastavenie. Zmena veľkosti sa volá pomocou cv2.resize(img, (width, height))

In [None]:
img_r = cv2.resize(img_b, (400,200))
print(img_r.shape)
plt.imshow(img_r[:,:,::-1])
plt.show()

### Úloha 4
One of the most common preprocessing steps when using deep learning is to prepare a batch of images to feed the neural network. In order to do that the images have to be resized to the same size and stacked so they create a tensor with shape n_img, height, width, 3 where n_img is the number of images. The following code downloads a zip with images. Write a function which creates the desired tensor from these images. The argument of this function will be the name of the folder where the images are as well as width and height of the resized images.

In [None]:
!wget https://github.com/kocurvik/edu/raw/master/PNNPPV/supplementary/cv01_images.zip

In [None]:
def load_img_tensor(path, width, heigth):
  ...

The following code should work and display the images in a sequence.

In [None]:
t = load_img_tensor('imgs', 227, 227)
for i in range(t.shape[0]):
  pls.imshow(t[i,:,:,::-1])
  plt.show()

### OpenCV documentation
When looking for documentation for OpenCV you may run into older versions that what you are actually using. All of them are here: https://docs.opencv.org/

You can check the version of the library easily:


In [None]:
print(cv2.__version__)