In [1]:
# This cell is used to change parameter of the rise slideshow, 
# such as the window width/height and enabling a scroll bar
from notebook.services.config import ConfigManager
cm = ConfigManager()
cm.update('livereveal', {
              'width': 1000,
              'height': 700,
              'scroll': True,
})

{'width': 1000, 'height': 700, 'scroll': True}

# Topic 3 Lecture - Images as Data

## Aims of the Session

* Justify the reason to use images as the main data input for (most of) the remaining of the course

* Learn the basics on image pre-processing techniques to better organise and classify data

## Resources for the Lecture

* Introduction to Computing and Programming in Python: A Multimedia Approach. Mark Guzdial, Barbara Ericson. Pearson, 2016.
* https://people.eecs.berkeley.edu/~fateman/kathey/skew.html
* https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_morphological_ops/py_morphological_ops.html

## Why images as data?

* Images (and multimedia in general) are easier to understand!

* They are a visual representation of the features of the data input

* Non-application dependent (mostly)

* They will lead us to Convolutional Neural Networks!

* It is a novel educational approach.

![Fig. 1. A book about multimedia in Python](https://www.dropbox.com/s/wjbajs4a7yzd266/book.jpg?raw=1)

## How does an image look *digitally*?

* These are the main **compression algorithms** used to store images:

![Fig. 2. Image extensions](https://www.dropbox.com/s/4an7wf2na3fvgvf/comp.jpg?raw=1)

### Compression Algorithms

* The *art* of compression algorithms is in **quantisation**!

* Best algorithms are the ones that achieve best visual quality with reduced size.

![Fig. 3. Quantization](https://www.dropbox.com/s/714a6l2o7ln6ukw/quant.jpg?raw=1)

The importance of compression

In [1]:
import warnings
warnings.filterwarnings('ignore')
from IPython.display import HTML
HTML('<iframe width="560" height="315" src="https://www.youtube.com/embed/NMkZpuiEqh8" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>')

* Fortunately, this is **NOT** our problem in this module!

* We are going to work with images in *simpler* ways

## Images as arrays/matrices

* Using the `numpy` module

* Complementing by using the `OpenCV`module, which will let us import and manipulate images

* When we import an image, the first thing we will get is a **bitmap**

![Fig. 4. Bitmap](https://www.dropbox.com/s/zinj0mv5uzu9eb1/pix.jpg?raw=1)

* Each pixel will be represented as a value within an $n \times m$ matrix

### Grayscale Images

* A 2D grid of pixels

* Two ways to represent them:

1. Standard: from 0 (black) to 255 (white) with 254 gray values in between.

![Fig. 5 a. Standard grayscale image](https://www.dropbox.com/s/mk70ili2yyb9con/graystan.jpg?raw=1)

2. Normalised: from 0 (black) to 1 (white) with "infinite" gray values in between.

![Fig. 5 b. Normalised grayscale image](https://www.dropbox.com/s/qrt5j974q2adu9m/graynorm.jpg?raw=1)

**IS IT POSSIBLE TO CONVERT BETWEEN STANDARD $\leftrightarrow$ NORMALISED?**

### Colour Images

![Fig. 6. "Pixelated" colour image](https://www.dropbox.com/s/4rbrymh3dobhyi9/smile.jpg?raw=1)

* Each pixel has three `channels`: $\color{red}{red}$, $\color{blue}{blue}$ and $\color{green}{green}$

* Images with colour are often called $\color{red}{R}$$\color{green}{G}$$\color{blue}{B}$ images

#### Option 1

* If a colour image is imported, a matrix will be produced, this time with three values per pixel instead of one

* The three values will be stored in a tuple

![Fig. 7. RGB image, option 1](https://www.dropbox.com/s/ohrcjtln8mvpa49/rgb.jpg?raw=1)

#### Option 2

* When importing a colour image in `OpenCV`, a 3D array will be produced, with the third dimension representing the three channels

![Fig. 8. RGB image, option 2](https://www.dropbox.com/s/5eytedi8kqb01er/rgb2.jpg?raw=1)

* Advantage of option 2: Faster to do calculations and transformations

**HOW MANY COLOURS CAN BE REPRESENTED USING THIS STANDARD?**

**CAN RGB BE NORMALISED?**

**ARE THERE ANY OTHER STANDARDS THAT CAN REPRESENT MORE COLOURS?**

## Image Pre-processing

* Before actually working with images as data, there are plenty of techniques that can be applied to improve quality

* Not all images are perfect, especially in the document image analysis

![Fig. 9. An engineering drawing with poor resolution](https://www.dropbox.com/s/x7320umcr44sl1f/ed.jpg?raw=1)

* As this is not a computer vision course, we will only learn **ONE** and apply one preprocessing technique:

### THRESHOLDING/BINARISATION

* Converting a grayscale image into a binary (black/white) image based on a threshold

* Useful to improve the quality of an image and to refine shapes

![Fig. 10. An example of binarisation](https://www.dropbox.com/s/nqkcytkcely4n3t/bin.jpg?raw=1)

### Other pre-processing techniques

#### Skew Correction

![Fig. 11. Skew correction](https://www.dropbox.com/s/ckge1t0aqz1j3j0/skew.jpg?raw=1)

#### Erosion and Dilation

![Fig. 12. Erosion and dilation](https://www.dropbox.com/s/j6ifsppdurwimzh/erodil.jpg?raw=1)

#### Opening

![Fig. 13. Image opening](https://www.dropbox.com/s/rqafolr87eyeot3/open.jpg?raw=1)

#### Closing

![Fig. 14. Image closing](https://www.dropbox.com/s/1tx3qriiqsnr78j/close.jpg?raw=1)

# Recognition vs Classification: What's the difference?

In [2]:
import warnings
warnings.filterwarnings('ignore')
from IPython.display import HTML
HTML('<iframe width="560" height="315" src="https://www.youtube.com/embed/R9OHn5ZF4Uo" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>')

# LAB 3: IMPORTING AND MANIPULATING IMAGES IN PYTHON