TKO_3120 Machine Learning and Pattern Recognition

Image recognition exercise

Your name <br>
Your e-mail

February 2022

---


This is the template for the image recognition exercise. <Br>
Some **general instructions**:
 - write a clear *report*, understandable for an unspecialized reader: define shortly the concepts and explain the phases you use
    - use the Markdown feature of the notebook for larger explanations
 - return your output as a working Jupyter notebook
 - name your file as MLPR22_exercise_your_surname.ipynb
 - write easily readable code with comments     
     - if you exploit some code from web, provide a reference
     - avoid redundant code! Exploit the relevant parts and modify the code for your purposes to produce only what you need 
 - it is ok to discuss with a friend about the assignment. But it is not ok to copy someone's work. Everyone should submit their own implementation

**Deadline 21st of March at 23:59**
- No extension granted, unless you have an extremely justified reason. In such case, ask for extension well in advance!
- Start now, do not leave it to the last minute. This exercise will need some labour!
- If you encounter problems, Google first and if you can’t find an answer, ask for help
    - pekavir@utu.fi

**Grading**

The exercise covers a part of the grading in this course. The course exam has 5 questions, 6 points of each. Exercise gives 4 points, i.e. the total score is 34 points.

From the template below, you can see how many exercise points can be acquired from each task. Exam points are given according to the table below: <br>
<br>
7-8 exercise points: 1 point <br>
9-10 exercise points: 2 points <br>
11-12 exercise points: 3 points <br>
13-14 exercise points: 4 points <br>
<br>
To pass the exercise, you need at least 7 exercise points, distributed somewhat evenly into tasks (you can't just implement Introduction, Data preparation and Feature extraction and leave the left undone!) <Br>        

## Introduction

Write an introductory chapter for your report **(1 p)**
<br>E.g.
- What is the purpose of this task?
- What kind of data were used? Where did it originate?
- Which methods did you use?

Images: https://unsplash.com/

## Data preparation

In [15]:
# gather all packages needed here
import numpy as np
from itertools import groupby
import matplotlib.pyplot as plt
from skimage import io
from skimage.transform import resize
from skimage.color import rgb2gray

In [2]:
urls_trees = np.loadtxt('data/trees.txt', dtype='U150')
urls_pebbles = np.loadtxt('data/pebbles.txt', dtype='U150')
urls_sky = np.loadtxt('data/sky.txt', dtype='U150')

In [3]:
def load_and_flip_images(data, id):
    i = 0
    img_data = []
    img_id = []
    for img in data:
        temp_img = io.imread(img)
        img_data.append(temp_img)
        img_id.append(id + str(i))
        np.flip(temp_img, 1)
        img_data.append(temp_img)
        img_id.append(id + str(i))
        i += 1
    print(id + ' done')
    return img_data, img_id

In [4]:
trees_data, trees_id = load_and_flip_images(urls_trees, 'trees')
pebbles_data, pebbles_id = load_and_flip_images(urls_pebbles, 'pebbles')
sky_data, sky_id = load_and_flip_images(urls_sky, 'sky')

trees done
pebbles done
sky done


In [5]:
def extract_dimension(data):
    rows = []
    columns = []
    for img in data:
        rows.append(len(img))
        columns.append(len(img[0]))
    return rows, columns

In [6]:
trees_x, trees_y = extract_dimension(trees_data)
pebbles_x, pebbles_y = extract_dimension(pebbles_data)
sky_x, sky_y = extract_dimension(sky_data)

x_dimensions = trees_x + pebbles_x + sky_x
y_dimensions = trees_y + pebbles_y + sky_y

mean_x = np.mean(x_dimensions)
mean_y = np.mean(y_dimensions)

In [7]:
def resize_images(data):
    resized_images = []
    for img in data:
        resized_images.append(resize(img, (mean_x, mean_y)))
    print('done')
    return resized_images

In [8]:
resized_trees = resize_images(trees_data)
resized_pebbles = resize_images(pebbles_data)
resized_sky = resize_images(sky_data)

done
done
done


In [16]:
def grayscale_images(data):
    grayscaled = []
    for img in data:
        grayscaled.append(rgb2gray(img))
    print ('done')
    return grayscaled

In [17]:
grayscaled_resized_trees = grayscale_images(resized_trees)
grayscaled_resized_pebbles = grayscale_images(resized_pebbles)
grayscaled_resized_sky = grayscale_images(resized_sky)


done
done
done


Perform preparations for the data **(3 p)**
- import all the packages needed for this notebook in one cell
- read the URL:s from the text files and import the images
- crop and/or resize the images into same size
- for GLCM and GLRLM, change the images into grayscale and reduce the quantization level to 8 levels
- make data augmentation: flip each image horizontally to increase the number of examples in the data

## Feature extraction

### First order texture measures (6 features)

- Calculate the below mentioned color features for each image **(1 p)**
    - Mean for each RGB color channel
    - Variance for each RGB color channel

### Second order texture measures (10 features)

- Calculate feature values for each following feature for each image in the prepared data set:
- Gray-Level-Co-Occurrence (GLCM) features (4 features) **(2 p)**
    - calculate the "correlation" feature using the GLC matrix
        - in horizontal and vertical directions for two reference pixel distances (you can choose the distances)
    - explain your choice for the distances
- Gray-Level-Run-Length (GLRL) features (6 features) **(2 p)**
    - Calculate the following three features in horizontal and vertical direction
        - Use the given function for Gray-Level-Run-Length (GLRL) matrix
        - Implement the following run-length features using the GLRL matrix
            - Short-Run emphasis
            - Long-run emphasis
            - Run percentage
        - Test your implementation with the given toy image

Gather your features into an input array X, and the image classes into an output array y. Assign an image id for each image so that the original and flipped image have the same id. Standardize the feature values in X.

In [9]:
# Grey-Level-Run-Length-Matrix

def glrlm(image, levels, angle):
    if angle==0: # horizontal        
        runs=image.shape[1]
        glrl_matrix=np.zeros([levels,runs])

        for row in range(0,image.shape[0]):
            onerow=image[row,:]
            counts=[(i, len(list(g))) for i, g in groupby(onerow)]
            for count in counts:
                glrl_matrix[count[0],count[1]-1]=glrl_matrix[count[0],count[1]-1]+1

    if angle==90: # vertical
        runs=image.shape[0]
        glrl_matrix=np.zeros([levels,runs])

        for column in range(0,image.shape[1]):
            onecolumn=image[:,column]
            counts=[(i, len(list(g))) for i, g in groupby(onecolumn)]
            for count in counts:
                glrl_matrix[count[0],count[1]-1]=glrl_matrix[count[0],count[1]-1]+1
        
    return(glrl_matrix)        

In [10]:
# G_m = gray-level-run-length-matrix
# Np = the number of pixels in the image
def emphasis(G_m, Np):
    
    ...
    
    return(SRE, LRE, RP)

In [11]:
# test the glrlm function with a toy example 1
toy_image=np.array([[1,1,1,2],[2,0,0,1],[1,0,2,2]])

toy_GLRLM_0=glrlm(toy_image,3, 0)
toy_GLRLM_90=glrlm(toy_image,3, 90)

print('GLRL matrix for 0 degrees:')
print(toy_GLRLM_0)
print('GLRL matrix for 90 degrees:')
print(toy_GLRLM_90)

GLRL matrix for 0 degrees:
[[1. 1. 0. 0.]
 [2. 0. 1. 0.]
 [2. 1. 0. 0.]]
GLRL matrix for 90 degrees:
[[1. 1. 0.]
 [5. 0. 0.]
 [4. 0. 0.]]


In [12]:
# test your emphasis function in 0 direction with toy example 1
toy_SRE_0, toy_LRE_0, toy_RP_0=emphasis(toy_GLRLM_0, 12)
print('SRE:', np.round(toy_SRE_0, 3))
print('LRE:', np.round(toy_LRE_0, 3))
print('RP:', np.round(toy_RP_0, 3))

NameError: name 'SRE' is not defined

In [None]:
# test the emphasis function in 90 direction with toy example 1
toy_SRE_90, toy_LRE_90, toy_RP_90=emphasis(toy_GLRLM_90, 12)
print('SRE:', np.round(toy_SRE_90, 3))
print('LRE:', np.round(toy_LRE_90, 3))
print('RP:', np.round(toy_RP_90, 3))

SRE: 0.932
LRE: 1.273
RP: 0.917


In [None]:
# test the glrlm function with a toy example 2
toy_image=np.array([[1,1,1,2],[2,0,0,1],[1,0,2,2],[0,0,0,0]])

toy_GLRLM_0=glrlm(toy_image,3, 0)
toy_GLRLM_90=glrlm(toy_image,3, 90)

print('GLRL matrix for 0 degrees:')
print(toy_GLRLM_0)
print('GLRL matrix for 90 degrees:')
print(toy_GLRLM_90)

GLRL matrix for 0 degrees:
[[1. 1. 0. 1.]
 [2. 0. 1. 0.]
 [2. 1. 0. 0.]]
GLRL matrix for 90 degrees:
[[4. 0. 1. 0.]
 [5. 0. 0. 0.]
 [4. 0. 0. 0.]]


In [None]:
# test your emphasis function in 0 direction with toy example 2
toy_SRE_0, toy_LRE_0, toy_RP_0=emphasis(toy_GLRLM_0, 16)
print('SRE:', np.round(toy_SRE_0, 3))
print('LRE:', np.round(toy_LRE_0, 3))
print('RP:', np.round(toy_RP_0, 3))

SRE: 0.63
LRE: 4.222
RP: 0.562


In [None]:
# test the emphasis function in 90 direction with toy example 2
toy_SRE_90, toy_LRE_90, toy_RP_90=emphasis(toy_GLRLM_90, 16)
print('SRE:', np.round(toy_SRE_90, 3))
print('LRE:', np.round(toy_LRE_90, 3))
print('RP:', np.round(toy_RP_90, 3))

SRE: 0.937
LRE: 1.571
RP: 0.875


## Feature relationships

Make illustrations of the feature relationships, and discuss the results **(1 p)**
- Pairplot 
    - Which feature pairs possess roughly linear dependence?
- PCA
    - Can you see any clusters in PCA?
    - Does this figure give you any clues, how well you will be able to classify the image types? Explain.


## Build the classifiers and estimate their performance

Build the classifiers and estimate their perfomance. Use LeaveOneGroupOut or GroupKFold cross validator (image id as group indicator).
- k Nearest Neighbors classifier **(1 p)** 
    - optimize the hyperparameter (k) and select the best model for the classifier
    - estimate the performance of the model with nested cross validation
    - calculate the accuracy and the confusion matrix
- Regularized linear model with Ridge regression **(1 p)**
    - optimize the hyperparameter (alpha) and select the best model for the classifier
    - estimate the performance of the model with nested cross validation
    - calculate the accuracy and the confusion matrix

- Multi-layer perceptron MLP **(1 p)**
    - build the classifier. Use:
        - 1 hidden layer
        - solver for weight optimization: stochastic gradient-based optimizer ('adam')
        - activation function for the hidden layer: rectified linear unit function ('relu')
        - Early stop
    - optimize the number of neurons in the hidden layer and select the best model for the classifier
    - use Early stop committee, i.e. after selecting the model, calculate the prediction for the test data several times with different sampling of the training data. The members of the committee vote for the predicted class of the test sample. Use 50% of the training data for validation (algorithm terminates the training when validation score is not improving)
    - estimate the performance of the classifier with nested cross validation
- Discuss your results **(1 p)**
<br>E.g.
    - Which model performs the best and why?
    - What are the limitations?
    - How could the results be improved?
    - How do you expect the models will perform with unseen data?