----
# Model Evaluation
-------

### Summary:
Evaluating the performance of base logistic regression, logistic regression with feature extraction and CNN with real world images.

### Data Overview:
- **Dataset:** Real World images captured using Teachable Machine
- **Number of Samples:** Approximately 300 images per letter, only to use two images per letter to test the models.

### Notebook Overview:
- **Data Loading:**
    - Load the Real Dataset using keras
    - Randomly select 2 images per class

- **Model Testing**:
   - Pass images into model.predict 
   - Store results in a dataframe

- **Model Evaluation**:
   - Assess the accuracy of the models predictions

## Set Up
-----

In [1]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing.image import load_img, img_to_array
from skimage.feature import hog, local_binary_pattern
import numpy as np
import pandas as pd
import joblib
import re 
import random


## Utility Functions
-----


In [2]:
alphabet = ['a','b','c','d','e','f','g','h','i','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y']

In [3]:
# Randomly picks an image for a given letter
def get_test_imgs(letter):
    '''
    Overview:
    Gets a random image for a given letter.

    Arguments:
        - letter -> Letter (NOT J/Z as both of these letters do not exist in dataset)

    Output:
        - Random image for a given letteer

    '''
    if letter.lower() in ('j', 'z'):
        raise ValueError("The input letter cannot be 'j' or 'z'.")
    else:
        rand_idx = random.randint(1, 300)
        image_path = f'../../data/my_imgs/real_world_imgs/{letter.upper()}/{rand_idx}.jpg'
        return image_path

In [4]:
def extract_features(image):
  '''
    Overview:
    Extracts HOG, LBP, and color histogram features from a 28x28 image.

    Arguments:
        - single image

    Output:
        - feature vector for a given image
  '''
  # Reshape the image if it's flattened
  image = image.reshape(28, 28)

  # Histogram of Orientated Gradients (HOG) Features
  hog_features = hog(image, orientations=9, pixels_per_cell=(8, 8), cells_per_block=(2, 2))

  # LBP Features (Uniform LBP with 8 neighbors)
  lbp_features = local_binary_pattern(image, P=8, R=1, method='uniform')
  lbp_hist, bin_edges = np.histogram(lbp_features, bins=np.arange(0, lbp_features.max() + 2), density=True)

  # Color Histogram Features (8 bins per channel) -> distribution of colour (greyscale in this case)
  color_hist, bin_edges = np.histogram(image, bins=8, range=(0, 255), density=True)

  # Combining features into a single vector
  feature_vectors = np.concatenate((hog_features, lbp_hist, color_hist))

  return feature_vectors


In [5]:
def log_reg_process_image(letter, model):
    '''
        Overview:
        Processes image and passess it to .predict for the given model

        Arguments:
            - letter -> Letter (NOT J/Z as both of these letters do not exist in dataset)
            - model -> to do prediction
        Output:
            - actual letter and predicted letter
    '''  
    if letter.lower() in ('j', 'z'):
        raise ValueError("The input letter cannot be 'j' or 'z'.")
    else:

        image_path = get_test_imgs(letter)

        input_image = load_img(image_path, target_size=(28, 28))
        input_image = input_image.convert('L')
        image_array = img_to_array(input_image).astype(float)
        input_img = image_array.reshape(1,28*28)
        prediction = alphabet[model.predict(input_img)[0]]

    return letter, prediction

In [6]:
def log_reg_fe_process_image(letter, model):
    '''
        Overview:
        Processes image and passess it to .predict for the given model

        Arguments:
            - letter -> Letter (NOT J/Z as both of these letters do not exist in dataset)
            - model -> to do prediction
        Output:
            - actual letter and predicted letter
    '''  
    if letter.lower() in ('j', 'z'):
        raise ValueError("The input letter cannot be 'j' or 'z'.")
    else:
        image_path = get_test_imgs(letter)
        input_image = load_img(image_path, target_size=(28, 28))
        input_image = input_image.convert('L')
        image_array = img_to_array(input_image).astype(float)
        input_img = image_array.reshape(1,28*28)

        feature_vec = extract_features(input_img)
        feature_vec =feature_vec.reshape(1,-1)

        prediction = alphabet[model.predict(feature_vec)[0]]

        return letter, prediction


In [7]:
def cnn_process_image(letter, model):
    '''
        Overview:
        Processes image and passess it to .predict for the given model

        Arguments:
            - letter -> Letter (NOT J/Z as both of these letters do not exist in dataset)
            - model -> to do prediction
        Output:
            - actual letter and predicted letter
    '''  
    if letter.lower() in ('j', 'z'):
        raise ValueError("The input letter cannot be 'j' or 'z'.")
    else:
        image_path = get_test_imgs(letter)
        input_image = load_img(image_path, target_size=(28, 28))
        input_image = input_image.convert('L')
        image_array = img_to_array(input_image).astype(float)
        input_img = image_array.reshape(1, image_array.shape[0], image_array.shape[1], image_array.shape[2])
        soft_pred = model.predict(input_img)
        pred_y = np.argmax(soft_pred, axis=1)
        
        prediction = alphabet[pred_y[0]]

        return letter, prediction

In [8]:
def model_pred (model_name, model, df):
    '''
        Overview:
        To return a dataframe with all results for a given model

        Arguments:
            - model_name -> to store in df
            - model -> to make prediction
            - df -> to store results
        Output:
            - dataframe of results
    '''  

    alphabet = ['a','b','c','d','e','f','g','h','i','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y']

    for letter in alphabet:
        for i in range(2):  # Get two rand images per letter
            # looking for name to decide which processor to use
            if re.search('.*fea.*eng.*', model_name.lower()):
                letter, prediction = log_reg_fe_process_image(letter, model)
            elif re.search('.*log.*reg.*', model_name.lower()):
                letter, prediction = log_reg_process_image(letter, model)
            elif re.search('.*cnn.*', model_name.lower()):
                letter, prediction = cnn_process_image(letter, model)
            else:
                print(f'{model_name}: Model not expected')
            correct = np.where(letter == prediction, 'Y','N')
            row = { 'Model': model_name, 'Actual Letter': letter, 'Predicted Letter': prediction, 'Correct' : correct}
            df.loc[len(df)] = row

    # Return the DataFrame to see the resulrts
    return(df)

## Initialise dataframes to store results for each model
---

In [9]:
# Base Logistic Regression
base_lr_df = pd.DataFrame(columns=['Model', 'Actual Letter',  'Predicted Letter', 'Correct'])
# Logistic Regression with Feature engineering
fe_lr_df = pd.DataFrame(columns=['Model', 'Actual Letter',  'Predicted Letter', 'Correct'])
# CNN
cnn_df = pd.DataFrame(columns=['Model', 'Actual Letter',  'Predicted Letter', 'Correct'])

## Logistic Regression Testing
----

In [10]:
log_reg_model = joblib.load('../../model/my_models/log_reg_basic_model.pkl')

In [11]:
model_pred('Base Log Reg', log_reg_model, base_lr_df)

Unnamed: 0,Model,Actual Letter,Predicted Letter,Correct
0,Base Log Reg,a,l,N
1,Base Log Reg,a,l,N
2,Base Log Reg,b,h,N
3,Base Log Reg,b,g,N
4,Base Log Reg,c,g,N
5,Base Log Reg,c,f,N
6,Base Log Reg,d,l,N
7,Base Log Reg,d,g,N
8,Base Log Reg,e,n,N
9,Base Log Reg,e,t,N


In [12]:
log_reg_aug_model = joblib.load('../../model/my_models/log_reg_augmented_model.pkl')

In [13]:
model_pred('Base Log Reg Aug', log_reg_aug_model, base_lr_df)

Unnamed: 0,Model,Actual Letter,Predicted Letter,Correct
0,Base Log Reg,a,l,N
1,Base Log Reg,a,l,N
2,Base Log Reg,b,h,N
3,Base Log Reg,b,g,N
4,Base Log Reg,c,g,N
...,...,...,...,...
91,Base Log Reg Aug,w,t,N
92,Base Log Reg Aug,x,g,N
93,Base Log Reg Aug,x,g,N
94,Base Log Reg Aug,y,q,N


## Logistic Regression with Feature Engineering Testing

---

In [14]:
log_reg_with_fe = joblib.load('../../model/my_models/log_reg_with_fe.pkl')

In [15]:
model_pred('Feature Eng Log Reg', log_reg_with_fe, fe_lr_df)

Unnamed: 0,Model,Actual Letter,Predicted Letter,Correct
0,Feature Eng Log Reg,a,h,N
1,Feature Eng Log Reg,a,g,N
2,Feature Eng Log Reg,b,y,N
3,Feature Eng Log Reg,b,q,N
4,Feature Eng Log Reg,c,h,N
5,Feature Eng Log Reg,c,g,N
6,Feature Eng Log Reg,d,i,N
7,Feature Eng Log Reg,d,d,Y
8,Feature Eng Log Reg,e,y,N
9,Feature Eng Log Reg,e,c,N


In [16]:
log_reg_aug_model = joblib.load('../../model/my_models/log_reg_fe_augmented_model.pkl')

In [17]:
model_pred('Feature Eng Log Reg Aug', log_reg_aug_model, fe_lr_df)

Unnamed: 0,Model,Actual Letter,Predicted Letter,Correct
0,Feature Eng Log Reg,a,h,N
1,Feature Eng Log Reg,a,g,N
2,Feature Eng Log Reg,b,y,N
3,Feature Eng Log Reg,b,q,N
4,Feature Eng Log Reg,c,h,N
...,...,...,...,...
91,Feature Eng Log Reg Aug,w,v,N
92,Feature Eng Log Reg Aug,x,c,N
93,Feature Eng Log Reg Aug,x,g,N
94,Feature Eng Log Reg Aug,y,a,N


## CNN Testing
----

In [18]:
cnn_model = load_model('../../model/my_models/CNN_model.h5')

In [19]:
model_pred('CNN', cnn_model,cnn_df)



2024-07-07 22:45:38.555547: W tensorflow/tsl/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz




Unnamed: 0,Model,Actual Letter,Predicted Letter,Correct
0,CNN,a,l,N
1,CNN,a,k,N
2,CNN,b,o,N
3,CNN,b,l,N
4,CNN,c,k,N
5,CNN,c,l,N
6,CNN,d,v,N
7,CNN,d,d,Y
8,CNN,e,k,N
9,CNN,e,o,N


In [20]:
cnn_aug_model = load_model('../../model/my_models/CNN_augmented_model.h5')

In [21]:
model_pred('CNN_aug', cnn_aug_model,cnn_df)



Unnamed: 0,Model,Actual Letter,Predicted Letter,Correct
0,CNN,a,l,N
1,CNN,a,k,N
2,CNN,b,o,N
3,CNN,b,l,N
4,CNN,c,k,N
...,...,...,...,...
91,CNN_aug,w,f,N
92,CNN_aug,x,x,Y
93,CNN_aug,x,x,Y
94,CNN_aug,y,l,N


## Result Evaluation 
---

### Logistic Regression

In [22]:
base_lr_df[base_lr_df['Correct'] == 'Y']

Unnamed: 0,Model,Actual Letter,Predicted Letter,Correct
30,Base Log Reg,q,q,Y
60,Base Log Reg Aug,g,g,Y
61,Base Log Reg Aug,g,g,Y
63,Base Log Reg Aug,h,h,Y


**Comment**

Base logistic regression model perfroms very poorly on the real world dataset, this is likely due to the fact that real world data is harder to classify using linear models. In the dataset I purposely performed augmentation to replicate scenarios that will happen in the real world - the basic logisitic regression model does not handle this well.

### Logisitic Regression with Feature Engineering 

In [23]:
fe_lr_df[fe_lr_df['Correct'] == 'Y']

Unnamed: 0,Model,Actual Letter,Predicted Letter,Correct
7,Feature Eng Log Reg,d,d,Y
11,Feature Eng Log Reg,f,f,Y
29,Feature Eng Log Reg,p,p,Y
46,Feature Eng Log Reg,y,y,Y
47,Feature Eng Log Reg,y,y,Y
52,Feature Eng Log Reg Aug,c,c,Y
53,Feature Eng Log Reg Aug,c,c,Y
59,Feature Eng Log Reg Aug,f,f,Y
60,Feature Eng Log Reg Aug,g,g,Y
78,Feature Eng Log Reg Aug,q,q,Y


Instead of looking at raw pixel values, feature extraction allows us to manually pick out features from the images such as edges and patterns. This helps models generalise better as looking for features opposed to pixel values is more robust especially when noise is introduced into the dataset. This is reflected in the results we see here, feature extraction on the real world dataset has improved the performance of the logistic regression model.


### CNN

In [24]:
cnn_df[cnn_df['Correct']=='Y']

Unnamed: 0,Model,Actual Letter,Predicted Letter,Correct
7,CNN,d,d,Y
10,CNN,f,f,Y
11,CNN,f,f,Y
20,CNN,l,l,Y
31,CNN,q,q,Y
43,CNN,w,w,Y
45,CNN,x,x,Y
52,CNN_aug,c,c,Y
54,CNN_aug,d,d,Y
58,CNN_aug,f,f,Y


As expected CNN performs the best on real world data. This is likely due to the networks ability to automatically learn more complex features than manual feature extraction methods. Such features can capture more abstract patterns some of which would be better suited to real world data.



NOTE: I will come back to this notebook as how I have performed the testing is rather messy and believe there is a better way to do this.