# Enironment set up

In this section we will set up a Colab environment for the MLEnd mini-project. Before starting, follow these simple instructions:

1.   Go to https://drive.google.com/
2.   Create a folder named 'Data' in 'MyDrive': On the left, click 'New' > 'Folder', enter the name **'Data'**, and click 'create'
3.   Open the 'Data' folder and create a folder named **'MLEnd'**.

Let's start by loading a few useful Python libraries and mounting our personal Google Drive storage system (i.e. making it available, so that Colab can access it).

In [None]:
!pip install mlend

In [None]:
from google.colab import drive

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import spkit as sp

from skimage import exposure
from skimage.color import rgb2hsv, rgb2gray
import skimage as ski

import mlend
from mlend import download_yummy_small, yummy_small_load

import os, sys, re, pickle, glob
import urllib.request
import zipfile

import IPython.display as ipd
from tqdm import tqdm
import librosa

drive.mount('/content/drive')

# Download data

In this section we will download a small subsample of the MLEnd Yummy Dataset, i.e. the MLEnd Small Yummy Dataset. This dataset consists of a total of 99 samples from the MLEnd Yummy Dataset corresponding to dishes that contain either rice or chips.

You should be able to download the entire training dataset using a similar approach to the one used here for the small sample. As you will see, you only need to provide a different link.

Run the next code cell to download the MLEnd Small Yummy Dataset:


In [None]:
baseDir = download_yummy_small(save_to = '/content/drive/MyDrive/Data/MLEnd')
baseDir

And now run the following cell to check the contents of the folder where you have downloaded your data into:


In [None]:
os.listdir(baseDir)

As you can see, there is a subfolder ('MLEndYD_images_small') together with a CSV file ('MLEndYD_image_attributes_small.csv'). The subfolder contains all the photos in the MLEnd Small Yummy dataset. You can also checking the contents of this folder via Google Drive.

# Understanding our dataset

Each sample in the MLEnd Small Yummy Dataset corresponds to one dish instance and is described by 9 attributes, namely:

- Photo of the dish.
- Dish name.
- Whether home or restaurant.
- Cuisine.
- Ingredients.
- Diet.
- Healthiness rating.
- Tastiness rating.
- Rice or chips?

As previously mentioned, only dishes containing either rice or chips have been included in the Small MLEnd Yummy Dataset. We captured this by adding the binary attribute *Rice or chips?*.

We can imagine the MLEnd Small Yummy Dataset as a table that has 99 rows and 9 columns. Tables are a useful abstractions, but at the end of the day, we need to store the values of the attributes of each sample somewhere. Most of the attributes are text-based, and therefore can be stored using a text file, for instance, a CSV file. However, the photo attribute is a complex one that is not suitable to be stored in a text file. Consequently, we stored each photo in the separate folder *MLEndYD_images_small* as a JPEG file.

The CSV file *MLEndYD_image_attributes_small.csv* captures the values of all the attributes of each sample. However, instead of an actual photo, this CSV file stores the name of the photo, e.g. '00001.jpg', that is stored in the separate folder *MLEndYD_images_small*.

Let's look at the contents of this CSV file:





In [None]:
MLENDYD_df = pd.read_csv('/content/drive/MyDrive/Data/MLEnd/yummy/MLEndYD_image_attributes_small.csv').set_index('filename')
MLENDYD_df

Note that there are 99 rows and 10 columns. The first column is used both as a unique indentifier (index) of the sample and also as a link to the photo of the dish. Pandas do not include the index column in the column count, and that's why it reports that the table has 9 columns.

The 10th column ('Benchmark_A') is one that we have added for benchmarking purposes. This column indicates whether a sample should be used for training or for testing. Note that no sample is included in both training and test.

If we now count the number of files in the *MLEndYD_images_small* folder, we obtain 99, as expected:

In [None]:
sample_path = '/content/drive/MyDrive/Data/MLEnd/yummy/MLEndYD_images_small/*.jpg'
files = glob.glob(sample_path)
len(files)

Let's have a look at the first two photos:

In [None]:
I = plt.imread('/content/drive/MyDrive/Data/MLEnd/yummy/MLEndYD_images_small/00001.jpg')
plt.subplot(1,2,1)
plt.imshow(I)
plt.axis('off')

I = plt.imread('/content/drive/MyDrive/Data/MLEnd/yummy/MLEndYD_images_small/00002.jpg')
plt.subplot(1,2,2)
plt.imshow(I)
plt.axis('off')

Both photos correspond to a dish that has chips. Note that their sizes are different.

# Create train and test Datasets

In this Starter kit we will consider the problem of predicting whether a dish has rice or chips using a picture of the dish as the predictor.

To solve this section, let us create two datasets, one for the training task and another one for the test task. We will use the `yummy_small_load` function included in our `mlend` library for this, and will specify which dataset each sample should belong to, by using the column 'Benchmark_A' in the CSV file:

In [None]:
TrainSet, TestSet, Map = yummy_small_load(datadir_main=baseDir,train_test_split='Benchmark_A')

`TrainSet` and `TrainSet` contain both datasets and `Map` describe how the values 'chips' and 'rice' are mapped to the values 0 and 1.

In [None]:
TrainSet.keys()

In [None]:
TestSet.keys()

In [None]:
Map

Let us plot all the labels in the training dataset using the values 'chips' and 'rice':

In [None]:
TrainSet['Y']

And now, let's plot the labels encoded using the values 0 and 1:

In [None]:
TrainSet['Y_encoded']

Finally, let's save the predictors and labels of the training and test dataset:

In [None]:
X_train_paths = TrainSet['X_paths']
X_test_paths  = TestSet['X_paths']

Y_train = TrainSet['Y_encoded']
Y_test  = TestSet['Y_encoded']

# Visualising dishes

In this section, we will visualise the images that we have extracted from the MLEnd Small Yummy Dataset. Specifically, we will select five dishes that contain rice and five dishes that have chips?

In [None]:
Chips_Img = np.array(X_train_paths)[Y_train==0]
Rice_Img = np.array(X_train_paths)[Y_train==1]

print('Rice')
plt.figure(figsize=(15,5))
for k,file in enumerate(Rice_Img[:5]):
  I = plt.imread(file)
  plt.subplot(1,5,k+1)
  plt.imshow(I)
  plt.axis('off')

plt.tight_layout()
plt.show()

print('Chips')
plt.figure(figsize=(15,5))
for k,file in enumerate(Chips_Img[:5]):
  I = plt.imread(file)
  plt.subplot(1,5,k+1)
  plt.imshow(I)
  plt.axis('off')

plt.tight_layout()
plt.show()

# Resizing Images

As previousle mentioned, images are not of same size. Our first step will be to resize all the images to so that they have the same size.

To keep the aspect ratio of image as it is, we will append black color to make so that images are squared and then we will resize them to 200x200 pixels.

You can choose any other size or approach to resize images

In [None]:
def make_it_square(I, pad=0):
  N,M,C = I.shape
  if N>M:
    Is = [np.pad(I[:,:,i], [(0,0),(0, N-M)], 'constant', constant_values=pad) for i in range(C)]
  else:
    Is = [np.pad(I[:,:,i], [(0, M-N),(0,0)], 'constant', constant_values=pad) for i in range(C)]

  return np.array(Is).transpose([1,2,0])

def resize_img(I,size=[100,100]):
  N,M,C = I.shape
  Ir = [sp.core.processing.resize(I[:,:,i],size) for i in range(C)]
  return np.array(Ir).transpose([1,2,0])

In [None]:
X_train = []
for k,file in enumerate(X_train_paths):
  sp.utils.ProgBar_JL(k,len(X_train_paths),L=50,color='blue')
  I = plt.imread(file)
  I = make_it_square(I, pad=0)
  I = resize_img(I,size=[200,200])
  X_train.append(I)


X_test = []
for k,file in enumerate(X_test_paths):
  sp.utils.ProgBar_JL(k,len(X_test_paths),L=50,color='blue')
  I = plt.imread(file)
  I = make_it_square(I, pad=0)
  I = resize_img(I,size=[200,200])
  X_test.append(I)

X_train = np.array(X_train)
X_test = np.array(X_test)
X_train.shape, X_test.shape

Let's now plot a few images after resizing:

In [None]:
plt.figure(figsize=(10,6))
for k,I in enumerate(X_train):
  plt.subplot(3,5,k+1)
  plt.imshow(I)
  plt.axis('off')
  k+=1
  if k>=15:break
plt.tight_layout()
plt.show()

As you can see, all the images have a square shape and consist of 200x200 pixels.

# Feature Extraction

To solve the problem of predicting whether a dish has rice or chips using a 200 x 200 pixels photo as the predictor, we need to start considering its dimensionality. Each photo is described by 3 x 200 x 200 = 120,000 values. Therefore, the predictor space has 120,000 dimensions. To train a model on such a space, we need a training dataset that has more than 120,000 samples. Needless to say, our training dataset is much, much smaller.

Our only option is to reduce the dimensionality of the predictor space, in other words, we will move our samples from a 120,000D space to another space that has fewer dimensions. Feature extraction is a common approach that allows us to reduce the dimensionality of our prediction space. In the code cell below, we define two functions `get_yellow_component` and `GMLC_features` that extract three image features that will define a new predictor space:

In [None]:
from skimage.feature import ORB
from skimage.feature import graycomatrix, graycoprops


def get_yellow_component(I,t1=27, t2=33):
  Ihsv = (rgb2hsv(I)*255).astype('uint8')
  mask = (Ihsv[:,:,0]<t2)*(Ihsv[:,:,0]>t1)
  Ypx = mask.sum()
  return Ypx

def GMLC_features(I):
  Ig = (rgb2gray(I)*255).astype('uint8')
  glcm = graycomatrix(Ig, distances=[5], angles=[0], levels=256,
                        symmetric=True, normed=True)
  f1 = graycoprops(glcm, 'dissimilarity')[0, 0]
  f2 = graycoprops(glcm, 'correlation')[0, 0]
  return f1,f2


def showConfMat(CM, labels = ['Chips','Rice']):
  plt.matshow(CM,cmap='Blues')
  for i in range(CM.shape[0]):
    for j in range(CM.shape[1]):
      plt.text(i,j,CM[i,j].round(2),ha='center',)
  plt.xticks([0,1],labels)
  plt.yticks([0,1],labels)
  plt.show()

Let us now extract the three features from each image and create the transform sets `X_train_f` and `X_test_f`:

In [None]:
X_train_f = []
for k, I in enumerate(X_train):
  f1 = get_yellow_component(I)
  f2,f3 = GMLC_features(I)
  X_train_f.append([f1,f2,f3])

X_test_f = []
for k, I in enumerate(X_test):
  f1 = get_yellow_component(I)
  f2,f3 = GMLC_features(I)
  X_test_f.append([f1,f2,f3])

After formatting both `X_train_f` and `X_test_f` as numpy arrays, we can check their respective shapes:

In [None]:
X_train_f = np.array(X_train_f)
X_test_f = np.array(X_test_f)
X_train_f.shape, X_test_f.shape

Note that `X_train_f` represents a collection of 70 samples described by 3 attributes and `X_test_f` represent a collection of 29 samples described by 3 attributes. This feature extraction stage has reduced the dimensionality of our problem from 120,000D to 3D.

# Normalisation

In addition to reducing the dimensionality of the prediction space, let's implement a normalisation stage to ensure that the 3 attributes in the new prediction space take on a similar range of values. We will implement standardisation.

In [None]:
MEAN = X_train_f.mean(0)
SD = X_train_f.std(0)

X_train_fn = (X_train_f - MEAN)/SD
X_test_fn = (X_test_f - MEAN)/SD

# Linear Model

Finally, let's train and test a linear model that uses the 3 normalised attributes of an image to predict whether the image corresponds to a dish that has rice or chips.

The linear model that we will use is called a 'Support Vector Machine'. It produces a linear boundary in the prediction space, but is trained using a different strategy that other linear models that we have seen in the class, such as logistic regression or discriminant analysis.

Let us use the method LinearSVC available in scikit-learn to train a linear support vector machine:

In [None]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import LinearSVC

model = LinearSVC(C=1)
model.fit(X_train_fn, Y_train)

Now, let's use this trained model to predict the labels in the training and test datasets and based on the predicted labels, let's calculate the training and test accuracy:

In [None]:
ytp = model.predict(X_train_fn)
ysp = model.predict(X_test_fn)

train_accuracy = np.mean(ytp==Y_train)
test_accuracy  = np.mean(ysp==Y_test)

print('Training Accuracy:\t',train_accuracy)
print('Test  Accuracy:\t',test_accuracy)

How well do you think the model is performing? Let's build a confusion matrix to look at the per-class accuracies:

In [None]:
Ac = np.mean(ysp[Y_test.astype(int)==0]==0)
Ar = np.mean(ysp[Y_test.astype(int)==1]==1)

Mc = np.mean(ysp[Y_test.astype(int)==0]==1)
Mr = np.mean(ysp[Y_test.astype(int)==1]==0)

CM = np.array([[Ac, Mc],[Mr, Ar]])

showConfMat(CM)

What would you conclude from the per-class accuracies?

One final question for you is, if we were to deploy this solution, what would be the *processing pipeline*? To answer this question, you need to identify all the steps that have taken us from a picture all the way to the prediction.

You can try other machine learning models that use the same normalised predictors, or extract a different set of features. For instance, to train a Random Forest model simply run

`model = RandomForestClassifier(n_estimators=5,max_depth=3)`

It is surprisingly easy.


