Skip to content

MingSheng92/Image_Classification

Repository files navigation

Machine learning - Image classification

Today we want to look at different image classification methods with 2 different datasets, MNIST and Fashion MNIST.

Datasets

As mentioned in the previous section, we will be performing our test and evaluation on the two MNIST dataset. One is the handwritten digit MNIST and the other is the Fashion MNIST dataset, both dataset has the similar attributes where both datasets contains sets of 28 x 28 greyscale image label under 10 classes.

MNIST sample images
By Josef Steppan - Own work, CC BY-SA 4.0, Link

Fashion Mnist dataset is created because hand written digits MNIST dataset is too simple, where it can easily achieve 90% accuracy with simple model setup. Fashion-MNIST Image taken from Fashion-MNIST oficial github website.

For more information on the used dataset, you may refer to the links below:
MNIST handwritten digits dataset.
Fashion MNIST dataset.

Pre-processing

, where x will be each and every pixel in our case, and z would be our rescaled data that is ranging between .

Another way to perform normalization is to divide all pixel data by 255 since we already know pixel has a min value of 0 and a max of 255.

Methods

Bernoulli Naive Bayes

Naive Bayes is a probabilitic classifier that has strong (naïve) independence assumptions between the features. Out of all the Naive Bayes model, we chose to work with bernoulli naive bayes as I believe that we can treat the image data as binary representation when calculating the posterior hence it should work relatively better compared to Gaussian or Multinomial Naive Bayes model.

The definition of the likelihood of each Class Ck is shown below:

Additional reading material:
Wikipedia
NLP Standford

Logistic regression

We will use multi-class logistic regression for this task, with softmax function, where the definition can be found below:

When there are K different classes C1,C2, ...... , Ck. For each class Ck, we have parameter vector wk and the posterior probability of the model will be:

Where the formula of multinomial logistics loss is:

Additional reading material:
Wikipedia
Stackoverflow

Convolutional Neural Network

Anyone that has some exposure to machine learning would have heard of convolutional neural network, it is one of the most commonly used machine learning method for analyzing visual imagery.

Additional reading material:
Wikipedia
Adit Deshpande

Results

Without any doubts we can directly assume that Neural Network will have better results compared to the other machine learning methods and the final results shows. However, Bernoulli Naive Bayes Classifier is getting comparable results to logistic regression which is kinda of a surprise where Naive Bayes does not really have good prediction results in real life tasks though that work really well in theory. But our current implementation of logistic regression is not optimal where if we have implement regularization we can sure to see huge improvement in creating a more robust classifier. Though CNN have achieved a better prediction results, there are extremely sensitive to different dataset, meaning with different task you will need to define different architechture where there is no such thing as one model to fit all (No free lunch theory).

From the table below, You can find the summary of accuracy/performance of all the different classifiers for MNIST and Fashion-MNIST dataset. Result

Future Work

  1. Add regularization to logistic regression to add robustness to the classifier.
  2. Implement multinomial Naive Bayes and Gaussian Naive Bayes and compare between the Naive Bayes models.
  3. Add different evaluation methods to further analyze the listed models, ROC curve, confusion matrix etc.