## Neural Networks - Classifying images in CIFAR-10

## 1) Main objective

The goal of this analysis is to train a Neural Network so that it will be able to recognize and differentiate between, types of clothing. There are ten categories of items, which can typically be found in someone's wardrobe including blouse, shirts, shoes, handbags. This will allow us to automatically send new pictures which need description, to the department of the clothing store which specializes in it. Putting up pictures of new items on a Website would benefit from this type of categorization.

## 2) The Data

The Neural Network will be trained using the MNIST-Fasion Dataset. This is popular  open access dataset that is is used as practice when learning about Neural Networks. The data comes from Zalando, an international clothier (Zalando.com). It contains 60000 24x24 black and white images in 10 classes. There are also similar 10,000 images in the test set. 

Each item has 1 of the following labels:

0	T-shirt/top
1	Trouser
2	Pullover
3	Dress
4	Coat
5	Sandal
6	Shirt
7	Sneaker
8	Bag
9	Ankle boot

Here is the website with more details on MNIST-Fashion:
https://github.com/zalandoresearch/fashion-mnist

In [73]:
import tensorflow

In [74]:
import keras 
import tensorflow
#from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
#from keras.layers import Conv2D, MaxPooling2D
import matplotlib.pyplot as plt

#### Load the Data

In [163]:
import pandas as pd
test_set =  pd.read_csv(r"C:\Users\David\OneDrive\Data Science Certification\IBM\Machine Learning\Neural Networks\Datasets\FashionMNIST\fashion-mnist_test.csv")
train_set = pd.read_csv(r"C:\Users\David\OneDrive\Data Science Certification\IBM\Machine Learning\Neural Networks\Datasets\FashionMNIST\fashion-mnist_train.csv")

In [164]:
#X_train, y_train = X_test, y_test = mnist_reader.load_mnist('data/fashion', kind='t10k')

In [165]:
# import tensorflow_datasets as tfds
# datasets = tfds.load('mnist')

# train_dataset = datasets['train']
# test_dataset = datasets['test']

# IMAGE_INPUT_NAME = 'image'
# LABEL_INPUT_NAME = 'label'

In [166]:
# train_dataset = datasets['train']
# test_dataset = datasets['test']

## 3) Data Exploration & Data Cleaning

There are typically several steps required in processing iamges for training a CNN. These include:

* Creating a convolutional layer which applies filters in order to extra the main features of the image. Lines, edges etc.
* Max Pooling to downsample the image and reduce the number of inputs to the CNN. Bear in mind that a simple 24 x 24 image has requires an astounding 784 inputs.
* Flattening to create a vector from the matrix which is ready to be fed to the CNN.

This dataset needs *NO pre-processing*, because it already has all of the above already done to it. And it is therefore ready to be fed to the CNN after the train/test split. 

In [167]:
#train_set

In [168]:
#test_set

In [169]:
#Splitting into X & y (pixel1 to pixel784 & labels') datasets

y_train = train_set.iloc[:,0]
X_train = train_set.iloc[:,1:]
y_test = test_set.iloc[:,0]
X_test = test_set.iloc[:,1:]
#y_train = train_set.loc[:'label']
#y_train = train_set[['label']]

In [170]:
# y_train

In [171]:
X_train

Unnamed: 0,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,pixel9,pixel10,...,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783,pixel784
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,5,0,0,...,0,0,0,30,43,0,0,0,0,0
3,0,0,0,1,2,0,0,0,0,0,...,3,0,0,0,0,1,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
59995,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
59996,0,0,0,0,0,0,0,0,0,0,...,73,0,0,0,0,0,0,0,0,0
59997,0,0,0,0,0,0,0,0,0,0,...,160,162,163,135,94,0,0,0,0,0
59998,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [172]:
X_test

Unnamed: 0,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,pixel9,pixel10,...,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783,pixel784
0,0,0,0,0,0,0,0,9,8,0,...,103,87,56,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,34,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,14,53,99,17,...,0,0,0,0,63,53,31,0,0,0
3,0,0,0,0,0,0,0,0,0,161,...,137,126,140,0,133,224,222,56,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9995,0,0,0,0,0,0,0,0,0,37,...,32,23,14,20,0,0,1,0,0,0
9996,0,0,0,0,0,0,0,0,0,0,...,0,0,0,2,52,23,28,0,0,0
9997,0,0,0,0,0,0,0,0,0,0,...,175,172,172,182,199,222,42,0,1,0
9998,0,1,3,0,0,0,0,0,0,0,...,0,0,0,0,0,1,0,0,0,0


In [173]:
y_test

0       0
1       1
2       2
3       2
4       3
       ..
9995    0
9996    6
9997    8
9998    8
9999    1
Name: label, Length: 10000, dtype: int64

In [174]:
# X_test

#### Displaying the data shows that the X & y data are indeed split approriately, and ready to be fed.

## 4) Fitting the Neural Nets 

In [183]:
y_train[0]

2

In [184]:
from keras.utils import np_utils
from tensorflow.keras.utils import to_categorical

In [187]:
# num_classes = 10
y_train = to_categorical(y_train, num_classes)
y_test = to_categorical(y_test, num_classes)

In [195]:
# now instead of classes described by an integer between 0-9 we have a vector with a 1 in the (Pythonic) 9th position
y_train[0]

array([0., 0., 1., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)

In [196]:
y_test[0]

array([1., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)

In [197]:
# As before, let's make everything float and scale
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255

In [200]:
model_1 = Sequential()
model_1.add(Dense(1,input_shape = (784,),activation = 'relu'))
model_1.add(Dense(1,activation='relu'))

model_1.summary

<bound method Model.summary of <keras.engine.sequential.Sequential object at 0x0000016316DB4488>>

In [201]:
from tensorflow import optimizers 
from tensorflow.keras.optimizers import Adam, SGD, RMSprop

batch_size = 784

# initiate RMSprop optimizer
#opt = keras.optimizers.RMSprop(lr=0.0005, decay=1e-6)
opt = optimizers.RMSprop(learning_rate=0.0005, decay=1e-6)


# Let's train the model using RMSprop
model_1.compile(SGD(lr = .003), "categorical_crossentropy", metrics=["accuracy"])
run_hist_1 = model_1.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=28)


#model_1.compile(loss='sparse_categorical_crossentropy',
#              optimizer=opt,
#              metrics=['accuracy'])

model_1.fit(X_train, y_train,
              batch_size=batch_size,
              epochs=15,
              validation_data=(X_test, y_test),
              shuffle=True)



  super(SGD, self).__init__(name, **kwargs)


Epoch 1/28


ValueError: in user code:

    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\training.py", line 878, in train_function  *
        return step_function(self, iterator)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\training.py", line 867, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\training.py", line 860, in run_step  **
        outputs = model.train_step(data)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\training.py", line 810, in train_step
        y, y_pred, sample_weight, regularization_losses=self.losses)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\compile_utils.py", line 201, in __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\losses.py", line 141, in __call__
        losses = call_fn(y_true, y_pred)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\losses.py", line 245, in call  **
        return ag_fn(y_true, y_pred, **self._fn_kwargs)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\losses.py", line 1665, in categorical_crossentropy
        y_true, y_pred, from_logits=from_logits, axis=axis)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\backend.py", line 4994, in categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)

    ValueError: Shapes (32, 10) and (32, 1) are incompatible


### Model 2
The previous model had the structure:

(with appropriate activation functions and dropouts)

1. Build a more complicated model with the following pattern:
- Conv -> Conv -> MaxPool -> Conv -> Conv -> MaxPool -> (Flatten) -> Dense -> Final Classification

5. Try different structures and run times, and see how accurate your model can be.


In [32]:
# initiate RMSprop optimizer
opt_2 = optimizers.RMSprop(lr=0.0005)

# Let's train the model using RMSprop
model_2.compile(loss='sparse_categorical_crossentropy',
              optimizer=opt_2,
              metrics=['accuracy'])

In [33]:
model_2.fit(x_train, y_train,
              batch_size=batch_size,
              epochs=5,
              validation_data=(x_test, y_test),
              shuffle=True)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x2495fd25fd0>

## 5) Deep Learning model recommended 

Model-2 is recommended because even though it takes longer to run and requires more memory, it is more accurate and more robust.

## 6) Summary Key Findings and Insights

It is possible to achieve an accuracy of   %. This makes the neural network feasible for use in classying images which have been converted from color to black and white, and reduced to a size of just 28x28 pixels. Running on a laptop, traininig of the neural network was achieved in just minutes. This means that an ordinary user could run this for themself, making this neural network very useful for on-the-fly categorization of images. The up-side is that a (very) tech-savvy user can manage images and post updates to a web-site independently, at a moment's notice.   

## 7) Next steps for analyzing this data

Using the full size images and running them through an actual CNN would make for an interesting exercise. It would be great to know what type of classification accuray is achievable with larger images in color, using dense layers. 