# Neural Network models using the CIFAR-10 dataset
The Canadian Institute for Advanced Research (CIFAR-10) dataset is a collection of colored images. There are ten different classes: airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks. No data augmentations are used to focues on the basic NN algorithms. 

In [1]:
import numpy as np
import keras
from keras.datasets import mnist
from keras.layers import Dense, Activation
import tensorflow as tf
from keras.models import Sequential
import numpy.random as npr
from keras.datasets import cifar10
import matplotlib.pyplot as plt
from keras.initializers import RandomUniform
from keras.utils import to_categorical
from skimage.color import rgb2gray

Using TensorFlow backend.


In [2]:
(X_train, Y_train), (X_test, Y_test) = cifar10.load_data()

Below are the dimensions of the training and testing images

In [3]:
print('Initial example dimensions')
print("X_train shape", X_train.shape)
print("Y_train shape", Y_train.shape)
print("X_test shape", X_test.shape)
print("Y_test shape", Y_test.shape)

Initial example dimensions
X_train shape (50000, 32, 32, 3)
Y_train shape (50000, 1)
X_test shape (10000, 32, 32, 3)
Y_test shape (10000, 1)


## Data Preparation
Image nomalization and vectorization

In [4]:
new_X_train = X_train / 255
new_X_test = X_test / 255

new_X_train = new_X_train.reshape(50000,32*32*3)
new_X_test = new_X_test.reshape(10000,32*32*3)
new_Y_train = Y_train.reshape(50000)
new_Y_test = Y_test.reshape(10000)

In [5]:
print("New dimensions")
print("flat_X_train shape:", new_X_train.shape)
print("flat_X_test shape:", new_X_test.shape)
print("new_Y_train shape:", new_Y_train.shape)
print("new_Y_test shape:", new_Y_test.shape)

New dimensions
flat_X_train shape: (50000, 3072)
flat_X_test shape: (10000, 3072)
new_Y_train shape: (50000,)
new_Y_test shape: (10000,)


## Binary Classification using Linear Regression
In this model, I classify whether an image is a  cat or not.

In [None]:
testing = new_X_test
training = new_X_train
labels = np.where(Y_train==3,1,-1)
testlabel = np.where(Y_test==3,1,-1)

In [2]:
opt = keras.optimizers.SGD(lr=.001)
model = Sequential()
model.add(Dense(1, input_shape=(3072,), activation='linear',kernel_initializer=RandomUniform(minval=-.2, maxval=.2, seed=2),bias_initializer=RandomUniform(minval=-.1, maxval=.1)))
model.compile(optimizer=opt, loss='mse', metrics=['accuracy'])

In [None]:
history = model.fit(training, labels, epochs=100, batch_size=64,verbose=0)
model.evaluate(x=testing,y=testlabel)
w = model.layers[0].get_weights()
loss_values = history.history['loss']
valloss_values = history.history['val_loss']
epochs = range(1, len(loss_values)+1)

model.evaluate(x=testing,y=testlabel)

Below shows the loss function

In [None]:
plt.plot(epochs, loss_values, label='Training Loss')
plt.plot(epochs, valloss_values, label='Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

Let's compare the results using a general Linear Regression algorithm

In [18]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, accuracy_score

reg = LinearRegression(fit_intercept=True).fit(training, labels)
y_pred = reg.predict(testing)

# acc_LR = r2_score(testing,y_pred)
acc_LR = accuracy_score(testlabel,y_pred.round(), normalize=True)
print('Acc:',acc_LR)

Acc: 0.8429


## Binary Classification using Support Vector Machine (SVM)
To make this, we use a Ridge Regression, or an l2-regularizer. We also use a hinge loss function.

In [None]:
testing = new_X_test
training = new_X_train
labels = np.where(Y_train==3,1,-1)
testlabel = np.where(Y_test==3,1,-1)

In [6]:
model_linsvm = Sequential()
opt_linsvm = keras.optimizers.SGD(lr=.0005)
model_linsvm.add(Dense(1, input_shape=(3072,), activation='softsign',kernel_initializer=RandomUniform(minval=-.2, maxval=.2, seed=2),bias_initializer=RandomUniform(minval=-.1, maxval=.1),kernel_regularizer=keras.regularizers.l2(.5/4)))
model_linsvm.compile(optimizer=opt_linsvm, loss='hinge', metrics=['accuracy'])
model_linsvm.fit(training, labels, epochs=10, batch_size=300)

model_linsvm.evaluate(x=testing,y=testlabel)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


[3.719522741699219, 0.8999000191688538]

Let's compare this results using the regular SVM algorithm

In [None]:
from sklearn import svm
from sklearn.metrics import accuracy_score
clf = svm.SVC(kernel='poly',verbose=0)
clf.fit(training, labels)
y_pred = clf.predict(testing)
print('Acc:', accuracy_score(y_pred, testlabel))

We can modify this approach by adding another layer that introduces linearity

In [None]:
model_svm = Sequential()
model_svm.add(Dense(50, input_shape=(3072,), activation='relu'))
model_svm.add(Dense(1, activation='linear',kernel_regularizer=keras.regularizers.l2(.5)))

learning_rate = 0.04
batch_size = np.int32(20/(learning_rate**2))
print('Batch size:',batch_size)

model_svm.compile(optimizer=keras.optimizers.SGD(lr=learning_rate), loss='hinge', metrics=['accuracy'])
model_svm.fit(training, labels, epochs=50, batch_size=batch_size,verbose=0)

## Multi-class Classification problem using SVM
In a multi-class case, the SVM-NN algortihm can generate a nonlinear separable curve. We can extend our current understaning in the perceptron model by adding addional layers. The first layer are input nodes. We will then introduce nonlinearity by applying a Ridge Regession, also called l2 regularizer, and a Rectified Linear Unit (ReLU) activation function. Each node will then be passed through a Linear-SVM algorithm. We classify examples by selecting the highest values in the output layer.

In [3]:
(X_train, Y_train), (X_test, Y_test) = cifar10.load_data()
new_X_train = rgb2gray(X_train) 
new_X_test = rgb2gray(X_test)

In [3]:
new_X_train = new_X_train.reshape(50000,32*32)
new_X_test = new_X_test.reshape(10000,32*32)
new_Y_train = Y_train.reshape(50000)
new_Y_test = Y_test.reshape(10000)

Multi-class classification requires addional steps. A process called *One-Hot encoding* is used to label multi-class datasets.

In [3]:
encoded_Y_train = to_categorical(Y_train,10)
encoded_Y_test = to_categorical(Y_test,10)

In [3]:
epochs = 150
opt = keras.optimizers.SGD(lr=.1,momentum=0.01,nesterov=True,decay = .001/ epochs)

model = Sequential()
model.add(Dense(15, input_shape=(1024,), activation='relu',kernel_initializer=RandomUniform(minval=-.2, maxval=.2, seed=2),bias_initializer=RandomUniform(minval=-.1, maxval=.1)))
model.add(Dense(10,input_shape=(50,),kernel_regularizer = keras.regularizers.l2(l=.125/8), activation='softmax'))

model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
training = new_X_train
testing = new_X_test
labels = encoded_Y_train
testlabel = encoded_Y_test

history = model.fit(training, labels, epochs=100, batch_size=100)
model.evaluate(x=testing,y=testlabel)
w = model.layers[0].get_weights()
loss_values = history.history['loss']
valloss_values = history.history['val_loss']
epochs = range(1, len(loss_values)+1)

model.evaluate(x=testing,y=testlabel)

Epoch 1/100


Below shows the loss function

In [None]:
plt.plot(epochs, loss_values, label='Training Loss')
plt.plot(epochs, valloss_values, label='Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

Below uses the SVM algorithm 

In [None]:
from sklearn import svm
from sklearn.metrics import accuracy_score
clf = svm.SVC(kernel='poly',verbose=0)
clf.fit(training, labels)
y_pred = clf.predict(testing)
print('Acc:', accuracy_score(y_pred, testlabel))

Note: This is computationally heavy, so I will not run this cell.

The general SVM algorithm classifies examples very well. However, this method is computationally heavy for large datasets. Using a Neural Network that imitates SVM can reduce the taxing computations, while using the technique of it.