## Classification on 8 celebrities

In this notebook we try to classify images of : "Miranda_Cosgrove" "Chris_Martin" "Emma_Stone" "Jamie_Foxx" "Steve_Jobs" "Zac_Efron" "Sandra_Oh" "Taryn_Manning". There are 350 images of each celebrity, we use 250 for traning and 50 for validation and test.

In [None]:
import gzip
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as imgplot
import time
%matplotlib inline
import h5py

Reading in the data, it is already prepared in a train, validation and testset. The class labels are form 0 to 7. The pixelsize of every image is 48x48 and we have 3 channels (RGB) per image. All 3 sets are balanced.

In [None]:
h5f_X = h5py.File('C:/Users/Elvis/Dropbox/DAS_DL_shared/Excercises/8_faces_no_cut/Data_8_faces_no_cut.hdf5', 'r')
print(list(h5f_X.keys()))
X_train = h5f_X['X_train_8_faces']
print(X_train.shape)
Y_train = h5f_X['Y_train_8_faces']
print(Y_train.shape)
X_valid = h5f_X['X_valid_8_faces']
print(X_valid.shape)
Y_valid = h5f_X['Y_valid_8_faces']
print(Y_valid.shape)

In [None]:
plt.hist(Y_train,bins=8)

In [None]:
plt.hist(Y_valid,bins=8)

How hard is this task? Lets plot some random image of the trainset to get an impression of the images and of  the task. 

In [None]:
rmd=np.random.randint(0,len(X_train))
plt.imshow(np.asarray(X_train[rmd],dtype="uint8"),interpolation="bicubic")

#### Normalization of the training and validationset.

In [None]:
X_mean = np.mean( X_train, axis = 0)
X_std = np.std( X_train, axis = 0)

X_train = (X_train - X_mean ) / (X_std + 0.0001)
X_valid = (X_valid - X_mean ) / (X_std + 0.0001)

#### Flattern the images into a vector because we unly use fully connected layers in this model

In [None]:
print(X_train.shape)
X_newtrain=np.zeros([len(X_train),48*48*3])
for i in range(0,len(X_train)):
    X_newtrain[i]=np.reshape(X_train[i],newshape=((48*48*3),))
    
X_newvalid=np.zeros([len(X_valid),48*48*3])
for i in range(0,len(X_valid)):
    X_newvalid[i]=np.reshape(X_valid[i],newshape=((48*48*3),))

In [None]:
X_newtrain.shape

In [None]:
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.utils import np_utils
from keras import backend as K
from keras.layers.normalization import BatchNormalization

Define the convertToOneHot function and convert the labels into the onehot encoding.

In [None]:
def convertToOneHot(vector, num_classes=None):
    result = np.zeros((len(vector), num_classes), dtype='int32')
    result[np.arange(len(vector)), vector] = 1
    return result

In [None]:
Y_train=convertToOneHot(Y_train,num_classes=8)
Y_valid=convertToOneHot(Y_valid,num_classes=8)

In [None]:
batch_size = 128
nb_classes = 8
nb_epoch = 50

In [None]:
print(Y_train[1])
print(Y_valid[1])

In [None]:
print(X_newtrain.shape)
print(X_newvalid.shape)

### Define the network

In [None]:
model = Sequential()
name = 'only_fc'

#Your code here, built the network



# End of your code

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

In [None]:
#for part b:
model = Sequential()
name = 'fc_with_hidden'

#Your code here, built the network



# End of your code

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

In [None]:
model.summary()

### Running a forward pass of the untrained network

In [None]:
model.evaluate(X_newtrain,Y_train)

In [None]:
-np.log(1/8)

In [None]:
model.predict(X_newtrain[0].reshape(1,48*48*3))

In [None]:
tensorboard = keras.callbacks.TensorBoard(
        log_dir='tensorboard/8_faces/' + name + '/', 
        write_graph=True,
        histogram_freq=1)

### Training the network

In [None]:
history=model.fit(X_newtrain, Y_train, 
                  batch_size=batch_size, 
                  nb_epoch=30,
                  verbose=1, 
                  validation_data=(X_newvalid, Y_valid),
                  callbacks=[tensorboard])

In [None]:
# summarize history for accuracy
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'valid'], loc='lower right')
plt.show()

In [None]:
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'valid'], loc='upper right')
plt.show()

### Evaluation of the trained network

In [None]:
X_test = h5f_X['X_test_8_faces']
print(X_test.shape)
Y_test = h5f_X['Y_test_8_faces']
print(Y_test.shape)

In [None]:
X_test = (X_test - X_mean ) / (X_std + 0.0001)

In [None]:
X_newtest=np.zeros([len(X_test),48*48*3])
for i in range(0,len(X_test)):
    X_newtest[i]=np.reshape(X_test[i],newshape=((48*48*3),))

In [None]:
model.predict(X_newtest[0].reshape(1,48*48*3))

In [None]:
preds=np.zeros([len(X_newtest),8])
for i in range(0,len(X_newtest)):
    preds[i]=model.predict(X_newtest[i].reshape(1,48*48*3))

In [None]:
pred=np.zeros([len(X_test)])
for i in range(0,len(X_test)):
    pred[i]=np.argmax(preds[i])

In [None]:
sum(pred==Y_test)/400

In [None]:
from sklearn.metrics import confusion_matrix
confusion_matrix(Y_test, pred)