## Street View House Number Recognition - MLP

Moving on from the MNIST data, we will look at recognizing house numbers. This is more complicated then recognizing single digits because house numbers can have a varying number of digits.

The dataset we will use is the [Street View House Numbers (SVHN) Dataset](http://ufldl.stanford.edu/housenumbers/) which contains house numbers obtained from Google Street View images.

Download the MNIST-like 32-by-32 images and place them in the `data` folder.  [train_32x32.mat](http://ufldl.stanford.edu/housenumbers/train_32x32.mat), [test_32x32.mat](http://ufldl.stanford.edu/housenumbers/test_32x32.mat) , [extra_32x32.mat](http://ufldl.stanford.edu/housenumbers/extra_32x32.mat)

Let's visualise.

In [None]:
import random
import matplotlib.pyplot as plt
import scipy.io
training_data = scipy.io.loadmat('../data/train_32x32.mat')
test_data = scipy.io.loadmat('../data/test_32x32.mat')

X_train = training_data['X']
y_train = training_data['y']
X_test = test_data['X']
y_test = test_data['y']

training_index = random.choice(range(X_train.shape[3]))
%matplotlib inline
plt.figure()
plt.imshow(X_train[:, :, :, training_index]);
plt.xticks([]);
plt.yticks([]);
plt.title(y_train[training_index]);

We first attempt to classify the data using an MLP with one of the three channels flattened and concatenated

In [None]:
X_train_mlp = X_train[:,:,0].reshape(32 * 32, -1)
X_test_mlp = X_test[:,:,0].reshape(32 * 32, -1)
X_train_mlp = X_train_mlp.astype('float32')
X_test_mlp = X_test_mlp.astype('float32')

# normalize
X_train_mlp /= 255
X_test_mlp /= 255

X_train_mlp = X_train_mlp.T
X_test_mlp = X_test_mlp.T

In [None]:
from keras.utils import to_categorical
Y_train = to_categorical(y_train, num_classes=None)
Y_test = to_categorical(y_test, num_classes=None)



We now create the MLP using Keras

In [None]:
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD, Adam, RMSprop

model = Sequential()
model.add(Dense(256, input_shape=(X_train_mlp.shape[1],)))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(Y_train.shape[1]))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy',
              optimizer=RMSprop(),
              metrics=['accuracy'])

    

In [None]:
model.summary()

In [None]:
batch_size = 32
num_epoch = 10
history = model.fit(X_train_mlp, 
              Y_train, 
              batch_size=batch_size, 
              epochs=num_epoch, verbose=1, 
              validation_data=(X_test_mlp, Y_test))

In [None]:
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('Training Metrics')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Val'], loc='upper left')

Using convolutional neural networks can lead to great improvement here. We will consider them in coming weeks.