# MNIST subset with vanilla network

I get about what a RF gets with 10k subsample of 60k images.

**colab** github can't seem to display notebooks so...

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/parrt/playdl/blob/master/mnist/notebooks/mnist-vanilla.ipynb)


## Setup
Make sure to enable this to see progress bars:

```
$ jupyter nbextension enable --py widgetsnbextension
$ jupyter labextension install @jupyter-widgets/jupyterlab-manager
```

In [47]:
!pip install -q --no-deps tensorflow-addons~=0.7
!pip install -q "tqdm>=4.36.1"

## MNIST Images

In [48]:
import tensorflow as tf
from keras.datasets import mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()

### Make sure to shuffle

Shuffle before getting subsample or else we get mostly 0s,1s,2s etc...

In [49]:
idx = np.random.randint(0,sub,size=sub)
X_train = X_train[idx,:,:]
X_test = X_test[idx,:,:]
y_train = y_train[idx]
y_test = y_test[idx]

In [50]:
sub = 10_000
X_train = X_train[:sub,:,:]
X_test = X_test[:sub,:,:]
y_train = y_train[:sub]
y_test = y_test[:sub]

n, w, h = X_train.shape

print(f"Using {n} images")

Using 10000 images


## RandomForestClassifier

In [51]:
from sklearn.ensemble import RandomForestClassifier
import numpy as np
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score, \
    accuracy_score, confusion_matrix
import matplotlib.pyplot as plt

In [52]:
rf = RandomForestClassifier(n_estimators=100,
                            min_samples_leaf=1,
                            oob_score=True, n_jobs=-1)
rf.fit(X_train.reshape(n,-1), y_train)
print(rf.oob_score_)

y_pred = rf.predict(X_test.reshape(n, -1))
conf = confusion_matrix(y_test, y_pred)
print(conf)
print("test accuracy", accuracy_score(y_test, y_pred))

0.9774
[[ 930    0    3    0    0    4    7    1    4    0]
 [   0 1142    8    2    0    2    0    0    3    0]
 [  17    0  994    6    7    1   13   13   13    1]
 [   3    0   24  921    2   18    1   19   19    6]
 [   0    0    3    0  943    2   14    0    8   40]
 [   9    4    5   23    2  815    7    3   16    4]
 [  14    5    0    1   13    6  931    0    6    0]
 [   2    8   30    4    3    0    0  919    5   21]
 [   6    2    8   15    3   11    8    4  899   15]
 [   3    4    1   18   24    5    1    0   10  913]]
test accuracy 0.9407


Conclusion is that a RF with only 100 trees does a quick easy job on this 10k image subsample. Does about what DL does with vanilla net, but accuracy .94 and .963 are very far apart if you're trying to win a competition.

## Vanilla two-layers of 512 neurons, softmax on end

In [53]:
import tensorflow_addons as tfa
from keras.datasets import mnist
from tensorflow.keras import models, layers, callbacks, optimizers
import tqdm
from tqdm.keras import TqdmCallback

### Don't forget to normalize data for DL

In [54]:
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255

In [59]:
num_classes = 10
layer1 = 512
layer2 = 512
batch_size = 2000
dropout = 0.2

model = models.Sequential()
model.add(layers.Dense(layer1, input_dim=w*h, activation='relu'))
model.add(layers.BatchNormalization())
model.add(layers.Dropout(dropout))

model.add(layers.Dense(layer2, activation='relu'))
model.add(layers.BatchNormalization())
model.add(layers.Dropout(dropout))

model.add(layers.Dense(num_classes, activation='softmax'))

In [None]:
learning_rate = 0.15
# opt = optimizers.Adam(lr=learning_rate)
opt = optimizers.RMSprop() # this one seems a bit better

model.compile(loss=tf.keras.losses.sparse_categorical_crossentropy, optimizer=opt, metrics=['accuracy'])

# callback = callbacks.EarlyStopping(monitor='val_loss', patience=10)
history = model.fit(X_train.reshape(n,w*h), y_train,
                    shuffle=True,
                    epochs=125,
                    validation_data=(X_test.reshape(n,w*h), y_test),
                    batch_size=batch_size,
                    verbose=0,
                    callbacks=[tfa.callbacks.TQDMProgressBar(show_epoch_progress=False)]
                    )

HBox(children=(FloatProgress(value=0.0, description='Training', layout=Layout(flex='2'), max=125.0, style=Prog…

In [None]:
y_pred = model.predict(X_test.reshape(n,w*h))
y_pred = np.argmax(y_pred, axis=1)
val_accur = accuracy_score(y_test, y_pred)
print("Keras validation accuracy", val_accur)

conf = confusion_matrix(y_test, y_pred)
print(conf)

In [None]:
plt.ylabel("Accuracy")
plt.xlabel("Epochs")

accur = history.history['accuracy']
plt.plot(accur, label='train_accuracy')
val_accur = history.history['val_accuracy']
plt.plot(val_accur, label='valid_accuracy')
plt.title(f"batch_size {batch_size}, Layers {layer1,layer2}\ntrain {accur[-1]:.3f}, valid {val_accur[-1]:.3f}, learning_rate={learning_rate:.2f}\ndropout {dropout:.2f}")
plt.xlim(0, 200)
plt.ylim(0.5, 1.02)
plt.legend(loc='lower right')
plt.show()