In the Deep Learning notes space, we've got [a notebook outlining a potential architecture for a network to predict on the MNist dataset.](https://napsterinblue.github.io/notes/machine_learning/neural_nets/conv_project_structure/) Here, we'll walk through how to match that implementation in TensorFlow using Keras.

### Overview

Our instantiation will basically look like the following:

- Generate our Data
- Create Placeholders
- Create Variable objects

### Data

The dataset is a smaller resolution, but the exercise is the same

In [1]:
from sklearn.datasets import load_digits

data = load_digits()

X = data['images']
y = data['target']

print(X.shape, y.shape)

(1797, 8, 8) (1797,)


Because we're going to end in a softmax layer, we want to separate `y` into 10 distinct classes-- not just their correct values.

In [2]:
from sklearn.preprocessing import OneHotEncoder

enc = OneHotEncoder()
sparse = enc.fit_transform(y.reshape(-1, 1))

y = sparse.todense()
print(y.shape)

(1797, 10)


In [3]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=.7)

# Second split to get dev/test set
X_dev, X_test, y_dev, y_test = train_test_split(X_test, y_test, train_size=.66)



In [4]:
print(X_train.shape, X_dev.shape, X_test.shape)
print(y_train.shape, y_dev.shape, y_test.shape)

(1257, 8, 8) (356, 8, 8) (184, 8, 8)
(1257, 10) (356, 10) (184, 10)


### Build TensorFlow Graph

Note: Because the resolution is much smaller, we'll comment out filtering steps-- this would just takes us from some data to very little data, lol

In [5]:
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten
from tensorflow.keras.layers import Conv1D, MaxPooling1D

Determine the shape of our inputs

In [6]:
m, X_w, X_h = X.shape
n_y = y.shape[1]

Instantiate the model

In [7]:
model = Sequential()

Layer 1

In [8]:
model.add(Conv1D(filters=6, kernel_size=(3), activation='relu', input_shape=(X_w, X_h)))

In [9]:
# model.add(MaxPooling1D(pool_size=2))

Layer 2

In [10]:
model.add(Conv1D(filters=6, kernel_size=(2), activation='relu'))

In [11]:
# model.add(MaxPooling1D(pool_size=2))

Graduating past Convolution

In [12]:
model.add(Flatten())

Fully-Connected layers

In [13]:
model.add(Dense(128, activation='relu'))

In [14]:
model.add(Dense(n_y, activation='softmax'))

Compile model with optimizer and loss function

In [15]:
model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adam(),
              metrics=['accuracy'])

Fit it

In [16]:
model.fit(X_train, y_train, batch_size=64, epochs=10, verbose=1,
          validation_data=(X_dev, y_dev))

Train on 1257 samples, validate on 356 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x23584d340f0>

### Evaluating

In [17]:
from sklearn.metrics import confusion_matrix
from sklearn.metrics import precision_score, recall_score, accuracy_score

In [18]:
precision_score(y_test.argmax(axis=1), model.predict(X_test).argmax(axis=1), average='macro')

0.8832403870639165

In [19]:
recall_score(y_test.argmax(axis=1), model.predict(X_test).argmax(axis=1), average='macro')

0.88765537856498944

In [20]:
accuracy_score(y_test.argmax(axis=1), model.predict(X_test).argmax(axis=1))

0.88043478260869568

In [21]:
confusion_matrix(y_test.argmax(axis=1), model.predict(X_test).argmax(axis=1))

array([[17,  0,  0,  0,  1,  0,  1,  0,  0,  0],
       [ 0, 13,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0, 14,  0,  0,  0,  0,  0,  1,  1],
       [ 0,  0,  0, 18,  1,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0, 12,  0,  1,  1,  0,  0],
       [ 0,  0,  0,  0,  1, 20,  0,  0,  0,  2],
       [ 0,  0,  0,  0,  0,  0, 21,  0,  2,  0],
       [ 0,  0,  0,  1,  0,  0,  0, 14,  0,  0],
       [ 0,  1,  1,  1,  0,  4,  1,  0, 14,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  1,  0, 19]], dtype=int64)