# Deep Learning Exercise with MNIST and Keras

## Introduction

This exercise complements the topics learned in the prerequisite tutorial [Deep Learning from Pre-Trained Models with Keras](keras-transfer-learning-tutorial.ipynb).

This exercise aims to prepare participants for the HPC Saudi 2020 Student AI Competition.  Participants will:

* Practice what they've learned from the prerequisite *Deep Learning from Pre-Trained Models with Keras* tutorial,
* create their own CNN based image classifier for the MNIST digits dataset,
* and finally, submit the classification results from their model to Kaggle for evaluation.

Participants are expected to bring their own laptops and sign-up for free online cloud services (e.g., Google Colab, Kaggle).  They may also need to download free, open-source software prior to arriving for the workshop.

* Tutorial materials are derived from:
  * [PyTorch Tutorials](https://github.com/kaust-vislab/pytorch-tutorials) by David Pugh.

## Setup

### Create a Kaggle Account

#### 1. Register for an account

In order to download Kaggle competition data you will first need to create a [Kaggle](https://www.kaggle.com/) account.

#### 2. Create an API key

Once you have registered for a Kaggle account you will need to create some [API credentials](https://github.com/Kaggle/kaggle-api#api-credentials) in order to be able to use the `kaggle` CLI to download data.

### Download MNIST Data

If you are using Binder to run this notebook, then the data is already downloaded and available.  Skip to the next step.

If you are using Google Colab to run this notebook, then you will need to download the data before proceeding.

#### Download MNIST from Kaggle

Provide your Kaggle username and API key in the cell below and execute the code to download the Kaggle [Digit Recognizer: Learn computer vision with the famous MNIST data](https://www.kaggle.com/c/digit-recognizer) competition data. 

**Note: Before attempting to download the competition data you will need to login to your Kaggle account and accept the rules for this competition.**

In [None]:
%%bash
# NOTE: Replace YOUR_USERNAME and YOUR_API_KEY with actual credentials 
export KAGGLE_USERNAME="YOUR_USERNAME"
export KAGGLE_KEY="YOUR_API_KEY"
kaggle competitions download -c digit-recognizer -p ../datasets/mnist/

#### (Alternative) Download MNIST from GitHub

If you are running this notebook using Google Colab, but did create a Kaggle account and API key, then  dowload the data from our GitHub repository by running the code in the following cells.

In [None]:
import os
import pathlib
import requests

#RAW_URL = "https://raw.githubusercontent.com/kaust-vislab/keras-tutorials/master/mnist/data/raw"
RAW_URL = "https://github.com/holstgr-kaust/keras-tutorials/raw/master/datasets/mnist"
DEST_DIR = pathlib.Path('../datasets/mnist')

def fetch_mnist_data():
    DEST_DIR.mkdir(parents=True, exist_ok=True)
    for n in ["mnist.npz", "kaggle/train.csv", "kaggle/test.csv", "kaggle/sample_submission.csv"]:
        path = DEST_DIR / n
        with path.open(mode = 'wb') as f:
            response = requests.get(RAW_URL + "/" + n)
            f.write(response.content)

In [None]:
fetch_mnist_data()

#### (Alternative) Download MNIST with Keras

If you are running this notebook using Google Colab, but did create a Kaggle account and API key, then dowload the data using the Keras load_data() API by running the code in the following cells.

In [None]:
from tensorflow.keras.datasets import mnist
mnist.load_data();

## Exercise

### Setup

Initialize the Python environment by importing and verifying the modules we will use.

In [None]:
import os
import sys
import pathlib
import random
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [None]:
import tensorflow as tf
import tensorflow.keras as keras

`%matplotlib inline` is a magic command that makes *matplotlib* charts and plots appear was outputs in the notebook.

`%matplotlib notebook` enables semi-interactive plots that can be enlarged, zoomed, and cropped while the plot is active.  One issue with this option is that new plots appear in the active plot widget, not in the cell where the data was produced.

In [None]:
%matplotlib inline

In [None]:
# Verify runtime environment

try:
    # %tensorflow_version only exists in Colab.
    %tensorflow_version 2.x
    IS_COLAB = True
except Exception:
    IS_COLAB = False

assert tf.__version__ >= "2.0", "TensorFlow version >= 2.0 required."

print("executing_eagerly:", tf.executing_eagerly())

if not tf.test.is_gpu_available():
    print("No GPUs available. Expect training to be very slow.")
    if IS_COLAB:
        print("Go to `Runtime` > `Change runtime` and select a GPU hardware accelerator.")
else:
    print("is_built_with_cuda:", tf.test.is_built_with_cuda())
    print("is_gpu_available:", tf.test.is_gpu_available(), tf.test.gpu_device_name())

assert sys.version_info >= (3, 5), "Python >= 3.5 required."

### Dataset Pre-processing - MNIST

The previously acquired MNIST dataset is the essential input needed to train an image classification model. Before using the dataset, there are several preprocessing steps required to load the data, and create the correctly sized training, validation, and testing arrays used as input to the network.

The following data preparation steps are needed before they can become inputs to the network:

* Cache the downloaded dataset (to use Keras `load_data()` functionality).
* Load the dataset (MNIST is small, and fits in memory).
    * Convert from textual CSV files into binary tensor arrays (https://www.tensorflow.org/tutorials/load_data/csv).
    * Reshape from (784, 1) to (28, 28,1) to (32, 32, 3)
* Verify the shape and type of the data, and understand it...
* Convert label indices into categorical vectors.
* Convert image data from integer to float values, and normalize.
  * Verify converted input data.

#### Cache Data

Make downloaded data available to Keras.  Provide dataset utility functions.

In [None]:
# Cache MNIST Datasets

for n in ["mnist.npz", "kaggle/train.csv", "kaggle/test.csv"]:
    #DATA_URL = "https://github.com/holstgr-kaust/keras-tutorials/raw/master/datasets/mnist/%s" % n
    DATA_URL = "file:///" + str(pathlib.Path("../datasets/mnist/%s" % n).absolute())
    #data_file_path = tf.keras.utils.get_file(p + n, DATA_URL)
    data_file_path = tf.keras.utils.get_file(n.replace('/','-mnist-'), DATA_URL)
    print("cached file: %s" % n)

In [None]:
%%bash
find ~/.keras -name "*mnist*" -type f

In [None]:
def get_csv_dataset(file_path, **kwargs):
    dataset = tf.data.experimental.make_csv_dataset(
        file_path,
        batch_size=5, # Artificially small to make examples easier to show.
        label_name='label',
        na_value="?",
        num_epochs=1,
        ignore_errors=True, 
        **kwargs)
    return dataset

def pack(features, label):
    return tf.stack(list(features.values()), axis=-1), label

In [None]:
def show_batch(dataset):
    for batch, label in dataset.take(1):
        print("{:20s}: {} :: {}".format('label', label, type(label)))
        for key, value in batch.items():
              print("{:20s}: {} :: {}".format(key, value.numpy(), type(value)))

def show_packed(packed_dataset):
    for features, labels in packed_dataset.take(1):
        print(features.numpy())
        print()
        print(labels.numpy())

#### Load Data

In [None]:
train_file_path = "../datasets/mnist/kaggle/train.csv"
test_file_path = "../datasets/mnist/kaggle/test.csv"

raw_train_data = get_csv_dataset(train_file_path)
packed_train_data = raw_train_data.map(pack)
train_data = packed_train_data.shuffle(500)

# NOTE: unlabelled Kaggle test dataset?
#raw_test_data = get_dataset(test_file_path)

(Alternative) Load data via Keras API.  This loads data into a `numpy` array, and the test examples are labelled.

In [None]:
# TODO: complete example and convert numpy array into Dataset
from tensorflow.keras.datasets import mnist

# The data, split between train and test sets:
(x_train, y_train), (x_test, y_test) = mnist.load_data()

**TODO: Modify Explore Data examples to use Dataset, move into Explore Data section**

#### Explore Data

Explore data types, shape, and value ranges.  Ensure they make sense, and you understand the data well.

In [None]:
show_batch(raw_train_data)

In [None]:
show_packed(packed_train_data)

In [None]:
packed_train_data

In [None]:
print('x_train type:', type(x_train), ',', 'y_train type:', type(y_train))
print('x_train dtype:', x_train.dtype, ',', 'y_train dtype:', y_train.dtype)
print('x_train shape:', x_train.shape, ',', 'y_train shape:', y_train.shape)
print('x_test shape:', x_test.shape, ',', 'y_test shape:', y_test.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

In [None]:
print('x_train (min, max, mean): (%s, %s, %s)' % (x_train.min(), x_train.max(), x_train.mean()))
print('y_train (min, max): (%s, %s)' % (y_train.min(), y_train.max()))

In [None]:
def imageset_plot(img_data=None):
    (x_imgs, y_imgs) = img_data if img_data else (x_train, y_train)
    fig = plt.figure(figsize=(16,8))

    for i in range(40):
        plt.subplot(4, 10, i + 1)
        plt.xticks([])
        plt.yticks([])
        idx = int(random.uniform(0, x_imgs.shape[0]))
        plt.title("%s" % (y_imgs[idx]))
        plt.imshow(x_imgs[idx], cmap=plt.get_cmap('gray'))
    plt.show()

In [None]:
# Show array of random labelled images with matplotlib (re-run cell to see new examples)
imageset_plot((x_train, y_train))

In [None]:
hist, bins = np.histogram(y_train, bins = range(y_train.min(), y_train.max() + 2))

fig = plt.figure(figsize=(12,5))

plt.subplot(1,2,1)
plt.hist(y_train, bins = range(y_train.min(), y_train.max() + 2))
plt.xticks(range(y_train.min(), y_train.max() + 2))
plt.title("y_train histogram")
plt.subplot(1,2,2)
plt.hist(x_train.flat, bins = range(x_train.min(), x_train.max() + 2))
plt.title("x_train histogram")
plt.tight_layout()
plt.show()

print('y_train histogram counts:', hist)

The data looks reasonable: there are sufficient examples for each category (y_train) and the histogram showning mostly black (0) and near-white grayscale (>250) agrees with the examples shown previously.

##### Visualizing training samples using PCA

[Principal Components Analysis (PCA)](https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html) can be used as a visualization tool to see if there are any obvious patterns in the training samples.

In [None]:
import sklearn
import sklearn.decomposition

_prng = np.random.RandomState(42)

pca = sklearn.decomposition.PCA(n_components=3, random_state=_prng)

x_train_flat = x_train.reshape(*x_train.shape[:1], -1)
y_train_flat = y_train.reshape(y_train.shape[0])
print("x_train:", x_train.shape, "y_train", y_train.shape)
print("x_train_flat:", x_train_flat.shape, "y_train_flat", y_train_flat.shape)
pca_train_features = pca.fit_transform(x_train_flat, y_train_flat)
print("pca_train_features:", pca_train_features.shape)

# Sample 10% of the PCA results
_idxs = _prng.randint(y_train_flat.shape[0], size=y_train_flat.shape[0] // 10)
pca_features = pca_train_features[_idxs]
pca_category = y_train_flat[_idxs]
print("pca_features:", pca_features.shape, 
      "pca_category", pca_category.shape, 
      "min,max category:", pca_category.min(), pca_category.max())

In [None]:
def category_scatter_plot(features, category):
    num_category = 1 + category.max() - category.min()

    fig, ax = plt.subplots(1, 1, figsize=(12, 10))
    cm = plt.cm.get_cmap('tab10', num_category)
    sc = ax.scatter(features[:,0], features[:,1], c=category, alpha=0.4, cmap=cm)
    ax.set_xlabel("Component 1")
    ax.set_ylabel("Component 2")
    ax.set_title("CIFAR10 - PCA")
    plt.colorbar(sc)
    plt.show()

In [None]:
from mpl_toolkits.mplot3d import Axes3D

def category_scatter3d_plot(features, category):
    num_category = 1 + category.max() - category.min()
    mean_feat = np.mean(features, axis=0)
    std_feat = np.std(features, axis=0)
    min_range = mean_feat - std_feat
    max_range = mean_feat + std_feat
    
    fig = plt.figure(figsize=(12, 10))
    cm = plt.cm.get_cmap('tab10', num_category)
    ax = fig.add_subplot(111, projection='3d')
    sc = ax.scatter(features[:,0], features[:,1], features[:,2],
                    c=category, alpha=0.85, cmap=cm)
    ax.set_xlabel("Component 1")
    ax.set_ylabel("Component 2")
    ax.set_zlabel("Component 3")
    ax.set_title("CIFAR10 - PCA")
    ax.set_xlim(2.0 * min_range[0], 2.0 * max_range[0])
    ax.set_ylim(2.0 * min_range[1], 2.0 * max_range[1])
    ax.set_zlim(2.0 * min_range[2], 2.0 * max_range[2])
    plt.colorbar(sc)
    plt.show()

In [None]:
category_scatter_plot(pca_features, pca_category)

**Note:** 3D PCA plot works best with `%matplotlib notebook` to enable interactive rotation.

In [None]:
%matplotlib widget

In [None]:
category_scatter3d_plot(pca_features, pca_category)

The data in its original image space does not appear to cluster into corresponding categories.

##### Visualizing training sample using t-SNE

[t-distributed Stochastic Neighbor Embedding (t-SNE)](https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html#sklearn.manifold.TSNE) is a tool to visualize high-dimensional data. It converts similarities between data points to joint probabilities and tries to minimize the Kullback-Leibler divergence between the joint probabilities of the low-dimensional embedding and the high-dimensional data. For more details on t-SNE including other use cases see this excellent *Toward Data Science* [blog post](https://towardsdatascience.com/an-introduction-to-t-sne-with-python-example-5a3a293108d1)

It is highly recommended to use another dimensionality reduction method (e.g. PCA) to reduce the number of dimensions to a reasonable amount if the number of features is very high. This will suppress some noise and speed up the computation of pairwise distances between samples.

In [None]:
import sklearn
import sklearn.decomposition
import sklearn.pipeline
import sklearn.manifold

_prng = np.random.RandomState(42)

embedding2_pipeline = sklearn.pipeline.make_pipeline(
    sklearn.decomposition.PCA(n_components=0.95, random_state=_prng),
    sklearn.manifold.TSNE(n_components=2, random_state=_prng))

embedding3_pipeline = sklearn.pipeline.make_pipeline(
    sklearn.decomposition.PCA(n_components=0.95, random_state=_prng),
    sklearn.manifold.TSNE(n_components=3, random_state=_prng))

In [None]:
# Sample 10% of the data

_prng = np.random.RandomState(42)

_idxs = _prng.randint(y_train_flat.shape[0], size=y_train_flat.shape[0] // 10)
tsne_features = x_train_flat[_idxs]
tsne_category = y_train_flat[_idxs]
print("tsne_features:", tsne_features.shape, 
      "tsne_category", tsne_category.shape, 
      "min,max category:", tsne_category.min(), tsne_category.max())

In [None]:
# t-SNE is SLOW (but can be GPU accelerated!); 
#       lengthy operation, be prepared to wait...

transform2_tsne_features = embedding2_pipeline.fit_transform(tsne_features)

print("transform2_tsne_features:", transform2_tsne_features.shape)
for i in range(2):
    print("min,max features[%s]:" % i, 
          transform2_tsne_features[:,i].min(), 
          transform2_tsne_features[:,i].max())

In [None]:
category_scatter_plot(transform2_tsne_features, tsne_category)

In [None]:
# t-SNE is SLOW (but can be GPU accelerated!); 
#       lengthy operation, be prepared to wait...

transform3_tsne_features = embedding3_pipeline.fit_transform(tsne_features)

print("transform3_tsne_features:", transform3_tsne_features.shape)
for i in range(3):
    print("min,max features[%s]:" % i, 
          transform3_tsne_features[:,i].min(), 
          transform3_tsne_features[:,i].max())

In [None]:
category_scatter3d_plot(transform3_tsne_features, tsne_category)

t-SNE relates the data points (images) according to their closest neighbours.  Hints of underlying categories appear; but not cleanly seperable into the original categories.

#### Data Conversion

The data type for the training data is `uint8`, while the input type for the network will be `float32` so the data must be converted.  Also, the data should be normalized, and the labels need to be categorical.  I.e., instead of label existing as 10 different values in a 1-D space, they need to exist as Boolean values in a 10-D space — one dimension for each category, and either a 0 or 1 value in each dimension to represent membership in that category.

* https://keras.io/examples/cifar10_cnn/

In [None]:
num_classes = (y_train.max() - y_train.min()) + 1
print('num_classes =', num_classes)

y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

In [None]:
print("shape:", x_train.shape, x_test.shape)
x_train = x_train.reshape(x_train.shape + (1,))
x_test = x_test.reshape(x_test.shape + (1,))
print("reshape:", x_train.shape, x_test.shape)

In [None]:
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

train_data = (x_train, y_train)
test_data = (x_test, y_test)

In [None]:
print('x_train type:', type(x_train))
print('x_train dtype:', x_train.dtype)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)

print('y_train type:', type(y_train))
print('y_train dtype:', y_train.dtype)
print('y_train shape:', y_train.shape)
print('y_test shape:', y_test.shape)

### Evaluate Model

Visualize accuracy and loss for training and validation.

* https://keras.io/visualization/

In [None]:
def history_plot(history):
    fig = plt.figure(figsize=(12,5))

    plt.title('Model accuracy & loss')

    # Plot training & validation accuracy values
    ax1 = fig.add_subplot()
    #ax1.set_ylim(0, 1.1 * max(history.history['loss']+history.history['val_loss']))
    ax1.set_prop_cycle(color=['green', 'red'])
    p1 = ax1.plot(history.history['loss'], label='Train Loss')
    p2 = ax1.plot(history.history['val_loss'], label='Test Loss')

    # Plot training & validation loss values
    ax2 = ax1.twinx()
    ax2.set_ylim(0, 1.1 * max(history.history['accuracy']+history.history['val_accuracy']))
    ax2.set_prop_cycle(color=['blue', 'orange'])
    p3 = ax2.plot(history.history['accuracy'], label='Train Acc')
    p4 = ax2.plot(history.history['val_accuracy'], label='Test Acc')

    ax1.set_ylabel('Loss')
    ax1.set_xlabel('Epoch')
    ax2.set_ylabel('Accuracy')

    pz = p3 + p4 + p1 + p2
    plt.legend(pz, [l.get_label() for l in pz], loc='center right')
    plt.show()

In [None]:
def prediction_plot(model, test_data):
    (x_test, y_test) = test_data
    fig = plt.figure(figsize=(16,8))
    correct = 0
    total = 0
    
    for i in range(40):
        plt.subplot(4, 10, i + 1)
        plt.xticks([])
        plt.yticks([])
        idx = int(random.uniform(0, x_test.shape[0]))
        result = model.predict_classes(x_test[idx:idx+1])[0]
        rCorrect = True if np.argmax(y_test[idx]) == result else False
        rSym = '✔' if rCorrect else '✘'
        correct += 1 if rCorrect else 0
        total += 1
        plt.title("%s %s" % (rSym, result))
        plt.imshow(x_test[idx][:,:,0], cmap=plt.get_cmap('gray'))
    plt.show()
    
    print("% 3.2f%% correct (%s/%s)" % (100.0 * float(correct) / float(total), correct, total))

In [None]:
def prediction_proba_plot(model, test_data):
    (x_test, y_test) = test_data
    fig = plt.figure(figsize=(15,15))
    
    for i in range(10):
        plt.subplot(10, 2, (2*i) + 1)
        plt.xticks([])
        plt.yticks([])
        idx = int(random.uniform(0, x_test.shape[0]))
        result = model.predict_proba(x_test[idx:idx+1])[0] * 100 # prob -> percent
        plt.title("%s" % np.argmax(y_test[idx]))
        plt.xlabel("#%s" % idx)
        plt.imshow(x_test[idx][:,:,0], cmap=plt.get_cmap('gray'))
        
        ax = plt.subplot(10, 2, (2*i) + 2)
        plt.bar(np.arange(len(result)), result, label='%')
        plt.xticks(range(0, len(result) + 1))
        ax.set_xticklabels(range(10))
        plt.title("classifier probabilities")

        plt.tight_layout()
    plt.show()

### Create Your Own CNN Classifier Model

Create a basic CNN (Convolutional Neural Network) based classifier from scratch.

Try and create your own deep learning model to classify the MNIST data. Refer to the prerequisite tutorial and use the [CNN Classifier Model](keras-transfer-learning-tutorial.ipynb#CNN-Classifier-Model) code as a template.  Here are a few ideas to try.

1. Add more convolutional layers.
2. Add more neurons in each convolutional layer(s).
3. Try different activation layers.
4. Try using a different optimizer.
5. Try tuning the hyper-parameters of your chosen optimizer.
6. Train the model for more epochs (but don't overfit!)

* https://keras.io/examples/mnist_cnn/

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten, Dense, Activation, Dropout, Conv2D, MaxPooling2D

def create_my_model():
    model = Sequential()

    # Add your code here...
    
    return model

In [None]:
model = create_my_model()
model.summary()

In [None]:
batch_size = 128 #32
epochs = 12 #25
learning_rate = 1e-3 #1e-4
decay = 1e-6

In [None]:
from tensorflow.keras.optimizers import RMSprop, Adam

model.compile(loss='categorical_crossentropy',
              optimizer=RMSprop(learning_rate=learning_rate, decay=decay),
              metrics=['accuracy'])

In [None]:
history = model.fit(x_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    validation_data=(x_test, y_test),
                    shuffle=True)

In [None]:
history_plot(history)

In [None]:
# Score trained model.
scores = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])

In [None]:
prediction_plot(model, (x_test, y_test))

In [None]:
prediction_proba_plot(model, (x_test, y_test))

## Submitting to Kaggle

Submit your model's predictions to Kaggle, using your previously created Kaggle account, and then see how well your results compare to those of your peers.

### Re-train the model using the entire training set

In [None]:
# reload the original numpy arrays!
training_target, training_features = mnist_arr[:, 0], mnist_arr[:, 1:]

_prng = np.random.RandomState(42)

# could also add PCA or other dimension reducation step to pipeline
preprocessing_pipeline = pipeline.make_pipeline(
    preprocessing.MinMaxScaler(),
    preprocessing.FunctionTransformer(lambda X: X.astype("float32")),
    preprocessing.FunctionTransformer(lambda X: torch.from_numpy(X))
)
_training_features_tensor = preprocessing_pipeline.fit_transform(training_features)
_training_target_tensor = torch.from_numpy(training_target)

# create the data loader
batch_size = 128
_training_data = data.TensorDataset(_training_features_tensor, _training_target_tensor)
_training_data_loader = data.DataLoader(_training_data, batch_size=batch_size, shuffle=True)

# wrap the data loader to reshape the data as needed
reshape = lambda X, y: (X.view(-1, 1, 28, 28).to(device), y.to(device))
wrapped_training_data_loader = WrappedDataLoader(_training_data_loader, reshape)

In [None]:
fit(model_fn,
    loss_fn,
    wrapped_training_data_loader,
    opt,
    number_epochs=5)

### Use trained model to make predictions using the test data

In [None]:
# note we use transform method and NOT fit_transform!
_testing_features = np.loadtxt("../datasets/mnist/kaggle/test.csv", 
                               delimiter=',', skiprows=1, dtype=np.int64)
scaled_testing_features = preprocessing_pipeline.transform(_testing_features)

In [None]:
output = model_fn(scaled_testing_features_tensor.view(-1, 1, 28, 28).to(device))
predictions = torch.argmax(output, dim=1)

In [None]:
predictions

#### Visually check model predictions

In [None]:
_, ax = plt.subplots(1,1)
_ = ax.imshow(scaled_testing_features_tensor[0].reshape((28, 28)), cmap="gray")

### Reformat predictions

In [None]:
# submission format for kaggle
!head ../datasets/mnist/kaggle/sample_submission.csv

In [None]:
import os
import time

import pandas as pd

if not os.path.isdir("../results/kaggle-submissions"):
    os.makedirs("../results/kaggle-submissions")

timestamp = time.strftime("%Y%m%d-%H%M%S")
number_predictions, = predictions.shape
df = pd.DataFrame({"ImageId": range(1, number_predictions + 1), "Label": predictions.cpu()})
df.to_csv(f"../results/kaggle-submissions/submission-{timestamp}.csv", index=False)

### Submit to Kaggle

Once you have successfully submited your predictions then you can check the [Digit-Recognizer competition](https://www.kaggle.com/c/digit-recognizer) website and see how well your best model compares to your peers.

In [None]:
%%bash
export KAGGLE_USERNAME="YOUR_USERNAME"
export KAGGLE_KEY="YOUR_API_KEY"
kaggle competitions submit digit-recognizer \
  -f $(ls ../results/kaggle-submissions/submission-*.csv | tail -n 1) \
  -m "My first digit recognizer submission!"

## Mentor Bios

Glendon Holst is a Staff Scientist in the Visualization Core Lab at KAUST (King Abdullah University of Science and Technology) specializing in HPC workflow solutions for deep learning, image processing, and scientific visualization.

Mohsin Ahmed Shaikh is a Computational Scientist in the Supercomputing Core Lab at KAUST (King Abdullah University of Science and Technology) specializing in large scale HPC applications and GPGPU support for users on Ibex (cluster) and Shaheen (supercomputer).  Mohsin holds a PhD in Computational Bioengineering, and a Post Doc, from University of Canterbury, New Zealand.