##**Deep Neural Networks for MNIST Classification**

Pipeline ▶
  - Prepare/Preprocess Data ▶ Split Data ▶ Train, Validate, Test.
  - Outline model, choose activation functions.
  - Set the appropriate advanced optimizers and loss function.
  - Make it learn/Train.
  - Test accuracy of the model.

In [3]:
# import relevant packages
import numpy as np
import pandas as pd
import tensorflow as tf

import tensorflow_datasets as tfds

### Acquire Data

In [4]:
# load dataset 
# as_supervised, loads into two tuple structure [input, target]
# with_info, provides tuple with info about version, features, number of samples
mnist_dataset, mnist_info = tfds.load(name='mnist', with_info=True, as_supervised=True)

[1mDownloading and preparing dataset mnist/3.0.1 (download: 11.06 MiB, generated: 21.00 MiB, total: 32.06 MiB) to /root/tensorflow_datasets/mnist/3.0.1...[0m


local data directory. If you'd instead prefer to read directly from our public
GCS bucket (recommended if you're running on GCP), you can instead pass
`try_gcs=True` to `tfds.load` or set `data_dir=gs://tfds-data/datasets`.



Dl Completed...:   0%|          | 0/4 [00:00<?, ? file/s]


[1mDataset mnist downloaded and prepared to /root/tensorflow_datasets/mnist/3.0.1. Subsequent calls will reuse this data.[0m


In [5]:
# look at dataset
mnist_dataset

{'test': <PrefetchDataset shapes: ((28, 28, 1), ()), types: (tf.uint8, tf.int64)>,
 'train': <PrefetchDataset shapes: ((28, 28, 1), ()), types: (tf.uint8, tf.int64)>}

In [7]:
# get info
mnist_info

tfds.core.DatasetInfo(
    name='mnist',
    version=3.0.1,
    description='The MNIST database of handwritten digits.',
    homepage='http://yann.lecun.com/exdb/mnist/',
    features=FeaturesDict({
        'image': Image(shape=(28, 28, 1), dtype=tf.uint8),
        'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=10),
    }),
    total_num_examples=70000,
    splits={
        'test': 10000,
        'train': 60000,
    },
    supervised_keys=('image', 'label'),
    citation="""@article{lecun2010mnist,
      title={MNIST handwritten digit database},
      author={LeCun, Yann and Cortes, Corinna and Burges, CJ},
      journal={ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist},
      volume={2},
      year={2010}
    }""",
    redistribution_info=,
)

### Prepare Data

In [8]:
mnist_train = mnist_dataset['train']
mnist_test = mnist_dataset['test']

In [9]:
num_validation_samples = 0.1 * mnist_info.splits['train'].num_examples
# convert num_validation_samples into an integer
num_validation_samples = tf.cast(num_validation_samples, tf.int64) 

num_test_samples = mnist_info.splits['test'].num_examples
num_test_samples = tf.cast(num_test_samples, tf.int64) 

In [10]:
# create a function to scale our data

# the function below replicates a tensorflow function dataset.map(*function*)

def scale(image, label):
  '''
  This function takes in an image and label, converts them to floats, scales them on a scale from 0 to 1 
  by dividing by 255 (0 to 256, the # of shades of gray)
  '''
  image = tf.cast(image, tf.float32)
  image /= 255. # signifies we want image to be float

  return image, label


In [11]:
scaled_train_and_validation_data = mnist_train.map(scale)

scaled_test_data = mnist_test.map(scale)

**Shuffle Data**

In [None]:
# shuffle data for algorithm
# shuffle 10_000 values at once
BUFFER_SIZE = 10_000
