# Model 1

Build a simple model in TensorFlow. 

 Build and train neural network models using TensorFlow 2.x
You need to understand the foundational principles of machine learning (ML) and deep learning (DL)
using TensorFlow 2.x. You need to know how to:
* Use TensorFlow 2.x.
* Build, compile and train machine learning (ML) models using TensorFlow.
* Preprocess data to get it ready for use in a model.
* Use models to predict results.
* Build sequential models with multiple layers.
* Build and train models for binary classification.
* Build and train models for multi-class categorization.
* Plot loss and accuracy of a trained model.
* Identify strategies to prevent overfitting, including augmentation and dropout.
* Use pretrained models (transfer learning).
* Extract features from pre-trained models.
* Ensure that that inputs to a model are in the correct shape.
* Ensure that you can match test data to the input shape of a neural network.
* Ensure you can match output data of a neural network to specified input shape for test data.
* Understand batch loading of data.
* Use callbacks to trigger the end of training cycles.
* Use datasets from different sources.
* Use datasets in different formats, including json and csv.
* Use datasets from tf.data.datasets.


In [12]:
import tensorflow as tf
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline 
import tensorflow_datasets as tfds
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, BatchNormalization, Flatten
# tf.enable_eager_execution()
# tf.enable_v2_behavior()

In [13]:
(train, test), info = tfds.load('fashion_mnist', split=['train', 'test'], shuffle_files=True, as_supervised=True, with_info=True)

In [14]:
def normalize_img(image, label):
    """Normalizes images; uint8 -> float32"""
    
    return tf.cast(image,tf.float32) / 255., label

## Training Pipeline

### Why do we need to set up a good training pipeline?

When working with GPUs and TPUs, we can radically reduce the time required to execute a single training step. Achieving peak performance requires an efficient input pipeline that delivers data for the next step before the current step has finished. The `tf.data` API helps to build flexible and efficient input pipelines. 

### Steps taken to set up training pipeline

* We use a map function with the `normalize_img` function we created. We let tf autoset the number of parallel calls to make this happen as fast as possible
* Because the entire training dataset can fit in memory, we can cache before shuffling to get better performance. 
* For true randomness, we set the shuffle buffer to the full dataset size (which can be found in `info.splits['train'].num_examples`
* By batching after shuffling, we get unique batches for each epoch.
* Finally, we end the pipeline by prefetching for performances.

**_NOTE:_** The prefetch transformation provides benefits any time there is an opportunity to overlap the work of a "producer" with the work of a "consumer".

In [15]:
# Training pipeline

train = train.map(normalize_img, num_parallel_calls=tf.data.experimental.AUTOTUNE)
train = train.cache()
train.shuffle(info.splits['train'].num_examples)
train = train.batch(128)
train.prefetch(tf.data.experimental.AUTOTUNE)

<DatasetV1Adapter shapes: ((None, 28, 28, 1), (None,)), types: (tf.float32, tf.int64)>

### Testing Pipeline

A bit different from training pipeline. 

* We still normalize the testing dataset just like the training dataset. This is important!
* For testing, we batch our data before caching. This way
* We still end with a `prefetch()` call to optimize our pipeline. 

In [16]:
test = test.map(normalize_img, num_parallel_calls=tf.data.experimental.AUTOTUNE)
test = test.batch(128)
test = test.cache()
test = test.prefetch(tf.data.experimental.AUTOTUNE)

In [17]:
model = Sequential()
model.add(Flatten(input_shape=(28, 28, 1)))
model.add(Dense(128, activation='tanh'))
model.add(Dense(64, activation='tanh'))
model.add(Dense(10, activation='softmax'))

In [18]:
model.compile(loss='sparse_categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])

In [19]:
model.fit(train, epochs=10, validation_data=test)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x23448504b48>