In [None]:
!pip install tensorflow

# TensorFlow

![Tensors](https://miro.medium.com/v2/resize:fit:1400/0*jGB1CGQ9HdeUwlgB)

TensorFlow is an `open-source` library for numerical computation on tensors that uses directed graphs to represent the computation that you want to do.


Tensorflow graphs are portable between different devices.

### TensorFlow API hierarchy

![TF API abstraction layers](./img/lab_9_tf_api.png)

* The lowest layer of abstraction is the layer that's implemented to target the different hardware platforms.
* The next level is the TensorFlow C++ API where you can write custom TensorFlow operations.
* The core Python API is what contains much of the numeric processing code to work with tensors
* Sets of Python modules that have high-level representation of useful neural network components. These modules are useful when building custom neural network models.
* Lastly, the high-level APIs allow you to easily do distributed training, data preprocessing, the model definition, compilation and overall training.

## TF tensors and variable

In [None]:
import tensorflow as tf

In [None]:
# scalar
x = tf.constant(3)
print(x.shape)

In [None]:
# vector
x = tf.constant([3, 5, 7])
print(x.shape)

In [None]:
# Matrix
x = tf.constant([[3, 5, 7],
                 [4, 6, 8]])
print(x.shape)

In [None]:
# 3D Tensor
x3 = tf.stack([x, x])
print(x3.shape)

In [None]:
x3

note: Tensors can be reshaped. 

`tf.constant` produces constant tensors while `tf.Variable` can be modified.

In [None]:
x = tf.Variable(3.0, dtype=tf.float32, name='sample_variable')
x

In [None]:
x.assign(6)
x

In [None]:
x.assign_add(1)

Let's look at a simplified neural network architecture again

![NN](./img/lab_9_nn_recap.png)

In [None]:
# input data
x = tf.constant([[3, 4]])
# weights that will change
w = tf.Variable([[1], [2]])

# compute the dot product of weights and input feature
tf.matmul(w, x)

### TF Input Data Pipeline

`tf.data.Dataset` allows you to feed, preprocess, and configure data to TF models.

* create data pipelines from
    * in-memory dict or lists of tensors ```tf.data.Dataset.from_tensor_slices((X, Y))```
    * out-of-memory sharded data files ```tf.data.TFRecrodDataset(files)```
* preprocess data in parallel and cache result of costly operation ```dataset.mapy(expensive_function).cache()```
* configure data ```dataset.shuffle(1000).repeat(epochs).batch()```

# Regression with TF

## Load the data

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


# Make numpy values easier to read.
np.set_printoptions(precision=3, suppress=True)

import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.layers.experimental import preprocessing

In [None]:
# Abalone dataset https://archive.ics.uci.edu/ml/datasets/abalone
cols = ["Length", "Diameter", "Height",
         "Whole weight", "Shucked weight",
         "Viscera weight", "Shell weight", "Age"]

df_train = pd.read_csv(
    "https://storage.googleapis.com/download.tensorflow.org/data/abalone_train.csv", header=None, 
    names=cols)


df_test = pd.read_csv(
    "https://storage.googleapis.com/download.tensorflow.org/data/abalone_test.csv", header=None, 
    names=cols)

df_train.head()

In [None]:
df_train.info()

The nominal task for this dataset is to predict the age from the other measurements, so separate the features and labels for training:

In [None]:
X_train = df_train.copy()
y_train = X_train.pop('Age')


X_test = df_test.copy()
y_test = X_test.pop('Age')

## Modeling

The basic building block of a neural network is the *layer*. Layers extract representations from the data fed into them. 

Most of deep learning consists of chaining together simple layers. Most layers, such as `tf.keras.layers.Dense`, have parameters that are learned during training.

In [None]:
# Basic with no preprocessing
abalone_model = tf.keras.Sequential([
  layers.Dense(64, activation='relu'),
  layers.Dense(1, activation='relu')
])

Before the model is ready for training, it needs a few more settings. These are added during the model's compile step:
* Loss function —This measures how accurate the model is during training. You want to minimize this function to "steer" the model in the right direction.
* Optimizer —This is how the model is updated based on the data it sees and its loss function.
* Metrics —Used to monitor the training and testing steps.

In [None]:
abalone_model.compile(loss=tf.losses.MeanSquaredError(),
                      optimizer=tf.optimizers.Adam()
                    )

In [None]:
abalone_model.build(input_shape=X_train.shape)
abalone_model.summary()

In [None]:
history = abalone_model.fit(X_train, y_train,
                             epochs=10,
                             batch_size=64,
                             validation_data=(X_test, y_test)
                             )

In [None]:
df_history = pd.DataFrame(history.history)

In [None]:
#  visualize the training loss with each epoch.
df_history['loss'].plot() 
df_history['val_loss'].plot() 
plt.title('Loss') 
plt.legend() 
plt.show() 

## Preprocessing

It's good practice to normalize the inputs to your model. The `experimental.preprocessing` layers provide a convenient way to build this normalization into your model.

In [None]:
normalize = preprocessing.Normalization()

**Note: Only use your training data to .adapt() preprocessing layers. Do not use your validation or test data.**

In [None]:
normalize.adapt(X_train)

In [None]:
# use the normalization layer in the model
norm_abalone_model = tf.keras.Sequential([
  normalize,
  layers.Dense(64, activation='relu'),
  layers.Dense(1, activation='relu')
])

norm_abalone_model.compile(loss = tf.losses.MeanSquaredError(),
                           optimizer = tf.optimizers.Adam())

history_norm = norm_abalone_model.fit(X_train, y_train,
                      epochs=10,
                      batch_size=64,
                      validation_data=(X_test, y_test)
)

We use a loss function to determine how far the predicted values deviate from the actual values in the training data. ... We change the model weights to make the loss minimum, and that is what training is all about

In [None]:
norm_abalone_model.evaluate(X_train, y_train)

In [None]:
df_history_norm = pd.DataFrame(history_norm.history)

df_history_norm['loss'].plot() 
df_history_norm['val_loss'].plot() 
plt.title('Loss') 
plt.legend() 
plt.show() 

## Classification Example

In [None]:
mnist = tf.keras.datasets.mnist

In [None]:
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

The first layer in this network, tf.keras.layers.Flatten, transforms the format of the images from a two-dimensional array (of 28 by 28 pixels) to a one-dimensional array (of 28 * 28 = 784 pixels). Think of this layer as unstacking rows of pixels in the image and lining them up. This layer has no parameters to learn; it only reformats the data.

In [None]:
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

training_history = model.fit(x_train, 
                             y_train, 
                             epochs=10,
                             validation_split=0.2)

In [None]:
model.summary()

In [None]:
# Evaluate returns the loss value and metrics values for the model.
model.evaluate(x_test, y_test)

You can learn a lot about neural networks and deep learning models by observing their performance over time during training.

In [None]:
training_history.history.keys()

In [None]:
# summarize history for accuracy
plt.plot(training_history.history['accuracy'])
plt.plot(training_history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()
# summarize history for loss
plt.plot(training_history.history['loss'])
plt.plot(training_history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()

In [None]:
# Savemodel is the universal serialization format for TF models
tf.saved_model.save(model, './exports/models/')