<a href="https://colab.research.google.com/github/hcheruiy/remote_materials/blob/master/TensorFlow_2_0_and_Keras.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

TensorFlow (TF) is a specialized numerical computation library for deep learning.

The TF API hierarchy is primarily composed of three API levels, the high-level API, the mid-level API which provides components for building neural network models, and the low-level API. (TODO: image)

# The Low-Level TensorFlow APIs
The low-level API gives the tools for building network graphs from the ground up using mathematical operations. This API level affords the greatest level of flexibility to tweak and tune the model as desired. Moreover, the higher-level APIs implement low-level operations under the hood.

# The Mid-Level TensorFlow APIs
TensorFlow provides a set of reusable packages for simplifying the process involved in creating neural network models. Some examples of these functions include:
- __layers__ : `tf.keras.layers`
- __Datasets__: `tf.data`
- __metrics__: `tf.keras.metrics`
- __loss__ : `tf.keras.losses`), and
- __FeatureColumns__: `tf.feature_column`

## Layers
The layers (`tf.keras.layers`) package provides a handy set of functions to simplify the construction of layers in a neural network architecture.

## Datasets
The Dataset package (`tf.data`) provides a convenient set of high-level functions for creating complex dataset input pipelines. The goal of the Dataset package is to have a fast, flexible, and easy-to-use interface for fetching data from various data sources, performing data transform operations on them before passing them as inputs to the learning model. The Dataset API provides a more efficient means of fetching records
from a dataset. The major classes of the Dataset API are:

- `TextLineDataset`: used for reading lines from text files.
- `TFRecordDataset`: responsible for reading records from `TFRecord` files. A `TFRecord` file is a TensorFlow binary storage format. It is faster and easier to work with data stored as `TFRecord`
files as opposed to raw data files. Working with `TFRecord` also
makes the data input pipeline more easily aligned for applying vital
transformations such as shuffling and returning data in batches.
- `FixedLengthRecordDataset`: responsible for reading records of fixed sizes from binary files.

## `FeatureColumns`
FeatureColumns `tf.feature_column` is a functionality for describing the
features of the dataset that will be fed into a high-level Keras or Estimator models for training and validation. FeatureColumns makes it easy to prepare data for modeling by carrying out tasks such as the conversion of categorical features of the dataset into a one-hot
encoded vector. The feature_column API is broadly divided into two categories: they are the __categorical__ and __dense columns__.

In [1]:
# A simple TensorFlow Program: a graph to find the roots
# of the quadratic expression x^2 + 3x − 4 = 0

%tensorflow_version 2.x
# import tensorflow
import tensorflow as tf

# Quadratic expression: X**2 + 3x - 4 = 0
a = tf.constant(1.0)
b = tf.constant(3.0)
c = tf.constant(-4.0)

print(a, b, c)


TensorFlow 2.x selected.
tf.Tensor(1.0, shape=(), dtype=float32) tf.Tensor(3.0, shape=(), dtype=float32) tf.Tensor(-4.0, shape=(), dtype=float32)


In [2]:
x1 = (-b + tf.math.sqrt(b**2 - (4*a*c))) / 2**a
x2 = (-b - tf.math.sqrt(b**2 - (4*a*c))) / 2**a
roots = (x1, x2)
print(roots)

(<tf.Tensor: shape=(), dtype=float32, numpy=1.0>, <tf.Tensor: shape=(), dtype=float32, numpy=-4.0>)


## Building Efficient Input Pipelines with the Dataset API
The Dataset API `tf.data` offers an efficient mechanism for building robust input pipelines for passing data into a TensorFlow program. This section uses the Boston housing dataset to illustrate working with the Dataset API methods for building data input pipelines in TensorFlow.

In [3]:
%tensorflow_version 2.x

# import packages
import tensorflow as tf
from tensorflow.keras.datasets import boston_housing

# load dataset and split in train and test sets
(X_train, y_train), (X_test, y_test) = boston_housing.load_data()

# construct data input pipelines
dataset = tf.data.Dataset.from_tensor_slices((X_train, y_train))
dataset = dataset.shuffle(buffer_size = 1000)
dataset = dataset.batch(5)

# retrieve first data batch from dataset
for features, labels in dataset:
  print('Features:', features)
  print('Shape of Features:', features.shape)
  print('Labels:', labels)
  print('Shape of Labels:', labels.shape)
  
  break

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/boston_housing.npz
Features: tf.Tensor(
[[2.37934e+00 0.00000e+00 1.95800e+01 0.00000e+00 8.71000e-01 6.13000e+00
  1.00000e+02 1.41910e+00 5.00000e+00 4.03000e+02 1.47000e+01 1.72910e+02
  2.78000e+01]
 [1.13081e+00 0.00000e+00 8.14000e+00 0.00000e+00 5.38000e-01 5.71300e+00
  9.41000e+01 4.23300e+00 4.00000e+00 3.07000e+02 2.10000e+01 3.60170e+02
  2.26000e+01]
 [1.36781e+01 0.00000e+00 1.81000e+01 0.00000e+00 7.40000e-01 5.93500e+00
  8.79000e+01 1.82060e+00 2.40000e+01 6.66000e+02 2.02000e+01 6.89500e+01
  3.40200e+01]
 [6.56650e-01 2.00000e+01 3.97000e+00 0.00000e+00 6.47000e-01 6.84200e+00
  1.00000e+02 2.01070e+00 5.00000e+00 2.64000e+02 1.30000e+01 3.91930e+02
  6.90000e+00]
 [4.54192e+00 0.00000e+00 1.81000e+01 0.00000e+00 7.70000e-01 6.39800e+00
  8.80000e+01 2.51820e+00 2.40000e+01 6.66000e+02 2.02000e+01 3.74560e+02
  7.79000e+00]], shape=(5, 13), dtype=float64)
Shape of Features: (5, 13)
Labe

## Linear Regression with TensorFlow
In this section, we use TensorFlow to implement a linear regression machine learning model. In the following example, we use the Boston house-prices dataset from the Keras
dataset package to build a linear regression model with TensorFlow 2.0.

In [0]:
%tensorflow_version 2.x

# import packages
import numpy as np

import tensorflow as tf
from tensorflow.keras.datasets import boston_housing
from tensorflow.keras import Model

from sklearn.preprocessing import StandardScaler

# load dataset and split in train and test sets
(X_train, y_train), (X_test, y_test) = boston_housing.load_data()

# standardize the dataset
scaler_X_train = StandardScaler().fit(X_train)
scaler_X_test = StandardScaler().fit(X_test)
X_train = scaler_X_train.transform(X_train)
X_test = scaler_X_test.transform(X_test)

# reshape y-data to become column vector
y_train = np.reshape(y_train, [-1, 1])
y_test = np.reshape(y_test, [-1, 1])

# build the linear model
class LinearRegressionModel(Model):
  def __init__(self):
    super(LinearRegressionModel, self).__init__()
    # initialize weight and bias variables
    self.weight = tf.Variable(
        initial_value = tf. random.normal(
            [13, 1], dtype=tf.float64),
            trainable = True
    )
    self.bias = tf.Variable(initial_value = tf.constant(
        1.0, shape=[], dtype=tf.float64),
        trainable = True
    )
    def call(self, inputs):
      return tf.add(tf.matmul(inputs, self.weight), self.bias)

In [0]:
model = LinearRegressionModel()

# parameters
batch_size = 32
learning_rate = 0.01

# use tf.data to batch and shuffle the dataset
train_ds = tf.data.Dataset.from_tensor_slices(
(X_train, y_train)).shuffle(len(X_train)).batch(batch_size)
test_ds = tf.data.Dataset.from_tensor_slices((X_test, y_test)).batch(batch_size)
loss_object = tf.keras.losses.MeanSquaredError()
optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)
train_loss = tf.keras.metrics.Mean(name='train_loss')
train_rmse = tf.keras.metrics.RootMeanSquaredError(name='train_rmse')
test_loss = tf.keras.metrics.Mean(name='test_loss')
test_rmse = tf.keras.metrics.RootMeanSquaredError(name='test_rmse')

In [0]:
# use tf.GradientTape to train the model
@tf.function
def train_step(inputs, labels):
  with tf.GradientTape() as tape:
    predictions = model(inputs)
    loss = loss_object(labels, predictions)
    gradients = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
    train_loss(loss)
    train_rmse(labels, predictions)

@tf.function
def test_step(inputs, labels):
  predictions = model(inputs)
  t_loss = loss_object(labels, predictions)
  test_loss(t_loss)
  test_rmse(labels, predictions)
  
num_epochs = 1000
for epoch in range(num_epochs):
  for train_inputs, train_labels in train_ds:
    train_step(train_inputs, train_labels)
  for test_inputs, test_labels in test_ds:
    test_step(test_inputs, test_labels)
    
  template = 'Epoch {}, Loss: {}, RMSE: {}, Test Loss: {}, Test RMSE: {}'
  
  if ((epoch+1) % 100 == 0):
    print (template.format(epoch+1,
                           train_loss.result(),
                           train_rmse.result(),
                           test_loss.result(),
                           test_rmse.result()))