##### Copyright 2019 The TensorFlow Authors.

# TensorFlow 2 quickstart for experts

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://www.tensorflow.org/tutorials/quickstart/advanced"><img src="https://www.tensorflow.org/images/tf_logo_32px.png" />View on TensorFlow.org</a>
  </td>
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/quickstart/advanced.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/tensorflow/docs/blob/master/site/en/tutorials/quickstart/advanced.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
  <td>
    <a href="https://storage.googleapis.com/tensorflow_docs/docs/site/en/tutorials/quickstart/advanced.ipynb"><img src="https://www.tensorflow.org/images/download_logo_32px.png" />Download notebook</a>
  </td>
</table>

This is a [Google Colaboratory](https://colab.research.google.com/notebooks/welcome.ipynb) notebook file. Python programs are run directly in the browser—a great way to learn and use TensorFlow. To follow this tutorial, run the notebook in Google Colab by clicking the button at the top of this page.

1. In Colab, connect to a Python runtime: At the top-right of the menu bar, select *CONNECT*.
2. Run all the notebook code cells: Select *Runtime* > *Run all*.

Download and install TensorFlow 2. Import TensorFlow into your program:

Note: Upgrade `pip` to install the TensorFlow 2 package. See the [install guide](https://www.tensorflow.org/install) for details.

Import TensorFlow into your program:

In [None]:
import tensorflow as tf
print("TensorFlow version:", tf.__version__)

from tensorflow.keras.layers import Dense, Flatten, Conv2D
from tensorflow.keras import Model

Load and prepare the [MNIST dataset](http://yann.lecun.com/exdb/mnist/).

In [None]:
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Add a channels dimension
x_train = x_train[..., tf.newaxis].astype("float32")
x_test = x_test[..., tf.newaxis].astype("float32")

Use `tf.data` to batch and shuffle the dataset:

In [None]:
train_ds = tf.data.Dataset.from_tensor_slices(
    (x_train, y_train)).shuffle(10000).batch(32)

test_ds = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)

Build the `tf.keras` model using the Keras [model subclassing API](https://www.tensorflow.org/guide/keras#model_subclassing):

In [None]:
class MyModel(Model):
  def __init__(self):
    super(MyModel, self).__init__()
    self.conv1 = Conv2D(32, 3, activation='relu')
    self.flatten = Flatten()
    self.d1 = Dense(128, activation='relu')
    self.d2 = Dense(10)

  def call(self, x):
    x = self.conv1(x)
    x = self.flatten(x)
    x = self.d1(x)
    return self.d2(x)

# Create an instance of the model
model = MyModel()

Choose an optimizer and loss function for training: 

In [None]:
loss_object = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

optimizer = tf.keras.optimizers.Adam()

Select metrics to measure the loss and the accuracy of the model. These metrics accumulate the values over epochs and then print the overall result.

In [None]:
train_loss = tf.keras.metrics.Mean(name='train_loss')
train_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='train_accuracy')

test_loss = tf.keras.metrics.Mean(name='test_loss')
test_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='test_accuracy')

Use `tf.GradientTape` to train the model:

In [None]:
@tf.function
def train_step(images, labels):
  with tf.GradientTape() as tape:
    # training=True is only needed if there are layers with different
    # behavior during training versus inference (e.g. Dropout).
    predictions = model(images, training=True)
    loss = loss_object(labels, predictions)
  gradients = tape.gradient(loss, model.trainable_variables)
  optimizer.apply_gradients(zip(gradients, model.trainable_variables))

  train_loss(loss)
  train_accuracy(labels, predictions)

Test the model:

In [None]:
@tf.function
def test_step(images, labels):
  # training=False is only needed if there are layers with different
  # behavior during training versus inference (e.g. Dropout).
  predictions = model(images, training=False)
  t_loss = loss_object(labels, predictions)

  test_loss(t_loss)
  test_accuracy(labels, predictions)

In [None]:
EPOCHS = 5

for epoch in range(EPOCHS):
  # Reset the metrics at the start of the next epoch
  train_loss.reset_states()
  train_accuracy.reset_states()
  test_loss.reset_states()
  test_accuracy.reset_states()

  for images, labels in train_ds:
    train_step(images, labels)

  for test_images, test_labels in test_ds:
    test_step(test_images, test_labels)

  print(
    f'Epoch {epoch + 1}, '
    f'Loss: {train_loss.result()}, '
    f'Accuracy: {train_accuracy.result() * 100}, '
    f'Test Loss: {test_loss.result()}, '
    f'Test Accuracy: {test_accuracy.result() * 100}'
  )

The image classifier is now trained to ~98% accuracy on this dataset. To learn more, read the [TensorFlow tutorials](https://www.tensorflow.org/tutorials).

# Use Tensorflow 2 to build a nn model
- Practice with Tensor (Basic object in Tensorflow)
- Build nn

## Practice with Tensor

In [None]:
import tensorflow as tf
import numpy as np

In [None]:
const = tf.Variable(2.0, name='const')
b = tf.Variable(2.0, name='b')
c = tf.Variable(1.0, name='c')

In [None]:
d = tf.add(b, c, name='d') # d = b + c
e = tf.add(c, const, name='e') # e = c + const
a = tf.multiply(d, e, name='a') # a = d * e
a

<tf.Tensor: shape=(), dtype=float32, numpy=9.0>

In [None]:
b = tf.Variable(np.arange(0, 10), name='b')
b

<tf.Variable 'b:0' shape=(10,) dtype=int64, numpy=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])>

In [None]:
tf.cast(b, tf.float32) + c

<tf.Tensor: shape=(10,), dtype=float32, numpy=array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.], dtype=float32)>

In [None]:
b[:5]

<tf.Tensor: shape=(5,), dtype=int64, numpy=array([0, 1, 2, 3, 4])>

In [None]:
b[1].assign(10)
b

<tf.Variable 'b:0' shape=(10,) dtype=int64, numpy=array([ 0, 10,  2,  3,  4,  5,  6,  7,  8,  9])>

## Build a Neural Network

In [None]:
from tensorflow.keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

In [None]:
print(x_train.shape)
print(x_test.shape)
print(y_train.shape)
print(y_test.shape)

(60000, 28, 28)
(10000, 28, 28)
(60000,)
(10000,)


In [None]:
# 隨機從 0 到 y_data的數量生成 batch_size 個數字，把這些數字當作 index
# 套用 x_data, y_data 當中
def get_batch(x_data, y_data, batch_size):
    idxs = np.random.randint(0, len(y_data), batch_size)
    return x_data[idxs,:,:], y_data[idxs]

In [None]:
epochs = 10
batch_size = 100
x_train = x_train / 255.0 # scaler to 0~1
x_test = x_test / 255.0 # scaler to 0-1
# convert x_test to tensor to pass through model (train data will be converted to
# tensors on the fly)
x_test = tf.Variable(x_test)

Setup weight and bias vaiables for three-layer nn

In [None]:
# input layer weights and bias: 784 is input dimensions, 300 is hidden layer's neruon number.
W1 = tf.Variable(tf.random.normal([784,300], stddev=0.03), name='W1')
b1 = tf.Variable(tf.random.normal([300]), name='b1')
# hidden layer weights and bias to the output layer
W2 = tf.Variable(tf.random.normal([300,10], stddev=0.03), name='W2')
b2 = tf.Variable(tf.random.normal([10]), name='b2')

Build model

In [None]:
# build model
def nn_model(x_input, W1, b1 ,W2, b2):
    x_input = tf.reshape(x_input, shape=(x_input.shape[0], -1))
    # y = (Wx + b) * activation function
    x = tf.add(tf.matmul(tf.cast(x_input, tf.float32), W1), b1)
    x = tf.nn.relu(x)
    outputs = tf.add(tf.matmul(x, W2), b2)
    # outputs = tf.nn.softmax(outputs)
    return outputs

Define loss function

In [None]:
def loss_fn(outputs, labels):
  # The arguments to softmax_cross_entropy_with_logits are labels and logits.
  # The usage of this function in the main training loop will be demonstrated shortly.
  # The labels argument is supplied from the one-hot y values that are fed into loss_fn during the training process.
  # tf.reduct_mean will calculate the mean of all values in tensor.
    cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=labels, logits=outputs))
    return cross_entropy

Define optimizer

In [None]:
optimizer = tf.keras.optimizers.Adam()

In [None]:
loss

<tf.Tensor: shape=(), dtype=float32, numpy=0.008301935>

Train the network

In [None]:
total_batch = int(len(y_train) / batch_size)
for epoch in range(epochs):
    avg_loss = 0
    # 開始從第一個 epoch 迭代，總共會跑600個 epochs
    for i in range(total_batch):
        # 一個 batch 總共會訓練100筆 data
        batch_x, batch_y = get_batch(x_train, y_train, batch_size=batch_size)
        batch_x = tf.Variable(batch_x)
        # batch_y.shape = (100,)
        batch_y = tf.Variable(batch_y)
        # after one-hot batch_y.shape = (100, 10)
        batch_y = tf.one_hot(batch_y, 10)
        with tf.GradientTape() as tape:
            # pred.shape = (100, 10)
            pred = nn_model(batch_x, W1, b1, W2, b2)
            # loss is a scalar, arguments needs to have same dimension.
            loss = loss_fn(outputs=pred, labels=batch_y)
        # gradients = calculate dL/dw and dL/db （對 loss 和參數做微分）
        gradients = tape.gradient(loss, [W1, b1, W2, b2])
        # 更新參數（update weights and bias through backpropagation）
        optimizer.apply_gradients(zip(gradients, [W1, b1, W2, b2]))
        avg_loss += loss / total_batch
    # validate training data
    train_pred = nn_model(x_train, W1, b1, W2, b2)
    train_max_idxs = tf.argmax(train_pred, axis=1)
    acc = np.sum(train_max_idxs.numpy()==y_train) / len(y_train)
    # validate testing data
    test_pred = nn_model(x_test, W1, b1, W2, b2)
    y_test_one_hot = tf.one_hot(y_test, 10)
    test_loss = loss_fn(outputs=test_pred, labels=y_test_one_hot)
    max_idxs = tf.argmax(test_pred, axis=1)
    test_acc = np.sum(max_idxs.numpy()==y_test) / len(y_test)
    print(f"Epoch: {epoch+1}, loss={avg_loss:.3f}, , acc={acc:.3f}, val_loss={test_loss:.3f}, val_acc:{test_acc*100:.3f}")
print("Training complete!")

Epoch: 1, loss=0.362, , acc=0.939, val_loss=0.211, val_acc:93.680
Epoch: 2, loss=0.155, , acc=0.962, val_loss=0.133, val_acc:95.970
Epoch: 3, loss=0.108, , acc=0.975, val_loss=0.099, val_acc:96.920
Epoch: 4, loss=0.077, , acc=0.981, val_loss=0.090, val_acc:97.210
Epoch: 5, loss=0.063, , acc=0.985, val_loss=0.080, val_acc:97.510
Epoch: 6, loss=0.046, , acc=0.989, val_loss=0.073, val_acc:97.810
Epoch: 7, loss=0.040, , acc=0.991, val_loss=0.067, val_acc:97.990
Epoch: 8, loss=0.030, , acc=0.992, val_loss=0.068, val_acc:97.940
Epoch: 9, loss=0.025, , acc=0.993, val_loss=0.070, val_acc:97.970
Epoch: 10, loss=0.021, , acc=0.995, val_loss=0.067, val_acc:97.970
Training complete!


In [None]:
print(W1.shape)
print(b1.shape)
print(W2.shape)
print(b2.shape)

(784, 300)
(300,)
(300, 10)
(10,)
