<a href="https://colab.research.google.com/github/JHyunjun/SNU/blob/main/1_sequential.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Foundations: TensorFlow Tutorials

This material is based on [TensrorFlow 2 quickstart for beginners](https://www.tensorflow.org/tutorials/quickstart/beginner) and may be copyrighted by the original writers. For educational uses only.

## Introduction

This tutorial demonstrates the basic workflow of using TensorFlow. The goal of this tutorial is to build a simple linear model for classification. 
After loading the so-called MNIST data-set with images of hand-written digits, we define and optimize a simple mathematical model in TensorFlow. The results are then plotted and discussed.

You should be familiar with basic linear algebra, Python and the Jupyter Notebook editor. It also helps if you have a basic understanding of Machine Learning and classification.

This short introduction uses [Keras](https://www.tensorflow.org/guide/keras/overview) to:

1. Build a neural network that classifies images.
2. Train this neural network.
3. And, finally, evaluate the accuracy of the model.

This is a [Google Colaboratory](https://colab.research.google.com/notebooks/welcome.ipynb) notebook file. Python programs are run directly in the browser—a great way to learn and use TensorFlow. To follow this tutorial, run the notebook in Google Colab by clicking the button at the top of this page.

1. In Colab, connect to a Python runtime: At the top-right of the menu bar, select *CONNECT*.
2. Run all the notebook code cells: Select *Runtime* > *Run all*.

##Imports

Download and install TensorFlow 2. Import TensorFlow into your program:

Note: Upgrade `pip` to install the TensorFlow 2 package. See the [install guide](https://www.tensorflow.org/install) for details.

In [None]:
%matplotlib inline
from __future__ import absolute_import, division, print_function
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

In [None]:
# Load the TensorBoard notebook extension
%load_ext tensorboard
# Clear any logs from previous runs
!rm -rf ./logs/ 
%tensorboard --logdir logs/

## Load data

The MNIST database of handwritten digits, has a training set of 60,000 examples, and a test set of 10,000 examples of 28x28 grayscale images. It is a good database for people who want to try learning techniques and pattern recognition methods.

Load and prepare the [MNIST dataset](http://yann.lecun.com/exdb/mnist/). Convert the samples from integers to floating-point numbers:


In [None]:
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data() # Load MNIST dataset using Keras
x_train, x_test = x_train / 255.0, x_test / 255.0       # Convert the samples to floating-point numbers

print('Train dataset size:', x_train.shape[0])
print('Test dataset size:', x_test.shape[0])

print('Image shape:', x_train.shape[1:3])
print('num_classes:', np.unique(y_train))

In [None]:
# Plot train examples
fig, axes = plt.subplots(1,10,figsize=(15,2), sharex=True, sharey=True)
plt.suptitle('Train examples')
for i in range(10):
  axes[i].imshow(x_train[i], cmap=plt.cm.gray_r)
  axes[i].set_title('label: {}'.format(y_train[i]))

# Plot test examples
fig, axes = plt.subplots(1,10,figsize=(15,2), sharex=True, sharey=True)
plt.suptitle('Test examples')
for i in range(10):
  axes[i].imshow(x_test[i], cmap=plt.cm.gray_r)
  axes[i].set_title('label: {}'.format(y_test[i]))

## Define model

### When to use a Sequential model

A `Sequential` model is appropriate for **a plain stack of layers**
where each layer has **exactly one input tensor and one output tensor**.

A Sequential model is **not appropriate** when:

- Your model has multiple inputs or multiple outputs
- Any of your layers has multiple inputs or multiple outputs
- You need to do layer sharing
- You want non-linear topology (e.g. a residual connection, a multi-branch
model)

Build the `tf.keras.Sequential` model by stacking layers. Choose an optimizer and loss function for training:

In [None]:
# Model definition
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28,28)),
  tf.keras.layers.Dense(10)
])

You can also create a Sequential model incrementally via the `add()` method:

In [None]:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Flatten(input_shape=(28,28)))
model.add(tf.keras.layers.Dense(10))

For each example the model returns a vector of "[logits](https://developers.google.com/machine-learning/glossary#logits)" or "[log-odds](https://developers.google.com/machine-learning/glossary#log-odds)" scores, one for each class.

In [None]:
predictions = model(x_train[:1]).numpy()
predictions

The `tf.nn.softmax` function converts these logits to "probabilities" for each class: 

In [None]:
tf.nn.softmax(predictions).numpy()

Note: It is possible to bake this `tf.nn.softmax` in as the activation function for the last layer of the network. While this can make the model output more directly interpretable, this approach is discouraged as it's impossible to
provide an exact and numerically stable loss calculation for all models when using a softmax output. 

##Define loss function

The `losses.SparseCategoricalCrossentropy` loss takes a vector of logits and a `True` index and returns a scalar loss for each example.

In [None]:
# Define loss function
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

This loss is equal to the negative log probability of the true class:
It is zero if the model is sure of the correct class.

This untrained model gives probabilities close to random (1/10 for each class), so the initial loss should be close to `-tf.log(1/10) ~= 2.3`.

In [None]:
loss_fn(y_train[:1], predictions).numpy()

## Optimize model

In [None]:
# Compile model graph
model.compile(optimizer='adam', 
              loss=loss_fn,
              metrics=['accuracy'])

The `Model.fit` method adjusts the model parameters to minimize the loss:

In [None]:
# Optimize model
model.fit(x_train, y_train, epochs=5,
          callbacks=[tf.keras.callbacks.TensorBoard("./logs/keras")])

## Evaluate model
The `Model.evaluate` method checks the models performance, usually on a "[Validation-set](https://developers.google.com/machine-learning/glossary#validation-set)" or "[Test-set](https://developers.google.com/machine-learning/glossary#test-set)".

In [None]:
# Evaluate model performance
model.evaluate(x_test, y_test, verbose=2)

The image classifier is now trained to ~92% accuracy on this dataset. To learn more, read the [TensorFlow tutorials](https://www.tensorflow.org/tutorials/).

If you want your model to return a probability, you can wrap the trained model, and attach the softmax to it:

In [None]:
probability_model = tf.keras.Sequential([
  model,
  tf.keras.layers.Softmax()
])

In [None]:
print('Prediction probability:\n',probability_model(x_test[:5]))
y_pred = tf.math.argmax(model(x_test[:5]))

fig, axes = plt.subplots(1,5,figsize=(10,3), sharex=True, sharey=True)
plt.suptitle('Test examples')
for i in range(5):
  axes[i].imshow(x_test[i], cmap=plt.cm.gray_r)
  axes[i].set_title('label: {}, pred:{}'.format(y_test[i], y_pred[i]))

## Tensorboard

In [None]:
%tensorboard --logdir logs/keras

## Exercise: Build an MLP model using Sequential

The previous sections implemented a linear model.
This section implements an MLP models. The code is basically the same except the model is expanded to include some "hidden"  non-linear layers. The name "hidden" here just means not directly connected to the inputs or outputs.

These models will contain a few more layers than the linear model:

* The flattening layer
* Two hidden, nonlinear, `Dense` layers using the `relu` nonlinearity.
* A linear single-output layer.

Train the model by using the same training procedure above.

In [None]:
#TODO