<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://www.tensorflow.org/tutorials/quickstart/beginner"><img src="https://www.tensorflow.org/images/tf_logo_32px.png" />View on TensorFlow.org</a>
  </td>
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/quickstart/beginner.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/tensorflow/docs/blob/master/site/en/tutorials/quickstart/beginner.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
  <td>
    <a href="https://storage.googleapis.com/tensorflow_docs/docs/site/en/tutorials/quickstart/beginner.ipynb"><img src="https://www.tensorflow.org/images/download_logo_32px.png" />Download notebook</a>
  </td>
</table>


This short introduction uses [Keras](https://www.tensorflow.org/guide/keras/overview) to:

1. Build a neural network that classifies images.
2. Train this neural network.
3. And, finally, evaluate the accuracy of the model.


This is a [Google Colaboratory](https://colab.research.google.com/notebooks/welcome.ipynb) notebook file. Python programs are run directly in the browser—a great way to learn and use TensorFlow. To follow this tutorial, run the notebook in Google Colab by clicking the button at the top of this page.

1. In Colab, connect to a Python runtime: At the top-right of the menu bar, select _CONNECT_.
2. Run all the notebook code cells: Select _Runtime_ > _Run all_.


Download and install TensorFlow 2. Import TensorFlow into your program:

Note: Upgrade `pip` to install the TensorFlow 2 package. See the [install guide](https://www.tensorflow.org/install) for details.


In [15]:
# 匯入 Tensorflow 套件
import tensorflow as tf
import numpy as np

Load and prepare the [MNIST dataset](http://yann.lecun.com/exdb/mnist/). Convert the samples from integers to floating-point numbers:


In [16]:
# 匯入 MNIST 手寫數字資料集
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()  # 匯入訓練與測試資料
x_train, x_test = x_train / 255.0, x_test / 255.0  # 將影像明暗調至[0 1]範圍

In [17]:
print("x_train:", x_train.shape)
print("y_train:", y_train.shape)
print("x_test:", x_test.shape)
print("y_test:", y_test.shape)

(60000, 28, 28)
(60000,)


Build the `tf.keras.Sequential` model by stacking layers. Choose an optimizer and loss function for training:


In [18]:
# 建立模型: MPL(Multilayer perceptron, 多層感知器) 神經網路
model = tf.keras.models.Sequential([
    # 將輸入層匯入的28x28像素的灰階影像平坦化(忽略維度)
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128, activation='relu'),  # 全連接層，使用relu做激勵函數
    tf.keras.layers.Dropout(0.2),  # 20% Dropout (20%的節點隨機輸出歸零)
    tf.keras.layers.Dense(10)  # 全連接層輸出 (十類分類指數)
])

For each example the model returns a vector of "[logits](https://developers.google.com/machine-learning/glossary#logits)" or "[log-odds](https://developers.google.com/machine-learning/glossary#log-odds)" scores, one for each class.


In [19]:
predictions = model(x_train[:2]).numpy()  # 觀看第0與1號影像的分類(10類)結果
predictions

array([[ 0.7220865 ,  0.28849483,  0.24393797, -0.44996282,  0.27778473,
        -0.12496303,  0.00340241, -0.11346176,  0.27729824, -0.02791936],
       [ 0.6286163 ,  0.1393464 ,  0.20468664,  0.13206229,  0.16145432,
        -0.17609222,  0.1398169 , -0.1411345 ,  0.36098874,  0.09811371]],
      dtype=float32)

The `tf.nn.softmax` function converts these logits to "probabilities" for each class:


In [20]:
a = tf.nn.softmax(predictions).numpy()  # 換算成機率值
print(a)  # 顯示屬於10類別機率值
print(np.argmax(a, axis=1))  # 顯示模式估計的類別

[[0.17598997 0.11407264 0.10910149 0.05450965 0.11285742 0.07544301
  0.08577631 0.0763157  0.11280253 0.08313129]
 [0.15673594 0.09609071 0.10257896 0.09539331 0.09823874 0.07009517
  0.09613593 0.07258888 0.1199332  0.0922092 ]]
[0 0]


Note: It is possible to bake this `tf.nn.softmax` in as the activation function for the last layer of the network. While this can make the model output more directly interpretable, this approach is discouraged as it's impossible to
provide an exact and numerically stable loss calculation for all models when using a softmax output.


The `losses.SparseCategoricalCrossentropy` loss takes a vector of logits and a `True` index and returns a scalar loss for each example.


In [21]:
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(
    from_logits=True)  # 使用疏稀交叉熵作為損失函式
# 損失函式是分類機率值取對數，再給予負號。如果分十類，訓練前，每類機率接近1/10，loss值因此在2.3左右。
# 如果分類完原正確，機率值為 1，則loss值為 0。

This loss is equal to the negative log probability of the true class:
It is zero if the model is sure of the correct class.

This untrained model gives probabilities close to random (1/10 for each class), so the initial loss should be close to `-tf.log(1/10) ~= 2.3`.


In [22]:
loss_fn(y_train[:2], predictions).numpy()  # 評估第0與1號資料的 loss 平均值

2.2187853

In [23]:
model.compile(optimizer='adam',  # 用 adam 優化器
              loss=loss_fn,  # 指定損失函式
              metrics=['accuracy'])  # 評估指標用「正確度(答對的比率)」

The `Model.fit` method adjusts the model parameters to minimize the loss:


In [24]:
model.fit(x_train, y_train, epochs=5)  # 訓練：5 epochs (訓練資料跑完五次)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5

The `Model.evaluate` method checks the models performance, usually on a "[Validation-set](https://developers.google.com/machine-learning/glossary#validation-set)" or "[Test-set](https://developers.google.com/machine-learning/glossary#test-set)".


In [None]:
model.evaluate(x_test,  y_test, verbose=2)  # 評估測試組的平均損失與正確率
# verbose=0 不顯示，verbose=1 顯示進度，verbose=2 僅顯示結果

313/313 - 2s - loss: 0.0755 - accuracy: 0.9783 - 2s/epoch - 5ms/step


[0.07549842447042465, 0.9782999753952026]

The image classifier is now trained to ~98% accuracy on this dataset. To learn more, read the [TensorFlow tutorials](https://www.tensorflow.org/tutorials/).


If you want your model to return a probability, you can wrap the trained model, and attach the softmax to it:


In [None]:
probability_model = tf.keras.Sequential([
    model,
    tf.keras.layers.Softmax()  # 將結果換算成機率值
])

In [None]:
b = probability_model(x_test[:5])  # 前五號影像分別屬於[0,...,9]10個類別的機率值
print(b)  # 顯示10個類別的機率值
print(np.argmax(b, axis=1))  # 顯示模式估計前五號影像的類別

tf.Tensor(
[[1.06512354e-08 1.42511411e-07 8.83808389e-06 4.58419527e-05
  9.54334459e-11 1.08424402e-07 1.55195721e-13 9.99944448e-01
  1.89251935e-07 3.70345134e-07]
 [2.09968825e-08 3.55544726e-05 9.99923944e-01 1.68137899e-06
  1.48324617e-13 2.84486896e-06 1.80849838e-06 4.77528851e-14
  3.42051244e-05 3.29599889e-11]
 [6.26648387e-07 9.99031067e-01 8.03726725e-05 7.24194069e-06
  2.58711480e-05 2.70645114e-06 2.78120497e-05 7.28668121e-04
  9.48008310e-05 8.49367723e-07]
 [9.99954581e-01 4.14269764e-11 5.73308716e-06 6.29555572e-08
  4.62593192e-07 9.39399243e-08 1.09966993e-06 7.32857006e-06
  8.56520899e-09 3.06364600e-05]
 [9.88371539e-06 1.84061238e-10 2.03134459e-06 4.40126122e-08
  9.98960853e-01 1.88695068e-07 4.71576686e-06 8.39873828e-05
  3.08169206e-07 9.37980891e-04]], shape=(5, 10), dtype=float32)
[7 2 1 0 4]
