3.10 多层感知机的简洁实现
下面我们使用 tensorflow 来实现上一节中的多层感知机。首先导入所需的包或模块


In [None]:
import tensorflow as tf
from tensorflow import keras
import sys

sys.path.append("..")
from tensorflow import keras

fashion_mnist = keras.datasets.fashion_mnist

3.10.1 定义模型
和 softmax 回归唯一的不同在于，我们多加了一个全连接层作为隐藏层。它的隐藏单元个数为 256，并使用 ReLU 函数作为激活函数。


In [14]:
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),        tf.keras.layers.Dense(
            256,
            activation="relu",
        ),
        tf.keras.layers.Dense(10, activation="softmax"),
    ]
)


3.10.2 读取数据并训练模型
我们使用与 3.7 节中训练 softmax 回归几乎相同的步骤来读取数据并训练模型。


In [15]:
import os
import gzip
import numpy as np
import tensorflow as tf

# import keras

data_dir = "./Data"


train_images_path = os.path.join(data_dir, "train-images-idx3-ubyte.gz")
train_labels_path = os.path.join(data_dir, "train-labels-idx1-ubyte.gz")
test_images_path = os.path.join(data_dir, "t10k-images-idx3-ubyte.gz")
test_labels_path = os.path.join(data_dir, "t10k-labels-idx1-ubyte.gz")


def load_images(filename):
    with gzip.open(filename, "rb") as f:
        data = np.frombuffer(f.read(), np.uint8, offset=16)  # 跳过前16字节头文件
    return data.reshape(-1, 28, 28)  # Fashion-MNIST 图像尺寸为 28x28


def load_labels(filename):
    with gzip.open(filename, "rb") as f:
        data = np.frombuffer(f.read(), np.uint8, offset=8)  # 跳过前8字节头文件
    return data


# 加载数据
x_train = load_images(train_images_path)
y_train = load_labels(train_labels_path)
x_test = load_images(test_images_path)
y_test = load_labels(test_labels_path)


x_train = x_train / 255.0


x_test = x_test / 255.0




model.compile(
    optimizer=tf.keras.optimizers.SGD(learning_rate = 0.5),
    loss="sparse_categorical_crossentropy",
    metrics=["accuracy"],
)


model.fit(
    x_train,
    y_train,
    epochs=5,
    batch_size=256,
    validation_data=(x_test, y_test),
    validation_freq=1,
)

Epoch 1/5
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.6188 - loss: 1.1229 - val_accuracy: 0.7667 - val_loss: 0.6062
Epoch 2/5
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.8216 - loss: 0.4828 - val_accuracy: 0.8294 - val_loss: 0.4536
Epoch 3/5
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.8427 - loss: 0.4234 - val_accuracy: 0.7996 - val_loss: 0.5532
Epoch 4/5
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.8534 - loss: 0.3987 - val_accuracy: 0.8503 - val_loss: 0.4063
Epoch 5/5
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.8683 - loss: 0.3568 - val_accuracy: 0.8419 - val_loss: 0.4418


<keras.src.callbacks.history.History at 0x19ce9f528f0>

小结
通过 Tensorflow2.0 可以更简洁地实现多层感知机。
