# Stage 5: DeZero advanced challenge

DeZero已经具备了开发神经网络的基本功能，这一阶段将增加一些额外功能，让DeZero更加完善。

In [1]:
import time
import dezero
import dezero.functions as F
from dezero import optimizers, DataLoader
from dezero.models import MLP
import numpy as np
import os

## Step 52: Support GPU

CuPy拥有与NumPy相似的接口，很容易直接进行迁移。

为了让DeZero支持GPU，要能获取当前数组所属的模块，且要让DeZero能够在NumPy和CuPy之间切换。

In [4]:
import cupy as cp
import numpy as np

# numpy -> cupy
n = np.array([1, 2, 3])
c = cp.array(n)
assert type(c) == cp.ndarray

# cupy -> numpy
c = cp.array([1, 2, 3])
n = cp.asnumpy(c)
assert type(n) == np.ndarray

# 获取数组模块
x = np.array([1, 2, 3])
xp = cp.get_array_module(x)
assert xp == np

x = cp.array([1, 2, 3])
xp = cp.get_array_module(x)
assert xp == cp

In [2]:
# 超参数设置
max_epoch = 5
batch_size = 100

# 数据集
train_set = dezero.datasets.MNIST(train=True)
train_loader = DataLoader(train_set, batch_size)

# 模型
model = MLP((1000, 10))
optimizer = optimizers.SGD().setup(model)

if dezero.cuda.gpu_enable:
    train_loader.to_gpu()
    model.to_gpu()

# 训练
for epoch in range(max_epoch):
    start = time.time()
    sum_loss = 0

    for x, t in train_loader:
        y = model(x)
        loss = F.softmax_cross_entropy(y, t)
        model.cleargrads()
        loss.backward()
        optimizer.update()

        sum_loss += float(loss.data) * len(t)

    elapsed_time = time.time() - start
    print('epoch: {}, loss: {:.4f}, time: {:.4f}[sec]'.format(
        epoch + 1, sum_loss / len(train_set), elapsed_time))

epoch: 1, loss: 1.9050, time: 14.1970[sec]
epoch: 2, loss: 1.2751, time: 6.1311[sec]
epoch: 3, loss: 0.9201, time: 6.5514[sec]
epoch: 4, loss: 0.7378, time: 5.6780[sec]
epoch: 5, loss: 0.6343, time: 5.6873[sec]


## Step 53: Model saving and loading

DeZero的参数为Parameter类的实例，而Parameter的数据则作为ndarray实例保存在变量data中。使用NumPy的函数来进行保存。

In [3]:
x1 = np.array([1, 2, 3])
x2 = np.array([4, 5, 6])
data = {'x1': x1, 'x2': x2}

np.savez('test.npz', **data)

arrays = np.load('test.npz')
x1 = arrays['x1']
x2 = arrays['x2']
print(x1)
print(x2)

[1 2 3]
[4 5 6]


Layer的层次结构是一个嵌套的结构，为了取出参数，可以将其“展平”，实际是递归地取出参数。

In [3]:
max_epoch = 3
batch_size = 100

train_set = dezero.datasets.MNIST(train=True)
train_loader = DataLoader(train_set, batch_size)
model = MLP((100, 10))
optimizer = optimizers.SGD().setup(model)

if os.path.exists('my_mlp.npz'):
    model.load_weights('my_mlp.npz')

for epoch in range(max_epoch):
    sum_loss = 0

    for x, t in train_loader:
        y = model(x)
        loss = F.softmax_cross_entropy(y, t)
        model.cleargrads()
        loss.backward()
        optimizer.update()
        sum_loss += float(loss.data) * len(t)
    
    print('epoch: {}, loss: {:.4f}'.format(
        epoch + 1, sum_loss / len(train_set)))
    
model.save_weights('my_mlp.npz')

epoch: 1, loss: 0.9778
epoch: 2, loss: 0.8133
epoch: 3, loss: 0.7077
