### MLP and MNIST

在该实现中您可以看到如下功能：
1. 从 Flux 中导入标准数据集
2. 对数据集进行切分
3. 指定使用的设备/GPU
4. 定义模型
5. 定义损失函数
6. 定义评估方法
7. 使用模型进行训练和推断

In this template you can finish the following functions:
1. Import standard dataset from Flux
2. Split the data set
3. Specify the device / GPU used
4. Define the model
5. Define the loss function
6. Define the evaluation method
7. Use the model for training and inference

Flux 是 Julia 中的深度学习库，其完全由 Julia 实现，结构轻量化，是 Julia 中的 PyTorch。

Flux is an elegant approach to machine learning. It's a 100% pure-Julia stack, and provides lightweight abstractions on top of Julia's native GPU and AD support. Flux makes the easy things easy while remaining fully hackable.

In [1]:
using Flux, Flux.Data.MNIST, Statistics
using Flux: onehotbatch, onecold, crossentropy, throttle, params
using Base.Iterators: repeated, partition
using CuArrays

尽管 Flux 中目前已经实现了 gpu 方法，但功能有限。所幸 Flux 在 GPU 上的功能基于 CuArrays 实现，我们可以使用 CUDAapi, CUDAdrv, CUDAnative 来设置 Flux 使用哪个 GPU，或是只使用 CPU。

Although the gpu method has been implemented in Flux, it has limited functionality. Fortunately, the function of Flux on the GPU is based on CuArrays. We can use CUDAapi, CUDAdrv, CUDAnative to set which GPU Flux uses, or only the CPU.

In [2]:
using CUDAapi, CUDAdrv, CUDAnative
gpu_id = 1  ## set < 0 for no cuda, >= 0 for using a specific device (if available)

if has_cuda_gpu() && gpu_id >=0
    device!(gpu_id)
    device = Flux.gpu
    @info "Training on GPU-$(gpu_id)"
else
    device = Flux.cpu
    @info "Training on CPU"
end

┌ Info: Training on GPU-1
└ @ Main In[2]:7


加载数据集和对应的label，对label进行onehot编码。

Load the data set and the corresponding label, and use onehot to encode the label.

In [3]:
imgs = MNIST.images()
labels = onehotbatch(MNIST.labels(), 0:9)

10×60000 Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}:
 0  1  0  0  0  0  0  0  0  0  0  0  0  …  0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  1  0  0  1  0  1  0  0  0  0     0  0  0  0  0  0  1  0  0  0  0  0
 0  0  0  0  0  1  0  0  0  0  0  0  0     0  0  0  1  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  1  0  0  1  0  1     0  0  0  0  0  0  0  0  1  0  0  0
 0  0  1  0  0  0  0  0  0  1  0  0  0     0  0  0  0  0  0  0  0  0  0  0  0
 1  0  0  0  0  0  0  0  0  0  0  1  0  …  0  0  0  0  0  1  0  0  0  1  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  0     0  0  0  0  0  0  0  0  0  0  1  0
 0  0  0  0  0  0  0  0  0  0  0  0  0     1  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  0     0  1  0  0  0  0  0  1  0  0  0  1
 0  0  0  0  1  0  0  0  0  0  0  0  0     0  0  1  0  1  0  0  0  0  0  0  0

准备训练数据集，将每 1000 张图像分为一个 batch，并全部图像迁移到 GPU 中。

Prepare a training data set, divide every 1000 images into a batch, and migrate all images to the GPU.

In [4]:
train = [(cat(float.(imgs[i])..., dims = 4), labels[:,i])
         for i in partition(1:60_000, 1000)] |> device

60-element Array{Tuple{CuArray{Float32,4,Nothing},Flux.OneHotMatrix{CuArray{Flux.OneHotVector,1,Nothing}}},1}:
 ([0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0]

[0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0]

[0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0]

...

[0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0]

[0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0]

[0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0], [0 1 … 0 0; 0 0 … 0 0; … ; 0 0 … 0 0; 0 0 … 0 0])
 ([0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0]

[0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0]

[0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0]

...

[0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0]

[0.0 

└ @ GPUArrays /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/GPUArrays/WZupy/src/host/indexing.jl:43


选择前1000张图片作为测试数据集，也迁移到 GPU 中。

Select the first 1000 pictures as the test data set, and also migrate to the GPU.

In [5]:
test_X = cat(float.(MNIST.images(:test)[1:1000])..., dims = 4) |> device
test_y = onehotbatch(MNIST.labels(:test)[1:1000], 0:9) |> device

10×1000 Flux.OneHotMatrix{CuArray{Flux.OneHotVector,1,Nothing}}:
 0  0  0  1  0  0  0  0  0  0  1  0  0  …  0  0  0  0  0  1  0  0  0  1  0  0
 0  0  1  0  0  1  0  0  0  0  0  0  0     1  0  0  0  0  0  1  0  0  0  0  0
 0  1  0  0  0  0  0  0  0  0  0  0  0     0  0  1  0  0  0  0  1  1  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  0     0  0  0  1  0  0  0  0  0  0  0  0
 0  0  0  0  1  0  1  0  0  0  0  0  0     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  1  0  0  0  0  …  0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  1  0     0  1  0  0  0  0  0  0  0  0  0  0
 1  0  0  0  0  0  0  0  0  0  0  0  0     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  0     0  0  0  0  0  0  0  0  0  0  1  0
 0  0  0  0  0  0  0  1  0  1  0  0  1     0  0  0  0  1  0  0  0  0  0  0  1

定义模型、损失函数和评估方法。

Define models, loss functions and evaluation methods.

In [6]:
model = Chain(
  Conv((2,2), 1=>16, relu),
  MaxPool((2, 2)),
  Conv((2,2), 16=>8, relu),
  MaxPool((2, 2)),
  x -> reshape(x, :, size(x, 4)),
  Dense(288, 10), softmax
) |> device


loss(x, y) = crossentropy(model(x), y)
accuracy(x, y) = mean(onecold(model(x)) .== onecold(y))

accuracy (generic function with 1 method)

训练并打印测试集的准确率。

Train and print the accuracy of the test set.

In [7]:
opt = ADAM(0.01)
evalcb() = @show(accuracy(test_X, test_y))

epochs = 5

for i = 1:epochs
    Flux.train!(loss, Flux.params(model), train, opt, cb=throttle(evalcb, 10))
end

accuracy(test_X, test_y) = 0.192
accuracy(test_X, test_y) = 0.919
accuracy(test_X, test_y) = 0.951
accuracy(test_X, test_y) = 0.962
accuracy(test_X, test_y) = 0.967


针对单独的图片进行推断。

Infer for individual pictures.

In [8]:
using Colors, FileIO, ImageShow

img = test_X[:, :, 1:1, 7:7]

println("Predicted: ", Flux.onecold(model(img |> device)) .- 1)
save("outputs.jpg", collect(test_X[:, :, 1, 7]))

Predicted: [4]
