# Julia 深度學習：類神經網路模型簡介

## 作業 032：訓練 MLP 學習門牌號碼資料集

訓練一個 MLP 模型來學習門牌號碼資料集。

注意：MLP 模型的能力有限，可能會導致訓練起來效果不佳。

注意：近期 Flux 正在持續更新，請確保您的 Julia 在 v1.3 版以上，以及 Flux 在 v0.10.4 以上或是最新版。

In [4]:
using Flux
using Flux.Data: DataLoader
using Flux: @epochs, onecold, onehotbatch, throttle, logitcrossentropy
using MLDatasets
using Statistics

┌ Info: Precompiling Flux [587475ba-b771-5e3f-ad9e-33799f191a9c]
└ @ Base loading.jl:1260
│ This may mean CuArrays [3a865a2d-5b23-5a0f-bc46-62713ec82fae] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1016
┌ Info: Skipping precompilation since __precompile__(false). Importing Flux [587475ba-b771-5e3f-ad9e-33799f191a9c].
└ @ Base loading.jl:1033
┌ Info: Precompiling CodecZlib [944b1d66-785c-5afd-91f1-9de20f533193]
└ @ Base loading.jl:1260
┌ Info: Precompiling MLDatasets [eb30cadb-4394-5ae3-aed4-317e484a6458]
└ @ Base loading.jl:1260


## 讀取資料

呼叫 SVHN2 資料集的過程中，會先去檢查以前是否有下載過，如果是第一次下載，過程中會詢問是否下載資料集，請回答 `y`。整個資料集的大小約為 1.6 GB，下載時間可能會稍久，請耐心等候。

In [5]:
train_X, train_y = SVHN2.traindata(Float32;dir = "D:\\julia\\data\\SVHN2")
test_X,  test_y  = SVHN2.testdata(Float32;dir = "D:\\julia\\data\\SVHN2")

(Float32[0.14901961 0.15294118 … 0.19607843 0.1882353; 0.15294118 0.15294118 … 0.2 0.1882353; … ; 0.16470589 0.16862746 … 0.1764706 0.17254902; 0.15294118 0.15294118 … 0.16470589 0.16470589]

Float32[0.40392157 0.40784314 … 0.45882353 0.4509804; 0.40784314 0.40784314 … 0.4627451 0.4509804; … ; 0.40392157 0.39607844 … 0.45490196 0.4509804; 0.38039216 0.38039216 … 0.44313726 0.44313726]

Float32[0.23529412 0.23921569 … 0.29803923 0.2901961; 0.23921569 0.23921569 … 0.3019608 0.2901961; … ; 0.24313726 0.24705882 … 0.28235295 0.2784314; 0.22352941 0.22352941 … 0.27058825 0.2784314]

Float32[0.5058824 0.5254902 … 0.5411765 0.5137255; 0.49803922 0.52156866 … 0.50980395 0.47843137; … ; 0.48235294 0.49411765 … 0.39607844 0.43529412; 0.48235294 0.49019608 … 0.4392157 0.48235294]

Float32[0.5568628 0.5882353 … 0.59607846 0.5686275; 0.56078434 0.58431375 … 0.5647059 0.53333336; … ; 0.5254902 0.5372549 … 0.41960785 0.4627451; 0.5294118 0.5372549 … 0.4627451 0.50980395]

Float32[0.6 0.627451 … 0.647

In [6]:
typeof(train_X)

Array{Float32,4}

In [8]:
size(train_X)

(32, 32, 3, 73257)

In [7]:
typeof(train_y)

Array{Int64,1}

In [9]:
unique(train_y)

10-element Array{Int64,1}:
  1
  9
  2
  3
  5
  8
  7
  4
  6
 10

In [11]:
# Transform (w, h, c, b)-shaped input into (w × h × c, b)-shaped output by
#   linearizing all values for each element in the batch.
train_X = Flux.flatten(train_X) # to minibatch
test_X = Flux.flatten(test_X) # to minibatch

3072×26032 Array{Float32,2}:
 0.14902   0.505882  0.588235  0.509804  …  0.45098   0.376471  0.396078
 0.152941  0.498039  0.588235  0.505882     0.454902  0.380392  0.392157
 0.152941  0.490196  0.596078  0.498039     0.458824  0.380392  0.388235
 0.160784  0.490196  0.603922  0.486275     0.454902  0.376471  0.384314
 0.168627  0.498039  0.611765  0.501961     0.458824  0.372549  0.376471
 0.172549  0.498039  0.603922  0.513726  …  0.466667  0.364706  0.364706
 0.176471  0.494118  0.584314  0.521569     0.470588  0.360784  0.34902
 0.184314  0.482353  0.556863  0.521569     0.470588  0.352941  0.345098
 0.184314  0.482353  0.545098  0.52549      0.462745  0.352941  0.337255
 0.184314  0.486275  0.556863  0.529412     0.458824  0.356863  0.333333
 0.184314  0.494118  0.576471  0.529412  …  0.458824  0.360784  0.329412
 0.184314  0.490196  0.607843  0.517647     0.458824  0.360784  0.321569
 0.184314  0.486275  0.623529  0.501961     0.458824  0.364706  0.313726
 ⋮                     

In [10]:
train_y = onehotbatch(train_y, 1:10) # e.g. convert [1 2 2 1] (id of category) into [1 0 0 1; 0 1 1 0]
test_y = onehotbatch(test_y, 1:10) # 1:10 because unique(train_y) is 1:10

10×26032 Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}:
 0  0  1  0  0  1  0  1  1  0  0  0  0  …  1  0  1  0  1  0  0  0  0  0  0  0
 0  1  0  0  0  0  0  0  0  0  0  0  0     0  0  0  0  0  0  0  1  1  0  0  0
 0  0  0  0  0  0  0  0  0  0  1  0  0     0  0  0  0  0  1  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  0     0  0  0  0  0  0  0  0  0  0  0  0
 1  0  0  0  0  0  0  0  0  0  0  0  1     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  1  0  0  0  0  0  0  1  0  …  0  0  0  0  0  0  1  0  0  0  1  0
 0  0  0  0  0  0  0  0  0  0  0  0  0     0  0  0  1  0  0  0  0  0  1  0  1
 0  0  0  0  0  0  0  0  0  1  0  0  0     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  1  0  0  0  0  0  0     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  1  0  0  0  0  0  0  0  0  0     0  1  0  0  0  0  0  0  0  0  0  0

In [13]:
train = DataLoader(train_X,train_y, batchsize=1024, shuffle=true)
test = DataLoader(test_X,test_y)

DataLoader((Float32[0.14901961 0.5058824 … 0.3764706 0.39607844; 0.15294118 0.49803922 … 0.38039216 0.39215687; … ; 0.2784314 0.5647059 … 0.2901961 0.19607843; 0.2784314 0.6117647 … 0.26666668 0.19607843], Bool[0 0 … 0 0; 0 1 … 0 0; … ; 0 0 … 0 0; 0 0 … 0 0]), 1, 26032, true, 26032, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10  …  26023, 26024, 26025, 26026, 26027, 26028, 26029, 26030, 26031, 26032], false)

In [14]:
layer1end = 256;
layer2end = 128;

model = Chain(
    Dense(size(train_X,1),layer1end,relu),
    Dense(layer1end,layer2end,relu),
    Dense(layer2end,size(train_y,1))
)

Chain(Dense(3072, 256, relu), Dense(256, 128, relu), Dense(128, 10))

In [19]:
loss(x, y) = logitcrossentropy(model(x), y)

loss (generic function with 1 method)

`test` (type: `DataLoader`) is a generator that outputs (x,y)

In [35]:
function test_loss()
    L = 0f0 # Use Float32 to save GPU memory
    for (x, y) in test
        L += loss(x, y)
    end
    L/length(test)
end

test_loss (generic function with 1 method)

In [36]:
evalcb() = @show(test_loss())  # for displaying current progress only

evalcb (generic function with 1 method)

In [37]:
epochs = 20
timeout_in_seconds = 10
ps = Flux.params(model)
@epochs epochs Flux.train!(loss, ps, train, ADAM(0.005), cb=throttle(evalcb, timeout_in_seconds))

┌ Info: Epoch 1
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 17.654922f0


┌ Info: Epoch 2
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 4.05785f0


┌ Info: Epoch 3
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 3.1313214f0


┌ Info: Epoch 4
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 3.2584875f0


┌ Info: Epoch 5
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 3.2889419f0


┌ Info: Epoch 6
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.2409694f0
test_loss() = 3.0780807f0

┌ Info: Epoch 7
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121





┌ Info: Epoch 8
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 3.0603104f0


┌ Info: Epoch 9
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 3.5132546f0


┌ Info: Epoch 10
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 4.170439f0


┌ Info: Epoch 11
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 3.36852f0


┌ Info: Epoch 12
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 3.8231251f0


┌ Info: Epoch 13
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 3.2760942f0


┌ Info: Epoch 14
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 3.1102757f0


┌ Info: Epoch 15
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 4.1055994f0


┌ Info: Epoch 16
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 4.8159614f0


┌ Info: Epoch 17
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 3.7527559f0
test_loss() = 5.8747187f0


┌ Info: Epoch 18
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121
┌ Info: Epoch 19
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 3.7824585f0


┌ Info: Epoch 20
└ @ Main C:\Users\HSI\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 5.0791707f0


In [38]:
accuracy(x, y) = mean(onecold(model(x)) .== onecold(y))
accuracy(test_X, test_y)

0.5751382913337431