# Julia 深度學習：類神經網路模型簡介

## 作業 032：訓練 MLP 學習門牌號碼資料集

訓練一個 MLP 模型來學習門牌號碼資料集。

注意：MLP 模型的能力有限，可能會導致訓練起來效果不佳。

注意：近期 Flux 正在持續更新，請確保您的 Julia 在 v1.3 版以上，以及 Flux 在 v0.10.4 以上或是最新版。

In [None]:
using Pkg
Pkg.add("Flux")

In [17]:
using Flux
using Flux.Data: DataLoader
using Flux: @epochs, onecold, onehotbatch, throttle, logitcrossentropy
using MLDatasets
using Statistics

## 讀取資料

呼叫 SVHN2 資料集的過程中，會先去檢查以前是否有下載過，如果是第一次下載，過程中會詢問是否下載資料集，請回答 `y`。整個資料集的大小約為 1.6 GB，下載時間可能會稍久，請耐心等候。

In [18]:
origintrain_X, origintrain_y = SVHN2.traindata(Float32)
origintest_X,  origintest_y  = SVHN2.testdata(Float32)

(Float32[0.14901961 0.15294118 … 0.19607843 0.1882353; 0.15294118 0.15294118 … 0.2 0.1882353; … ; 0.16470589 0.16862746 … 0.1764706 0.17254902; 0.15294118 0.15294118 … 0.16470589 0.16470589]

Float32[0.40392157 0.40784314 … 0.45882353 0.4509804; 0.40784314 0.40784314 … 0.4627451 0.4509804; … ; 0.40392157 0.39607844 … 0.45490196 0.4509804; 0.38039216 0.38039216 … 0.44313726 0.44313726]

Float32[0.23529412 0.23921569 … 0.29803923 0.2901961; 0.23921569 0.23921569 … 0.3019608 0.2901961; … ; 0.24313726 0.24705882 … 0.28235295 0.2784314; 0.22352941 0.22352941 … 0.27058825 0.2784314]

Float32[0.5058824 0.5254902 … 0.5411765 0.5137255; 0.49803922 0.52156866 … 0.50980395 0.47843137; … ; 0.48235294 0.49411765 … 0.39607844 0.43529412; 0.48235294 0.49019608 … 0.4392157 0.48235294]

Float32[0.5568628 0.5882353 … 0.59607846 0.5686275; 0.56078434 0.58431375 … 0.5647059 0.53333336; … ; 0.5254902 0.5372549 … 0.41960785 0.4627451; 0.5294118 0.5372549 … 0.4627451 0.50980395]

Float32[0.6 0.627451 … 0.647

In [19]:
println(size(origintrain_X))
println(size(origintrain_y))
println(size(origintest_X))
println(size(origintest_y))

(32, 32, 3, 73257)
(73257,)
(32, 32, 3, 26032)
(26032,)


In [20]:
train_X = Flux.flatten(origintrain_X)
test_X = Flux.flatten(origintest_X)
train_y = onehotbatch(origintrain_y, 1:10)
test_y = onehotbatch(origintest_y, 1:10)

10×26032 Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}:
 0  0  1  0  0  1  0  1  1  0  0  0  0  …  1  0  1  0  1  0  0  0  0  0  0  0
 0  1  0  0  0  0  0  0  0  0  0  0  0     0  0  0  0  0  0  0  1  1  0  0  0
 0  0  0  0  0  0  0  0  0  0  1  0  0     0  0  0  0  0  1  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0  0  0  0  0     0  0  0  0  0  0  0  0  0  0  0  0
 1  0  0  0  0  0  0  0  0  0  0  0  1     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  1  0  0  0  0  0  0  1  0  …  0  0  0  0  0  0  1  0  0  0  1  0
 0  0  0  0  0  0  0  0  0  0  0  0  0     0  0  0  1  0  0  0  0  0  1  0  1
 0  0  0  0  0  0  0  0  0  1  0  0  0     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  1  0  0  0  0  0  0     0  0  0  0  0  0  0  0  0  0  0  0
 0  0  0  1  0  0  0  0  0  0  0  0  0     0  1  0  0  0  0  0  0  0  0  0  0

In [21]:
println(size(train_X))
println(size(train_y))
println(size(test_X))
println(size(test_y))

(3072, 73257)
(10, 73257)
(3072, 26032)
(10, 26032)


In [22]:
batchsize = 1024
train = DataLoader(train_X, train_y, batchsize=batchsize, shuffle=true)
test = DataLoader(test_X, test_y, batchsize=batchsize)

DataLoader((Float32[0.14901961 0.5058824 … 0.3764706 0.39607844; 0.15294118 0.49803922 … 0.38039216 0.39215687; … ; 0.2784314 0.5647059 … 0.2901961 0.19607843; 0.2784314 0.6117647 … 0.26666668 0.19607843], Bool[0 0 … 0 0; 0 1 … 0 0; … ; 0 0 … 0 0; 0 0 … 0 0]), 1024, 26032, true, 26032, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10  …  26023, 26024, 26025, 26026, 26027, 26028, 26029, 26030, 26031, 26032], false)

In [23]:
size(train_X)

(3072, 73257)

In [24]:
# model = Chain(
#   Dense(3072, 768, σ),
#   Dense(768, 256,σ),
#   Dense(256, 10),
#   softmax)

model = Chain(
  Dense(3072, 1536, σ),
  Dense(1536, 768, σ),
  Dense(768, 256, σ),
  Dense(256, 128, σ),
  Dense(128, 10),
  softmax)

# model = Chain(
#   Dense(784, 256, relu),
#   Dense(256, 128, relu),
#   Dense(128, 10),
#   softmax)

Chain(Dense(3072, 1536, σ), Dense(1536, 768, σ), Dense(768, 256, σ), Dense(256, 128, σ), Dense(128, 10), softmax)

In [25]:
loss(x, y) = logitcrossentropy(model(x), y)
function test_loss()
    l = 0f0
    for (x, y) in test
        l += loss(x, y)
    end
    l/length(test)
end
evalcb() = @show(test_loss())

evalcb (generic function with 1 method)

In [26]:
epochs = 20
@epochs epochs Flux.train!(loss, params(model), train, ADAM(0.005), cb=throttle(evalcb, 10))

┌ Info: Epoch 1
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.2959933f0
test_loss() = 2.2451975f0
test_loss() = 2.2533665f0


┌ Info: Epoch 2
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.2451954f0
test_loss() = 2.2458086f0


┌ Info: Epoch 3
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.247957f0


┌ Info: Epoch 4
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.2463098f0
test_loss() = 2.2449305f0
test_loss() = 2.2456987f0

┌ Info: Epoch 5
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121



test_loss() = 2.2452912f0
test_loss() = 2.245207f0


┌ Info: Epoch 6
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.2465518f0
test_loss() = 2.245768f0


┌ Info: Epoch 7
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.2456381f0


┌ Info: Epoch 8
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.2476206f0
test_loss() = 2.245648f0


┌ Info: Epoch 9
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.2454543f0
test_loss() = 2.2450178f0
test_loss() = 2.2456598f0


┌ Info: Epoch 10
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.2459617f0


┌ Info: Epoch 11
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.2454147f0
test_loss() = 2.2451837f0


┌ Info: Epoch 12
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.2452474f0
test_loss() = 2.245167f0


┌ Info: Epoch 13
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.247278f0
test_loss() = 2.2453902f0
test_loss() = 2.2468958f0


┌ Info: Epoch 14
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.24568f0
test_loss() = 2.246213f0


┌ Info: Epoch 15
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.2467601f0


┌ Info: Epoch 16
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.2456083f0
test_loss() = 2.2454007f0


┌ Info: Epoch 17
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.2478468f0
test_loss() = 2.2464736f0
test_loss() = 2.2456493f0


┌ Info: Epoch 18
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.2454305f0


┌ Info: Epoch 19
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.245773f0
test_loss() = 2.2467f0
test_loss() = 2.2454882f0


┌ Info: Epoch 20
└ @ Main C:\Users\User\.julia\packages\Flux\Fj3bt\src\optimise\train.jl:121


test_loss() = 2.2464564f0


In [27]:
accuracy(x, y) = mean(onecold(model(x)) .== onecold(y))
accuracy(test_X, test_y)

0.1958743085433313

### SVM Testing

In [10]:
using LIBSVM, MLDatasets, Statistics

In [11]:
train_y = string.(origintrain_y) # convert to string type
test_y =  string.(origintest_y) # convert to string type
println(size(train_X))
println(size(train_y))

(3072, 73257)
(73257,)


In [28]:
svmtrain_X = reshape(train_X,size(train_X)[2],size(train_X)[1])
println(size(svmtrain_X))
model = LIBSVM.fit!(SVC(), svmtrain_X[1:1000,:], train_y[1:1000])

(73257, 3072)


SVC(LIBSVM.Kernel.RadialBasis, 0.0003255208333333333, nothing, 1.0, 3, 0.0, 0.001, true, false, false, LIBSVM.SVM{Bool}(SVC, LIBSVM.Kernel.RadialBasis, nothing, 3072, 2, Bool[1, 0], Int32[1, 2], Float64[], Int32[], LIBSVM.SupportVectors{Bool,Float32}(339, Int32[100, 239], Bool[1, 1, 1, 1, 1, 1, 1, 1, 1, 1  …  0, 0, 0, 0, 0, 0, 0, 0, 0, 0], Float32[0.12941177 0.23529412 … 0.28235295 0.3529412; 0.23529412 0.49803922 … 0.34901962 0.37254903; … ; 0.25490198 0.5921569 … 0.16470589 0.31764707; 0.972549 0.92941177 … 0.078431375 0.12156863], Int32[1, 19, 22, 33, 42, 55, 69, 73, 83, 91  …  969, 972, 974, 976, 981, 983, 993, 996, 998, 1000], LIBSVM.SVMNode[LIBSVM.SVMNode(1, 0.12941177189350128), LIBSVM.SVMNode(1, 0.23529411852359772), LIBSVM.SVMNode(1, 0.2705882489681244), LIBSVM.SVMNode(1, 0.10980392247438431), LIBSVM.SVMNode(1, 0.3450980484485626), LIBSVM.SVMNode(1, 0.3176470696926117), LIBSVM.SVMNode(1, 0.250980406999588), LIBSVM.SVMNode(1, 0.3294117748737335), LIBSVM.SVMNode(1, 0.09019608050

In [29]:
svmtest_X = reshape(test_X,size(test_X)[2],size(test_X)[1])
predicted_labels, decision_values = LIBSVM.predict(model, svmtest_X)
accuracy() = mean((predicted_labels .== test_y))*100
accuracy()

90.0