### VGG and Cifar

在该实现中您可以看到如下功能：
1. 设置训练过程中的参数
2. 读取 Cifar10 数据集并创建训练集和测试集
3. 使用图像增广
4. 使用预训练的 VGG19 模型
5. 训练、测试和保存模型

In this template you can finish the following functions:
1. Set the parameters during training
2. Read Cifar10 data set and create training set and test set
3. Use image augmentation
4. Use the pre-trained VGG19 model
5. Train, test and save the model

In [1]:
using Flux, Metalhead, Statistics
using Flux: onehotbatch, onecold, logitcrossentropy, throttle, flatten
using Metalhead: trainimgs
using Parameters: @with_kw
using Images: channelview
using Statistics: mean
using Base.Iterators: partition
using CUDAapi

In [2]:
using CUDAapi, CUDAdrv, CUDAnative
gpu_id = 3  ## set < 0 for no cuda, >= 0 for using a specific device (if available)

if has_cuda_gpu() && gpu_id >=0
    device!(gpu_id)
    device = Flux.gpu
    @info "Training on GPU-$(gpu_id)"
else
    device = Flux.cpu
    @info "Training on CPU"
end

┌ Info: Training on GPU-3
└ @ Main In[2]:7


为了便于调整参数和记录试验结果，我们需要使用 parameters 将参数记录和封装。

In order to easily adjust the parameters and record the test results, we need to use parameters to record and encapsulate the parameters.

In [3]:
using Parameters: @with_kw
@with_kw mutable struct Args
    batchsize::Int = 128
    throttle::Int = 10
    lr::Float64 = 3e-4
    epochs::Int = 10
    splitr_::Float64 = 0.1
end

Args

参照pytorch实现图像增广的预处理过程。

Refer to pytorch to realize the preprocessing process of image augmentation.

In [4]:
# without augmentation
function preprocess(X)
    Float32.(permutedims(channelview(X), (2, 3, 1)))
end

# # with augmentation
# function resize_smallest_dimension(im, len)
#   reduction_factor = len/minimum(size(im)[1:2])
#   new_size = size(im)
#   new_size = (
#       round(Int, size(im,1)*reduction_factor),
#       round(Int, size(im,2)*reduction_factor),
#   )
#   if reduction_factor < 1.0
#     # Images.jl's imresize() needs to first lowpass the image, it won't do it for us
#     im = imfilter(im, KernelFactors.gaussian(0.75/reduction_factor), Inner())
#   end
#   return imresize(im, new_size)
# end

# # Take the len-by-len square of pixels at the center of image `im`
# function center_crop(im, len)
#   l2 = div(len,2)
#   adjust = len % 2 == 0 ? 1 : 0
#   return im[div(end,2)-l2:div(end,2)+l2-adjust,div(end,2)-l2:div(end,2)+l2-adjust]
# end

# function preprocess(im)
#   # Resize such that smallest edge is 256 pixels long
#   im = resize_smallest_dimension(im, 256)

#   # Center-crop to 224x224
#   im = center_crop(im, 224)

#   # Convert to channel view and normalize (these coefficients taken
#   # from PyTorch's ImageNet normalization code)
#   μ = [0.485, 0.456, 0.406]
#   # the sigma numbers are suspect: they cause the image to go outside of 0..1
#   # 1/0.225 = 4.4 effective scale
#   σ = [0.229, 0.224, 0.225]
#   #im = (channelview(im) .- μ)./σ
#   im = (channelview(im) .- μ)

#   # Convert from CHW (Image.jl's channel ordering) to WHCN (Flux.jl's ordering)
#   # and enforce Float32, as that seems important to Flux
#   # result is (224, 224, 3, 1)
#   #return Float32.(permutedims(im, (3, 2, 1))[:,:,:,:].*255)  # why
#   return Float32.(permutedims(im, (3, 2, 1))[:,:,:,:])
# end

preprocess (generic function with 1 method)

构建训练集合、验证集合和测试集合。

Build training set, validation set and test set.

In [5]:
using Metalhead: trainimgs
using Images, ImageMagick

function get_processed_data(args)
    # Fetching the train and validation data and getting them into proper shape	
    X = trainimgs(CIFAR10)
    imgs = [preprocess(X[i].img) for i in 1:40000]
    #onehot encode labels of batch
   
    labels = onehotbatch([X[i].ground_truth.class for i in 1:40000],1:10)

    train_pop = Int((1-args.splitr_)* 40000)
    train = device.([(cat(imgs[i]..., dims = 4), labels[:,i]) for i in partition(1:train_pop, args.batchsize)])
    valset = collect(train_pop+1:40000)
    valX = cat(imgs[valset]..., dims = 4) |> device
    valY = labels[:, valset] |> device

    val = (valX,valY)
    return train, val
end

function get_test_data()
    # Fetch the test data from Metalhead and get it into proper shape.
    test = valimgs(CIFAR10)

    # CIFAR-10 does not specify a validation set so valimgs fetch the testdata instead of testimgs
    testimgs = [preprocess(test[i].img) for i in 1:1000]
    testY = onehotbatch([test[i].ground_truth.class for i in 1:1000], 1:10) |> device
    testX = cat(testimgs..., dims = 4) |> device

    test = (testX,testY)
    return test
end

get_test_data (generic function with 1 method)

使用Metalhead中提供的模型结构和预训练参数。

Use the model structure and pre-training parameters provided in Metalhead.

在源码中可以找到预训练权重的下载地址，例如[github](https://github.com/FluxML/Metalhead.jl/blob/fd4687a0f91a188f099a43d6464000162b20aa60/src/utils.jl)，我们从提供的[vgg19](https://github.com/FluxML/Metalhead.jl/releases/download/Models/vgg19.bson)下载地址中下载VGG19的权重文件，放在 deps 文件夹中。我的 deps 文件夹的路径是："~/.juliapro/JuliaPro_v1.4.1-1/packages/Metalhead/RZn9O/deps"。

The download address of the pre-training weights can be found in the source code, such as [github](https://github.com/FluxML/Metalhead.jl/blob/fd4687a0f91a188f099a43d6464000162b20aa60/src/utils.jl). We download the weight file of VGG19 from the provided [vgg19](https://github.com/FluxML/Metalhead.jl/releases/download/Models/vgg19.bson) download address and place it in the deps folder. The path of my deps folder is: "~/.juliapro/JuliaPro_v1.4.1-1/packages/Metalhead/RZn9O/deps".

In [6]:
using Metalhead

function VGG19()
    return Chain(
            Conv((3, 3), 3 => 64, relu, pad=(1, 1), stride=(1, 1)),
            BatchNorm(64),
            Conv((3, 3), 64 => 64, relu, pad=(1, 1), stride=(1, 1)),
            BatchNorm(64),
            MaxPool((2,2)),
            Conv((3, 3), 64 => 128, relu, pad=(1, 1), stride=(1, 1)),
            BatchNorm(128),
            Conv((3, 3), 128 => 128, relu, pad=(1, 1), stride=(1, 1)),
            BatchNorm(128),
            MaxPool((2,2)),
            Conv((3, 3), 128 => 256, relu, pad=(1, 1), stride=(1, 1)),
            BatchNorm(256),
            Conv((3, 3), 256 => 256, relu, pad=(1, 1), stride=(1, 1)),
            BatchNorm(256),
            Conv((3, 3), 256 => 256, relu, pad=(1, 1), stride=(1, 1)),
            BatchNorm(256),
            Conv((3, 3), 256 => 256, relu, pad=(1, 1), stride=(1, 1)),
            MaxPool((2,2)),
            Conv((3, 3), 256 => 512, relu, pad=(1, 1), stride=(1, 1)),
            BatchNorm(512),
            Conv((3, 3), 512 => 512, relu, pad=(1, 1), stride=(1, 1)),
            BatchNorm(512),
            Conv((3, 3), 512 => 512, relu, pad=(1, 1), stride=(1, 1)),
            BatchNorm(512),
            Conv((3, 3), 512 => 512, relu, pad=(1, 1), stride=(1, 1)),
            MaxPool((2,2)),
            Conv((3, 3), 512 => 512, relu, pad=(1, 1), stride=(1, 1)),
            BatchNorm(512),
            Conv((3, 3), 512 => 512, relu, pad=(1, 1), stride=(1, 1)),
            BatchNorm(512),
            Conv((3, 3), 512 => 512, relu, pad=(1, 1), stride=(1, 1)),
            BatchNorm(512),
            Conv((3, 3), 512 => 512, relu, pad=(1, 1), stride=(1, 1)),
            MaxPool((2,2)),
            flatten,
            Dense(512, 4096, relu),
            Dropout(0.5),
            Dense(4096, 4096, relu),
            Dropout(0.5),
            Dense(4096, 10))
end
model = VGG19() |> device

# # Finetune MetalHead VGG19 without augmentation
# vgg = VGG19()
# model = Chain(vgg.layers[1:end-6],
#               Dense(512, 4096, relu),
#               Dropout(0.5),
#               Dense(4096, 4096, relu),
#               Dropout(0.5),
#               Dense(4096, 10)) |> device

# # Finetune MetalHead VGG19 with augmentation, images are resized to 224*224
# vgg = VGG19()
# model = Chain(vgg.layers[1:end-2],
#               Dense(4096,10),
#               softmax) |> device

# # Finetune your trained models
# function vgg19()
#     ws = weights("vgg19.bson")
#     return Chain(
#         Conv(ws[:conv1_1_w_0][end:-1:1,:,:,:][:,end:-1:1,:,:], ws[:conv1_1_b_0], relu, pad = (1,1), stride = (1,1), dilation = (1,1)),
#         Conv(ws[:conv1_2_w_0][end:-1:1,:,:,:][:,end:-1:1,:,:], ws[:conv1_2_b_0], relu, pad = (1,1), stride = (1,1), dilation = (1,1)),
#         MaxPool((2,2)),
#         Conv(ws[:conv2_1_w_0][end:-1:1,:,:,:][:,end:-1:1,:,:], ws[:conv2_1_b_0], relu, pad = (1,1), stride = (1,1), dilation = (1,1)),
#         Conv(ws[:conv2_2_w_0][end:-1:1,:,:,:][:,end:-1:1,:,:], ws[:conv2_2_b_0], relu, pad = (1,1), stride = (1,1), dilation = (1,1)),
#         MaxPool((2,2)),
#         Conv(ws[:conv3_1_w_0][end:-1:1,:,:,:][:,end:-1:1,:,:], ws[:conv3_1_b_0], relu, pad = (1,1), stride = (1,1), dilation = (1,1)),
#         Conv(ws[:conv3_2_w_0][end:-1:1,:,:,:][:,end:-1:1,:,:], ws[:conv3_2_b_0], relu, pad = (1,1), stride = (1,1), dilation = (1,1)),
#         Conv(ws[:conv3_3_w_0][end:-1:1,:,:,:][:,end:-1:1,:,:], ws[:conv3_3_b_0], relu, pad = (1,1), stride = (1,1), dilation = (1,1)),
#         Conv(ws[:conv3_4_w_0][end:-1:1,:,:,:][:,end:-1:1,:,:], ws[:conv3_4_b_0], relu, pad = (1,1), stride = (1,1), dilation = (1,1)),
#         MaxPool((2,2)),
#         Conv(ws[:conv4_1_w_0][end:-1:1,:,:,:][:,end:-1:1,:,:], ws[:conv4_1_b_0], relu, pad = (1,1), stride = (1,1), dilation = (1,1)),
#         Conv(ws[:conv4_2_w_0][end:-1:1,:,:,:][:,end:-1:1,:,:], ws[:conv4_2_b_0], relu, pad = (1,1), stride = (1,1), dilation = (1,1)),
#         Conv(ws[:conv4_3_w_0][end:-1:1,:,:,:][:,end:-1:1,:,:], ws[:conv4_3_b_0], relu, pad = (1,1), stride = (1,1), dilation = (1,1)),
#         Conv(ws[:conv4_4_w_0][end:-1:1,:,:,:][:,end:-1:1,:,:], ws[:conv4_4_b_0], relu, pad = (1,1), stride = (1,1), dilation = (1,1)),
#         MaxPool((2,2)),
#         Conv(ws[:conv5_1_w_0][end:-1:1,:,:,:][:,end:-1:1,:,:], ws[:conv5_1_b_0], relu, pad = (1,1), stride = (1,1), dilation = (1,1)),
#         Conv(ws[:conv5_2_w_0][end:-1:1,:,:,:][:,end:-1:1,:,:], ws[:conv5_2_b_0], relu, pad = (1,1), stride = (1,1), dilation = (1,1)),
#         Conv(ws[:conv5_3_w_0][end:-1:1,:,:,:][:,end:-1:1,:,:], ws[:conv5_3_b_0], relu, pad = (1,1), stride = (1,1), dilation = (1,1)),
#         Conv(ws[:conv5_4_w_0][end:-1:1,:,:,:][:,end:-1:1,:,:], ws[:conv5_4_b_0], relu, pad = (1,1), stride = (1,1), dilation = (1,1)),
#         MaxPool((2,2)),
#         x -> reshape(x, :, size(x, 4)),
#         Dense(ws[:fc6_w_0]', ws[:fc6_b_0], relu),
#         Dropout(0.5f0),
#         Dense(ws[:fc7_w_0]', ws[:fc7_b_0], relu),
#         Dropout(0.5f0),
#         Dense(ws[:fc8_w_0]', ws[:fc8_b_0]),
#         softmax)
# end
# model = vgg19() |> device

Chain(Conv((3, 3), 3=>64, relu), BatchNorm(64), Conv((3, 3), 64=>64, relu), BatchNorm(64), MaxPool((2, 2), pad = (0, 0, 0, 0), stride = (2, 2)), Conv((3, 3), 64=>128, relu), BatchNorm(128), Conv((3, 3), 128=>128, relu), BatchNorm(128), MaxPool((2, 2), pad = (0, 0, 0, 0), stride = (2, 2)), Conv((3, 3), 128=>256, relu), BatchNorm(256), Conv((3, 3), 256=>256, relu), BatchNorm(256), Conv((3, 3), 256=>256, relu), BatchNorm(256), Conv((3, 3), 256=>256, relu), MaxPool((2, 2), pad = (0, 0, 0, 0), stride = (2, 2)), Conv((3, 3), 256=>512, relu), BatchNorm(512), Conv((3, 3), 512=>512, relu), BatchNorm(512), Conv((3, 3), 512=>512, relu), BatchNorm(512), Conv((3, 3), 512=>512, relu), MaxPool((2, 2), pad = (0, 0, 0, 0), stride = (2, 2)), Conv((3, 3), 512=>512, relu), BatchNorm(512), Conv((3, 3), 512=>512, relu), BatchNorm(512), Conv((3, 3), 512=>512, relu), BatchNorm(512), Conv((3, 3), 512=>512, relu), MaxPool((2, 2), pad = (0, 0, 0, 0), stride = (2, 2)), flatten, Dense(512, 4096, relu), Dropout(0.5

训练模型并微调参数。

Train the model and fine-tune the parameters.

In [7]:
function train(model; kws...)
    # Initialize the hyperparameters
    args = Args(; kws...)
    
    # Load the train, validation data 
    train, val = get_processed_data(args)

    @info("Constructing Model")
    # Defining the loss and accuracy functions

    loss(x, y) = logitcrossentropy(model(x), y)

    ## Training
    # Defining the callback and the optimizer
    evalcb = throttle(() -> @show(loss(val...)), args.throttle)
    opt = ADAM(args.lr)
    @info("Training....")
    # Starting to train models
    Flux.@epochs args.epochs Flux.train!(loss, params(model), train, opt, cb=evalcb)
end

train (generic function with 1 method)

需要耐心等待几分钟，正在下载数据集和数据增广。

Need to wait patiently for a few minutes, the dataset is being downloaded and image augmentation is in progress.

In [8]:
train(model)

┌ Info: Constructing Model
└ @ Main In[7]:8
┌ Info: Training....
└ @ Main In[7]:17
┌ Info: Epoch 1
└ @ Main /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Flux/Fj3bt/src/optimise/train.jl:121


loss(val...) = 2.3033738f0
loss(val...) = 1.6381592f0
loss(val...) = 1.4765667f0
loss(val...) = 1.5272578f0
loss(val...) = 1.4042577f0


┌ Info: Epoch 2
└ @ Main /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Flux/Fj3bt/src/optimise/train.jl:121


loss(val...) = 1.2379736f0
loss(val...) = 1.1534756f0
loss(val...) = 1.1156354f0
loss(val...) = 1.0893316f0


┌ Info: Epoch 3
└ @ Main /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Flux/Fj3bt/src/optimise/train.jl:121


loss(val...) = 1.2024394f0
loss(val...) = 1.0005852f0
loss(val...) = 0.9206779f0
loss(val...) = 0.969336f0
loss(val...) = 0.984805f0


┌ Info: Epoch 4
└ @ Main /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Flux/Fj3bt/src/optimise/train.jl:121


loss(val...) = 0.96014816f0
loss(val...) = 0.9179143f0
loss(val...) = 0.9446917f0
loss(val...) = 1.1109371f0
loss(val...) = 0.909474f0


┌ Info: Epoch 5
└ @ Main /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Flux/Fj3bt/src/optimise/train.jl:121


loss(val...) = 0.8882381f0
loss(val...) = 0.8820702f0
loss(val...) = 0.8960527f0
loss(val...) = 0.94824517f0


┌ Info: Epoch 6
└ @ Main /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Flux/Fj3bt/src/optimise/train.jl:121


loss(val...) = 1.1659671f0
loss(val...) = 0.94695526f0
loss(val...) = 0.87784076f0
loss(val...) = 0.87166667f0
loss(val...) = 0.9482982f0


┌ Info: Epoch 7
└ @ Main /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Flux/Fj3bt/src/optimise/train.jl:121


loss(val...) = 0.8773484f0
loss(val...) = 1.0302075f0
loss(val...) = 0.838604f0
loss(val...) = 1.0184537f0
loss(val...) = 0.8309129f0


┌ Info: Epoch 8
└ @ Main /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Flux/Fj3bt/src/optimise/train.jl:121


loss(val...) = 0.8406193f0
loss(val...) = 0.91691136f0
loss(val...) = 1.0379539f0
loss(val...) = 0.9353888f0
loss(val...) = 0.9796433f0


┌ Info: Epoch 9
└ @ Main /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Flux/Fj3bt/src/optimise/train.jl:121


loss(val...) = 0.91448677f0
loss(val...) = 0.92203945f0
loss(val...) = 0.82181627f0
loss(val...) = 0.8147631f0


┌ Info: Epoch 10
└ @ Main /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Flux/Fj3bt/src/optimise/train.jl:121


loss(val...) = 0.9315647f0
loss(val...) = 0.88752526f0
loss(val...) = 0.8546379f0
loss(val...) = 0.90364754f0
loss(val...) = 0.8723239f0


测试模型在测试集上的准确率.

Test the accuracy of the model on the test set.

In [9]:
accuracy(x, y, m) = mean(onecold(cpu(m(x)), 1:10) .== onecold(cpu(y), 1:10))

accuracy (generic function with 1 method)

In [10]:
function test(model)
    test_data = get_test_data() |> device
    # Print the final accuracy
    @show(accuracy(test_data..., model))
end

test (generic function with 1 method)

In [11]:
test(model)

accuracy(test_data..., model) = 0.747


0.747

In [12]:
using Tracker
using BSON: @load, @save

pretrained = model |> cpu
weights = Tracker.data.(params(pretrained))
@save "weights.bson" weights

In [13]:
# # load weights
# weights = BSON.load(filename)
# Flux.loadparams!(model, weights)