# cifar-100 conv net with Caffe for DVIA

Based on guillaume-chevalier's implementation of NiN for Cifar-100 imageset. This is a variation of the same database scaled _up_ to 48x48 (DVIA images at 48x48 full resolution).

It is based on the NIN (Network In Network) architecture detailed in this paper: http://arxiv.org/pdf/1312.4400v3.pdf. 

https://github.com/guillaume-chevalier/python-caffe-custom-cifar-100-conv-net for original implementation.

## Convert the cifar-100 dataset to Caffe's HDF5 format
This step converts previously downloaded Cifar-100 database to 48x48 HDF5 DB.

In [1]:
%%time

!ipython convert-cifar-100.ipy

Converting...
Conversion was already done. Did not convert twice.

CPU times: user 64 ms, sys: 16 ms, total: 80 ms
Wall time: 5.02 s


## Build the model with Caffe. 

In [1]:
import numpy as np
import os, sys

scriptpath = os.path.dirname(os.path.realpath( "__file__" ))
caffe_root  = os.path.sep.join(scriptpath.split(os.path.sep)[:-2])
#caffe_root = os.path.join(os.environ['HOME'], 'Projects', 'dvcaffe')

import caffe
from caffe import layers as L
from caffe import params as P


print "caffe_root = {}".format(caffe_root)

  from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, \
  from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, \
  from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, \


caffe_root = /home/maheriya/Projects/dvcaffe


In [4]:
def cnn(hdf5, batch_size):
    n = caffe.NetSpec()
    ## Input LMDB data layer
    n.data, n.label_coarse, n.label_fine = L.HDF5Data(batch_size=batch_size, source=hdf5, ntop=3)

    n.conv1 = L.Convolution(n.data, kernel_size=5, stride=2, num_output=64, weight_filler=dict(type='xavier'))
    n.cccp1a = L.Convolution(n.conv1, kernel_size=1, num_output=42, weight_filler=dict(type='xavier'))
    n.relu1a = L.ReLU(n.cccp1a, in_place=True)
    n.cccp1b = L.Convolution(n.relu1a, kernel_size=1, num_output=32, weight_filler=dict(type='xavier'))
    n.pool1 = L.Pooling(n.cccp1b, kernel_size=3, stride=2, pool=P.Pooling.MAX)
    n.drop1 = L.Dropout(n.pool1, in_place=True)
    n.relu1b = L.ReLU(n.drop1, in_place=True)
    
    n.conv2 = L.Convolution(n.relu1b, kernel_size=3, num_output=64, weight_filler=dict(type='xavier'))
    n.pool2 = L.Pooling(n.conv2, kernel_size=3, stride=2, pool=P.Pooling.MAX)
    n.drop2 = L.Dropout(n.pool2, in_place=True)
    n.relu2 = L.ReLU(n.drop2, in_place=True)
    
    n.conv3 = L.Convolution(n.relu2, kernel_size=3, num_output=96, weight_filler=dict(type='xavier'))
    n.pool3 = L.Pooling(n.conv3, kernel_size=2, stride=2, pool=P.Pooling.AVE)
    n.relu3 = L.ReLU(n.pool3, in_place=True)
    
    n.fc1 = L.InnerProduct(n.relu3, num_output=500, weight_filler=dict(type='xavier'))
    n.relu4 = L.ReLU(n.fc1, in_place=True)
    
    # Output: 20-class and 100-class classifiers
    n.fc_coarse       = L.InnerProduct(n.relu4, num_output=20, weight_filler=dict(type='xavier'))
    n.accuracy_coarse = L.Accuracy(n.fc_coarse, n.label_coarse)
    n.loss_coarse     = L.SoftmaxWithLoss(n.fc_coarse, n.label_coarse)
    
    n.fc_fine       = L.InnerProduct(n.relu4, num_output=100, weight_filler=dict(type='xavier'))
    n.accuracy_fine = L.Accuracy(n.fc_fine, n.label_fine)
    n.loss_f        = L.SoftmaxWithLoss(n.fc_fine, n.label_fine)
    
    return n.to_proto()
    
with open('dvia_pretrain.prototxt', 'w') as f:
    f.write(str(cnn('cifar_100_caffe_hdf5/train.txt', 100)))
    
with open('dvia_pretest.prototxt', 'w') as f:
    f.write(str(cnn('cifar_100_caffe_hdf5/test.txt', 120)))

!python /usr/local/caffe/python/draw_net.py dvia_pretrain.prototxt cifar_nin.png

  from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, \
  from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, \
  from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, \
Drawing net to cifar_nin.png


## Load and visualise the untrained network's internal structure and shape
The network's structure (graph) visualisation tool of caffe is broken in the current release. We will simply print here the data shapes. 

In [5]:
caffe.set_mode_gpu()
solver = caffe.get_solver('dvia_presolver.prototxt')

In [6]:
print("Layers' features:")
[(k, v.data.shape) for k, v in solver.net.blobs.items()]

Layers' features:


[('data', (100, 3, 48, 48)),
 ('label_coarse', (100,)),
 ('label_fine', (100,)),
 ('label_coarse_data_1_split_0', (100,)),
 ('label_coarse_data_1_split_1', (100,)),
 ('label_fine_data_2_split_0', (100,)),
 ('label_fine_data_2_split_1', (100,)),
 ('conv1', (100, 64, 22, 22)),
 ('cccp1a', (100, 42, 22, 22)),
 ('cccp1b', (100, 32, 22, 22)),
 ('pool1', (100, 32, 11, 11)),
 ('conv2', (100, 64, 9, 9)),
 ('pool2', (100, 64, 4, 4)),
 ('conv3', (100, 96, 2, 2)),
 ('pool3', (100, 96, 1, 1)),
 ('fc1', (100, 500)),
 ('fc1_relu4_0_split_0', (100, 500)),
 ('fc1_relu4_0_split_1', (100, 500)),
 ('fc_coarse', (100, 20)),
 ('fc_coarse_fc_coarse_0_split_0', (100, 20)),
 ('fc_coarse_fc_coarse_0_split_1', (100, 20)),
 ('accuracy_coarse', ()),
 ('loss_coarse', ()),
 ('fc_fine', (100, 100)),
 ('fc_fine_fc_fine_0_split_0', (100, 100)),
 ('fc_fine_fc_fine_0_split_1', (100, 100)),
 ('accuracy_fine', ()),
 ('loss_f', ())]

In [7]:
print("Parameters and shape:")
[(k, v[0].data.shape) for k, v in solver.net.params.items()]

Parameters and shape:


[('conv1', (64, 3, 5, 5)),
 ('cccp1a', (42, 64, 1, 1)),
 ('cccp1b', (32, 42, 1, 1)),
 ('conv2', (64, 32, 3, 3)),
 ('conv3', (96, 64, 3, 3)),
 ('fc1', (500, 96)),
 ('fc_coarse', (20, 500)),
 ('fc_fine', (100, 500))]

## Solver's params

The solver's params for the created net are defined in a `.prototxt` file. 

Notice that because `max_iter: 100000`, the training will loop 2 times on the 50000 training data. Because we train data by minibatches of 100 as defined above when creating the net, there will be a total of `100000*100/50000 = 200` epochs on some of those pre-shuffled 100 images minibatches.

We will test the net on `test_iter: 100` different test images at each `test_interval: 1000` images trained. 
____

Here, **RMSProp** is used, it is SDG-based, it converges faster than a pure SGD and it is robust.
____

In [8]:
!cat dvia_presolver.prototxt

train_net: "dvia_pretrain.prototxt"
test_net: "dvia_pretest.prototxt"

test_iter: 100
test_interval: 1000

base_lr: 0.0006
momentum: 0.0
weight_decay: 0.001

lr_policy: "inv"
gamma: 0.0001
power: 0.75

display: 100

max_iter: 150000

snapshot: 50000
snapshot_prefix: "cifar_pretrain"
solver_mode: GPU

type: "RMSProp"
rms_decay: 0.98


## Alternative way to train directly in Python
Since a recent update, there is no output in python by default, which is bad for debugging. 
Skip this cell and train with the second method shown below if needed. It is commented out in case you just chain some `shift+enter` ipython shortcuts. 

In [9]:
# %%time
# solver.solve()
solver = None

## Train by calling caffe in command line
Just set the parameters correctly. Be sure that the notebook is at the root of the ipython notebook server. 
You can run this in an external terminal if you open it in the notebook's directory. 

It is also possible to finetune an existing net with a different solver or different data. Here I do it, because I feel the net could better fit the data. 

In [10]:
%%time
!caffe train -solver dvia_presolver.prototxt

I0514 19:58:51.508553  4667 caffe.cpp:185] Using GPUs 0
I0514 19:58:51.530227  4667 caffe.cpp:190] GPU 0: GeForce GTX 470
I0514 19:58:51.689117  4667 solver.cpp:48] Initializing solver from parameters: 
train_net: "dvia_pretrain.prototxt"
test_net: "dvia_pretest.prototxt"
test_iter: 100
test_interval: 1000
base_lr: 0.0006
display: 100
max_iter: 150000
lr_policy: "inv"
gamma: 0.0001
power: 0.75
momentum: 0
weight_decay: 0.001
snapshot: 50000
snapshot_prefix: "cifar_pretrain"
solver_mode: GPU
device_id: 0
rms_decay: 0.98
type: "RMSProp"
I0514 19:58:51.689388  4667 solver.cpp:81] Creating training net from train_net file: dvia_pretrain.prototxt
I0514 19:58:51.690402  4667 net.cpp:49] Initializing net from parameters: 
state {
  phase: TRAIN
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label_coarse"
  top: "label_fine"
  hdf5_data_param {
    source: "cifar_100_caffe_hdf5/train.txt"
    batch_size: 100
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data

Caffe brewed. 
## Test the model completely on test data
Let's test directly in command-line:

In [3]:
%%time
!caffe test -model dvia_pretest.prototxt -weights cifar_pretrain_iter_150000.caffemodel -iterations 100

I0512 13:13:28.350412 26823 caffe.cpp:246] Use CPU.
I0512 13:13:28.551002 26823 net.cpp:49] Initializing net from parameters: 
state {
  phase: TEST
}
layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "/home/maheriya/Projects/dvcaffe/data/dvia_cifar/dvia_val_NP_lmdb"
    batch_size: 120
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  convolution_param {
    num_output: 64
    kernel_size: 4
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "cccp1a"
  type: "Convolution"
  bottom: "conv1"
  top: "cccp1a"
  convolution_param {
    num_output: 42
    kernel_size: 1
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "relu1a"
  type: "ReLU"
  bottom: "cccp1a"
  top: "cccp1a"
}
layer {
  name: "cccp1b"
  type: "Convolution"
  bottom: "cccp1a"
  top: "cccp1b"
  convolution_param {
    num_output: 32
    kernel_si

## The model achieved near 87.91% accuracy
The above is purely test/validation database that is not used for training.

In [13]:
!jupyter nbconvert --to markdown dvia-pretrain.ipynb 

[NbConvertApp] Converting notebook custom-cifar-100.ipynb to markdown
[NbConvertApp] Writing 731885 bytes to custom-cifar-100.md
