Deprecated

# Fine Tuning

We are going to try and fine tune a FCN classifier so to segment cells from histopathology scans. Our implementation is based on the implementation based on FCN.  


The fine tuning wil be addressed in four steps. Preparing the data in the form of blobs, so as to be caffee compatible but also in three model training. We will train with a 32 stride, take the parameters and plug them into a finer 16 stride model, and finally the same protocole is used to go a 8 stride finner model.

We will use the pre-train models available in the caffee zoo. We will remove each final layer and fine tune it via back propagation. 




# Preparing the data

Under the library caffee, data is provided to model under the form of blobs. A blob is a 4 dimensinal array where each channels represents: (num, channel, height, width). We will be providing cell segmentation data from histopathology slides. 
It can accept several types: HDF5, LMDB, etc..
Lets create an LMDB file to feed the neural network:

In [3]:
from DataToLMDB import MakeLMDB
import ImageTransf as Transf

path = '/home/naylor/Bureau/ToAnnotate'
#path = "/data/users/pnaylor/Bureau/ToAnnotate"
wd = '/home/naylor/Documents/Python/PhD/PhD_Fabien/AssociatedNotebooks'

output_dir = path + '/lmdb'

enlarge = False ## create symetry if the image becomes black ? 

transform_list = [Transf.Identity(),
                  Transf.Rotation(45, enlarge=enlarge), 
                  Transf.Rotation(90, enlarge=enlarge),
                  Transf.Rotation(135, enlarge=enlarge),
                  Transf.Flip(0),
                  Transf.Flip(1),
                  Transf.OutOfFocus(5),
                  Transf.OutOfFocus(10),
                  Transf.ElasticDeformation(0, 30, num_points = 4),
                  Transf.ElasticDeformation(0, 30, num_points = 4)]

mean_ = MakeLMDB(path, output_dir, transform_list, val_num = 2, get_mean=True, verbose = False)

  "%s to %s" % (dtypeobj_in, dtypeobj))


In [4]:
print mean_

[ 243.14176483  195.61314077  172.87945147]


# Preparing the 32 stride model: Pascal voc example

The architecture and pre-trained weight parameters can be found on the FCN github page. The architecture is based on the famous AlexNet implementation, the authors performed "net surgery" to transform the fully connected layers (final layers of the AlexNet model) into fully convolutionnal layers. These layers have the particularity of accepting any given size of inputs. Once these layers "convoloutionnized" , they added an upsampling and a deconvolution layer as to improve the accuracy of the model, indeed, AlexNet models provided the "what" but does not initially provide the where. Bridges were also added in order to combine low lever features to higher level features.

In [5]:
import sys
import caffe
import os

caffe.set_device(0)
caffe.set_mode_gpu()

weights =  '/home/naylor/Documents/FCN/fcn.berkeleyvision.org/voc-fcn32s/fcn32s-heavy-pascal.caffemodel'
assert os.path.exists(weights)
solver = None


## Getting the architecture right

In [6]:
import caffe
from caffe import layers as L, params as P
from caffe.coord_map import crop

def conv_relu(bottom, nout, ks=3, stride=1, pad=1):
    conv = L.Convolution(bottom, kernel_size=ks, stride=stride,
        num_output=nout, pad=pad,
        param=[dict(lr_mult=1, decay_mult=1), dict(lr_mult=2, decay_mult=0)])
    return conv, L.ReLU(conv, in_place=True)

def max_pool(bottom, ks=2, stride=2):
    return L.Pooling(bottom, pool=P.Pooling.MAX, kernel_size=ks, stride=stride)

def fcn32(batch_size, lmdb, mean_, layer_name = ""):
    n = caffe.NetSpec()

    n.data, n.label = L.Data(source=lmdb, backend=P.Data.LMDB, batch_size=batch_size, ntop=2,
        transform_param=dict(mean_value=list(mean_), scale = 1/256., mirror=True))
    # the base net

    n.conv1_1, n.relu1_1 = conv_relu(n.data, 64, pad=100)
    n.conv1_2, n.relu1_2 = conv_relu(n.relu1_1, 64)
    n.pool1 = max_pool(n.relu1_2)

    n.conv2_1, n.relu2_1 = conv_relu(n.pool1, 128)
    n.conv2_2, n.relu2_2 = conv_relu(n.relu2_1, 128)
    n.pool2 = max_pool(n.relu2_2)

    n.conv3_1, n.relu3_1 = conv_relu(n.pool2, 256)
    n.conv3_2, n.relu3_2 = conv_relu(n.relu3_1, 256)
    n.conv3_3, n.relu3_3 = conv_relu(n.relu3_2, 256)
    n.pool3 = max_pool(n.relu3_3)

    n.conv4_1, n.relu4_1 = conv_relu(n.pool3, 512)
    n.conv4_2, n.relu4_2 = conv_relu(n.relu4_1, 512)
    n.conv4_3, n.relu4_3 = conv_relu(n.relu4_2, 512)
    n.pool4 = max_pool(n.relu4_3)

    n.conv5_1, n.relu5_1 = conv_relu(n.pool4, 512)
    n.conv5_2, n.relu5_2 = conv_relu(n.relu5_1, 512)
    n.conv5_3, n.relu5_3 = conv_relu(n.relu5_2, 512)
    n.pool5 = max_pool(n.relu5_3)

    # fully conv
    n.fc6, n.relu6 = conv_relu(n.pool5, 4096, ks=7, pad=0)
    
    n.drop6 = L.Dropout(n.relu6, dropout_ratio=0.5, in_place=True)
    
    n.fc7, n.relu7 = conv_relu(n.drop6, 4096, ks=1, pad=0)
    
    n.drop7 = L.Dropout(n.relu7, dropout_ratio=0.5, in_place=True)
    
    n.score_fr = L.Convolution(n.drop7, num_output=2, kernel_size=1, pad=0,
        param=[dict(lr_mult=1, decay_mult=1), dict(lr_mult=2, decay_mult=0)])
    
    n.upscore = L.Deconvolution(n.score_fr,
        convolution_param=dict(num_output=2, kernel_size=64, stride=32,
            bias_term=False),
        param=[dict(lr_mult=0)])
    
    n.score = crop(n.upscore, n.data)
    
    n.loss = L.SoftmaxWithLoss(n.score, n.label,
            loss_param=dict(normalize=False, ignore_label=255))

    return n.to_proto()

def make_net(output_dir):
    if not os.path.isdir('./fcn32'):
        os.mkdir('./fcn32')
    with open('fcn32/train.prototxt', 'w') as f:
        f.write(str(fcn32(32, output_dir, mean_)))


if __name__ == '__main__':
    make_net(output_dir)
    # load the solver
    #solver = caffe.SGDSolver('fcn32/train.prototxt')

## Writting the solver
Writes a file that will have all the parameters for training

In [4]:

def solver(train_net_path, test_net_path=None, base_lr=0.001, out_snap="./temp_snapshot"):
    s = caffe_pb2.SolverParameter()
    s.train_net = train_net_path
    if test_net_path is not None:
        s.test_net.append(test_net_path)
        s.test_interval = 1000  # Test after every 1000 training iterations.
        s.test_iter.append(100) # Test on 100 batches each time we test.

    # The number of iterations over which to average the gradient.
    # Effectively boosts the training batch size by the given factor, without
    # affecting memory utilization.
    s.iter_size = 1
    
    s.max_iter = 100000     # # of times to update the net (training iterations)
    
    # Solve using the stochastic gradient descent (SGD) algorithm.
    # Other choices include 'Adam' and 'RMSProp'.
    s.type = 'SGD'

    # Set the initial learning rate for SGD.
    s.base_lr = base_lr

    # Set `lr_policy` to define how the learning rate changes during training.
    # Here, we 'step' the learning rate by multiplying it by a factor `gamma`
    # every `stepsize` iterations.
    s.lr_policy = 'step'
    s.gamma = 0.1
    s.stepsize = 20000

    # Set other SGD hyperparameters. Setting a non-zero `momentum` takes a
    # weighted average of the current gradient and previous gradients to make
    # learning more stable. L2 weight decay regularizes learning, to help prevent
    # the model from overfitting.
    s.momentum = 0.9
    s.weight_decay = 5e-4

    # Display the current training loss and accuracy every 1000 iterations.
    s.display = 1000

    # Snapshots are files used to store networks we've trained.  Here, we'll
    # snapshot every 10K iterations -- ten times during training.
    s.snapshot = 10000
    if not os.path.isdir(out_snap):
        os.mkdir(out_snap)
    s.snapshot_prefix = out_snap
    
    # Train on the GPU.  Using the CPU to train large networks is very slow.
    s.solver_mode = caffe_pb2.SolverParameter.GPU
    
    # Write the solver to a temporary file and return its filename.
    with open('fcn32/train.prototxt', 'w') as f:
        f.write(str(s))
        return f.name
        

Help on function layer_fn in module caffe.net_spec:

layer_fn(*args, **kwargs)

