# NET TRAINING

A series of experiments (using the images and their corresponding labels and annotations) are performed to select the best parameters to train the final model. The supplementary material of the manuscript includes a table listing the parameters used for this work as a reference. 

`region` is the name of independent dataset on which to train the machine. In this case, we are showing the data from 'central_Pacific_ocean' located in `basedir`. In this example, we use the region `Central Pacific Ocean` from the provided data as an example

In [1]:
##SETUP WORKSPACE 
import os
import reef_learning.experiments.catlin_caffe_experiments as cce
import reef_learning.deeplearning_wrappers.catlin_classify as cc
import os.path as osp
import warnings
warnings.filterwarnings('ignore')
import reef_learning.deeplearning_wrappers.catlin_caffe_tools as cct
import reef_learning.toolbox.plot_log as pl
import reef_learning.deeplearning_wrappers.catlin_tools as ct
import matplotlib.pyplot as plt
import glob

##SET PATHS 

region='hawaii' #Folder that contrained the data for training the Net.
basedir='/media/data_caffe' #directory where the data is stored

##Read labelset and check it match your labelset
lines = [line.rstrip() for line in open(osp.join(basedir,region,'label_structure.csv'))][1:]
labelset = [line.split(',')[1] for line in lines]
print 'Number of labels = '+str(len(labelset)) 

Number of labels = 40


## Model parameter optimisation (Experiment sweeper)

A series of experiments (using the images and their corresponding labels and annotations) are performed to select the best parameters to train the final model. These parameters include: learning rate (lrate) and receptive field calibration (method). 

**Learning rate** is a hyper-parameter that controls how much we are adjusting the weights of our network with respect the loss gradient. The lower the value, the slower we travel along the downward slope. While this might be a good idea (using a low learning rate) in terms of making sure that we do not miss any local minima, it could also mean that we’ll be taking a long time to converge — especially if we get stuck on a plateau region. Here we provide a function ('set_experiment') that evaluates a vector of learning rate values to find the best compromise in accuracy and processing time. 

**Receptive field** is defined as the region in the input space that a particular CNN’s feature is looking at (i.e. be affected by). When dealing with high-dimensional inputs such as images, it is impractical to connect neurons to all neurons in the previous volume. Instead, we connect each neuron to only a local region of the input volume. The spatial extent of this connectivity is a hyper-parameter called the receptive field of the neuron (equivalently this is the filter size). In the VGG-16 architecture, 224x224 pixels is a predefined area, so it is not possible directly altering the receptive field in this architecture. To go around this point in this work, we vary the size of each image before cropping the patches used for classification and evaluate its impact on the overall classification accuracy. To alter the size, two methods are available: "scale", a factor by which the image is increased or decreased without changing the pixel/cm ratio and "ratio", changing the pixel/centimetre ratio by interpolation. The later can be change knowing the ratio of the original image, in this case 10 pixel/cm.

This experiment sweeper will train multiple models using a combination of learning rates and methods provided and produce a folder for each trained net using a set prefix, defined by 'experiment_type'. 

    Note: A table in the supplementary material of the manuscript, contains the parameters defined for the results in this publication.  Each experiment can take a day or two, depending on the hardware resources, and uses all the resources from the instance. Therefore, only individual experiments can be run at once. 

In [None]:
#Learning Rate
lrate=['0.01',']0.001'
#Experiement name (This is the prefix name of the folder that will be create containing the trainned net and its predictions on a small subset of training images, here defined as validation set)
experiment_type='ScaleLr_sweeper'
#Desired method to explore the importance of receptive field in the classification performance (i.e., scale or ratio)
method='scale'
# Multiplying factor used to modify the size of the image to evaluate the importance of the receptive field 
factor=['1.0','2.0']
#Number of cycles 
c=30
#Number of iterations per cycle
cs=1000
       
#Run experiment sweeper
cce.mix_experiment_wrapper(basedir,
                           region,
                           method = method, 
                           factors =  factor,
                           etype=experiment_type,
                           lrates=lrate, 
                           cycles=c, 
                           cyclesize=cs)

[ 126.07176518  127.70577545  140.36791122]
[ 113.68123413  123.70726926  144.58570591]
[ 126.07176518  127.70577545  140.36791122]
Fine tuning /media/data_caffe/hawaii/ScaleLr_sweeper_scale_1.0_0.001 from vgg_initial.caffemodel.
