# Modify the low-rank classification for extra unshared weights

This modifies the 12-target (i12), low-rank classifier to use additional convolutional layers after classification. 
Before using this you should:
- Have pycaffe setup and on the path
- Have a copy of the low-rank solver.prototxt
- Have a copy of the low-rank training classifier (e.g. `../scripts/anisotropic-training/training.prototxt`)
- Have a copy of the low-rank inference classifier (e.g. `../scripts/anisotropic-training/deploy/deploy.prototxt`)

Table of contents:
- [Utility Functions](#Utility-Functions)
- [Generate the Training Net](#Generate-the-net-to-use-for-training)
- [Generate the Inference Net](#Generate-the-net-to-use-for-inference-/-testing)
- [Test the Net](#Final-Test)
- [Modify the Solver Prototxt](#Set-the-solver-parameters-to-use-the-new-net)

# Utility Functions

In [1]:
%pylab notebook

Populating the interactive namespace from numpy and matplotlib


In [2]:
import os
import google.protobuf.text_format
import caffe
caffe.set_mode_cpu()

In [3]:
# For debugging (cell can be removed from final notebook)
from IPython.core.debugger import set_trace

In [4]:
def read_net_proto(path):
    """Read a net from a prototxt file
    
    :param path: The path to a caffe network (.prototxt file)
    
    :return: The prototxt object
    :rtype: caffe.proto.caffe_pb2.NetParameter
    """
    net = caffe.proto.caffe_pb2.NetParameter()
    with open(path) as f:
        proto = f.read()
    google.protobuf.text_format.Parse(proto, net)
    return net
    

In [5]:
def summarize_net(train_net):
    """ Print a (not sooo short) summary of the net layers
    
    :param net: A net 
    :type net: caffe.proto.caffe_pb2.NetParameter
    """
    layers = list(train_net.layer)
    for i, layer in enumerate(layers):
        print "{:04}".format(i),
        print "\t{:15}\t{:15}".format(layer.name, layer.type),
        if layer.type=="Convolution":
            if layer.convolution_param.kernel_size > 0:
                print "\t{0:>2}x{0:<2}".format(layer.convolution_param.kernel_size),
            else:
                print "\t{:>2}x{:<2}".format(layer.convolution_param.kernel_w, layer.convolution_param.kernel_h),                
        else:
            print "\t{:5}".format(''),

        if "_D" in layer.name:
            print "DECODE"
        else:
            print "      "

    print "Total", len(layers), "layers"   

# Generate the net to use for training

The first time I tried to programatically modify a network, I wisely named it `modified-training-net.prototxt`. It is, in fact, a _low-rank-training-net.prototxt_ file and I wish I had named it that. I am too lazy to rename it and fix all references to it that would break; so there we are. 

This notebook will modify it again, but I will try to choose a more appropriate name this time. 

In [6]:
!ls modified-training-net.prototxt

modified-training-net.prototxt


In [7]:
train_net = read_net_proto('modified-training-net.prototxt')

Right now, first non-shared-weights layers are 'conv-{LABEL}'.

The probability scores are not in the net.

In [8]:
LABELS = ['facade', 'window', 'door', 'cornice', 'sill', 'balcony', 'blind', 'deco', 'molding', 'pillar', 'shop']

In [9]:
num_layers = len(train_net.layers)

indices = {}
for i,l in enumerate(train_net.layer):
    indices[l.name] = i
    
for label in LABELS:
    print label, indices['conv-'+label]

facade 116
window 117
door 118
cornice 119
sill 120
balcony 121
blind 122
deco 123
molding 124
pillar 125
shop 126


In [10]:
# Top off the shared weights with a (full rank) convolution that has the
# same number of outputs that our concatenation layer will have

def make_initial_layer():
    layer = caffe.proto.caffe_pb2.LayerParameter()
    layer.type ="Convolution"
    layer.name = "conv1_1_D"
    layer.bottom.append('conv1_2_D')
    layer.top.append('conv1_1_D')

    layer.convolution_param.num_output = 4*len(LABELS)
    layer.convolution_param.pad.append(1)
    layer.convolution_param.kernel_size.append(3)
    layer.convolution_param.weight_filler.type='msra'
    layer.convolution_param.bias_filler.type='constant'

    layer.param.add(lr_mult=1, decay_mult=1)
    layer.param.add(lr_mult=1, decay_mult=0)

    return layer

make_initial_layer()

<caffe_pb2.LayerParameter at 0x7ff07999b050>

In [11]:
# Concatenate
concat = caffe.proto.caffe_pb2.LayerParameter()
concat.type ="Concat"
concat.name = "concat"
concat.top.append('concat')
for label in LABELS:
    concat.bottom.append('conv-' + label)
concat.concat_param.axis = 1

concat

<caffe_pb2.LayerParameter at 0x7ff07999b5f0>

In [12]:
# Now create recurrent layers; 

# QUESTION: To the _output_ blocks require different names?
# Or will the network be able to understand this structure?

def make_recurrant_feature_node(target_name, pass_index):
    rec_xxx = caffe.proto.caffe_pb2.LayerParameter()
    rec_xxx.CopyFrom(train_net.layer[indices['conv-'+target_name]])
    rec_xxx.name = 'conv_{}_r{}'.format(target_name, pass_index)
    if pass_index == 1:
        rec_xxx.bottom[0] = 'conv1_1_D'
    else:
        rec_xxx.bottom[0] = 'concat_r{}'.format(pass_index-1)
    rec_xxx.top[0] = 'conv_{}_r{}'.format(target_name, pass_index)
    rec_xxx.param[0].name='conv_{}_W'.format(target_name)
    rec_xxx.param[1].name='conv_{}_b'.format(target_name)
    return rec_xxx
    
def make_reccurant_concat_node(pass_index):
    concat = caffe.proto.caffe_pb2.LayerParameter()
    concat.type ="Concat"
    concat.name = 'concat_r{}'.format(pass_index)
    concat.top.append('concat_r{}'.format(pass_index))
    for label in LABELS:
        concat.bottom.append('conv_{}_r{}'.format(label, pass_index))
    concat.concat_param.axis = 1
    return concat
 
def make_recurrant_block(pass_index):
    block = []
    for target_name in LABELS:
        block.append(make_recurrant_feature_node(target_name, pass_index))
    block.append(make_reccurant_concat_node(pass_index))
    return block

In [13]:
#TODO: Set bottoms for conv-xxx


In [14]:
# And insert them after the existing 'conv' layers ('conv-shop')
def stitch_in_new_layers(net):
    indices = {l.name:i for i, l in enumerate(net.layer)}
    conv1_1_D_index = indices['conv-'+LABELS[0]]
    reccurant_start_index = indices['conv-'+LABELS[-1]]+1

    layers = list(net.layer)

    prequel = layers[:conv1_1_D_index]
    target_layers = layers[conv1_1_D_index:reccurant_start_index]
    sequel = layers[reccurant_start_index:]

    rec_blocks = []
    for pass_index in range(1, 5+1):
        rec_blocks += make_recurrant_block(pass_index)

    del net.layer[:] 
    net.layer.extend(prequel)
    net.layer.extend([make_initial_layer()])
    net.layer.extend(rec_blocks)
    
    for layer in target_layers:
        layer.bottom[0] = 'concat_r5'
        target_name = layer.name.replace('conv-','')
        layer.name = layer.name + '_final'
        layer.param[0].name='conv_{}_W'.format(target_name)
        layer.param[1].name='conv_{}_b'.format(target_name)
    
    net.layer.extend(target_layers)
    net.layer.extend(sequel)

In [15]:
def save_net_proto(path, net):
    new_proto = google.protobuf.text_format.MessageToString(net)
    with open(path, 'w') as f:
        f.write(new_proto)

In [16]:
stitch_in_new_layers(train_net)
save_net_proto('compatability-training-net.prototxt', train_net)

# Generate the net to use for inference / testing

Before executing these cells, copy the net used to do inference with the 'low-rank' classifier into this folder and name it 'modified-inference-net.prototxt'

In [17]:
infer_net_path = 'modified-inference-net.prototxt'
infer_net = read_net_proto(infer_net_path)

In [18]:
stitch_in_new_layers(infer_net)
save_net_proto('compatability-inference-net.prototxt', infer_net)

# Save a non-bayesian version

In [19]:
print ','.join(set([layer.type for layer in infer_net.layer]))

Convolution,ArgMax,BN,Upsample,ReLU,Pooling,Softmax,Dropout,Concat


In [20]:
dropouts = [layer for layer in infer_net.layer if layer.type == 'Dropout']
print "Found", len(dropouts), "dropout layers"

Found 6 dropout layers


In [21]:
for dropout in dropouts:
    if dropout.dropout_param.HasField('sample_weights_test'):
        print "Dropout", dropout.name, "_was_ set to work during testing...",
        dropout.dropout_param.ClearField('sample_weights_test')
        print "not anymore though!"
    else:
        print "Dropout", dropout.name, "is NOT set to work during testing..."

Dropout encdrop3 _was_ set to work during testing... not anymore though!
Dropout encdrop4 _was_ set to work during testing... not anymore though!
Dropout encdrop5 _was_ set to work during testing... not anymore though!
Dropout decdrop5 _was_ set to work during testing... not anymore though!
Dropout decdrop4 _was_ set to work during testing... not anymore though!
Dropout decdrop3 _was_ set to work during testing... not anymore though!


In [22]:
save_net_proto('non-bayesian-compatability-inference-net.prototxt', infer_net)

# Final Test 
** Make Sure Things Load Right**
- The net should load without crashing everything
- The tops should ONLY be the 'label-' layers, or the 'loss' layers, if everything is connected properly

In [24]:
%run /opt/caffe-segnet-cudnn5/python/draw_net.py 'compatability-training-net.prototxt'  'compatability-training-net.svg'

Drawing net to compatability-training-net.svg


[output figure](compatability-training-net.svg)

In [25]:
import caffe
caffe.set_device(0)
caffe.set_mode_gpu()

import facade_layers
reload(facade_layers)
net = caffe.Net('compatability-training-net.prototxt', 
                'test_weights_from_peihao.caffemodel', caffe.TRAIN)

results = net.forward()

This call to matplotlib.use() has no effect because the backend has already
been chosen; matplotlib.use() must be called *before* pylab, matplotlib.pyplot,
or matplotlib.backends is imported for the first time.

The backend was *originally* set to u'nbAgg' by the following code:
  File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/lib/python2.7/dist-packages/ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "/usr/local/lib/python2.7/dist-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/usr/local/lib/python2.7/dist-packages/ipykernel/kernelapp.py", line 486, in start
    self.io_loop.start()
  File "/usr/local/lib/python2.7/dist-packages/tornado/ioloop.py", line 888, in start
    handler_func(fd_obj, events)
  File "/usr/local/lib/python2.7/dis

Initialized 'TrainInputLayer
   For fold number 1
   Getting files from /root/low-rank/data/training/independant_12_layers/fold_01/train.txt
   Found 1173 files.
source: /root/low-rank/data/training/independant_12_layers/fold_01/train.txt
number of files: 1173
/root/low-rank/data/training/independant_12_layers/cmp/npy/stack_0651.npy


If I have properly connected things, these should be just the 'loss' layers

In [26]:
results.keys()

['shop-loss',
 'blind-loss',
 'cornice-loss',
 'door-loss',
 'sill-loss',
 'deco-loss',
 'facade-loss',
 'window-loss',
 'molding-loss',
 'balcony-loss',
 'pillar-loss']

In [27]:
del net

# Set the solver parameters to use the new net

In [29]:
from google.protobuf.text_format import Parse

In [30]:
solver = caffe.proto.caffe_pb2.SolverParameter()
Parse(open('solver.prototxt').read(), solver)

<caffe_pb2.SolverParameter at 0x7ff05097ae60>

In [31]:
solver.net = os.path.abspath('compatability-training-net.prototxt')

In [35]:
ls /data/models/

In [36]:
solver.snapshot_prefix='/data/models/compatability_net'

In [37]:
print solver

test_iter: 1
test_interval: 10000000
base_lr: 1e-05
display: 10
max_iter: 250000
lr_policy: "step"
gamma: 1.0
momentum: 0.95
weight_decay: 0.0005
stepsize: 10000000
snapshot: 1000
snapshot_prefix: "/data/models/compatability_net"
solver_mode: GPU
net: "/root/low-rank/scripts/anisotropic-training/compatability-training-net.prototxt"
test_initialization: false



In [38]:
with open('compatability-solver.prototxt', 'w') as f:
    f.write(google.protobuf.text_format.MessageToString(solver))

At this point we can start training **almost**. This cost me 1000 iterations, I forgot to make the folder that will hold the snapshots. 

In [39]:
os.path.isdir(os.path.dirname(solver.snapshot_prefix))

True

In [40]:
try:
    os.makedirs(os.path.dirname(solver.snapshot_prefix))
    print "Since caffe won't do it for us (ugh!) I made the folder needed to hold our snapshots"
except OSError as e:
    print "Ok, it's alright, we must have already made that directory."

Ok, it's alright, we must have already made that directory.


In [47]:
%%file compatability-start-training.sh
#!/usr/bin/env bash

INITIAL_WEIGHTS=${PWD}/test_weights_from_peihao.caffemodel

caffe.bin train -solver=compatability-solver.prototxt -weights ${INITIAL_WEIGHTS} -gpu=3

Overwriting compatability-start-training.sh


In [46]:
%%file compatability-resume-training
#!/usr/bin/env bash
caffe.bin train -solver=compatability-solver.prototxt \
                -snapshot $(./get_iter.py -s -p compatability-solver.prototxt) \
                -gpu=3

Writing compatability-resume-training


Now I should be able to run 'compatability-start-training.sh'