## Network conversion and weight/activation quantization

This script will load an originally network prototxt file as well as pre trained weights and produces a network file whoch can be loaded by NullHop and run on FPGA

Example case VGG16
network protoxt file can be found either in caffe_lp/examples/low_precision/imagenet/models or here:

https://gist.github.com/ksimonyan/211839e770f7b538e2d8#file-readme-md

caffemodel (original pre trained weights)
http://www.robots.ox.ac.uk/%7Evgg/software/very_deep/caffe/VGG_ILSVRC_16_layers.caffemodel




In [1]:
import caffe
import numpy as np
import matplotlib.pyplot as plt
from caffe.quantization.net_descriptor import net_prototxt
from caffe.quantization.qmf_check import distribute_bits
from caffe.quantization.convert_weights import convert_weights
import os
import sys

%matplotlib inline

# initialize classes
d = distribute_bits()
n = net_prototxt()
c = convert_weights()

Done


In [2]:
'''
The information flow is as follow:
1) Create a folder for a given network you want to quantize/fine tune
   in caffe_lp/examples/low_precision/quantization/network
2) Download and copy .caffemodel file into weight directory (e.g. HDD or wherever you want)
   The copied original file is used to convert blobs (weight W and biases b) into low precision blob structure
3) All models, e.g. prototxt files, are stored in caffe_lp/examples/low_precision/imagenet/models
   We start from the deploy files as they are normally released
4) We distribute the available bits (e.g. 16 bit including sign) for each layer separately for weights and activations
5) We extract the network structure from a given net using extract function within net_prototxt
6) We create a new prototxt file based on extracted network layout and layer-wise bit distribution 
   for weights and activations
7) We finetune the model using the reduced bit precision for 1-5 Epochs

Author: Moritz Milde
Date: 03.11.2016
email: mmilde@ini@uzh.ch
'''

net_name = 'VGG16'
n_bits = 16

caffe_root = '/home/moritz/Repositories/caffe_lp/'
weight_dir = '/media/moritz/Data/ILSVRC2015/pre_trained/' + net_name + '/'
model_dir = caffe_root + 'examples/low_precision/imagenet/models/'
script_dir = caffe_root + 'examples/low_precision/imagenet/'
layer_dir = caffe_root + 'examples/create_prototxt/layers/'
save_dir = caffe_root + 'examples/low_precision/imagenet/models/'

c.convert_weights(net_name, save_name=None, 
                  caffe_root=caffe_root, model_dir=model_dir, weight_dir=weight_dir, debug=False)


Downloading to /media/moritz/Data/ILSVRC2015/pre_trained/VGG16/
File already downloaded


In [7]:
# We first have to estimate the bit distribution for weights and activations
# since the network will be constructed based on this estimate
# Make sure you downloaded the pre_trained weights from the above link, if VGG16/GoogLeNet, or the respective source
# The naming convention is: NetworkName_deploy.prototxt (for test/one-time rounding)
#                           NetworkName_train.prototxt (for finetune/re-train)
bit_w, net = d.weights(net_name=net_name, n_bits=n_bits, load_mode='high_precision', threshold=0.1,
                       caffe_root=caffe_root, model_dir=model_dir, weight_dir=weight_dir, debug=False)
bit_a, net = d.activation(net_name=net_name, n_bits=n_bits, load_mode='high_precision', threshold=0.05,
                          caffe_root=caffe_root, model_dir=model_dir, weight_dir=weight_dir, debug=False)



In [9]:
# By default the functions try to locate models within the directory in (3)!!!!
# Make sure the desired prototxt file to extract the net structure from is in the respective directory
print 'Extracting network structure'
net_layout = n.extract(net_name=net_name, mode='train', model=net, 
                       caffe_root=caffe_root, weight_dir=weight_dir, debug=False)
print net_layout
print 'Creating new network based on weight/activation distribution'
n.create(net_name=net_name, net_descriptor=net_layout, 
         bit_distribution_weights=bit_w, bit_distribution_act=bit_a, 
         init_method='xavier', lp=True, deploy=False, visualize=False, round_bias='false',
         caffe_root=caffe_root, model_dir=model_dir, layer_dir=layer_dir, save_dir=save_dir, debug=False)


Extracting network structure
['64C3S1p1', 'A', 'ReLU', '64C3S1p1', 'A', 'ReLU', '2P2', '128C3S1p1', 'A', 'ReLU', '128C3S1p1', 'A', 'ReLU', '2P2', '256C3S1p1', 'A', 'ReLU', '256C3S1p1', 'A', 'ReLU', '256C3S1p1', 'A', 'ReLU', '2P2', '512C3S1p1', 'A', 'ReLU', '512C3S1p1', 'A', 'ReLU', '512C3S1p1', 'A', 'ReLU', '2P2', '512C3S1p1', 'A', 'ReLU', '512C3S1p1', 'A', 'ReLU', '512C3S1p1', 'A', 'ReLU', '2P2', '4096F', 'A', 'ReLU', 'D5', '4096F', 'A', 'ReLU', 'D5', '1000F', 'Accuracy', 'loss']
Creating new network based on weight/activation distribution
Generating LP_VGG16_16_bit_train.prototxt


# Finetune the network
You have to change the lp_solver_GPU0_finetune.prototxt
located in $caffe_root/examples/low_precision/imagenet/solver
to load the netowrk you just created LP_{net_name}_{n_bits}_bits_train.prototxt 
and the converted HP_{net_name}_v2.caffemodel
HAPPY FINETUNING

In [None]:
from caffe.proto import caffe_pb2
from google.protobuf import text_format
def get_model():
    prototxt = model_dir + 'VGG16_deploy.prototxt'
    # test = caffe_pb2.NetParameter()
    # dir(caffe_pb2.ConvolutionParameter)
    caffemodel = directory + 'VGG16.caffemodel'
    model = caffe.Net(prototxt, caffemodel, caffe.TEST)
    # model = caffe.Net(prototxt, caffemodel, caffe.TEST)
    model_protobuf = caffe.proto.caffe_pb2.NetParameter()
    text_format.Merge(open(prototxt).read(), model_protobuf)
    return {'model': (model, model_protobuf), 'val_fn': model.forward_all}

model = get_model()


In [None]:
print type(model['model'])

caffe_model = model['model'][0]
caffe_layers = model['model'][1].layer

for (layer_num, layer) in enumerate(caffe_layers):
    if layer.type == 'Convolution':
        p = layer.convolution_param
        print 'Type: {}'.format(layer.type)
        print 'Pad: {}'.format(p.pad._values[0])
        print 'Stride {}'.format(p.stride)
        print 'Kernel size: {}'.format(p.kernel_size._values[0])
        print 'Output Channels {}'.format(p.num_output)
        print '----------------------'
    if layer.type == 'Pooling':
        p = layer.pooling_param
        print 'Type: {}'.format(layer.type)
        print 'Kernel size: {}'.format(p.kernel_size)
        print 'Stride: {}'.format(p.stride)
        print '----------------------'
    if layer.type == 'Dropout':
        p = layer.dropout_param
        print 'Type: {}'.format(layer.type)
        print 'Dropout ratio: {}'.format(int(p.dropout_ratio * 10))
        print '----------------------'
    if layer.type == 'Accuracy':
        p = layer.accuracy_param
        print 'Type: {}'.format(layer.type)
        print 'Top: {}'.format(p.top_k)

In [None]:
select_key1 = 'conv'
select_key2 = 'fc'
bit_distribution = np.zeros((2, len(filter(lambda x: select_key1 in x, net.blobs.keys())) +
                                         len(filter(lambda x: select_key2 in x, net.blobs.keys())) - 3))

In [None]:
np.shape(bit_distribution)