# CIFAR-10 Recipe
In this notebook, we will show how to train a state-of-art CIFAR-10 network with MXNet and extract feature from the network.
This example wiil cover

- Network/Data definition 
- Model saving and loading
- Learning rate schedule
- Extracting feature from network


In [12]:
import mxnet as mx
import logging

# setup logging
logging.basicConfig(level=logging.DEBUG)
console = logging.StreamHandler()
console.setLevel(logging.DEBUG)
logging.getLogger('').addHandler(console)

First, let's make some helper function to let us build a simplified Inception Network. More details about how to composite symbol into component can be found at [component demo](composite_symbol.ipynb)

In [3]:
# Basic Conv + BN + ReLU factory
def ConvFactory(data, num_filter, kernel, stride=(1,1), pad=(0, 0), act_type="relu"):
    conv = mx.symbol.Convolution(data=data, num_filter=num_filter, kernel=kernel, stride=stride, pad=pad)
    bn = mx.symbol.BatchNorm(data=conv)
    act = mx.symbol.Activation(data = bn, act_type=act_type)
    return act

In [4]:
# A Simple Downsampling Factory
def DownsampleFactory(data, ch_3x3):
    # conv 3x3
    conv = ConvFactory(data=data, kernel=(3, 3), stride=(2, 2), num_filter=ch_3x3, pad=(1, 1))
    # pool
    pool = mx.symbol.Pooling(data=data, kernel=(3, 3), stride=(2, 2), pool_type='max')
    # concat
    concat = mx.symbol.Concat(*[conv, pool])
    return concat

In [5]:
# A Simple module
def SimpleFactory(data, ch_1x1, ch_3x3):
    # 1x1
    conv1x1 = ConvFactory(data=data, kernel=(1, 1), pad=(0, 0), num_filter=ch_1x1)
    # 3x3
    conv3x3 = ConvFactory(data=data, kernel=(3, 3), pad=(1, 1), num_filter=ch_3x3)
    #concat
    concat = mx.symbol.Concat(*[conv1x1, conv3x3])
    return concat

Now we can build a network with these component factories

In [9]:
data = mx.symbol.Variable(name="data")
conv1 = ConvFactory(data=data, kernel=(3,3), pad=(1,1), num_filter=96, act_type="relu")
in3a = SimpleFactory(conv1, 32, 32)
in3b = SimpleFactory(in3a, 32, 48)
in3c = DownsampleFactory(in3b, 80)
in4a = SimpleFactory(in3c, 112, 48)
in4b = SimpleFactory(in4a, 96, 64)
in4c = SimpleFactory(in4b, 80, 80)
in4d = SimpleFactory(in4c, 48, 96)
in4e = DownsampleFactory(in4d, 96)
in5a = SimpleFactory(in4e, 176, 160)
in5b = SimpleFactory(in5a, 176, 160)
pool = mx.symbol.Pooling(data=in5b, pool_type="avg", kernel=(7,7))
flatten = mx.symbol.Flatten(data=pool)
fc = mx.symbol.FullyConnected(data=flatten, num_hidden=10)
loss = mx.symbol.Softmax(data=fc)

In [None]:
# If you'd like to see the network structure, run the plot_network function
# mx.viz.plot_network(loss)

In [13]:
# We will make model with current current symbol
# For demo purpose, this model only train 1 round
model = mx.model.FeedForward(ctx=mx.gpu(), symbol=loss, num_round = 1,
                             learning_rate=0.05, momentum=0.9, wd=0.00001)
# To make automatic model saving after each round, we can add check_point callback
# model_prefix = "cifar"
# model = mx.model.FeedForward(ctx=mx.gpu(), symbol=loss, num_round = 1,
#                              learning_rate=0.05, momentum=0.9, wd=0.00001,
#                              iter_end_callback=mx.model.do_checkpoint(model_prefix))


ValueError: Find duplicated argument name "weight", please make the weight name non-duplicated(using name arguments), arguments are ['data', 'weight', 'bias', 'gamma', 'beta', 'weight', 'bias', 'gamma', 'beta', 'weight', 'bias', 'gamma', 'beta', 'weight', 'bias', 'gamma', 'beta', 'weight', 'bias', 'gamma', 'beta', 'weight', 'bias', 'gamma', 'beta', 'weight', 'bias', 'gamma', 'beta', 'weight', 'bias', 'gamma', 'beta', 'weight', 'bias', 'gamma', 'beta', 'weight', 'bias', 'gamma', 'beta', 'weight', 'bias', 'gamma', 'beta', 'weight', 'bias', 'gamma', 'beta', 'weight', 'bias', 'gamma', 'beta', 'weight', 'bias', 'gamma', 'beta', 'weight', 'bias', 'gamma', 'beta', 'weight', 'bias', 'gamma', 'beta', 'weight', 'bias', 'gamma', 'beta', 'weight', 'bias', 'gamma', 'beta', 'weight', 'bias', 'gamma', 'beta', 'weight', 'bias', 'label']

Next step is declaring data iterator. The original CIFAR-10 data is 3x32x32 in binary format, we provides RecordIO format, so we can use Image RecordIO format. For more infomation about Image RecordIO Iterator, check [document](https://mxnet.readthedocs.org/en/latest/python/io.html).

In [14]:
# Use utility function in test to download the data
import sys
sys.path.append("../../tests/python/common")
import get_data
get_data.GetCifar10()
# After we get the data, we can declare our data iterator
# The iterator will automatically create mean image file if it doesn't exist
batch_size = 128
# Train iterator make batch of 128 image, and random crop each image into 3x28x28 from original 3x32x32
train_dataiter = mx.io.ImageRecordIter(
        shuffle=True,
        path_imgrec="data/cifar/train.rec",
        mean_img="data/cifar/cifar_mean.bin",
        rand_crop=True,
        rand_mirror=True,
        data_shape=(3,28,28),
        batch_size=batch_size,
        preprocess_threads=1)
# test iterator make batch of 128 image, and center crop each image into 3x28x28 from original 3x32x32
test_dataiter = mx.io.ImageRecordIter(
        path_imgrec="data/cifar/test.rec",
        mean_img="data/cifar/cifar_mean.bin",
        rand_crop=False,
        rand_mirror=False,
        data_shape=(3,28,28),
        batch_size=batch_size,
        preprocess_threads=1)