# Training notebook for MobileNet v2 1.0 and 0.5 on ImageNet dataset

## Overview
Use this notebook to train a MobileNet model from scratch. **Make sure to have the ImageNet dataset prepared** according to the guidelines in the dataset section in [MobileNet readme](README.md) before proceeding.

## Prerequisites
The following dependencies need to be installed before proceeding.
* mxnet - `pip install mxnet-cu90mkl` (tested on this version GPU, can use other versions)
* gluoncv - `pip install gluoncv`
* numpy - `pip install numpy`
* matplotlib - `pip install matplotlib`

In order to train the model with a python script: 
* Generate the script : In Jupyter Notebook browser, go to File -> Download as -> Python (.py)
* Run the script: `python train_mobilenet.py`

### Import dependencies
Verify that all dependencies are installed using the cell below. Continue if no errors encountered, warnings can be ignored.

In [1]:
import matplotlib
matplotlib.use('Agg')

import argparse, time, logging
import mxnet as mx
import numpy as np
from mxnet import gluon, nd
from mxnet import autograd as ag
from mxnet.gluon import nn
from mxnet.gluon.data.vision import transforms

from gluoncv.data import imagenet
from gluoncv.utils import makedirs, TrainingHistory

import os
from mxnet.context import cpu
from mxnet.gluon.block import HybridBlock
from mxnet.gluon.contrib.nn import HybridConcurrent
import multiprocessing

### Specify model, hyperparameters and save locations

The training was done on a p3.16xlarge ec2 instance on AWS. It has 8 Nvidia Tesla V100 GPUs (16GB each) and Intel Xeon E5-2686 v4 @ 2.70GHz with 64 threads.

The `batch_size` set below is per device. For multiple GPUs there are different batches in each GPU of size `batch_size` simultaneously.

The rest of the parameters can be tuned to fit the needs of a user. The values shown below were used to train the model in the model zoo.

In [2]:
# specify model - choose from (mobilenetv2_1.0, mobilenetv2_0.5)
model_name = 'mobilenetv2_1.0' 

# path to training and validation images to use
data_dir = '/home/data/tmp/'

# training batch size per device (CPU/GPU)
batch_size = 20

# number of GPUs to use (automatically detect the number of GPUs)
num_gpus = len(mx.test_utils.list_gpus())
print(num_gpus)
# number of pre-processing workers (automatically detect the number of workers)
#num_workers = multiprocessing.cpu_count()
print(multiprocessing.cpu_count())
num_workers = 1
# number of training epochs 
#used as 480 for all of the models , used 1 over here to show demo for 1 epoch
num_epochs = 5

# learning rate
lr = 0.045

# momentum value for optimizer
momentum = 0.9

# weight decay rate
wd = 0.00004

# decay rate of learning rate
lr_decay = 0.98

# interval for periodic learning rate decays
lr_decay_period = 1

# epoches at which learning rate decays
lr_decay_epoch = '30,60,90'

# mode in which to train the model. options are symbolic, imperative, hybrid
mode = 'hybrid'

# Number of batches to wait before logging
log_interval = 1

# frequency of model saving
save_frequency = 10

# directory of saved models
save_dir = 'params'

#directory of training logs
logging_dir = 'logs'

# the path to save the history plot
save_plot_dir = '.'


0
8


### Model definition in Gluon

The class `MobileNetV2` contains model definitions of the MobileNet models and the required model is retrieved using the relevant constructor function: `mobilenet_v2_1_0` or `mobilenet_v2_0_5`. 

`RELU6`, `_add_conv`, `_add_conv_dw` and `LinearBottleneck` are helper functions and classes used in the model.

In [3]:
##This block contains definition for Mobilenet v2

# Helpers
class RELU6(nn.HybridBlock):
    """Relu6 used in MobileNetV2."""

    def __init__(self, **kwargs):
        super(RELU6, self).__init__(**kwargs)

    def hybrid_forward(self, F, x):
        return F.clip(x, 0, 6, name="relu6")


def _add_conv(out, channels=1, kernel=1, stride=1, pad=0,
              num_group=1, active=True, relu6=False):
    out.add(nn.Conv2D(channels, kernel, stride, pad, groups=num_group, use_bias=False))
    out.add(nn.BatchNorm(scale=True))
    if active:
        out.add(RELU6() if relu6 else nn.Activation('relu'))


def _add_conv_dw(out, dw_channels, channels, stride, relu6=False):
    _add_conv(out, channels=dw_channels, kernel=3, stride=stride,
              pad=1, num_group=dw_channels, relu6=relu6)
    _add_conv(out, channels=channels, relu6=relu6)


class LinearBottleneck(nn.HybridBlock):
    r"""LinearBottleneck used in MobileNetV2 model from the
    `"Inverted Residuals and Linear Bottlenecks:
      Mobile Networks for Classification, Detection and Segmentation"
    <https://arxiv.org/abs/1801.04381>`_ paper.
    Parameters
    ----------
    in_channels : int
        Number of input channels.
    channels : int
        Number of output channels.
    t : int
        Layer expansion ratio.
    stride : int
        stride
    """

    def __init__(self, in_channels, channels, t, stride, **kwargs):
        super(LinearBottleneck, self).__init__(**kwargs)
        self.use_shortcut = stride == 1 and in_channels == channels
        with self.name_scope():
            self.out = nn.HybridSequential()

            _add_conv(self.out, in_channels * t, relu6=True)
            _add_conv(self.out, in_channels * t, kernel=3, stride=stride,
                      pad=1, num_group=in_channels * t, relu6=True)
            _add_conv(self.out, channels, active=False, relu6=True)

    def hybrid_forward(self, F, x):
        out = self.out(x)
        if self.use_shortcut:
            out = F.elemwise_add(out, x)
        return out


# Net
class MobileNetV2(nn.HybridBlock):
    r"""MobileNetV2 model from the
    `"Inverted Residuals and Linear Bottlenecks:
      Mobile Networks for Classification, Detection and Segmentation"
    <https://arxiv.org/abs/1801.04381>`_ paper.
    Parameters
    ----------
    multiplier : float, default 1.0
        The width multiplier for controling the model size. The actual number of channels
        is equal to the original channel size multiplied by this multiplier.
    classes : int, default 1000
        Number of classes for the output layer.
    """

    def __init__(self, multiplier=1.0, classes=1000, **kwargs):
        super(MobileNetV2, self).__init__(**kwargs)
        with self.name_scope():
            self.features = nn.HybridSequential(prefix='features_')
            with self.features.name_scope():
                _add_conv(self.features, int(32 * multiplier), kernel=3,
                          stride=2, pad=1, relu6=True)

                in_channels_group = [int(x * multiplier) for x in [32] + [16] + [24] * 2
                                     + [32] * 3 + [64] * 4 + [96] * 3 + [160] * 3]
                channels_group = [int(x * multiplier) for x in [16] + [24] * 2 + [32] * 3
                                  + [64] * 4 + [96] * 3 + [160] * 3 + [320]]
                ts = [1] + [6] * 16
                strides = [1, 2] * 2 + [1, 1, 2] + [1] * 6 + [2] + [1] * 3

                for in_c, c, t, s in zip(in_channels_group, channels_group, ts, strides):
                    self.features.add(LinearBottleneck(in_channels=in_c, channels=c,
                                                       t=t, stride=s))

                last_channels = int(1280 * multiplier) if multiplier > 1.0 else 1280
                _add_conv(self.features, last_channels, relu6=True)

                self.features.add(nn.GlobalAvgPool2D())

            self.output = nn.HybridSequential(prefix='output_')
            with self.output.name_scope():
                self.output.add(
                    nn.Conv2D(classes, 1, use_bias=False, prefix='pred_'),
                    nn.Flatten()
                )

    def hybrid_forward(self, F, x):
        x = self.features(x)
        x = self.output(x)
        return x


# Constructor
def get_mobilenet_v2(multiplier, **kwargs):
    r"""MobileNetV2 model from the
    `"Inverted Residuals and Linear Bottlenecks:
      Mobile Networks for Classification, Detection and Segmentation"
    <https://arxiv.org/abs/1801.04381>`_ paper.
    Parameters
    ----------
    multiplier : float
        The width multiplier for controling the model size. Only multipliers that are no
        less than 0.25 are supported. The actual number of channels is equal to the original
        channel size multiplied by this multiplier.
    """
    net = MobileNetV2(multiplier, **kwargs)
    return net

def mobilenet_v2_1_0(**kwargs):
    r"""MobileNetV2 model from the
    `"Inverted Residuals and Linear Bottlenecks:
      Mobile Networks for Classification, Detection and Segmentation"
    <https://arxiv.org/abs/1801.04381>`_ paper.
    """
    return get_mobilenet_v2(1.0, **kwargs)

def mobilenet_v2_0_5(**kwargs):
    r"""MobileNetV2 model from the
    `"Inverted Residuals and Linear Bottlenecks:
      Mobile Networks for Classification, Detection and Segmentation"
    <https://arxiv.org/abs/1801.04381>`_ paper.
    """
    return get_mobilenet_v2(0.5, **kwargs)
models = {  
            'mobilenetv2_1.0': mobilenet_v2_1_0,
            'mobilenetv2_0.5': mobilenet_v2_0_5
         }



### Helper code
Define context, optimizer, accuracy metrics, retireve gluon model

In [4]:
# Specify logging fucntion
logging.basicConfig(level=logging.INFO)

# Specify classes (1000 for ImageNet)
classes = 1000
# Extrapolate batches to all devices
batch_size *= max(1, num_gpus)
# Define context
context = [mx.gpu(i) for i in range(num_gpus)] if num_gpus > 0 else [mx.cpu()]

lr_decay_epoch = [int(i) for i in lr_decay_epoch.split(',')] + [np.inf]

kwargs = {'classes': classes}

# Define optimizer (nag = Nestrov Accelerated Gradient)
optimizer = 'nag'
optimizer_params = {'learning_rate': lr, 'wd': wd, 'momentum': momentum}

# Retireve gluon model
net = models[model_name](**kwargs)

# Define accuracy measures - top1 error and top5 error
acc_top1 = mx.metric.Accuracy()
acc_top5 = mx.metric.TopKAccuracy(5)
train_history = TrainingHistory(['training-top1-err', 'training-top5-err',
                                 'validation-top1-err', 'validation-top5-err'])
makedirs(save_dir)

### Define preprocessing functions
`preprocess_train_data(normalize, jitter_param, lighting_param)` : Do pre-processing and data augmentation of train images -> take random crops of size 224x224, do random left right flips, jitter image color and lighting, mormalize image

`preprocess_test_data(normalize)` : Pre-process validation images -> resize to size 256x256, take center crop of size 224x224, normalize image

In [5]:
normalize = transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
jitter_param = 0.0
lighting_param = 0.0

# Input pre-processing for train data
def preprocess_train_data(normalize, jitter_param, lighting_param):
    transform_train = transforms.Compose([
        transforms.Resize(480),
        transforms.RandomResizedCrop(224),
        transforms.RandomFlipLeftRight(),
        transforms.RandomColorJitter(brightness=jitter_param, contrast=jitter_param,
                                     saturation=jitter_param),
        transforms.RandomLighting(lighting_param),
        transforms.ToTensor(),
        normalize
    ])
    return transform_train

# Input pre-processing for validation data
def preprocess_test_data(normalize):
    transform_test = transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        normalize
    ])
    return transform_test

### Define test function
`test(ctx, val_data)` : Computes and returns validation errors on `val_data` using `ctx` context

In [6]:
# Test function
def test(ctx, val_data):
    # Reset accuracy metrics
    acc_top1.reset()
    acc_top5.reset()
    for i, batch in enumerate(val_data):
        # Load validation batch
        data = gluon.utils.split_and_load(batch[0], ctx_list=ctx, batch_axis=0)
        label = gluon.utils.split_and_load(batch[1], ctx_list=ctx, batch_axis=0)
        # Perform forward pass
        outputs = [net(X) for X in data]
        # Update accuracy metrics
        acc_top1.update(label, outputs)
        acc_top5.update(label, outputs)
    # Retrieve and return top1 and top5 errors
    _, top1 = acc_top1.get()
    _, top5 = acc_top5.get()
    return (1-top1, 1-top5)

### Define train function
`train(epochs, ctx)` : Train model for `epochs` epochs using `ctx` context, log training progress, compute and display validation errors after each epoch, take periodic snapshots of the model, generates training plot 

In [7]:
# Train function
def train(epochs, ctx):
    if isinstance(ctx, mx.Context):
        ctx = [ctx]
    # Initialize network - Use method in MSRA paper <https://arxiv.org/abs/1502.01852>
    net.initialize(mx.init.MSRAPrelu(), ctx=ctx)
    # Prepare train and validation batches
    transform_train = preprocess_train_data(normalize, jitter_param, lighting_param)
    transform_test = preprocess_test_data(normalize)
    train_data = gluon.data.DataLoader(
        imagenet.classification.ImageNet(data_dir, train=True).transform_first(transform_train),
        batch_size=batch_size, shuffle=True, last_batch='discard', num_workers=num_workers)
    val_data = gluon.data.DataLoader(
        imagenet.classification.ImageNet(data_dir, train=False).transform_first(transform_test),
        batch_size=batch_size, shuffle=False, num_workers=num_workers)
    # Define trainer
    trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)
    # Define loss
    L = gluon.loss.SoftmaxCrossEntropyLoss()

    lr_decay_count = 0

    best_val_score = 1
    # Main training loop - loop over epochs
    for epoch in range(epochs):
        print("start epoch %d" % epoch)
        tic = time.time()NL

        # Reset accuracy metrics
        acc_top1.reset()
        acc_top5.reset()
        btic = time.time()
        train_loss = 0
        num_batch = len(train_data)
        print("will do %d batches" % num_batch)
        # Check and perform learning rate decay
        if lr_decay_period and epoch and epoch % lr_decay_period == 0:
            trainer.set_learning_rate(trainer.learning_rate*lr_decay)
        elif lr_decay_period == 0 and epoch == lr_decay_epoch[lr_decay_count]:
            trainer.set_learning_rate(trainer.learning_rate*lr_decay)
            lr_decay_count += 1
            
        print("set learning rate")
        # Loop over batches in an epoch
        for i, batch in enumerate(train_data):
            print("batch: %d" % i)
            # Load train batch
            data = gluon.utils.split_and_load(batch[0], ctx_list=ctx, batch_axis=0)
            label = gluon.utils.split_and_load(batch[1], ctx_list=ctx, batch_axis=0)
            label_smooth = label
            print("forward pass")
            # Perform forward pass
            with ag.record():
                outputs = [net(X) for X in data]
                loss = [L(yhat, y) for yhat, y in zip(outputs, label_smooth)]
            # Perform backward pass
            print("backward pass")
            ag.backward(loss)
            # PErform updates
            print("update")
            trainer.step(batch_size)
            # Update accuracy metrics
            acc_top1.update(label, outputs)
            acc_top5.update(label, outputs)
            # Update loss
            print("update loss")
            train_loss += sum([l.sum().asscalar() for l in loss])
            # Log training progress (after each `log_interval` batches)
            if log_interval and not (i+1)%log_interval:
                _, top1 = acc_top1.get()
                _, top5 = acc_top5.get()
                err_top1, err_top5 = (1-top1, 1-top5)
                logging.info('Epoch[%d] Batch [%d]\tSpeed: %f samples/sec\ttop1-err=%f\ttop5-err=%f'%(
                             epoch, i, batch_size*log_interval/(time.time()-btic), err_top1, err_top5))
                btic = time.time()

        # Retrieve training errors and loss
        _, top1 = acc_top1.get()
        _, top5 = acc_top5.get()
        err_top1, err_top5 = (1-top1, 1-top5)
        train_loss /= num_batch * batch_size

        # Compute validation errors
        err_top1_val, err_top5_val = test(ctx, val_data)
        # Update training history
        train_history.update([err_top1, err_top5, err_top1_val, err_top5_val])
        # Update plot
        train_history.plot(['training-top1-err', 'validation-top1-err','training-top5-err', 'validation-top5-err'],
                           save_path='%s/%s_top_error.png'%(save_plot_dir, model_name))

        # Log training progress (after each epoch)
        logging.info('[Epoch %d] training: err-top1=%f err-top5=%f loss=%f'%(epoch, err_top1, err_top5, train_loss))
        logging.info('[Epoch %d] time cost: %f'%(epoch, time.time()-tic))
        logging.info('[Epoch %d] validation: err-top1=%f err-top5=%f'%(epoch, err_top1_val, err_top5_val))

        # Save a snapshot of the best model - use net.export to get MXNet symbols and params
        if err_top1_val < best_val_score and epoch > 50:
            best_val_score = err_top1_val
            net.export('%s/%.4f-imagenet-%s-best'%(save_dir, best_val_score, model_name), epoch)
        # Save a snapshot of the model after each 'save_frequency' epochs
        if save_frequency and save_dir and (epoch + 1) % save_frequency == 0:
            net.export('%s/%.4f-imagenet-%s'%(save_dir, best_val_score, model_name), epoch)
    # Save a snapshot of the model at the end of training
    if save_frequency and save_dir:
        net.export('%s/%.4f-imagenet-%s'%(save_dir, best_val_score, model_name), epochs-1)

### Train model
* Run the cell below to start training
* Logs are displayed in the cell output
* An example run of 1 epoch is shown here
* Once training completes, the symbols and params files are saved in the root folder

In [8]:
def main():
    print(batch_size)
    net.hybridize()
    train(num_epochs, context)
if __name__ == '__main__':
    main()

20
start epoch 0
will do 75 batches
set learning rate
batch: 0
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [0]	Speed: 4.460128 samples/sec	top1-err=1.000000	top5-err=1.000000


batch: 1
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [1]	Speed: 4.928642 samples/sec	top1-err=0.875000	top5-err=0.750000


batch: 2
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [2]	Speed: 4.877480 samples/sec	top1-err=0.866667	top5-err=0.666667


batch: 3
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [3]	Speed: 4.552999 samples/sec	top1-err=0.850000	top5-err=0.600000


batch: 4
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [4]	Speed: 4.623847 samples/sec	top1-err=0.830000	top5-err=0.570000


batch: 5
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [5]	Speed: 4.699587 samples/sec	top1-err=0.833333	top5-err=0.541667


batch: 6
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [6]	Speed: 4.873991 samples/sec	top1-err=0.835714	top5-err=0.507143


batch: 7
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [7]	Speed: 4.888185 samples/sec	top1-err=0.812500	top5-err=0.462500


batch: 8
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [8]	Speed: 4.910532 samples/sec	top1-err=0.816667	top5-err=0.444444


batch: 9
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [9]	Speed: 4.752672 samples/sec	top1-err=0.820000	top5-err=0.445000


batch: 10
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [10]	Speed: 4.709552 samples/sec	top1-err=0.822727	top5-err=0.440909


batch: 11
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [11]	Speed: 5.070765 samples/sec	top1-err=0.837500	top5-err=0.437500


batch: 12
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [12]	Speed: 5.161344 samples/sec	top1-err=0.834615	top5-err=0.430769


batch: 13
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [13]	Speed: 5.196967 samples/sec	top1-err=0.839286	top5-err=0.439286


batch: 14
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [14]	Speed: 5.129726 samples/sec	top1-err=0.843333	top5-err=0.420000


batch: 15
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [15]	Speed: 5.297176 samples/sec	top1-err=0.837500	top5-err=0.403125


batch: 16
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [16]	Speed: 5.173068 samples/sec	top1-err=0.844118	top5-err=0.411765


batch: 17
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [17]	Speed: 5.097279 samples/sec	top1-err=0.847222	top5-err=0.405556


batch: 18
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [18]	Speed: 4.909553 samples/sec	top1-err=0.847368	top5-err=0.407895


batch: 19
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [19]	Speed: 4.954603 samples/sec	top1-err=0.852500	top5-err=0.410000


batch: 20
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [20]	Speed: 5.016738 samples/sec	top1-err=0.854762	top5-err=0.404762


batch: 21
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [21]	Speed: 5.460812 samples/sec	top1-err=0.856818	top5-err=0.404545


batch: 22
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [22]	Speed: 5.151016 samples/sec	top1-err=0.863043	top5-err=0.413043


batch: 23
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [23]	Speed: 4.821672 samples/sec	top1-err=0.864583	top5-err=0.412500


batch: 24
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [24]	Speed: 5.123209 samples/sec	top1-err=0.864000	top5-err=0.412000


batch: 25
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [25]	Speed: 4.928271 samples/sec	top1-err=0.867308	top5-err=0.411538


batch: 26
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [26]	Speed: 4.640025 samples/sec	top1-err=0.864815	top5-err=0.407407


batch: 27
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [27]	Speed: 4.302321 samples/sec	top1-err=0.866071	top5-err=0.403571


batch: 28
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [28]	Speed: 4.532249 samples/sec	top1-err=0.868966	top5-err=0.401724


batch: 29
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [29]	Speed: 3.887085 samples/sec	top1-err=0.868333	top5-err=0.400000


batch: 30
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [30]	Speed: 4.523198 samples/sec	top1-err=0.870968	top5-err=0.401613


batch: 31
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [31]	Speed: 4.759204 samples/sec	top1-err=0.870313	top5-err=0.404687


batch: 32
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [32]	Speed: 4.641793 samples/sec	top1-err=0.868182	top5-err=0.406061


batch: 33
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [33]	Speed: 4.498050 samples/sec	top1-err=0.861765	top5-err=0.402941


batch: 34
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [34]	Speed: 4.148216 samples/sec	top1-err=0.860000	top5-err=0.397143


batch: 35
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [35]	Speed: 4.383781 samples/sec	top1-err=0.861111	top5-err=0.394444


batch: 36
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [36]	Speed: 4.152014 samples/sec	top1-err=0.862162	top5-err=0.391892


batch: 37
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [37]	Speed: 4.679066 samples/sec	top1-err=0.864474	top5-err=0.394737


batch: 38
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [38]	Speed: 4.727843 samples/sec	top1-err=0.865385	top5-err=0.397436


batch: 39
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [39]	Speed: 4.664229 samples/sec	top1-err=0.863750	top5-err=0.393750


batch: 40
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [40]	Speed: 1.849712 samples/sec	top1-err=0.862195	top5-err=0.389024


batch: 41
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [41]	Speed: 4.854110 samples/sec	top1-err=0.860714	top5-err=0.388095


batch: 42
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [42]	Speed: 4.711705 samples/sec	top1-err=0.861628	top5-err=0.386047


batch: 43
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [43]	Speed: 4.827230 samples/sec	top1-err=0.861364	top5-err=0.379545


batch: 44
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [44]	Speed: 4.815445 samples/sec	top1-err=0.860000	top5-err=0.376667


batch: 45
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [45]	Speed: 4.754408 samples/sec	top1-err=0.861957	top5-err=0.376087


batch: 46
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [46]	Speed: 4.813057 samples/sec	top1-err=0.862766	top5-err=0.372340


batch: 47
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [47]	Speed: 4.748944 samples/sec	top1-err=0.861458	top5-err=0.371875


batch: 48
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [48]	Speed: 4.666805 samples/sec	top1-err=0.864286	top5-err=0.373469


batch: 49
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [49]	Speed: 4.731457 samples/sec	top1-err=0.864000	top5-err=0.374000


batch: 50
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [50]	Speed: 4.795073 samples/sec	top1-err=0.863725	top5-err=0.371569


batch: 51
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [51]	Speed: 4.728837 samples/sec	top1-err=0.864423	top5-err=0.371154


batch: 52
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [52]	Speed: 4.689326 samples/sec	top1-err=0.865094	top5-err=0.373585


batch: 53
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [53]	Speed: 4.715695 samples/sec	top1-err=0.866667	top5-err=0.375000


batch: 54
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [54]	Speed: 4.670074 samples/sec	top1-err=0.868182	top5-err=0.376364


batch: 55
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [55]	Speed: 4.637163 samples/sec	top1-err=0.868750	top5-err=0.376786


batch: 56
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [56]	Speed: 1.997443 samples/sec	top1-err=0.869298	top5-err=0.375439


batch: 57
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [57]	Speed: 4.901266 samples/sec	top1-err=0.869828	top5-err=0.375000


batch: 58
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [58]	Speed: 4.861813 samples/sec	top1-err=0.868644	top5-err=0.372881


batch: 59
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [59]	Speed: 4.773902 samples/sec	top1-err=0.868333	top5-err=0.369167


batch: 60
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [60]	Speed: 4.636560 samples/sec	top1-err=0.864754	top5-err=0.366393


batch: 61
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [61]	Speed: 4.701768 samples/sec	top1-err=0.863710	top5-err=0.363710


batch: 62
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [62]	Speed: 4.844086 samples/sec	top1-err=0.861905	top5-err=0.362698


batch: 63
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [63]	Speed: 4.706201 samples/sec	top1-err=0.858594	top5-err=0.360156


batch: 64
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [64]	Speed: 4.665012 samples/sec	top1-err=0.858462	top5-err=0.361538


batch: 65
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [65]	Speed: 4.684389 samples/sec	top1-err=0.859091	top5-err=0.362121


batch: 66
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [66]	Speed: 4.640381 samples/sec	top1-err=0.858955	top5-err=0.361194


batch: 67
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [67]	Speed: 1.639739 samples/sec	top1-err=0.860294	top5-err=0.363971


batch: 68
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [68]	Speed: 4.294376 samples/sec	top1-err=0.859420	top5-err=0.363768


batch: 69
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [69]	Speed: 4.250891 samples/sec	top1-err=0.858571	top5-err=0.363571


batch: 70
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [70]	Speed: 4.521578 samples/sec	top1-err=0.858451	top5-err=0.364085


batch: 71
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [71]	Speed: 4.703198 samples/sec	top1-err=0.859028	top5-err=0.361111


batch: 72
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [72]	Speed: 4.496396 samples/sec	top1-err=0.858904	top5-err=0.360274


batch: 73
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [73]	Speed: 4.698703 samples/sec	top1-err=0.858784	top5-err=0.360135


batch: 74
forward pass
backward pass
update
update loss


INFO:root:Epoch[0] Batch [74]	Speed: 4.656575 samples/sec	top1-err=0.857333	top5-err=0.358667
INFO:root:[Epoch 0] training: err-top1=0.857333 err-top5=0.358667 loss=3.261408
INFO:root:[Epoch 0] time cost: 367.147962
INFO:root:[Epoch 0] validation: err-top1=0.802500 err-top5=0.305000


start epoch 1
will do 75 batches
set learning rate
batch: 0
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [0]	Speed: 2.477629 samples/sec	top1-err=0.900000	top5-err=0.200000


batch: 1
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [1]	Speed: 2.201568 samples/sec	top1-err=0.900000	top5-err=0.250000


batch: 2
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [2]	Speed: 4.798941 samples/sec	top1-err=0.883333	top5-err=0.216667


batch: 3
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [3]	Speed: 4.806423 samples/sec	top1-err=0.887500	top5-err=0.212500


batch: 4
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [4]	Speed: 4.804354 samples/sec	top1-err=0.840000	top5-err=0.220000


batch: 5
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [5]	Speed: 4.747451 samples/sec	top1-err=0.808333	top5-err=0.208333


batch: 6
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [6]	Speed: 4.747682 samples/sec	top1-err=0.792857	top5-err=0.221429


batch: 7
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [7]	Speed: 4.696922 samples/sec	top1-err=0.775000	top5-err=0.218750


batch: 8
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [8]	Speed: 4.734673 samples/sec	top1-err=0.788889	top5-err=0.227778


batch: 9
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [9]	Speed: 4.696995 samples/sec	top1-err=0.775000	top5-err=0.225000


batch: 10
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [10]	Speed: 4.757820 samples/sec	top1-err=0.777273	top5-err=0.236364


batch: 11
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [11]	Speed: 4.757907 samples/sec	top1-err=0.791667	top5-err=0.241667


batch: 12
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [12]	Speed: 4.681787 samples/sec	top1-err=0.788462	top5-err=0.246154


batch: 13
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [13]	Speed: 4.725784 samples/sec	top1-err=0.800000	top5-err=0.260714


batch: 14
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [14]	Speed: 2.276019 samples/sec	top1-err=0.803333	top5-err=0.266667


batch: 15
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [15]	Speed: 3.202070 samples/sec	top1-err=0.803125	top5-err=0.265625


batch: 16
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [16]	Speed: 4.836866 samples/sec	top1-err=0.805882	top5-err=0.258824


batch: 17
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [17]	Speed: 4.745300 samples/sec	top1-err=0.808333	top5-err=0.275000


batch: 18
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [18]	Speed: 4.807394 samples/sec	top1-err=0.807895	top5-err=0.278947


batch: 19
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [19]	Speed: 4.748872 samples/sec	top1-err=0.807500	top5-err=0.277500


batch: 20
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [20]	Speed: 4.722247 samples/sec	top1-err=0.804762	top5-err=0.280952


batch: 21
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [21]	Speed: 4.599003 samples/sec	top1-err=0.802273	top5-err=0.286364


batch: 22
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [22]	Speed: 4.649598 samples/sec	top1-err=0.804348	top5-err=0.284783


batch: 23
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [23]	Speed: 4.740413 samples/sec	top1-err=0.800000	top5-err=0.279167


batch: 24
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [24]	Speed: 4.742021 samples/sec	top1-err=0.798000	top5-err=0.284000


batch: 25
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [25]	Speed: 4.570882 samples/sec	top1-err=0.800000	top5-err=0.290385


batch: 26
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [26]	Speed: 4.662321 samples/sec	top1-err=0.801852	top5-err=0.290741


batch: 27
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [27]	Speed: 4.687190 samples/sec	top1-err=0.798214	top5-err=0.287500


batch: 28
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [28]	Speed: 4.710533 samples/sec	top1-err=0.800000	top5-err=0.284483


batch: 29
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [29]	Speed: 4.531302 samples/sec	top1-err=0.800000	top5-err=0.288333


batch: 30
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [30]	Speed: 4.594930 samples/sec	top1-err=0.804839	top5-err=0.293548


batch: 31
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [31]	Speed: 4.742113 samples/sec	top1-err=0.806250	top5-err=0.300000


batch: 32
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [32]	Speed: 4.667934 samples/sec	top1-err=0.806061	top5-err=0.300000


batch: 33
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [33]	Speed: 4.645669 samples/sec	top1-err=0.802941	top5-err=0.295588


batch: 34
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [34]	Speed: 4.605988 samples/sec	top1-err=0.807143	top5-err=0.294286


batch: 35
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [35]	Speed: 4.672992 samples/sec	top1-err=0.801389	top5-err=0.293056


batch: 36
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [36]	Speed: 4.697120 samples/sec	top1-err=0.801351	top5-err=0.289189


batch: 37
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [37]	Speed: 4.614277 samples/sec	top1-err=0.803947	top5-err=0.294737


batch: 38
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [38]	Speed: 4.657094 samples/sec	top1-err=0.802564	top5-err=0.297436


batch: 39
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [39]	Speed: 4.554284 samples/sec	top1-err=0.807500	top5-err=0.298750


batch: 40
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [40]	Speed: 4.084380 samples/sec	top1-err=0.809756	top5-err=0.297561


batch: 41
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [41]	Speed: 4.341045 samples/sec	top1-err=0.809524	top5-err=0.292857


batch: 42
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [42]	Speed: 4.375081 samples/sec	top1-err=0.811628	top5-err=0.293023


batch: 43
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [43]	Speed: 4.529942 samples/sec	top1-err=0.811364	top5-err=0.290909


batch: 44
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [44]	Speed: 4.645588 samples/sec	top1-err=0.808889	top5-err=0.291111


batch: 45
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [45]	Speed: 4.611327 samples/sec	top1-err=0.809783	top5-err=0.289130


batch: 46
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [46]	Speed: 4.650873 samples/sec	top1-err=0.807447	top5-err=0.287234


batch: 47
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [47]	Speed: 4.614425 samples/sec	top1-err=0.807292	top5-err=0.287500


batch: 48
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [48]	Speed: 4.234605 samples/sec	top1-err=0.807143	top5-err=0.287755


batch: 49
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [49]	Speed: 4.382695 samples/sec	top1-err=0.807000	top5-err=0.286000


batch: 50
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [50]	Speed: 4.535144 samples/sec	top1-err=0.805882	top5-err=0.285294


batch: 51
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [51]	Speed: 4.584195 samples/sec	top1-err=0.802885	top5-err=0.282692


batch: 52
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [52]	Speed: 4.641571 samples/sec	top1-err=0.802830	top5-err=0.281132


batch: 53
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [53]	Speed: 4.678779 samples/sec	top1-err=0.801852	top5-err=0.284259


batch: 54
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [54]	Speed: 4.654207 samples/sec	top1-err=0.802727	top5-err=0.287273


batch: 55
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [55]	Speed: 4.601455 samples/sec	top1-err=0.803571	top5-err=0.287500


batch: 56
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [56]	Speed: 4.602001 samples/sec	top1-err=0.802632	top5-err=0.288596


batch: 57
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [57]	Speed: 4.512359 samples/sec	top1-err=0.800862	top5-err=0.287069


batch: 58
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [58]	Speed: 4.490206 samples/sec	top1-err=0.802542	top5-err=0.284746


batch: 59
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [59]	Speed: 4.622460 samples/sec	top1-err=0.800833	top5-err=0.284167


batch: 60
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [60]	Speed: 4.636431 samples/sec	top1-err=0.800820	top5-err=0.286066


batch: 61
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [61]	Speed: 4.582683 samples/sec	top1-err=0.798387	top5-err=0.284677


batch: 62
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [62]	Speed: 4.697484 samples/sec	top1-err=0.799206	top5-err=0.284127


batch: 63
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [63]	Speed: 4.635257 samples/sec	top1-err=0.800000	top5-err=0.283594


batch: 64
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [64]	Speed: 4.630833 samples/sec	top1-err=0.800000	top5-err=0.282308


batch: 65
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [65]	Speed: 4.677193 samples/sec	top1-err=0.800000	top5-err=0.279545


batch: 66
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [66]	Speed: 4.713476 samples/sec	top1-err=0.800000	top5-err=0.279104


batch: 67
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [67]	Speed: 4.604748 samples/sec	top1-err=0.800735	top5-err=0.277206


batch: 68
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [68]	Speed: 4.640973 samples/sec	top1-err=0.801449	top5-err=0.277536


batch: 69
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [69]	Speed: 4.639073 samples/sec	top1-err=0.799286	top5-err=0.277143


batch: 70
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [70]	Speed: 4.578915 samples/sec	top1-err=0.800704	top5-err=0.276056


batch: 71
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [71]	Speed: 4.563927 samples/sec	top1-err=0.800694	top5-err=0.275694


batch: 72
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [72]	Speed: 1.809957 samples/sec	top1-err=0.801370	top5-err=0.277397


batch: 73
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [73]	Speed: 4.861548 samples/sec	top1-err=0.802703	top5-err=0.275000


batch: 74
forward pass
backward pass
update
update loss


INFO:root:Epoch[1] Batch [74]	Speed: 4.795195 samples/sec	top1-err=0.800667	top5-err=0.274667
INFO:root:[Epoch 1] training: err-top1=0.800667 err-top5=0.274667 loss=2.156865
INFO:root:[Epoch 1] time cost: 380.509069
INFO:root:[Epoch 1] validation: err-top1=0.827500 err-top5=0.280000


start epoch 2
will do 75 batches
set learning rate
batch: 0
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [0]	Speed: 4.572048 samples/sec	top1-err=0.900000	top5-err=0.250000


batch: 1
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [1]	Speed: 4.819624 samples/sec	top1-err=0.725000	top5-err=0.200000


batch: 2
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [2]	Speed: 4.690619 samples/sec	top1-err=0.750000	top5-err=0.200000


batch: 3
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [3]	Speed: 4.591588 samples/sec	top1-err=0.737500	top5-err=0.187500


batch: 4
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [4]	Speed: 4.688338 samples/sec	top1-err=0.710000	top5-err=0.180000


batch: 5
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [5]	Speed: 4.594602 samples/sec	top1-err=0.741667	top5-err=0.175000


batch: 6
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [6]	Speed: 4.648421 samples/sec	top1-err=0.742857	top5-err=0.192857


batch: 7
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [7]	Speed: 4.680872 samples/sec	top1-err=0.762500	top5-err=0.175000


batch: 8
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [8]	Speed: 4.656199 samples/sec	top1-err=0.766667	top5-err=0.188889


batch: 9
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [9]	Speed: 4.659335 samples/sec	top1-err=0.755000	top5-err=0.185000


batch: 10
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [10]	Speed: 4.606632 samples/sec	top1-err=0.754545	top5-err=0.190909


batch: 11
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [11]	Speed: 2.139358 samples/sec	top1-err=0.745833	top5-err=0.191667


batch: 12
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [12]	Speed: 3.555811 samples/sec	top1-err=0.742308	top5-err=0.180769


batch: 13
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [13]	Speed: 4.719001 samples/sec	top1-err=0.750000	top5-err=0.185714


batch: 14
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [14]	Speed: 4.781931 samples/sec	top1-err=0.743333	top5-err=0.180000


batch: 15
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [15]	Speed: 4.751176 samples/sec	top1-err=0.743750	top5-err=0.190625


batch: 16
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [16]	Speed: 4.809093 samples/sec	top1-err=0.732353	top5-err=0.185294


batch: 17
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [17]	Speed: 4.649846 samples/sec	top1-err=0.727778	top5-err=0.183333


batch: 18
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [18]	Speed: 4.569500 samples/sec	top1-err=0.723684	top5-err=0.189474


batch: 19
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [19]	Speed: 4.686058 samples/sec	top1-err=0.725000	top5-err=0.187500


batch: 20
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [20]	Speed: 4.647348 samples/sec	top1-err=0.726190	top5-err=0.197619


batch: 21
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [21]	Speed: 4.506990 samples/sec	top1-err=0.718182	top5-err=0.190909


batch: 22
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [22]	Speed: 4.740102 samples/sec	top1-err=0.721739	top5-err=0.193478


batch: 23
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [23]	Speed: 4.622974 samples/sec	top1-err=0.720833	top5-err=0.191667


batch: 24
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [24]	Speed: 4.291412 samples/sec	top1-err=0.726000	top5-err=0.200000


batch: 25
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [25]	Speed: 1.666579 samples/sec	top1-err=0.726923	top5-err=0.209615


batch: 26
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [26]	Speed: 4.795757 samples/sec	top1-err=0.729630	top5-err=0.207407


batch: 27
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [27]	Speed: 4.747933 samples/sec	top1-err=0.725000	top5-err=0.205357


batch: 28
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [28]	Speed: 4.663081 samples/sec	top1-err=0.724138	top5-err=0.208621


batch: 29
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [29]	Speed: 4.727167 samples/sec	top1-err=0.725000	top5-err=0.206667


batch: 30
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [30]	Speed: 4.619705 samples/sec	top1-err=0.727419	top5-err=0.209677


batch: 31
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [31]	Speed: 4.704492 samples/sec	top1-err=0.728125	top5-err=0.214063


batch: 32
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [32]	Speed: 4.661497 samples/sec	top1-err=0.730303	top5-err=0.216667


batch: 33
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [33]	Speed: 4.661740 samples/sec	top1-err=0.732353	top5-err=0.217647


batch: 34
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [34]	Speed: 4.617976 samples/sec	top1-err=0.735714	top5-err=0.214286


batch: 35
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [35]	Speed: 4.683984 samples/sec	top1-err=0.733333	top5-err=0.213889


batch: 36
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [36]	Speed: 4.544389 samples/sec	top1-err=0.739189	top5-err=0.213514


batch: 37
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [37]	Speed: 4.626835 samples/sec	top1-err=0.739474	top5-err=0.217105


batch: 38
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [38]	Speed: 4.609221 samples/sec	top1-err=0.738462	top5-err=0.220513


batch: 39
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [39]	Speed: 4.646119 samples/sec	top1-err=0.736250	top5-err=0.216250


batch: 40
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [40]	Speed: 4.645163 samples/sec	top1-err=0.734146	top5-err=0.213415


batch: 41
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [41]	Speed: 4.559424 samples/sec	top1-err=0.733333	top5-err=0.213095


batch: 42
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [42]	Speed: 4.566058 samples/sec	top1-err=0.733721	top5-err=0.216279


batch: 43
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [43]	Speed: 4.535791 samples/sec	top1-err=0.732955	top5-err=0.217045


batch: 44
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [44]	Speed: 4.509216 samples/sec	top1-err=0.733333	top5-err=0.215556


batch: 45
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [45]	Speed: 4.570182 samples/sec	top1-err=0.735870	top5-err=0.218478


batch: 46
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [46]	Speed: 4.565422 samples/sec	top1-err=0.738298	top5-err=0.219149


batch: 47
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [47]	Speed: 1.829832 samples/sec	top1-err=0.735417	top5-err=0.215625


batch: 48
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [48]	Speed: 4.826516 samples/sec	top1-err=0.735714	top5-err=0.217347


batch: 49
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [49]	Speed: 4.813592 samples/sec	top1-err=0.736000	top5-err=0.219000


batch: 50
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [50]	Speed: 4.796746 samples/sec	top1-err=0.737255	top5-err=0.216667


batch: 51
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [51]	Speed: 4.711674 samples/sec	top1-err=0.737500	top5-err=0.220192


batch: 52
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [52]	Speed: 4.711705 samples/sec	top1-err=0.737736	top5-err=0.222642


batch: 53
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [53]	Speed: 4.769416 samples/sec	top1-err=0.739815	top5-err=0.222222


batch: 54
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [54]	Speed: 4.581916 samples/sec	top1-err=0.740909	top5-err=0.222727


batch: 55
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [55]	Speed: 4.691373 samples/sec	top1-err=0.737500	top5-err=0.223214


batch: 56
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [56]	Speed: 4.636816 samples/sec	top1-err=0.736842	top5-err=0.222807


batch: 57
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [57]	Speed: 4.609241 samples/sec	top1-err=0.740517	top5-err=0.223276


batch: 58
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [58]	Speed: 4.650823 samples/sec	top1-err=0.740678	top5-err=0.220339


batch: 59
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [59]	Speed: 4.642623 samples/sec	top1-err=0.740833	top5-err=0.218333


batch: 60
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [60]	Speed: 4.594605 samples/sec	top1-err=0.741803	top5-err=0.219672


batch: 61
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [61]	Speed: 1.834792 samples/sec	top1-err=0.742742	top5-err=0.220161


batch: 62
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [62]	Speed: 4.826704 samples/sec	top1-err=0.742857	top5-err=0.218254


batch: 63
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [63]	Speed: 4.782124 samples/sec	top1-err=0.742969	top5-err=0.221094


batch: 64
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [64]	Speed: 4.820625 samples/sec	top1-err=0.743077	top5-err=0.223846


batch: 65
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [65]	Speed: 4.736312 samples/sec	top1-err=0.743182	top5-err=0.225000


batch: 66
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [66]	Speed: 4.722507 samples/sec	top1-err=0.741791	top5-err=0.224627


batch: 67
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [67]	Speed: 4.688019 samples/sec	top1-err=0.740441	top5-err=0.223529


batch: 68
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [68]	Speed: 4.663141 samples/sec	top1-err=0.740580	top5-err=0.223188


batch: 69
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [69]	Speed: 4.672296 samples/sec	top1-err=0.740000	top5-err=0.224286


batch: 70
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [70]	Speed: 4.662065 samples/sec	top1-err=0.741549	top5-err=0.225352


batch: 71
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [71]	Speed: 4.641049 samples/sec	top1-err=0.740278	top5-err=0.225694


batch: 72
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [72]	Speed: 4.672648 samples/sec	top1-err=0.740411	top5-err=0.223288


batch: 73
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [73]	Speed: 4.742018 samples/sec	top1-err=0.739865	top5-err=0.222297


batch: 74
forward pass
backward pass
update
update loss


INFO:root:Epoch[2] Batch [74]	Speed: 4.643784 samples/sec	top1-err=0.736000	top5-err=0.220667
INFO:root:[Epoch 2] training: err-top1=0.736000 err-top5=0.220667 loss=1.980883
INFO:root:[Epoch 2] time cost: 381.845947
INFO:root:[Epoch 2] validation: err-top1=0.702500 err-top5=0.202500


start epoch 3
will do 75 batches
set learning rate
batch: 0
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [0]	Speed: 4.330092 samples/sec	top1-err=0.650000	top5-err=0.150000


batch: 1
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [1]	Speed: 4.589278 samples/sec	top1-err=0.725000	top5-err=0.175000


batch: 2
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [2]	Speed: 4.556436 samples/sec	top1-err=0.666667	top5-err=0.150000


batch: 3
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [3]	Speed: 4.528703 samples/sec	top1-err=0.637500	top5-err=0.137500


batch: 4
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [4]	Speed: 4.533025 samples/sec	top1-err=0.640000	top5-err=0.140000


batch: 5
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [5]	Speed: 4.612824 samples/sec	top1-err=0.650000	top5-err=0.158333


batch: 6
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [6]	Speed: 4.591946 samples/sec	top1-err=0.678571	top5-err=0.164286


batch: 7
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [7]	Speed: 4.584495 samples/sec	top1-err=0.687500	top5-err=0.156250


batch: 8
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [8]	Speed: 4.463505 samples/sec	top1-err=0.694444	top5-err=0.161111


batch: 9
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [9]	Speed: 4.490135 samples/sec	top1-err=0.680000	top5-err=0.155000


batch: 10
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [10]	Speed: 4.573803 samples/sec	top1-err=0.677273	top5-err=0.168182


batch: 11
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [11]	Speed: 4.618277 samples/sec	top1-err=0.670833	top5-err=0.179167


batch: 12
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [12]	Speed: 4.621008 samples/sec	top1-err=0.684615	top5-err=0.169231


batch: 13
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [13]	Speed: 4.593637 samples/sec	top1-err=0.685714	top5-err=0.171429


batch: 14
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [14]	Speed: 4.600984 samples/sec	top1-err=0.690000	top5-err=0.176667


batch: 15
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [15]	Speed: 4.569204 samples/sec	top1-err=0.700000	top5-err=0.175000


batch: 16
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [16]	Speed: 4.540368 samples/sec	top1-err=0.697059	top5-err=0.173529


batch: 17
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [17]	Speed: 4.666930 samples/sec	top1-err=0.691667	top5-err=0.172222


batch: 18
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [18]	Speed: 4.017484 samples/sec	top1-err=0.689474	top5-err=0.173684


batch: 19
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [19]	Speed: 4.516167 samples/sec	top1-err=0.685000	top5-err=0.170000


batch: 20
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [20]	Speed: 3.757896 samples/sec	top1-err=0.690476	top5-err=0.171429


batch: 21
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [21]	Speed: 4.458889 samples/sec	top1-err=0.688636	top5-err=0.175000


batch: 22
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [22]	Speed: 4.693435 samples/sec	top1-err=0.689130	top5-err=0.171739


batch: 23
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [23]	Speed: 4.448100 samples/sec	top1-err=0.683333	top5-err=0.175000


batch: 24
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [24]	Speed: 4.657722 samples/sec	top1-err=0.682000	top5-err=0.180000


batch: 25
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [25]	Speed: 4.623914 samples/sec	top1-err=0.680769	top5-err=0.175000


batch: 26
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [26]	Speed: 4.729628 samples/sec	top1-err=0.688889	top5-err=0.181481


batch: 27
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [27]	Speed: 4.735547 samples/sec	top1-err=0.694643	top5-err=0.189286


batch: 28
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [28]	Speed: 4.700725 samples/sec	top1-err=0.696552	top5-err=0.186207


batch: 29
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [29]	Speed: 4.702874 samples/sec	top1-err=0.698333	top5-err=0.186667


batch: 30
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [30]	Speed: 4.749077 samples/sec	top1-err=0.698387	top5-err=0.187097


batch: 31
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [31]	Speed: 3.990874 samples/sec	top1-err=0.700000	top5-err=0.185937


batch: 32
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [32]	Speed: 4.381624 samples/sec	top1-err=0.701515	top5-err=0.184848


batch: 33
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [33]	Speed: 4.492084 samples/sec	top1-err=0.701471	top5-err=0.186765


batch: 34
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [34]	Speed: 4.111032 samples/sec	top1-err=0.701429	top5-err=0.182857


batch: 35
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [35]	Speed: 1.521707 samples/sec	top1-err=0.704167	top5-err=0.181944


batch: 36
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [36]	Speed: 4.398685 samples/sec	top1-err=0.698649	top5-err=0.178378


batch: 37
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [37]	Speed: 4.284700 samples/sec	top1-err=0.700000	top5-err=0.178947


batch: 38
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [38]	Speed: 4.271244 samples/sec	top1-err=0.697436	top5-err=0.176923


batch: 39
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [39]	Speed: 4.281676 samples/sec	top1-err=0.701250	top5-err=0.180000


batch: 40
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [40]	Speed: 4.086583 samples/sec	top1-err=0.697561	top5-err=0.181707


batch: 41
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [41]	Speed: 4.100622 samples/sec	top1-err=0.696429	top5-err=0.180952


batch: 42
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [42]	Speed: 4.207013 samples/sec	top1-err=0.696512	top5-err=0.182558


batch: 43
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [43]	Speed: 4.175592 samples/sec	top1-err=0.696591	top5-err=0.184091


batch: 44
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [44]	Speed: 4.034992 samples/sec	top1-err=0.694444	top5-err=0.185556


batch: 45
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [45]	Speed: 1.604732 samples/sec	top1-err=0.694565	top5-err=0.186957


batch: 46
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [46]	Speed: 4.424114 samples/sec	top1-err=0.694681	top5-err=0.187234


batch: 47
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [47]	Speed: 4.334506 samples/sec	top1-err=0.695833	top5-err=0.185417


batch: 48
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [48]	Speed: 4.251991 samples/sec	top1-err=0.693878	top5-err=0.188776


batch: 49
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [49]	Speed: 4.209144 samples/sec	top1-err=0.694000	top5-err=0.191000


batch: 50
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [50]	Speed: 4.038691 samples/sec	top1-err=0.696078	top5-err=0.191176


batch: 51
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [51]	Speed: 4.020501 samples/sec	top1-err=0.700962	top5-err=0.191346


batch: 52
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [52]	Speed: 4.021861 samples/sec	top1-err=0.700943	top5-err=0.188679


batch: 53
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [53]	Speed: 1.827554 samples/sec	top1-err=0.700000	top5-err=0.187037


batch: 54
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [54]	Speed: 3.987447 samples/sec	top1-err=0.701818	top5-err=0.184545


batch: 55
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [55]	Speed: 4.345667 samples/sec	top1-err=0.700893	top5-err=0.182143


batch: 56
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [56]	Speed: 4.363183 samples/sec	top1-err=0.701754	top5-err=0.182456


batch: 57
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [57]	Speed: 4.245394 samples/sec	top1-err=0.701724	top5-err=0.181897


batch: 58
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [58]	Speed: 4.164419 samples/sec	top1-err=0.701695	top5-err=0.183051


batch: 59
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [59]	Speed: 4.208972 samples/sec	top1-err=0.702500	top5-err=0.183333


batch: 60
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [60]	Speed: 4.055039 samples/sec	top1-err=0.701639	top5-err=0.181148


batch: 61
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [61]	Speed: 1.620104 samples/sec	top1-err=0.704032	top5-err=0.183871


batch: 62
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [62]	Speed: 4.385361 samples/sec	top1-err=0.700794	top5-err=0.182540


batch: 63
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [63]	Speed: 4.270676 samples/sec	top1-err=0.699219	top5-err=0.182813


batch: 64
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [64]	Speed: 4.233292 samples/sec	top1-err=0.700000	top5-err=0.183077


batch: 65
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [65]	Speed: 1.762043 samples/sec	top1-err=0.700000	top5-err=0.181818


batch: 66
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [66]	Speed: 4.413373 samples/sec	top1-err=0.700746	top5-err=0.183582


batch: 67
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [67]	Speed: 4.332864 samples/sec	top1-err=0.700735	top5-err=0.184559


batch: 68
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [68]	Speed: 4.307561 samples/sec	top1-err=0.700000	top5-err=0.181884


batch: 69
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [69]	Speed: 4.263416 samples/sec	top1-err=0.699286	top5-err=0.182857


batch: 70
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [70]	Speed: 4.252523 samples/sec	top1-err=0.698592	top5-err=0.181690


batch: 71
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [71]	Speed: 4.263423 samples/sec	top1-err=0.697222	top5-err=0.181944


batch: 72
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [72]	Speed: 4.009962 samples/sec	top1-err=0.695890	top5-err=0.181507


batch: 73
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [73]	Speed: 3.546182 samples/sec	top1-err=0.695270	top5-err=0.181757


batch: 74
forward pass
backward pass
update
update loss


INFO:root:Epoch[3] Batch [74]	Speed: 3.971308 samples/sec	top1-err=0.696000	top5-err=0.182667
INFO:root:[Epoch 3] training: err-top1=0.696000 err-top5=0.182667 loss=1.887716
INFO:root:[Epoch 3] time cost: 422.775354
INFO:root:[Epoch 3] validation: err-top1=0.720000 err-top5=0.220000


start epoch 4
will do 75 batches
set learning rate
batch: 0
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [0]	Speed: 4.449197 samples/sec	top1-err=0.700000	top5-err=0.100000


batch: 1
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [1]	Speed: 4.729810 samples/sec	top1-err=0.725000	top5-err=0.250000


batch: 2
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [2]	Speed: 4.665674 samples/sec	top1-err=0.750000	top5-err=0.316667


batch: 3
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [3]	Speed: 3.678360 samples/sec	top1-err=0.675000	top5-err=0.250000


batch: 4
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [4]	Speed: 2.111868 samples/sec	top1-err=0.650000	top5-err=0.230000


batch: 5
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [5]	Speed: 4.835361 samples/sec	top1-err=0.650000	top5-err=0.216667


batch: 6
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [6]	Speed: 4.556224 samples/sec	top1-err=0.671429	top5-err=0.207143


batch: 7
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [7]	Speed: 4.316799 samples/sec	top1-err=0.681250	top5-err=0.200000


batch: 8
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [8]	Speed: 4.195578 samples/sec	top1-err=0.677778	top5-err=0.200000


batch: 9
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [9]	Speed: 4.266352 samples/sec	top1-err=0.695000	top5-err=0.200000


batch: 10
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [10]	Speed: 4.313153 samples/sec	top1-err=0.713636	top5-err=0.204545


batch: 11
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [11]	Speed: 4.105917 samples/sec	top1-err=0.716667	top5-err=0.204167


batch: 12
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [12]	Speed: 4.270042 samples/sec	top1-err=0.707692	top5-err=0.200000


batch: 13
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [13]	Speed: 4.314033 samples/sec	top1-err=0.703571	top5-err=0.192857


batch: 14
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [14]	Speed: 4.128200 samples/sec	top1-err=0.700000	top5-err=0.186667


batch: 15
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [15]	Speed: 1.747891 samples/sec	top1-err=0.690625	top5-err=0.184375


batch: 16
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [16]	Speed: 4.388671 samples/sec	top1-err=0.697059	top5-err=0.185294


batch: 17
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [17]	Speed: 4.368626 samples/sec	top1-err=0.697222	top5-err=0.186111


batch: 18
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [18]	Speed: 4.394510 samples/sec	top1-err=0.705263	top5-err=0.184211


batch: 19
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [19]	Speed: 4.355754 samples/sec	top1-err=0.707500	top5-err=0.182500


batch: 20
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [20]	Speed: 1.816000 samples/sec	top1-err=0.709524	top5-err=0.185714


batch: 21
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [21]	Speed: 4.445172 samples/sec	top1-err=0.709091	top5-err=0.184091


batch: 22
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [22]	Speed: 4.009437 samples/sec	top1-err=0.697826	top5-err=0.184783


batch: 23
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [23]	Speed: 4.252363 samples/sec	top1-err=0.693750	top5-err=0.179167


batch: 24
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [24]	Speed: 3.878247 samples/sec	top1-err=0.702000	top5-err=0.188000


batch: 25
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [25]	Speed: 4.117151 samples/sec	top1-err=0.700000	top5-err=0.186538


batch: 26
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [26]	Speed: 4.168530 samples/sec	top1-err=0.703704	top5-err=0.192593


batch: 27
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [27]	Speed: 4.097277 samples/sec	top1-err=0.703571	top5-err=0.191071


batch: 28
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [28]	Speed: 4.167514 samples/sec	top1-err=0.700000	top5-err=0.191379


batch: 29
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [29]	Speed: 3.939954 samples/sec	top1-err=0.696667	top5-err=0.188333


batch: 30
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [30]	Speed: 4.467258 samples/sec	top1-err=0.696774	top5-err=0.188710


batch: 31
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [31]	Speed: 4.436075 samples/sec	top1-err=0.700000	top5-err=0.189063


batch: 32
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [32]	Speed: 4.502013 samples/sec	top1-err=0.696970	top5-err=0.187879


batch: 33
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [33]	Speed: 4.583715 samples/sec	top1-err=0.698529	top5-err=0.183824


batch: 34
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [34]	Speed: 4.674363 samples/sec	top1-err=0.695714	top5-err=0.180000


batch: 35
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [35]	Speed: 1.883164 samples/sec	top1-err=0.701389	top5-err=0.184722


batch: 36
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [36]	Speed: 4.613296 samples/sec	top1-err=0.700000	top5-err=0.182432


batch: 37
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [37]	Speed: 1.848775 samples/sec	top1-err=0.700000	top5-err=0.184211


batch: 38
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [38]	Speed: 4.793783 samples/sec	top1-err=0.700000	top5-err=0.184615


batch: 39
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [39]	Speed: 4.918956 samples/sec	top1-err=0.697500	top5-err=0.182500


batch: 40
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [40]	Speed: 4.921541 samples/sec	top1-err=0.696341	top5-err=0.181707


batch: 41
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [41]	Speed: 4.886975 samples/sec	top1-err=0.694048	top5-err=0.180952


batch: 42
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [42]	Speed: 4.879881 samples/sec	top1-err=0.693023	top5-err=0.180233


batch: 43
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [43]	Speed: 4.785108 samples/sec	top1-err=0.694318	top5-err=0.179545


batch: 44
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [44]	Speed: 4.571385 samples/sec	top1-err=0.694444	top5-err=0.176667


batch: 45
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [45]	Speed: 4.716354 samples/sec	top1-err=0.694565	top5-err=0.175000


batch: 46
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [46]	Speed: 4.782369 samples/sec	top1-err=0.697872	top5-err=0.173404


batch: 47
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [47]	Speed: 4.805025 samples/sec	top1-err=0.700000	top5-err=0.177083


batch: 48
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [48]	Speed: 4.676816 samples/sec	top1-err=0.700000	top5-err=0.180612


batch: 49
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [49]	Speed: 4.794893 samples/sec	top1-err=0.701000	top5-err=0.179000


batch: 50
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [50]	Speed: 4.583794 samples/sec	top1-err=0.701961	top5-err=0.177451


batch: 51
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [51]	Speed: 4.761376 samples/sec	top1-err=0.703846	top5-err=0.178846


batch: 52
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [52]	Speed: 4.788480 samples/sec	top1-err=0.703774	top5-err=0.179245


batch: 53
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [53]	Speed: 4.806817 samples/sec	top1-err=0.703704	top5-err=0.180556


batch: 54
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [54]	Speed: 4.462073 samples/sec	top1-err=0.703636	top5-err=0.182727


batch: 55
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [55]	Speed: 4.671429 samples/sec	top1-err=0.702679	top5-err=0.183036


batch: 56
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [56]	Speed: 4.762420 samples/sec	top1-err=0.703509	top5-err=0.180702


batch: 57
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [57]	Speed: 4.682249 samples/sec	top1-err=0.706897	top5-err=0.181034


batch: 58
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [58]	Speed: 4.540141 samples/sec	top1-err=0.706780	top5-err=0.182203


batch: 59
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [59]	Speed: 4.733545 samples/sec	top1-err=0.704167	top5-err=0.181667


batch: 60
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [60]	Speed: 4.635494 samples/sec	top1-err=0.705738	top5-err=0.181148


batch: 61
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [61]	Speed: 4.502850 samples/sec	top1-err=0.703226	top5-err=0.179839


batch: 62
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [62]	Speed: 4.832964 samples/sec	top1-err=0.701587	top5-err=0.179365


batch: 63
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [63]	Speed: 1.872983 samples/sec	top1-err=0.700000	top5-err=0.180469


batch: 64
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [64]	Speed: 4.956477 samples/sec	top1-err=0.698462	top5-err=0.180000


batch: 65
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [65]	Speed: 4.881930 samples/sec	top1-err=0.699242	top5-err=0.180303


batch: 66
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [66]	Speed: 4.944047 samples/sec	top1-err=0.700000	top5-err=0.179104


batch: 67
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [67]	Speed: 4.811516 samples/sec	top1-err=0.699265	top5-err=0.178676


batch: 68
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [68]	Speed: 4.866938 samples/sec	top1-err=0.697826	top5-err=0.177536


batch: 69
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [69]	Speed: 4.781055 samples/sec	top1-err=0.700714	top5-err=0.177857


batch: 70
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [70]	Speed: 4.647936 samples/sec	top1-err=0.697887	top5-err=0.178873


batch: 71
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [71]	Speed: 1.832766 samples/sec	top1-err=0.697917	top5-err=0.179861


batch: 72
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [72]	Speed: 4.506088 samples/sec	top1-err=0.696575	top5-err=0.180822


batch: 73
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [73]	Speed: 4.364406 samples/sec	top1-err=0.696622	top5-err=0.179730


batch: 74
forward pass
backward pass
update
update loss


INFO:root:Epoch[4] Batch [74]	Speed: 4.461520 samples/sec	top1-err=0.695333	top5-err=0.179333
INFO:root:[Epoch 4] training: err-top1=0.695333 err-top5=0.179333 loss=1.873062
INFO:root:[Epoch 4] time cost: 408.431010
INFO:root:[Epoch 4] validation: err-top1=0.692500 err-top5=0.230000


### Export model to ONNX format
The conversion of the model to ONNX format is done using an internal converter which will be released soon. The notebook will be updated with the code for the export once the converter is released.

In [117]:
transform_test = preprocess_test_data(normalize)
val_data = gluon.data.DataLoader(
        imagenet.classification.ImageNet(data_dir, train=False).transform_first(transform_test),
        batch_size=batch_size, shuffle=False, num_workers=num_workers)
batch = val_data
import re

expected = []
predicted = []
for i, batch in enumerate(val_data):
    #print(i)
    data = gluon.utils.split_and_load(batch[0], ctx_list=context, batch_axis=0)[0]
    label = gluon.utils.split_and_load(batch[1], ctx_list=context, batch_axis=0)[0]

    outputs = net(data)
    
    expected.append([int(re.findall('\d+', str(l))[0]) for l in label])
    maxes = [list(X.softmax()) for X in outputs]
    predicted.append([X.index(max(X)) for X in maxes])
    print(i)
    #pred = net(data[0])[0].softmax()
    #print(pred[0])
#     predicted.append(list(pred).index(max(pred)))
#     print(list(label[0]))
#     break

print("\n\n\n")
print("predicted:\n", predicted)
print("expected:\n", expected)

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19




predicted:
 [[2, 1, 2, 2, 1, 3, 2, 6, 2, 2, 6, 2, 2, 6, 2, 5, 2, 3, 3, 5], [2, 6, 3, 5, 1, 2, 6, 1, 6, 6, 2, 2, 5, 2, 2, 1, 2, 5, 1, 6], [2, 2, 3, 2, 1, 2, 5, 3, 3, 2, 1, 2, 2, 1, 2, 1, 1, 1, 2, 1], [2, 1, 1, 2, 2, 1, 1, 1, 2, 1, 1, 2, 2, 1, 1, 1, 1, 1, 2, 1], [2, 1, 2, 5, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1], [2, 5, 2, 2, 2, 2, 2, 2, 3, 2, 2, 2, 5, 2, 2, 2, 2, 2, 2, 2], [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 2, 2, 2, 2], [2, 2, 2, 2, 1, 2, 2, 2, 2, 6, 2, 2, 1, 2, 5, 2, 1, 2, 6, 6], [5, 2, 1, 5, 6, 2, 2, 1, 5, 2, 2, 2, 2, 0, 3, 1, 2, 3, 2, 5], [1, 3, 2, 2, 3, 6, 4, 3, 2, 2, 1, 6, 3, 6, 6, 3, 2, 2, 2, 6], [6, 1, 2, 2, 6, 2, 1, 2, 5, 6, 2, 2, 1, 1, 4, 6, 3, 1, 2, 2], [2, 2, 4, 2, 2, 1, 5, 2, 2, 2, 5, 5, 2, 1, 4, 6, 1, 2, 2, 2], [1, 2, 4, 2, 5, 2, 1, 2, 4, 1, 5, 5, 5, 1, 6, 6, 5, 5, 5, 5], [5, 1, 5, 5, 1, 5, 6, 5, 5, 2, 1, 5, 2, 5, 1, 5, 5, 2, 5, 2], [2, 5, 5, 1, 3, 6, 5, 5, 6, 5, 5, 6, 5, 5, 5, 1, 6, 1, 5, 2], [1,

In [118]:
for i in range(20):
    print("predicted: ", predicted[i])
    print("expected: ", expected[i])
    print("\n")

predicted:  [2, 1, 2, 2, 1, 3, 2, 6, 2, 2, 6, 2, 2, 6, 2, 5, 2, 3, 3, 5]
expected:  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]


predicted:  [2, 6, 3, 5, 1, 2, 6, 1, 6, 6, 2, 2, 5, 2, 2, 1, 2, 5, 1, 6]
expected:  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]


predicted:  [2, 2, 3, 2, 1, 2, 5, 3, 3, 2, 1, 2, 2, 1, 2, 1, 1, 1, 2, 1]
expected:  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]


predicted:  [2, 1, 1, 2, 2, 1, 1, 1, 2, 1, 1, 2, 2, 1, 1, 1, 1, 1, 2, 1]
expected:  [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]


predicted:  [2, 1, 2, 5, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1]
expected:  [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]


predicted:  [2, 5, 2, 2, 2, 2, 2, 2, 3, 2, 2, 2, 5, 2, 2, 2, 2, 2, 2, 2]
expected:  [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]


predicted:  [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 2, 2, 2, 2]
expected:  [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 