<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `device` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), device=gpu)
x

[11:26:40] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], device=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[11:26:40] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), device=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], device=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_device(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', device=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), device=gpu)
net(x)

[11:26:40] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 4.0686183, -1.6921507]], device=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, device=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7944738630603627 samples/sec                   batch loss = 13.44683289527893 | accuracy = 0.7


Epoch[1] Batch[10] Speed: 1.278727925611721 samples/sec                   batch loss = 29.597498655319214 | accuracy = 0.525


Epoch[1] Batch[15] Speed: 1.294340225565795 samples/sec                   batch loss = 43.93658638000488 | accuracy = 0.5166666666666667


Epoch[1] Batch[20] Speed: 1.2936988646795455 samples/sec                   batch loss = 57.500115156173706 | accuracy = 0.525


Epoch[1] Batch[25] Speed: 1.2907898020840098 samples/sec                   batch loss = 72.97877883911133 | accuracy = 0.46


Epoch[1] Batch[30] Speed: 1.2892724491857446 samples/sec                   batch loss = 87.37566232681274 | accuracy = 0.4583333333333333


Epoch[1] Batch[35] Speed: 1.2811729488667603 samples/sec                   batch loss = 101.9422447681427 | accuracy = 0.45


Epoch[1] Batch[40] Speed: 1.2831954574379558 samples/sec                   batch loss = 115.81807732582092 | accuracy = 0.4625


Epoch[1] Batch[45] Speed: 1.280073637048518 samples/sec                   batch loss = 129.5005419254303 | accuracy = 0.4777777777777778


Epoch[1] Batch[50] Speed: 1.281678565540777 samples/sec                   batch loss = 142.8156599998474 | accuracy = 0.475


Epoch[1] Batch[55] Speed: 1.2852349949577948 samples/sec                   batch loss = 156.50858306884766 | accuracy = 0.4909090909090909


Epoch[1] Batch[60] Speed: 1.2848440439817352 samples/sec                   batch loss = 169.59574031829834 | accuracy = 0.49583333333333335


Epoch[1] Batch[65] Speed: 1.280944446344717 samples/sec                   batch loss = 183.89560532569885 | accuracy = 0.49615384615384617


Epoch[1] Batch[70] Speed: 1.2802870760892202 samples/sec                   batch loss = 197.91839814186096 | accuracy = 0.4928571428571429


Epoch[1] Batch[75] Speed: 1.2826794252520295 samples/sec                   batch loss = 211.87678003311157 | accuracy = 0.49


Epoch[1] Batch[80] Speed: 1.2896975272749411 samples/sec                   batch loss = 224.95153903961182 | accuracy = 0.5


Epoch[1] Batch[85] Speed: 1.2821113871988439 samples/sec                   batch loss = 238.84226846694946 | accuracy = 0.5


Epoch[1] Batch[90] Speed: 1.2788477179299353 samples/sec                   batch loss = 251.90880727767944 | accuracy = 0.5083333333333333


Epoch[1] Batch[95] Speed: 1.2762823136341437 samples/sec                   batch loss = 265.7874286174774 | accuracy = 0.5210526315789473


Epoch[1] Batch[100] Speed: 1.2847352263477216 samples/sec                   batch loss = 278.7659330368042 | accuracy = 0.53


Epoch[1] Batch[105] Speed: 1.2825429325661872 samples/sec                   batch loss = 293.3864483833313 | accuracy = 0.5285714285714286


Epoch[1] Batch[110] Speed: 1.2787613560351394 samples/sec                   batch loss = 307.59783482551575 | accuracy = 0.525


Epoch[1] Batch[115] Speed: 1.2834968316542337 samples/sec                   batch loss = 321.2305715084076 | accuracy = 0.5239130434782608


Epoch[1] Batch[120] Speed: 1.2855538767873351 samples/sec                   batch loss = 335.9695599079132 | accuracy = 0.5166666666666667


Epoch[1] Batch[125] Speed: 1.2806377199095038 samples/sec                   batch loss = 348.9039897918701 | accuracy = 0.524


Epoch[1] Batch[130] Speed: 1.285062718862658 samples/sec                   batch loss = 362.57892179489136 | accuracy = 0.525


Epoch[1] Batch[135] Speed: 1.287756484906801 samples/sec                   batch loss = 376.4326808452606 | accuracy = 0.5259259259259259


Epoch[1] Batch[140] Speed: 1.2839431646843875 samples/sec                   batch loss = 390.34453678131104 | accuracy = 0.5267857142857143


Epoch[1] Batch[145] Speed: 1.2827678864566625 samples/sec                   batch loss = 404.34625482559204 | accuracy = 0.5293103448275862


Epoch[1] Batch[150] Speed: 1.2857138696983281 samples/sec                   batch loss = 417.9155731201172 | accuracy = 0.53


Epoch[1] Batch[155] Speed: 1.28382094230341 samples/sec                   batch loss = 431.7805733680725 | accuracy = 0.532258064516129


Epoch[1] Batch[160] Speed: 1.285081420889424 samples/sec                   batch loss = 445.3002984523773 | accuracy = 0.534375


Epoch[1] Batch[165] Speed: 1.2842773317443825 samples/sec                   batch loss = 459.07298016548157 | accuracy = 0.5348484848484848


Epoch[1] Batch[170] Speed: 1.2841296870857268 samples/sec                   batch loss = 473.1986002922058 | accuracy = 0.5352941176470588


Epoch[1] Batch[175] Speed: 1.2791418842629758 samples/sec                   batch loss = 486.791081905365 | accuracy = 0.5385714285714286


Epoch[1] Batch[180] Speed: 1.2883039144829695 samples/sec                   batch loss = 501.20718002319336 | accuracy = 0.5361111111111111


Epoch[1] Batch[185] Speed: 1.283275744618488 samples/sec                   batch loss = 514.8472990989685 | accuracy = 0.5364864864864864


Epoch[1] Batch[190] Speed: 1.283411313380957 samples/sec                   batch loss = 528.8617005348206 | accuracy = 0.5394736842105263


Epoch[1] Batch[195] Speed: 1.285548754519515 samples/sec                   batch loss = 542.3148856163025 | accuracy = 0.5435897435897435


Epoch[1] Batch[200] Speed: 1.2870673224894877 samples/sec                   batch loss = 556.1213555335999 | accuracy = 0.54375


Epoch[1] Batch[205] Speed: 1.2823740238715549 samples/sec                   batch loss = 569.4267411231995 | accuracy = 0.5475609756097561


Epoch[1] Batch[210] Speed: 1.286157210743694 samples/sec                   batch loss = 582.9228708744049 | accuracy = 0.5488095238095239


Epoch[1] Batch[215] Speed: 1.289717157606681 samples/sec                   batch loss = 596.9501659870148 | accuracy = 0.5488372093023256


Epoch[1] Batch[220] Speed: 1.285760968793894 samples/sec                   batch loss = 610.7859964370728 | accuracy = 0.5477272727272727


Epoch[1] Batch[225] Speed: 1.2822008479826288 samples/sec                   batch loss = 624.7834663391113 | accuracy = 0.5466666666666666


Epoch[1] Batch[230] Speed: 1.283725067077731 samples/sec                   batch loss = 638.9834170341492 | accuracy = 0.5445652173913044


Epoch[1] Batch[235] Speed: 1.2889769725004372 samples/sec                   batch loss = 652.7687132358551 | accuracy = 0.5446808510638298


Epoch[1] Batch[240] Speed: 1.2835726392383342 samples/sec                   batch loss = 666.3341298103333 | accuracy = 0.546875


Epoch[1] Batch[245] Speed: 1.2852624649587203 samples/sec                   batch loss = 679.7205548286438 | accuracy = 0.55


Epoch[1] Batch[250] Speed: 1.283784790946331 samples/sec                   batch loss = 694.1968913078308 | accuracy = 0.544


Epoch[1] Batch[255] Speed: 1.2837173073119372 samples/sec                   batch loss = 708.6610479354858 | accuracy = 0.542156862745098


Epoch[1] Batch[260] Speed: 1.2815371954182713 samples/sec                   batch loss = 722.382838010788 | accuracy = 0.5442307692307692


Epoch[1] Batch[265] Speed: 1.2808311056941168 samples/sec                   batch loss = 736.2404096126556 | accuracy = 0.5433962264150943


Epoch[1] Batch[270] Speed: 1.2785846721642737 samples/sec                   batch loss = 750.1489715576172 | accuracy = 0.5444444444444444


Epoch[1] Batch[275] Speed: 1.282579896466854 samples/sec                   batch loss = 763.6146521568298 | accuracy = 0.5445454545454546


Epoch[1] Batch[280] Speed: 1.2898010393923551 samples/sec                   batch loss = 776.9616618156433 | accuracy = 0.5446428571428571


Epoch[1] Batch[285] Speed: 1.282993409840877 samples/sec                   batch loss = 791.0988755226135 | accuracy = 0.5456140350877193


Epoch[1] Batch[290] Speed: 1.2812394803322307 samples/sec                   batch loss = 804.6921744346619 | accuracy = 0.5474137931034483


Epoch[1] Batch[295] Speed: 1.283989544539713 samples/sec                   batch loss = 818.6150360107422 | accuracy = 0.5466101694915254


Epoch[1] Batch[300] Speed: 1.2820208612834914 samples/sec                   batch loss = 832.6716909408569 | accuracy = 0.5475


Epoch[1] Batch[305] Speed: 1.2885264407182937 samples/sec                   batch loss = 846.583642244339 | accuracy = 0.5459016393442623


Epoch[1] Batch[310] Speed: 1.2863908318939543 samples/sec                   batch loss = 860.564733505249 | accuracy = 0.5435483870967742


Epoch[1] Batch[315] Speed: 1.2798243405160075 samples/sec                   batch loss = 874.4326498508453 | accuracy = 0.542063492063492


Epoch[1] Batch[320] Speed: 1.283282222995277 samples/sec                   batch loss = 888.432520866394 | accuracy = 0.5421875


Epoch[1] Batch[325] Speed: 1.2822806187273996 samples/sec                   batch loss = 902.1431195735931 | accuracy = 0.5423076923076923


Epoch[1] Batch[330] Speed: 1.2770562971568298 samples/sec                   batch loss = 915.2436068058014 | accuracy = 0.543939393939394


Epoch[1] Batch[335] Speed: 1.2867922982748754 samples/sec                   batch loss = 928.4865849018097 | accuracy = 0.5447761194029851


Epoch[1] Batch[340] Speed: 1.2871753505066006 samples/sec                   batch loss = 941.946352481842 | accuracy = 0.5455882352941176


Epoch[1] Batch[345] Speed: 1.287215149675868 samples/sec                   batch loss = 954.7303113937378 | accuracy = 0.5485507246376812


Epoch[1] Batch[350] Speed: 1.2884768628568937 samples/sec                   batch loss = 967.9880621433258 | accuracy = 0.5492857142857143


Epoch[1] Batch[355] Speed: 1.2866907486998853 samples/sec                   batch loss = 981.6669232845306 | accuracy = 0.5492957746478874


Epoch[1] Batch[360] Speed: 1.290151160367485 samples/sec                   batch loss = 995.3877537250519 | accuracy = 0.5493055555555556


Epoch[1] Batch[365] Speed: 1.2927418990077755 samples/sec                   batch loss = 1008.5540883541107 | accuracy = 0.55


Epoch[1] Batch[370] Speed: 1.2831100774506077 samples/sec                   batch loss = 1022.3373420238495 | accuracy = 0.5493243243243243


Epoch[1] Batch[375] Speed: 1.2825166571366542 samples/sec                   batch loss = 1035.914927482605 | accuracy = 0.55


Epoch[1] Batch[380] Speed: 1.2881580130389672 samples/sec                   batch loss = 1049.155121088028 | accuracy = 0.5532894736842106


Epoch[1] Batch[385] Speed: 1.2863085765247144 samples/sec                   batch loss = 1063.2871832847595 | accuracy = 0.5512987012987013


Epoch[1] Batch[390] Speed: 1.2932312698169706 samples/sec                   batch loss = 1075.8903813362122 | accuracy = 0.5538461538461539


Epoch[1] Batch[395] Speed: 1.286246941206614 samples/sec                   batch loss = 1089.2501246929169 | accuracy = 0.5550632911392405


Epoch[1] Batch[400] Speed: 1.2869641499529811 samples/sec                   batch loss = 1103.0326023101807 | accuracy = 0.556875


Epoch[1] Batch[405] Speed: 1.2892160772711296 samples/sec                   batch loss = 1116.7491273880005 | accuracy = 0.5574074074074075


Epoch[1] Batch[410] Speed: 1.2845247273760185 samples/sec                   batch loss = 1130.4829938411713 | accuracy = 0.5554878048780488


Epoch[1] Batch[415] Speed: 1.2877022221768526 samples/sec                   batch loss = 1144.2546575069427 | accuracy = 0.555421686746988


Epoch[1] Batch[420] Speed: 1.2847603137693948 samples/sec                   batch loss = 1157.7978100776672 | accuracy = 0.5553571428571429


Epoch[1] Batch[425] Speed: 1.288088585449416 samples/sec                   batch loss = 1170.4230546951294 | accuracy = 0.5570588235294117


Epoch[1] Batch[430] Speed: 1.2871546124607423 samples/sec                   batch loss = 1184.028384923935 | accuracy = 0.5581395348837209


Epoch[1] Batch[435] Speed: 1.2816192332842398 samples/sec                   batch loss = 1198.161331653595 | accuracy = 0.5563218390804597


Epoch[1] Batch[440] Speed: 1.2884612283146635 samples/sec                   batch loss = 1211.0473494529724 | accuracy = 0.5573863636363636


Epoch[1] Batch[445] Speed: 1.2863510836814394 samples/sec                   batch loss = 1224.3503370285034 | accuracy = 0.5578651685393259


Epoch[1] Batch[450] Speed: 1.286000064693931 samples/sec                   batch loss = 1237.7769784927368 | accuracy = 0.5577777777777778


Epoch[1] Batch[455] Speed: 1.2899127995729511 samples/sec                   batch loss = 1251.4967367649078 | accuracy = 0.5576923076923077


Epoch[1] Batch[460] Speed: 1.2861469566307968 samples/sec                   batch loss = 1266.4514207839966 | accuracy = 0.5554347826086956


Epoch[1] Batch[465] Speed: 1.2862793853017658 samples/sec                   batch loss = 1280.3477356433868 | accuracy = 0.5543010752688172


Epoch[1] Batch[470] Speed: 1.2846666590605047 samples/sec                   batch loss = 1293.7585139274597 | accuracy = 0.5542553191489362


Epoch[1] Batch[475] Speed: 1.2852651234082246 samples/sec                   batch loss = 1306.9921734333038 | accuracy = 0.5542105263157895


Epoch[1] Batch[480] Speed: 1.2855978117609268 samples/sec                   batch loss = 1320.2901449203491 | accuracy = 0.5541666666666667


Epoch[1] Batch[485] Speed: 1.29018241269541 samples/sec                   batch loss = 1333.9794921875 | accuracy = 0.5561855670103093


Epoch[1] Batch[490] Speed: 1.2883332966018064 samples/sec                   batch loss = 1347.0752222537994 | accuracy = 0.5571428571428572


Epoch[1] Batch[495] Speed: 1.2858853348723598 samples/sec                   batch loss = 1360.4214599132538 | accuracy = 0.5580808080808081


Epoch[1] Batch[500] Speed: 1.2844601160631088 samples/sec                   batch loss = 1373.1332066059113 | accuracy = 0.5595


Epoch[1] Batch[505] Speed: 1.2811700138141904 samples/sec                   batch loss = 1386.048598766327 | accuracy = 0.5603960396039604


Epoch[1] Batch[510] Speed: 1.2858786330758032 samples/sec                   batch loss = 1399.4535541534424 | accuracy = 0.5612745098039216


Epoch[1] Batch[515] Speed: 1.2830216671633656 samples/sec                   batch loss = 1412.6103763580322 | accuracy = 0.5626213592233009


Epoch[1] Batch[520] Speed: 1.2819236875160027 samples/sec                   batch loss = 1425.6309621334076 | accuracy = 0.5625


Epoch[1] Batch[525] Speed: 1.2855482619959142 samples/sec                   batch loss = 1438.7241315841675 | accuracy = 0.5638095238095238


Epoch[1] Batch[530] Speed: 1.2801433754329785 samples/sec                   batch loss = 1452.5356359481812 | accuracy = 0.5627358490566038


Epoch[1] Batch[535] Speed: 1.2788043405766474 samples/sec                   batch loss = 1466.838773727417 | accuracy = 0.5616822429906542


Epoch[1] Batch[540] Speed: 1.2822066295580246 samples/sec                   batch loss = 1480.4110054969788 | accuracy = 0.562037037037037


Epoch[1] Batch[545] Speed: 1.2795874377050978 samples/sec                   batch loss = 1493.9217319488525 | accuracy = 0.5628440366972477


Epoch[1] Batch[550] Speed: 1.2852426746247472 samples/sec                   batch loss = 1506.8805553913116 | accuracy = 0.5640909090909091


Epoch[1] Batch[555] Speed: 1.2887458759817947 samples/sec                   batch loss = 1519.9281690120697 | accuracy = 0.5644144144144144


Epoch[1] Batch[560] Speed: 1.286285203716831 samples/sec                   batch loss = 1533.1510560512543 | accuracy = 0.565625


Epoch[1] Batch[565] Speed: 1.2849368388571172 samples/sec                   batch loss = 1547.1854510307312 | accuracy = 0.5650442477876106


Epoch[1] Batch[570] Speed: 1.2919787381094938 samples/sec                   batch loss = 1559.3150796890259 | accuracy = 0.5666666666666667


Epoch[1] Batch[575] Speed: 1.2877504554887498 samples/sec                   batch loss = 1571.8508520126343 | accuracy = 0.5682608695652174


Epoch[1] Batch[580] Speed: 1.2789859602280662 samples/sec                   batch loss = 1584.6780762672424 | accuracy = 0.569396551724138


Epoch[1] Batch[585] Speed: 1.2789090361074282 samples/sec                   batch loss = 1597.2551536560059 | accuracy = 0.5709401709401709


Epoch[1] Batch[590] Speed: 1.2845557077081922 samples/sec                   batch loss = 1611.3566069602966 | accuracy = 0.5703389830508474


Epoch[1] Batch[595] Speed: 1.280776251719945 samples/sec                   batch loss = 1624.2789177894592 | accuracy = 0.5710084033613445


Epoch[1] Batch[600] Speed: 1.2886016559358824 samples/sec                   batch loss = 1637.518205165863 | accuracy = 0.5708333333333333


Epoch[1] Batch[605] Speed: 1.2846218040782866 samples/sec                   batch loss = 1650.4563100337982 | accuracy = 0.571900826446281


Epoch[1] Batch[610] Speed: 1.2911561597753047 samples/sec                   batch loss = 1663.353182554245 | accuracy = 0.5721311475409836


Epoch[1] Batch[615] Speed: 1.2822657222081253 samples/sec                   batch loss = 1675.415225982666 | accuracy = 0.5735772357723578


Epoch[1] Batch[620] Speed: 1.2863997090015307 samples/sec                   batch loss = 1687.787345647812 | accuracy = 0.5754032258064516


Epoch[1] Batch[625] Speed: 1.2796548782352781 samples/sec                   batch loss = 1699.9136023521423 | accuracy = 0.5768


Epoch[1] Batch[630] Speed: 1.2815685212874217 samples/sec                   batch loss = 1712.9762933254242 | accuracy = 0.5777777777777777


Epoch[1] Batch[635] Speed: 1.2823553025477243 samples/sec                   batch loss = 1726.572253704071 | accuracy = 0.5779527559055118


Epoch[1] Batch[640] Speed: 1.2835529991084023 samples/sec                   batch loss = 1737.6682488918304 | accuracy = 0.580078125


Epoch[1] Batch[645] Speed: 1.2870263476838082 samples/sec                   batch loss = 1750.611244916916 | accuracy = 0.5806201550387597


Epoch[1] Batch[650] Speed: 1.2802378371531271 samples/sec                   batch loss = 1763.1864278316498 | accuracy = 0.5807692307692308


Epoch[1] Batch[655] Speed: 1.2877966165703831 samples/sec                   batch loss = 1776.3871822357178 | accuracy = 0.5805343511450382


Epoch[1] Batch[660] Speed: 1.2808530094848476 samples/sec                   batch loss = 1789.9503917694092 | accuracy = 0.5803030303030303


Epoch[1] Batch[665] Speed: 1.276603664673551 samples/sec                   batch loss = 1803.092717885971 | accuracy = 0.5796992481203007


Epoch[1] Batch[670] Speed: 1.2787706154994272 samples/sec                   batch loss = 1815.7161555290222 | accuracy = 0.5805970149253732


Epoch[1] Batch[675] Speed: 1.2824267601724668 samples/sec                   batch loss = 1828.9342892169952 | accuracy = 0.5803703703703704


Epoch[1] Batch[680] Speed: 1.2817131296018596 samples/sec                   batch loss = 1841.5879452228546 | accuracy = 0.5816176470588236


Epoch[1] Batch[685] Speed: 1.2826798175142524 samples/sec                   batch loss = 1854.7160663604736 | accuracy = 0.5824817518248175


Epoch[1] Batch[690] Speed: 1.2864534674370516 samples/sec                   batch loss = 1868.586131811142 | accuracy = 0.5822463768115942


Epoch[1] Batch[695] Speed: 1.286498943676579 samples/sec                   batch loss = 1882.113454580307 | accuracy = 0.5820143884892086


Epoch[1] Batch[700] Speed: 1.2856060868453776 samples/sec                   batch loss = 1895.0244116783142 | accuracy = 0.5828571428571429


Epoch[1] Batch[705] Speed: 1.2848664788418291 samples/sec                   batch loss = 1909.4765989780426 | accuracy = 0.5822695035460993


Epoch[1] Batch[710] Speed: 1.2850288597359694 samples/sec                   batch loss = 1923.5843892097473 | accuracy = 0.5816901408450704


Epoch[1] Batch[715] Speed: 1.2862636068139737 samples/sec                   batch loss = 1935.592008113861 | accuracy = 0.5821678321678322


Epoch[1] Batch[720] Speed: 1.2831197925314473 samples/sec                   batch loss = 1947.5519750118256 | accuracy = 0.5833333333333334


Epoch[1] Batch[725] Speed: 1.2874505385014658 samples/sec                   batch loss = 1959.6057333946228 | accuracy = 0.5841379310344827


Epoch[1] Batch[730] Speed: 1.2878673967328182 samples/sec                   batch loss = 1972.6714010238647 | accuracy = 0.584931506849315


Epoch[1] Batch[735] Speed: 1.2962665269536444 samples/sec                   batch loss = 1985.3466618061066 | accuracy = 0.5850340136054422


Epoch[1] Batch[740] Speed: 1.2866525607330996 samples/sec                   batch loss = 1998.0382797718048 | accuracy = 0.5858108108108108


Epoch[1] Batch[745] Speed: 1.2833866713094522 samples/sec                   batch loss = 2010.6471571922302 | accuracy = 0.5865771812080537


Epoch[1] Batch[750] Speed: 1.2851455040670727 samples/sec                   batch loss = 2022.5652372837067 | accuracy = 0.587


Epoch[1] Batch[755] Speed: 1.2839349109935463 samples/sec                   batch loss = 2034.9425686597824 | accuracy = 0.5870860927152318


Epoch[1] Batch[760] Speed: 1.28611313893653 samples/sec                   batch loss = 2049.7652863264084 | accuracy = 0.5865131578947368


Epoch[1] Batch[765] Speed: 1.2905245014195048 samples/sec                   batch loss = 2061.654783844948 | accuracy = 0.5875816993464053


Epoch[1] Batch[770] Speed: 1.2905581544079066 samples/sec                   batch loss = 2074.8243473768234 | accuracy = 0.5883116883116883


Epoch[1] Batch[775] Speed: 1.2897008980974909 samples/sec                   batch loss = 2086.86106300354 | accuracy = 0.5896774193548387


Epoch[1] Batch[780] Speed: 1.288471816222678 samples/sec                   batch loss = 2099.3566678762436 | accuracy = 0.5897435897435898


Epoch[1] Batch[785] Speed: 1.2927197859245994 samples/sec                   batch loss = 2112.4191397428513 | accuracy = 0.5901273885350319


[Epoch 1] training: accuracy=0.5897842639593909
[Epoch 1] time cost: 631.284083366394
[Epoch 1] validation: validation accuracy=0.6866666666666666


Epoch[2] Batch[5] Speed: 1.2832037016244475 samples/sec                   batch loss = 13.423954248428345 | accuracy = 0.6


Epoch[2] Batch[10] Speed: 1.27945491971804 samples/sec                   batch loss = 24.559162735939026 | accuracy = 0.7


Epoch[2] Batch[15] Speed: 1.2736221721730991 samples/sec                   batch loss = 38.19987189769745 | accuracy = 0.6666666666666666


Epoch[2] Batch[20] Speed: 1.26954738800628 samples/sec                   batch loss = 50.770437836647034 | accuracy = 0.6375


Epoch[2] Batch[25] Speed: 1.2710067005592374 samples/sec                   batch loss = 63.403475522994995 | accuracy = 0.61


Epoch[2] Batch[30] Speed: 1.2643211902688085 samples/sec                   batch loss = 77.06675362586975 | accuracy = 0.6083333333333333


Epoch[2] Batch[35] Speed: 1.2483082913478476 samples/sec                   batch loss = 88.7778228521347 | accuracy = 0.6214285714285714


Epoch[2] Batch[40] Speed: 1.2485045783741133 samples/sec                   batch loss = 100.91107547283173 | accuracy = 0.61875


Epoch[2] Batch[45] Speed: 1.2505934223076696 samples/sec                   batch loss = 112.93937885761261 | accuracy = 0.6333333333333333


Epoch[2] Batch[50] Speed: 1.2559738654196075 samples/sec                   batch loss = 125.33059203624725 | accuracy = 0.635


Epoch[2] Batch[55] Speed: 1.2553584942536493 samples/sec                   batch loss = 138.579447388649 | accuracy = 0.6363636363636364


Epoch[2] Batch[60] Speed: 1.2532189841445227 samples/sec                   batch loss = 149.77771878242493 | accuracy = 0.6416666666666667


Epoch[2] Batch[65] Speed: 1.2587338967114734 samples/sec                   batch loss = 163.15150773525238 | accuracy = 0.6384615384615384


Epoch[2] Batch[70] Speed: 1.2568436303487458 samples/sec                   batch loss = 175.57134974002838 | accuracy = 0.6357142857142857


Epoch[2] Batch[75] Speed: 1.2631449291260364 samples/sec                   batch loss = 187.85190343856812 | accuracy = 0.6366666666666667


Epoch[2] Batch[80] Speed: 1.2600109574555571 samples/sec                   batch loss = 200.62738013267517 | accuracy = 0.63125


Epoch[2] Batch[85] Speed: 1.2574961794779296 samples/sec                   batch loss = 214.36209440231323 | accuracy = 0.6235294117647059


Epoch[2] Batch[90] Speed: 1.2528528788660622 samples/sec                   batch loss = 227.0179922580719 | accuracy = 0.625


Epoch[2] Batch[95] Speed: 1.2465930153201092 samples/sec                   batch loss = 240.04993534088135 | accuracy = 0.6210526315789474


Epoch[2] Batch[100] Speed: 1.254329650664307 samples/sec                   batch loss = 251.7988704442978 | accuracy = 0.62


Epoch[2] Batch[105] Speed: 1.239565548101905 samples/sec                   batch loss = 266.3597048521042 | accuracy = 0.6190476190476191


Epoch[2] Batch[110] Speed: 1.2390083275471349 samples/sec                   batch loss = 278.7505098581314 | accuracy = 0.6227272727272727


Epoch[2] Batch[115] Speed: 1.2407612290514691 samples/sec                   batch loss = 290.88872826099396 | accuracy = 0.6260869565217392


Epoch[2] Batch[120] Speed: 1.2343014103608327 samples/sec                   batch loss = 304.04647064208984 | accuracy = 0.625


Epoch[2] Batch[125] Speed: 1.2465984802349404 samples/sec                   batch loss = 314.29658460617065 | accuracy = 0.628


Epoch[2] Batch[130] Speed: 1.2383886230036896 samples/sec                   batch loss = 329.64918994903564 | accuracy = 0.6173076923076923


Epoch[2] Batch[135] Speed: 1.247932146450617 samples/sec                   batch loss = 340.6588159799576 | accuracy = 0.6222222222222222


Epoch[2] Batch[140] Speed: 1.2412159779734164 samples/sec                   batch loss = 353.8565367460251 | accuracy = 0.6267857142857143


Epoch[2] Batch[145] Speed: 1.249318159715567 samples/sec                   batch loss = 365.8476313352585 | accuracy = 0.6293103448275862


Epoch[2] Batch[150] Speed: 1.2429878444066267 samples/sec                   batch loss = 379.32664597034454 | accuracy = 0.6216666666666667


Epoch[2] Batch[155] Speed: 1.239223851232416 samples/sec                   batch loss = 392.5654512643814 | accuracy = 0.6225806451612903


Epoch[2] Batch[160] Speed: 1.2454796359982854 samples/sec                   batch loss = 405.5920022726059 | accuracy = 0.6265625


Epoch[2] Batch[165] Speed: 1.2405337043425588 samples/sec                   batch loss = 419.148601770401 | accuracy = 0.6257575757575757


Epoch[2] Batch[170] Speed: 1.2428154752794984 samples/sec                   batch loss = 430.648832321167 | accuracy = 0.6264705882352941


Epoch[2] Batch[175] Speed: 1.2404862832908272 samples/sec                   batch loss = 445.3123428821564 | accuracy = 0.6242857142857143


Epoch[2] Batch[180] Speed: 1.2455187477593832 samples/sec                   batch loss = 455.321205496788 | accuracy = 0.6277777777777778


Epoch[2] Batch[185] Speed: 1.2465453151318415 samples/sec                   batch loss = 465.6844631433487 | accuracy = 0.6310810810810811


Epoch[2] Batch[190] Speed: 1.2437550614700807 samples/sec                   batch loss = 478.6493912935257 | accuracy = 0.6289473684210526


Epoch[2] Batch[195] Speed: 1.2446151816912987 samples/sec                   batch loss = 492.79385781288147 | accuracy = 0.6230769230769231


Epoch[2] Batch[200] Speed: 1.2398859913753764 samples/sec                   batch loss = 503.7177039384842 | accuracy = 0.62875


Epoch[2] Batch[205] Speed: 1.2398595105097379 samples/sec                   batch loss = 517.8380197286606 | accuracy = 0.6268292682926829


Epoch[2] Batch[210] Speed: 1.2473336235835433 samples/sec                   batch loss = 531.5099390745163 | accuracy = 0.6238095238095238


Epoch[2] Batch[215] Speed: 1.2417685689084605 samples/sec                   batch loss = 543.8206948041916 | accuracy = 0.6244186046511628


Epoch[2] Batch[220] Speed: 1.2444659917347218 samples/sec                   batch loss = 556.2953034639359 | accuracy = 0.6227272727272727


Epoch[2] Batch[225] Speed: 1.2455949441538605 samples/sec                   batch loss = 568.8238059282303 | accuracy = 0.6244444444444445


Epoch[2] Batch[230] Speed: 1.2485318014674693 samples/sec                   batch loss = 580.7236903905869 | accuracy = 0.6282608695652174


Epoch[2] Batch[235] Speed: 1.2425333608543117 samples/sec                   batch loss = 593.3940788507462 | accuracy = 0.6297872340425532


Epoch[2] Batch[240] Speed: 1.236749801151002 samples/sec                   batch loss = 606.4783174991608 | accuracy = 0.6291666666666667


Epoch[2] Batch[245] Speed: 1.2440609774083897 samples/sec                   batch loss = 617.7852841615677 | accuracy = 0.6306122448979592


Epoch[2] Batch[250] Speed: 1.236338131067251 samples/sec                   batch loss = 631.4856508970261 | accuracy = 0.63


Epoch[2] Batch[255] Speed: 1.2480835613807135 samples/sec                   batch loss = 642.255490899086 | accuracy = 0.6333333333333333


Epoch[2] Batch[260] Speed: 1.2387695544680928 samples/sec                   batch loss = 653.6934105157852 | accuracy = 0.635576923076923


Epoch[2] Batch[265] Speed: 1.2460267764936588 samples/sec                   batch loss = 663.9572688341141 | accuracy = 0.6386792452830189


Epoch[2] Batch[270] Speed: 1.250940486062822 samples/sec                   batch loss = 674.6712741851807 | accuracy = 0.6416666666666667


Epoch[2] Batch[275] Speed: 1.2462342882869673 samples/sec                   batch loss = 690.1825511455536 | accuracy = 0.64


Epoch[2] Batch[280] Speed: 1.244397132694328 samples/sec                   batch loss = 700.5152069330215 | accuracy = 0.6419642857142858


Epoch[2] Batch[285] Speed: 1.2408661203533669 samples/sec                   batch loss = 712.2982784509659 | accuracy = 0.6429824561403509


Epoch[2] Batch[290] Speed: 1.2503393542915444 samples/sec                   batch loss = 728.0935282707214 | accuracy = 0.6387931034482759


Epoch[2] Batch[295] Speed: 1.2415258827840605 samples/sec                   batch loss = 739.5922684669495 | accuracy = 0.6398305084745762


Epoch[2] Batch[300] Speed: 1.2442118234272792 samples/sec                   batch loss = 753.7233536243439 | accuracy = 0.6383333333333333


Epoch[2] Batch[305] Speed: 1.2523908734972362 samples/sec                   batch loss = 765.9936987161636 | accuracy = 0.6385245901639345


Epoch[2] Batch[310] Speed: 1.243406720090643 samples/sec                   batch loss = 778.3767665624619 | accuracy = 0.6395161290322581


Epoch[2] Batch[315] Speed: 1.2511796823624044 samples/sec                   batch loss = 789.4564168453217 | accuracy = 0.6396825396825396


Epoch[2] Batch[320] Speed: 1.2400419673817802 samples/sec                   batch loss = 800.0830088853836 | accuracy = 0.640625


Epoch[2] Batch[325] Speed: 1.243622854969068 samples/sec                   batch loss = 809.1755295991898 | accuracy = 0.6438461538461538


Epoch[2] Batch[330] Speed: 1.2533233708219886 samples/sec                   batch loss = 824.900190114975 | accuracy = 0.6424242424242425


Epoch[2] Batch[335] Speed: 1.2512025432711609 samples/sec                   batch loss = 835.859706401825 | accuracy = 0.6447761194029851


Epoch[2] Batch[340] Speed: 1.2572632301514475 samples/sec                   batch loss = 850.5169684886932 | accuracy = 0.6441176470588236


Epoch[2] Batch[345] Speed: 1.249040245870969 samples/sec                   batch loss = 861.723600268364 | accuracy = 0.644927536231884


Epoch[2] Batch[350] Speed: 1.251755378349141 samples/sec                   batch loss = 870.2819162607193 | accuracy = 0.6492857142857142


Epoch[2] Batch[355] Speed: 1.2514114093506685 samples/sec                   batch loss = 887.5733925104141 | accuracy = 0.6450704225352113


Epoch[2] Batch[360] Speed: 1.2422731744050741 samples/sec                   batch loss = 897.3757570981979 | accuracy = 0.6472222222222223


Epoch[2] Batch[365] Speed: 1.2493668166577616 samples/sec                   batch loss = 908.9662922620773 | accuracy = 0.6472602739726028


Epoch[2] Batch[370] Speed: 1.2520893510264957 samples/sec                   batch loss = 921.1965379714966 | accuracy = 0.647972972972973


Epoch[2] Batch[375] Speed: 1.2498918641781298 samples/sec                   batch loss = 932.7540471553802 | accuracy = 0.648


Epoch[2] Batch[380] Speed: 1.248026277469782 samples/sec                   batch loss = 943.2351046800613 | accuracy = 0.65


Epoch[2] Batch[385] Speed: 1.2517534170779436 samples/sec                   batch loss = 955.9374420642853 | accuracy = 0.65


Epoch[2] Batch[390] Speed: 1.2494374364379137 samples/sec                   batch loss = 966.6256192922592 | accuracy = 0.6512820512820513


Epoch[2] Batch[395] Speed: 1.2480201501666472 samples/sec                   batch loss = 979.0228371620178 | accuracy = 0.6518987341772152


Epoch[2] Batch[400] Speed: 1.2508493655135793 samples/sec                   batch loss = 988.4808753728867 | accuracy = 0.65375


Epoch[2] Batch[405] Speed: 1.2541968743038352 samples/sec                   batch loss = 1001.1030670404434 | accuracy = 0.6537037037037037


Epoch[2] Batch[410] Speed: 1.2518803520143433 samples/sec                   batch loss = 1012.2773600816727 | accuracy = 0.6554878048780488


Epoch[2] Batch[415] Speed: 1.2535601074594347 samples/sec                   batch loss = 1022.8146134614944 | accuracy = 0.6560240963855422


Epoch[2] Batch[420] Speed: 1.2581551619731117 samples/sec                   batch loss = 1034.9173357486725 | accuracy = 0.6577380952380952


Epoch[2] Batch[425] Speed: 1.2560029196807068 samples/sec                   batch loss = 1047.7825864553452 | accuracy = 0.658235294117647


Epoch[2] Batch[430] Speed: 1.2550093505483162 samples/sec                   batch loss = 1060.4901082515717 | accuracy = 0.6581395348837209


Epoch[2] Batch[435] Speed: 1.2495451958335424 samples/sec                   batch loss = 1072.3539996147156 | accuracy = 0.6597701149425287


Epoch[2] Batch[440] Speed: 1.2530549964336528 samples/sec                   batch loss = 1082.6148793697357 | accuracy = 0.6607954545454545


Epoch[2] Batch[445] Speed: 1.2528115276591716 samples/sec                   batch loss = 1094.5373685359955 | accuracy = 0.6601123595505618


Epoch[2] Batch[450] Speed: 1.2527542766480708 samples/sec                   batch loss = 1103.4106501340866 | accuracy = 0.6622222222222223


Epoch[2] Batch[455] Speed: 1.256223174005373 samples/sec                   batch loss = 1115.4678137302399 | accuracy = 0.6626373626373626


Epoch[2] Batch[460] Speed: 1.2523264629813047 samples/sec                   batch loss = 1126.3519048690796 | accuracy = 0.6635869565217392


Epoch[2] Batch[465] Speed: 1.253521706596613 samples/sec                   batch loss = 1141.8483127355576 | accuracy = 0.6639784946236559


Epoch[2] Batch[470] Speed: 1.2504775597062365 samples/sec                   batch loss = 1155.5181226730347 | accuracy = 0.6632978723404256


Epoch[2] Batch[475] Speed: 1.2507732709085453 samples/sec                   batch loss = 1166.660140991211 | accuracy = 0.6631578947368421


Epoch[2] Batch[480] Speed: 1.249785441486964 samples/sec                   batch loss = 1179.0512717962265 | accuracy = 0.6630208333333333


Epoch[2] Batch[485] Speed: 1.2482958455058655 samples/sec                   batch loss = 1191.3600090742111 | accuracy = 0.6623711340206185


Epoch[2] Batch[490] Speed: 1.2481990736201056 samples/sec                   batch loss = 1203.9565669298172 | accuracy = 0.6632653061224489


Epoch[2] Batch[495] Speed: 1.2557272874177328 samples/sec                   batch loss = 1213.703033566475 | accuracy = 0.6651515151515152


Epoch[2] Batch[500] Speed: 1.2515289388376722 samples/sec                   batch loss = 1227.8876769542694 | accuracy = 0.6635


Epoch[2] Batch[505] Speed: 1.2527826208573936 samples/sec                   batch loss = 1238.752112865448 | accuracy = 0.6623762376237624


Epoch[2] Batch[510] Speed: 1.2531608534530274 samples/sec                   batch loss = 1251.624670624733 | accuracy = 0.6617647058823529


Epoch[2] Batch[515] Speed: 1.24831971573469 samples/sec                   batch loss = 1261.7629252672195 | accuracy = 0.6616504854368932


Epoch[2] Batch[520] Speed: 1.2594745393034026 samples/sec                   batch loss = 1274.647752046585 | accuracy = 0.6610576923076923


Epoch[2] Batch[525] Speed: 1.25119349211759 samples/sec                   batch loss = 1285.9137423038483 | accuracy = 0.6619047619047619


Epoch[2] Batch[530] Speed: 1.258045252153204 samples/sec                   batch loss = 1296.1044375896454 | accuracy = 0.6632075471698113


Epoch[2] Batch[535] Speed: 1.2560731632047548 samples/sec                   batch loss = 1305.8512156009674 | accuracy = 0.6644859813084112


Epoch[2] Batch[540] Speed: 1.2528967590367106 samples/sec                   batch loss = 1314.3407267332077 | accuracy = 0.6666666666666666


Epoch[2] Batch[545] Speed: 1.2483101489574882 samples/sec                   batch loss = 1324.349189043045 | accuracy = 0.6678899082568808


Epoch[2] Batch[550] Speed: 1.252880759727194 samples/sec                   batch loss = 1332.9111914634705 | accuracy = 0.6690909090909091


Epoch[2] Batch[555] Speed: 1.2542276279083397 samples/sec                   batch loss = 1348.0534046888351 | accuracy = 0.668018018018018


Epoch[2] Batch[560] Speed: 1.2648386689909987 samples/sec                   batch loss = 1358.8483498096466 | accuracy = 0.6683035714285714


Epoch[2] Batch[565] Speed: 1.2835139170378866 samples/sec                   batch loss = 1370.3530910015106 | accuracy = 0.6685840707964602


Epoch[2] Batch[570] Speed: 1.2809624418737466 samples/sec                   batch loss = 1383.2483919858932 | accuracy = 0.6684210526315789


Epoch[2] Batch[575] Speed: 1.2812362514398163 samples/sec                   batch loss = 1397.6751329898834 | accuracy = 0.6673913043478261


Epoch[2] Batch[580] Speed: 1.2828165355117 samples/sec                   batch loss = 1414.5916438102722 | accuracy = 0.6655172413793103


Epoch[2] Batch[585] Speed: 1.2869080784217588 samples/sec                   batch loss = 1425.256780385971 | accuracy = 0.667094017094017


Epoch[2] Batch[590] Speed: 1.2819391638031559 samples/sec                   batch loss = 1440.1346763372421 | accuracy = 0.6665254237288135


Epoch[2] Batch[595] Speed: 1.2709932202699028 samples/sec                   batch loss = 1453.0888268947601 | accuracy = 0.6659663865546218


Epoch[2] Batch[600] Speed: 1.283463447734556 samples/sec                   batch loss = 1468.2879660129547 | accuracy = 0.665


Epoch[2] Batch[605] Speed: 1.2740042911757525 samples/sec                   batch loss = 1481.8103349208832 | accuracy = 0.6644628099173554


Epoch[2] Batch[610] Speed: 1.2696425022737032 samples/sec                   batch loss = 1495.0251395702362 | accuracy = 0.6639344262295082


Epoch[2] Batch[615] Speed: 1.2688997411181375 samples/sec                   batch loss = 1508.402509689331 | accuracy = 0.6634146341463415


Epoch[2] Batch[620] Speed: 1.271623344208856 samples/sec                   batch loss = 1519.8104281425476 | accuracy = 0.6641129032258064


Epoch[2] Batch[625] Speed: 1.2710672691020155 samples/sec                   batch loss = 1533.874329328537 | accuracy = 0.6636


Epoch[2] Batch[630] Speed: 1.2709257268356182 samples/sec                   batch loss = 1544.6367884874344 | accuracy = 0.6634920634920635


Epoch[2] Batch[635] Speed: 1.2667490618045834 samples/sec                   batch loss = 1556.81709420681 | accuracy = 0.6633858267716536


Epoch[2] Batch[640] Speed: 1.2636360329193397 samples/sec                   batch loss = 1566.4936374425888 | accuracy = 0.66484375


Epoch[2] Batch[645] Speed: 1.2621111687689546 samples/sec                   batch loss = 1576.5268193483353 | accuracy = 0.665891472868217


Epoch[2] Batch[650] Speed: 1.2681800878711933 samples/sec                   batch loss = 1586.9707092046738 | accuracy = 0.6665384615384615


Epoch[2] Batch[655] Speed: 1.272001757139726 samples/sec                   batch loss = 1599.4545747041702 | accuracy = 0.666793893129771


Epoch[2] Batch[660] Speed: 1.2689487835962288 samples/sec                   batch loss = 1612.7802780866623 | accuracy = 0.6670454545454545


Epoch[2] Batch[665] Speed: 1.266395179351465 samples/sec                   batch loss = 1624.7096905708313 | accuracy = 0.6676691729323309


Epoch[2] Batch[670] Speed: 1.2637301682712225 samples/sec                   batch loss = 1636.8860214948654 | accuracy = 0.6671641791044776


Epoch[2] Batch[675] Speed: 1.270841971771113 samples/sec                   batch loss = 1646.5900183916092 | accuracy = 0.6674074074074074


Epoch[2] Batch[680] Speed: 1.2790991695835552 samples/sec                   batch loss = 1658.9467412233353 | accuracy = 0.6676470588235294


Epoch[2] Batch[685] Speed: 1.2733976113684853 samples/sec                   batch loss = 1668.0196598768234 | accuracy = 0.668978102189781


Epoch[2] Batch[690] Speed: 1.2686704146798182 samples/sec                   batch loss = 1679.2730977535248 | accuracy = 0.6684782608695652


Epoch[2] Batch[695] Speed: 1.274788299118305 samples/sec                   batch loss = 1690.477182149887 | accuracy = 0.668705035971223


Epoch[2] Batch[700] Speed: 1.2729760608920415 samples/sec                   batch loss = 1701.1362788677216 | accuracy = 0.6696428571428571


Epoch[2] Batch[705] Speed: 1.269507040804096 samples/sec                   batch loss = 1712.5922498703003 | accuracy = 0.6698581560283688


Epoch[2] Batch[710] Speed: 1.276530620552373 samples/sec                   batch loss = 1724.9092011451721 | accuracy = 0.6697183098591549


Epoch[2] Batch[715] Speed: 1.277826451055429 samples/sec                   batch loss = 1736.4532833099365 | accuracy = 0.6695804195804196


Epoch[2] Batch[720] Speed: 1.2774352275506422 samples/sec                   batch loss = 1746.522976398468 | accuracy = 0.6697916666666667


Epoch[2] Batch[725] Speed: 1.2773305786734754 samples/sec                   batch loss = 1759.6294239759445 | accuracy = 0.6693103448275862


Epoch[2] Batch[730] Speed: 1.2727653420225489 samples/sec                   batch loss = 1768.9749666452408 | accuracy = 0.6702054794520548


Epoch[2] Batch[735] Speed: 1.2744130669221183 samples/sec                   batch loss = 1782.3862203359604 | accuracy = 0.669047619047619


Epoch[2] Batch[740] Speed: 1.2764287418501945 samples/sec                   batch loss = 1792.1281156539917 | accuracy = 0.6699324324324324


Epoch[2] Batch[745] Speed: 1.2759503528839133 samples/sec                   batch loss = 1806.6915117502213 | accuracy = 0.6691275167785234


Epoch[2] Batch[750] Speed: 1.269292379176489 samples/sec                   batch loss = 1814.7072929143906 | accuracy = 0.6703333333333333


Epoch[2] Batch[755] Speed: 1.2688650009911344 samples/sec                   batch loss = 1825.151826620102 | accuracy = 0.671523178807947


Epoch[2] Batch[760] Speed: 1.273658720339059 samples/sec                   batch loss = 1842.220097064972 | accuracy = 0.6700657894736842


Epoch[2] Batch[765] Speed: 1.2727880329574799 samples/sec                   batch loss = 1853.5953224897385 | accuracy = 0.6709150326797385


Epoch[2] Batch[770] Speed: 1.2742559707811254 samples/sec                   batch loss = 1866.2184590101242 | accuracy = 0.6704545454545454


Epoch[2] Batch[775] Speed: 1.267462303737383 samples/sec                   batch loss = 1877.5896359682083 | accuracy = 0.6706451612903226


Epoch[2] Batch[780] Speed: 1.2687467836507176 samples/sec                   batch loss = 1888.2038969993591 | accuracy = 0.6708333333333333


Epoch[2] Batch[785] Speed: 1.2682420171024886 samples/sec                   batch loss = 1901.9622988700867 | accuracy = 0.6713375796178344


[Epoch 2] training: accuracy=0.6716370558375635
[Epoch 2] time cost: 641.0706522464752
[Epoch 2] validation: validation accuracy=0.7177777777777777


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).