<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[15:32:04] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[15:32:04] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[15:32:04] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 6.510857, -5.206547]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7821931000425941 samples/sec                   batch loss = 14.516242504119873 | accuracy = 0.55


Epoch[1] Batch[10] Speed: 1.254081186752014 samples/sec                   batch loss = 27.831450819969177 | accuracy = 0.575


Epoch[1] Batch[15] Speed: 1.25976383242601 samples/sec                   batch loss = 41.10348951816559 | accuracy = 0.55


Epoch[1] Batch[20] Speed: 1.2571163619039827 samples/sec                   batch loss = 56.15726935863495 | accuracy = 0.525


Epoch[1] Batch[25] Speed: 1.257975919785104 samples/sec                   batch loss = 71.25851094722748 | accuracy = 0.49


Epoch[1] Batch[30] Speed: 1.2639985643174265 samples/sec                   batch loss = 86.86704909801483 | accuracy = 0.43333333333333335


Epoch[1] Batch[35] Speed: 1.2561774615898194 samples/sec                   batch loss = 101.55967128276825 | accuracy = 0.42857142857142855


Epoch[1] Batch[40] Speed: 1.254569957321092 samples/sec                   batch loss = 114.5157378911972 | accuracy = 0.45625


Epoch[1] Batch[45] Speed: 1.249654463109114 samples/sec                   batch loss = 128.73744118213654 | accuracy = 0.45


Epoch[1] Batch[50] Speed: 1.2549913258079206 samples/sec                   batch loss = 142.7920104265213 | accuracy = 0.46


Epoch[1] Batch[55] Speed: 1.253881455655641 samples/sec                   batch loss = 156.23208010196686 | accuracy = 0.4681818181818182


Epoch[1] Batch[60] Speed: 1.2575627253522716 samples/sec                   batch loss = 169.5144044160843 | accuracy = 0.4791666666666667


Epoch[1] Batch[65] Speed: 1.2580846853988918 samples/sec                   batch loss = 183.2689563035965 | accuracy = 0.4846153846153846


Epoch[1] Batch[70] Speed: 1.25757573372569 samples/sec                   batch loss = 196.92394483089447 | accuracy = 0.4928571428571429


Epoch[1] Batch[75] Speed: 1.2492692274928918 samples/sec                   batch loss = 210.31680142879486 | accuracy = 0.5033333333333333


Epoch[1] Batch[80] Speed: 1.2468004378051005 samples/sec                   batch loss = 224.53740513324738 | accuracy = 0.50625


Epoch[1] Batch[85] Speed: 1.2493168572873619 samples/sec                   batch loss = 238.36866223812103 | accuracy = 0.5058823529411764


Epoch[1] Batch[90] Speed: 1.2474857281201688 samples/sec                   batch loss = 252.61541330814362 | accuracy = 0.5055555555555555


Epoch[1] Batch[95] Speed: 1.247217529718885 samples/sec                   batch loss = 265.7613798379898 | accuracy = 0.5131578947368421


Epoch[1] Batch[100] Speed: 1.254555040996285 samples/sec                   batch loss = 279.18243181705475 | accuracy = 0.5225


Epoch[1] Batch[105] Speed: 1.2497424306339309 samples/sec                   batch loss = 292.7440768480301 | accuracy = 0.5285714285714286


Epoch[1] Batch[110] Speed: 1.2504625541335912 samples/sec                   batch loss = 307.10738027095795 | accuracy = 0.5295454545454545


Epoch[1] Batch[115] Speed: 1.2517465993735006 samples/sec                   batch loss = 320.5391901731491 | accuracy = 0.5347826086956522


Epoch[1] Batch[120] Speed: 1.2509263088105624 samples/sec                   batch loss = 334.1359258890152 | accuracy = 0.5354166666666667


Epoch[1] Batch[125] Speed: 1.2510205189479782 samples/sec                   batch loss = 348.11549723148346 | accuracy = 0.538


Epoch[1] Batch[130] Speed: 1.250326961097722 samples/sec                   batch loss = 361.6729406118393 | accuracy = 0.5403846153846154


Epoch[1] Batch[135] Speed: 1.252563103670604 samples/sec                   batch loss = 375.32334196567535 | accuracy = 0.5425925925925926


Epoch[1] Batch[140] Speed: 1.257833505833065 samples/sec                   batch loss = 388.1255975961685 | accuracy = 0.5553571428571429


Epoch[1] Batch[145] Speed: 1.2534303969207414 samples/sec                   batch loss = 402.2347334623337 | accuracy = 0.553448275862069


Epoch[1] Batch[150] Speed: 1.2594962860464893 samples/sec                   batch loss = 415.39319717884064 | accuracy = 0.5566666666666666


Epoch[1] Batch[155] Speed: 1.2569614290801403 samples/sec                   batch loss = 428.9949268102646 | accuracy = 0.5548387096774193


Epoch[1] Batch[160] Speed: 1.2516899125700967 samples/sec                   batch loss = 442.58677184581757 | accuracy = 0.55625


Epoch[1] Batch[165] Speed: 1.25237413921381 samples/sec                   batch loss = 456.48522651195526 | accuracy = 0.5590909090909091


Epoch[1] Batch[170] Speed: 1.2505609823437405 samples/sec                   batch loss = 470.42313373088837 | accuracy = 0.5544117647058824


Epoch[1] Batch[175] Speed: 1.2550860553632674 samples/sec                   batch loss = 483.9681304693222 | accuracy = 0.5585714285714286


Epoch[1] Batch[180] Speed: 1.254005542009766 samples/sec                   batch loss = 497.7775877714157 | accuracy = 0.5555555555555556


Epoch[1] Batch[185] Speed: 1.256456961305039 samples/sec                   batch loss = 511.1538516283035 | accuracy = 0.5594594594594594


Epoch[1] Batch[190] Speed: 1.2513309531227192 samples/sec                   batch loss = 525.4990829229355 | accuracy = 0.5565789473684211


Epoch[1] Batch[195] Speed: 1.2474391653529702 samples/sec                   batch loss = 538.995713353157 | accuracy = 0.5564102564102564


Epoch[1] Batch[200] Speed: 1.245558879164378 samples/sec                   batch loss = 551.9553152322769 | accuracy = 0.56125


Epoch[1] Batch[205] Speed: 1.2419617012576099 samples/sec                   batch loss = 564.6403795480728 | accuracy = 0.5670731707317073


Epoch[1] Batch[210] Speed: 1.2427512174019968 samples/sec                   batch loss = 577.8939887285233 | accuracy = 0.5702380952380952


Epoch[1] Batch[215] Speed: 1.247528491014111 samples/sec                   batch loss = 591.9616321325302 | accuracy = 0.5686046511627907


Epoch[1] Batch[220] Speed: 1.2460326991548967 samples/sec                   batch loss = 605.903782248497 | accuracy = 0.5681818181818182


Epoch[1] Batch[225] Speed: 1.253496325854342 samples/sec                   batch loss = 620.0795966386795 | accuracy = 0.5666666666666667


Epoch[1] Batch[230] Speed: 1.246779497879157 samples/sec                   batch loss = 633.8267422914505 | accuracy = 0.5673913043478261


Epoch[1] Batch[235] Speed: 1.2515002779045274 samples/sec                   batch loss = 647.5812922716141 | accuracy = 0.5659574468085107


Epoch[1] Batch[240] Speed: 1.2479544246990781 samples/sec                   batch loss = 660.5151945352554 | accuracy = 0.5697916666666667


Epoch[1] Batch[245] Speed: 1.2525149455129045 samples/sec                   batch loss = 673.8246816396713 | accuracy = 0.5724489795918367


Epoch[1] Batch[250] Speed: 1.2478586338738271 samples/sec                   batch loss = 686.7398973703384 | accuracy = 0.576


Epoch[1] Batch[255] Speed: 1.2440137475207356 samples/sec                   batch loss = 699.8591622114182 | accuracy = 0.5784313725490197


Epoch[1] Batch[260] Speed: 1.2475126285140075 samples/sec                   batch loss = 712.9667745828629 | accuracy = 0.5788461538461539


Epoch[1] Batch[265] Speed: 1.2565907813546708 samples/sec                   batch loss = 726.7269707918167 | accuracy = 0.5783018867924529


Epoch[1] Batch[270] Speed: 1.2493696078018877 samples/sec                   batch loss = 740.7323158979416 | accuracy = 0.5768518518518518


Epoch[1] Batch[275] Speed: 1.2568715949552878 samples/sec                   batch loss = 754.3011280298233 | accuracy = 0.5781818181818181


Epoch[1] Batch[280] Speed: 1.2537727596956518 samples/sec                   batch loss = 768.1114603281021 | accuracy = 0.5767857142857142


Epoch[1] Batch[285] Speed: 1.2494508355580922 samples/sec                   batch loss = 782.5459758043289 | accuracy = 0.5745614035087719


Epoch[1] Batch[290] Speed: 1.245578575938259 samples/sec                   batch loss = 796.4217962026596 | accuracy = 0.5741379310344827


Epoch[1] Batch[295] Speed: 1.243257082646668 samples/sec                   batch loss = 809.3849905729294 | accuracy = 0.576271186440678


Epoch[1] Batch[300] Speed: 1.2408919099948166 samples/sec                   batch loss = 823.6781896352768 | accuracy = 0.5733333333333334


Epoch[1] Batch[305] Speed: 1.246807479708148 samples/sec                   batch loss = 837.5571573972702 | accuracy = 0.5713114754098361


Epoch[1] Batch[310] Speed: 1.2526760794737588 samples/sec                   batch loss = 850.5320850610733 | accuracy = 0.5725806451612904


Epoch[1] Batch[315] Speed: 1.2451655365297787 samples/sec                   batch loss = 863.9611157178879 | accuracy = 0.5722222222222222


Epoch[1] Batch[320] Speed: 1.2509779827130594 samples/sec                   batch loss = 878.2098523378372 | accuracy = 0.57109375


Epoch[1] Batch[325] Speed: 1.2508416250756185 samples/sec                   batch loss = 891.7103003263474 | accuracy = 0.5723076923076923


Epoch[1] Batch[330] Speed: 1.241153905393583 samples/sec                   batch loss = 905.0185638666153 | accuracy = 0.5727272727272728


Epoch[1] Batch[335] Speed: 1.246416311358022 samples/sec                   batch loss = 918.1378463506699 | accuracy = 0.5738805970149253


Epoch[1] Batch[340] Speed: 1.2491980686486446 samples/sec                   batch loss = 931.965851187706 | accuracy = 0.575


Epoch[1] Batch[345] Speed: 1.2472077943945283 samples/sec                   batch loss = 946.0042120218277 | accuracy = 0.5731884057971014


Epoch[1] Batch[350] Speed: 1.2484625846675113 samples/sec                   batch loss = 959.3052757978439 | accuracy = 0.5735714285714286


Epoch[1] Batch[355] Speed: 1.2435660719635757 samples/sec                   batch loss = 972.1465481519699 | accuracy = 0.5753521126760563


Epoch[1] Batch[360] Speed: 1.2393443209072266 samples/sec                   batch loss = 985.9589458703995 | accuracy = 0.5736111111111111


Epoch[1] Batch[365] Speed: 1.2492029983464301 samples/sec                   batch loss = 999.1738086938858 | accuracy = 0.5746575342465754


Epoch[1] Batch[370] Speed: 1.2492564834176871 samples/sec                   batch loss = 1012.2879213094711 | accuracy = 0.5763513513513514


Epoch[1] Batch[375] Speed: 1.2457133259761186 samples/sec                   batch loss = 1026.3442605733871 | accuracy = 0.5753333333333334


Epoch[1] Batch[380] Speed: 1.25111101135031 samples/sec                   batch loss = 1039.1045421361923 | accuracy = 0.5763157894736842


Epoch[1] Batch[385] Speed: 1.25313211772792 samples/sec                   batch loss = 1051.426918387413 | accuracy = 0.5772727272727273


Epoch[1] Batch[390] Speed: 1.2514033819306003 samples/sec                   batch loss = 1065.3301311731339 | accuracy = 0.5769230769230769


Epoch[1] Batch[395] Speed: 1.252921366959417 samples/sec                   batch loss = 1079.4754754304886 | accuracy = 0.5772151898734177


Epoch[1] Batch[400] Speed: 1.254526335105191 samples/sec                   batch loss = 1093.938966870308 | accuracy = 0.573125


Epoch[1] Batch[405] Speed: 1.2641630477322592 samples/sec                   batch loss = 1107.7359577417374 | accuracy = 0.5734567901234567


Epoch[1] Batch[410] Speed: 1.2608943794911784 samples/sec                   batch loss = 1121.1685053110123 | accuracy = 0.573170731707317


Epoch[1] Batch[415] Speed: 1.257953470941889 samples/sec                   batch loss = 1134.8156403303146 | accuracy = 0.5734939759036145


Epoch[1] Batch[420] Speed: 1.2681368560331343 samples/sec                   batch loss = 1148.5717910528183 | accuracy = 0.5738095238095238


Epoch[1] Batch[425] Speed: 1.262799615617688 samples/sec                   batch loss = 1161.6733845472336 | accuracy = 0.5741176470588235


Epoch[1] Batch[430] Speed: 1.2650004143224522 samples/sec                   batch loss = 1176.3126329183578 | accuracy = 0.5732558139534883


Epoch[1] Batch[435] Speed: 1.2564876376449245 samples/sec                   batch loss = 1188.8704098463058 | accuracy = 0.5758620689655173


Epoch[1] Batch[440] Speed: 1.2566541252546777 samples/sec                   batch loss = 1202.243309378624 | accuracy = 0.5772727272727273


Epoch[1] Batch[445] Speed: 1.258746268244627 samples/sec                   batch loss = 1215.6896055936813 | accuracy = 0.5786516853932584


Epoch[1] Batch[450] Speed: 1.2605158213473049 samples/sec                   batch loss = 1228.9653648138046 | accuracy = 0.5788888888888889


Epoch[1] Batch[455] Speed: 1.2627681550786078 samples/sec                   batch loss = 1242.5816704034805 | accuracy = 0.5796703296703297


Epoch[1] Batch[460] Speed: 1.2563642826905306 samples/sec                   batch loss = 1255.7736600637436 | accuracy = 0.5798913043478261


Epoch[1] Batch[465] Speed: 1.2551694367147108 samples/sec                   batch loss = 1268.2281543016434 | accuracy = 0.5806451612903226


Epoch[1] Batch[470] Speed: 1.2641651433385224 samples/sec                   batch loss = 1280.177279472351 | accuracy = 0.5824468085106383


Epoch[1] Batch[475] Speed: 1.2633369678495914 samples/sec                   batch loss = 1293.396642923355 | accuracy = 0.5831578947368421


Epoch[1] Batch[480] Speed: 1.2626257943520105 samples/sec                   batch loss = 1306.8238008022308 | accuracy = 0.5822916666666667


Epoch[1] Batch[485] Speed: 1.2564698527132532 samples/sec                   batch loss = 1320.9062983989716 | accuracy = 0.581958762886598


Epoch[1] Batch[490] Speed: 1.2543378094506188 samples/sec                   batch loss = 1333.1945207118988 | accuracy = 0.5836734693877551


Epoch[1] Batch[495] Speed: 1.2600469178741929 samples/sec                   batch loss = 1346.2748081684113 | accuracy = 0.5843434343434344


Epoch[1] Batch[500] Speed: 1.2549715179663874 samples/sec                   batch loss = 1359.6120266914368 | accuracy = 0.5855


Epoch[1] Batch[505] Speed: 1.2580685533177767 samples/sec                   batch loss = 1372.1091845035553 | accuracy = 0.5856435643564356


Epoch[1] Batch[510] Speed: 1.2626847114535025 samples/sec                   batch loss = 1386.0891771316528 | accuracy = 0.5852941176470589


Epoch[1] Batch[515] Speed: 1.2631680391656106 samples/sec                   batch loss = 1399.2671778202057 | accuracy = 0.5859223300970874


Epoch[1] Batch[520] Speed: 1.261510353658224 samples/sec                   batch loss = 1413.0265657901764 | accuracy = 0.5865384615384616


Epoch[1] Batch[525] Speed: 1.2659979307535139 samples/sec                   batch loss = 1425.4709553718567 | accuracy = 0.5876190476190476


Epoch[1] Batch[530] Speed: 1.2585618537266439 samples/sec                   batch loss = 1438.2145783901215 | accuracy = 0.5886792452830188


Epoch[1] Batch[535] Speed: 1.260651644138485 samples/sec                   batch loss = 1451.547555923462 | accuracy = 0.5887850467289719


Epoch[1] Batch[540] Speed: 1.2579480946571822 samples/sec                   batch loss = 1464.636157989502 | accuracy = 0.5893518518518519


Epoch[1] Batch[545] Speed: 1.2594282117896036 samples/sec                   batch loss = 1476.0602178573608 | accuracy = 0.591743119266055


Epoch[1] Batch[550] Speed: 1.2568216926564038 samples/sec                   batch loss = 1489.8508813381195 | accuracy = 0.5913636363636363


Epoch[1] Batch[555] Speed: 1.260101998073782 samples/sec                   batch loss = 1503.901183128357 | accuracy = 0.5905405405405405


Epoch[1] Batch[560] Speed: 1.2587560900929735 samples/sec                   batch loss = 1516.7089421749115 | accuracy = 0.5915178571428571


Epoch[1] Batch[565] Speed: 1.2587829120750293 samples/sec                   batch loss = 1530.1230444908142 | accuracy = 0.5907079646017699


Epoch[1] Batch[570] Speed: 1.2557601839552337 samples/sec                   batch loss = 1542.666579246521 | accuracy = 0.5916666666666667


Epoch[1] Batch[575] Speed: 1.262261675657322 samples/sec                   batch loss = 1555.8360981941223 | accuracy = 0.5926086956521739


Epoch[1] Batch[580] Speed: 1.2555005361831988 samples/sec                   batch loss = 1570.2665483951569 | accuracy = 0.5918103448275862


Epoch[1] Batch[585] Speed: 1.2590025364330848 samples/sec                   batch loss = 1583.2535321712494 | accuracy = 0.5914529914529915


Epoch[1] Batch[590] Speed: 1.2577296865814258 samples/sec                   batch loss = 1594.7672117948532 | accuracy = 0.5932203389830508


Epoch[1] Batch[595] Speed: 1.255547796675569 samples/sec                   batch loss = 1609.2666515111923 | accuracy = 0.5928571428571429


Epoch[1] Batch[600] Speed: 1.2608836713946627 samples/sec                   batch loss = 1623.3616403341293 | accuracy = 0.5916666666666667


Epoch[1] Batch[605] Speed: 1.2556019204168032 samples/sec                   batch loss = 1636.5980783700943 | accuracy = 0.5913223140495868


Epoch[1] Batch[610] Speed: 1.2608428306785857 samples/sec                   batch loss = 1650.0726639032364 | accuracy = 0.5913934426229508


Epoch[1] Batch[615] Speed: 1.2591995550790451 samples/sec                   batch loss = 1662.4275003671646 | accuracy = 0.591869918699187


Epoch[1] Batch[620] Speed: 1.2546801051892549 samples/sec                   batch loss = 1675.0814324617386 | accuracy = 0.5919354838709677


Epoch[1] Batch[625] Speed: 1.2589444347936036 samples/sec                   batch loss = 1687.1059876680374 | accuracy = 0.5924


Epoch[1] Batch[630] Speed: 1.2571495196637175 samples/sec                   batch loss = 1699.0818498134613 | accuracy = 0.5936507936507937


Epoch[1] Batch[635] Speed: 1.2581957343986525 samples/sec                   batch loss = 1714.6654570102692 | accuracy = 0.5929133858267717


Epoch[1] Batch[640] Speed: 1.2595438478452783 samples/sec                   batch loss = 1727.779345035553 | accuracy = 0.59296875


Epoch[1] Batch[645] Speed: 1.25592798311058 samples/sec                   batch loss = 1741.2715864181519 | accuracy = 0.5930232558139535


Epoch[1] Batch[650] Speed: 1.2611944696691784 samples/sec                   batch loss = 1755.095857143402 | accuracy = 0.5938461538461538


Epoch[1] Batch[655] Speed: 1.2617088222294937 samples/sec                   batch loss = 1769.5492994785309 | accuracy = 0.5931297709923664


Epoch[1] Batch[660] Speed: 1.2559970899153423 samples/sec                   batch loss = 1784.3418643474579 | accuracy = 0.5924242424242424


Epoch[1] Batch[665] Speed: 1.2574102269597363 samples/sec                   batch loss = 1796.1307768821716 | accuracy = 0.593609022556391


Epoch[1] Batch[670] Speed: 1.2546360061913624 samples/sec                   batch loss = 1808.7233581542969 | accuracy = 0.594776119402985


Epoch[1] Batch[675] Speed: 1.2635643700071322 samples/sec                   batch loss = 1821.2662510871887 | accuracy = 0.5955555555555555


Epoch[1] Batch[680] Speed: 1.2872994964964375 samples/sec                   batch loss = 1835.4892284870148 | accuracy = 0.5944852941176471


Epoch[1] Batch[685] Speed: 1.290593794736919 samples/sec                   batch loss = 1849.2596399784088 | accuracy = 0.5945255474452554


Epoch[1] Batch[690] Speed: 1.2845274811228113 samples/sec                   batch loss = 1862.3721606731415 | accuracy = 0.5952898550724638


Epoch[1] Batch[695] Speed: 1.2921546650018365 samples/sec                   batch loss = 1875.0445189476013 | accuracy = 0.5953237410071942


Epoch[1] Batch[700] Speed: 1.2878287434731444 samples/sec                   batch loss = 1888.3194286823273 | accuracy = 0.5957142857142858


Epoch[1] Batch[705] Speed: 1.2846495429740252 samples/sec                   batch loss = 1901.0232971906662 | accuracy = 0.5960992907801419


Epoch[1] Batch[710] Speed: 1.2863212987897987 samples/sec                   batch loss = 1914.6988657712936 | accuracy = 0.596830985915493


Epoch[1] Batch[715] Speed: 1.2736697432135085 samples/sec                   batch loss = 1927.1497770547867 | accuracy = 0.5972027972027972


Epoch[1] Batch[720] Speed: 1.2777371130671975 samples/sec                   batch loss = 1941.2576948404312 | accuracy = 0.5961805555555556


Epoch[1] Batch[725] Speed: 1.2808472401101498 samples/sec                   batch loss = 1953.4351054430008 | accuracy = 0.596551724137931


Epoch[1] Batch[730] Speed: 1.2925906088418624 samples/sec                   batch loss = 1966.1232182979584 | accuracy = 0.5976027397260274


Epoch[1] Batch[735] Speed: 1.2818128177033818 samples/sec                   batch loss = 1980.7437689304352 | accuracy = 0.5962585034013606


Epoch[1] Batch[740] Speed: 1.282323252126178 samples/sec                   batch loss = 1993.775804758072 | accuracy = 0.5952702702702702


Epoch[1] Batch[745] Speed: 1.2853320805777997 samples/sec                   batch loss = 2008.6678502559662 | accuracy = 0.5942953020134228


Epoch[1] Batch[750] Speed: 1.2875830381365105 samples/sec                   batch loss = 2020.7342314720154 | accuracy = 0.5953333333333334


Epoch[1] Batch[755] Speed: 1.286616743072532 samples/sec                   batch loss = 2031.8411263227463 | accuracy = 0.5960264900662252


Epoch[1] Batch[760] Speed: 1.2857781145096472 samples/sec                   batch loss = 2045.7170161008835 | accuracy = 0.5960526315789474


Epoch[1] Batch[765] Speed: 1.2850264975379004 samples/sec                   batch loss = 2059.553958773613 | accuracy = 0.5970588235294118


Epoch[1] Batch[770] Speed: 1.2813509360187412 samples/sec                   batch loss = 2070.5029608011246 | accuracy = 0.5983766233766233


Epoch[1] Batch[775] Speed: 1.278463954897939 samples/sec                   batch loss = 2082.703154206276 | accuracy = 0.5993548387096774


Epoch[1] Batch[780] Speed: 1.2799088932039369 samples/sec                   batch loss = 2095.6181248426437 | accuracy = 0.5996794871794872


Epoch[1] Batch[785] Speed: 1.2893607321587188 samples/sec                   batch loss = 2108.27465736866 | accuracy = 0.6


[Epoch 1] training: accuracy=0.5999365482233503
[Epoch 1] time cost: 641.4754412174225
[Epoch 1] validation: validation accuracy=0.7133333333333334


Epoch[2] Batch[5] Speed: 1.2789420860216791 samples/sec                   batch loss = 12.036967515945435 | accuracy = 0.75


Epoch[2] Batch[10] Speed: 1.2806168987349404 samples/sec                   batch loss = 24.68270206451416 | accuracy = 0.675


Epoch[2] Batch[15] Speed: 1.282942883356552 samples/sec                   batch loss = 37.206409215927124 | accuracy = 0.6666666666666666


Epoch[2] Batch[20] Speed: 1.2840482119952605 samples/sec                   batch loss = 49.2651309967041 | accuracy = 0.6875


Epoch[2] Batch[25] Speed: 1.2803739373346037 samples/sec                   batch loss = 62.23650348186493 | accuracy = 0.68


Epoch[2] Batch[30] Speed: 1.2844640495949706 samples/sec                   batch loss = 74.97488737106323 | accuracy = 0.675


Epoch[2] Batch[35] Speed: 1.2854095823648704 samples/sec                   batch loss = 88.01727199554443 | accuracy = 0.6857142857142857


Epoch[2] Batch[40] Speed: 1.2822374981733873 samples/sec                   batch loss = 101.23856925964355 | accuracy = 0.66875


Epoch[2] Batch[45] Speed: 1.2857987096807817 samples/sec                   batch loss = 114.54265022277832 | accuracy = 0.6666666666666666


Epoch[2] Batch[50] Speed: 1.2839171266156182 samples/sec                   batch loss = 126.25193059444427 | accuracy = 0.67


Epoch[2] Batch[55] Speed: 1.2879419417956306 samples/sec                   batch loss = 138.1210436820984 | accuracy = 0.6772727272727272


Epoch[2] Batch[60] Speed: 1.284417537122611 samples/sec                   batch loss = 150.37360167503357 | accuracy = 0.675


Epoch[2] Batch[65] Speed: 1.2881243862063056 samples/sec                   batch loss = 163.91044235229492 | accuracy = 0.6730769230769231


Epoch[2] Batch[70] Speed: 1.2804906176667727 samples/sec                   batch loss = 175.5074005126953 | accuracy = 0.6892857142857143


Epoch[2] Batch[75] Speed: 1.2833186405426937 samples/sec                   batch loss = 186.0700957775116 | accuracy = 0.7033333333333334


Epoch[2] Batch[80] Speed: 1.2868379958782137 samples/sec                   batch loss = 199.35654044151306 | accuracy = 0.7125


Epoch[2] Batch[85] Speed: 1.2850842754571825 samples/sec                   batch loss = 213.19908142089844 | accuracy = 0.7088235294117647


Epoch[2] Batch[90] Speed: 1.2852204235343145 samples/sec                   batch loss = 227.42381525039673 | accuracy = 0.7


Epoch[2] Batch[95] Speed: 1.2821953604338727 samples/sec                   batch loss = 240.49580097198486 | accuracy = 0.6973684210526315


Epoch[2] Batch[100] Speed: 1.2862612400741924 samples/sec                   batch loss = 253.7274134159088 | accuracy = 0.6925


Epoch[2] Batch[105] Speed: 1.2904530318679415 samples/sec                   batch loss = 267.5981879234314 | accuracy = 0.6857142857142857


Epoch[2] Batch[110] Speed: 1.2797931974798384 samples/sec                   batch loss = 279.1807607412338 | accuracy = 0.6863636363636364


Epoch[2] Batch[115] Speed: 1.2817934272508509 samples/sec                   batch loss = 291.9986823797226 | accuracy = 0.6847826086956522


Epoch[2] Batch[120] Speed: 1.280557860212846 samples/sec                   batch loss = 305.2679349184036 | accuracy = 0.68125


Epoch[2] Batch[125] Speed: 1.2817934272508509 samples/sec                   batch loss = 319.37909507751465 | accuracy = 0.678


Epoch[2] Batch[130] Speed: 1.2866906500200475 samples/sec                   batch loss = 330.29822909832 | accuracy = 0.6807692307692308


Epoch[2] Batch[135] Speed: 1.2837699576905415 samples/sec                   batch loss = 342.7529036998749 | accuracy = 0.6777777777777778


Epoch[2] Batch[140] Speed: 1.2927894148132326 samples/sec                   batch loss = 356.6579134464264 | accuracy = 0.6714285714285714


Epoch[2] Batch[145] Speed: 1.2871970767951975 samples/sec                   batch loss = 369.90887212753296 | accuracy = 0.6706896551724137


Epoch[2] Batch[150] Speed: 1.2829043289300799 samples/sec                   batch loss = 384.95347833633423 | accuracy = 0.6683333333333333


Epoch[2] Batch[155] Speed: 1.286203159065028 samples/sec                   batch loss = 396.1468942165375 | accuracy = 0.6741935483870968


Epoch[2] Batch[160] Speed: 1.284646591970729 samples/sec                   batch loss = 407.5582607984543 | accuracy = 0.6765625


Epoch[2] Batch[165] Speed: 1.2812632572229838 samples/sec                   batch loss = 420.3851988315582 | accuracy = 0.6757575757575758


Epoch[2] Batch[170] Speed: 1.2799764652764791 samples/sec                   batch loss = 432.48861706256866 | accuracy = 0.6764705882352942


Epoch[2] Batch[175] Speed: 1.2852080183936279 samples/sec                   batch loss = 445.52189886569977 | accuracy = 0.6728571428571428


Epoch[2] Batch[180] Speed: 1.28783032514668 samples/sec                   batch loss = 457.90264415740967 | accuracy = 0.6736111111111112


Epoch[2] Batch[185] Speed: 1.2839841399349436 samples/sec                   batch loss = 468.78099405765533 | accuracy = 0.672972972972973


Epoch[2] Batch[190] Speed: 1.2823628498219337 samples/sec                   batch loss = 482.54961705207825 | accuracy = 0.6697368421052632


Epoch[2] Batch[195] Speed: 1.28520280042981 samples/sec                   batch loss = 493.9205096960068 | accuracy = 0.6692307692307692


Epoch[2] Batch[200] Speed: 1.2860465932737033 samples/sec                   batch loss = 505.079093337059 | accuracy = 0.67125


Epoch[2] Batch[205] Speed: 1.2866137830245088 samples/sec                   batch loss = 519.463197350502 | accuracy = 0.6707317073170732


Epoch[2] Batch[210] Speed: 1.2863441797758046 samples/sec                   batch loss = 532.0289093255997 | accuracy = 0.6714285714285714


Epoch[2] Batch[215] Speed: 1.2859108614725387 samples/sec                   batch loss = 543.599213719368 | accuracy = 0.6732558139534883


Epoch[2] Batch[220] Speed: 1.2881698817520266 samples/sec                   batch loss = 557.3413888216019 | accuracy = 0.6693181818181818


Epoch[2] Batch[225] Speed: 1.2833101003932261 samples/sec                   batch loss = 570.0896487236023 | accuracy = 0.67


Epoch[2] Batch[230] Speed: 1.2786520070317715 samples/sec                   batch loss = 579.6756502389908 | accuracy = 0.6728260869565217


Epoch[2] Batch[235] Speed: 1.288353281165565 samples/sec                   batch loss = 591.0216135978699 | accuracy = 0.6734042553191489


Epoch[2] Batch[240] Speed: 1.2758873774863113 samples/sec                   batch loss = 604.0276226997375 | accuracy = 0.6739583333333333


Epoch[2] Batch[245] Speed: 1.2797291588831525 samples/sec                   batch loss = 613.9376889467239 | accuracy = 0.6775510204081633


Epoch[2] Batch[250] Speed: 1.2788254927610887 samples/sec                   batch loss = 625.2782679796219 | accuracy = 0.679


Epoch[2] Batch[255] Speed: 1.2860958856960105 samples/sec                   batch loss = 634.7839579582214 | accuracy = 0.6823529411764706


Epoch[2] Batch[260] Speed: 1.2812361535948456 samples/sec                   batch loss = 646.2900120019913 | accuracy = 0.6826923076923077


Epoch[2] Batch[265] Speed: 1.2773751204304753 samples/sec                   batch loss = 657.8595951795578 | accuracy = 0.6849056603773584


Epoch[2] Batch[270] Speed: 1.2811561214144966 samples/sec                   batch loss = 672.3138624429703 | accuracy = 0.6814814814814815


Epoch[2] Batch[275] Speed: 1.2829130598822185 samples/sec                   batch loss = 683.9624035358429 | accuracy = 0.6809090909090909


Epoch[2] Batch[280] Speed: 1.2789208324884551 samples/sec                   batch loss = 696.7897101640701 | accuracy = 0.6803571428571429


Epoch[2] Batch[285] Speed: 1.275779780599424 samples/sec                   batch loss = 710.8653688430786 | accuracy = 0.6771929824561403


Epoch[2] Batch[290] Speed: 1.2811814605952618 samples/sec                   batch loss = 720.7692623138428 | accuracy = 0.6801724137931034


Epoch[2] Batch[295] Speed: 1.280893103185273 samples/sec                   batch loss = 732.3634295463562 | accuracy = 0.6813559322033899


Epoch[2] Batch[300] Speed: 1.2810836314561427 samples/sec                   batch loss = 743.7039414644241 | accuracy = 0.68


Epoch[2] Batch[305] Speed: 1.2805729125640317 samples/sec                   batch loss = 757.9222456216812 | accuracy = 0.6778688524590164


Epoch[2] Batch[310] Speed: 1.2756866545747798 samples/sec                   batch loss = 769.2668554782867 | accuracy = 0.6782258064516129


Epoch[2] Batch[315] Speed: 1.2825183238296818 samples/sec                   batch loss = 781.1082628965378 | accuracy = 0.6777777777777778


Epoch[2] Batch[320] Speed: 1.2833404331633174 samples/sec                   batch loss = 796.7204560041428 | accuracy = 0.67578125


Epoch[2] Batch[325] Speed: 1.2924351724265968 samples/sec                   batch loss = 809.0454108715057 | accuracy = 0.6753846153846154


Epoch[2] Batch[330] Speed: 1.291783364166497 samples/sec                   batch loss = 823.1890050172806 | accuracy = 0.6727272727272727


Epoch[2] Batch[335] Speed: 1.2828495915112954 samples/sec                   batch loss = 833.4676995277405 | accuracy = 0.673134328358209


Epoch[2] Batch[340] Speed: 1.2840890956698734 samples/sec                   batch loss = 847.61585688591 | accuracy = 0.6713235294117647


Epoch[2] Batch[345] Speed: 1.2817665949659776 samples/sec                   batch loss = 862.9783842563629 | accuracy = 0.6695652173913044


Epoch[2] Batch[350] Speed: 1.2795848026912044 samples/sec                   batch loss = 873.375815153122 | accuracy = 0.6707142857142857


Epoch[2] Batch[355] Speed: 1.2742510349311331 samples/sec                   batch loss = 884.8887835741043 | accuracy = 0.6697183098591549


Epoch[2] Batch[360] Speed: 1.2717094193895404 samples/sec                   batch loss = 898.8236685991287 | accuracy = 0.66875


Epoch[2] Batch[365] Speed: 1.2770075008096795 samples/sec                   batch loss = 913.0483125448227 | accuracy = 0.6671232876712329


Epoch[2] Batch[370] Speed: 1.2729976968424266 samples/sec                   batch loss = 923.737745642662 | accuracy = 0.6668918918918919


Epoch[2] Batch[375] Speed: 1.2785457946238517 samples/sec                   batch loss = 935.6127842664719 | accuracy = 0.6673333333333333


Epoch[2] Batch[380] Speed: 1.2742514220552408 samples/sec                   batch loss = 946.9900958538055 | accuracy = 0.6677631578947368


Epoch[2] Batch[385] Speed: 1.2797192022380344 samples/sec                   batch loss = 959.8391044139862 | accuracy = 0.6681818181818182


Epoch[2] Batch[390] Speed: 1.2849013134666982 samples/sec                   batch loss = 970.2373225688934 | accuracy = 0.6711538461538461


Epoch[2] Batch[395] Speed: 1.283870850239545 samples/sec                   batch loss = 980.9907193183899 | accuracy = 0.6708860759493671


Epoch[2] Batch[400] Speed: 1.2811654155922017 samples/sec                   batch loss = 994.5755549669266 | accuracy = 0.67


Epoch[2] Batch[405] Speed: 1.281442933368957 samples/sec                   batch loss = 1009.4369133710861 | accuracy = 0.6679012345679012


Epoch[2] Batch[410] Speed: 1.2795606976974763 samples/sec                   batch loss = 1020.0470652580261 | accuracy = 0.6689024390243903


Epoch[2] Batch[415] Speed: 1.2793497445679998 samples/sec                   batch loss = 1032.169096827507 | accuracy = 0.6692771084337349


Epoch[2] Batch[420] Speed: 1.278427422688598 samples/sec                   batch loss = 1042.7379834651947 | accuracy = 0.6696428571428571


Epoch[2] Batch[425] Speed: 1.2795252737879614 samples/sec                   batch loss = 1054.4806512594223 | accuracy = 0.67


Epoch[2] Batch[430] Speed: 1.2766463099864613 samples/sec                   batch loss = 1065.9235925674438 | accuracy = 0.6697674418604651


Epoch[2] Batch[435] Speed: 1.2809521726240445 samples/sec                   batch loss = 1075.7536998987198 | accuracy = 0.6718390804597701


Epoch[2] Batch[440] Speed: 1.2864979571730597 samples/sec                   batch loss = 1085.774685382843 | accuracy = 0.6732954545454546


Epoch[2] Batch[445] Speed: 1.2836394201761312 samples/sec                   batch loss = 1098.2594819068909 | accuracy = 0.6735955056179775


Epoch[2] Batch[450] Speed: 1.2910118965172293 samples/sec                   batch loss = 1105.5779275894165 | accuracy = 0.6766666666666666


Epoch[2] Batch[455] Speed: 1.285433908232678 samples/sec                   batch loss = 1116.4860223531723 | accuracy = 0.6774725274725275


Epoch[2] Batch[460] Speed: 1.2866870975559626 samples/sec                   batch loss = 1127.0437850952148 | accuracy = 0.6782608695652174


Epoch[2] Batch[465] Speed: 1.2845303332301505 samples/sec                   batch loss = 1137.2938126325607 | accuracy = 0.678494623655914


Epoch[2] Batch[470] Speed: 1.2861222093971953 samples/sec                   batch loss = 1150.9916412830353 | accuracy = 0.6771276595744681


Epoch[2] Batch[475] Speed: 1.2871103735295508 samples/sec                   batch loss = 1165.563493013382 | accuracy = 0.6747368421052632


Epoch[2] Batch[480] Speed: 1.2895525985722176 samples/sec                   batch loss = 1175.9566580057144 | accuracy = 0.6755208333333333


Epoch[2] Batch[485] Speed: 1.2845476428429292 samples/sec                   batch loss = 1185.8569122552872 | accuracy = 0.6762886597938145


Epoch[2] Batch[490] Speed: 1.292068486680514 samples/sec                   batch loss = 1199.4295711517334 | accuracy = 0.6760204081632653


Epoch[2] Batch[495] Speed: 1.2863807713201432 samples/sec                   batch loss = 1210.4521998167038 | accuracy = 0.6757575757575758


Epoch[2] Batch[500] Speed: 1.287577603237805 samples/sec                   batch loss = 1220.9094092845917 | accuracy = 0.676


Epoch[2] Batch[505] Speed: 1.2901586012606125 samples/sec                   batch loss = 1232.9145699739456 | accuracy = 0.6767326732673268


Epoch[2] Batch[510] Speed: 1.2814387246912045 samples/sec                   batch loss = 1243.1831240653992 | accuracy = 0.6774509803921569


Epoch[2] Batch[515] Speed: 1.2810268974007408 samples/sec                   batch loss = 1254.6404206752777 | accuracy = 0.6776699029126214


Epoch[2] Batch[520] Speed: 1.2809617572519774 samples/sec                   batch loss = 1265.0159275531769 | accuracy = 0.6778846153846154


Epoch[2] Batch[525] Speed: 1.2869367058817933 samples/sec                   batch loss = 1277.0244485139847 | accuracy = 0.6780952380952381


Epoch[2] Batch[530] Speed: 1.2845411516834684 samples/sec                   batch loss = 1290.7394762039185 | accuracy = 0.6773584905660377


Epoch[2] Batch[535] Speed: 1.2842629786434034 samples/sec                   batch loss = 1302.444581747055 | accuracy = 0.6785046728971963


Epoch[2] Batch[540] Speed: 1.2906539608726946 samples/sec                   batch loss = 1315.2991116046906 | accuracy = 0.6782407407407407


Epoch[2] Batch[545] Speed: 1.291634585366924 samples/sec                   batch loss = 1326.9295434951782 | accuracy = 0.6788990825688074


Epoch[2] Batch[550] Speed: 1.2911458257978343 samples/sec                   batch loss = 1339.8850498199463 | accuracy = 0.6781818181818182


Epoch[2] Batch[555] Speed: 1.2852637449515534 samples/sec                   batch loss = 1351.764819085598 | accuracy = 0.6783783783783783


Epoch[2] Batch[560] Speed: 1.2855543693152383 samples/sec                   batch loss = 1360.8059436678886 | accuracy = 0.6794642857142857


Epoch[2] Batch[565] Speed: 1.2874263338334768 samples/sec                   batch loss = 1372.8443577885628 | accuracy = 0.6792035398230089


Epoch[2] Batch[570] Speed: 1.2916637218313995 samples/sec                   batch loss = 1382.208841741085 | accuracy = 0.6798245614035088


Epoch[2] Batch[575] Speed: 1.2877974073672085 samples/sec                   batch loss = 1396.0224897265434 | accuracy = 0.678695652173913


Epoch[2] Batch[580] Speed: 1.2850771882778143 samples/sec                   batch loss = 1407.2731476426125 | accuracy = 0.6788793103448276


Epoch[2] Batch[585] Speed: 1.2811018265450014 samples/sec                   batch loss = 1418.3925378918648 | accuracy = 0.6790598290598291


Epoch[2] Batch[590] Speed: 1.2742429053792075 samples/sec                   batch loss = 1432.4052503705025 | accuracy = 0.6783898305084746


Epoch[2] Batch[595] Speed: 1.2841994749370385 samples/sec                   batch loss = 1445.1752267479897 | accuracy = 0.6781512605042017


Epoch[2] Batch[600] Speed: 1.2741710988420358 samples/sec                   batch loss = 1455.0123363137245 | accuracy = 0.6795833333333333


Epoch[2] Batch[605] Speed: 1.2767645466296156 samples/sec                   batch loss = 1469.2819479107857 | accuracy = 0.6789256198347108


Epoch[2] Batch[610] Speed: 1.278553686865604 samples/sec                   batch loss = 1478.37897503376 | accuracy = 0.6799180327868852


Epoch[2] Batch[615] Speed: 1.2769028248133494 samples/sec                   batch loss = 1489.2558584213257 | accuracy = 0.6804878048780488


Epoch[2] Batch[620] Speed: 1.2757719225658921 samples/sec                   batch loss = 1499.179472208023 | accuracy = 0.6814516129032258


Epoch[2] Batch[625] Speed: 1.2788633150009339 samples/sec                   batch loss = 1509.6092946529388 | accuracy = 0.6812


Epoch[2] Batch[630] Speed: 1.2812151172731645 samples/sec                   batch loss = 1518.4471280574799 | accuracy = 0.6825396825396826


Epoch[2] Batch[635] Speed: 1.2827185545904516 samples/sec                   batch loss = 1532.4816644191742 | accuracy = 0.6822834645669291


Epoch[2] Batch[640] Speed: 1.2876290883922925 samples/sec                   batch loss = 1539.6015006899834 | accuracy = 0.68359375


Epoch[2] Batch[645] Speed: 1.2838984584616329 samples/sec                   batch loss = 1555.1760992407799 | accuracy = 0.6821705426356589


Epoch[2] Batch[650] Speed: 1.2857832386054033 samples/sec                   batch loss = 1566.2934097647667 | accuracy = 0.6826923076923077


Epoch[2] Batch[655] Speed: 1.2896310067055798 samples/sec                   batch loss = 1574.5726808905602 | accuracy = 0.684351145038168


Epoch[2] Batch[660] Speed: 1.2848789757922483 samples/sec                   batch loss = 1585.2610090374947 | accuracy = 0.684469696969697


Epoch[2] Batch[665] Speed: 1.2757006226117515 samples/sec                   batch loss = 1596.9313141703606 | accuracy = 0.6842105263157895


Epoch[2] Batch[670] Speed: 1.2836003328427605 samples/sec                   batch loss = 1607.2232927680016 | accuracy = 0.6843283582089552


Epoch[2] Batch[675] Speed: 1.279650876500369 samples/sec                   batch loss = 1620.0501998066902 | accuracy = 0.6837037037037037


Epoch[2] Batch[680] Speed: 1.279026327156649 samples/sec                   batch loss = 1629.891074359417 | accuracy = 0.6841911764705882


Epoch[2] Batch[685] Speed: 1.277846986933941 samples/sec                   batch loss = 1642.7796285748482 | accuracy = 0.6839416058394161


Epoch[2] Batch[690] Speed: 1.279370329437622 samples/sec                   batch loss = 1651.8127899765968 | accuracy = 0.6844202898550724


Epoch[2] Batch[695] Speed: 1.283450094652939 samples/sec                   batch loss = 1662.406202852726 | accuracy = 0.685251798561151


Epoch[2] Batch[700] Speed: 1.2840170595966967 samples/sec                   batch loss = 1674.8669031262398 | accuracy = 0.6839285714285714


Epoch[2] Batch[705] Speed: 1.2853635922481093 samples/sec                   batch loss = 1687.3902412056923 | accuracy = 0.6829787234042554


Epoch[2] Batch[710] Speed: 1.2908726315904482 samples/sec                   batch loss = 1698.0593233704567 | accuracy = 0.6838028169014084


Epoch[2] Batch[715] Speed: 1.2868655344461633 samples/sec                   batch loss = 1710.845091164112 | accuracy = 0.6832167832167833


Epoch[2] Batch[720] Speed: 1.2898152190701073 samples/sec                   batch loss = 1722.6029737591743 | accuracy = 0.6829861111111111


Epoch[2] Batch[725] Speed: 1.2902058281300623 samples/sec                   batch loss = 1732.2464460730553 | accuracy = 0.6841379310344827


Epoch[2] Batch[730] Speed: 1.280602529586789 samples/sec                   batch loss = 1747.5321303009987 | accuracy = 0.6839041095890411


Epoch[2] Batch[735] Speed: 1.2770527004860153 samples/sec                   batch loss = 1758.8338186144829 | accuracy = 0.6840136054421768


Epoch[2] Batch[740] Speed: 1.2773422487024984 samples/sec                   batch loss = 1770.257569015026 | accuracy = 0.6831081081081081


Epoch[2] Batch[745] Speed: 1.2845942630974747 samples/sec                   batch loss = 1782.1396747231483 | accuracy = 0.6838926174496645


Epoch[2] Batch[750] Speed: 1.28030173126586 samples/sec                   batch loss = 1796.744176208973 | accuracy = 0.6826666666666666


Epoch[2] Batch[755] Speed: 1.2820132200705499 samples/sec                   batch loss = 1808.0242587924004 | accuracy = 0.6831125827814569


Epoch[2] Batch[760] Speed: 1.2850752196307516 samples/sec                   batch loss = 1821.9709163308144 | accuracy = 0.6828947368421052


Epoch[2] Batch[765] Speed: 1.2792294680290155 samples/sec                   batch loss = 1831.3752695918083 | accuracy = 0.6836601307189543


Epoch[2] Batch[770] Speed: 1.283017055640331 samples/sec                   batch loss = 1842.19505661726 | accuracy = 0.6840909090909091


Epoch[2] Batch[775] Speed: 1.2745927634725274 samples/sec                   batch loss = 1855.0771998763084 | accuracy = 0.6838709677419355


Epoch[2] Batch[780] Speed: 1.2806881627449198 samples/sec                   batch loss = 1869.0116379857063 | accuracy = 0.6826923076923077


Epoch[2] Batch[785] Speed: 1.2797893901279827 samples/sec                   batch loss = 1878.8666794896126 | accuracy = 0.6837579617834395


[Epoch 2] training: accuracy=0.6833756345177665
[Epoch 2] time cost: 630.1503601074219
[Epoch 2] validation: validation accuracy=0.7222222222222222


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).