<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[10:32:53] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[10:32:53] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[10:32:53] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 9.125309, -5.445491]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.779084056547484 samples/sec                   batch loss = 13.893839359283447 | accuracy = 0.6


Epoch[1] Batch[10] Speed: 1.2579253638560566 samples/sec                   batch loss = 28.327918529510498 | accuracy = 0.55


Epoch[1] Batch[15] Speed: 1.2547966540582804 samples/sec                   batch loss = 42.23283314704895 | accuracy = 0.55


Epoch[1] Batch[20] Speed: 1.2544280320717311 samples/sec                   batch loss = 55.841066122055054 | accuracy = 0.5875


Epoch[1] Batch[25] Speed: 1.2562183768672501 samples/sec                   batch loss = 71.20261549949646 | accuracy = 0.52


Epoch[1] Batch[30] Speed: 1.2580166693123394 samples/sec                   batch loss = 84.79775977134705 | accuracy = 0.55


Epoch[1] Batch[35] Speed: 1.2591519247769793 samples/sec                   batch loss = 98.95876145362854 | accuracy = 0.5285714285714286


Epoch[1] Batch[40] Speed: 1.2585619481390935 samples/sec                   batch loss = 112.20642232894897 | accuracy = 0.53125


Epoch[1] Batch[45] Speed: 1.2565586882945468 samples/sec                   batch loss = 126.09463143348694 | accuracy = 0.5333333333333333


Epoch[1] Batch[50] Speed: 1.2491363112035967 samples/sec                   batch loss = 139.75531148910522 | accuracy = 0.525


Epoch[1] Batch[55] Speed: 1.2582032830455971 samples/sec                   batch loss = 152.61923217773438 | accuracy = 0.55


Epoch[1] Batch[60] Speed: 1.2588211637099218 samples/sec                   batch loss = 167.58035039901733 | accuracy = 0.5416666666666666


Epoch[1] Batch[65] Speed: 1.255924880534463 samples/sec                   batch loss = 180.96926283836365 | accuracy = 0.5461538461538461


Epoch[1] Batch[70] Speed: 1.2563731265570135 samples/sec                   batch loss = 195.66354060173035 | accuracy = 0.5357142857142857


Epoch[1] Batch[75] Speed: 1.2626202830334226 samples/sec                   batch loss = 208.35689663887024 | accuracy = 0.5466666666666666


Epoch[1] Batch[80] Speed: 1.251755471743161 samples/sec                   batch loss = 223.47933650016785 | accuracy = 0.534375


Epoch[1] Batch[85] Speed: 1.258918550586721 samples/sec                   batch loss = 238.09435629844666 | accuracy = 0.5264705882352941


Epoch[1] Batch[90] Speed: 1.2596931755196947 samples/sec                   batch loss = 252.58704805374146 | accuracy = 0.5166666666666667


Epoch[1] Batch[95] Speed: 1.2574459448358812 samples/sec                   batch loss = 266.6777741909027 | accuracy = 0.5157894736842106


Epoch[1] Batch[100] Speed: 1.257569229505341 samples/sec                   batch loss = 280.9998290538788 | accuracy = 0.51


Epoch[1] Batch[105] Speed: 1.257633520550428 samples/sec                   batch loss = 294.9898376464844 | accuracy = 0.5119047619047619


Epoch[1] Batch[110] Speed: 1.2571218252781262 samples/sec                   batch loss = 308.7079927921295 | accuracy = 0.5113636363636364


Epoch[1] Batch[115] Speed: 1.2555683744232258 samples/sec                   batch loss = 322.29104495048523 | accuracy = 0.5173913043478261


Epoch[1] Batch[120] Speed: 1.2607963077036841 samples/sec                   batch loss = 336.5152587890625 | accuracy = 0.5166666666666667


Epoch[1] Batch[125] Speed: 1.2614893909784781 samples/sec                   batch loss = 350.3045246601105 | accuracy = 0.52


Epoch[1] Batch[130] Speed: 1.259963171128087 samples/sec                   batch loss = 364.0364887714386 | accuracy = 0.5211538461538462


Epoch[1] Batch[135] Speed: 1.2577052665589419 samples/sec                   batch loss = 378.19627141952515 | accuracy = 0.5222222222222223


Epoch[1] Batch[140] Speed: 1.255727663368425 samples/sec                   batch loss = 391.8351995944977 | accuracy = 0.5267857142857143


Epoch[1] Batch[145] Speed: 1.2541382778601509 samples/sec                   batch loss = 405.76047682762146 | accuracy = 0.5258620689655172


Epoch[1] Batch[150] Speed: 1.257667082710205 samples/sec                   batch loss = 418.9886429309845 | accuracy = 0.535


Epoch[1] Batch[155] Speed: 1.2571239918017119 samples/sec                   batch loss = 432.14765095710754 | accuracy = 0.5435483870967742


Epoch[1] Batch[160] Speed: 1.2591912384174397 samples/sec                   batch loss = 445.8019685745239 | accuracy = 0.546875


Epoch[1] Batch[165] Speed: 1.258542027426066 samples/sec                   batch loss = 459.8444356918335 | accuracy = 0.5454545454545454


Epoch[1] Batch[170] Speed: 1.2582370643503062 samples/sec                   batch loss = 473.30423617362976 | accuracy = 0.55


Epoch[1] Batch[175] Speed: 1.2554457635630285 samples/sec                   batch loss = 487.27900528907776 | accuracy = 0.5514285714285714


Epoch[1] Batch[180] Speed: 1.2545774625243087 samples/sec                   batch loss = 500.6699106693268 | accuracy = 0.55


Epoch[1] Batch[185] Speed: 1.2570268823617206 samples/sec                   batch loss = 513.7876086235046 | accuracy = 0.5513513513513514


Epoch[1] Batch[190] Speed: 1.2579498867469783 samples/sec                   batch loss = 528.6383955478668 | accuracy = 0.5447368421052632


Epoch[1] Batch[195] Speed: 1.2571679832653515 samples/sec                   batch loss = 541.912716627121 | accuracy = 0.5461538461538461


Epoch[1] Batch[200] Speed: 1.2609504815179975 samples/sec                   batch loss = 555.1733078956604 | accuracy = 0.55


Epoch[1] Batch[205] Speed: 1.2564112318737473 samples/sec                   batch loss = 569.2497148513794 | accuracy = 0.5487804878048781


Epoch[1] Batch[210] Speed: 1.260481822872015 samples/sec                   batch loss = 583.5297837257385 | accuracy = 0.5464285714285714


Epoch[1] Batch[215] Speed: 1.260928305458035 samples/sec                   batch loss = 596.5941789150238 | accuracy = 0.5534883720930233


Epoch[1] Batch[220] Speed: 1.2653103828176426 samples/sec                   batch loss = 610.1526176929474 | accuracy = 0.5545454545454546


Epoch[1] Batch[225] Speed: 1.2572093399327544 samples/sec                   batch loss = 623.5648100376129 | accuracy = 0.5544444444444444


Epoch[1] Batch[230] Speed: 1.257147635653228 samples/sec                   batch loss = 636.5460064411163 | accuracy = 0.5565217391304348


Epoch[1] Batch[235] Speed: 1.25629372438557 samples/sec                   batch loss = 649.7193415164948 | accuracy = 0.5574468085106383


Epoch[1] Batch[240] Speed: 1.2576922555056218 samples/sec                   batch loss = 663.7401652336121 | accuracy = 0.5520833333333334


Epoch[1] Batch[245] Speed: 1.260385519598641 samples/sec                   batch loss = 677.2514729499817 | accuracy = 0.5551020408163265


Epoch[1] Batch[250] Speed: 1.2566125227461764 samples/sec                   batch loss = 690.8680696487427 | accuracy = 0.556


Epoch[1] Batch[255] Speed: 1.2519825538100426 samples/sec                   batch loss = 704.6534433364868 | accuracy = 0.5568627450980392


Epoch[1] Batch[260] Speed: 1.2589173225303605 samples/sec                   batch loss = 718.394299030304 | accuracy = 0.5538461538461539


Epoch[1] Batch[265] Speed: 1.2594709464355263 samples/sec                   batch loss = 732.2215070724487 | accuracy = 0.5528301886792453


Epoch[1] Batch[270] Speed: 1.2582323461820073 samples/sec                   batch loss = 745.7694289684296 | accuracy = 0.5537037037037037


Epoch[1] Batch[275] Speed: 1.256225995868442 samples/sec                   batch loss = 759.7513320446014 | accuracy = 0.5545454545454546


Epoch[1] Batch[280] Speed: 1.2533221536573946 samples/sec                   batch loss = 773.1545805931091 | accuracy = 0.5526785714285715


Epoch[1] Batch[285] Speed: 1.2546399468267448 samples/sec                   batch loss = 787.4526569843292 | accuracy = 0.55


Epoch[1] Batch[290] Speed: 1.2563700217811933 samples/sec                   batch loss = 801.127053976059 | accuracy = 0.5482758620689655


Epoch[1] Batch[295] Speed: 1.254718483222909 samples/sec                   batch loss = 814.9561803340912 | accuracy = 0.5474576271186441


Epoch[1] Batch[300] Speed: 1.2581499726655683 samples/sec                   batch loss = 828.5425581932068 | accuracy = 0.5475


Epoch[1] Batch[305] Speed: 1.2561428502867258 samples/sec                   batch loss = 841.7822933197021 | accuracy = 0.5475409836065573


Epoch[1] Batch[310] Speed: 1.2587392797150965 samples/sec                   batch loss = 855.1281168460846 | accuracy = 0.5491935483870968


Epoch[1] Batch[315] Speed: 1.2544845919541743 samples/sec                   batch loss = 868.9138445854187 | accuracy = 0.55


Epoch[1] Batch[320] Speed: 1.259164777043193 samples/sec                   batch loss = 882.4509842395782 | accuracy = 0.55


Epoch[1] Batch[325] Speed: 1.2602028962885974 samples/sec                   batch loss = 896.4609527587891 | accuracy = 0.5484615384615384


Epoch[1] Batch[330] Speed: 1.257503059957172 samples/sec                   batch loss = 910.3303167819977 | accuracy = 0.5492424242424242


Epoch[1] Batch[335] Speed: 1.260576151847242 samples/sec                   batch loss = 924.087464094162 | accuracy = 0.5477611940298508


Epoch[1] Batch[340] Speed: 1.2600266662280626 samples/sec                   batch loss = 937.9796962738037 | accuracy = 0.5477941176470589


Epoch[1] Batch[345] Speed: 1.2541167157470465 samples/sec                   batch loss = 951.2536072731018 | accuracy = 0.5485507246376812


Epoch[1] Batch[350] Speed: 1.2590815254735397 samples/sec                   batch loss = 964.4617774486542 | accuracy = 0.5492857142857143


Epoch[1] Batch[355] Speed: 1.2606587486464267 samples/sec                   batch loss = 978.5028100013733 | accuracy = 0.547887323943662


Epoch[1] Batch[360] Speed: 1.2582912314572807 samples/sec                   batch loss = 992.2801158428192 | accuracy = 0.5479166666666667


Epoch[1] Batch[365] Speed: 1.2529220219363457 samples/sec                   batch loss = 1006.21102643013 | accuracy = 0.547945205479452


Epoch[1] Batch[370] Speed: 1.2598925864720352 samples/sec                   batch loss = 1019.8435757160187 | accuracy = 0.5486486486486486


Epoch[1] Batch[375] Speed: 1.2563679519391715 samples/sec                   batch loss = 1034.0384485721588 | accuracy = 0.546


Epoch[1] Batch[380] Speed: 1.2544073978951917 samples/sec                   batch loss = 1046.9642350673676 | accuracy = 0.5486842105263158


Epoch[1] Batch[385] Speed: 1.2555586962204481 samples/sec                   batch loss = 1060.6168735027313 | accuracy = 0.55


Epoch[1] Batch[390] Speed: 1.2570437412173094 samples/sec                   batch loss = 1075.1594524383545 | accuracy = 0.5493589743589744


Epoch[1] Batch[395] Speed: 1.258466787578537 samples/sec                   batch loss = 1087.4792709350586 | accuracy = 0.5518987341772152


Epoch[1] Batch[400] Speed: 1.256265785487348 samples/sec                   batch loss = 1101.767804145813 | accuracy = 0.549375


Epoch[1] Batch[405] Speed: 1.256491778124548 samples/sec                   batch loss = 1115.1739959716797 | accuracy = 0.5487654320987654


Epoch[1] Batch[410] Speed: 1.2594463642052491 samples/sec                   batch loss = 1128.8730165958405 | accuracy = 0.5481707317073171


Epoch[1] Batch[415] Speed: 1.2591188502982724 samples/sec                   batch loss = 1142.1375098228455 | accuracy = 0.55


Epoch[1] Batch[420] Speed: 1.2526244522405092 samples/sec                   batch loss = 1154.8977131843567 | accuracy = 0.5529761904761905


Epoch[1] Batch[425] Speed: 1.2539374976596882 samples/sec                   batch loss = 1168.0775599479675 | accuracy = 0.5541176470588235


Epoch[1] Batch[430] Speed: 1.2569210303847775 samples/sec                   batch loss = 1180.7899224758148 | accuracy = 0.5575581395348838


Epoch[1] Batch[435] Speed: 1.260498490419035 samples/sec                   batch loss = 1194.4054307937622 | accuracy = 0.5591954022988506


Epoch[1] Batch[440] Speed: 1.2593042788096955 samples/sec                   batch loss = 1206.932211637497 | accuracy = 0.5602272727272727


Epoch[1] Batch[445] Speed: 1.2600499462136765 samples/sec                   batch loss = 1220.759155511856 | accuracy = 0.5589887640449438


Epoch[1] Batch[450] Speed: 1.259654870871834 samples/sec                   batch loss = 1234.4098901748657 | accuracy = 0.5588888888888889


Epoch[1] Batch[455] Speed: 1.256884118229642 samples/sec                   batch loss = 1248.3567118644714 | accuracy = 0.5598901098901099


Epoch[1] Batch[460] Speed: 1.2558558756822737 samples/sec                   batch loss = 1262.8487186431885 | accuracy = 0.5581521739130435


Epoch[1] Batch[465] Speed: 1.2593381192619955 samples/sec                   batch loss = 1276.1510572433472 | accuracy = 0.5580645161290323


Epoch[1] Batch[470] Speed: 1.2529121973543185 samples/sec                   batch loss = 1288.5373525619507 | accuracy = 0.5606382978723404


Epoch[1] Batch[475] Speed: 1.2598226719355716 samples/sec                   batch loss = 1301.9056866168976 | accuracy = 0.5610526315789474


Epoch[1] Batch[480] Speed: 1.2513717399638697 samples/sec                   batch loss = 1315.497751235962 | accuracy = 0.5619791666666667


Epoch[1] Batch[485] Speed: 1.2532157077141504 samples/sec                   batch loss = 1330.848390340805 | accuracy = 0.5603092783505155


Epoch[1] Batch[490] Speed: 1.2606188697119798 samples/sec                   batch loss = 1345.0308730602264 | accuracy = 0.5586734693877551


Epoch[1] Batch[495] Speed: 1.255409783464991 samples/sec                   batch loss = 1358.5687217712402 | accuracy = 0.5585858585858586


Epoch[1] Batch[500] Speed: 1.2580232724935407 samples/sec                   batch loss = 1371.3506038188934 | accuracy = 0.5605


Epoch[1] Batch[505] Speed: 1.2592440699877598 samples/sec                   batch loss = 1384.0668952465057 | accuracy = 0.560891089108911


Epoch[1] Batch[510] Speed: 1.2616093903979204 samples/sec                   batch loss = 1397.6847231388092 | accuracy = 0.5612745098039216


Epoch[1] Batch[515] Speed: 1.2603545579556932 samples/sec                   batch loss = 1411.3693714141846 | accuracy = 0.5616504854368932


Epoch[1] Batch[520] Speed: 1.2593643988565957 samples/sec                   batch loss = 1425.3769500255585 | accuracy = 0.5610576923076923


Epoch[1] Batch[525] Speed: 1.2578008777728222 samples/sec                   batch loss = 1439.865980386734 | accuracy = 0.560952380952381


Epoch[1] Batch[530] Speed: 1.2504678666243414 samples/sec                   batch loss = 1453.5646374225616 | accuracy = 0.5617924528301886


Epoch[1] Batch[535] Speed: 1.2548129838899553 samples/sec                   batch loss = 1466.9032790660858 | accuracy = 0.5630841121495327


Epoch[1] Batch[540] Speed: 1.2543141773955861 samples/sec                   batch loss = 1480.627013206482 | accuracy = 0.5634259259259259


Epoch[1] Batch[545] Speed: 1.2574229494534024 samples/sec                   batch loss = 1493.4329447746277 | accuracy = 0.5628440366972477


Epoch[1] Batch[550] Speed: 1.2525465518084495 samples/sec                   batch loss = 1506.608605146408 | accuracy = 0.5636363636363636


Epoch[1] Batch[555] Speed: 1.2536690475884462 samples/sec                   batch loss = 1520.5886137485504 | accuracy = 0.5635135135135135


Epoch[1] Batch[560] Speed: 1.2551865275038823 samples/sec                   batch loss = 1534.1027238368988 | accuracy = 0.5647321428571429


Epoch[1] Batch[565] Speed: 1.2603881708192004 samples/sec                   batch loss = 1546.9003047943115 | accuracy = 0.5654867256637168


Epoch[1] Batch[570] Speed: 1.2641535223368086 samples/sec                   batch loss = 1560.1147468090057 | accuracy = 0.5649122807017544


Epoch[1] Batch[575] Speed: 1.260508434335997 samples/sec                   batch loss = 1572.772231578827 | accuracy = 0.5660869565217391


Epoch[1] Batch[580] Speed: 1.2621985248634937 samples/sec                   batch loss = 1585.971363544464 | accuracy = 0.565948275862069


Epoch[1] Batch[585] Speed: 1.2619916434467056 samples/sec                   batch loss = 1599.349336385727 | accuracy = 0.5658119658119658


Epoch[1] Batch[590] Speed: 1.260746282755066 samples/sec                   batch loss = 1612.9955070018768 | accuracy = 0.565677966101695


Epoch[1] Batch[595] Speed: 1.2594894782896449 samples/sec                   batch loss = 1625.5649123191833 | accuracy = 0.5663865546218487


Epoch[1] Batch[600] Speed: 1.2533486509287786 samples/sec                   batch loss = 1638.4113008975983 | accuracy = 0.5670833333333334


Epoch[1] Batch[605] Speed: 1.2601501735360003 samples/sec                   batch loss = 1651.8358883857727 | accuracy = 0.5673553719008264


Epoch[1] Batch[610] Speed: 1.2629696817405556 samples/sec                   batch loss = 1665.770580291748 | accuracy = 0.5663934426229508


Epoch[1] Batch[615] Speed: 1.261714040935953 samples/sec                   batch loss = 1679.5991597175598 | accuracy = 0.5654471544715447


Epoch[1] Batch[620] Speed: 1.2537767886061055 samples/sec                   batch loss = 1692.4227793216705 | accuracy = 0.5661290322580645


Epoch[1] Batch[625] Speed: 1.2612516413839474 samples/sec                   batch loss = 1707.090324640274 | accuracy = 0.5652


Epoch[1] Batch[630] Speed: 1.2606776943924272 samples/sec                   batch loss = 1720.0732035636902 | accuracy = 0.5662698412698413


Epoch[1] Batch[635] Speed: 1.2588578119011367 samples/sec                   batch loss = 1732.8718172311783 | accuracy = 0.5665354330708662


Epoch[1] Batch[640] Speed: 1.2609945516446797 samples/sec                   batch loss = 1745.5538421869278 | accuracy = 0.567578125


Epoch[1] Batch[645] Speed: 1.2586591060567003 samples/sec                   batch loss = 1758.1333659887314 | accuracy = 0.5678294573643411


Epoch[1] Batch[650] Speed: 1.2587964179423412 samples/sec                   batch loss = 1770.5845206975937 | accuracy = 0.5688461538461539


Epoch[1] Batch[655] Speed: 1.256286198634135 samples/sec                   batch loss = 1783.7852627038956 | accuracy = 0.5690839694656489


Epoch[1] Batch[660] Speed: 1.2596770021620034 samples/sec                   batch loss = 1798.5590234994888 | accuracy = 0.5693181818181818


Epoch[1] Batch[665] Speed: 1.2588447769906816 samples/sec                   batch loss = 1811.2180749177933 | accuracy = 0.5691729323308271


Epoch[1] Batch[670] Speed: 1.2614332409883158 samples/sec                   batch loss = 1824.5661715269089 | accuracy = 0.5694029850746268


Epoch[1] Batch[675] Speed: 1.2593911522323598 samples/sec                   batch loss = 1837.3125177621841 | accuracy = 0.5692592592592592


Epoch[1] Batch[680] Speed: 1.2599845562443646 samples/sec                   batch loss = 1849.6453510522842 | accuracy = 0.5702205882352941


Epoch[1] Batch[685] Speed: 1.255132439567651 samples/sec                   batch loss = 1863.1532872915268 | accuracy = 0.5708029197080292


Epoch[1] Batch[690] Speed: 1.260920060711576 samples/sec                   batch loss = 1876.4985226392746 | accuracy = 0.5717391304347826


Epoch[1] Batch[695] Speed: 1.2586338001203927 samples/sec                   batch loss = 1889.9434012174606 | accuracy = 0.5712230215827339


Epoch[1] Batch[700] Speed: 1.2604478262306806 samples/sec                   batch loss = 1903.2067974805832 | accuracy = 0.5717857142857142


Epoch[1] Batch[705] Speed: 1.2604866526272134 samples/sec                   batch loss = 1915.970827460289 | accuracy = 0.5723404255319149


Epoch[1] Batch[710] Speed: 1.25749825303909 samples/sec                   batch loss = 1928.110334277153 | accuracy = 0.5728873239436619


Epoch[1] Batch[715] Speed: 1.254789427774175 samples/sec                   batch loss = 1940.7804297208786 | accuracy = 0.5734265734265734


Epoch[1] Batch[720] Speed: 1.250295559894475 samples/sec                   batch loss = 1953.5678135156631 | accuracy = 0.5739583333333333


Epoch[1] Batch[725] Speed: 1.2504331032492508 samples/sec                   batch loss = 1966.9552561044693 | accuracy = 0.5737931034482758


Epoch[1] Batch[730] Speed: 1.2519621868558424 samples/sec                   batch loss = 1980.2497090101242 | accuracy = 0.5743150684931507


Epoch[1] Batch[735] Speed: 1.255004656555645 samples/sec                   batch loss = 1991.4346044063568 | accuracy = 0.5761904761904761


Epoch[1] Batch[740] Speed: 1.2511682055680855 samples/sec                   batch loss = 2004.0334417819977 | accuracy = 0.5763513513513514


Epoch[1] Batch[745] Speed: 1.2578886756238004 samples/sec                   batch loss = 2015.7528715133667 | accuracy = 0.5778523489932886


Epoch[1] Batch[750] Speed: 1.2491657010333834 samples/sec                   batch loss = 2028.3628520965576 | accuracy = 0.578


Epoch[1] Batch[755] Speed: 1.2537276001921107 samples/sec                   batch loss = 2041.3550565242767 | accuracy = 0.578476821192053


Epoch[1] Batch[760] Speed: 1.2544204348641397 samples/sec                   batch loss = 2053.4565930366516 | accuracy = 0.5792763157894737


Epoch[1] Batch[765] Speed: 1.2559837380760002 samples/sec                   batch loss = 2065.243881344795 | accuracy = 0.5800653594771242


Epoch[1] Batch[770] Speed: 1.2577301580197613 samples/sec                   batch loss = 2077.1053594350815 | accuracy = 0.5811688311688312


Epoch[1] Batch[775] Speed: 1.2578385039224382 samples/sec                   batch loss = 2085.889726281166 | accuracy = 0.5835483870967741


Epoch[1] Batch[780] Speed: 1.2529059284158348 samples/sec                   batch loss = 2097.8521217107773 | accuracy = 0.5836538461538462


Epoch[1] Batch[785] Speed: 1.253155050040779 samples/sec                   batch loss = 2110.141570687294 | accuracy = 0.5840764331210191


[Epoch 1] training: accuracy=0.583756345177665
[Epoch 1] time cost: 644.8289625644684
[Epoch 1] validation: validation accuracy=0.6822222222222222


Epoch[2] Batch[5] Speed: 1.2540287875020004 samples/sec                   batch loss = 12.303969621658325 | accuracy = 0.65


Epoch[2] Batch[10] Speed: 1.2506078716839322 samples/sec                   batch loss = 24.96553063392639 | accuracy = 0.675


Epoch[2] Batch[15] Speed: 1.2500883701328622 samples/sec                   batch loss = 38.265788316726685 | accuracy = 0.6333333333333333


Epoch[2] Batch[20] Speed: 1.2514967303884663 samples/sec                   batch loss = 50.50007224082947 | accuracy = 0.65


Epoch[2] Batch[25] Speed: 1.256552006367986 samples/sec                   batch loss = 62.82657015323639 | accuracy = 0.65


Epoch[2] Batch[30] Speed: 1.2486024198030543 samples/sec                   batch loss = 76.56475400924683 | accuracy = 0.6166666666666667


Epoch[2] Batch[35] Speed: 1.2519251916983438 samples/sec                   batch loss = 88.24485778808594 | accuracy = 0.6357142857142857


Epoch[2] Batch[40] Speed: 1.248072791232701 samples/sec                   batch loss = 98.6074743270874 | accuracy = 0.6625


Epoch[2] Batch[45] Speed: 1.246401958676479 samples/sec                   batch loss = 110.23847019672394 | accuracy = 0.6777777777777778


Epoch[2] Batch[50] Speed: 1.2482312054230138 samples/sec                   batch loss = 119.77240657806396 | accuracy = 0.69


Epoch[2] Batch[55] Speed: 1.246758651306449 samples/sec                   batch loss = 131.58920550346375 | accuracy = 0.6863636363636364


Epoch[2] Batch[60] Speed: 1.2447475069201417 samples/sec                   batch loss = 143.742631316185 | accuracy = 0.6875


Epoch[2] Batch[65] Speed: 1.251925004859663 samples/sec                   batch loss = 157.6032119989395 | accuracy = 0.6807692307692308


Epoch[2] Batch[70] Speed: 1.2556277623904175 samples/sec                   batch loss = 169.38401854038239 | accuracy = 0.6821428571428572


Epoch[2] Batch[75] Speed: 1.252831548038381 samples/sec                   batch loss = 183.03102362155914 | accuracy = 0.6766666666666666


Epoch[2] Batch[80] Speed: 1.2521353270914204 samples/sec                   batch loss = 194.3750776052475 | accuracy = 0.68125


Epoch[2] Batch[85] Speed: 1.2401568208634017 samples/sec                   batch loss = 207.51883924007416 | accuracy = 0.6823529411764706


Epoch[2] Batch[90] Speed: 1.2444876848339352 samples/sec                   batch loss = 218.73864090442657 | accuracy = 0.6833333333333333


Epoch[2] Batch[95] Speed: 1.2455684038176251 samples/sec                   batch loss = 229.94203460216522 | accuracy = 0.6894736842105263


Epoch[2] Batch[100] Speed: 1.2433188129418804 samples/sec                   batch loss = 240.10575878620148 | accuracy = 0.6875


Epoch[2] Batch[105] Speed: 1.245516990913641 samples/sec                   batch loss = 252.87171363830566 | accuracy = 0.6833333333333333


Epoch[2] Batch[110] Speed: 1.2484394521730395 samples/sec                   batch loss = 267.9990611076355 | accuracy = 0.6772727272727272


Epoch[2] Batch[115] Speed: 1.2513821004316492 samples/sec                   batch loss = 282.91153216362 | accuracy = 0.6739130434782609


Epoch[2] Batch[120] Speed: 1.246027886988352 samples/sec                   batch loss = 298.07673835754395 | accuracy = 0.66875


Epoch[2] Batch[125] Speed: 1.2463127016772744 samples/sec                   batch loss = 310.12142169475555 | accuracy = 0.672


Epoch[2] Batch[130] Speed: 1.2526676616997146 samples/sec                   batch loss = 323.14906990528107 | accuracy = 0.6692307692307692


Epoch[2] Batch[135] Speed: 1.252340765469908 samples/sec                   batch loss = 334.32966232299805 | accuracy = 0.6685185185185185


Epoch[2] Batch[140] Speed: 1.2544183714409318 samples/sec                   batch loss = 346.55856943130493 | accuracy = 0.6696428571428571


Epoch[2] Batch[145] Speed: 1.253191649878731 samples/sec                   batch loss = 359.0639958381653 | accuracy = 0.6706896551724137


Epoch[2] Batch[150] Speed: 1.2550713145425412 samples/sec                   batch loss = 372.16350078582764 | accuracy = 0.67


Epoch[2] Batch[155] Speed: 1.2556610296242974 samples/sec                   batch loss = 382.59514832496643 | accuracy = 0.6741935483870968


Epoch[2] Batch[160] Speed: 1.2566735156002058 samples/sec                   batch loss = 394.2079323530197 | accuracy = 0.6734375


Epoch[2] Batch[165] Speed: 1.25277008563634 samples/sec                   batch loss = 408.75967013835907 | accuracy = 0.6681818181818182


Epoch[2] Batch[170] Speed: 1.2542810752602318 samples/sec                   batch loss = 418.8090124130249 | accuracy = 0.6735294117647059


Epoch[2] Batch[175] Speed: 1.2527803757247338 samples/sec                   batch loss = 430.8806519508362 | accuracy = 0.6742857142857143


Epoch[2] Batch[180] Speed: 1.2526468048136796 samples/sec                   batch loss = 443.6649272441864 | accuracy = 0.6736111111111112


Epoch[2] Batch[185] Speed: 1.2550563862935573 samples/sec                   batch loss = 454.84197223186493 | accuracy = 0.677027027027027


Epoch[2] Batch[190] Speed: 1.2520235700085374 samples/sec                   batch loss = 467.0524207353592 | accuracy = 0.675


Epoch[2] Batch[195] Speed: 1.2505407548273255 samples/sec                   batch loss = 478.8289016485214 | accuracy = 0.6717948717948717


Epoch[2] Batch[200] Speed: 1.2507914544391827 samples/sec                   batch loss = 491.99474227428436 | accuracy = 0.6675


Epoch[2] Batch[205] Speed: 1.2577375125035581 samples/sec                   batch loss = 505.6198877096176 | accuracy = 0.6658536585365854


Epoch[2] Batch[210] Speed: 1.260762199353568 samples/sec                   batch loss = 515.5060091018677 | accuracy = 0.6714285714285714


Epoch[2] Batch[215] Speed: 1.2566653264007028 samples/sec                   batch loss = 526.2085744142532 | accuracy = 0.672093023255814


Epoch[2] Batch[220] Speed: 1.2544227796714795 samples/sec                   batch loss = 540.2259556055069 | accuracy = 0.6727272727272727


Epoch[2] Batch[225] Speed: 1.252523641760603 samples/sec                   batch loss = 553.1951186656952 | accuracy = 0.6733333333333333


Epoch[2] Batch[230] Speed: 1.2470584457557001 samples/sec                   batch loss = 563.6534128189087 | accuracy = 0.6728260869565217


Epoch[2] Batch[235] Speed: 1.2572878214540886 samples/sec                   batch loss = 576.5831003189087 | accuracy = 0.6712765957446809


Epoch[2] Batch[240] Speed: 1.248474197688734 samples/sec                   batch loss = 590.8348764181137 | accuracy = 0.6697916666666667


Epoch[2] Batch[245] Speed: 1.2475100311840033 samples/sec                   batch loss = 602.6521400213242 | accuracy = 0.6683673469387755


Epoch[2] Batch[250] Speed: 1.244150096689767 samples/sec                   batch loss = 618.4026168584824 | accuracy = 0.664


Epoch[2] Batch[255] Speed: 1.2456767916851605 samples/sec                   batch loss = 632.6865092515945 | accuracy = 0.6627450980392157


Epoch[2] Batch[260] Speed: 1.24665100143567 samples/sec                   batch loss = 647.5581119060516 | accuracy = 0.6605769230769231


Epoch[2] Batch[265] Speed: 1.2464909506308417 samples/sec                   batch loss = 658.2873878479004 | accuracy = 0.6622641509433962


Epoch[2] Batch[270] Speed: 1.2491112937356428 samples/sec                   batch loss = 669.2757343053818 | accuracy = 0.662962962962963


Epoch[2] Batch[275] Speed: 1.2493131360788827 samples/sec                   batch loss = 678.3972765207291 | accuracy = 0.6663636363636364


Epoch[2] Batch[280] Speed: 1.2436036809068338 samples/sec                   batch loss = 690.9849650859833 | accuracy = 0.6651785714285714


Epoch[2] Batch[285] Speed: 1.2458689213061636 samples/sec                   batch loss = 701.600606918335 | accuracy = 0.6666666666666666


Epoch[2] Batch[290] Speed: 1.2493536984474167 samples/sec                   batch loss = 712.2821278572083 | accuracy = 0.6681034482758621


Epoch[2] Batch[295] Speed: 1.2573146751246975 samples/sec                   batch loss = 724.8577592372894 | accuracy = 0.6677966101694915


Epoch[2] Batch[300] Speed: 1.2440027707752905 samples/sec                   batch loss = 735.6342942714691 | accuracy = 0.6675


Epoch[2] Batch[305] Speed: 1.2425055705405923 samples/sec                   batch loss = 748.0502083301544 | accuracy = 0.6655737704918033


Epoch[2] Batch[310] Speed: 1.2426801548230508 samples/sec                   batch loss = 760.7734153270721 | accuracy = 0.6669354838709678


Epoch[2] Batch[315] Speed: 1.242645086778499 samples/sec                   batch loss = 775.4969749450684 | accuracy = 0.6674603174603174


Epoch[2] Batch[320] Speed: 1.2497375897708913 samples/sec                   batch loss = 786.0904257297516 | accuracy = 0.66953125


Epoch[2] Batch[325] Speed: 1.242961138794676 samples/sec                   batch loss = 800.238950252533 | accuracy = 0.6676923076923077


Epoch[2] Batch[330] Speed: 1.2517563122899669 samples/sec                   batch loss = 814.2528033256531 | accuracy = 0.6651515151515152


Epoch[2] Batch[335] Speed: 1.2497703594222362 samples/sec                   batch loss = 824.7299246788025 | accuracy = 0.6671641791044776


Epoch[2] Batch[340] Speed: 1.2490532645083594 samples/sec                   batch loss = 842.0314993858337 | accuracy = 0.663235294117647


Epoch[2] Batch[345] Speed: 1.246041953425799 samples/sec                   batch loss = 854.4792015552521 | accuracy = 0.6615942028985508


Epoch[2] Batch[350] Speed: 1.2444907311594897 samples/sec                   batch loss = 867.2185142040253 | accuracy = 0.6621428571428571


Epoch[2] Batch[355] Speed: 1.2465957940843746 samples/sec                   batch loss = 879.5555322170258 | accuracy = 0.6619718309859155


Epoch[2] Batch[360] Speed: 1.249933488480935 samples/sec                   batch loss = 890.3411755561829 | accuracy = 0.6631944444444444


Epoch[2] Batch[365] Speed: 1.2468819805020106 samples/sec                   batch loss = 901.4952501058578 | accuracy = 0.6623287671232877


Epoch[2] Batch[370] Speed: 1.248646839111727 samples/sec                   batch loss = 914.3676201105118 | accuracy = 0.6608108108108108


Epoch[2] Batch[375] Speed: 1.244417254253194 samples/sec                   batch loss = 925.6965211629868 | accuracy = 0.6606666666666666


Epoch[2] Batch[380] Speed: 1.2459253596572677 samples/sec                   batch loss = 938.2439560890198 | accuracy = 0.6605263157894737


Epoch[2] Batch[385] Speed: 1.2510221047845964 samples/sec                   batch loss = 949.4791904687881 | accuracy = 0.6616883116883117


Epoch[2] Batch[390] Speed: 1.2477588674002742 samples/sec                   batch loss = 960.8542567491531 | accuracy = 0.6621794871794872


Epoch[2] Batch[395] Speed: 1.2479066202687714 samples/sec                   batch loss = 972.6776978969574 | accuracy = 0.6632911392405063


Epoch[2] Batch[400] Speed: 1.2474374030854292 samples/sec                   batch loss = 986.1517840623856 | accuracy = 0.6625


Epoch[2] Batch[405] Speed: 1.248317950978351 samples/sec                   batch loss = 998.5117764472961 | accuracy = 0.6623456790123456


Epoch[2] Batch[410] Speed: 1.2462986291263007 samples/sec                   batch loss = 1010.8921320438385 | accuracy = 0.6634146341463415


Epoch[2] Batch[415] Speed: 1.2495491045547142 samples/sec                   batch loss = 1020.8487224578857 | accuracy = 0.6650602409638554


Epoch[2] Batch[420] Speed: 1.249732283483336 samples/sec                   batch loss = 1031.7226732969284 | accuracy = 0.6666666666666666


Epoch[2] Batch[425] Speed: 1.2491055277698857 samples/sec                   batch loss = 1043.0308630466461 | accuracy = 0.6652941176470588


Epoch[2] Batch[430] Speed: 1.2482876722521061 samples/sec                   batch loss = 1054.7746840715408 | accuracy = 0.6651162790697674


Epoch[2] Batch[435] Speed: 1.253214022699487 samples/sec                   batch loss = 1066.9968843460083 | accuracy = 0.6660919540229885


Epoch[2] Batch[440] Speed: 1.2441106095529586 samples/sec                   batch loss = 1079.6326768398285 | accuracy = 0.6664772727272728


Epoch[2] Batch[445] Speed: 1.2424061061285667 samples/sec                   batch loss = 1091.6830492019653 | accuracy = 0.6668539325842696


Epoch[2] Batch[450] Speed: 1.2478221592571563 samples/sec                   batch loss = 1103.1786705255508 | accuracy = 0.6661111111111111


Epoch[2] Batch[455] Speed: 1.2463365887084266 samples/sec                   batch loss = 1113.030912399292 | accuracy = 0.6675824175824175


Epoch[2] Batch[460] Speed: 1.2474696811963004 samples/sec                   batch loss = 1124.5894567966461 | accuracy = 0.6673913043478261


Epoch[2] Batch[465] Speed: 1.2482751339458535 samples/sec                   batch loss = 1136.5318806171417 | accuracy = 0.6682795698924732


Epoch[2] Batch[470] Speed: 1.2427801234186184 samples/sec                   batch loss = 1150.7060297727585 | accuracy = 0.6680851063829787


Epoch[2] Batch[475] Speed: 1.244213207505052 samples/sec                   batch loss = 1160.4925241470337 | accuracy = 0.6694736842105263


Epoch[2] Batch[480] Speed: 1.2474405566203355 samples/sec                   batch loss = 1172.6414321660995 | accuracy = 0.6692708333333334


Epoch[2] Batch[485] Speed: 1.2498852529668283 samples/sec                   batch loss = 1184.6883473396301 | accuracy = 0.6695876288659793


Epoch[2] Batch[490] Speed: 1.2465654136010966 samples/sec                   batch loss = 1196.7338579893112 | accuracy = 0.6688775510204081


Epoch[2] Batch[495] Speed: 1.2470219252129469 samples/sec                   batch loss = 1208.1770459413528 | accuracy = 0.6686868686868687


Epoch[2] Batch[500] Speed: 1.2502884785354382 samples/sec                   batch loss = 1217.3645536899567 | accuracy = 0.67


Epoch[2] Batch[505] Speed: 1.2494476718510221 samples/sec                   batch loss = 1230.5544316768646 | accuracy = 0.6702970297029703


Epoch[2] Batch[510] Speed: 1.2406484654484553 samples/sec                   batch loss = 1246.7746996879578 | accuracy = 0.6696078431372549


Epoch[2] Batch[515] Speed: 1.2416319141992107 samples/sec                   batch loss = 1259.5465537309647 | accuracy = 0.6689320388349514


Epoch[2] Batch[520] Speed: 1.2446632883065847 samples/sec                   batch loss = 1271.307033598423 | accuracy = 0.66875


Epoch[2] Batch[525] Speed: 1.242282280930335 samples/sec                   batch loss = 1281.4440621733665 | accuracy = 0.670952380952381


Epoch[2] Batch[530] Speed: 1.2463033508020565 samples/sec                   batch loss = 1294.4560411572456 | accuracy = 0.6716981132075471


Epoch[2] Batch[535] Speed: 1.2372298973194262 samples/sec                   batch loss = 1306.938035428524 | accuracy = 0.6714953271028037


Epoch[2] Batch[540] Speed: 1.2323606376871847 samples/sec                   batch loss = 1319.483906686306 | accuracy = 0.6712962962962963


Epoch[2] Batch[545] Speed: 1.2369533222573972 samples/sec                   batch loss = 1329.657145678997 | accuracy = 0.673394495412844


Epoch[2] Batch[550] Speed: 1.242041692738077 samples/sec                   batch loss = 1340.7961454987526 | accuracy = 0.6736363636363636


Epoch[2] Batch[555] Speed: 1.2416111475219604 samples/sec                   batch loss = 1352.3001779913902 | accuracy = 0.6738738738738739


Epoch[2] Batch[560] Speed: 1.237640152994166 samples/sec                   batch loss = 1365.241120994091 | accuracy = 0.6736607142857143


Epoch[2] Batch[565] Speed: 1.235530079545108 samples/sec                   batch loss = 1374.5645642876625 | accuracy = 0.6752212389380531


Epoch[2] Batch[570] Speed: 1.2338884644497312 samples/sec                   batch loss = 1386.8178326487541 | accuracy = 0.6758771929824562


Epoch[2] Batch[575] Speed: 1.2476600447164765 samples/sec                   batch loss = 1398.228181540966 | accuracy = 0.6756521739130434


Epoch[2] Batch[580] Speed: 1.2392535087662928 samples/sec                   batch loss = 1409.227659523487 | accuracy = 0.6762931034482759


Epoch[2] Batch[585] Speed: 1.243202083353711 samples/sec                   batch loss = 1422.5210632681847 | accuracy = 0.6756410256410257


Epoch[2] Batch[590] Speed: 1.2403466094622657 samples/sec                   batch loss = 1434.709094464779 | accuracy = 0.6754237288135593


Epoch[2] Batch[595] Speed: 1.2402152182213264 samples/sec                   batch loss = 1446.9694470763206 | accuracy = 0.6752100840336135


Epoch[2] Batch[600] Speed: 1.2456927925159307 samples/sec                   batch loss = 1457.5504580140114 | accuracy = 0.67625


Epoch[2] Batch[605] Speed: 1.2448363551346187 samples/sec                   batch loss = 1469.127216041088 | accuracy = 0.6764462809917355


Epoch[2] Batch[610] Speed: 1.2429199775644608 samples/sec                   batch loss = 1481.7231637835503 | accuracy = 0.6770491803278689


Epoch[2] Batch[615] Speed: 1.2468184133468272 samples/sec                   batch loss = 1491.806378185749 | accuracy = 0.6784552845528455


Epoch[2] Batch[620] Speed: 1.2484044299072388 samples/sec                   batch loss = 1502.7134105563164 | accuracy = 0.6794354838709677


Epoch[2] Batch[625] Speed: 1.2453845944858166 samples/sec                   batch loss = 1512.9153081774712 | accuracy = 0.6812


Epoch[2] Batch[630] Speed: 1.2466959304897014 samples/sec                   batch loss = 1524.058805525303 | accuracy = 0.680952380952381


Epoch[2] Batch[635] Speed: 1.2380117626206284 samples/sec                   batch loss = 1534.086114704609 | accuracy = 0.6822834645669291


Epoch[2] Batch[640] Speed: 1.2420244982976332 samples/sec                   batch loss = 1545.781265437603 | accuracy = 0.68203125


Epoch[2] Batch[645] Speed: 1.2457726178293258 samples/sec                   batch loss = 1556.4481586813927 | accuracy = 0.6825581395348838


Epoch[2] Batch[650] Speed: 1.2395537338923628 samples/sec                   batch loss = 1568.1886630654335 | accuracy = 0.6823076923076923


Epoch[2] Batch[655] Speed: 1.2479244420747964 samples/sec                   batch loss = 1578.0768211483955 | accuracy = 0.683206106870229


Epoch[2] Batch[660] Speed: 1.2480156939839753 samples/sec                   batch loss = 1590.8696401715279 | accuracy = 0.6821969696969697


Epoch[2] Batch[665] Speed: 1.2429215429305072 samples/sec                   batch loss = 1602.246688902378 | accuracy = 0.6823308270676691


Epoch[2] Batch[670] Speed: 1.247689272330688 samples/sec                   batch loss = 1612.6171801686287 | accuracy = 0.6828358208955224


Epoch[2] Batch[675] Speed: 1.2472248544916875 samples/sec                   batch loss = 1623.1044906973839 | accuracy = 0.6833333333333333


Epoch[2] Batch[680] Speed: 1.2435896694660231 samples/sec                   batch loss = 1635.2133492827415 | accuracy = 0.6823529411764706


Epoch[2] Batch[685] Speed: 1.245140123402136 samples/sec                   batch loss = 1646.684849202633 | accuracy = 0.6821167883211678


Epoch[2] Batch[690] Speed: 1.2457859384604373 samples/sec                   batch loss = 1658.6572888493538 | accuracy = 0.6818840579710145


Epoch[2] Batch[695] Speed: 1.2476393542491624 samples/sec                   batch loss = 1670.164569914341 | accuracy = 0.6816546762589928


Epoch[2] Batch[700] Speed: 1.2446029940413832 samples/sec                   batch loss = 1680.0594565272331 | accuracy = 0.6825


Epoch[2] Batch[705] Speed: 1.2452019484187096 samples/sec                   batch loss = 1694.2453848719597 | accuracy = 0.6819148936170213


Epoch[2] Batch[710] Speed: 1.2491752809286591 samples/sec                   batch loss = 1707.1207453608513 | accuracy = 0.6816901408450704


Epoch[2] Batch[715] Speed: 1.2518527024769381 samples/sec                   batch loss = 1715.0927058458328 | accuracy = 0.6832167832167833


Epoch[2] Batch[720] Speed: 1.2569784745301245 samples/sec                   batch loss = 1729.8081978559494 | accuracy = 0.6819444444444445


Epoch[2] Batch[725] Speed: 1.2512265247956986 samples/sec                   batch loss = 1739.6902356147766 | accuracy = 0.6827586206896552


Epoch[2] Batch[730] Speed: 1.2468165601742325 samples/sec                   batch loss = 1753.0230071544647 | accuracy = 0.6818493150684931


Epoch[2] Batch[735] Speed: 1.2549547146479014 samples/sec                   batch loss = 1765.4564381241798 | accuracy = 0.6816326530612244


Epoch[2] Batch[740] Speed: 1.2460602772877338 samples/sec                   batch loss = 1776.3719990849495 | accuracy = 0.6810810810810811


Epoch[2] Batch[745] Speed: 1.2491138047373689 samples/sec                   batch loss = 1784.2203589081764 | accuracy = 0.6822147651006711


Epoch[2] Batch[750] Speed: 1.2491684912789773 samples/sec                   batch loss = 1794.2417910695076 | accuracy = 0.6826666666666666


Epoch[2] Batch[755] Speed: 1.253205129641606 samples/sec                   batch loss = 1804.830973803997 | accuracy = 0.6827814569536423


Epoch[2] Batch[760] Speed: 1.25195181678055 samples/sec                   batch loss = 1816.844179213047 | accuracy = 0.6832236842105263


Epoch[2] Batch[765] Speed: 1.247447049242407 samples/sec                   batch loss = 1827.6121797561646 | accuracy = 0.6830065359477124


Epoch[2] Batch[770] Speed: 1.2416628817294513 samples/sec                   batch loss = 1833.9655002355576 | accuracy = 0.6844155844155844


Epoch[2] Batch[775] Speed: 1.242120039209531 samples/sec                   batch loss = 1849.0692502260208 | accuracy = 0.6835483870967742


Epoch[2] Batch[780] Speed: 1.2424610350144074 samples/sec                   batch loss = 1862.171859383583 | accuracy = 0.6833333333333333


Epoch[2] Batch[785] Speed: 1.2390077785379467 samples/sec                   batch loss = 1872.8169717788696 | accuracy = 0.6843949044585987


[Epoch 2] training: accuracy=0.6849619289340102
[Epoch 2] time cost: 648.0971446037292
[Epoch 2] validation: validation accuracy=0.7422222222222222


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).