<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[09:37:26] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[09:37:26] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[09:37:26] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 2.0092325, -4.848022 ]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.796626831880997 samples/sec                   batch loss = 15.023828506469727 | accuracy = 0.3


Epoch[1] Batch[10] Speed: 1.2822887531570517 samples/sec                   batch loss = 30.489282846450806 | accuracy = 0.325


Epoch[1] Batch[15] Speed: 1.2765528631562384 samples/sec                   batch loss = 44.15068864822388 | accuracy = 0.38333333333333336


Epoch[1] Batch[20] Speed: 1.2769402418384233 samples/sec                   batch loss = 59.38434720039368 | accuracy = 0.375


Epoch[1] Batch[25] Speed: 1.2762871681426027 samples/sec                   batch loss = 73.20957112312317 | accuracy = 0.4


Epoch[1] Batch[30] Speed: 1.281882549835341 samples/sec                   batch loss = 86.37341499328613 | accuracy = 0.4583333333333333


Epoch[1] Batch[35] Speed: 1.2819605177748716 samples/sec                   batch loss = 99.0113697052002 | accuracy = 0.5


Epoch[1] Batch[40] Speed: 1.275618758752326 samples/sec                   batch loss = 112.90507364273071 | accuracy = 0.5


Epoch[1] Batch[45] Speed: 1.2824923435191435 samples/sec                   batch loss = 127.99438333511353 | accuracy = 0.49444444444444446


Epoch[1] Batch[50] Speed: 1.2880021575696847 samples/sec                   batch loss = 140.9621901512146 | accuracy = 0.505


Epoch[1] Batch[55] Speed: 1.2889952934645008 samples/sec                   batch loss = 155.50482320785522 | accuracy = 0.509090909090909


Epoch[1] Batch[60] Speed: 1.2870626818383195 samples/sec                   batch loss = 170.5485715866089 | accuracy = 0.49583333333333335


Epoch[1] Batch[65] Speed: 1.2852127441332712 samples/sec                   batch loss = 184.61263799667358 | accuracy = 0.48846153846153845


Epoch[1] Batch[70] Speed: 1.2878010648151594 samples/sec                   batch loss = 198.7211720943451 | accuracy = 0.4857142857142857


Epoch[1] Batch[75] Speed: 1.2852336165657485 samples/sec                   batch loss = 212.33066034317017 | accuracy = 0.49666666666666665


Epoch[1] Batch[80] Speed: 1.2855055123853878 samples/sec                   batch loss = 226.30350065231323 | accuracy = 0.50625


Epoch[1] Batch[85] Speed: 1.2859916859642142 samples/sec                   batch loss = 239.84506177902222 | accuracy = 0.5117647058823529


Epoch[1] Batch[90] Speed: 1.2873426618499682 samples/sec                   batch loss = 253.83693385124207 | accuracy = 0.5111111111111111


Epoch[1] Batch[95] Speed: 1.2829126674773248 samples/sec                   batch loss = 267.8445837497711 | accuracy = 0.5026315789473684


Epoch[1] Batch[100] Speed: 1.2829941947496948 samples/sec                   batch loss = 281.13391375541687 | accuracy = 0.5175


Epoch[1] Batch[105] Speed: 1.2824795008110086 samples/sec                   batch loss = 295.12882709503174 | accuracy = 0.5142857142857142


Epoch[1] Batch[110] Speed: 1.291106677422031 samples/sec                   batch loss = 308.98426055908203 | accuracy = 0.5136363636363637


Epoch[1] Batch[115] Speed: 1.2852587234558268 samples/sec                   batch loss = 322.67653369903564 | accuracy = 0.5152173913043478


Epoch[1] Batch[120] Speed: 1.2891157296295712 samples/sec                   batch loss = 336.51906418800354 | accuracy = 0.5145833333333333


Epoch[1] Batch[125] Speed: 1.2887709222855097 samples/sec                   batch loss = 349.81298446655273 | accuracy = 0.522


Epoch[1] Batch[130] Speed: 1.2827247331252802 samples/sec                   batch loss = 363.27484822273254 | accuracy = 0.5307692307692308


Epoch[1] Batch[135] Speed: 1.2882331852489564 samples/sec                   batch loss = 376.80388498306274 | accuracy = 0.5314814814814814


Epoch[1] Batch[140] Speed: 1.285468970617486 samples/sec                   batch loss = 389.76513838768005 | accuracy = 0.5357142857142857


Epoch[1] Batch[145] Speed: 1.2779941638230767 samples/sec                   batch loss = 404.7666928768158 | accuracy = 0.5293103448275862


Epoch[1] Batch[150] Speed: 1.2783262150190615 samples/sec                   batch loss = 418.58613443374634 | accuracy = 0.53


Epoch[1] Batch[155] Speed: 1.280182838613425 samples/sec                   batch loss = 432.1659827232361 | accuracy = 0.532258064516129


Epoch[1] Batch[160] Speed: 1.282829188793981 samples/sec                   batch loss = 445.72558784484863 | accuracy = 0.534375


Epoch[1] Batch[165] Speed: 1.2568011679553015 samples/sec                   batch loss = 459.5230321884155 | accuracy = 0.5333333333333333


Epoch[1] Batch[170] Speed: 1.2525021351314225 samples/sec                   batch loss = 473.06875467300415 | accuracy = 0.5367647058823529


Epoch[1] Batch[175] Speed: 1.2549336876653359 samples/sec                   batch loss = 486.63896918296814 | accuracy = 0.54


Epoch[1] Batch[180] Speed: 1.249016069121507 samples/sec                   batch loss = 500.73171639442444 | accuracy = 0.5388888888888889


Epoch[1] Batch[185] Speed: 1.2518173950460842 samples/sec                   batch loss = 514.0126583576202 | accuracy = 0.5418918918918919


Epoch[1] Batch[190] Speed: 1.2491891394837937 samples/sec                   batch loss = 527.4937624931335 | accuracy = 0.5407894736842105


Epoch[1] Batch[195] Speed: 1.2557756929207258 samples/sec                   batch loss = 542.0154693126678 | accuracy = 0.5358974358974359


Epoch[1] Batch[200] Speed: 1.2543194288864168 samples/sec                   batch loss = 556.4968798160553 | accuracy = 0.5325


Epoch[1] Batch[205] Speed: 1.2516648860930235 samples/sec                   batch loss = 570.4217054843903 | accuracy = 0.5292682926829269


Epoch[1] Batch[210] Speed: 1.2543856389534773 samples/sec                   batch loss = 584.1222584247589 | accuracy = 0.5297619047619048


Epoch[1] Batch[215] Speed: 1.2488698197450796 samples/sec                   batch loss = 597.5464143753052 | accuracy = 0.5313953488372093


Epoch[1] Batch[220] Speed: 1.251381447063748 samples/sec                   batch loss = 610.8744683265686 | accuracy = 0.5340909090909091


Epoch[1] Batch[225] Speed: 1.2495358894529864 samples/sec                   batch loss = 624.7358326911926 | accuracy = 0.5333333333333333


Epoch[1] Batch[230] Speed: 1.2465830118713268 samples/sec                   batch loss = 638.383056640625 | accuracy = 0.5358695652173913


Epoch[1] Batch[235] Speed: 1.2472460875471179 samples/sec                   batch loss = 651.8793699741364 | accuracy = 0.5361702127659574


Epoch[1] Batch[240] Speed: 1.2525087740555372 samples/sec                   batch loss = 665.3453071117401 | accuracy = 0.5385416666666667


Epoch[1] Batch[245] Speed: 1.2531961431008023 samples/sec                   batch loss = 678.7674820423126 | accuracy = 0.539795918367347


Epoch[1] Batch[250] Speed: 1.2538242005208045 samples/sec                   batch loss = 691.931370973587 | accuracy = 0.544


Epoch[1] Batch[255] Speed: 1.2477699105316538 samples/sec                   batch loss = 705.4944961071014 | accuracy = 0.5450980392156862


Epoch[1] Batch[260] Speed: 1.2445951460598206 samples/sec                   batch loss = 717.8297364711761 | accuracy = 0.55


Epoch[1] Batch[265] Speed: 1.2448454992777511 samples/sec                   batch loss = 732.163382768631 | accuracy = 0.5490566037735849


Epoch[1] Batch[270] Speed: 1.2464161261600262 samples/sec                   batch loss = 745.6676335334778 | accuracy = 0.5509259259259259


Epoch[1] Batch[275] Speed: 1.2476018719215631 samples/sec                   batch loss = 759.2717940807343 | accuracy = 0.55


Epoch[1] Batch[280] Speed: 1.2442165293042704 samples/sec                   batch loss = 773.1891782283783 | accuracy = 0.5491071428571429


Epoch[1] Batch[285] Speed: 1.246906167375507 samples/sec                   batch loss = 786.5476443767548 | accuracy = 0.5491228070175439


Epoch[1] Batch[290] Speed: 1.2555107772205367 samples/sec                   batch loss = 800.2388741970062 | accuracy = 0.5491379310344827


Epoch[1] Batch[295] Speed: 1.25155153242034 samples/sec                   batch loss = 813.7529902458191 | accuracy = 0.5491525423728814


Epoch[1] Batch[300] Speed: 1.251983394661845 samples/sec                   batch loss = 826.703339099884 | accuracy = 0.5525


Epoch[1] Batch[305] Speed: 1.2468037734334707 samples/sec                   batch loss = 840.8761010169983 | accuracy = 0.55


Epoch[1] Batch[310] Speed: 1.2470902407618056 samples/sec                   batch loss = 854.3306565284729 | accuracy = 0.55


Epoch[1] Batch[315] Speed: 1.2446708601279346 samples/sec                   batch loss = 867.7215211391449 | accuracy = 0.5515873015873016


Epoch[1] Batch[320] Speed: 1.248436850982299 samples/sec                   batch loss = 882.149500131607 | accuracy = 0.54921875


Epoch[1] Batch[325] Speed: 1.2454514363707372 samples/sec                   batch loss = 895.7065649032593 | accuracy = 0.5484615384615384


Epoch[1] Batch[330] Speed: 1.2411495899272948 samples/sec                   batch loss = 909.149448633194 | accuracy = 0.5492424242424242


Epoch[1] Batch[335] Speed: 1.2414791207833302 samples/sec                   batch loss = 922.9924545288086 | accuracy = 0.5485074626865671


Epoch[1] Batch[340] Speed: 1.2514298914767585 samples/sec                   batch loss = 935.6626889705658 | accuracy = 0.55


Epoch[1] Batch[345] Speed: 1.2532229158835837 samples/sec                   batch loss = 949.5620832443237 | accuracy = 0.5485507246376812


Epoch[1] Batch[350] Speed: 1.2513237667008215 samples/sec                   batch loss = 962.6886234283447 | accuracy = 0.55


Epoch[1] Batch[355] Speed: 1.2542851074382253 samples/sec                   batch loss = 975.5060756206512 | accuracy = 0.5521126760563381


Epoch[1] Batch[360] Speed: 1.2443406481257329 samples/sec                   batch loss = 988.6725523471832 | accuracy = 0.5534722222222223


Epoch[1] Batch[365] Speed: 1.2431141130276044 samples/sec                   batch loss = 1002.6017928123474 | accuracy = 0.5534246575342465


Epoch[1] Batch[370] Speed: 1.2347443474761517 samples/sec                   batch loss = 1015.2455823421478 | accuracy = 0.5560810810810811


Epoch[1] Batch[375] Speed: 1.2363697462147396 samples/sec                   batch loss = 1028.7837438583374 | accuracy = 0.5566666666666666


Epoch[1] Batch[380] Speed: 1.2394018177314547 samples/sec                   batch loss = 1042.3758690357208 | accuracy = 0.5578947368421052


Epoch[1] Batch[385] Speed: 1.2488474158414073 samples/sec                   batch loss = 1056.0950610637665 | accuracy = 0.5584415584415584


Epoch[1] Batch[390] Speed: 1.247605490166858 samples/sec                   batch loss = 1069.4427292346954 | accuracy = 0.5583333333333333


Epoch[1] Batch[395] Speed: 1.2492282985397538 samples/sec                   batch loss = 1082.7691230773926 | accuracy = 0.5582278481012658


Epoch[1] Batch[400] Speed: 1.2574022166328407 samples/sec                   batch loss = 1097.0155248641968 | accuracy = 0.558125


Epoch[1] Batch[405] Speed: 1.2485946142259214 samples/sec                   batch loss = 1111.0000252723694 | accuracy = 0.5580246913580247


Epoch[1] Batch[410] Speed: 1.2505345095973206 samples/sec                   batch loss = 1125.032236814499 | accuracy = 0.5585365853658537


Epoch[1] Batch[415] Speed: 1.2484562672746968 samples/sec                   batch loss = 1139.0211279392242 | accuracy = 0.558433734939759


Epoch[1] Batch[420] Speed: 1.2517540708343267 samples/sec                   batch loss = 1151.8580174446106 | accuracy = 0.5589285714285714


Epoch[1] Batch[425] Speed: 1.2520487042690378 samples/sec                   batch loss = 1165.448002576828 | accuracy = 0.5594117647058824


Epoch[1] Batch[430] Speed: 1.2570071043925077 samples/sec                   batch loss = 1178.1030538082123 | accuracy = 0.5622093023255814


Epoch[1] Batch[435] Speed: 1.254054845973446 samples/sec                   batch loss = 1191.9855706691742 | accuracy = 0.5626436781609195


Epoch[1] Batch[440] Speed: 1.2478539004062223 samples/sec                   batch loss = 1206.176954984665 | accuracy = 0.5625


Epoch[1] Batch[445] Speed: 1.2507619880604723 samples/sec                   batch loss = 1219.7267820835114 | accuracy = 0.5623595505617978


Epoch[1] Batch[450] Speed: 1.2483361560729727 samples/sec                   batch loss = 1233.2063167095184 | accuracy = 0.5638888888888889


Epoch[1] Batch[455] Speed: 1.249865978426028 samples/sec                   batch loss = 1246.442881822586 | accuracy = 0.5648351648351648


Epoch[1] Batch[460] Speed: 1.254047159549117 samples/sec                   batch loss = 1259.8098485469818 | accuracy = 0.5641304347826087


Epoch[1] Batch[465] Speed: 1.2513468195412851 samples/sec                   batch loss = 1272.803035736084 | accuracy = 0.5650537634408602


Epoch[1] Batch[470] Speed: 1.2551345053407055 samples/sec                   batch loss = 1286.242508649826 | accuracy = 0.5648936170212766


Epoch[1] Batch[475] Speed: 1.2545616078880302 samples/sec                   batch loss = 1300.1606867313385 | accuracy = 0.5636842105263158


Epoch[1] Batch[480] Speed: 1.2498809696841648 samples/sec                   batch loss = 1312.758383989334 | accuracy = 0.5651041666666666


Epoch[1] Batch[485] Speed: 1.2500988956546342 samples/sec                   batch loss = 1326.0408446788788 | accuracy = 0.5649484536082474


Epoch[1] Batch[490] Speed: 1.2468051632838926 samples/sec                   batch loss = 1339.9617788791656 | accuracy = 0.5642857142857143


Epoch[1] Batch[495] Speed: 1.2503657255892096 samples/sec                   batch loss = 1352.945778131485 | accuracy = 0.5656565656565656


Epoch[1] Batch[500] Speed: 1.2490419196805733 samples/sec                   batch loss = 1366.4231765270233 | accuracy = 0.566


Epoch[1] Batch[505] Speed: 1.2513968480600615 samples/sec                   batch loss = 1378.4334049224854 | accuracy = 0.5693069306930693


Epoch[1] Batch[510] Speed: 1.2519108052831045 samples/sec                   batch loss = 1391.78559923172 | accuracy = 0.5700980392156862


Epoch[1] Batch[515] Speed: 1.2538361009121415 samples/sec                   batch loss = 1404.4219918251038 | accuracy = 0.5713592233009709


Epoch[1] Batch[520] Speed: 1.2585750716074011 samples/sec                   batch loss = 1417.099394083023 | accuracy = 0.5725961538461538


Epoch[1] Batch[525] Speed: 1.2507043647879779 samples/sec                   batch loss = 1430.4551877975464 | accuracy = 0.5728571428571428


Epoch[1] Batch[530] Speed: 1.251911272369315 samples/sec                   batch loss = 1443.2184386253357 | accuracy = 0.5731132075471698


Epoch[1] Batch[535] Speed: 1.2528544693516104 samples/sec                   batch loss = 1456.861367225647 | accuracy = 0.5728971962616822


Epoch[1] Batch[540] Speed: 1.2511066263651014 samples/sec                   batch loss = 1470.1138694286346 | accuracy = 0.5726851851851852


Epoch[1] Batch[545] Speed: 1.255071220653073 samples/sec                   batch loss = 1481.2114322185516 | accuracy = 0.5743119266055046


Epoch[1] Batch[550] Speed: 1.2532354601683617 samples/sec                   batch loss = 1495.1202583312988 | accuracy = 0.5745454545454546


Epoch[1] Batch[555] Speed: 1.249730980191531 samples/sec                   batch loss = 1508.4603924751282 | accuracy = 0.5752252252252252


Epoch[1] Batch[560] Speed: 1.2585330585905345 samples/sec                   batch loss = 1521.538002729416 | accuracy = 0.575


Epoch[1] Batch[565] Speed: 1.2586869626482307 samples/sec                   batch loss = 1535.1691677570343 | accuracy = 0.5743362831858407


Epoch[1] Batch[570] Speed: 1.2587219031672163 samples/sec                   batch loss = 1547.7329368591309 | accuracy = 0.5754385964912281


Epoch[1] Batch[575] Speed: 1.2535278880400396 samples/sec                   batch loss = 1560.2187314033508 | accuracy = 0.5760869565217391


Epoch[1] Batch[580] Speed: 1.250699889416205 samples/sec                   batch loss = 1574.532320022583 | accuracy = 0.5758620689655173


Epoch[1] Batch[585] Speed: 1.250315313583122 samples/sec                   batch loss = 1586.4637068510056 | accuracy = 0.5773504273504273


Epoch[1] Batch[590] Speed: 1.2474169054971738 samples/sec                   batch loss = 1599.2429648637772 | accuracy = 0.5775423728813559


Epoch[1] Batch[595] Speed: 1.2532605494913145 samples/sec                   batch loss = 1611.4707788228989 | accuracy = 0.5794117647058824


Epoch[1] Batch[600] Speed: 1.253609657414867 samples/sec                   batch loss = 1624.6633040904999 | accuracy = 0.5795833333333333


Epoch[1] Batch[605] Speed: 1.2632602024247326 samples/sec                   batch loss = 1636.4987089633942 | accuracy = 0.5809917355371901


Epoch[1] Batch[610] Speed: 1.254487593616156 samples/sec                   batch loss = 1649.0233249664307 | accuracy = 0.5819672131147541


Epoch[1] Batch[615] Speed: 1.2553656331602019 samples/sec                   batch loss = 1660.9506808519363 | accuracy = 0.5829268292682926


Epoch[1] Batch[620] Speed: 1.2533285203906612 samples/sec                   batch loss = 1675.1826039552689 | accuracy = 0.5818548387096775


Epoch[1] Batch[625] Speed: 1.2530649168448236 samples/sec                   batch loss = 1688.452266573906 | accuracy = 0.5808


Epoch[1] Batch[630] Speed: 1.248050880126022 samples/sec                   batch loss = 1702.6545034646988 | accuracy = 0.578968253968254


Epoch[1] Batch[635] Speed: 1.2545785883125362 samples/sec                   batch loss = 1715.7227982282639 | accuracy = 0.5795275590551181


Epoch[1] Batch[640] Speed: 1.2486907039571606 samples/sec                   batch loss = 1727.7573844194412 | accuracy = 0.5796875


Epoch[1] Batch[645] Speed: 1.2487349436599067 samples/sec                   batch loss = 1740.9655698537827 | accuracy = 0.5806201550387597


Epoch[1] Batch[650] Speed: 1.249934978440259 samples/sec                   batch loss = 1753.77141559124 | accuracy = 0.5815384615384616


Epoch[1] Batch[655] Speed: 1.2446787090643818 samples/sec                   batch loss = 1766.8356658220291 | accuracy = 0.5820610687022901


Epoch[1] Batch[660] Speed: 1.246219939797143 samples/sec                   batch loss = 1778.6838678121567 | accuracy = 0.5833333333333334


Epoch[1] Batch[665] Speed: 1.2516049386768475 samples/sec                   batch loss = 1788.9830104112625 | accuracy = 0.5857142857142857


Epoch[1] Batch[670] Speed: 1.250025872676628 samples/sec                   batch loss = 1799.9401619434357 | accuracy = 0.5869402985074627


Epoch[1] Batch[675] Speed: 1.2488916666492973 samples/sec                   batch loss = 1811.703308224678 | accuracy = 0.5877777777777777


Epoch[1] Batch[680] Speed: 1.243800335288727 samples/sec                   batch loss = 1823.492367863655 | accuracy = 0.5893382352941177


Epoch[1] Batch[685] Speed: 1.246231973992067 samples/sec                   batch loss = 1836.7719415426254 | accuracy = 0.5901459854014599


Epoch[1] Batch[690] Speed: 1.2474560462732796 samples/sec                   batch loss = 1850.6648082733154 | accuracy = 0.5902173913043478


Epoch[1] Batch[695] Speed: 1.2432834324407502 samples/sec                   batch loss = 1861.4528645277023 | accuracy = 0.5910071942446044


Epoch[1] Batch[700] Speed: 1.246353624979942 samples/sec                   batch loss = 1875.113946557045 | accuracy = 0.5910714285714286


Epoch[1] Batch[705] Speed: 1.2435798985165292 samples/sec                   batch loss = 1887.6076403856277 | accuracy = 0.5911347517730496


Epoch[1] Batch[710] Speed: 1.2450924419327367 samples/sec                   batch loss = 1899.2044755220413 | accuracy = 0.592605633802817


Epoch[1] Batch[715] Speed: 1.24049939938923 samples/sec                   batch loss = 1914.2022627592087 | accuracy = 0.5916083916083916


Epoch[1] Batch[720] Speed: 1.2473333453775248 samples/sec                   batch loss = 1927.2959088087082 | accuracy = 0.5909722222222222


Epoch[1] Batch[725] Speed: 1.2452202475831444 samples/sec                   batch loss = 1939.4303258657455 | accuracy = 0.593103448275862


Epoch[1] Batch[730] Speed: 1.2452276413374925 samples/sec                   batch loss = 1950.9960800409317 | accuracy = 0.5941780821917808


Epoch[1] Batch[735] Speed: 1.2463494584574721 samples/sec                   batch loss = 1962.8232468366623 | accuracy = 0.5952380952380952


Epoch[1] Batch[740] Speed: 1.244408947108739 samples/sec                   batch loss = 1975.2717934846878 | accuracy = 0.595945945945946


Epoch[1] Batch[745] Speed: 1.2465839381098398 samples/sec                   batch loss = 1988.1849662065506 | accuracy = 0.5956375838926175


Epoch[1] Batch[750] Speed: 1.2460413981656682 samples/sec                   batch loss = 2001.589326262474 | accuracy = 0.5953333333333334


Epoch[1] Batch[755] Speed: 1.2417932931146605 samples/sec                   batch loss = 2014.9376195669174 | accuracy = 0.5950331125827815


Epoch[1] Batch[760] Speed: 1.2453827455729987 samples/sec                   batch loss = 2027.8298687934875 | accuracy = 0.5950657894736842


Epoch[1] Batch[765] Speed: 1.2413329777905104 samples/sec                   batch loss = 2039.671788096428 | accuracy = 0.5957516339869281


Epoch[1] Batch[770] Speed: 1.2432450137083468 samples/sec                   batch loss = 2050.5615504980087 | accuracy = 0.5964285714285714


Epoch[1] Batch[775] Speed: 1.2432899739962546 samples/sec                   batch loss = 2062.6764413118362 | accuracy = 0.5967741935483871


Epoch[1] Batch[780] Speed: 1.2463013139963623 samples/sec                   batch loss = 2076.1804229021072 | accuracy = 0.596474358974359


Epoch[1] Batch[785] Speed: 1.2441750080888503 samples/sec                   batch loss = 2090.1225925683975 | accuracy = 0.5964968152866242


[Epoch 1] training: accuracy=0.5961294416243654
[Epoch 1] time cost: 645.6145884990692
[Epoch 1] validation: validation accuracy=0.6922222222222222


Epoch[2] Batch[5] Speed: 1.251343366227694 samples/sec                   batch loss = 12.429806232452393 | accuracy = 0.7


Epoch[2] Batch[10] Speed: 1.2522014002352708 samples/sec                   batch loss = 23.308876633644104 | accuracy = 0.725


Epoch[2] Batch[15] Speed: 1.254555040996285 samples/sec                   batch loss = 37.309513449668884 | accuracy = 0.6833333333333333


Epoch[2] Batch[20] Speed: 1.2523634818585798 samples/sec                   batch loss = 50.10183370113373 | accuracy = 0.6875


Epoch[2] Batch[25] Speed: 1.2594798340934257 samples/sec                   batch loss = 62.79719913005829 | accuracy = 0.66


Epoch[2] Batch[30] Speed: 1.2585661023008952 samples/sec                   batch loss = 74.00566852092743 | accuracy = 0.675


Epoch[2] Batch[35] Speed: 1.2544486669271189 samples/sec                   batch loss = 86.06416261196136 | accuracy = 0.6857142857142857


Epoch[2] Batch[40] Speed: 1.248592291156334 samples/sec                   batch loss = 97.14889168739319 | accuracy = 0.69375


Epoch[2] Batch[45] Speed: 1.2445893293844077 samples/sec                   batch loss = 108.52588939666748 | accuracy = 0.6888888888888889


Epoch[2] Batch[50] Speed: 1.2467419745502049 samples/sec                   batch loss = 120.95412409305573 | accuracy = 0.69


Epoch[2] Batch[55] Speed: 1.2557751289516326 samples/sec                   batch loss = 132.40898251533508 | accuracy = 0.6909090909090909


Epoch[2] Batch[60] Speed: 1.247327874017715 samples/sec                   batch loss = 144.32677447795868 | accuracy = 0.6875


Epoch[2] Batch[65] Speed: 1.2511962914294492 samples/sec                   batch loss = 158.7556449174881 | accuracy = 0.6730769230769231


Epoch[2] Batch[70] Speed: 1.250762640781673 samples/sec                   batch loss = 170.86998581886292 | accuracy = 0.6785714285714286


Epoch[2] Batch[75] Speed: 1.2523565640233203 samples/sec                   batch loss = 182.42513346672058 | accuracy = 0.6833333333333333


Epoch[2] Batch[80] Speed: 1.2528840344064096 samples/sec                   batch loss = 194.3407562971115 | accuracy = 0.68125


Epoch[2] Batch[85] Speed: 1.2523092630355266 samples/sec                   batch loss = 204.79554545879364 | accuracy = 0.6852941176470588


Epoch[2] Batch[90] Speed: 1.2471314932588717 samples/sec                   batch loss = 217.31427347660065 | accuracy = 0.6861111111111111


Epoch[2] Batch[95] Speed: 1.2572999761313175 samples/sec                   batch loss = 230.87040579319 | accuracy = 0.6815789473684211


Epoch[2] Batch[100] Speed: 1.2536847859908338 samples/sec                   batch loss = 242.4303959608078 | accuracy = 0.68


Epoch[2] Batch[105] Speed: 1.250882006968375 samples/sec                   batch loss = 254.2897001504898 | accuracy = 0.6833333333333333


Epoch[2] Batch[110] Speed: 1.2488106974886775 samples/sec                   batch loss = 265.9800560474396 | accuracy = 0.6863636363636364


Epoch[2] Batch[115] Speed: 1.2525273821193323 samples/sec                   batch loss = 279.82808554172516 | accuracy = 0.6826086956521739


Epoch[2] Batch[120] Speed: 1.2531616958883052 samples/sec                   batch loss = 292.42854130268097 | accuracy = 0.6791666666666667


Epoch[2] Batch[125] Speed: 1.249431574413204 samples/sec                   batch loss = 305.2334749698639 | accuracy = 0.678


Epoch[2] Batch[130] Speed: 1.2568061578438916 samples/sec                   batch loss = 316.84602892398834 | accuracy = 0.6807692307692308


Epoch[2] Batch[135] Speed: 1.2558098139932854 samples/sec                   batch loss = 329.31089425086975 | accuracy = 0.6759259259259259


Epoch[2] Batch[140] Speed: 1.2594842779694135 samples/sec                   batch loss = 342.9763469696045 | accuracy = 0.6696428571428571


Epoch[2] Batch[145] Speed: 1.2538717097315022 samples/sec                   batch loss = 355.25752210617065 | accuracy = 0.6672413793103448


Epoch[2] Batch[150] Speed: 1.2518060933527766 samples/sec                   batch loss = 369.724773645401 | accuracy = 0.665


Epoch[2] Batch[155] Speed: 1.2554664319148523 samples/sec                   batch loss = 380.7427022457123 | accuracy = 0.6661290322580645


Epoch[2] Batch[160] Speed: 1.252806662990759 samples/sec                   batch loss = 392.73044097423553 | accuracy = 0.671875


Epoch[2] Batch[165] Speed: 1.2585862126082354 samples/sec                   batch loss = 405.25255811214447 | accuracy = 0.6696969696969697


Epoch[2] Batch[170] Speed: 1.2521667273200052 samples/sec                   batch loss = 416.5829850435257 | accuracy = 0.6691176470588235


Epoch[2] Batch[175] Speed: 1.2555230854772939 samples/sec                   batch loss = 430.36199629306793 | accuracy = 0.6685714285714286


Epoch[2] Batch[180] Speed: 1.2516677808955297 samples/sec                   batch loss = 440.27871203422546 | accuracy = 0.6708333333333333


Epoch[2] Batch[185] Speed: 1.252705542573856 samples/sec                   batch loss = 453.2020888328552 | accuracy = 0.668918918918919


Epoch[2] Batch[190] Speed: 1.251033299039788 samples/sec                   batch loss = 465.2908775806427 | accuracy = 0.6657894736842105


Epoch[2] Batch[195] Speed: 1.260115153679193 samples/sec                   batch loss = 477.8037066459656 | accuracy = 0.6641025641025641


Epoch[2] Batch[200] Speed: 1.2555899864825808 samples/sec                   batch loss = 492.7035825252533 | accuracy = 0.66


Epoch[2] Batch[205] Speed: 1.2569790395805818 samples/sec                   batch loss = 505.4168030023575 | accuracy = 0.6585365853658537


Epoch[2] Batch[210] Speed: 1.2648288473461629 samples/sec                   batch loss = 517.8728076219559 | accuracy = 0.6595238095238095


Epoch[2] Batch[215] Speed: 1.2535625427154324 samples/sec                   batch loss = 532.2746583223343 | accuracy = 0.6558139534883721


Epoch[2] Batch[220] Speed: 1.2596311326173217 samples/sec                   batch loss = 545.6143780946732 | accuracy = 0.6545454545454545


Epoch[2] Batch[225] Speed: 1.2556169556175019 samples/sec                   batch loss = 559.7224771976471 | accuracy = 0.6555555555555556


Epoch[2] Batch[230] Speed: 1.2554373085243298 samples/sec                   batch loss = 570.673159122467 | accuracy = 0.657608695652174


Epoch[2] Batch[235] Speed: 1.2573796940253088 samples/sec                   batch loss = 580.8451937437057 | accuracy = 0.6606382978723404


Epoch[2] Batch[240] Speed: 1.2584133604769683 samples/sec                   batch loss = 595.0508712530136 | accuracy = 0.659375


Epoch[2] Batch[245] Speed: 1.2590969276013952 samples/sec                   batch loss = 608.0460859537125 | accuracy = 0.6561224489795918


Epoch[2] Batch[250] Speed: 1.2620830654660395 samples/sec                   batch loss = 621.0942540168762 | accuracy = 0.656


Epoch[2] Batch[255] Speed: 1.2556649767012293 samples/sec                   batch loss = 632.9361438751221 | accuracy = 0.6568627450980392


Epoch[2] Batch[260] Speed: 1.2623220784188591 samples/sec                   batch loss = 645.4427700042725 | accuracy = 0.6576923076923077


Epoch[2] Batch[265] Speed: 1.2541652784211819 samples/sec                   batch loss = 657.1847324371338 | accuracy = 0.6575471698113208


Epoch[2] Batch[270] Speed: 1.25831557986436 samples/sec                   batch loss = 667.6365699768066 | accuracy = 0.6592592592592592


Epoch[2] Batch[275] Speed: 1.2542485374702395 samples/sec                   batch loss = 676.7240791320801 | accuracy = 0.6636363636363637


Epoch[2] Batch[280] Speed: 1.255246630678455 samples/sec                   batch loss = 688.2363815307617 | accuracy = 0.6651785714285714


Epoch[2] Batch[285] Speed: 1.2525230807087202 samples/sec                   batch loss = 702.2599238157272 | accuracy = 0.6622807017543859


Epoch[2] Batch[290] Speed: 1.2520762690234484 samples/sec                   batch loss = 714.7781708240509 | accuracy = 0.6603448275862069


Epoch[2] Batch[295] Speed: 1.2540669382219463 samples/sec                   batch loss = 726.0888677835464 | accuracy = 0.6601694915254237


Epoch[2] Batch[300] Speed: 1.2558692248032364 samples/sec                   batch loss = 738.1717617511749 | accuracy = 0.6591666666666667


Epoch[2] Batch[305] Speed: 1.2603807853039555 samples/sec                   batch loss = 749.6645658016205 | accuracy = 0.6598360655737705


Epoch[2] Batch[310] Speed: 1.253011947412564 samples/sec                   batch loss = 761.0241984128952 | accuracy = 0.6629032258064517


Epoch[2] Batch[315] Speed: 1.2552557405681581 samples/sec                   batch loss = 772.2332335710526 | accuracy = 0.6619047619047619


Epoch[2] Batch[320] Speed: 1.2569539894982722 samples/sec                   batch loss = 784.017930150032 | accuracy = 0.66328125


Epoch[2] Batch[325] Speed: 1.2535872704659852 samples/sec                   batch loss = 794.5789102315903 | accuracy = 0.6646153846153846


Epoch[2] Batch[330] Speed: 1.2623818220561909 samples/sec                   batch loss = 807.2598642110825 | accuracy = 0.6651515151515152


Epoch[2] Batch[335] Speed: 1.2579420581817518 samples/sec                   batch loss = 819.760944724083 | accuracy = 0.6656716417910448


Epoch[2] Batch[340] Speed: 1.255757458176637 samples/sec                   batch loss = 834.0608006715775 | accuracy = 0.6647058823529411


Epoch[2] Batch[345] Speed: 1.264473940165987 samples/sec                   batch loss = 843.9892152547836 | accuracy = 0.6673913043478261


Epoch[2] Batch[350] Speed: 1.2561945798413803 samples/sec                   batch loss = 857.5918251276016 | accuracy = 0.665


Epoch[2] Batch[355] Speed: 1.2557556723281105 samples/sec                   batch loss = 868.4256063699722 | accuracy = 0.6669014084507042


Epoch[2] Batch[360] Speed: 1.2609374032344278 samples/sec                   batch loss = 879.0943081378937 | accuracy = 0.6680555555555555


Epoch[2] Batch[365] Speed: 1.2558328444154145 samples/sec                   batch loss = 892.3732825517654 | accuracy = 0.6691780821917809


Epoch[2] Batch[370] Speed: 1.2626182875678666 samples/sec                   batch loss = 902.773319363594 | accuracy = 0.6695945945945946


Epoch[2] Batch[375] Speed: 1.253219733045299 samples/sec                   batch loss = 915.260933637619 | accuracy = 0.6693333333333333


Epoch[2] Batch[380] Speed: 1.258019121914408 samples/sec                   batch loss = 929.5655789375305 | accuracy = 0.6684210526315789


Epoch[2] Batch[385] Speed: 1.2629531389217017 samples/sec                   batch loss = 940.4383866786957 | accuracy = 0.6688311688311688


Epoch[2] Batch[390] Speed: 1.2572222467846714 samples/sec                   batch loss = 953.4350473880768 | accuracy = 0.6673076923076923


Epoch[2] Batch[395] Speed: 1.260410706644293 samples/sec                   batch loss = 966.4501595497131 | accuracy = 0.6670886075949367


Epoch[2] Batch[400] Speed: 1.2597619405715688 samples/sec                   batch loss = 978.3565669059753 | accuracy = 0.668125


Epoch[2] Batch[405] Speed: 1.2652292745111733 samples/sec                   batch loss = 989.0149029493332 | accuracy = 0.6685185185185185


Epoch[2] Batch[410] Speed: 1.2565049525594414 samples/sec                   batch loss = 1000.5373474359512 | accuracy = 0.6682926829268293


Epoch[2] Batch[415] Speed: 1.2590777458678437 samples/sec                   batch loss = 1011.8096808195114 | accuracy = 0.6686746987951807


Epoch[2] Batch[420] Speed: 1.2628485679118804 samples/sec                   batch loss = 1022.8018417358398 | accuracy = 0.6696428571428571


Epoch[2] Batch[425] Speed: 1.261715369340854 samples/sec                   batch loss = 1035.1321096420288 | accuracy = 0.6694117647058824


Epoch[2] Batch[430] Speed: 1.2646716264393398 samples/sec                   batch loss = 1045.4759097099304 | accuracy = 0.6703488372093023


Epoch[2] Batch[435] Speed: 1.2579601677818482 samples/sec                   batch loss = 1057.789824604988 | accuracy = 0.6706896551724137


Epoch[2] Batch[440] Speed: 1.2562997450516378 samples/sec                   batch loss = 1069.4736964702606 | accuracy = 0.6715909090909091


Epoch[2] Batch[445] Speed: 1.2772523949791845 samples/sec                   batch loss = 1081.9319376945496 | accuracy = 0.6713483146067416


Epoch[2] Batch[450] Speed: 1.2811732423727569 samples/sec                   batch loss = 1090.9354568719864 | accuracy = 0.6733333333333333


Epoch[2] Batch[455] Speed: 1.2834827905587618 samples/sec                   batch loss = 1103.27663064003 | accuracy = 0.6736263736263737


Epoch[2] Batch[460] Speed: 1.2887238993946448 samples/sec                   batch loss = 1110.815789937973 | accuracy = 0.6755434782608696


Epoch[2] Batch[465] Speed: 1.2866249326096977 samples/sec                   batch loss = 1121.9620826244354 | accuracy = 0.6758064516129032


Epoch[2] Batch[470] Speed: 1.2855756468809694 samples/sec                   batch loss = 1130.7271522283554 | accuracy = 0.6776595744680851


Epoch[2] Batch[475] Speed: 1.2871999407697705 samples/sec                   batch loss = 1144.27015042305 | accuracy = 0.6768421052631579


Epoch[2] Batch[480] Speed: 1.2873419703912856 samples/sec                   batch loss = 1155.1823824644089 | accuracy = 0.678125


Epoch[2] Batch[485] Speed: 1.290165645385193 samples/sec                   batch loss = 1164.7636477947235 | accuracy = 0.6788659793814433


Epoch[2] Batch[490] Speed: 1.2858059033649254 samples/sec                   batch loss = 1173.4090030193329 | accuracy = 0.6801020408163265


Epoch[2] Batch[495] Speed: 1.2853482301158186 samples/sec                   batch loss = 1186.1132888793945 | accuracy = 0.6797979797979798


Epoch[2] Batch[500] Speed: 1.280193193217177 samples/sec                   batch loss = 1194.870841741562 | accuracy = 0.681


Epoch[2] Batch[505] Speed: 1.2844590343460704 samples/sec                   batch loss = 1204.4135256409645 | accuracy = 0.6826732673267327


Epoch[2] Batch[510] Speed: 1.2831811285087404 samples/sec                   batch loss = 1212.93621134758 | accuracy = 0.6838235294117647


Epoch[2] Batch[515] Speed: 1.2848733668946688 samples/sec                   batch loss = 1225.766355395317 | accuracy = 0.6825242718446602


Epoch[2] Batch[520] Speed: 1.2879567727444619 samples/sec                   batch loss = 1239.0046021938324 | accuracy = 0.6822115384615385


Epoch[2] Batch[525] Speed: 1.289490156436961 samples/sec                   batch loss = 1253.1651604175568 | accuracy = 0.680952380952381


Epoch[2] Batch[530] Speed: 1.2843060389088354 samples/sec                   batch loss = 1268.5238444805145 | accuracy = 0.6792452830188679


Epoch[2] Batch[535] Speed: 1.284829678210865 samples/sec                   batch loss = 1280.0874512195587 | accuracy = 0.6794392523364486


Epoch[2] Batch[540] Speed: 1.2894063150814405 samples/sec                   batch loss = 1292.7273029088974 | accuracy = 0.6791666666666667


Epoch[2] Batch[545] Speed: 1.2893079194548351 samples/sec                   batch loss = 1305.630106329918 | accuracy = 0.6775229357798165


Epoch[2] Batch[550] Speed: 1.2888133943298778 samples/sec                   batch loss = 1316.0180052518845 | accuracy = 0.6781818181818182


Epoch[2] Batch[555] Speed: 1.28824554993947 samples/sec                   batch loss = 1326.376986503601 | accuracy = 0.6792792792792792


Epoch[2] Batch[560] Speed: 1.2818533632726579 samples/sec                   batch loss = 1338.2525433301926 | accuracy = 0.6799107142857143


Epoch[2] Batch[565] Speed: 1.2859193377106302 samples/sec                   batch loss = 1349.4382910728455 | accuracy = 0.6805309734513274


Epoch[2] Batch[570] Speed: 1.2795087823339306 samples/sec                   batch loss = 1362.9969727993011 | accuracy = 0.6793859649122806


Epoch[2] Batch[575] Speed: 1.2831955555824102 samples/sec                   batch loss = 1374.594249367714 | accuracy = 0.6791304347826087


Epoch[2] Batch[580] Speed: 1.2759469565174406 samples/sec                   batch loss = 1384.0883705615997 | accuracy = 0.6797413793103448


Epoch[2] Batch[585] Speed: 1.2791288159988081 samples/sec                   batch loss = 1395.978800535202 | accuracy = 0.6803418803418804


Epoch[2] Batch[590] Speed: 1.2784868494799817 samples/sec                   batch loss = 1404.7205144166946 | accuracy = 0.6813559322033899


Epoch[2] Batch[595] Speed: 1.2788328035866312 samples/sec                   batch loss = 1415.3219230175018 | accuracy = 0.6815126050420168


Epoch[2] Batch[600] Speed: 1.278792351400327 samples/sec                   batch loss = 1426.2916893959045 | accuracy = 0.6816666666666666


Epoch[2] Batch[605] Speed: 1.2793591101137538 samples/sec                   batch loss = 1437.2741760015488 | accuracy = 0.6818181818181818


Epoch[2] Batch[610] Speed: 1.2797166642944502 samples/sec                   batch loss = 1446.5902161598206 | accuracy = 0.6831967213114755


Epoch[2] Batch[615] Speed: 1.2753363896822 samples/sec                   batch loss = 1456.4703367948532 | accuracy = 0.6845528455284553


Epoch[2] Batch[620] Speed: 1.2777639715411957 samples/sec                   batch loss = 1469.2217742204666 | accuracy = 0.6834677419354839


Epoch[2] Batch[625] Speed: 1.2813297980684757 samples/sec                   batch loss = 1479.4535768032074 | accuracy = 0.684


Epoch[2] Batch[630] Speed: 1.282900208746748 samples/sec                   batch loss = 1492.8545988798141 | accuracy = 0.6833333333333333


Epoch[2] Batch[635] Speed: 1.2771232767012537 samples/sec                   batch loss = 1504.360562324524 | accuracy = 0.6826771653543308


Epoch[2] Batch[640] Speed: 1.2826895260807247 samples/sec                   batch loss = 1515.019785284996 | accuracy = 0.683203125


Epoch[2] Batch[645] Speed: 1.2744013535507082 samples/sec                   batch loss = 1524.8508160114288 | accuracy = 0.6844961240310078


Epoch[2] Batch[650] Speed: 1.2602346077886708 samples/sec                   batch loss = 1534.9473885297775 | accuracy = 0.6846153846153846


Epoch[2] Batch[655] Speed: 1.2493068100753433 samples/sec                   batch loss = 1544.1793069839478 | accuracy = 0.6854961832061068


Epoch[2] Batch[660] Speed: 1.2507687017967837 samples/sec                   batch loss = 1557.5457736253738 | accuracy = 0.6859848484848485


Epoch[2] Batch[665] Speed: 1.2547833277290008 samples/sec                   batch loss = 1567.7120428085327 | accuracy = 0.6860902255639098


Epoch[2] Batch[670] Speed: 1.2562440560922934 samples/sec                   batch loss = 1584.3382811546326 | accuracy = 0.6850746268656717


Epoch[2] Batch[675] Speed: 1.2484607266041643 samples/sec                   batch loss = 1593.5375186800957 | accuracy = 0.6862962962962963


Epoch[2] Batch[680] Speed: 1.254809229868039 samples/sec                   batch loss = 1606.314856827259 | accuracy = 0.6856617647058824


Epoch[2] Batch[685] Speed: 1.2562388825375983 samples/sec                   batch loss = 1617.9745813012123 | accuracy = 0.6854014598540146


Epoch[2] Batch[690] Speed: 1.2558693188121368 samples/sec                   batch loss = 1628.5318946242332 | accuracy = 0.6858695652173913


Epoch[2] Batch[695] Speed: 1.2517517359932373 samples/sec                   batch loss = 1639.7193230986595 | accuracy = 0.6863309352517986


Epoch[2] Batch[700] Speed: 1.2514124361211294 samples/sec                   batch loss = 1653.8787543177605 | accuracy = 0.6853571428571429


Epoch[2] Batch[705] Speed: 1.2499192409242892 samples/sec                   batch loss = 1670.348520576954 | accuracy = 0.6833333333333333


Epoch[2] Batch[710] Speed: 1.2531346449212475 samples/sec                   batch loss = 1682.6964256167412 | accuracy = 0.6838028169014084


Epoch[2] Batch[715] Speed: 1.2564673120498284 samples/sec                   batch loss = 1695.8216948509216 | accuracy = 0.6839160839160839


Epoch[2] Batch[720] Speed: 1.2542723545918435 samples/sec                   batch loss = 1707.9977707862854 | accuracy = 0.6840277777777778


Epoch[2] Batch[725] Speed: 1.2535191778418737 samples/sec                   batch loss = 1719.8979835510254 | accuracy = 0.6837931034482758


Epoch[2] Batch[730] Speed: 1.2647020378650398 samples/sec                   batch loss = 1733.1712406873703 | accuracy = 0.6832191780821918


Epoch[2] Batch[735] Speed: 1.2544163080245123 samples/sec                   batch loss = 1742.633981704712 | accuracy = 0.6843537414965987


Epoch[2] Batch[740] Speed: 1.2626318758628412 samples/sec                   batch loss = 1750.5940076708794 | accuracy = 0.6847972972972973


Epoch[2] Batch[745] Speed: 1.2619040312414442 samples/sec                   batch loss = 1760.9561948180199 | accuracy = 0.6852348993288591


Epoch[2] Batch[750] Speed: 1.2585070968924719 samples/sec                   batch loss = 1773.187005341053 | accuracy = 0.6856666666666666


Epoch[2] Batch[755] Speed: 1.2565984048738343 samples/sec                   batch loss = 1785.0357586741447 | accuracy = 0.686092715231788


Epoch[2] Batch[760] Speed: 1.2566523368548908 samples/sec                   batch loss = 1799.323617875576 | accuracy = 0.6855263157894737


Epoch[2] Batch[765] Speed: 1.2571910635300814 samples/sec                   batch loss = 1808.5532235503197 | accuracy = 0.6862745098039216


Epoch[2] Batch[770] Speed: 1.2540615951066438 samples/sec                   batch loss = 1821.132268846035 | accuracy = 0.6857142857142857


Epoch[2] Batch[775] Speed: 1.249106736758295 samples/sec                   batch loss = 1830.6087077260017 | accuracy = 0.6870967741935484


Epoch[2] Batch[780] Speed: 1.255081830251863 samples/sec                   batch loss = 1840.7225026488304 | accuracy = 0.6871794871794872


Epoch[2] Batch[785] Speed: 1.2544472599835952 samples/sec                   batch loss = 1852.1698707342148 | accuracy = 0.6872611464968152


[Epoch 2] training: accuracy=0.6875
[Epoch 2] time cost: 641.3668866157532
[Epoch 2] validation: validation accuracy=0.7488888888888889


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).