<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `device` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), device=gpu)
x

[05:36:02] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], device=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[05:36:02] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), device=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], device=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_device(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', device=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), device=gpu)
net(x)

[05:36:02] /work/mxnet/src/operator/cudnn_ops.cc:353: Auto-tuning cuDNN op, set MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable


[05:36:03] /work/mxnet/src/operator/cudnn_ops.cc:353: Auto-tuning cuDNN op, set MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable


array([[ 10.139104, -14.296497]], device=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, device=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.573228532411715 samples/sec                   batch loss = 12.921398162841797 | accuracy = 0.6


Epoch[1] Batch[10] Speed: 1.247850002283394 samples/sec                   batch loss = 26.740209341049194 | accuracy = 0.625


Epoch[1] Batch[15] Speed: 1.2476663540538075 samples/sec                   batch loss = 40.39805006980896 | accuracy = 0.6166666666666667


Epoch[1] Batch[20] Speed: 1.2479038356575618 samples/sec                   batch loss = 53.04903244972229 | accuracy = 0.65


Epoch[1] Batch[25] Speed: 1.2489754356934077 samples/sec                   batch loss = 66.34776759147644 | accuracy = 0.62


Epoch[1] Batch[30] Speed: 1.2458476425995249 samples/sec                   batch loss = 79.99511480331421 | accuracy = 0.6166666666666667


Epoch[1] Batch[35] Speed: 1.2457071288695165 samples/sec                   batch loss = 94.20178508758545 | accuracy = 0.6142857142857143


Epoch[1] Batch[40] Speed: 1.242016223068187 samples/sec                   batch loss = 106.74906802177429 | accuracy = 0.6125


Epoch[1] Batch[45] Speed: 1.2485523357124013 samples/sec                   batch loss = 121.09439873695374 | accuracy = 0.5944444444444444


Epoch[1] Batch[50] Speed: 1.243646823378332 samples/sec                   batch loss = 133.67806482315063 | accuracy = 0.605


Epoch[1] Batch[55] Speed: 1.2467139956613396 samples/sec                   batch loss = 148.512362241745 | accuracy = 0.5954545454545455


Epoch[1] Batch[60] Speed: 1.2476249733868965 samples/sec                   batch loss = 162.86343836784363 | accuracy = 0.5916666666666667


Epoch[1] Batch[65] Speed: 1.2412769547178348 samples/sec                   batch loss = 177.75258874893188 | accuracy = 0.5846153846153846


Epoch[1] Batch[70] Speed: 1.2365352282338846 samples/sec                   batch loss = 191.48908257484436 | accuracy = 0.5857142857142857


Epoch[1] Batch[75] Speed: 1.241435761132691 samples/sec                   batch loss = 204.80410766601562 | accuracy = 0.59


Epoch[1] Batch[80] Speed: 1.2384777542303285 samples/sec                   batch loss = 218.6534903049469 | accuracy = 0.584375


Epoch[1] Batch[85] Speed: 1.2375159067097474 samples/sec                   batch loss = 232.62955713272095 | accuracy = 0.5735294117647058


Epoch[1] Batch[90] Speed: 1.2407446205941728 samples/sec                   batch loss = 246.92934942245483 | accuracy = 0.5694444444444444


Epoch[1] Batch[95] Speed: 1.2503483930915955 samples/sec                   batch loss = 259.95805406570435 | accuracy = 0.5736842105263158


Epoch[1] Batch[100] Speed: 1.2510080189647899 samples/sec                   batch loss = 275.433557510376 | accuracy = 0.565


Epoch[1] Batch[105] Speed: 1.2474294265683061 samples/sec                   batch loss = 288.82896399497986 | accuracy = 0.5714285714285714


Epoch[1] Batch[110] Speed: 1.2470637293615914 samples/sec                   batch loss = 302.224662065506 | accuracy = 0.5727272727272728


Epoch[1] Batch[115] Speed: 1.2440323807590257 samples/sec                   batch loss = 316.73043942451477 | accuracy = 0.5739130434782609


Epoch[1] Batch[120] Speed: 1.2425313363523907 samples/sec                   batch loss = 330.2368085384369 | accuracy = 0.575


Epoch[1] Batch[125] Speed: 1.2427418278388067 samples/sec                   batch loss = 343.92630648612976 | accuracy = 0.572


Epoch[1] Batch[130] Speed: 1.2439179150789499 samples/sec                   batch loss = 357.68956184387207 | accuracy = 0.5711538461538461


Epoch[1] Batch[135] Speed: 1.2446142583761306 samples/sec                   batch loss = 371.22042417526245 | accuracy = 0.5740740740740741


Epoch[1] Batch[140] Speed: 1.2447700410256144 samples/sec                   batch loss = 384.7407877445221 | accuracy = 0.5714285714285714


Epoch[1] Batch[145] Speed: 1.2407642571606676 samples/sec                   batch loss = 398.5598855018616 | accuracy = 0.5706896551724138


Epoch[1] Batch[150] Speed: 1.2442076712124368 samples/sec                   batch loss = 412.3822286128998 | accuracy = 0.5683333333333334


Epoch[1] Batch[155] Speed: 1.2462190140994678 samples/sec                   batch loss = 426.221853017807 | accuracy = 0.567741935483871


Epoch[1] Batch[160] Speed: 1.2483513893152853 samples/sec                   batch loss = 439.46734285354614 | accuracy = 0.571875


Epoch[1] Batch[165] Speed: 1.2437218689110157 samples/sec                   batch loss = 453.73809123039246 | accuracy = 0.5651515151515152


Epoch[1] Batch[170] Speed: 1.2409103580472522 samples/sec                   batch loss = 467.7569217681885 | accuracy = 0.5632352941176471


Epoch[1] Batch[175] Speed: 1.2415844091508559 samples/sec                   batch loss = 481.95768117904663 | accuracy = 0.56


Epoch[1] Batch[180] Speed: 1.2448161276588987 samples/sec                   batch loss = 495.5177345275879 | accuracy = 0.5597222222222222


Epoch[1] Batch[185] Speed: 1.2462593757952907 samples/sec                   batch loss = 509.40370893478394 | accuracy = 0.5608108108108109


Epoch[1] Batch[190] Speed: 1.2413769732234654 samples/sec                   batch loss = 522.9807877540588 | accuracy = 0.5578947368421052


Epoch[1] Batch[195] Speed: 1.2357262820136046 samples/sec                   batch loss = 535.7462441921234 | accuracy = 0.5641025641025641


Epoch[1] Batch[200] Speed: 1.2398237768786373 samples/sec                   batch loss = 549.3937137126923 | accuracy = 0.5625


Epoch[1] Batch[205] Speed: 1.2434912295603344 samples/sec                   batch loss = 562.9309940338135 | accuracy = 0.5621951219512196


Epoch[1] Batch[210] Speed: 1.2440612541565808 samples/sec                   batch loss = 576.6637682914734 | accuracy = 0.5607142857142857


Epoch[1] Batch[215] Speed: 1.2450758097211436 samples/sec                   batch loss = 590.2072811126709 | accuracy = 0.5616279069767441


Epoch[1] Batch[220] Speed: 1.2419571043591786 samples/sec                   batch loss = 603.6356687545776 | accuracy = 0.5613636363636364


Epoch[1] Batch[225] Speed: 1.2446474062488848 samples/sec                   batch loss = 617.2630739212036 | accuracy = 0.5633333333333334


Epoch[1] Batch[230] Speed: 1.2461046084554577 samples/sec                   batch loss = 631.2444198131561 | accuracy = 0.5608695652173913


Epoch[1] Batch[235] Speed: 1.2439373755253782 samples/sec                   batch loss = 643.8625164031982 | accuracy = 0.5659574468085107


Epoch[1] Batch[240] Speed: 1.2482691899479352 samples/sec                   batch loss = 657.7369167804718 | accuracy = 0.565625


Epoch[1] Batch[245] Speed: 1.2461257108291737 samples/sec                   batch loss = 671.6635735034943 | accuracy = 0.5653061224489796


Epoch[1] Batch[250] Speed: 1.2403014033330315 samples/sec                   batch loss = 685.0621161460876 | accuracy = 0.567


Epoch[1] Batch[255] Speed: 1.2413465710360438 samples/sec                   batch loss = 698.4190909862518 | accuracy = 0.5686274509803921


Epoch[1] Batch[260] Speed: 1.2424752970771509 samples/sec                   batch loss = 711.6010711193085 | accuracy = 0.5701923076923077


Epoch[1] Batch[265] Speed: 1.24100288192709 samples/sec                   batch loss = 724.4408791065216 | accuracy = 0.5716981132075472


Epoch[1] Batch[270] Speed: 1.2339644243008943 samples/sec                   batch loss = 738.1809487342834 | accuracy = 0.5703703703703704


Epoch[1] Batch[275] Speed: 1.2379718420189108 samples/sec                   batch loss = 751.6024925708771 | accuracy = 0.5736363636363636


Epoch[1] Batch[280] Speed: 1.240223377787713 samples/sec                   batch loss = 765.7248542308807 | accuracy = 0.5696428571428571


Epoch[1] Batch[285] Speed: 1.2444668225203388 samples/sec                   batch loss = 779.7595913410187 | accuracy = 0.5675438596491228


Epoch[1] Batch[290] Speed: 1.2479251846610941 samples/sec                   batch loss = 793.06907081604 | accuracy = 0.5698275862068966


Epoch[1] Batch[295] Speed: 1.2476819421019671 samples/sec                   batch loss = 806.408988237381 | accuracy = 0.5711864406779661


Epoch[1] Batch[300] Speed: 1.2507017541505514 samples/sec                   batch loss = 819.5428602695465 | accuracy = 0.5708333333333333


Epoch[1] Batch[305] Speed: 1.2497218572250082 samples/sec                   batch loss = 833.2020003795624 | accuracy = 0.5704918032786885


Epoch[1] Batch[310] Speed: 1.2478013706077802 samples/sec                   batch loss = 847.2846875190735 | accuracy = 0.5685483870967742


Epoch[1] Batch[315] Speed: 1.2458542111661912 samples/sec                   batch loss = 860.7817754745483 | accuracy = 0.5706349206349206


Epoch[1] Batch[320] Speed: 1.2438696816210593 samples/sec                   batch loss = 873.4943671226501 | accuracy = 0.5734375


Epoch[1] Batch[325] Speed: 1.242195268479388 samples/sec                   batch loss = 886.3080310821533 | accuracy = 0.5753846153846154


Epoch[1] Batch[330] Speed: 1.2447433511291428 samples/sec                   batch loss = 898.7381041049957 | accuracy = 0.578030303030303


Epoch[1] Batch[335] Speed: 1.2383799391085417 samples/sec                   batch loss = 912.0928082466125 | accuracy = 0.5791044776119403


Epoch[1] Batch[340] Speed: 1.2412926590363085 samples/sec                   batch loss = 926.5812950134277 | accuracy = 0.5779411764705882


Epoch[1] Batch[345] Speed: 1.237182454815068 samples/sec                   batch loss = 939.2737348079681 | accuracy = 0.5797101449275363


Epoch[1] Batch[350] Speed: 1.2395984275914491 samples/sec                   batch loss = 953.2564146518707 | accuracy = 0.58


Epoch[1] Batch[355] Speed: 1.2400369264154696 samples/sec                   batch loss = 968.1696465015411 | accuracy = 0.5774647887323944


Epoch[1] Batch[360] Speed: 1.2409431253434795 samples/sec                   batch loss = 981.2724325656891 | accuracy = 0.5763888888888888


Epoch[1] Batch[365] Speed: 1.2456816936301023 samples/sec                   batch loss = 994.6143839359283 | accuracy = 0.576027397260274


Epoch[1] Batch[370] Speed: 1.2479356737869567 samples/sec                   batch loss = 1006.8934400081635 | accuracy = 0.5777027027027027


Epoch[1] Batch[375] Speed: 1.2462366951627817 samples/sec                   batch loss = 1021.1433427333832 | accuracy = 0.5786666666666667


Epoch[1] Batch[380] Speed: 1.2417763812835763 samples/sec                   batch loss = 1034.7322881221771 | accuracy = 0.5796052631578947


Epoch[1] Batch[385] Speed: 1.2405226054735645 samples/sec                   batch loss = 1047.5069284439087 | accuracy = 0.5792207792207792


Epoch[1] Batch[390] Speed: 1.2417500034231506 samples/sec                   batch loss = 1060.3233296871185 | accuracy = 0.5788461538461539


Epoch[1] Batch[395] Speed: 1.243141377891515 samples/sec                   batch loss = 1074.4577100276947 | accuracy = 0.5784810126582278


Epoch[1] Batch[400] Speed: 1.2400074146714115 samples/sec                   batch loss = 1087.5281691551208 | accuracy = 0.58


Epoch[1] Batch[405] Speed: 1.243228430816492 samples/sec                   batch loss = 1099.502660036087 | accuracy = 0.5814814814814815


Epoch[1] Batch[410] Speed: 1.249165235993663 samples/sec                   batch loss = 1111.7188799381256 | accuracy = 0.5835365853658536


Epoch[1] Batch[415] Speed: 1.2466344201905661 samples/sec                   batch loss = 1125.5644314289093 | accuracy = 0.5825301204819278


Epoch[1] Batch[420] Speed: 1.2443831956657423 samples/sec                   batch loss = 1139.6072709560394 | accuracy = 0.5815476190476191


Epoch[1] Batch[425] Speed: 1.2522385051851874 samples/sec                   batch loss = 1153.2635958194733 | accuracy = 0.5841176470588235


Epoch[1] Batch[430] Speed: 1.24998815368912 samples/sec                   batch loss = 1166.7645425796509 | accuracy = 0.5837209302325581


Epoch[1] Batch[435] Speed: 1.2518812861416653 samples/sec                   batch loss = 1180.8644244670868 | accuracy = 0.5810344827586207


Epoch[1] Batch[440] Speed: 1.2545741789868523 samples/sec                   batch loss = 1194.584415435791 | accuracy = 0.5806818181818182


Epoch[1] Batch[445] Speed: 1.2451940928818075 samples/sec                   batch loss = 1207.383390903473 | accuracy = 0.5808988764044943


Epoch[1] Batch[450] Speed: 1.2478944608911697 samples/sec                   batch loss = 1221.6060030460358 | accuracy = 0.5811111111111111


Epoch[1] Batch[455] Speed: 1.2505498897537104 samples/sec                   batch loss = 1235.0302448272705 | accuracy = 0.5824175824175825


Epoch[1] Batch[460] Speed: 1.2505914646759235 samples/sec                   batch loss = 1246.980705499649 | accuracy = 0.5853260869565218


Epoch[1] Batch[465] Speed: 1.2502792542538346 samples/sec                   batch loss = 1261.272985458374 | accuracy = 0.5844086021505376


Epoch[1] Batch[470] Speed: 1.2515181091451175 samples/sec                   batch loss = 1274.3669118881226 | accuracy = 0.5845744680851064


Epoch[1] Batch[475] Speed: 1.2439459530718775 samples/sec                   batch loss = 1288.2489590644836 | accuracy = 0.5836842105263158


Epoch[1] Batch[480] Speed: 1.2490222991943511 samples/sec                   batch loss = 1301.210470199585 | accuracy = 0.5833333333333334


Epoch[1] Batch[485] Speed: 1.251038709668279 samples/sec                   batch loss = 1312.7334246635437 | accuracy = 0.5865979381443299


Epoch[1] Batch[490] Speed: 1.242889407676775 samples/sec                   batch loss = 1326.395665884018 | accuracy = 0.5867346938775511


Epoch[1] Batch[495] Speed: 1.240646447084119 samples/sec                   batch loss = 1339.550134897232 | accuracy = 0.5868686868686869


Epoch[1] Batch[500] Speed: 1.2453649038464036 samples/sec                   batch loss = 1352.2529890537262 | accuracy = 0.588


Epoch[1] Batch[505] Speed: 1.2445897910233272 samples/sec                   batch loss = 1366.047037601471 | accuracy = 0.5881188118811881


Epoch[1] Batch[510] Speed: 1.2398454916858554 samples/sec                   batch loss = 1379.2251381874084 | accuracy = 0.5872549019607843


Epoch[1] Batch[515] Speed: 1.2376411572906063 samples/sec                   batch loss = 1392.6196331977844 | accuracy = 0.5868932038834952


Epoch[1] Batch[520] Speed: 1.2445791734148195 samples/sec                   batch loss = 1404.9402437210083 | accuracy = 0.5879807692307693


Epoch[1] Batch[525] Speed: 1.2473944609469518 samples/sec                   batch loss = 1418.3163928985596 | accuracy = 0.5885714285714285


Epoch[1] Batch[530] Speed: 1.2529799433182682 samples/sec                   batch loss = 1431.2115857601166 | accuracy = 0.5886792452830188


Epoch[1] Batch[535] Speed: 1.2519934849715701 samples/sec                   batch loss = 1443.8346326351166 | accuracy = 0.5901869158878504


Epoch[1] Batch[540] Speed: 1.2497442925143165 samples/sec                   batch loss = 1456.3006219863892 | accuracy = 0.5907407407407408


Epoch[1] Batch[545] Speed: 1.2439043576870366 samples/sec                   batch loss = 1470.3852753639221 | accuracy = 0.5899082568807339


Epoch[1] Batch[550] Speed: 1.2447212797264542 samples/sec                   batch loss = 1483.7155594825745 | accuracy = 0.5895454545454546


Epoch[1] Batch[555] Speed: 1.2415150417708156 samples/sec                   batch loss = 1496.2572314739227 | accuracy = 0.590990990990991


Epoch[1] Batch[560] Speed: 1.2449882207390535 samples/sec                   batch loss = 1508.8038380146027 | accuracy = 0.5915178571428571


Epoch[1] Batch[565] Speed: 1.2431838434567373 samples/sec                   batch loss = 1522.557144165039 | accuracy = 0.5915929203539823


Epoch[1] Batch[570] Speed: 1.246642942339661 samples/sec                   batch loss = 1535.2051055431366 | accuracy = 0.5929824561403508


Epoch[1] Batch[575] Speed: 1.2421814726987177 samples/sec                   batch loss = 1548.7091331481934 | accuracy = 0.5917391304347827


Epoch[1] Batch[580] Speed: 1.2443486774821328 samples/sec                   batch loss = 1562.7885987758636 | accuracy = 0.5905172413793104


Epoch[1] Batch[585] Speed: 1.2414940033955093 samples/sec                   batch loss = 1574.6825776100159 | accuracy = 0.5923076923076923


Epoch[1] Batch[590] Speed: 1.2412048668927198 samples/sec                   batch loss = 1587.3441758155823 | accuracy = 0.5932203389830508


Epoch[1] Batch[595] Speed: 1.247080785517726 samples/sec                   batch loss = 1601.0221424102783 | accuracy = 0.5936974789915966


Epoch[1] Batch[600] Speed: 1.2488543879244474 samples/sec                   batch loss = 1613.2121243476868 | accuracy = 0.5941666666666666


Epoch[1] Batch[605] Speed: 1.2396887406795478 samples/sec                   batch loss = 1624.862622141838 | accuracy = 0.5950413223140496


Epoch[1] Batch[610] Speed: 1.242861141076591 samples/sec                   batch loss = 1637.5756987333298 | accuracy = 0.5959016393442623


Epoch[1] Batch[615] Speed: 1.2382286760533776 samples/sec                   batch loss = 1651.319644331932 | accuracy = 0.5955284552845529


Epoch[1] Batch[620] Speed: 1.2424467732790812 samples/sec                   batch loss = 1663.9947391748428 | accuracy = 0.5959677419354839


Epoch[1] Batch[625] Speed: 1.238488542264529 samples/sec                   batch loss = 1675.6839810609818 | accuracy = 0.598


Epoch[1] Batch[630] Speed: 1.2406256216173848 samples/sec                   batch loss = 1688.864219546318 | accuracy = 0.5972222222222222


Epoch[1] Batch[635] Speed: 1.2414687399132427 samples/sec                   batch loss = 1701.353478550911 | accuracy = 0.5980314960629921


Epoch[1] Batch[640] Speed: 1.2338761229915296 samples/sec                   batch loss = 1714.975334763527 | accuracy = 0.596875


Epoch[1] Batch[645] Speed: 1.2402154932611416 samples/sec                   batch loss = 1727.6086348295212 | accuracy = 0.5961240310077519


Epoch[1] Batch[650] Speed: 1.244460360883666 samples/sec                   batch loss = 1739.402572631836 | accuracy = 0.5973076923076923


Epoch[1] Batch[655] Speed: 1.2439564676450918 samples/sec                   batch loss = 1753.4494173526764 | accuracy = 0.5969465648854961


Epoch[1] Batch[660] Speed: 1.2479062489865587 samples/sec                   batch loss = 1765.7717757225037 | accuracy = 0.5977272727272728


Epoch[1] Batch[665] Speed: 1.2512703844364463 samples/sec                   batch loss = 1778.946878194809 | accuracy = 0.5977443609022557


Epoch[1] Batch[670] Speed: 1.2492266242305892 samples/sec                   batch loss = 1792.016053557396 | accuracy = 0.5977611940298507


Epoch[1] Batch[675] Speed: 1.2476067890292626 samples/sec                   batch loss = 1805.9702116250992 | accuracy = 0.5988888888888889


Epoch[1] Batch[680] Speed: 1.2439029742963146 samples/sec                   batch loss = 1819.6420670747757 | accuracy = 0.5985294117647059


Epoch[1] Batch[685] Speed: 1.2395432936155588 samples/sec                   batch loss = 1833.7699352502823 | accuracy = 0.5974452554744526


Epoch[1] Batch[690] Speed: 1.2438064212264468 samples/sec                   batch loss = 1845.9693652391434 | accuracy = 0.5985507246376811


Epoch[1] Batch[695] Speed: 1.2428806605458589 samples/sec                   batch loss = 1860.6005145311356 | accuracy = 0.5974820143884892


Epoch[1] Batch[700] Speed: 1.2437782051081507 samples/sec                   batch loss = 1872.4793342351913 | accuracy = 0.5985714285714285


Epoch[1] Batch[705] Speed: 1.243558421403072 samples/sec                   batch loss = 1887.005257487297 | accuracy = 0.597872340425532


Epoch[1] Batch[710] Speed: 1.2447081665440671 samples/sec                   batch loss = 1900.3200422525406 | accuracy = 0.5964788732394366


Epoch[1] Batch[715] Speed: 1.2433323575729505 samples/sec                   batch loss = 1912.6676369905472 | accuracy = 0.5975524475524475


Epoch[1] Batch[720] Speed: 1.2479936920484849 samples/sec                   batch loss = 1925.8464003801346 | accuracy = 0.5979166666666667


Epoch[1] Batch[725] Speed: 1.2414565219434253 samples/sec                   batch loss = 1940.5251280069351 | accuracy = 0.5968965517241379


Epoch[1] Batch[730] Speed: 1.243197477268608 samples/sec                   batch loss = 1953.7862263917923 | accuracy = 0.5969178082191781


Epoch[1] Batch[735] Speed: 1.2484717821625244 samples/sec                   batch loss = 1967.0580557584763 | accuracy = 0.5972789115646259


Epoch[1] Batch[740] Speed: 1.2455087615443041 samples/sec                   batch loss = 1980.6911767721176 | accuracy = 0.5966216216216216


Epoch[1] Batch[745] Speed: 1.2411839308514758 samples/sec                   batch loss = 1993.3442405462265 | accuracy = 0.5973154362416108


Epoch[1] Batch[750] Speed: 1.2478698644014243 samples/sec                   batch loss = 2005.5702608823776 | accuracy = 0.5976666666666667


Epoch[1] Batch[755] Speed: 1.2425242506476195 samples/sec                   batch loss = 2018.453757405281 | accuracy = 0.5983443708609272


Epoch[1] Batch[760] Speed: 1.2419397283905307 samples/sec                   batch loss = 2031.4024835824966 | accuracy = 0.5976973684210526


Epoch[1] Batch[765] Speed: 1.245913423907506 samples/sec                   batch loss = 2042.5242944955826 | accuracy = 0.5990196078431372


Epoch[1] Batch[770] Speed: 1.2496255156584768 samples/sec                   batch loss = 2055.5288039445877 | accuracy = 0.5987012987012987


Epoch[1] Batch[775] Speed: 1.2408050004308038 samples/sec                   batch loss = 2068.4423773288727 | accuracy = 0.5996774193548388


Epoch[1] Batch[780] Speed: 1.2411443563169566 samples/sec                   batch loss = 2078.566201686859 | accuracy = 0.6009615384615384


Epoch[1] Batch[785] Speed: 1.247479698880703 samples/sec                   batch loss = 2093.217521429062 | accuracy = 0.6003184713375797


[Epoch 1] training: accuracy=0.600253807106599
[Epoch 1] time cost: 653.1779236793518
[Epoch 1] validation: validation accuracy=0.6766666666666666


Epoch[2] Batch[5] Speed: 1.2515223102967026 samples/sec                   batch loss = 11.014700770378113 | accuracy = 0.85


Epoch[2] Batch[10] Speed: 1.2469011631177969 samples/sec                   batch loss = 24.19795334339142 | accuracy = 0.725


Epoch[2] Batch[15] Speed: 1.2517521095672264 samples/sec                   batch loss = 35.450042486190796 | accuracy = 0.7


Epoch[2] Batch[20] Speed: 1.2530870043621833 samples/sec                   batch loss = 47.77067947387695 | accuracy = 0.7


Epoch[2] Batch[25] Speed: 1.2516802940668492 samples/sec                   batch loss = 60.50522565841675 | accuracy = 0.67


Epoch[2] Batch[30] Speed: 1.2516459301365708 samples/sec                   batch loss = 72.74204659461975 | accuracy = 0.6583333333333333


Epoch[2] Batch[35] Speed: 1.2530257040607533 samples/sec                   batch loss = 84.90400695800781 | accuracy = 0.65


Epoch[2] Batch[40] Speed: 1.251986477794783 samples/sec                   batch loss = 97.21616244316101 | accuracy = 0.65625


Epoch[2] Batch[45] Speed: 1.2492222524444851 samples/sec                   batch loss = 111.04751253128052 | accuracy = 0.6444444444444445


Epoch[2] Batch[50] Speed: 1.2503606935242704 samples/sec                   batch loss = 125.25404024124146 | accuracy = 0.64


Epoch[2] Batch[55] Speed: 1.2495780484653074 samples/sec                   batch loss = 137.44389355182648 | accuracy = 0.6454545454545455


Epoch[2] Batch[60] Speed: 1.2460492643970051 samples/sec                   batch loss = 151.94642198085785 | accuracy = 0.6375


Epoch[2] Batch[65] Speed: 1.2468927301077202 samples/sec                   batch loss = 166.24957478046417 | accuracy = 0.6269230769230769


Epoch[2] Batch[70] Speed: 1.2472754812230473 samples/sec                   batch loss = 178.0897787809372 | accuracy = 0.6392857142857142


Epoch[2] Batch[75] Speed: 1.249324299770825 samples/sec                   batch loss = 191.89419078826904 | accuracy = 0.64


Epoch[2] Batch[80] Speed: 1.2399275017212577 samples/sec                   batch loss = 204.3253788948059 | accuracy = 0.640625


Epoch[2] Batch[85] Speed: 1.2533567969582153 samples/sec                   batch loss = 217.41290950775146 | accuracy = 0.6411764705882353


Epoch[2] Batch[90] Speed: 1.2526588699231644 samples/sec                   batch loss = 229.6152696609497 | accuracy = 0.6444444444444445


Epoch[2] Batch[95] Speed: 1.2514997177690754 samples/sec                   batch loss = 242.43747889995575 | accuracy = 0.6447368421052632


Epoch[2] Batch[100] Speed: 1.25044959930154 samples/sec                   batch loss = 257.3459290266037 | accuracy = 0.6425


Epoch[2] Batch[105] Speed: 1.2531475618463346 samples/sec                   batch loss = 269.32181119918823 | accuracy = 0.6428571428571429


Epoch[2] Batch[110] Speed: 1.2464790966504788 samples/sec                   batch loss = 281.9052565097809 | accuracy = 0.6454545454545455


Epoch[2] Batch[115] Speed: 1.243027352399163 samples/sec                   batch loss = 294.7360849380493 | accuracy = 0.6391304347826087


Epoch[2] Batch[120] Speed: 1.245856061479164 samples/sec                   batch loss = 305.31230676174164 | accuracy = 0.6416666666666667


Epoch[2] Batch[125] Speed: 1.2517146598841025 samples/sec                   batch loss = 317.2718094587326 | accuracy = 0.644


Epoch[2] Batch[130] Speed: 1.2420210042984168 samples/sec                   batch loss = 328.53346586227417 | accuracy = 0.6519230769230769


Epoch[2] Batch[135] Speed: 1.2498484735610949 samples/sec                   batch loss = 341.7994718551636 | accuracy = 0.6444444444444445


Epoch[2] Batch[140] Speed: 1.2468030321811794 samples/sec                   batch loss = 353.80774569511414 | accuracy = 0.6464285714285715


Epoch[2] Batch[145] Speed: 1.252176820585083 samples/sec                   batch loss = 365.07868015766144 | accuracy = 0.65


Epoch[2] Batch[150] Speed: 1.2589423564603202 samples/sec                   batch loss = 375.95770514011383 | accuracy = 0.65


Epoch[2] Batch[155] Speed: 1.2534115747336936 samples/sec                   batch loss = 386.37900161743164 | accuracy = 0.6580645161290323


Epoch[2] Batch[160] Speed: 1.2489065415746456 samples/sec                   batch loss = 399.346777677536 | accuracy = 0.65625


Epoch[2] Batch[165] Speed: 1.2463572359886164 samples/sec                   batch loss = 413.5871765613556 | accuracy = 0.6545454545454545


Epoch[2] Batch[170] Speed: 1.2496247710472141 samples/sec                   batch loss = 426.047199010849 | accuracy = 0.6558823529411765


Epoch[2] Batch[175] Speed: 1.2459736601425384 samples/sec                   batch loss = 437.79631090164185 | accuracy = 0.6571428571428571


Epoch[2] Batch[180] Speed: 1.258952086898017 samples/sec                   batch loss = 450.35739159584045 | accuracy = 0.6583333333333333


Epoch[2] Batch[185] Speed: 1.260032533460429 samples/sec                   batch loss = 461.79343521595 | accuracy = 0.6648648648648648


Epoch[2] Batch[190] Speed: 1.257337006945429 samples/sec                   batch loss = 472.6772129535675 | accuracy = 0.6671052631578948


Epoch[2] Batch[195] Speed: 1.2527540895618412 samples/sec                   batch loss = 484.62208342552185 | accuracy = 0.6666666666666666


Epoch[2] Batch[200] Speed: 1.2484232878061936 samples/sec                   batch loss = 494.3515303134918 | accuracy = 0.67125


Epoch[2] Batch[205] Speed: 1.2497314456525779 samples/sec                   batch loss = 505.94894540309906 | accuracy = 0.6719512195121952


Epoch[2] Batch[210] Speed: 1.2400186876252282 samples/sec                   batch loss = 517.6146359443665 | accuracy = 0.6726190476190477


Epoch[2] Batch[215] Speed: 1.2446782473594977 samples/sec                   batch loss = 529.0067434310913 | accuracy = 0.6732558139534883


Epoch[2] Batch[220] Speed: 1.2460352903368932 samples/sec                   batch loss = 539.6843845844269 | accuracy = 0.6761363636363636


Epoch[2] Batch[225] Speed: 1.2445476909623945 samples/sec                   batch loss = 551.7104487419128 | accuracy = 0.6766666666666666


Epoch[2] Batch[230] Speed: 1.2408596960543063 samples/sec                   batch loss = 561.52532351017 | accuracy = 0.6793478260869565


Epoch[2] Batch[235] Speed: 1.2400225369734748 samples/sec                   batch loss = 572.3558342456818 | accuracy = 0.6829787234042554


Epoch[2] Batch[240] Speed: 1.2419275011484925 samples/sec                   batch loss = 585.5565050840378 | accuracy = 0.68125


Epoch[2] Batch[245] Speed: 1.2421558133616988 samples/sec                   batch loss = 598.0544432401657 | accuracy = 0.6795918367346939


Epoch[2] Batch[250] Speed: 1.2482250761074924 samples/sec                   batch loss = 609.6015598773956 | accuracy = 0.681


Epoch[2] Batch[255] Speed: 1.2455854190947115 samples/sec                   batch loss = 621.6671451330185 | accuracy = 0.6803921568627451


Epoch[2] Batch[260] Speed: 1.244981661300781 samples/sec                   batch loss = 631.2586911916733 | accuracy = 0.6826923076923077


Epoch[2] Batch[265] Speed: 1.2390604856386933 samples/sec                   batch loss = 642.3766462802887 | accuracy = 0.6849056603773584


Epoch[2] Batch[270] Speed: 1.2368711580502232 samples/sec                   batch loss = 654.85125041008 | accuracy = 0.6814814814814815


Epoch[2] Batch[275] Speed: 1.2384387177448704 samples/sec                   batch loss = 664.1729162931442 | accuracy = 0.6854545454545454


Epoch[2] Batch[280] Speed: 1.2445426132966506 samples/sec                   batch loss = 679.3204520940781 | accuracy = 0.6848214285714286


Epoch[2] Batch[285] Speed: 1.2457826082759533 samples/sec                   batch loss = 689.2169378995895 | accuracy = 0.6885964912280702


Epoch[2] Batch[290] Speed: 1.2395914668854462 samples/sec                   batch loss = 703.0048956871033 | accuracy = 0.6870689655172414


Epoch[2] Batch[295] Speed: 1.2373872135006447 samples/sec                   batch loss = 714.1541543006897 | accuracy = 0.688135593220339


Epoch[2] Batch[300] Speed: 1.2390960837697047 samples/sec                   batch loss = 724.8731496334076 | accuracy = 0.6883333333333334


Epoch[2] Batch[305] Speed: 1.2400466417689813 samples/sec                   batch loss = 735.340703368187 | accuracy = 0.6885245901639344


Epoch[2] Batch[310] Speed: 1.2420914397628944 samples/sec                   batch loss = 745.3272750377655 | accuracy = 0.6895161290322581


Epoch[2] Batch[315] Speed: 1.239569119884008 samples/sec                   batch loss = 757.6395133733749 | accuracy = 0.6888888888888889


Epoch[2] Batch[320] Speed: 1.2421800011668656 samples/sec                   batch loss = 769.4003953933716 | accuracy = 0.690625


Epoch[2] Batch[325] Speed: 1.2408326229534143 samples/sec                   batch loss = 780.3837133646011 | accuracy = 0.6915384615384615


Epoch[2] Batch[330] Speed: 1.2397364672096254 samples/sec                   batch loss = 793.323849439621 | accuracy = 0.6901515151515152


Epoch[2] Batch[335] Speed: 1.2384802226622502 samples/sec                   batch loss = 804.8158632516861 | accuracy = 0.6888059701492537


Epoch[2] Batch[340] Speed: 1.2413179153526341 samples/sec                   batch loss = 816.023305773735 | accuracy = 0.6897058823529412


Epoch[2] Batch[345] Speed: 1.240831062844618 samples/sec                   batch loss = 825.1376589536667 | accuracy = 0.691304347826087


Epoch[2] Batch[350] Speed: 1.2462132748045764 samples/sec                   batch loss = 835.0885465145111 | accuracy = 0.6928571428571428


Epoch[2] Batch[355] Speed: 1.2484178068802807 samples/sec                   batch loss = 848.7447797060013 | accuracy = 0.6908450704225352


Epoch[2] Batch[360] Speed: 1.2464271455365674 samples/sec                   batch loss = 860.7936267852783 | accuracy = 0.6909722222222222


Epoch[2] Batch[365] Speed: 1.2552491664076693 samples/sec                   batch loss = 869.4232674241066 | accuracy = 0.6924657534246575


Epoch[2] Batch[370] Speed: 1.2503058093719934 samples/sec                   batch loss = 880.9054165482521 | accuracy = 0.6925675675675675


Epoch[2] Batch[375] Speed: 1.2432264961745916 samples/sec                   batch loss = 892.5210171341896 | accuracy = 0.692


Epoch[2] Batch[380] Speed: 1.2428001927178491 samples/sec                   batch loss = 903.9056087136269 | accuracy = 0.6907894736842105


Epoch[2] Batch[385] Speed: 1.2379002289019752 samples/sec                   batch loss = 914.0554613471031 | accuracy = 0.6922077922077922


Epoch[2] Batch[390] Speed: 1.2482729978150853 samples/sec                   batch loss = 928.0167439579964 | accuracy = 0.6923076923076923


Epoch[2] Batch[395] Speed: 1.249658000188597 samples/sec                   batch loss = 939.1848283410072 | accuracy = 0.6917721518987342


Epoch[2] Batch[400] Speed: 1.2442612830507052 samples/sec                   batch loss = 951.3173150420189 | accuracy = 0.691875


Epoch[2] Batch[405] Speed: 1.240034635080695 samples/sec                   batch loss = 965.3392364382744 | accuracy = 0.6901234567901234


Epoch[2] Batch[410] Speed: 1.2410612671888324 samples/sec                   batch loss = 980.1312338709831 | accuracy = 0.6890243902439024


Epoch[2] Batch[415] Speed: 1.2460816558078154 samples/sec                   batch loss = 990.4989894032478 | accuracy = 0.6897590361445783


Epoch[2] Batch[420] Speed: 1.2474059613593895 samples/sec                   batch loss = 1000.3177977204323 | accuracy = 0.6916666666666667


Epoch[2] Batch[425] Speed: 1.2449605976565032 samples/sec                   batch loss = 1013.9153969883919 | accuracy = 0.6911764705882353


Epoch[2] Batch[430] Speed: 1.2430155642037761 samples/sec                   batch loss = 1026.4585401415825 | accuracy = 0.6918604651162791


Epoch[2] Batch[435] Speed: 1.2533464973984636 samples/sec                   batch loss = 1038.3027474284172 | accuracy = 0.6913793103448276


Epoch[2] Batch[440] Speed: 1.242064680701612 samples/sec                   batch loss = 1048.5306349396706 | accuracy = 0.6920454545454545


Epoch[2] Batch[445] Speed: 1.2452836518784858 samples/sec                   batch loss = 1060.6440407633781 | accuracy = 0.6915730337078652


Epoch[2] Batch[450] Speed: 1.2446857270207947 samples/sec                   batch loss = 1069.6664915680885 | accuracy = 0.6927777777777778


Epoch[2] Batch[455] Speed: 1.2476256228380445 samples/sec                   batch loss = 1081.2379546761513 | accuracy = 0.6928571428571428


Epoch[2] Batch[460] Speed: 1.2416250225257646 samples/sec                   batch loss = 1093.6584789156914 | accuracy = 0.6929347826086957


Epoch[2] Batch[465] Speed: 1.2407997697261475 samples/sec                   batch loss = 1107.0874773859978 | accuracy = 0.6913978494623656


Epoch[2] Batch[470] Speed: 1.24049618912972 samples/sec                   batch loss = 1118.4027106165886 | accuracy = 0.6909574468085107


Epoch[2] Batch[475] Speed: 1.2418011977154173 samples/sec                   batch loss = 1130.4376977086067 | accuracy = 0.6910526315789474


Epoch[2] Batch[480] Speed: 1.2447228496330054 samples/sec                   batch loss = 1143.6849526762962 | accuracy = 0.6895833333333333


Epoch[2] Batch[485] Speed: 1.2411202088268647 samples/sec                   batch loss = 1156.188955128193 | accuracy = 0.6896907216494845


Epoch[2] Batch[490] Speed: 1.2414068257628001 samples/sec                   batch loss = 1166.6968304514885 | accuracy = 0.689795918367347


Epoch[2] Batch[495] Speed: 1.2393953170268428 samples/sec                   batch loss = 1178.6110424399376 | accuracy = 0.6893939393939394


Epoch[2] Batch[500] Speed: 1.247189158626116 samples/sec                   batch loss = 1190.4937466979027 | accuracy = 0.6885


Epoch[2] Batch[505] Speed: 1.2378133725800533 samples/sec                   batch loss = 1201.790021598339 | accuracy = 0.6881188118811881


Epoch[2] Batch[510] Speed: 1.2453116590385473 samples/sec                   batch loss = 1212.5140253901482 | accuracy = 0.6887254901960784


Epoch[2] Batch[515] Speed: 1.245578668412845 samples/sec                   batch loss = 1224.876092016697 | accuracy = 0.6893203883495146


Epoch[2] Batch[520] Speed: 1.2448778281335353 samples/sec                   batch loss = 1237.2007736563683 | accuracy = 0.6889423076923077


Epoch[2] Batch[525] Speed: 1.2453446591902908 samples/sec                   batch loss = 1247.9306114912033 | accuracy = 0.689047619047619


Epoch[2] Batch[530] Speed: 1.2491455186280884 samples/sec                   batch loss = 1265.8660680055618 | accuracy = 0.6877358490566038


Epoch[2] Batch[535] Speed: 1.247063265885599 samples/sec                   batch loss = 1277.912060379982 | accuracy = 0.6873831775700935


Epoch[2] Batch[540] Speed: 1.2456235202053911 samples/sec                   batch loss = 1289.8129349946976 | accuracy = 0.687962962962963


Epoch[2] Batch[545] Speed: 1.240762788984665 samples/sec                   batch loss = 1300.5725679397583 | accuracy = 0.6885321100917431


Epoch[2] Batch[550] Speed: 1.245649507979605 samples/sec                   batch loss = 1313.1358497142792 | accuracy = 0.6881818181818182


Epoch[2] Batch[555] Speed: 1.2321131088331758 samples/sec                   batch loss = 1322.0451647043228 | accuracy = 0.6896396396396396


Epoch[2] Batch[560] Speed: 1.2424158586414513 samples/sec                   batch loss = 1332.0729883909225 | accuracy = 0.690625


Epoch[2] Batch[565] Speed: 1.2338275762686024 samples/sec                   batch loss = 1345.6940355300903 | accuracy = 0.6907079646017699


Epoch[2] Batch[570] Speed: 1.2423521020597958 samples/sec                   batch loss = 1358.3596115112305 | accuracy = 0.6903508771929825


Epoch[2] Batch[575] Speed: 1.2444128237623493 samples/sec                   batch loss = 1370.39972448349 | accuracy = 0.6895652173913044


Epoch[2] Batch[580] Speed: 1.2418193970625135 samples/sec                   batch loss = 1385.0112438201904 | accuracy = 0.6879310344827586


Epoch[2] Batch[585] Speed: 1.2470631731904418 samples/sec                   batch loss = 1400.548562169075 | accuracy = 0.6863247863247863


Epoch[2] Batch[590] Speed: 1.244948865146177 samples/sec                   batch loss = 1413.8559211492538 | accuracy = 0.6864406779661016


Epoch[2] Batch[595] Speed: 1.2379798807431266 samples/sec                   batch loss = 1426.1651686429977 | accuracy = 0.6861344537815126


Epoch[2] Batch[600] Speed: 1.2458563390265842 samples/sec                   batch loss = 1438.4509370326996 | accuracy = 0.68625


Epoch[2] Batch[605] Speed: 1.2438523443130538 samples/sec                   batch loss = 1448.7427932024002 | accuracy = 0.6863636363636364


Epoch[2] Batch[610] Speed: 1.2453114741686742 samples/sec                   batch loss = 1459.347729921341 | accuracy = 0.6877049180327869


Epoch[2] Batch[615] Speed: 1.2400623149714771 samples/sec                   batch loss = 1471.7032914161682 | accuracy = 0.6878048780487804


Epoch[2] Batch[620] Speed: 1.2437162447918153 samples/sec                   batch loss = 1484.4632683992386 | accuracy = 0.6875


Epoch[2] Batch[625] Speed: 1.2415286390040547 samples/sec                   batch loss = 1496.9954410791397 | accuracy = 0.6872


Epoch[2] Batch[630] Speed: 1.2375356237786712 samples/sec                   batch loss = 1507.2334859371185 | accuracy = 0.6884920634920635


Epoch[2] Batch[635] Speed: 1.2429751360757608 samples/sec                   batch loss = 1515.6569011211395 | accuracy = 0.6893700787401574


Epoch[2] Batch[640] Speed: 1.2419402800011103 samples/sec                   batch loss = 1530.4408506155014 | accuracy = 0.688671875


Epoch[2] Batch[645] Speed: 1.2452635022917244 samples/sec                   batch loss = 1540.3167731761932 | accuracy = 0.6891472868217055


Epoch[2] Batch[650] Speed: 1.2391409275142542 samples/sec                   batch loss = 1551.3155326843262 | accuracy = 0.6896153846153846


Epoch[2] Batch[655] Speed: 1.2432596622974672 samples/sec                   batch loss = 1567.3369796276093 | accuracy = 0.6889312977099237


Epoch[2] Batch[660] Speed: 1.241659757338831 samples/sec                   batch loss = 1578.2076848745346 | accuracy = 0.6897727272727273


Epoch[2] Batch[665] Speed: 1.2388765792943908 samples/sec                   batch loss = 1588.363493680954 | accuracy = 0.6906015037593985


Epoch[2] Batch[670] Speed: 1.2341259948665668 samples/sec                   batch loss = 1599.332969069481 | accuracy = 0.6914179104477612


Epoch[2] Batch[675] Speed: 1.2427502968503126 samples/sec                   batch loss = 1611.634248971939 | accuracy = 0.6918518518518518


Epoch[2] Batch[680] Speed: 1.2454819474997922 samples/sec                   batch loss = 1623.1288537979126 | accuracy = 0.6915441176470588


Epoch[2] Batch[685] Speed: 1.2375415572788775 samples/sec                   batch loss = 1637.9401919841766 | accuracy = 0.6908759124087591


Epoch[2] Batch[690] Speed: 1.2438503155107257 samples/sec                   batch loss = 1652.294041633606 | accuracy = 0.6898550724637681


Epoch[2] Batch[695] Speed: 1.242816027668283 samples/sec                   batch loss = 1664.72130048275 | accuracy = 0.689568345323741


Epoch[2] Batch[700] Speed: 1.2422977347332704 samples/sec                   batch loss = 1678.225025177002 | accuracy = 0.6889285714285714


Epoch[2] Batch[705] Speed: 1.2445215645059928 samples/sec                   batch loss = 1688.6155424118042 | accuracy = 0.6897163120567376


Epoch[2] Batch[710] Speed: 1.2506984908690941 samples/sec                   batch loss = 1698.9655836820602 | accuracy = 0.6908450704225352


Epoch[2] Batch[715] Speed: 1.2532123376893547 samples/sec                   batch loss = 1709.8005112409592 | accuracy = 0.691958041958042


Epoch[2] Batch[720] Speed: 1.2417270271060405 samples/sec                   batch loss = 1717.7964539527893 | accuracy = 0.6930555555555555


Epoch[2] Batch[725] Speed: 1.2457643850817013 samples/sec                   batch loss = 1727.0870541334152 | accuracy = 0.6934482758620689


Epoch[2] Batch[730] Speed: 1.24785789136673 samples/sec                   batch loss = 1738.0016664266586 | accuracy = 0.6938356164383561


Epoch[2] Batch[735] Speed: 1.24665952381147 samples/sec                   batch loss = 1751.7336364984512 | accuracy = 0.6928571428571428


Epoch[2] Batch[740] Speed: 1.2469390667364315 samples/sec                   batch loss = 1762.2557100057602 | accuracy = 0.6932432432432433


Epoch[2] Batch[745] Speed: 1.2520202063922432 samples/sec                   batch loss = 1773.9759184122086 | accuracy = 0.6932885906040268


Epoch[2] Batch[750] Speed: 1.2419279608148903 samples/sec                   batch loss = 1784.8580263853073 | accuracy = 0.693


Epoch[2] Batch[755] Speed: 1.2484964952951108 samples/sec                   batch loss = 1795.2715920209885 | accuracy = 0.6930463576158941


Epoch[2] Batch[760] Speed: 1.2510754658661836 samples/sec                   batch loss = 1805.8796472549438 | accuracy = 0.6934210526315789


Epoch[2] Batch[765] Speed: 1.2516115680931115 samples/sec                   batch loss = 1816.2615855932236 | accuracy = 0.6931372549019608


Epoch[2] Batch[770] Speed: 1.2517442645602836 samples/sec                   batch loss = 1829.93006336689 | accuracy = 0.6918831168831169


Epoch[2] Batch[775] Speed: 1.2528847829069196 samples/sec                   batch loss = 1842.9826999902725 | accuracy = 0.6919354838709677


Epoch[2] Batch[780] Speed: 1.2552057786754378 samples/sec                   batch loss = 1854.3277629613876 | accuracy = 0.6923076923076923


Epoch[2] Batch[785] Speed: 1.2566256055910694 samples/sec                   batch loss = 1865.3179550170898 | accuracy = 0.6926751592356688


[Epoch 2] training: accuracy=0.692258883248731
[Epoch 2] time cost: 648.7394227981567
[Epoch 2] validation: validation accuracy=0.71


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).