<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[03:31:52] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[03:31:52] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[03:31:52] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 5.3163333, -5.633159 ]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7760893004211932 samples/sec                   batch loss = 12.41059923171997 | accuracy = 0.7


Epoch[1] Batch[10] Speed: 1.2518538233803767 samples/sec                   batch loss = 27.457735061645508 | accuracy = 0.575


Epoch[1] Batch[15] Speed: 1.2555181997281704 samples/sec                   batch loss = 40.555607080459595 | accuracy = 0.5666666666666667


Epoch[1] Batch[20] Speed: 1.2508534689171351 samples/sec                   batch loss = 54.52211785316467 | accuracy = 0.5625


Epoch[1] Batch[25] Speed: 1.258318788639441 samples/sec                   batch loss = 68.79746103286743 | accuracy = 0.57


Epoch[1] Batch[30] Speed: 1.2531886544152502 samples/sec                   batch loss = 83.18084049224854 | accuracy = 0.5416666666666666


Epoch[1] Batch[35] Speed: 1.2521192537742958 samples/sec                   batch loss = 97.50966906547546 | accuracy = 0.5357142857142857


Epoch[1] Batch[40] Speed: 1.2550382663175317 samples/sec                   batch loss = 112.81985759735107 | accuracy = 0.51875


Epoch[1] Batch[45] Speed: 1.2551525341941807 samples/sec                   batch loss = 126.76397180557251 | accuracy = 0.5222222222222223


Epoch[1] Batch[50] Speed: 1.2541964055105588 samples/sec                   batch loss = 140.57818150520325 | accuracy = 0.52


Epoch[1] Batch[55] Speed: 1.253382640235179 samples/sec                   batch loss = 154.3870553970337 | accuracy = 0.5181818181818182


Epoch[1] Batch[60] Speed: 1.25945222603457 samples/sec                   batch loss = 167.89437317848206 | accuracy = 0.5208333333333334


Epoch[1] Batch[65] Speed: 1.2570675704269465 samples/sec                   batch loss = 181.8481628894806 | accuracy = 0.5192307692307693


Epoch[1] Batch[70] Speed: 1.2540196954125011 samples/sec                   batch loss = 196.01586651802063 | accuracy = 0.5142857142857142


Epoch[1] Batch[75] Speed: 1.2638120365002476 samples/sec                   batch loss = 209.76587867736816 | accuracy = 0.5166666666666667


Epoch[1] Batch[80] Speed: 1.2547513269252617 samples/sec                   batch loss = 224.31753706932068 | accuracy = 0.50625


Epoch[1] Batch[85] Speed: 1.2525323381290587 samples/sec                   batch loss = 238.13580179214478 | accuracy = 0.5147058823529411


Epoch[1] Batch[90] Speed: 1.2569236670502493 samples/sec                   batch loss = 252.66896677017212 | accuracy = 0.5083333333333333


Epoch[1] Batch[95] Speed: 1.2559117182607902 samples/sec                   batch loss = 266.32033824920654 | accuracy = 0.5131578947368421


Epoch[1] Batch[100] Speed: 1.2534784381995718 samples/sec                   batch loss = 280.65442728996277 | accuracy = 0.5025


Epoch[1] Batch[105] Speed: 1.2568772445218441 samples/sec                   batch loss = 294.902526140213 | accuracy = 0.49523809523809526


Epoch[1] Batch[110] Speed: 1.2582409332747158 samples/sec                   batch loss = 308.58187437057495 | accuracy = 0.5


Epoch[1] Batch[115] Speed: 1.2560292483274185 samples/sec                   batch loss = 322.57900857925415 | accuracy = 0.5043478260869565


Epoch[1] Batch[120] Speed: 1.2555927115341885 samples/sec                   batch loss = 335.7632484436035 | accuracy = 0.5104166666666666


Epoch[1] Batch[125] Speed: 1.253112836514388 samples/sec                   batch loss = 349.2412521839142 | accuracy = 0.516


Epoch[1] Batch[130] Speed: 1.2586014137834622 samples/sec                   batch loss = 362.96885776519775 | accuracy = 0.5192307692307693


Epoch[1] Batch[135] Speed: 1.2572879156754888 samples/sec                   batch loss = 376.20839977264404 | accuracy = 0.5277777777777778


Epoch[1] Batch[140] Speed: 1.2564564908206914 samples/sec                   batch loss = 390.1280777454376 | accuracy = 0.5285714285714286


Epoch[1] Batch[145] Speed: 1.2584120390160465 samples/sec                   batch loss = 403.7228317260742 | accuracy = 0.5293103448275862


Epoch[1] Batch[150] Speed: 1.256483685394376 samples/sec                   batch loss = 417.8767650127411 | accuracy = 0.5266666666666666


Epoch[1] Batch[155] Speed: 1.2531205114894666 samples/sec                   batch loss = 430.8851544857025 | accuracy = 0.532258064516129


Epoch[1] Batch[160] Speed: 1.2535325709923133 samples/sec                   batch loss = 444.4122338294983 | accuracy = 0.5328125


Epoch[1] Batch[165] Speed: 1.2562423629242478 samples/sec                   batch loss = 458.48582673072815 | accuracy = 0.5318181818181819


Epoch[1] Batch[170] Speed: 1.2587441905655798 samples/sec                   batch loss = 471.95116901397705 | accuracy = 0.5323529411764706


Epoch[1] Batch[175] Speed: 1.2587093432438257 samples/sec                   batch loss = 485.44575214385986 | accuracy = 0.5328571428571428


Epoch[1] Batch[180] Speed: 1.256111908644098 samples/sec                   batch loss = 499.197172164917 | accuracy = 0.5333333333333333


Epoch[1] Batch[185] Speed: 1.2583819293371052 samples/sec                   batch loss = 513.4240343570709 | accuracy = 0.5297297297297298


Epoch[1] Batch[190] Speed: 1.2572949823207582 samples/sec                   batch loss = 527.3872587680817 | accuracy = 0.5289473684210526


Epoch[1] Batch[195] Speed: 1.2523445982184345 samples/sec                   batch loss = 540.4508602619171 | accuracy = 0.5333333333333333


Epoch[1] Batch[200] Speed: 1.2520979479202023 samples/sec                   batch loss = 554.1622200012207 | accuracy = 0.53625


Epoch[1] Batch[205] Speed: 1.2569745191911472 samples/sec                   batch loss = 568.1703708171844 | accuracy = 0.5329268292682927


Epoch[1] Batch[210] Speed: 1.2577443013341725 samples/sec                   batch loss = 582.1075232028961 | accuracy = 0.5333333333333333


Epoch[1] Batch[215] Speed: 1.253149059478064 samples/sec                   batch loss = 595.1954205036163 | accuracy = 0.5383720930232558


Epoch[1] Batch[220] Speed: 1.2622833288127866 samples/sec                   batch loss = 609.5404570102692 | accuracy = 0.5375


Epoch[1] Batch[225] Speed: 1.2568947585009365 samples/sec                   batch loss = 622.7466950416565 | accuracy = 0.5422222222222223


Epoch[1] Batch[230] Speed: 1.2566999665024146 samples/sec                   batch loss = 636.5732052326202 | accuracy = 0.5413043478260869


Epoch[1] Batch[235] Speed: 1.2599450038120037 samples/sec                   batch loss = 649.8846061229706 | accuracy = 0.5436170212765957


Epoch[1] Batch[240] Speed: 1.2568728190237317 samples/sec                   batch loss = 663.5258903503418 | accuracy = 0.5416666666666666


Epoch[1] Batch[245] Speed: 1.2523031870806551 samples/sec                   batch loss = 676.6005055904388 | accuracy = 0.5438775510204081


Epoch[1] Batch[250] Speed: 1.2494728889024598 samples/sec                   batch loss = 690.0755343437195 | accuracy = 0.545


Epoch[1] Batch[255] Speed: 1.2481498575920786 samples/sec                   batch loss = 702.767160654068 | accuracy = 0.5490196078431373


Epoch[1] Batch[260] Speed: 1.2442463340185488 samples/sec                   batch loss = 716.4991238117218 | accuracy = 0.5509615384615385


Epoch[1] Batch[265] Speed: 1.250116686983813 samples/sec                   batch loss = 730.7308645248413 | accuracy = 0.5490566037735849


Epoch[1] Batch[270] Speed: 1.2503058093719934 samples/sec                   batch loss = 744.2284910678864 | accuracy = 0.55


Epoch[1] Batch[275] Speed: 1.2506480520272434 samples/sec                   batch loss = 756.7112708091736 | accuracy = 0.5545454545454546


Epoch[1] Batch[280] Speed: 1.2520287089016624 samples/sec                   batch loss = 770.2246565818787 | accuracy = 0.5589285714285714


Epoch[1] Batch[285] Speed: 1.2590928644267805 samples/sec                   batch loss = 783.0700891017914 | accuracy = 0.5631578947368421


Epoch[1] Batch[290] Speed: 1.2495920090301356 samples/sec                   batch loss = 797.053624868393 | accuracy = 0.5612068965517242


Epoch[1] Batch[295] Speed: 1.2533695311933863 samples/sec                   batch loss = 810.5347528457642 | accuracy = 0.5635593220338984


Epoch[1] Batch[300] Speed: 1.250107185792134 samples/sec                   batch loss = 823.4875245094299 | accuracy = 0.5641666666666667


Epoch[1] Batch[305] Speed: 1.2473883398461576 samples/sec                   batch loss = 836.7059609889984 | accuracy = 0.5663934426229508


Epoch[1] Batch[310] Speed: 1.2523923693208803 samples/sec                   batch loss = 851.2233364582062 | accuracy = 0.5645161290322581


Epoch[1] Batch[315] Speed: 1.2568913686604386 samples/sec                   batch loss = 865.6053130626678 | accuracy = 0.5642857142857143


Epoch[1] Batch[320] Speed: 1.252244113177466 samples/sec                   batch loss = 879.4867424964905 | accuracy = 0.56484375


Epoch[1] Batch[325] Speed: 1.2500322059643882 samples/sec                   batch loss = 893.7935676574707 | accuracy = 0.5623076923076923


Epoch[1] Batch[330] Speed: 1.2457341376062223 samples/sec                   batch loss = 906.9756338596344 | accuracy = 0.5651515151515152


Epoch[1] Batch[335] Speed: 1.2566817049064412 samples/sec                   batch loss = 919.7562611103058 | accuracy = 0.5664179104477612


Epoch[1] Batch[340] Speed: 1.2534923923995571 samples/sec                   batch loss = 933.0041675567627 | accuracy = 0.5669117647058823


Epoch[1] Batch[345] Speed: 1.2493808655431196 samples/sec                   batch loss = 946.4480817317963 | accuracy = 0.5673913043478261


Epoch[1] Batch[350] Speed: 1.2483721034058757 samples/sec                   batch loss = 960.5143048763275 | accuracy = 0.5671428571428572


Epoch[1] Batch[355] Speed: 1.2546664060196888 samples/sec                   batch loss = 973.5466001033783 | accuracy = 0.5690140845070423


Epoch[1] Batch[360] Speed: 1.25047504321861 samples/sec                   batch loss = 987.3183767795563 | accuracy = 0.5708333333333333


Epoch[1] Batch[365] Speed: 1.2482843286791436 samples/sec                   batch loss = 1001.4645411968231 | accuracy = 0.5698630136986301


Epoch[1] Batch[370] Speed: 1.2462658561276387 samples/sec                   batch loss = 1015.3348965644836 | accuracy = 0.5695945945945946


Epoch[1] Batch[375] Speed: 1.2511682055680855 samples/sec                   batch loss = 1028.945695400238 | accuracy = 0.5693333333333334


Epoch[1] Batch[380] Speed: 1.249170165432317 samples/sec                   batch loss = 1042.408656835556 | accuracy = 0.5703947368421053


Epoch[1] Batch[385] Speed: 1.253671576947894 samples/sec                   batch loss = 1055.6219975948334 | accuracy = 0.5707792207792208


Epoch[1] Batch[390] Speed: 1.2489314578684423 samples/sec                   batch loss = 1069.4675018787384 | accuracy = 0.5705128205128205


Epoch[1] Batch[395] Speed: 1.2479400365726445 samples/sec                   batch loss = 1082.4260079860687 | accuracy = 0.5689873417721519


Epoch[1] Batch[400] Speed: 1.2517237185795427 samples/sec                   batch loss = 1095.5640456676483 | accuracy = 0.569375


Epoch[1] Batch[405] Speed: 1.2455040458872912 samples/sec                   batch loss = 1108.4356751441956 | accuracy = 0.5709876543209876


Epoch[1] Batch[410] Speed: 1.2506115073860193 samples/sec                   batch loss = 1122.249473810196 | accuracy = 0.5695121951219512


Epoch[1] Batch[415] Speed: 1.2508769707470637 samples/sec                   batch loss = 1136.1915624141693 | accuracy = 0.5692771084337349


Epoch[1] Batch[420] Speed: 1.2566215583509581 samples/sec                   batch loss = 1150.3052184581757 | accuracy = 0.5672619047619047


Epoch[1] Batch[425] Speed: 1.2563649412720417 samples/sec                   batch loss = 1163.9120297431946 | accuracy = 0.5682352941176471


Epoch[1] Batch[430] Speed: 1.2526191214099096 samples/sec                   batch loss = 1177.186427116394 | accuracy = 0.5686046511627907


Epoch[1] Batch[435] Speed: 1.2511581285564273 samples/sec                   batch loss = 1190.332290649414 | accuracy = 0.5678160919540229


Epoch[1] Batch[440] Speed: 1.2614520203273343 samples/sec                   batch loss = 1203.5693860054016 | accuracy = 0.5698863636363637


Epoch[1] Batch[445] Speed: 1.2523227238224663 samples/sec                   batch loss = 1216.9336223602295 | accuracy = 0.5691011235955056


Epoch[1] Batch[450] Speed: 1.2553827292927744 samples/sec                   batch loss = 1230.4611551761627 | accuracy = 0.5694444444444444


Epoch[1] Batch[455] Speed: 1.253724414789797 samples/sec                   batch loss = 1244.5903928279877 | accuracy = 0.5681318681318681


Epoch[1] Batch[460] Speed: 1.2502641602684326 samples/sec                   batch loss = 1257.3005638122559 | accuracy = 0.5695652173913044


Epoch[1] Batch[465] Speed: 1.2521245803511143 samples/sec                   batch loss = 1270.0542097091675 | accuracy = 0.5709677419354838


Epoch[1] Batch[470] Speed: 1.2602772078379714 samples/sec                   batch loss = 1283.411018371582 | accuracy = 0.5707446808510638


Epoch[1] Batch[475] Speed: 1.2487292741111171 samples/sec                   batch loss = 1296.348878145218 | accuracy = 0.5715789473684211


Epoch[1] Batch[480] Speed: 1.2549492700937404 samples/sec                   batch loss = 1309.1934876441956 | accuracy = 0.5723958333333333


Epoch[1] Batch[485] Speed: 1.2566211818648312 samples/sec                   batch loss = 1323.8633751869202 | accuracy = 0.5711340206185567


Epoch[1] Batch[490] Speed: 1.2512063690558581 samples/sec                   batch loss = 1336.757224559784 | accuracy = 0.5719387755102041


Epoch[1] Batch[495] Speed: 1.2516933677966728 samples/sec                   batch loss = 1350.9200267791748 | accuracy = 0.5707070707070707


Epoch[1] Batch[500] Speed: 1.2504028150209476 samples/sec                   batch loss = 1364.6852660179138 | accuracy = 0.571


Epoch[1] Batch[505] Speed: 1.2461734714720791 samples/sec                   batch loss = 1378.5757365226746 | accuracy = 0.5693069306930693


Epoch[1] Batch[510] Speed: 1.2461053488775244 samples/sec                   batch loss = 1392.1846137046814 | accuracy = 0.5686274509803921


Epoch[1] Batch[515] Speed: 1.251480206697082 samples/sec                   batch loss = 1404.8880622386932 | accuracy = 0.5689320388349515


Epoch[1] Batch[520] Speed: 1.2583478571122664 samples/sec                   batch loss = 1418.8153800964355 | accuracy = 0.5692307692307692


Epoch[1] Batch[525] Speed: 1.2552773418660468 samples/sec                   batch loss = 1432.1941964626312 | accuracy = 0.57


Epoch[1] Batch[530] Speed: 1.2488976165769168 samples/sec                   batch loss = 1444.9969482421875 | accuracy = 0.5702830188679245


Epoch[1] Batch[535] Speed: 1.2523046826948456 samples/sec                   batch loss = 1456.6551365852356 | accuracy = 0.5733644859813084


Epoch[1] Batch[540] Speed: 1.2533322655580927 samples/sec                   batch loss = 1470.6740577220917 | accuracy = 0.5731481481481482


Epoch[1] Batch[545] Speed: 1.2563622128674188 samples/sec                   batch loss = 1483.6230084896088 | accuracy = 0.5743119266055046


Epoch[1] Batch[550] Speed: 1.2518512079421427 samples/sec                   batch loss = 1497.6260602474213 | accuracy = 0.5727272727272728


Epoch[1] Batch[555] Speed: 1.2563398215806254 samples/sec                   batch loss = 1510.3304812908173 | accuracy = 0.5743243243243243


Epoch[1] Batch[560] Speed: 1.2577217664702307 samples/sec                   batch loss = 1524.0620188713074 | accuracy = 0.5732142857142857


Epoch[1] Batch[565] Speed: 1.250856826267342 samples/sec                   batch loss = 1538.4048073291779 | accuracy = 0.5734513274336284


Epoch[1] Batch[570] Speed: 1.2525801235556542 samples/sec                   batch loss = 1552.8825335502625 | accuracy = 0.5714912280701754


Epoch[1] Batch[575] Speed: 1.2552784689106908 samples/sec                   batch loss = 1565.8451635837555 | accuracy = 0.5721739130434783


Epoch[1] Batch[580] Speed: 1.2597337526134615 samples/sec                   batch loss = 1578.6956334114075 | accuracy = 0.5732758620689655


Epoch[1] Batch[585] Speed: 1.2537928108085608 samples/sec                   batch loss = 1591.6694462299347 | accuracy = 0.5747863247863247


Epoch[1] Batch[590] Speed: 1.2587102875902532 samples/sec                   batch loss = 1603.4935111999512 | accuracy = 0.576271186440678


Epoch[1] Batch[595] Speed: 1.258055723419969 samples/sec                   batch loss = 1615.7400045394897 | accuracy = 0.5777310924369747


Epoch[1] Batch[600] Speed: 1.25828226619591 samples/sec                   batch loss = 1628.3385500907898 | accuracy = 0.5783333333333334


Epoch[1] Batch[605] Speed: 1.2514617230847818 samples/sec                   batch loss = 1640.635451078415 | accuracy = 0.5785123966942148


Epoch[1] Batch[610] Speed: 1.255682174968904 samples/sec                   batch loss = 1655.2068989276886 | accuracy = 0.5774590163934427


Epoch[1] Batch[615] Speed: 1.2559681299476455 samples/sec                   batch loss = 1669.1457772254944 | accuracy = 0.5780487804878048


Epoch[1] Batch[620] Speed: 1.2561972134599533 samples/sec                   batch loss = 1681.6056693792343 | accuracy = 0.5782258064516129


Epoch[1] Batch[625] Speed: 1.2569575680203304 samples/sec                   batch loss = 1695.4087044000626 | accuracy = 0.578


Epoch[1] Batch[630] Speed: 1.2551326273649206 samples/sec                   batch loss = 1709.580173611641 | accuracy = 0.5761904761904761


Epoch[1] Batch[635] Speed: 1.2467722709883386 samples/sec                   batch loss = 1722.616692185402 | accuracy = 0.5763779527559055


Epoch[1] Batch[640] Speed: 1.2483737754252338 samples/sec                   batch loss = 1735.2838951349258 | accuracy = 0.57734375


Epoch[1] Batch[645] Speed: 1.2454159343932507 samples/sec                   batch loss = 1748.2854619026184 | accuracy = 0.577906976744186


Epoch[1] Batch[650] Speed: 1.2554611708152774 samples/sec                   batch loss = 1762.2195618152618 | accuracy = 0.5773076923076923


Epoch[1] Batch[655] Speed: 1.253403053431921 samples/sec                   batch loss = 1773.9773120880127 | accuracy = 0.5786259541984733


Epoch[1] Batch[660] Speed: 1.253890170889196 samples/sec                   batch loss = 1786.0907369852066 | accuracy = 0.5799242424242425


Epoch[1] Batch[665] Speed: 1.2522843987317136 samples/sec                   batch loss = 1801.0310181379318 | accuracy = 0.5789473684210527


Epoch[1] Batch[670] Speed: 1.2537466192545261 samples/sec                   batch loss = 1813.5296493768692 | accuracy = 0.5794776119402985


Epoch[1] Batch[675] Speed: 1.2551721599411543 samples/sec                   batch loss = 1824.848366856575 | accuracy = 0.5814814814814815


Epoch[1] Batch[680] Speed: 1.2499506232276234 samples/sec                   batch loss = 1837.3786388635635 | accuracy = 0.5823529411764706


Epoch[1] Batch[685] Speed: 1.2498897225104721 samples/sec                   batch loss = 1850.8306349515915 | accuracy = 0.5824817518248175


Epoch[1] Batch[690] Speed: 1.2580430824530633 samples/sec                   batch loss = 1863.5644022226334 | accuracy = 0.5829710144927536


Epoch[1] Batch[695] Speed: 1.2560639474084554 samples/sec                   batch loss = 1877.8611768484116 | accuracy = 0.5823741007194244


Epoch[1] Batch[700] Speed: 1.2503729010111218 samples/sec                   batch loss = 1891.554807305336 | accuracy = 0.5825


Epoch[1] Batch[705] Speed: 1.2551579805121968 samples/sec                   batch loss = 1905.1290692090988 | accuracy = 0.5819148936170213


Epoch[1] Batch[710] Speed: 1.2549449520326525 samples/sec                   batch loss = 1918.3624867200851 | accuracy = 0.581338028169014


Epoch[1] Batch[715] Speed: 1.2506777927375448 samples/sec                   batch loss = 1930.5384718179703 | accuracy = 0.5814685314685315


Epoch[1] Batch[720] Speed: 1.2505251884745354 samples/sec                   batch loss = 1942.7453311681747 | accuracy = 0.5822916666666667


Epoch[1] Batch[725] Speed: 1.248653065501679 samples/sec                   batch loss = 1955.2496081590652 | accuracy = 0.5824137931034483


Epoch[1] Batch[730] Speed: 1.2568570946335957 samples/sec                   batch loss = 1966.6478962898254 | accuracy = 0.583904109589041


Epoch[1] Batch[735] Speed: 1.2595697577718703 samples/sec                   batch loss = 1980.2245376110077 | accuracy = 0.5836734693877551


Epoch[1] Batch[740] Speed: 1.2476128194972818 samples/sec                   batch loss = 1992.6089698076248 | accuracy = 0.5837837837837838


Epoch[1] Batch[745] Speed: 1.2502558680488436 samples/sec                   batch loss = 2005.8541473150253 | accuracy = 0.5838926174496645


Epoch[1] Batch[750] Speed: 1.2498206343869898 samples/sec                   batch loss = 2018.2661110162735 | accuracy = 0.584


Epoch[1] Batch[755] Speed: 1.2549175424248222 samples/sec                   batch loss = 2031.563966870308 | accuracy = 0.5844370860927153


Epoch[1] Batch[760] Speed: 1.258560154304975 samples/sec                   batch loss = 2044.7178755998611 | accuracy = 0.5848684210526316


Epoch[1] Batch[765] Speed: 1.2616042674265822 samples/sec                   batch loss = 2058.647358775139 | accuracy = 0.584640522875817


Epoch[1] Batch[770] Speed: 1.2583289813276644 samples/sec                   batch loss = 2071.809439063072 | accuracy = 0.5847402597402598


Epoch[1] Batch[775] Speed: 1.2525035377151605 samples/sec                   batch loss = 2085.5380724668503 | accuracy = 0.584516129032258


Epoch[1] Batch[780] Speed: 1.2554271626282172 samples/sec                   batch loss = 2098.174038529396 | accuracy = 0.5852564102564103


Epoch[1] Batch[785] Speed: 1.2630857788852148 samples/sec                   batch loss = 2111.012000322342 | accuracy = 0.585031847133758


[Epoch 1] training: accuracy=0.5853426395939086
[Epoch 1] time cost: 646.8763015270233
[Epoch 1] validation: validation accuracy=0.6877777777777778


Epoch[2] Batch[5] Speed: 1.257010306497679 samples/sec                   batch loss = 12.962445616722107 | accuracy = 0.65


Epoch[2] Batch[10] Speed: 1.2562005995572019 samples/sec                   batch loss = 25.591745734214783 | accuracy = 0.65


Epoch[2] Batch[15] Speed: 1.2537430589920475 samples/sec                   batch loss = 37.25240969657898 | accuracy = 0.6833333333333333


Epoch[2] Batch[20] Speed: 1.2566997782358311 samples/sec                   batch loss = 50.6472704410553 | accuracy = 0.6375


Epoch[2] Batch[25] Speed: 1.2543925792235875 samples/sec                   batch loss = 61.863974809646606 | accuracy = 0.67


Epoch[2] Batch[30] Speed: 1.2509550367348206 samples/sec                   batch loss = 74.89184927940369 | accuracy = 0.6583333333333333


Epoch[2] Batch[35] Speed: 1.2518958587083917 samples/sec                   batch loss = 87.50037336349487 | accuracy = 0.6714285714285714


Epoch[2] Batch[40] Speed: 1.2589350823478669 samples/sec                   batch loss = 98.24707210063934 | accuracy = 0.7


Epoch[2] Batch[45] Speed: 1.256907564730056 samples/sec                   batch loss = 110.84087765216827 | accuracy = 0.7111111111111111


Epoch[2] Batch[50] Speed: 1.2519021175431186 samples/sec                   batch loss = 122.01080679893494 | accuracy = 0.7


Epoch[2] Batch[55] Speed: 1.2596418194498595 samples/sec                   batch loss = 133.83479619026184 | accuracy = 0.7090909090909091


Epoch[2] Batch[60] Speed: 1.2559578814478463 samples/sec                   batch loss = 146.96848618984222 | accuracy = 0.7


Epoch[2] Batch[65] Speed: 1.257242973670702 samples/sec                   batch loss = 160.81986320018768 | accuracy = 0.6923076923076923


Epoch[2] Batch[70] Speed: 1.2554505548021768 samples/sec                   batch loss = 172.2520055770874 | accuracy = 0.7


Epoch[2] Batch[75] Speed: 1.2539663640578418 samples/sec                   batch loss = 183.23508286476135 | accuracy = 0.7066666666666667


Epoch[2] Batch[80] Speed: 1.2501940991329978 samples/sec                   batch loss = 196.51949226856232 | accuracy = 0.696875


Epoch[2] Batch[85] Speed: 1.2523331000432343 samples/sec                   batch loss = 207.60488724708557 | accuracy = 0.6970588235294117


Epoch[2] Batch[90] Speed: 1.2502800928192652 samples/sec                   batch loss = 219.34406423568726 | accuracy = 0.6944444444444444


Epoch[2] Batch[95] Speed: 1.2554486758805052 samples/sec                   batch loss = 231.1934003829956 | accuracy = 0.6947368421052632


Epoch[2] Batch[100] Speed: 1.245608815859781 samples/sec                   batch loss = 245.56959533691406 | accuracy = 0.6875


Epoch[2] Batch[105] Speed: 1.2490517766503508 samples/sec                   batch loss = 256.751611828804 | accuracy = 0.6880952380952381


Epoch[2] Batch[110] Speed: 1.2505662024542 samples/sec                   batch loss = 270.6025677919388 | accuracy = 0.6795454545454546


Epoch[2] Batch[115] Speed: 1.2584336546897108 samples/sec                   batch loss = 283.11128866672516 | accuracy = 0.6760869565217391


Epoch[2] Batch[120] Speed: 1.2572666219980682 samples/sec                   batch loss = 295.1307262182236 | accuracy = 0.675


Epoch[2] Batch[125] Speed: 1.2541450278914166 samples/sec                   batch loss = 306.66631853580475 | accuracy = 0.672


Epoch[2] Batch[130] Speed: 1.2540182894310499 samples/sec                   batch loss = 321.33007752895355 | accuracy = 0.6634615384615384


Epoch[2] Batch[135] Speed: 1.2543218670935359 samples/sec                   batch loss = 335.9263949394226 | accuracy = 0.6611111111111111


Epoch[2] Batch[140] Speed: 1.250213290631241 samples/sec                   batch loss = 349.8025549650192 | accuracy = 0.6589285714285714


Epoch[2] Batch[145] Speed: 1.2470323991599939 samples/sec                   batch loss = 362.2917060852051 | accuracy = 0.6586206896551724


Epoch[2] Batch[150] Speed: 1.2558665925597394 samples/sec                   batch loss = 375.5362253189087 | accuracy = 0.6566666666666666


Epoch[2] Batch[155] Speed: 1.2478658733643309 samples/sec                   batch loss = 388.51021575927734 | accuracy = 0.6548387096774193


Epoch[2] Batch[160] Speed: 1.24262253750669 samples/sec                   batch loss = 401.68605637550354 | accuracy = 0.653125


Epoch[2] Batch[165] Speed: 1.2483789772918792 samples/sec                   batch loss = 413.4268503189087 | accuracy = 0.656060606060606


Epoch[2] Batch[170] Speed: 1.255891411294014 samples/sec                   batch loss = 426.8521103858948 | accuracy = 0.6573529411764706


Epoch[2] Batch[175] Speed: 1.2468128538455696 samples/sec                   batch loss = 438.2398668527603 | accuracy = 0.6571428571428571


Epoch[2] Batch[180] Speed: 1.2566607141399213 samples/sec                   batch loss = 448.12640714645386 | accuracy = 0.6625


Epoch[2] Batch[185] Speed: 1.2536979016280152 samples/sec                   batch loss = 461.7426242828369 | accuracy = 0.6581081081081082


Epoch[2] Batch[190] Speed: 1.2559066414575306 samples/sec                   batch loss = 473.11202323436737 | accuracy = 0.6592105263157895


Epoch[2] Batch[195] Speed: 1.2553556762865714 samples/sec                   batch loss = 484.9751478433609 | accuracy = 0.6576923076923077


Epoch[2] Batch[200] Speed: 1.2471758079702555 samples/sec                   batch loss = 498.4894675016403 | accuracy = 0.6575


Epoch[2] Batch[205] Speed: 1.2510727603871183 samples/sec                   batch loss = 511.63473665714264 | accuracy = 0.6573170731707317


Epoch[2] Batch[210] Speed: 1.2426261269278436 samples/sec                   batch loss = 522.4700828790665 | accuracy = 0.6583333333333333


Epoch[2] Batch[215] Speed: 1.2490082583717168 samples/sec                   batch loss = 536.9609376192093 | accuracy = 0.6558139534883721


Epoch[2] Batch[220] Speed: 1.2572282763537568 samples/sec                   batch loss = 551.5434814691544 | accuracy = 0.6534090909090909


Epoch[2] Batch[225] Speed: 1.2543463434671132 samples/sec                   batch loss = 564.4894105195999 | accuracy = 0.6522222222222223


Epoch[2] Batch[230] Speed: 1.2563417031706474 samples/sec                   batch loss = 575.8652871847153 | accuracy = 0.6532608695652173


Epoch[2] Batch[235] Speed: 1.2582471613493529 samples/sec                   batch loss = 588.9567016363144 | accuracy = 0.652127659574468


Epoch[2] Batch[240] Speed: 1.2532062529682686 samples/sec                   batch loss = 602.5222702026367 | accuracy = 0.65


Epoch[2] Batch[245] Speed: 1.2605459385202311 samples/sec                   batch loss = 613.0986233949661 | accuracy = 0.6530612244897959


Epoch[2] Batch[250] Speed: 1.2535886754811014 samples/sec                   batch loss = 625.6638902425766 | accuracy = 0.654


Epoch[2] Batch[255] Speed: 1.2571130650632494 samples/sec                   batch loss = 639.0789704322815 | accuracy = 0.6509803921568628


Epoch[2] Batch[260] Speed: 1.2576041079955966 samples/sec                   batch loss = 649.6432905197144 | accuracy = 0.6528846153846154


Epoch[2] Batch[265] Speed: 1.2546836707756024 samples/sec                   batch loss = 662.7647691965103 | accuracy = 0.6537735849056604


Epoch[2] Batch[270] Speed: 1.2535458707675626 samples/sec                   batch loss = 677.0931898355484 | accuracy = 0.6509259259259259


Epoch[2] Batch[275] Speed: 1.2531656272679115 samples/sec                   batch loss = 689.0648930072784 | accuracy = 0.6509090909090909


Epoch[2] Batch[280] Speed: 1.256650642586104 samples/sec                   batch loss = 702.6550323963165 | accuracy = 0.6526785714285714


Epoch[2] Batch[285] Speed: 1.2548058512675193 samples/sec                   batch loss = 714.01793217659 | accuracy = 0.6552631578947369


Epoch[2] Batch[290] Speed: 1.2492849486446025 samples/sec                   batch loss = 725.7729350328445 | accuracy = 0.6560344827586206


Epoch[2] Batch[295] Speed: 1.2527774757736252 samples/sec                   batch loss = 736.1896872520447 | accuracy = 0.6584745762711864


Epoch[2] Batch[300] Speed: 1.2516985039795399 samples/sec                   batch loss = 747.6969923973083 | accuracy = 0.6575


Epoch[2] Batch[305] Speed: 1.247584337645501 samples/sec                   batch loss = 759.1908198595047 | accuracy = 0.6573770491803279


Epoch[2] Batch[310] Speed: 1.246700099328996 samples/sec                   batch loss = 771.7138102054596 | accuracy = 0.6580645161290323


Epoch[2] Batch[315] Speed: 1.250351281807828 samples/sec                   batch loss = 781.5626850128174 | accuracy = 0.6603174603174603


Epoch[2] Batch[320] Speed: 1.2470092269467368 samples/sec                   batch loss = 794.76635825634 | accuracy = 0.65859375


Epoch[2] Batch[325] Speed: 1.2496257018114312 samples/sec                   batch loss = 806.198119521141 | accuracy = 0.6592307692307692


Epoch[2] Batch[330] Speed: 1.2489520982378217 samples/sec                   batch loss = 818.6377502679825 | accuracy = 0.6590909090909091


Epoch[2] Batch[335] Speed: 1.244952745164489 samples/sec                   batch loss = 829.5378770828247 | accuracy = 0.6597014925373135


Epoch[2] Batch[340] Speed: 1.250423690442696 samples/sec                   batch loss = 841.8537540435791 | accuracy = 0.6595588235294118


Epoch[2] Batch[345] Speed: 1.2567860102348423 samples/sec                   batch loss = 853.3978017568588 | accuracy = 0.6615942028985508


Epoch[2] Batch[350] Speed: 1.2481676863463507 samples/sec                   batch loss = 863.6343767642975 | accuracy = 0.6621428571428571


Epoch[2] Batch[355] Speed: 1.2463408477326428 samples/sec                   batch loss = 877.8433554172516 | accuracy = 0.6612676056338028


Epoch[2] Batch[360] Speed: 1.249464141913229 samples/sec                   batch loss = 888.2475144863129 | accuracy = 0.6638888888888889


Epoch[2] Batch[365] Speed: 1.2520822493337787 samples/sec                   batch loss = 898.5019053220749 | accuracy = 0.665068493150685


Epoch[2] Batch[370] Speed: 1.247851487279695 samples/sec                   batch loss = 910.4421279430389 | accuracy = 0.6655405405405406


Epoch[2] Batch[375] Speed: 1.2511553294153295 samples/sec                   batch loss = 918.505498945713 | accuracy = 0.668


Epoch[2] Batch[380] Speed: 1.24888534491332 samples/sec                   batch loss = 930.3205514550209 | accuracy = 0.6671052631578948


Epoch[2] Batch[385] Speed: 1.2451618400103786 samples/sec                   batch loss = 942.9398712515831 | accuracy = 0.6662337662337663


Epoch[2] Batch[390] Speed: 1.2449148698295274 samples/sec                   batch loss = 953.5861820578575 | accuracy = 0.6673076923076923


Epoch[2] Batch[395] Speed: 1.24742914831955 samples/sec                   batch loss = 964.645646750927 | accuracy = 0.6677215189873418


Epoch[2] Batch[400] Speed: 1.252171026283386 samples/sec                   batch loss = 976.5738129019737 | accuracy = 0.668125


Epoch[2] Batch[405] Speed: 1.2538651500601738 samples/sec                   batch loss = 988.8004927039146 | accuracy = 0.6679012345679012


Epoch[2] Batch[410] Speed: 1.2510945911381257 samples/sec                   batch loss = 1000.5410313010216 | accuracy = 0.6670731707317074


Epoch[2] Batch[415] Speed: 1.2560960152014335 samples/sec                   batch loss = 1012.5645381808281 | accuracy = 0.6680722891566265


Epoch[2] Batch[420] Speed: 1.2518758682226032 samples/sec                   batch loss = 1024.5836201310158 | accuracy = 0.6678571428571428


Epoch[2] Batch[425] Speed: 1.2509127847575663 samples/sec                   batch loss = 1033.7135543227196 | accuracy = 0.67


Epoch[2] Batch[430] Speed: 1.2489111900614838 samples/sec                   batch loss = 1047.5648924708366 | accuracy = 0.6680232558139535


Epoch[2] Batch[435] Speed: 1.24990797347878 samples/sec                   batch loss = 1058.5300522446632 | accuracy = 0.6683908045977012


Epoch[2] Batch[440] Speed: 1.2518419605874456 samples/sec                   batch loss = 1069.3934921622276 | accuracy = 0.6710227272727273


Epoch[2] Batch[445] Speed: 1.2550976041462498 samples/sec                   batch loss = 1082.4550133347511 | accuracy = 0.6702247191011236


Epoch[2] Batch[450] Speed: 1.2565464538351885 samples/sec                   batch loss = 1091.6333395838737 | accuracy = 0.6722222222222223


Epoch[2] Batch[455] Speed: 1.2529866808864671 samples/sec                   batch loss = 1101.9042028784752 | accuracy = 0.6725274725274726


Epoch[2] Batch[460] Speed: 1.2556224999386305 samples/sec                   batch loss = 1112.3228654265404 | accuracy = 0.6739130434782609


Epoch[2] Batch[465] Speed: 1.2563931668428505 samples/sec                   batch loss = 1125.8247107863426 | accuracy = 0.6736559139784947


Epoch[2] Batch[470] Speed: 1.2539342174712635 samples/sec                   batch loss = 1137.8104726672173 | accuracy = 0.673936170212766


Epoch[2] Batch[475] Speed: 1.257366501240825 samples/sec                   batch loss = 1147.7218328118324 | accuracy = 0.6736842105263158


Epoch[2] Batch[480] Speed: 1.249243274504663 samples/sec                   batch loss = 1157.1870413422585 | accuracy = 0.675


Epoch[2] Batch[485] Speed: 1.2529260453810738 samples/sec                   batch loss = 1170.5382505059242 | accuracy = 0.6731958762886598


Epoch[2] Batch[490] Speed: 1.254285576297814 samples/sec                   batch loss = 1182.4898042082787 | accuracy = 0.6729591836734694


Epoch[2] Batch[495] Speed: 1.255521300295321 samples/sec                   batch loss = 1196.5591389536858 | accuracy = 0.6732323232323232


Epoch[2] Batch[500] Speed: 1.2495165326254642 samples/sec                   batch loss = 1212.2514342665672 | accuracy = 0.671


Epoch[2] Batch[505] Speed: 1.2542419738555555 samples/sec                   batch loss = 1223.2752950787544 | accuracy = 0.6717821782178218


Epoch[2] Batch[510] Speed: 1.2511303243105505 samples/sec                   batch loss = 1233.8588513731956 | accuracy = 0.6725490196078432


Epoch[2] Batch[515] Speed: 1.2496038292188874 samples/sec                   batch loss = 1244.38153475523 | accuracy = 0.6742718446601942


Epoch[2] Batch[520] Speed: 1.261760821450846 samples/sec                   batch loss = 1253.4004234671593 | accuracy = 0.6764423076923077


Epoch[2] Batch[525] Speed: 1.25961505542325 samples/sec                   batch loss = 1265.92885607481 | accuracy = 0.6757142857142857


Epoch[2] Batch[530] Speed: 1.2525992948592126 samples/sec                   batch loss = 1275.8038333058357 | accuracy = 0.6768867924528302


Epoch[2] Batch[535] Speed: 1.2571172096658234 samples/sec                   batch loss = 1287.6020982861519 | accuracy = 0.6766355140186916


Epoch[2] Batch[540] Speed: 1.254146527908232 samples/sec                   batch loss = 1298.1767181754112 | accuracy = 0.6773148148148148


Epoch[2] Batch[545] Speed: 1.2528123696248474 samples/sec                   batch loss = 1308.204549252987 | accuracy = 0.6788990825688074


Epoch[2] Batch[550] Speed: 1.2552551770659894 samples/sec                   batch loss = 1319.1921206116676 | accuracy = 0.6786363636363636


Epoch[2] Batch[555] Speed: 1.2554975296718507 samples/sec                   batch loss = 1330.4895817637444 | accuracy = 0.6788288288288288


Epoch[2] Batch[560] Speed: 1.2565216092047167 samples/sec                   batch loss = 1341.4065485596657 | accuracy = 0.6799107142857143


Epoch[2] Batch[565] Speed: 1.2546473590413203 samples/sec                   batch loss = 1352.693546116352 | accuracy = 0.6792035398230089


Epoch[2] Batch[570] Speed: 1.2566045225795677 samples/sec                   batch loss = 1364.4474239945412 | accuracy = 0.6793859649122806


Epoch[2] Batch[575] Speed: 1.2518102030358227 samples/sec                   batch loss = 1376.6548102498055 | accuracy = 0.6791304347826087


Epoch[2] Batch[580] Speed: 1.25183719685198 samples/sec                   batch loss = 1388.8639717698097 | accuracy = 0.6788793103448276


Epoch[2] Batch[585] Speed: 1.2487709139639402 samples/sec                   batch loss = 1402.458060324192 | accuracy = 0.6786324786324787


Epoch[2] Batch[590] Speed: 1.2559602320064136 samples/sec                   batch loss = 1414.8303019404411 | accuracy = 0.6788135593220339


Epoch[2] Batch[595] Speed: 1.2503043185284561 samples/sec                   batch loss = 1425.3263667821884 | accuracy = 0.6789915966386555


Epoch[2] Batch[600] Speed: 1.2561045731590448 samples/sec                   batch loss = 1437.248480439186 | accuracy = 0.67875


Epoch[2] Batch[605] Speed: 1.250787910940455 samples/sec                   batch loss = 1446.978549838066 | accuracy = 0.6793388429752066


Epoch[2] Batch[610] Speed: 1.2504824995076993 samples/sec                   batch loss = 1457.9487804174423 | accuracy = 0.6790983606557377


Epoch[2] Batch[615] Speed: 1.2564845323031146 samples/sec                   batch loss = 1469.3240568637848 | accuracy = 0.6792682926829269


Epoch[2] Batch[620] Speed: 1.251574687020837 samples/sec                   batch loss = 1480.0750637054443 | accuracy = 0.6798387096774193


Epoch[2] Batch[625] Speed: 1.2585752604362805 samples/sec                   batch loss = 1493.3884094953537 | accuracy = 0.68


Epoch[2] Batch[630] Speed: 1.2523892841885294 samples/sec                   batch loss = 1503.5747616291046 | accuracy = 0.6805555555555556


Epoch[2] Batch[635] Speed: 1.2540501591181323 samples/sec                   batch loss = 1516.081068277359 | accuracy = 0.6795275590551181


Epoch[2] Batch[640] Speed: 1.254322898645555 samples/sec                   batch loss = 1528.7753256559372 | accuracy = 0.679296875


Epoch[2] Batch[645] Speed: 1.247197688361447 samples/sec                   batch loss = 1542.3396402597427 | accuracy = 0.6794573643410853


Epoch[2] Batch[650] Speed: 1.2544989437803726 samples/sec                   batch loss = 1553.5847265720367 | accuracy = 0.6796153846153846


Epoch[2] Batch[655] Speed: 1.2507711262192756 samples/sec                   batch loss = 1566.5705738067627 | accuracy = 0.6793893129770993


Epoch[2] Batch[660] Speed: 1.2468597405269013 samples/sec                   batch loss = 1578.606542468071 | accuracy = 0.6795454545454546


Epoch[2] Batch[665] Speed: 1.2520367443463736 samples/sec                   batch loss = 1588.6382246017456 | accuracy = 0.6804511278195489


Epoch[2] Batch[670] Speed: 1.2448027353996878 samples/sec                   batch loss = 1602.111221909523 | accuracy = 0.6802238805970149


Epoch[2] Batch[675] Speed: 1.2553279669508135 samples/sec                   batch loss = 1611.9383506774902 | accuracy = 0.6811111111111111


Epoch[2] Batch[680] Speed: 1.2504850160253365 samples/sec                   batch loss = 1621.3423410654068 | accuracy = 0.6819852941176471


Epoch[2] Batch[685] Speed: 1.2510275153162587 samples/sec                   batch loss = 1634.7361050844193 | accuracy = 0.6813868613138686


Epoch[2] Batch[690] Speed: 1.2470729989537863 samples/sec                   batch loss = 1647.711884856224 | accuracy = 0.6793478260869565


Epoch[2] Batch[695] Speed: 1.2453640718612657 samples/sec                   batch loss = 1658.083557009697 | accuracy = 0.6794964028776979


Epoch[2] Batch[700] Speed: 1.2459806927094097 samples/sec                   batch loss = 1669.8093972206116 | accuracy = 0.6796428571428571


Epoch[2] Batch[705] Speed: 1.249093717006225 samples/sec                   batch loss = 1682.1909042596817 | accuracy = 0.6790780141843972


Epoch[2] Batch[710] Speed: 1.2465573556116278 samples/sec                   batch loss = 1693.8725041151047 | accuracy = 0.6788732394366197


Epoch[2] Batch[715] Speed: 1.243810201914718 samples/sec                   batch loss = 1704.3704404830933 | accuracy = 0.679020979020979


Epoch[2] Batch[720] Speed: 1.2508582251685807 samples/sec                   batch loss = 1716.139067530632 | accuracy = 0.6791666666666667


Epoch[2] Batch[725] Speed: 1.2531746133710673 samples/sec                   batch loss = 1727.7804045677185 | accuracy = 0.68


Epoch[2] Batch[730] Speed: 1.2473735938045654 samples/sec                   batch loss = 1737.2554550170898 | accuracy = 0.6808219178082192


Epoch[2] Batch[735] Speed: 1.2452791227861175 samples/sec                   batch loss = 1745.852030158043 | accuracy = 0.6812925170068027


Epoch[2] Batch[740] Speed: 1.246267800240485 samples/sec                   batch loss = 1757.9111560583115 | accuracy = 0.6814189189189189


Epoch[2] Batch[745] Speed: 1.2456536698235392 samples/sec                   batch loss = 1769.9868153333664 | accuracy = 0.6812080536912751


Epoch[2] Batch[750] Speed: 1.2530269206496207 samples/sec                   batch loss = 1782.1301176548004 | accuracy = 0.681


Epoch[2] Batch[755] Speed: 1.2475079904323012 samples/sec                   batch loss = 1792.2249233722687 | accuracy = 0.6824503311258279


Epoch[2] Batch[760] Speed: 1.2502596880464507 samples/sec                   batch loss = 1805.1982262134552 | accuracy = 0.6815789473684211


Epoch[2] Batch[765] Speed: 1.2474821105694958 samples/sec                   batch loss = 1817.1887669563293 | accuracy = 0.680718954248366


Epoch[2] Batch[770] Speed: 1.2531340833218494 samples/sec                   batch loss = 1829.5699157714844 | accuracy = 0.6801948051948052


Epoch[2] Batch[775] Speed: 1.255837826617865 samples/sec                   batch loss = 1838.1681036949158 | accuracy = 0.6812903225806451


Epoch[2] Batch[780] Speed: 1.2469160833381394 samples/sec                   batch loss = 1848.6856924891472 | accuracy = 0.6817307692307693


Epoch[2] Batch[785] Speed: 1.251300248078995 samples/sec                   batch loss = 1861.1953191161156 | accuracy = 0.6812101910828026


[Epoch 2] training: accuracy=0.6814720812182741
[Epoch 2] time cost: 645.7363798618317
[Epoch 2] validation: validation accuracy=0.7377777777777778


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).