<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[15:32:27] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[15:32:27] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[15:32:28] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 5.024439, -4.458021]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7708974763208359 samples/sec                   batch loss = 14.146923542022705 | accuracy = 0.55


Epoch[1] Batch[10] Speed: 1.2608286175929058 samples/sec                   batch loss = 28.940996170043945 | accuracy = 0.5


Epoch[1] Batch[15] Speed: 1.2637960428032424 samples/sec                   batch loss = 42.64250826835632 | accuracy = 0.5


Epoch[1] Batch[20] Speed: 1.2437338549085304 samples/sec                   batch loss = 58.63893389701843 | accuracy = 0.475


Epoch[1] Batch[25] Speed: 1.2456745719491984 samples/sec                   batch loss = 72.15288019180298 | accuracy = 0.49


Epoch[1] Batch[30] Speed: 1.2550595784678924 samples/sec                   batch loss = 85.74534058570862 | accuracy = 0.49166666666666664


Epoch[1] Batch[35] Speed: 1.247919243995456 samples/sec                   batch loss = 100.68868327140808 | accuracy = 0.4714285714285714


Epoch[1] Batch[40] Speed: 1.2502485076314702 samples/sec                   batch loss = 114.30545258522034 | accuracy = 0.49375


Epoch[1] Batch[45] Speed: 1.2472490546692026 samples/sec                   batch loss = 127.72080039978027 | accuracy = 0.5055555555555555


Epoch[1] Batch[50] Speed: 1.2565817461924798 samples/sec                   batch loss = 141.66975021362305 | accuracy = 0.505


Epoch[1] Batch[55] Speed: 1.2493024377279862 samples/sec                   batch loss = 155.8163914680481 | accuracy = 0.5045454545454545


Epoch[1] Batch[60] Speed: 1.2586816745224787 samples/sec                   batch loss = 169.20685482025146 | accuracy = 0.5166666666666667


Epoch[1] Batch[65] Speed: 1.2503055298385595 samples/sec                   batch loss = 182.7382471561432 | accuracy = 0.5192307692307693


Epoch[1] Batch[70] Speed: 1.257152722494509 samples/sec                   batch loss = 197.05877161026 | accuracy = 0.5107142857142857


Epoch[1] Batch[75] Speed: 1.250497598765454 samples/sec                   batch loss = 211.52742767333984 | accuracy = 0.5033333333333333


Epoch[1] Batch[80] Speed: 1.248333833965032 samples/sec                   batch loss = 225.56001949310303 | accuracy = 0.5125


Epoch[1] Batch[85] Speed: 1.2571228614406487 samples/sec                   batch loss = 239.35807180404663 | accuracy = 0.5117647058823529


Epoch[1] Batch[90] Speed: 1.2572858428079448 samples/sec                   batch loss = 252.1621856689453 | accuracy = 0.525


Epoch[1] Batch[95] Speed: 1.2630555402269015 samples/sec                   batch loss = 266.3341407775879 | accuracy = 0.5210526315789473


Epoch[1] Batch[100] Speed: 1.2486258371200214 samples/sec                   batch loss = 280.64311599731445 | accuracy = 0.52


Epoch[1] Batch[105] Speed: 1.2563961776452577 samples/sec                   batch loss = 294.2069556713104 | accuracy = 0.5142857142857142


Epoch[1] Batch[110] Speed: 1.2498635575112353 samples/sec                   batch loss = 308.2631893157959 | accuracy = 0.5113636363636364


Epoch[1] Batch[115] Speed: 1.2458122105400726 samples/sec                   batch loss = 322.0647728443146 | accuracy = 0.508695652173913


Epoch[1] Batch[120] Speed: 1.2641953398910095 samples/sec                   batch loss = 336.2479293346405 | accuracy = 0.5083333333333333


Epoch[1] Batch[125] Speed: 1.2604594738984871 samples/sec                   batch loss = 350.1795928478241 | accuracy = 0.508


Epoch[1] Batch[130] Speed: 1.2547135098978772 samples/sec                   batch loss = 364.27485847473145 | accuracy = 0.5057692307692307


Epoch[1] Batch[135] Speed: 1.2604800235609135 samples/sec                   batch loss = 377.86392092704773 | accuracy = 0.512962962962963


Epoch[1] Batch[140] Speed: 1.2548771811409878 samples/sec                   batch loss = 391.98741030693054 | accuracy = 0.5107142857142857


Epoch[1] Batch[145] Speed: 1.26119361639926 samples/sec                   batch loss = 405.47666454315186 | accuracy = 0.5137931034482759


Epoch[1] Batch[150] Speed: 1.2608432096985904 samples/sec                   batch loss = 419.0911765098572 | accuracy = 0.5183333333333333


Epoch[1] Batch[155] Speed: 1.2621318672721091 samples/sec                   batch loss = 433.24174880981445 | accuracy = 0.5209677419354839


Epoch[1] Batch[160] Speed: 1.2492550880967783 samples/sec                   batch loss = 447.5937166213989 | accuracy = 0.515625


Epoch[1] Batch[165] Speed: 1.25866741569055 samples/sec                   batch loss = 461.3152735233307 | accuracy = 0.5151515151515151


Epoch[1] Batch[170] Speed: 1.2576150432719848 samples/sec                   batch loss = 474.92298221588135 | accuracy = 0.5176470588235295


Epoch[1] Batch[175] Speed: 1.2524104129750944 samples/sec                   batch loss = 488.5824065208435 | accuracy = 0.5185714285714286


Epoch[1] Batch[180] Speed: 1.2520035754439418 samples/sec                   batch loss = 502.5283203125 | accuracy = 0.5125


Epoch[1] Batch[185] Speed: 1.2543726964927437 samples/sec                   batch loss = 516.000550031662 | accuracy = 0.518918918918919


Epoch[1] Batch[190] Speed: 1.2495830742327103 samples/sec                   batch loss = 529.89142537117 | accuracy = 0.5184210526315789


Epoch[1] Batch[195] Speed: 1.2465592080137784 samples/sec                   batch loss = 544.297440290451 | accuracy = 0.5153846153846153


Epoch[1] Batch[200] Speed: 1.2585791314407955 samples/sec                   batch loss = 557.5053825378418 | accuracy = 0.51625


Epoch[1] Batch[205] Speed: 1.2465794921775302 samples/sec                   batch loss = 570.8390424251556 | accuracy = 0.5170731707317073


Epoch[1] Batch[210] Speed: 1.262829651915277 samples/sec                   batch loss = 584.3062422275543 | accuracy = 0.5190476190476191


Epoch[1] Batch[215] Speed: 1.2516738506861922 samples/sec                   batch loss = 597.6816396713257 | accuracy = 0.5197674418604651


Epoch[1] Batch[220] Speed: 1.2627308036643334 samples/sec                   batch loss = 611.7423121929169 | accuracy = 0.5159090909090909


Epoch[1] Batch[225] Speed: 1.2531709627511174 samples/sec                   batch loss = 625.1763129234314 | accuracy = 0.5155555555555555


Epoch[1] Batch[230] Speed: 1.2572689774578778 samples/sec                   batch loss = 638.2157790660858 | accuracy = 0.5184782608695652


Epoch[1] Batch[235] Speed: 1.2586727981255716 samples/sec                   batch loss = 652.258088350296 | accuracy = 0.5170212765957447


Epoch[1] Batch[240] Speed: 1.253277026608095 samples/sec                   batch loss = 665.8839037418365 | accuracy = 0.5177083333333333


Epoch[1] Batch[245] Speed: 1.271834649220891 samples/sec                   batch loss = 679.5999207496643 | accuracy = 0.5173469387755102


Epoch[1] Batch[250] Speed: 1.2705947180686057 samples/sec                   batch loss = 692.9893007278442 | accuracy = 0.521


Epoch[1] Batch[255] Speed: 1.2862599580937808 samples/sec                   batch loss = 706.5926313400269 | accuracy = 0.5196078431372549


Epoch[1] Batch[260] Speed: 1.2813088565277138 samples/sec                   batch loss = 719.4164891242981 | accuracy = 0.5230769230769231


Epoch[1] Batch[265] Speed: 1.2684725313492482 samples/sec                   batch loss = 733.096732378006 | accuracy = 0.5226415094339623


Epoch[1] Batch[270] Speed: 1.2842411546799835 samples/sec                   batch loss = 746.7912704944611 | accuracy = 0.5231481481481481


Epoch[1] Batch[275] Speed: 1.2709882133781207 samples/sec                   batch loss = 760.5426642894745 | accuracy = 0.5254545454545455


Epoch[1] Batch[280] Speed: 1.2815632349395916 samples/sec                   batch loss = 774.365877866745 | accuracy = 0.525


Epoch[1] Batch[285] Speed: 1.276294158699662 samples/sec                   batch loss = 787.8687000274658 | accuracy = 0.525438596491228


Epoch[1] Batch[290] Speed: 1.285248483664562 samples/sec                   batch loss = 801.4669020175934 | accuracy = 0.5275862068965518


Epoch[1] Batch[295] Speed: 1.2829264998947412 samples/sec                   batch loss = 814.8826897144318 | accuracy = 0.5288135593220339


Epoch[1] Batch[300] Speed: 1.2822106472936663 samples/sec                   batch loss = 829.0210816860199 | accuracy = 0.5291666666666667


Epoch[1] Batch[305] Speed: 1.283439981842201 samples/sec                   batch loss = 843.309029340744 | accuracy = 0.5278688524590164


Epoch[1] Batch[310] Speed: 1.288120628022317 samples/sec                   batch loss = 857.1752033233643 | accuracy = 0.5266129032258065


Epoch[1] Batch[315] Speed: 1.2866386478512215 samples/sec                   batch loss = 871.0015351772308 | accuracy = 0.526984126984127


Epoch[1] Batch[320] Speed: 1.2831419709440193 samples/sec                   batch loss = 883.7281987667084 | accuracy = 0.53046875


Epoch[1] Batch[325] Speed: 1.2746340156289375 samples/sec                   batch loss = 896.052102804184 | accuracy = 0.5330769230769231


Epoch[1] Batch[330] Speed: 1.276381255810001 samples/sec                   batch loss = 909.1331303119659 | accuracy = 0.5348484848484848


Epoch[1] Batch[335] Speed: 1.2796310634011387 samples/sec                   batch loss = 922.428925037384 | accuracy = 0.5365671641791044


Epoch[1] Batch[340] Speed: 1.2851886235345136 samples/sec                   batch loss = 935.6175169944763 | accuracy = 0.5375


Epoch[1] Batch[345] Speed: 1.2757909372330651 samples/sec                   batch loss = 949.2111115455627 | accuracy = 0.5355072463768116


Epoch[1] Batch[350] Speed: 1.2783225137911067 samples/sec                   batch loss = 961.6875133514404 | accuracy = 0.5392857142857143


Epoch[1] Batch[355] Speed: 1.279881065739588 samples/sec                   batch loss = 975.7481660842896 | accuracy = 0.5380281690140845


Epoch[1] Batch[360] Speed: 1.2794870220717631 samples/sec                   batch loss = 989.9220020771027 | accuracy = 0.5375


Epoch[1] Batch[365] Speed: 1.2795356177571358 samples/sec                   batch loss = 1002.7127754688263 | accuracy = 0.5404109589041096


Epoch[1] Batch[370] Speed: 1.2825545019399753 samples/sec                   batch loss = 1015.9122006893158 | accuracy = 0.5412162162162162


Epoch[1] Batch[375] Speed: 1.2806116202348838 samples/sec                   batch loss = 1029.7878077030182 | accuracy = 0.54


Epoch[1] Batch[380] Speed: 1.282930227831313 samples/sec                   batch loss = 1043.5378665924072 | accuracy = 0.5414473684210527


Epoch[1] Batch[385] Speed: 1.2758293563513499 samples/sec                   batch loss = 1057.1021583080292 | accuracy = 0.5415584415584416


Epoch[1] Batch[390] Speed: 1.276944809781285 samples/sec                   batch loss = 1070.5999774932861 | accuracy = 0.5423076923076923


Epoch[1] Batch[395] Speed: 1.278295729227581 samples/sec                   batch loss = 1084.389487028122 | accuracy = 0.5424050632911392


Epoch[1] Batch[400] Speed: 1.2762486243628823 samples/sec                   batch loss = 1097.6518347263336 | accuracy = 0.543125


Epoch[1] Batch[405] Speed: 1.2804271933072153 samples/sec                   batch loss = 1112.109491109848 | accuracy = 0.5432098765432098


Epoch[1] Batch[410] Speed: 1.2766370812603245 samples/sec                   batch loss = 1125.7554404735565 | accuracy = 0.5439024390243903


Epoch[1] Batch[415] Speed: 1.287630867223612 samples/sec                   batch loss = 1139.1045744419098 | accuracy = 0.5445783132530121


Epoch[1] Batch[420] Speed: 1.288586810131632 samples/sec                   batch loss = 1153.155190706253 | accuracy = 0.544047619047619


Epoch[1] Batch[425] Speed: 1.2823045322847366 samples/sec                   batch loss = 1167.1412661075592 | accuracy = 0.5435294117647059


Epoch[1] Batch[430] Speed: 1.2775880495996594 samples/sec                   batch loss = 1180.8331224918365 | accuracy = 0.5436046511627907


Epoch[1] Batch[435] Speed: 1.2735460852341076 samples/sec                   batch loss = 1194.1735877990723 | accuracy = 0.5442528735632184


Epoch[1] Batch[440] Speed: 1.2792558040233095 samples/sec                   batch loss = 1207.0057849884033 | accuracy = 0.5465909090909091


Epoch[1] Batch[445] Speed: 1.2846168859593878 samples/sec                   batch loss = 1220.387853860855 | accuracy = 0.5460674157303371


Epoch[1] Batch[450] Speed: 1.2790274972505493 samples/sec                   batch loss = 1233.7855160236359 | accuracy = 0.5461111111111111


Epoch[1] Batch[455] Speed: 1.2840575481819612 samples/sec                   batch loss = 1247.369393825531 | accuracy = 0.5467032967032966


Epoch[1] Batch[460] Speed: 1.2820701394983172 samples/sec                   batch loss = 1260.6315865516663 | accuracy = 0.5467391304347826


Epoch[1] Batch[465] Speed: 1.2782846261376088 samples/sec                   batch loss = 1274.4049100875854 | accuracy = 0.5451612903225806


Epoch[1] Batch[470] Speed: 1.2845013210049745 samples/sec                   batch loss = 1287.8816604614258 | accuracy = 0.5452127659574468


Epoch[1] Batch[475] Speed: 1.2807493642329637 samples/sec                   batch loss = 1301.2942335605621 | accuracy = 0.5457894736842105


Epoch[1] Batch[480] Speed: 1.280249462671483 samples/sec                   batch loss = 1313.4471340179443 | accuracy = 0.5479166666666667


Epoch[1] Batch[485] Speed: 1.2771514704827809 samples/sec                   batch loss = 1327.1338255405426 | accuracy = 0.5463917525773195


Epoch[1] Batch[490] Speed: 1.2771734430514499 samples/sec                   batch loss = 1340.299024105072 | accuracy = 0.5474489795918367


Epoch[1] Batch[495] Speed: 1.2831943778499502 samples/sec                   batch loss = 1354.0447709560394 | accuracy = 0.5484848484848485


Epoch[1] Batch[500] Speed: 1.2808563342328554 samples/sec                   batch loss = 1367.0835320949554 | accuracy = 0.5495


Epoch[1] Batch[505] Speed: 1.2827903469529707 samples/sec                   batch loss = 1380.26438498497 | accuracy = 0.5504950495049505


Epoch[1] Batch[510] Speed: 1.2846595764866449 samples/sec                   batch loss = 1394.777744293213 | accuracy = 0.5490196078431373


Epoch[1] Batch[515] Speed: 1.2768946613901921 samples/sec                   batch loss = 1408.3757569789886 | accuracy = 0.5490291262135922


Epoch[1] Batch[520] Speed: 1.2811997563025557 samples/sec                   batch loss = 1421.8991205692291 | accuracy = 0.5485576923076924


Epoch[1] Batch[525] Speed: 1.284231815823152 samples/sec                   batch loss = 1435.203153848648 | accuracy = 0.549047619047619


Epoch[1] Batch[530] Speed: 1.2826524577994274 samples/sec                   batch loss = 1447.8345465660095 | accuracy = 0.5504716981132075


Epoch[1] Batch[535] Speed: 1.2854552803014196 samples/sec                   batch loss = 1460.1504929065704 | accuracy = 0.5518691588785046


Epoch[1] Batch[540] Speed: 1.2847905183573272 samples/sec                   batch loss = 1472.526076078415 | accuracy = 0.5537037037037037


Epoch[1] Batch[545] Speed: 1.2819310338082512 samples/sec                   batch loss = 1485.2175085544586 | accuracy = 0.5545871559633028


Epoch[1] Batch[550] Speed: 1.2843013198347555 samples/sec                   batch loss = 1497.812839269638 | accuracy = 0.5563636363636364


Epoch[1] Batch[555] Speed: 1.2896174258531412 samples/sec                   batch loss = 1510.894053220749 | accuracy = 0.5554054054054054


Epoch[1] Batch[560] Speed: 1.2862334315341553 samples/sec                   batch loss = 1525.399429321289 | accuracy = 0.5549107142857143


Epoch[1] Batch[565] Speed: 1.284210386117755 samples/sec                   batch loss = 1537.48783826828 | accuracy = 0.5566371681415929


Epoch[1] Batch[570] Speed: 1.2820226246532596 samples/sec                   batch loss = 1550.310096502304 | accuracy = 0.5574561403508772


Epoch[1] Batch[575] Speed: 1.2851457993960498 samples/sec                   batch loss = 1563.7715034484863 | accuracy = 0.5573913043478261


Epoch[1] Batch[580] Speed: 1.2850509073367442 samples/sec                   batch loss = 1576.9496626853943 | accuracy = 0.5573275862068966


Epoch[1] Batch[585] Speed: 1.2872589019746974 samples/sec                   batch loss = 1589.8866052627563 | accuracy = 0.5576923076923077


Epoch[1] Batch[590] Speed: 1.285129654944411 samples/sec                   batch loss = 1601.2560222148895 | accuracy = 0.5601694915254237


Epoch[1] Batch[595] Speed: 1.2852812712637116 samples/sec                   batch loss = 1613.8995933532715 | accuracy = 0.5617647058823529


Epoch[1] Batch[600] Speed: 1.2838230053498696 samples/sec                   batch loss = 1627.361655831337 | accuracy = 0.5620833333333334


Epoch[1] Batch[605] Speed: 1.2816761177317841 samples/sec                   batch loss = 1638.85011947155 | accuracy = 0.5644628099173554


Epoch[1] Batch[610] Speed: 1.2832902719787689 samples/sec                   batch loss = 1651.2027081251144 | accuracy = 0.5655737704918032


Epoch[1] Batch[615] Speed: 1.2788544441201466 samples/sec                   batch loss = 1665.241593003273 | accuracy = 0.5654471544715447


Epoch[1] Batch[620] Speed: 1.2783470591767176 samples/sec                   batch loss = 1678.670848250389 | accuracy = 0.5661290322580645


Epoch[1] Batch[625] Speed: 1.2783582607571926 samples/sec                   batch loss = 1691.6446384191513 | accuracy = 0.5664


Epoch[1] Batch[630] Speed: 1.2827970165967804 samples/sec                   batch loss = 1705.309618115425 | accuracy = 0.5658730158730159


Epoch[1] Batch[635] Speed: 1.2799473654457112 samples/sec                   batch loss = 1717.7558780908585 | accuracy = 0.5669291338582677


Epoch[1] Batch[640] Speed: 1.2775026359526813 samples/sec                   batch loss = 1731.920468211174 | accuracy = 0.56640625


Epoch[1] Batch[645] Speed: 1.2826483392337558 samples/sec                   batch loss = 1743.3461116552353 | accuracy = 0.5678294573643411


Epoch[1] Batch[650] Speed: 1.2808601479356483 samples/sec                   batch loss = 1756.619965672493 | accuracy = 0.5676923076923077


Epoch[1] Batch[655] Speed: 1.2857163329546304 samples/sec                   batch loss = 1768.360130906105 | accuracy = 0.5687022900763359


Epoch[1] Batch[660] Speed: 1.2850392928813248 samples/sec                   batch loss = 1781.4781740903854 | accuracy = 0.5696969696969697


Epoch[1] Batch[665] Speed: 1.2802318779356552 samples/sec                   batch loss = 1798.0264493227005 | accuracy = 0.5676691729323309


Epoch[1] Batch[670] Speed: 1.2853212487979715 samples/sec                   batch loss = 1812.557920575142 | accuracy = 0.567910447761194


Epoch[1] Batch[675] Speed: 1.2822322063051965 samples/sec                   batch loss = 1826.7950471639633 | accuracy = 0.567037037037037


Epoch[1] Batch[680] Speed: 1.286431075762891 samples/sec                   batch loss = 1838.9658244848251 | accuracy = 0.5691176470588235


Epoch[1] Batch[685] Speed: 1.2869330533418712 samples/sec                   batch loss = 1852.762156367302 | accuracy = 0.5682481751824817


Epoch[1] Batch[690] Speed: 1.2796365290224037 samples/sec                   batch loss = 1865.3702915906906 | accuracy = 0.5688405797101449


Epoch[1] Batch[695] Speed: 1.2770486178031546 samples/sec                   batch loss = 1876.8056738376617 | accuracy = 0.570863309352518


Epoch[1] Batch[700] Speed: 1.2821487181658349 samples/sec                   batch loss = 1890.8706710338593 | accuracy = 0.5707142857142857


Epoch[1] Batch[705] Speed: 1.2765964764503817 samples/sec                   batch loss = 1904.120572566986 | accuracy = 0.5702127659574469


Epoch[1] Batch[710] Speed: 1.2800195315480278 samples/sec                   batch loss = 1918.0391840934753 | accuracy = 0.5704225352112676


Epoch[1] Batch[715] Speed: 1.2780610470641425 samples/sec                   batch loss = 1931.6813814640045 | accuracy = 0.5702797202797203


Epoch[1] Batch[720] Speed: 1.2757695942782505 samples/sec                   batch loss = 1944.350025653839 | accuracy = 0.5711805555555556


Epoch[1] Batch[725] Speed: 1.2747270848779364 samples/sec                   batch loss = 1957.694075345993 | accuracy = 0.5717241379310345


Epoch[1] Batch[730] Speed: 1.2775316247256334 samples/sec                   batch loss = 1970.2007462978363 | accuracy = 0.571917808219178


Epoch[1] Batch[735] Speed: 1.279905573372692 samples/sec                   batch loss = 1985.270444393158 | accuracy = 0.5700680272108843


Epoch[1] Batch[740] Speed: 1.2802785762404967 samples/sec                   batch loss = 1998.0188839435577 | accuracy = 0.5709459459459459


Epoch[1] Batch[745] Speed: 1.2744128733109232 samples/sec                   batch loss = 2010.832494020462 | accuracy = 0.5714765100671141


Epoch[1] Batch[750] Speed: 1.2806975478969145 samples/sec                   batch loss = 2024.4458612203598 | accuracy = 0.5723333333333334


Epoch[1] Batch[755] Speed: 1.277110541297746 samples/sec                   batch loss = 2037.3155895471573 | accuracy = 0.5728476821192053


Epoch[1] Batch[760] Speed: 1.280898872973145 samples/sec                   batch loss = 2050.180343747139 | accuracy = 0.5740131578947368


Epoch[1] Batch[765] Speed: 1.2785856465690915 samples/sec                   batch loss = 2060.7790018320084 | accuracy = 0.5751633986928104


Epoch[1] Batch[770] Speed: 1.2811629697428895 samples/sec                   batch loss = 2073.5706288814545 | accuracy = 0.5756493506493506


Epoch[1] Batch[775] Speed: 1.2918907926264378 samples/sec                   batch loss = 2086.6138553619385 | accuracy = 0.5758064516129032


Epoch[1] Batch[780] Speed: 1.2816846361474263 samples/sec                   batch loss = 2097.5481592416763 | accuracy = 0.5772435897435897


Epoch[1] Batch[785] Speed: 1.2752393541839921 samples/sec                   batch loss = 2109.20323574543 | accuracy = 0.578343949044586


[Epoch 1] training: accuracy=0.578997461928934
[Epoch 1] time cost: 637.2946498394012
[Epoch 1] validation: validation accuracy=0.6911111111111111


Epoch[2] Batch[5] Speed: 1.236534043458486 samples/sec                   batch loss = 14.097741842269897 | accuracy = 0.55


Epoch[2] Batch[10] Speed: 1.2574479239859768 samples/sec                   batch loss = 28.02176308631897 | accuracy = 0.525


Epoch[2] Batch[15] Speed: 1.2442915513268662 samples/sec                   batch loss = 41.6613826751709 | accuracy = 0.5


Epoch[2] Batch[20] Speed: 1.2494276664272899 samples/sec                   batch loss = 53.732855916023254 | accuracy = 0.5375


Epoch[2] Batch[25] Speed: 1.2583581446532957 samples/sec                   batch loss = 65.9274492263794 | accuracy = 0.56


Epoch[2] Batch[30] Speed: 1.2481028738357351 samples/sec                   batch loss = 77.01523470878601 | accuracy = 0.5916666666666667


Epoch[2] Batch[35] Speed: 1.2423914776463127 samples/sec                   batch loss = 90.20037066936493 | accuracy = 0.5928571428571429


Epoch[2] Batch[40] Speed: 1.2479909070486739 samples/sec                   batch loss = 102.18166923522949 | accuracy = 0.6125


Epoch[2] Batch[45] Speed: 1.2474764523912842 samples/sec                   batch loss = 114.70586466789246 | accuracy = 0.6222222222222222


Epoch[2] Batch[50] Speed: 1.2509592341069566 samples/sec                   batch loss = 127.49497723579407 | accuracy = 0.625


Epoch[2] Batch[55] Speed: 1.2451972350846745 samples/sec                   batch loss = 140.73393309116364 | accuracy = 0.6318181818181818


Epoch[2] Batch[60] Speed: 1.2425612444382825 samples/sec                   batch loss = 152.00941240787506 | accuracy = 0.6458333333333334


Epoch[2] Batch[65] Speed: 1.2510833024922687 samples/sec                   batch loss = 165.47770178318024 | accuracy = 0.6384615384615384


Epoch[2] Batch[70] Speed: 1.2388699926172362 samples/sec                   batch loss = 177.613476395607 | accuracy = 0.6428571428571429


Epoch[2] Batch[75] Speed: 1.2511232334925935 samples/sec                   batch loss = 189.2204818725586 | accuracy = 0.6533333333333333


Epoch[2] Batch[80] Speed: 1.247864852405454 samples/sec                   batch loss = 200.5622205734253 | accuracy = 0.65625


Epoch[2] Batch[85] Speed: 1.246084339743117 samples/sec                   batch loss = 213.91012954711914 | accuracy = 0.6529411764705882


Epoch[2] Batch[90] Speed: 1.249229414748357 samples/sec                   batch loss = 224.93893194198608 | accuracy = 0.6583333333333333


Epoch[2] Batch[95] Speed: 1.250489396654274 samples/sec                   batch loss = 237.6433663368225 | accuracy = 0.6526315789473685


Epoch[2] Batch[100] Speed: 1.2531523355599525 samples/sec                   batch loss = 249.140318274498 | accuracy = 0.6625


Epoch[2] Batch[105] Speed: 1.249017928838237 samples/sec                   batch loss = 262.3770695924759 | accuracy = 0.6619047619047619


Epoch[2] Batch[110] Speed: 1.2464194597323734 samples/sec                   batch loss = 274.47626662254333 | accuracy = 0.6613636363636364


Epoch[2] Batch[115] Speed: 1.244466637901217 samples/sec                   batch loss = 286.84412693977356 | accuracy = 0.6608695652173913


Epoch[2] Batch[120] Speed: 1.2406858063732729 samples/sec                   batch loss = 300.3105471134186 | accuracy = 0.6583333333333333


Epoch[2] Batch[125] Speed: 1.2474872122495912 samples/sec                   batch loss = 312.97582018375397 | accuracy = 0.66


Epoch[2] Batch[130] Speed: 1.2432682305001628 samples/sec                   batch loss = 327.2831919193268 | accuracy = 0.6557692307692308


Epoch[2] Batch[135] Speed: 1.2436204581789545 samples/sec                   batch loss = 339.5704392194748 | accuracy = 0.6555555555555556


Epoch[2] Batch[140] Speed: 1.2420575083656609 samples/sec                   batch loss = 353.6614247560501 | accuracy = 0.6553571428571429


Epoch[2] Batch[145] Speed: 1.2412252526641758 samples/sec                   batch loss = 365.9633187055588 | accuracy = 0.656896551724138


Epoch[2] Batch[150] Speed: 1.2404450106649394 samples/sec                   batch loss = 377.17498004436493 | accuracy = 0.6616666666666666


Epoch[2] Batch[155] Speed: 1.2380019877730337 samples/sec                   batch loss = 389.21450996398926 | accuracy = 0.6612903225806451


Epoch[2] Batch[160] Speed: 1.2474469564901873 samples/sec                   batch loss = 402.9012334346771 | accuracy = 0.65625


Epoch[2] Batch[165] Speed: 1.2359643385009096 samples/sec                   batch loss = 415.74409008026123 | accuracy = 0.656060606060606


Epoch[2] Batch[170] Speed: 1.2393940352058135 samples/sec                   batch loss = 427.69325256347656 | accuracy = 0.6573529411764706


Epoch[2] Batch[175] Speed: 1.2366803348171045 samples/sec                   batch loss = 441.04713451862335 | accuracy = 0.6557142857142857


Epoch[2] Batch[180] Speed: 1.2457092562275194 samples/sec                   batch loss = 455.14703476428986 | accuracy = 0.6472222222222223


Epoch[2] Batch[185] Speed: 1.2419822957803053 samples/sec                   batch loss = 467.9511947631836 | accuracy = 0.6432432432432432


Epoch[2] Batch[190] Speed: 1.2440718629300649 samples/sec                   batch loss = 479.83997642993927 | accuracy = 0.6447368421052632


Epoch[2] Batch[195] Speed: 1.2433365039475814 samples/sec                   batch loss = 492.67970311641693 | accuracy = 0.6410256410256411


Epoch[2] Batch[200] Speed: 1.241466718879079 samples/sec                   batch loss = 503.5106108188629 | accuracy = 0.64375


Epoch[2] Batch[205] Speed: 1.2448104936388755 samples/sec                   batch loss = 517.4352233409882 | accuracy = 0.6414634146341464


Epoch[2] Batch[210] Speed: 1.25242088414237 samples/sec                   batch loss = 530.6963908672333 | accuracy = 0.6392857142857142


Epoch[2] Batch[215] Speed: 1.2516471440476094 samples/sec                   batch loss = 544.4232615232468 | accuracy = 0.6372093023255814


Epoch[2] Batch[220] Speed: 1.250111656923169 samples/sec                   batch loss = 555.2355723381042 | accuracy = 0.6431818181818182


Epoch[2] Batch[225] Speed: 1.252355816153816 samples/sec                   batch loss = 568.0134773254395 | accuracy = 0.6422222222222222


Epoch[2] Batch[230] Speed: 1.2511395611544849 samples/sec                   batch loss = 578.5836153030396 | accuracy = 0.6456521739130435


Epoch[2] Batch[235] Speed: 1.251070614670663 samples/sec                   batch loss = 590.9335162639618 | accuracy = 0.6457446808510638


Epoch[2] Batch[240] Speed: 1.253667642393163 samples/sec                   batch loss = 603.0847916603088 | accuracy = 0.646875


Epoch[2] Batch[245] Speed: 1.2479989835823708 samples/sec                   batch loss = 614.2853918075562 | accuracy = 0.6489795918367347


Epoch[2] Batch[250] Speed: 1.2451924293690628 samples/sec                   batch loss = 628.0762023925781 | accuracy = 0.647


Epoch[2] Batch[255] Speed: 1.2475401794302108 samples/sec                   batch loss = 638.1949564218521 | accuracy = 0.6480392156862745


Epoch[2] Batch[260] Speed: 1.2471125816630055 samples/sec                   batch loss = 648.6995091438293 | accuracy = 0.6519230769230769


Epoch[2] Batch[265] Speed: 1.236170424717844 samples/sec                   batch loss = 661.63898229599 | accuracy = 0.6490566037735849


Epoch[2] Batch[270] Speed: 1.2322321093798487 samples/sec                   batch loss = 672.3909013271332 | accuracy = 0.6509259259259259


Epoch[2] Batch[275] Speed: 1.238333322384788 samples/sec                   batch loss = 684.9472634792328 | accuracy = 0.65


Epoch[2] Batch[280] Speed: 1.232782253866108 samples/sec                   batch loss = 697.9416065216064 | accuracy = 0.6508928571428572


Epoch[2] Batch[285] Speed: 1.231856182256859 samples/sec                   batch loss = 710.3167216777802 | accuracy = 0.6517543859649123


Epoch[2] Batch[290] Speed: 1.2310117939933385 samples/sec                   batch loss = 723.898678779602 | accuracy = 0.6508620689655172


Epoch[2] Batch[295] Speed: 1.2299179461683054 samples/sec                   batch loss = 737.1280014514923 | accuracy = 0.6491525423728813


Epoch[2] Batch[300] Speed: 1.2349689351564637 samples/sec                   batch loss = 749.8018555641174 | accuracy = 0.6491666666666667


Epoch[2] Batch[305] Speed: 1.246515400177409 samples/sec                   batch loss = 762.0367982387543 | accuracy = 0.6475409836065574


Epoch[2] Batch[310] Speed: 1.237221138429377 samples/sec                   batch loss = 773.562283873558 | accuracy = 0.646774193548387


Epoch[2] Batch[315] Speed: 1.2327226522894716 samples/sec                   batch loss = 787.651981472969 | accuracy = 0.6444444444444445


Epoch[2] Batch[320] Speed: 1.2349353008817783 samples/sec                   batch loss = 798.4376890659332 | accuracy = 0.6453125


Epoch[2] Batch[325] Speed: 1.2346825570235769 samples/sec                   batch loss = 811.2080736160278 | accuracy = 0.6438461538461538


Epoch[2] Batch[330] Speed: 1.2332205658290352 samples/sec                   batch loss = 823.0172301530838 | accuracy = 0.6462121212121212


Epoch[2] Batch[335] Speed: 1.2372961404409282 samples/sec                   batch loss = 836.4586709737778 | accuracy = 0.6455223880597015


Epoch[2] Batch[340] Speed: 1.2317264027070174 samples/sec                   batch loss = 848.3659374713898 | accuracy = 0.6455882352941177


Epoch[2] Batch[345] Speed: 1.2338090660008385 samples/sec                   batch loss = 859.592143535614 | accuracy = 0.6471014492753623


Epoch[2] Batch[350] Speed: 1.2351823282191392 samples/sec                   batch loss = 872.7384402751923 | accuracy = 0.6478571428571429


Epoch[2] Batch[355] Speed: 1.2373535386328927 samples/sec                   batch loss = 887.1196026802063 | accuracy = 0.6450704225352113


Epoch[2] Batch[360] Speed: 1.230859977595805 samples/sec                   batch loss = 901.4001603126526 | accuracy = 0.64375


Epoch[2] Batch[365] Speed: 1.2322542831247163 samples/sec                   batch loss = 915.5960133075714 | accuracy = 0.6431506849315068


Epoch[2] Batch[370] Speed: 1.2359303768430425 samples/sec                   batch loss = 926.3733942508698 | accuracy = 0.6452702702702703


Epoch[2] Batch[375] Speed: 1.2332141298043149 samples/sec                   batch loss = 939.9602572917938 | accuracy = 0.6446666666666667


Epoch[2] Batch[380] Speed: 1.2301632407143375 samples/sec                   batch loss = 951.7232066392899 | accuracy = 0.6453947368421052


Epoch[2] Batch[385] Speed: 1.2343807813720307 samples/sec                   batch loss = 964.5162043571472 | accuracy = 0.6448051948051948


Epoch[2] Batch[390] Speed: 1.2386445326438102 samples/sec                   batch loss = 975.2319293022156 | accuracy = 0.6461538461538462


Epoch[2] Batch[395] Speed: 1.2425849878824156 samples/sec                   batch loss = 985.1506103277206 | accuracy = 0.6487341772151899


Epoch[2] Batch[400] Speed: 1.2405814959562385 samples/sec                   batch loss = 997.9656068086624 | accuracy = 0.648125


Epoch[2] Batch[405] Speed: 1.2416203362315203 samples/sec                   batch loss = 1011.1871322393417 | accuracy = 0.6475308641975308


Epoch[2] Batch[410] Speed: 1.2402196188730061 samples/sec                   batch loss = 1022.799081325531 | accuracy = 0.6481707317073171


Epoch[2] Batch[415] Speed: 1.239924844242775 samples/sec                   batch loss = 1034.529993891716 | accuracy = 0.6475903614457831


Epoch[2] Batch[420] Speed: 1.2409219227777357 samples/sec                   batch loss = 1047.6858652830124 | accuracy = 0.6482142857142857


Epoch[2] Batch[425] Speed: 1.2371103856649532 samples/sec                   batch loss = 1057.5315811634064 | accuracy = 0.65


Epoch[2] Batch[430] Speed: 1.2334675423631867 samples/sec                   batch loss = 1068.6298636198044 | accuracy = 0.6511627906976745


Epoch[2] Batch[435] Speed: 1.2352577198748105 samples/sec                   batch loss = 1081.6259496212006 | accuracy = 0.6505747126436782


Epoch[2] Batch[440] Speed: 1.2298952253013395 samples/sec                   batch loss = 1093.82803606987 | accuracy = 0.6505681818181818


Epoch[2] Batch[445] Speed: 1.2315575945612205 samples/sec                   batch loss = 1106.0186141729355 | accuracy = 0.650561797752809


Epoch[2] Batch[450] Speed: 1.233469537436866 samples/sec                   batch loss = 1117.7191783189774 | accuracy = 0.6522222222222223


Epoch[2] Batch[455] Speed: 1.23170822670951 samples/sec                   batch loss = 1128.501198887825 | accuracy = 0.6527472527472528


Epoch[2] Batch[460] Speed: 1.2365098927625047 samples/sec                   batch loss = 1137.5612968206406 | accuracy = 0.6543478260869565


Epoch[2] Batch[465] Speed: 1.2310904714736228 samples/sec                   batch loss = 1147.5202357769012 | accuracy = 0.6548387096774193


Epoch[2] Batch[470] Speed: 1.2317150991629113 samples/sec                   batch loss = 1162.2295429706573 | accuracy = 0.6537234042553192


Epoch[2] Batch[475] Speed: 1.2351478639291995 samples/sec                   batch loss = 1172.5761464834213 | accuracy = 0.6542105263157895


Epoch[2] Batch[480] Speed: 1.2291180823162244 samples/sec                   batch loss = 1185.1516412496567 | accuracy = 0.6536458333333334


Epoch[2] Batch[485] Speed: 1.2321570865856637 samples/sec                   batch loss = 1201.1590458154678 | accuracy = 0.6515463917525773


Epoch[2] Batch[490] Speed: 1.2323133868408664 samples/sec                   batch loss = 1211.4420635700226 | accuracy = 0.6535714285714286


Epoch[2] Batch[495] Speed: 1.226721043263204 samples/sec                   batch loss = 1225.1409628391266 | accuracy = 0.6525252525252525


Epoch[2] Batch[500] Speed: 1.2306957401618597 samples/sec                   batch loss = 1236.6119998693466 | accuracy = 0.654


Epoch[2] Batch[505] Speed: 1.2346110513817354 samples/sec                   batch loss = 1249.3972119092941 | accuracy = 0.653960396039604


Epoch[2] Batch[510] Speed: 1.2337152528907223 samples/sec                   batch loss = 1258.446849822998 | accuracy = 0.6563725490196078


Epoch[2] Batch[515] Speed: 1.2342139687652325 samples/sec                   batch loss = 1270.4796314239502 | accuracy = 0.6572815533980583


Epoch[2] Batch[520] Speed: 1.227253798040823 samples/sec                   batch loss = 1280.7988632917404 | accuracy = 0.6591346153846154


Epoch[2] Batch[525] Speed: 1.2339229492445491 samples/sec                   batch loss = 1292.3629097938538 | accuracy = 0.659047619047619


Epoch[2] Batch[530] Speed: 1.2332153082263524 samples/sec                   batch loss = 1308.8618199825287 | accuracy = 0.6570754716981132


Epoch[2] Batch[535] Speed: 1.2270826235722923 samples/sec                   batch loss = 1320.0655601024628 | accuracy = 0.658411214953271


Epoch[2] Batch[540] Speed: 1.2323539390709197 samples/sec                   batch loss = 1331.3706502914429 | accuracy = 0.6592592592592592


Epoch[2] Batch[545] Speed: 1.235200152194843 samples/sec                   batch loss = 1341.9426901340485 | accuracy = 0.6600917431192661


Epoch[2] Batch[550] Speed: 1.227924054859401 samples/sec                   batch loss = 1357.941674232483 | accuracy = 0.6595454545454545


Epoch[2] Batch[555] Speed: 1.2329761347191894 samples/sec                   batch loss = 1368.184870839119 | accuracy = 0.6608108108108108


Epoch[2] Batch[560] Speed: 1.2340782454527006 samples/sec                   batch loss = 1382.7540603876114 | accuracy = 0.6598214285714286


Epoch[2] Batch[565] Speed: 1.2332354323964578 samples/sec                   batch loss = 1395.498222231865 | accuracy = 0.6601769911504425


Epoch[2] Batch[570] Speed: 1.2298523103984995 samples/sec                   batch loss = 1408.3681547641754 | accuracy = 0.6587719298245615


Epoch[2] Batch[575] Speed: 1.2318666743420241 samples/sec                   batch loss = 1419.9668971300125 | accuracy = 0.6595652173913044


Epoch[2] Batch[580] Speed: 1.2313084904368405 samples/sec                   batch loss = 1431.1098812818527 | accuracy = 0.6603448275862069


Epoch[2] Batch[585] Speed: 1.2352078821401415 samples/sec                   batch loss = 1441.783334851265 | accuracy = 0.6606837606837607


Epoch[2] Batch[590] Speed: 1.2296795090551094 samples/sec                   batch loss = 1453.3079053163528 | accuracy = 0.6610169491525424


Epoch[2] Batch[595] Speed: 1.2296734704497738 samples/sec                   batch loss = 1465.1260513067245 | accuracy = 0.6609243697478991


Epoch[2] Batch[600] Speed: 1.2269468489548196 samples/sec                   batch loss = 1477.4483276605606 | accuracy = 0.66125


Epoch[2] Batch[605] Speed: 1.235707805747201 samples/sec                   batch loss = 1488.1341753005981 | accuracy = 0.6619834710743802


Epoch[2] Batch[610] Speed: 1.2351943320645826 samples/sec                   batch loss = 1498.4338791370392 | accuracy = 0.6622950819672131


Epoch[2] Batch[615] Speed: 1.2346687458857477 samples/sec                   batch loss = 1507.501838684082 | accuracy = 0.6634146341463415


Epoch[2] Batch[620] Speed: 1.2430356411079198 samples/sec                   batch loss = 1518.0043544769287 | accuracy = 0.6637096774193548


Epoch[2] Batch[625] Speed: 1.2413616341693616 samples/sec                   batch loss = 1529.4864090681076 | accuracy = 0.6644


Epoch[2] Batch[630] Speed: 1.235757774854417 samples/sec                   batch loss = 1540.5167902708054 | accuracy = 0.6642857142857143


Epoch[2] Batch[635] Speed: 1.2360826270480896 samples/sec                   batch loss = 1552.2742502093315 | accuracy = 0.6653543307086615


Epoch[2] Batch[640] Speed: 1.2328297218581783 samples/sec                   batch loss = 1562.1216649413109 | accuracy = 0.666015625


Epoch[2] Batch[645] Speed: 1.2339834837956893 samples/sec                   batch loss = 1573.1327821612358 | accuracy = 0.6662790697674419


Epoch[2] Batch[650] Speed: 1.2305081712553119 samples/sec                   batch loss = 1584.186533510685 | accuracy = 0.6657692307692308


Epoch[2] Batch[655] Speed: 1.2329407061410342 samples/sec                   batch loss = 1596.9563210606575 | accuracy = 0.665648854961832


Epoch[2] Batch[660] Speed: 1.2346303125693123 samples/sec                   batch loss = 1607.2258881926537 | accuracy = 0.6659090909090909


Epoch[2] Batch[665] Speed: 1.2318475897340257 samples/sec                   batch loss = 1617.0499352812767 | accuracy = 0.6665413533834587


Epoch[2] Batch[670] Speed: 1.2260129457174174 samples/sec                   batch loss = 1631.8889073729515 | accuracy = 0.6652985074626866


Epoch[2] Batch[675] Speed: 1.228118657998429 samples/sec                   batch loss = 1640.3706349730492 | accuracy = 0.6666666666666666


Epoch[2] Batch[680] Speed: 1.2366536260860752 samples/sec                   batch loss = 1651.4988804459572 | accuracy = 0.6676470588235294


Epoch[2] Batch[685] Speed: 1.2330353983243958 samples/sec                   batch loss = 1662.2067915797234 | accuracy = 0.6682481751824818


Epoch[2] Batch[690] Speed: 1.2304755014799693 samples/sec                   batch loss = 1672.0145558714867 | accuracy = 0.6692028985507247


Epoch[2] Batch[695] Speed: 1.2335326576452381 samples/sec                   batch loss = 1687.8218526244164 | accuracy = 0.6676258992805756


Epoch[2] Batch[700] Speed: 1.2359862825485581 samples/sec                   batch loss = 1697.0568119883537 | accuracy = 0.6692857142857143


Epoch[2] Batch[705] Speed: 1.2321227909083625 samples/sec                   batch loss = 1706.8245041966438 | accuracy = 0.6698581560283688


Epoch[2] Batch[710] Speed: 1.231951431617516 samples/sec                   batch loss = 1717.7082894444466 | accuracy = 0.6697183098591549


Epoch[2] Batch[715] Speed: 1.2287799603268188 samples/sec                   batch loss = 1728.5450939536095 | accuracy = 0.6695804195804196


Epoch[2] Batch[720] Speed: 1.2367337557403772 samples/sec                   batch loss = 1738.0735538601875 | accuracy = 0.6704861111111111


Epoch[2] Batch[725] Speed: 1.231700992630886 samples/sec                   batch loss = 1750.6768464446068 | accuracy = 0.6706896551724137


Epoch[2] Batch[730] Speed: 1.2403615566355606 samples/sec                   batch loss = 1763.2141546607018 | accuracy = 0.6702054794520548


Epoch[2] Batch[735] Speed: 1.235913533281536 samples/sec                   batch loss = 1775.003159224987 | accuracy = 0.6707482993197279


Epoch[2] Batch[740] Speed: 1.2336400493949675 samples/sec                   batch loss = 1784.104247391224 | accuracy = 0.6712837837837838


Epoch[2] Batch[745] Speed: 1.229688251618668 samples/sec                   batch loss = 1794.5303285717964 | accuracy = 0.6714765100671141


Epoch[2] Batch[750] Speed: 1.2424094182855845 samples/sec                   batch loss = 1805.6702784895897 | accuracy = 0.6716666666666666


Epoch[2] Batch[755] Speed: 1.2345304698167918 samples/sec                   batch loss = 1816.5843500494957 | accuracy = 0.6721854304635762


Epoch[2] Batch[760] Speed: 1.2343036805568737 samples/sec                   batch loss = 1825.6738249659538 | accuracy = 0.6726973684210527


Epoch[2] Batch[765] Speed: 1.2389875570385174 samples/sec                   batch loss = 1838.908575475216 | accuracy = 0.6728758169934641


Epoch[2] Batch[770] Speed: 1.2394912780152674 samples/sec                   batch loss = 1850.4660479426384 | accuracy = 0.6724025974025974


Epoch[2] Batch[775] Speed: 1.2381277936296242 samples/sec                   batch loss = 1860.666991531849 | accuracy = 0.6725806451612903


Epoch[2] Batch[780] Speed: 1.2358998766767988 samples/sec                   batch loss = 1873.7406377196312 | accuracy = 0.6721153846153847


Epoch[2] Batch[785] Speed: 1.2327236486210456 samples/sec                   batch loss = 1885.4261954426765 | accuracy = 0.6726114649681528


[Epoch 2] training: accuracy=0.6732233502538071
[Epoch 2] time cost: 653.643975019455
[Epoch 2] validation: validation accuracy=0.7455555555555555


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).