<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[22:32:59] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[22:32:59] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[22:33:00] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 8.873018 , -4.7415466]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7698605472267367 samples/sec                   batch loss = 13.963647365570068 | accuracy = 0.5


Epoch[1] Batch[10] Speed: 1.2549895421378052 samples/sec                   batch loss = 26.97698450088501 | accuracy = 0.6


Epoch[1] Batch[15] Speed: 1.2591182833223573 samples/sec                   batch loss = 41.753692626953125 | accuracy = 0.5666666666666667


Epoch[1] Batch[20] Speed: 1.257999878678095 samples/sec                   batch loss = 55.500237464904785 | accuracy = 0.5625


Epoch[1] Batch[25] Speed: 1.2605104231382165 samples/sec                   batch loss = 69.0060646533966 | accuracy = 0.56


Epoch[1] Batch[30] Speed: 1.2584099624402052 samples/sec                   batch loss = 83.82849955558777 | accuracy = 0.5166666666666667


Epoch[1] Batch[35] Speed: 1.253863650716364 samples/sec                   batch loss = 98.6709258556366 | accuracy = 0.4928571428571429


Epoch[1] Batch[40] Speed: 1.2568747963701006 samples/sec                   batch loss = 112.45660853385925 | accuracy = 0.49375


Epoch[1] Batch[45] Speed: 1.2565528533688282 samples/sec                   batch loss = 127.07320308685303 | accuracy = 0.48333333333333334


Epoch[1] Batch[50] Speed: 1.252950748026686 samples/sec                   batch loss = 140.8690972328186 | accuracy = 0.485


Epoch[1] Batch[55] Speed: 1.2555286289694627 samples/sec                   batch loss = 154.18187618255615 | accuracy = 0.5


Epoch[1] Batch[60] Speed: 1.2581116674650528 samples/sec                   batch loss = 168.85232996940613 | accuracy = 0.48333333333333334


Epoch[1] Batch[65] Speed: 1.2504094317086394 samples/sec                   batch loss = 182.192476272583 | accuracy = 0.5


Epoch[1] Batch[70] Speed: 1.2518156203865216 samples/sec                   batch loss = 197.60520815849304 | accuracy = 0.48928571428571427


Epoch[1] Batch[75] Speed: 1.2503836177036896 samples/sec                   batch loss = 211.5127317905426 | accuracy = 0.49333333333333335


Epoch[1] Batch[80] Speed: 1.2553886472933093 samples/sec                   batch loss = 224.91156840324402 | accuracy = 0.496875


Epoch[1] Batch[85] Speed: 1.2559430261212126 samples/sec                   batch loss = 238.68600630760193 | accuracy = 0.49117647058823527


Epoch[1] Batch[90] Speed: 1.2490208114101127 samples/sec                   batch loss = 252.42686676979065 | accuracy = 0.4861111111111111


Epoch[1] Batch[95] Speed: 1.2476169944708309 samples/sec                   batch loss = 266.6068186759949 | accuracy = 0.48947368421052634


Epoch[1] Batch[100] Speed: 1.2475300680098504 samples/sec                   batch loss = 280.6754674911499 | accuracy = 0.495


Epoch[1] Batch[105] Speed: 1.2515396753558174 samples/sec                   batch loss = 294.45465302467346 | accuracy = 0.5047619047619047


Epoch[1] Batch[110] Speed: 1.2517809688318238 samples/sec                   batch loss = 308.8199985027313 | accuracy = 0.49772727272727274


Epoch[1] Batch[115] Speed: 1.2560582111137373 samples/sec                   batch loss = 322.0061593055725 | accuracy = 0.508695652173913


Epoch[1] Batch[120] Speed: 1.2564436937815713 samples/sec                   batch loss = 335.900630235672 | accuracy = 0.5083333333333333


Epoch[1] Batch[125] Speed: 1.2542468496770451 samples/sec                   batch loss = 349.8234131336212 | accuracy = 0.508


Epoch[1] Batch[130] Speed: 1.2569016324003 samples/sec                   batch loss = 363.7409827709198 | accuracy = 0.5076923076923077


Epoch[1] Batch[135] Speed: 1.2534549321624908 samples/sec                   batch loss = 377.74674820899963 | accuracy = 0.5092592592592593


Epoch[1] Batch[140] Speed: 1.2558575678086612 samples/sec                   batch loss = 390.87261724472046 | accuracy = 0.5125


Epoch[1] Batch[145] Speed: 1.2556553909860118 samples/sec                   batch loss = 404.45459508895874 | accuracy = 0.5172413793103449


Epoch[1] Batch[150] Speed: 1.2620277170124516 samples/sec                   batch loss = 418.29915714263916 | accuracy = 0.5183333333333333


Epoch[1] Batch[155] Speed: 1.257943190016482 samples/sec                   batch loss = 432.1511299610138 | accuracy = 0.5209677419354839


Epoch[1] Batch[160] Speed: 1.254278168357295 samples/sec                   batch loss = 446.27147483825684 | accuracy = 0.51875


Epoch[1] Batch[165] Speed: 1.253935529544574 samples/sec                   batch loss = 459.63064217567444 | accuracy = 0.5227272727272727


Epoch[1] Batch[170] Speed: 1.2486115264330566 samples/sec                   batch loss = 472.252658367157 | accuracy = 0.5294117647058824


Epoch[1] Batch[175] Speed: 1.2545403126462347 samples/sec                   batch loss = 486.3736469745636 | accuracy = 0.5271428571428571


Epoch[1] Batch[180] Speed: 1.2505645245567996 samples/sec                   batch loss = 499.1418447494507 | accuracy = 0.5319444444444444


Epoch[1] Batch[185] Speed: 1.2423546779564452 samples/sec                   batch loss = 513.538987159729 | accuracy = 0.5283783783783784


Epoch[1] Batch[190] Speed: 1.2536076903287396 samples/sec                   batch loss = 527.4068911075592 | accuracy = 0.5289473684210526


Epoch[1] Batch[195] Speed: 1.2555714752381997 samples/sec                   batch loss = 540.8835287094116 | accuracy = 0.5282051282051282


Epoch[1] Batch[200] Speed: 1.2552921814492994 samples/sec                   batch loss = 555.1171679496765 | accuracy = 0.525


Epoch[1] Batch[205] Speed: 1.2504793305739714 samples/sec                   batch loss = 568.1593346595764 | accuracy = 0.5280487804878049


Epoch[1] Batch[210] Speed: 1.254872675856432 samples/sec                   batch loss = 581.211832523346 | accuracy = 0.5333333333333333


Epoch[1] Batch[215] Speed: 1.254710788661246 samples/sec                   batch loss = 595.048680305481 | accuracy = 0.536046511627907


Epoch[1] Batch[220] Speed: 1.2525658155993316 samples/sec                   batch loss = 608.9643356800079 | accuracy = 0.5386363636363637


Epoch[1] Batch[225] Speed: 1.2470470444429 samples/sec                   batch loss = 623.071282863617 | accuracy = 0.5388888888888889


Epoch[1] Batch[230] Speed: 1.2472742757769177 samples/sec                   batch loss = 636.8081429004669 | accuracy = 0.5391304347826087


Epoch[1] Batch[235] Speed: 1.2452514867934679 samples/sec                   batch loss = 650.2427780628204 | accuracy = 0.5393617021276595


Epoch[1] Batch[240] Speed: 1.2470113587543872 samples/sec                   batch loss = 665.1807506084442 | accuracy = 0.5333333333333333


Epoch[1] Batch[245] Speed: 1.2486690499512543 samples/sec                   batch loss = 679.0663433074951 | accuracy = 0.5306122448979592


Epoch[1] Batch[250] Speed: 1.244789251071212 samples/sec                   batch loss = 691.9047141075134 | accuracy = 0.534


Epoch[1] Batch[255] Speed: 1.2432681383683228 samples/sec                   batch loss = 705.3700234889984 | accuracy = 0.5343137254901961


Epoch[1] Batch[260] Speed: 1.248559397418774 samples/sec                   batch loss = 719.9753828048706 | accuracy = 0.5317307692307692


Epoch[1] Batch[265] Speed: 1.2404106189079116 samples/sec                   batch loss = 734.1984016895294 | accuracy = 0.530188679245283


Epoch[1] Batch[270] Speed: 1.2464839123028095 samples/sec                   batch loss = 748.1975634098053 | accuracy = 0.5296296296296297


Epoch[1] Batch[275] Speed: 1.2482217328698522 samples/sec                   batch loss = 762.081759929657 | accuracy = 0.53


Epoch[1] Batch[280] Speed: 1.2554970599057533 samples/sec                   batch loss = 775.9713280200958 | accuracy = 0.5321428571428571


Epoch[1] Batch[285] Speed: 1.2427066642647882 samples/sec                   batch loss = 789.8683657646179 | accuracy = 0.5324561403508772


Epoch[1] Batch[290] Speed: 1.24354431880932 samples/sec                   batch loss = 803.3173582553864 | accuracy = 0.5336206896551724


Epoch[1] Batch[295] Speed: 1.2477445765792352 samples/sec                   batch loss = 817.4034879207611 | accuracy = 0.5322033898305085


Epoch[1] Batch[300] Speed: 1.241890728939094 samples/sec                   batch loss = 830.5680437088013 | accuracy = 0.535


Epoch[1] Batch[305] Speed: 1.2462418792345735 samples/sec                   batch loss = 843.9055116176605 | accuracy = 0.5377049180327869


Epoch[1] Batch[310] Speed: 1.2470822686837866 samples/sec                   batch loss = 857.552473783493 | accuracy = 0.5379032258064517


Epoch[1] Batch[315] Speed: 1.2502631353814024 samples/sec                   batch loss = 871.2468495368958 | accuracy = 0.5404761904761904


Epoch[1] Batch[320] Speed: 1.2497456889282463 samples/sec                   batch loss = 885.0845820903778 | accuracy = 0.54296875


Epoch[1] Batch[325] Speed: 1.2467037123456037 samples/sec                   batch loss = 898.5446367263794 | accuracy = 0.5446153846153846


Epoch[1] Batch[330] Speed: 1.248373032304966 samples/sec                   batch loss = 912.4217381477356 | accuracy = 0.543939393939394


Epoch[1] Batch[335] Speed: 1.2481151301268065 samples/sec                   batch loss = 925.2454915046692 | accuracy = 0.5455223880597015


Epoch[1] Batch[340] Speed: 1.2460418608824093 samples/sec                   batch loss = 938.8994972705841 | accuracy = 0.5455882352941176


Epoch[1] Batch[345] Speed: 1.2455547179538273 samples/sec                   batch loss = 951.9914298057556 | accuracy = 0.5478260869565217


Epoch[1] Batch[350] Speed: 1.2472969941919694 samples/sec                   batch loss = 965.1570253372192 | accuracy = 0.55


Epoch[1] Batch[355] Speed: 1.24584884528965 samples/sec                   batch loss = 978.2806570529938 | accuracy = 0.5514084507042254


Epoch[1] Batch[360] Speed: 1.2453171127245024 samples/sec                   batch loss = 991.5173404216766 | accuracy = 0.5534722222222223


Epoch[1] Batch[365] Speed: 1.2527201343365817 samples/sec                   batch loss = 1004.7722425460815 | accuracy = 0.5547945205479452


Epoch[1] Batch[370] Speed: 1.246416589155119 samples/sec                   batch loss = 1018.3055744171143 | accuracy = 0.5540540540540541


Epoch[1] Batch[375] Speed: 1.2505904392522167 samples/sec                   batch loss = 1032.285276889801 | accuracy = 0.554


Epoch[1] Batch[380] Speed: 1.2444385764316293 samples/sec                   batch loss = 1045.4219744205475 | accuracy = 0.555921052631579


Epoch[1] Batch[385] Speed: 1.2492567624822428 samples/sec                   batch loss = 1058.095012664795 | accuracy = 0.5590909090909091


Epoch[1] Batch[390] Speed: 1.2481614647961174 samples/sec                   batch loss = 1071.8688979148865 | accuracy = 0.5602564102564103


Epoch[1] Batch[395] Speed: 1.2519548997579626 samples/sec                   batch loss = 1085.7414908409119 | accuracy = 0.560126582278481


Epoch[1] Batch[400] Speed: 1.256704296649406 samples/sec                   batch loss = 1099.1059308052063 | accuracy = 0.56


Epoch[1] Batch[405] Speed: 1.2519165971767505 samples/sec                   batch loss = 1112.9417617321014 | accuracy = 0.5598765432098766


Epoch[1] Batch[410] Speed: 1.2486461885969564 samples/sec                   batch loss = 1126.8862693309784 | accuracy = 0.5597560975609757


Epoch[1] Batch[415] Speed: 1.2461381134349145 samples/sec                   batch loss = 1140.1735458374023 | accuracy = 0.5614457831325301


Epoch[1] Batch[420] Speed: 1.2483212018491612 samples/sec                   batch loss = 1153.031236410141 | accuracy = 0.5630952380952381


Epoch[1] Batch[425] Speed: 1.248908958783483 samples/sec                   batch loss = 1166.6840398311615 | accuracy = 0.5629411764705883


Epoch[1] Batch[430] Speed: 1.2481211655101032 samples/sec                   batch loss = 1180.2384586334229 | accuracy = 0.5633720930232559


Epoch[1] Batch[435] Speed: 1.2551436136029872 samples/sec                   batch loss = 1193.51939868927 | accuracy = 0.5643678160919541


Epoch[1] Batch[440] Speed: 1.2478467538663078 samples/sec                   batch loss = 1207.6335623264313 | accuracy = 0.5642045454545455


Epoch[1] Batch[445] Speed: 1.2520751477216212 samples/sec                   batch loss = 1221.2174010276794 | accuracy = 0.5646067415730337


Epoch[1] Batch[450] Speed: 1.2464529816454974 samples/sec                   batch loss = 1234.628142118454 | accuracy = 0.5655555555555556


Epoch[1] Batch[455] Speed: 1.2503562206116599 samples/sec                   batch loss = 1247.6256742477417 | accuracy = 0.5675824175824176


Epoch[1] Batch[460] Speed: 1.24837535455874 samples/sec                   batch loss = 1260.6285061836243 | accuracy = 0.5690217391304347


Epoch[1] Batch[465] Speed: 1.2499346059500949 samples/sec                   batch loss = 1274.0647988319397 | accuracy = 0.5704301075268817


Epoch[1] Batch[470] Speed: 1.2507512648811354 samples/sec                   batch loss = 1287.8245887756348 | accuracy = 0.5718085106382979


Epoch[1] Batch[475] Speed: 1.2506734107885575 samples/sec                   batch loss = 1301.7426960468292 | accuracy = 0.5710526315789474


Epoch[1] Batch[480] Speed: 1.2474026225081265 samples/sec                   batch loss = 1314.7582881450653 | accuracy = 0.5708333333333333


Epoch[1] Batch[485] Speed: 1.248577981238305 samples/sec                   batch loss = 1326.6040744781494 | accuracy = 0.572680412371134


Epoch[1] Batch[490] Speed: 1.2489689271519608 samples/sec                   batch loss = 1340.6844413280487 | accuracy = 0.5719387755102041


Epoch[1] Batch[495] Speed: 1.2501371802418724 samples/sec                   batch loss = 1354.701828956604 | accuracy = 0.5707070707070707


Epoch[1] Batch[500] Speed: 1.250697651742329 samples/sec                   batch loss = 1368.2087771892548 | accuracy = 0.571


Epoch[1] Batch[505] Speed: 1.251594854629021 samples/sec                   batch loss = 1381.075689315796 | accuracy = 0.5722772277227722


Epoch[1] Batch[510] Speed: 1.2494881499003037 samples/sec                   batch loss = 1394.265076637268 | accuracy = 0.5740196078431372


Epoch[1] Batch[515] Speed: 1.251340379593472 samples/sec                   batch loss = 1407.824105978012 | accuracy = 0.5742718446601942


Epoch[1] Batch[520] Speed: 1.2441217727183336 samples/sec                   batch loss = 1422.420815229416 | accuracy = 0.573076923076923


Epoch[1] Batch[525] Speed: 1.255724467794716 samples/sec                   batch loss = 1435.5455718040466 | accuracy = 0.5738095238095238


Epoch[1] Batch[530] Speed: 1.2559922945037552 samples/sec                   batch loss = 1448.5916244983673 | accuracy = 0.5740566037735849


Epoch[1] Batch[535] Speed: 1.2549601592493045 samples/sec                   batch loss = 1461.5192658901215 | accuracy = 0.5747663551401869


Epoch[1] Batch[540] Speed: 1.246000958050034 samples/sec                   batch loss = 1473.8654811382294 | accuracy = 0.5773148148148148


Epoch[1] Batch[545] Speed: 1.249857319043261 samples/sec                   batch loss = 1487.9425642490387 | accuracy = 0.5775229357798165


Epoch[1] Batch[550] Speed: 1.2506615703565789 samples/sec                   batch loss = 1501.692836523056 | accuracy = 0.5763636363636364


Epoch[1] Batch[555] Speed: 1.248156729030798 samples/sec                   batch loss = 1513.3941448926926 | accuracy = 0.5788288288288288


Epoch[1] Batch[560] Speed: 1.2554504608559598 samples/sec                   batch loss = 1525.8403273820877 | accuracy = 0.5794642857142858


Epoch[1] Batch[565] Speed: 1.2488993829725923 samples/sec                   batch loss = 1540.0610188245773 | accuracy = 0.5787610619469027


Epoch[1] Batch[570] Speed: 1.2528013306091708 samples/sec                   batch loss = 1553.338070511818 | accuracy = 0.5789473684210527


Epoch[1] Batch[575] Speed: 1.2454384002532268 samples/sec                   batch loss = 1566.821790933609 | accuracy = 0.578695652173913


Epoch[1] Batch[580] Speed: 1.2496695423767117 samples/sec                   batch loss = 1579.8506734371185 | accuracy = 0.5801724137931035


Epoch[1] Batch[585] Speed: 1.250777000820446 samples/sec                   batch loss = 1592.8937001228333 | accuracy = 0.5816239316239317


Epoch[1] Batch[590] Speed: 1.252274397203755 samples/sec                   batch loss = 1605.9097211360931 | accuracy = 0.5830508474576271


Epoch[1] Batch[595] Speed: 1.252210279053375 samples/sec                   batch loss = 1620.1446104049683 | accuracy = 0.5823529411764706


Epoch[1] Batch[600] Speed: 1.2544977243313944 samples/sec                   batch loss = 1632.6098957061768 | accuracy = 0.5825


Epoch[1] Batch[605] Speed: 1.2500993613897808 samples/sec                   batch loss = 1646.1647508144379 | accuracy = 0.5822314049586776


Epoch[1] Batch[610] Speed: 1.2448021812434462 samples/sec                   batch loss = 1658.5070028305054 | accuracy = 0.5831967213114754


Epoch[1] Batch[615] Speed: 1.2540779995527027 samples/sec                   batch loss = 1671.17072224617 | accuracy = 0.5841463414634146


Epoch[1] Batch[620] Speed: 1.2561023161039573 samples/sec                   batch loss = 1685.4459881782532 | accuracy = 0.5834677419354839


Epoch[1] Batch[625] Speed: 1.2513896608812738 samples/sec                   batch loss = 1697.911277770996 | accuracy = 0.5832


Epoch[1] Batch[630] Speed: 1.2430456798032017 samples/sec                   batch loss = 1710.121768951416 | accuracy = 0.5845238095238096


Epoch[1] Batch[635] Speed: 1.2466431276050658 samples/sec                   batch loss = 1723.8385274410248 | accuracy = 0.5846456692913385


Epoch[1] Batch[640] Speed: 1.2456497854350022 samples/sec                   batch loss = 1737.5871698856354 | accuracy = 0.584765625


Epoch[1] Batch[645] Speed: 1.2468170234668647 samples/sec                   batch loss = 1751.7308330535889 | accuracy = 0.5852713178294574


Epoch[1] Batch[650] Speed: 1.2426710424636642 samples/sec                   batch loss = 1765.6663959026337 | accuracy = 0.5838461538461538


Epoch[1] Batch[655] Speed: 1.2514873015619632 samples/sec                   batch loss = 1778.8157563209534 | accuracy = 0.583587786259542


Epoch[1] Batch[660] Speed: 1.2512118744944074 samples/sec                   batch loss = 1791.4616088867188 | accuracy = 0.5840909090909091


Epoch[1] Batch[665] Speed: 1.2493187178999146 samples/sec                   batch loss = 1804.2310655117035 | accuracy = 0.5845864661654135


Epoch[1] Batch[670] Speed: 1.2419580237361425 samples/sec                   batch loss = 1816.7993786334991 | accuracy = 0.5854477611940299


Epoch[1] Batch[675] Speed: 1.244056826200299 samples/sec                   batch loss = 1830.1168792247772 | accuracy = 0.5851851851851851


Epoch[1] Batch[680] Speed: 1.2463527916732196 samples/sec                   batch loss = 1842.5143424272537 | accuracy = 0.5856617647058824


Epoch[1] Batch[685] Speed: 1.2483430295631086 samples/sec                   batch loss = 1855.6563857793808 | accuracy = 0.5857664233576643


Epoch[1] Batch[690] Speed: 1.2432821425646912 samples/sec                   batch loss = 1867.466847062111 | accuracy = 0.586231884057971


Epoch[1] Batch[695] Speed: 1.2468215637528808 samples/sec                   batch loss = 1882.4265941381454 | accuracy = 0.5856115107913669


Epoch[1] Batch[700] Speed: 1.2445364278325413 samples/sec                   batch loss = 1894.9433199167252 | accuracy = 0.5860714285714286


Epoch[1] Batch[705] Speed: 1.2532930360399468 samples/sec                   batch loss = 1907.6669083833694 | accuracy = 0.5865248226950355


Epoch[1] Batch[710] Speed: 1.2477786338156374 samples/sec                   batch loss = 1919.079707980156 | accuracy = 0.5876760563380282


Epoch[1] Batch[715] Speed: 1.2491427284851206 samples/sec                   batch loss = 1932.6006063222885 | accuracy = 0.5874125874125874


Epoch[1] Batch[720] Speed: 1.2526408191091827 samples/sec                   batch loss = 1945.8783363103867 | accuracy = 0.5878472222222222


Epoch[1] Batch[725] Speed: 1.2549720812139076 samples/sec                   batch loss = 1957.4512957334518 | accuracy = 0.5886206896551724


Epoch[1] Batch[730] Speed: 1.2413081800689774 samples/sec                   batch loss = 1970.309106707573 | accuracy = 0.588013698630137


Epoch[1] Batch[735] Speed: 1.2412883426050811 samples/sec                   batch loss = 1982.0315767526627 | accuracy = 0.5891156462585034


Epoch[1] Batch[740] Speed: 1.24623697287982 samples/sec                   batch loss = 1996.4415093660355 | accuracy = 0.5891891891891892


Epoch[1] Batch[745] Speed: 1.2479251846610941 samples/sec                   batch loss = 2009.8833760023117 | accuracy = 0.5899328859060403


Epoch[1] Batch[750] Speed: 1.242467567918637 samples/sec                   batch loss = 2023.353263258934 | accuracy = 0.59


Epoch[1] Batch[755] Speed: 1.2385805224591246 samples/sec                   batch loss = 2034.8874751329422 | accuracy = 0.5910596026490066


Epoch[1] Batch[760] Speed: 1.2455960538789643 samples/sec                   batch loss = 2047.086707830429 | accuracy = 0.5917763157894737


Epoch[1] Batch[765] Speed: 1.2428072815701912 samples/sec                   batch loss = 2059.0762560367584 | accuracy = 0.592483660130719


Epoch[1] Batch[770] Speed: 1.2463049247019256 samples/sec                   batch loss = 2072.4311673641205 | accuracy = 0.5931818181818181


Epoch[1] Batch[775] Speed: 1.2544246555236631 samples/sec                   batch loss = 2086.2095839977264 | accuracy = 0.5929032258064516


Epoch[1] Batch[780] Speed: 1.2500172110781527 samples/sec                   batch loss = 2098.11364197731 | accuracy = 0.5935897435897436


Epoch[1] Batch[785] Speed: 1.2511291114017804 samples/sec                   batch loss = 2111.2869906425476 | accuracy = 0.5936305732484076


[Epoch 1] training: accuracy=0.5939086294416244
[Epoch 1] time cost: 649.0454640388489
[Epoch 1] validation: validation accuracy=0.72


Epoch[2] Batch[5] Speed: 1.249680619354508 samples/sec                   batch loss = 12.10738742351532 | accuracy = 0.65


Epoch[2] Batch[10] Speed: 1.2513539128938513 samples/sec                   batch loss = 25.51451599597931 | accuracy = 0.575


Epoch[2] Batch[15] Speed: 1.2470078366413264 samples/sec                   batch loss = 38.74583554267883 | accuracy = 0.5833333333333334


Epoch[2] Batch[20] Speed: 1.2486721167819759 samples/sec                   batch loss = 50.31645131111145 | accuracy = 0.6


Epoch[2] Batch[25] Speed: 1.2471263944878273 samples/sec                   batch loss = 63.01626706123352 | accuracy = 0.6


Epoch[2] Batch[30] Speed: 1.2420442673476786 samples/sec                   batch loss = 76.59259843826294 | accuracy = 0.5833333333333334


Epoch[2] Batch[35] Speed: 1.2433954776027538 samples/sec                   batch loss = 90.26212644577026 | accuracy = 0.5857142857142857


Epoch[2] Batch[40] Speed: 1.2405223302975175 samples/sec                   batch loss = 103.3860239982605 | accuracy = 0.59375


Epoch[2] Batch[45] Speed: 1.2445006087418942 samples/sec                   batch loss = 115.74459052085876 | accuracy = 0.6


Epoch[2] Batch[50] Speed: 1.2451097214231814 samples/sec                   batch loss = 128.6469020843506 | accuracy = 0.61


Epoch[2] Batch[55] Speed: 1.2440404983839248 samples/sec                   batch loss = 140.35508024692535 | accuracy = 0.6227272727272727


Epoch[2] Batch[60] Speed: 1.2426056030814623 samples/sec                   batch loss = 154.49914383888245 | accuracy = 0.6125


Epoch[2] Batch[65] Speed: 1.2441888481860714 samples/sec                   batch loss = 167.58592820167542 | accuracy = 0.6153846153846154


Epoch[2] Batch[70] Speed: 1.243609304008044 samples/sec                   batch loss = 179.49205482006073 | accuracy = 0.625


Epoch[2] Batch[75] Speed: 1.249629611036284 samples/sec                   batch loss = 194.39044320583344 | accuracy = 0.6233333333333333


Epoch[2] Batch[80] Speed: 1.2449178258703164 samples/sec                   batch loss = 209.02076590061188 | accuracy = 0.615625


Epoch[2] Batch[85] Speed: 1.2420062009935033 samples/sec                   batch loss = 220.68177449703217 | accuracy = 0.6235294117647059


Epoch[2] Batch[90] Speed: 1.2368893955554094 samples/sec                   batch loss = 233.6940575838089 | accuracy = 0.6277777777777778


Epoch[2] Batch[95] Speed: 1.242809306971428 samples/sec                   batch loss = 246.84345877170563 | accuracy = 0.6210526315789474


Epoch[2] Batch[100] Speed: 1.2434783266081961 samples/sec                   batch loss = 257.16595554351807 | accuracy = 0.63


Epoch[2] Batch[105] Speed: 1.2431774872547605 samples/sec                   batch loss = 269.23400819301605 | accuracy = 0.6357142857142857


Epoch[2] Batch[110] Speed: 1.2438798260147625 samples/sec                   batch loss = 279.9032007455826 | accuracy = 0.6477272727272727


Epoch[2] Batch[115] Speed: 1.2471591200523686 samples/sec                   batch loss = 290.339387178421 | accuracy = 0.65


Epoch[2] Batch[120] Speed: 1.2445805583100034 samples/sec                   batch loss = 303.6089518070221 | accuracy = 0.6479166666666667


Epoch[2] Batch[125] Speed: 1.2489565611100912 samples/sec                   batch loss = 317.5035767555237 | accuracy = 0.648


Epoch[2] Batch[130] Speed: 1.2415033741121648 samples/sec                   batch loss = 329.7710528373718 | accuracy = 0.6480769230769231


Epoch[2] Batch[135] Speed: 1.2494850790598022 samples/sec                   batch loss = 340.59211230278015 | accuracy = 0.6518518518518519


Epoch[2] Batch[140] Speed: 1.251889786764359 samples/sec                   batch loss = 350.8910595178604 | accuracy = 0.6553571428571429


Epoch[2] Batch[145] Speed: 1.2566193935588093 samples/sec                   batch loss = 363.4760261774063 | accuracy = 0.6551724137931034


Epoch[2] Batch[150] Speed: 1.2475185653089162 samples/sec                   batch loss = 375.7197917699814 | accuracy = 0.655


Epoch[2] Batch[155] Speed: 1.2478564991683048 samples/sec                   batch loss = 386.68258917331696 | accuracy = 0.6596774193548387


Epoch[2] Batch[160] Speed: 1.2483661584844314 samples/sec                   batch loss = 400.4588986635208 | accuracy = 0.65625


Epoch[2] Batch[165] Speed: 1.2519278074457292 samples/sec                   batch loss = 415.8539079427719 | accuracy = 0.6530303030303031


Epoch[2] Batch[170] Speed: 1.2455857889972024 samples/sec                   batch loss = 426.30367255210876 | accuracy = 0.6573529411764706


Epoch[2] Batch[175] Speed: 1.250996731881156 samples/sec                   batch loss = 438.5087124109268 | accuracy = 0.6628571428571428


Epoch[2] Batch[180] Speed: 1.2544061786241831 samples/sec                   batch loss = 448.91229355335236 | accuracy = 0.6652777777777777


Epoch[2] Batch[185] Speed: 1.254349063123143 samples/sec                   batch loss = 462.61088836193085 | accuracy = 0.6648648648648648


Epoch[2] Batch[190] Speed: 1.2581478969545368 samples/sec                   batch loss = 474.1703646183014 | accuracy = 0.6657894736842105


Epoch[2] Batch[195] Speed: 1.2504799830001905 samples/sec                   batch loss = 486.80103945732117 | accuracy = 0.6641025641025641


Epoch[2] Batch[200] Speed: 1.2517723762903024 samples/sec                   batch loss = 496.1787874698639 | accuracy = 0.6675


Epoch[2] Batch[205] Speed: 1.255321110253141 samples/sec                   batch loss = 508.25904297828674 | accuracy = 0.6695121951219513


Epoch[2] Batch[210] Speed: 1.2559878752353029 samples/sec                   batch loss = 518.9501875638962 | accuracy = 0.6702380952380952


Epoch[2] Batch[215] Speed: 1.2479085695040124 samples/sec                   batch loss = 531.3686133623123 | accuracy = 0.6732558139534883


Epoch[2] Batch[220] Speed: 1.249894564552159 samples/sec                   batch loss = 545.3082226514816 | accuracy = 0.6693181818181818


Epoch[2] Batch[225] Speed: 1.2520368377823838 samples/sec                   batch loss = 556.8260815143585 | accuracy = 0.6711111111111111


Epoch[2] Batch[230] Speed: 1.2544859051795236 samples/sec                   batch loss = 568.1197636127472 | accuracy = 0.6717391304347826


Epoch[2] Batch[235] Speed: 1.2465333675010488 samples/sec                   batch loss = 581.2196574211121 | accuracy = 0.6702127659574468


Epoch[2] Batch[240] Speed: 1.2504417706122912 samples/sec                   batch loss = 593.1079519987106 | accuracy = 0.6697916666666667


Epoch[2] Batch[245] Speed: 1.2548293141466664 samples/sec                   batch loss = 606.3312023878098 | accuracy = 0.6673469387755102


Epoch[2] Batch[250] Speed: 1.2491935110381358 samples/sec                   batch loss = 620.1852852106094 | accuracy = 0.665


Epoch[2] Batch[255] Speed: 1.251097669895079 samples/sec                   batch loss = 632.1625558137894 | accuracy = 0.6627450980392157


Epoch[2] Batch[260] Speed: 1.252363107919585 samples/sec                   batch loss = 644.1125413179398 | accuracy = 0.6615384615384615


Epoch[2] Batch[265] Speed: 1.2602194617984979 samples/sec                   batch loss = 656.0101634263992 | accuracy = 0.6622641509433962


Epoch[2] Batch[270] Speed: 1.256932424625669 samples/sec                   batch loss = 666.3937504291534 | accuracy = 0.6657407407407407


Epoch[2] Batch[275] Speed: 1.2582916089447729 samples/sec                   batch loss = 680.52587723732 | accuracy = 0.6645454545454546


Epoch[2] Batch[280] Speed: 1.2484378728773686 samples/sec                   batch loss = 692.8741281032562 | accuracy = 0.6660714285714285


Epoch[2] Batch[285] Speed: 1.2496197449444009 samples/sec                   batch loss = 705.0244621038437 | accuracy = 0.6657894736842105


Epoch[2] Batch[290] Speed: 1.249612298940476 samples/sec                   batch loss = 718.7156507968903 | accuracy = 0.6637931034482759


Epoch[2] Batch[295] Speed: 1.2531634743665003 samples/sec                   batch loss = 732.6768362522125 | accuracy = 0.6627118644067796


Epoch[2] Batch[300] Speed: 1.2535579532024008 samples/sec                   batch loss = 745.5177464485168 | accuracy = 0.6608333333333334


Epoch[2] Batch[305] Speed: 1.2490921360548108 samples/sec                   batch loss = 757.3030602931976 | accuracy = 0.6598360655737705


Epoch[2] Batch[310] Speed: 1.2515449036400412 samples/sec                   batch loss = 768.4340101480484 | accuracy = 0.6596774193548387


Epoch[2] Batch[315] Speed: 1.2551253033130558 samples/sec                   batch loss = 781.2956117391586 | accuracy = 0.6595238095238095


Epoch[2] Batch[320] Speed: 1.2520046031863656 samples/sec                   batch loss = 794.7452116012573 | accuracy = 0.659375


Epoch[2] Batch[325] Speed: 1.2496593033281909 samples/sec                   batch loss = 806.3865876197815 | accuracy = 0.6607692307692308


Epoch[2] Batch[330] Speed: 1.2527872046948934 samples/sec                   batch loss = 818.0477170944214 | accuracy = 0.6598484848484848


Epoch[2] Batch[335] Speed: 1.2516119415834415 samples/sec                   batch loss = 830.9374482631683 | accuracy = 0.6597014925373135


Epoch[2] Batch[340] Speed: 1.252258974599929 samples/sec                   batch loss = 844.3274540901184 | accuracy = 0.6602941176470588


Epoch[2] Batch[345] Speed: 1.2375598145594673 samples/sec                   batch loss = 855.203781247139 | accuracy = 0.6615942028985508


Epoch[2] Batch[350] Speed: 1.2475869352849287 samples/sec                   batch loss = 867.8083463907242 | accuracy = 0.6614285714285715


Epoch[2] Batch[355] Speed: 1.2457461624205761 samples/sec                   batch loss = 877.8038403987885 | accuracy = 0.6626760563380282


Epoch[2] Batch[360] Speed: 1.2467195542810188 samples/sec                   batch loss = 887.8509857654572 | accuracy = 0.6652777777777777


Epoch[2] Batch[365] Speed: 1.2539662703336194 samples/sec                   batch loss = 899.1517472267151 | accuracy = 0.665068493150685


Epoch[2] Batch[370] Speed: 1.2509827399114577 samples/sec                   batch loss = 909.5155826807022 | accuracy = 0.6662162162162162


Epoch[2] Batch[375] Speed: 1.2479327962216025 samples/sec                   batch loss = 921.6669836044312 | accuracy = 0.6666666666666666


Epoch[2] Batch[380] Speed: 1.2532997769757221 samples/sec                   batch loss = 936.0715115070343 | accuracy = 0.6631578947368421


Epoch[2] Batch[385] Speed: 1.2503758830288683 samples/sec                   batch loss = 946.7920684814453 | accuracy = 0.6649350649350649


Epoch[2] Batch[390] Speed: 1.2501219033857547 samples/sec                   batch loss = 958.4052456617355 | accuracy = 0.666025641025641


Epoch[2] Batch[395] Speed: 1.2444715303264542 samples/sec                   batch loss = 969.727087855339 | accuracy = 0.6658227848101266


Epoch[2] Batch[400] Speed: 1.2526940377698406 samples/sec                   batch loss = 982.2518662214279 | accuracy = 0.666875


Epoch[2] Batch[405] Speed: 1.2532710348791365 samples/sec                   batch loss = 995.4781392812729 | accuracy = 0.6666666666666666


Epoch[2] Batch[410] Speed: 1.2484536660138854 samples/sec                   batch loss = 1006.0724660158157 | accuracy = 0.6670731707317074


Epoch[2] Batch[415] Speed: 1.2483306759120967 samples/sec                   batch loss = 1016.6509130001068 | accuracy = 0.6680722891566265


Epoch[2] Batch[420] Speed: 1.2551806114088067 samples/sec                   batch loss = 1028.7896103858948 | accuracy = 0.6678571428571428


Epoch[2] Batch[425] Speed: 1.2514672307707342 samples/sec                   batch loss = 1042.5448514223099 | accuracy = 0.668235294117647


Epoch[2] Batch[430] Speed: 1.2581119504998526 samples/sec                   batch loss = 1056.1110752820969 | accuracy = 0.6691860465116279


Epoch[2] Batch[435] Speed: 1.2549534943126885 samples/sec                   batch loss = 1070.0830187797546 | accuracy = 0.667816091954023


Epoch[2] Batch[440] Speed: 1.2492313681182121 samples/sec                   batch loss = 1082.3586382865906 | accuracy = 0.6676136363636364


Epoch[2] Batch[445] Speed: 1.2491035747935528 samples/sec                   batch loss = 1093.5360367298126 | accuracy = 0.6679775280898876


Epoch[2] Batch[450] Speed: 1.2533119482934238 samples/sec                   batch loss = 1106.1705503463745 | accuracy = 0.6666666666666666


Epoch[2] Batch[455] Speed: 1.2504712219049148 samples/sec                   batch loss = 1118.1527645587921 | accuracy = 0.6664835164835164


Epoch[2] Batch[460] Speed: 1.2502239114223628 samples/sec                   batch loss = 1130.6673327684402 | accuracy = 0.6657608695652174


Epoch[2] Batch[465] Speed: 1.2502864286833193 samples/sec                   batch loss = 1141.9369988441467 | accuracy = 0.6661290322580645


Epoch[2] Batch[470] Speed: 1.2501697845372002 samples/sec                   batch loss = 1154.5317194461823 | accuracy = 0.6654255319148936


Epoch[2] Batch[475] Speed: 1.2489919862900756 samples/sec                   batch loss = 1165.8462995290756 | accuracy = 0.6673684210526316


Epoch[2] Batch[480] Speed: 1.2520257189950759 samples/sec                   batch loss = 1176.9876450300217 | accuracy = 0.6682291666666667


Epoch[2] Batch[485] Speed: 1.24917453685386 samples/sec                   batch loss = 1190.6004000902176 | accuracy = 0.6670103092783505


Epoch[2] Batch[490] Speed: 1.2485431370304776 samples/sec                   batch loss = 1203.724514722824 | accuracy = 0.6663265306122449


Epoch[2] Batch[495] Speed: 1.2496862975453071 samples/sec                   batch loss = 1217.1834651231766 | accuracy = 0.6656565656565656


Epoch[2] Batch[500] Speed: 1.2478879636109637 samples/sec                   batch loss = 1227.1066138744354 | accuracy = 0.668


Epoch[2] Batch[505] Speed: 1.2503220225250322 samples/sec                   batch loss = 1238.5703999996185 | accuracy = 0.6688118811881189


Epoch[2] Batch[510] Speed: 1.2490676783163168 samples/sec                   batch loss = 1250.5394541025162 | accuracy = 0.6681372549019607


Epoch[2] Batch[515] Speed: 1.2510349781953458 samples/sec                   batch loss = 1264.3482493162155 | accuracy = 0.666504854368932


Epoch[2] Batch[520] Speed: 1.2496781991577202 samples/sec                   batch loss = 1278.066722393036 | accuracy = 0.6653846153846154


Epoch[2] Batch[525] Speed: 1.2504389746756015 samples/sec                   batch loss = 1290.435180068016 | accuracy = 0.6642857142857143


Epoch[2] Batch[530] Speed: 1.245322936204613 samples/sec                   batch loss = 1300.9422680139542 | accuracy = 0.664622641509434


Epoch[2] Batch[535] Speed: 1.2467745872903755 samples/sec                   batch loss = 1314.1541857719421 | accuracy = 0.664018691588785


Epoch[2] Batch[540] Speed: 1.2435657954356527 samples/sec                   batch loss = 1326.2523357868195 | accuracy = 0.6652777777777777


Epoch[2] Batch[545] Speed: 1.243929628165698 samples/sec                   batch loss = 1336.2136497497559 | accuracy = 0.6660550458715596


Epoch[2] Batch[550] Speed: 1.2431577742301312 samples/sec                   batch loss = 1349.272320151329 | accuracy = 0.6654545454545454


Epoch[2] Batch[555] Speed: 1.2482373347987303 samples/sec                   batch loss = 1361.4336527585983 | accuracy = 0.6644144144144144


Epoch[2] Batch[560] Speed: 1.2433700445278897 samples/sec                   batch loss = 1372.7516548633575 | accuracy = 0.6647321428571429


Epoch[2] Batch[565] Speed: 1.2449305739570289 samples/sec                   batch loss = 1384.054517030716 | accuracy = 0.6650442477876106


Epoch[2] Batch[570] Speed: 1.2489565611100912 samples/sec                   batch loss = 1395.9531157016754 | accuracy = 0.6649122807017543


Epoch[2] Batch[575] Speed: 1.247337611216779 samples/sec                   batch loss = 1406.7796490192413 | accuracy = 0.6652173913043479


Epoch[2] Batch[580] Speed: 1.2465460560777273 samples/sec                   batch loss = 1419.6410834789276 | accuracy = 0.6642241379310345


Epoch[2] Batch[585] Speed: 1.2566747392828606 samples/sec                   batch loss = 1431.6608654260635 | accuracy = 0.6645299145299145


Epoch[2] Batch[590] Speed: 1.2516596568063088 samples/sec                   batch loss = 1442.623359322548 | accuracy = 0.6648305084745763


Epoch[2] Batch[595] Speed: 1.2434827504473354 samples/sec                   batch loss = 1456.8995794057846 | accuracy = 0.6638655462184874


Epoch[2] Batch[600] Speed: 1.2497789245007522 samples/sec                   batch loss = 1471.10866689682 | accuracy = 0.6633333333333333


Epoch[2] Batch[605] Speed: 1.2493922165289402 samples/sec                   batch loss = 1480.403029203415 | accuracy = 0.6640495867768595


Epoch[2] Batch[610] Speed: 1.2534657017513253 samples/sec                   batch loss = 1492.138962507248 | accuracy = 0.6647540983606557


Epoch[2] Batch[615] Speed: 1.2562992746850232 samples/sec                   batch loss = 1501.8965587615967 | accuracy = 0.666260162601626


Epoch[2] Batch[620] Speed: 1.2509897358571826 samples/sec                   batch loss = 1513.4199602603912 | accuracy = 0.6669354838709678


Epoch[2] Batch[625] Speed: 1.2518949245593223 samples/sec                   batch loss = 1521.896831870079 | accuracy = 0.6688


Epoch[2] Batch[630] Speed: 1.252637826278386 samples/sec                   batch loss = 1535.4028841257095 | accuracy = 0.6686507936507936


Epoch[2] Batch[635] Speed: 1.2547280546386386 samples/sec                   batch loss = 1546.4248088598251 | accuracy = 0.668503937007874


Epoch[2] Batch[640] Speed: 1.2452362366872882 samples/sec                   batch loss = 1558.7125207185745 | accuracy = 0.66796875


Epoch[2] Batch[645] Speed: 1.2497156201714927 samples/sec                   batch loss = 1569.4529166221619 | accuracy = 0.6689922480620155


Epoch[2] Batch[650] Speed: 1.2483356916506934 samples/sec                   batch loss = 1582.3893103599548 | accuracy = 0.6684615384615384


Epoch[2] Batch[655] Speed: 1.2509044839319334 samples/sec                   batch loss = 1597.8229944705963 | accuracy = 0.667175572519084


Epoch[2] Batch[660] Speed: 1.249257785720014 samples/sec                   batch loss = 1609.0182616710663 | accuracy = 0.6674242424242425


Epoch[2] Batch[665] Speed: 1.2465352198319068 samples/sec                   batch loss = 1617.691868185997 | accuracy = 0.6684210526315789


Epoch[2] Batch[670] Speed: 1.2470659540511506 samples/sec                   batch loss = 1629.7967817783356 | accuracy = 0.6682835820895522


Epoch[2] Batch[675] Speed: 1.2544231548414675 samples/sec                   batch loss = 1643.205821275711 | accuracy = 0.6681481481481482


Epoch[2] Batch[680] Speed: 1.2460265914114024 samples/sec                   batch loss = 1654.2235667705536 | accuracy = 0.669485294117647


Epoch[2] Batch[685] Speed: 1.248090246393663 samples/sec                   batch loss = 1666.1170649528503 | accuracy = 0.668978102189781


Epoch[2] Batch[690] Speed: 1.2515726329490662 samples/sec                   batch loss = 1674.356940805912 | accuracy = 0.6702898550724637


Epoch[2] Batch[695] Speed: 1.2497044494849774 samples/sec                   batch loss = 1684.3729051947594 | accuracy = 0.670863309352518


Epoch[2] Batch[700] Speed: 1.2536647383328932 samples/sec                   batch loss = 1697.5461272597313 | accuracy = 0.6696428571428571


Epoch[2] Batch[705] Speed: 1.2553556762865714 samples/sec                   batch loss = 1706.2870036959648 | accuracy = 0.6709219858156028


Epoch[2] Batch[710] Speed: 1.2548827189309077 samples/sec                   batch loss = 1716.6445694565773 | accuracy = 0.6721830985915493


Epoch[2] Batch[715] Speed: 1.2529642226075839 samples/sec                   batch loss = 1729.8035302758217 | accuracy = 0.6716783216783216


Epoch[2] Batch[720] Speed: 1.248993195058705 samples/sec                   batch loss = 1741.3595123887062 | accuracy = 0.6715277777777777


Epoch[2] Batch[725] Speed: 1.2454697428687809 samples/sec                   batch loss = 1752.3948952555656 | accuracy = 0.6717241379310345


Epoch[2] Batch[730] Speed: 1.2498481942321182 samples/sec                   batch loss = 1762.449046075344 | accuracy = 0.6726027397260274


Epoch[2] Batch[735] Speed: 1.2502471100936574 samples/sec                   batch loss = 1776.6949114203453 | accuracy = 0.6727891156462585


Epoch[2] Batch[740] Speed: 1.2549005528238115 samples/sec                   batch loss = 1784.8078858852386 | accuracy = 0.6736486486486486


Epoch[2] Batch[745] Speed: 1.2551037072474815 samples/sec                   batch loss = 1793.2587444782257 | accuracy = 0.674496644295302


Epoch[2] Batch[750] Speed: 1.24689569554881 samples/sec                   batch loss = 1806.1936346292496 | accuracy = 0.6743333333333333


Epoch[2] Batch[755] Speed: 1.2485978665378665 samples/sec                   batch loss = 1816.54250061512 | accuracy = 0.6748344370860927


Epoch[2] Batch[760] Speed: 1.257039314546643 samples/sec                   batch loss = 1825.1144934892654 | accuracy = 0.6759868421052632


Epoch[2] Batch[765] Speed: 1.2465772692232675 samples/sec                   batch loss = 1837.3678495883942 | accuracy = 0.6751633986928105


Epoch[2] Batch[770] Speed: 1.2520808476934195 samples/sec                   batch loss = 1847.1899735331535 | accuracy = 0.675974025974026


Epoch[2] Batch[775] Speed: 1.2513840605394464 samples/sec                   batch loss = 1857.6198962330818 | accuracy = 0.6764516129032258


Epoch[2] Batch[780] Speed: 1.2475781219093522 samples/sec                   batch loss = 1868.3620491623878 | accuracy = 0.6762820512820513


Epoch[2] Batch[785] Speed: 1.2449304815786497 samples/sec                   batch loss = 1876.3603829741478 | accuracy = 0.6777070063694267


[Epoch 2] training: accuracy=0.6779822335025381
[Epoch 2] time cost: 646.4355101585388
[Epoch 2] validation: validation accuracy=0.7677777777777778


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).