<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[21:32:30] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[21:32:30] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[21:32:31] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[10.447684, -4.917999]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7746763827266706 samples/sec                   batch loss = 15.31525993347168 | accuracy = 0.5


Epoch[1] Batch[10] Speed: 1.2575054163029877 samples/sec                   batch loss = 29.646886587142944 | accuracy = 0.5


Epoch[1] Batch[15] Speed: 1.251442213197447 samples/sec                   batch loss = 43.48821806907654 | accuracy = 0.5


Epoch[1] Batch[20] Speed: 1.2375977001353469 samples/sec                   batch loss = 57.78222060203552 | accuracy = 0.5


Epoch[1] Batch[25] Speed: 1.2436364062254108 samples/sec                   batch loss = 71.39983201026917 | accuracy = 0.52


Epoch[1] Batch[30] Speed: 1.2424848666447357 samples/sec                   batch loss = 84.63610553741455 | accuracy = 0.5333333333333333


Epoch[1] Batch[35] Speed: 1.2467572615597218 samples/sec                   batch loss = 97.39195489883423 | accuracy = 0.55


Epoch[1] Batch[40] Speed: 1.2437943416203947 samples/sec                   batch loss = 111.16032481193542 | accuracy = 0.55625


Epoch[1] Batch[45] Speed: 1.247718686899524 samples/sec                   batch loss = 125.67544555664062 | accuracy = 0.5444444444444444


Epoch[1] Batch[50] Speed: 1.2448906677735516 samples/sec                   batch loss = 139.47091341018677 | accuracy = 0.535


Epoch[1] Batch[55] Speed: 1.2500215884301213 samples/sec                   batch loss = 153.68166494369507 | accuracy = 0.5272727272727272


Epoch[1] Batch[60] Speed: 1.259960710940031 samples/sec                   batch loss = 167.98063135147095 | accuracy = 0.5208333333333334


Epoch[1] Batch[65] Speed: 1.2543823564200416 samples/sec                   batch loss = 181.02225756645203 | accuracy = 0.5269230769230769


Epoch[1] Batch[70] Speed: 1.248838956485181 samples/sec                   batch loss = 194.8302445411682 | accuracy = 0.5321428571428571


Epoch[1] Batch[75] Speed: 1.2464176077455347 samples/sec                   batch loss = 209.50186157226562 | accuracy = 0.52


Epoch[1] Batch[80] Speed: 1.2615958241016518 samples/sec                   batch loss = 224.16187000274658 | accuracy = 0.515625


Epoch[1] Batch[85] Speed: 1.2554751691954944 samples/sec                   batch loss = 237.38893604278564 | accuracy = 0.5205882352941177


Epoch[1] Batch[90] Speed: 1.2571441502487093 samples/sec                   batch loss = 251.58146715164185 | accuracy = 0.5166666666666667


Epoch[1] Batch[95] Speed: 1.2510496243543872 samples/sec                   batch loss = 265.73703384399414 | accuracy = 0.5105263157894737


Epoch[1] Batch[100] Speed: 1.2534726318454754 samples/sec                   batch loss = 279.55168628692627 | accuracy = 0.5125


Epoch[1] Batch[105] Speed: 1.2557068924300343 samples/sec                   batch loss = 293.1127345561981 | accuracy = 0.5095238095238095


Epoch[1] Batch[110] Speed: 1.2498818077154195 samples/sec                   batch loss = 306.41791892051697 | accuracy = 0.5136363636363637


Epoch[1] Batch[115] Speed: 1.2533646621905608 samples/sec                   batch loss = 319.60915541648865 | accuracy = 0.5173913043478261


Epoch[1] Batch[120] Speed: 1.250812342839693 samples/sec                   batch loss = 333.0114109516144 | accuracy = 0.51875


Epoch[1] Batch[125] Speed: 1.2446734456490072 samples/sec                   batch loss = 347.02222180366516 | accuracy = 0.518


Epoch[1] Batch[130] Speed: 1.2585801700070336 samples/sec                   batch loss = 360.56211829185486 | accuracy = 0.5211538461538462


Epoch[1] Batch[135] Speed: 1.251937710016992 samples/sec                   batch loss = 374.2312841415405 | accuracy = 0.524074074074074


Epoch[1] Batch[140] Speed: 1.2500037066748384 samples/sec                   batch loss = 388.12001514434814 | accuracy = 0.5232142857142857


Epoch[1] Batch[145] Speed: 1.2441182669092827 samples/sec                   batch loss = 401.781779050827 | accuracy = 0.5258620689655172


Epoch[1] Batch[150] Speed: 1.2384769314218746 samples/sec                   batch loss = 416.079621553421 | accuracy = 0.5216666666666666


Epoch[1] Batch[155] Speed: 1.2454403417853477 samples/sec                   batch loss = 429.27804493904114 | accuracy = 0.5258064516129032


Epoch[1] Batch[160] Speed: 1.2409360577410609 samples/sec                   batch loss = 443.7520053386688 | accuracy = 0.521875


Epoch[1] Batch[165] Speed: 1.2451332851376402 samples/sec                   batch loss = 457.4839653968811 | accuracy = 0.5257575757575758


Epoch[1] Batch[170] Speed: 1.2435579605289393 samples/sec                   batch loss = 471.256383895874 | accuracy = 0.5191176470588236


Epoch[1] Batch[175] Speed: 1.2477905125261248 samples/sec                   batch loss = 485.5043475627899 | accuracy = 0.5171428571428571


Epoch[1] Batch[180] Speed: 1.2530691283876292 samples/sec                   batch loss = 499.0984196662903 | accuracy = 0.5180555555555556


Epoch[1] Batch[185] Speed: 1.2543195226634385 samples/sec                   batch loss = 513.0319707393646 | accuracy = 0.5202702702702703


Epoch[1] Batch[190] Speed: 1.2520534696143812 samples/sec                   batch loss = 527.2712678909302 | accuracy = 0.5157894736842106


Epoch[1] Batch[195] Speed: 1.2593447363227892 samples/sec                   batch loss = 541.1122586727142 | accuracy = 0.5128205128205128


Epoch[1] Batch[200] Speed: 1.2514785263461297 samples/sec                   batch loss = 554.6486148834229 | accuracy = 0.515


Epoch[1] Batch[205] Speed: 1.2474008603438327 samples/sec                   batch loss = 567.7324523925781 | accuracy = 0.5207317073170732


Epoch[1] Batch[210] Speed: 1.2486640315334776 samples/sec                   batch loss = 581.6592972278595 | accuracy = 0.5214285714285715


Epoch[1] Batch[215] Speed: 1.2514623765365127 samples/sec                   batch loss = 596.2614786624908 | accuracy = 0.5197674418604651


Epoch[1] Batch[220] Speed: 1.252948876580038 samples/sec                   batch loss = 609.9732966423035 | accuracy = 0.5204545454545455


Epoch[1] Batch[225] Speed: 1.2518285101335844 samples/sec                   batch loss = 622.9373090267181 | accuracy = 0.5244444444444445


Epoch[1] Batch[230] Speed: 1.2545974455657016 samples/sec                   batch loss = 636.9940092563629 | accuracy = 0.525


Epoch[1] Batch[235] Speed: 1.2566926241474532 samples/sec                   batch loss = 650.7826056480408 | accuracy = 0.5234042553191489


Epoch[1] Batch[240] Speed: 1.2542252838305104 samples/sec                   batch loss = 663.59468126297 | accuracy = 0.5260416666666666


Epoch[1] Batch[245] Speed: 1.2552922753718299 samples/sec                   batch loss = 678.2846448421478 | accuracy = 0.5204081632653061


Epoch[1] Batch[250] Speed: 1.251088060491719 samples/sec                   batch loss = 692.0837087631226 | accuracy = 0.521


Epoch[1] Batch[255] Speed: 1.2455731199619882 samples/sec                   batch loss = 705.3011794090271 | accuracy = 0.5235294117647059


Epoch[1] Batch[260] Speed: 1.243188725800943 samples/sec                   batch loss = 718.9010140895844 | accuracy = 0.5230769230769231


Epoch[1] Batch[265] Speed: 1.2504568688864741 samples/sec                   batch loss = 732.5783996582031 | accuracy = 0.5245283018867924


Epoch[1] Batch[270] Speed: 1.2475254297984681 samples/sec                   batch loss = 745.9842653274536 | accuracy = 0.525


Epoch[1] Batch[275] Speed: 1.2417859400880231 samples/sec                   batch loss = 759.4149732589722 | accuracy = 0.5263636363636364


Epoch[1] Batch[280] Speed: 1.2454876800605534 samples/sec                   batch loss = 773.1984288692474 | accuracy = 0.5258928571428572


Epoch[1] Batch[285] Speed: 1.246177636817988 samples/sec                   batch loss = 787.0843412876129 | accuracy = 0.5245614035087719


Epoch[1] Batch[290] Speed: 1.2516380864604673 samples/sec                   batch loss = 800.3349590301514 | accuracy = 0.5275862068965518


Epoch[1] Batch[295] Speed: 1.2490812554977444 samples/sec                   batch loss = 813.9881842136383 | accuracy = 0.5288135593220339


Epoch[1] Batch[300] Speed: 1.2424154906193172 samples/sec                   batch loss = 827.9326455593109 | accuracy = 0.5291666666666667


Epoch[1] Batch[305] Speed: 1.2460397323882457 samples/sec                   batch loss = 842.111396074295 | accuracy = 0.5295081967213114


Epoch[1] Batch[310] Speed: 1.2445058706955985 samples/sec                   batch loss = 856.1748280525208 | accuracy = 0.5274193548387097


Epoch[1] Batch[315] Speed: 1.2479529394577675 samples/sec                   batch loss = 869.8443155288696 | accuracy = 0.5285714285714286


Epoch[1] Batch[320] Speed: 1.2545420012293262 samples/sec                   batch loss = 883.7665922641754 | accuracy = 0.52890625


Epoch[1] Batch[325] Speed: 1.2527298623672534 samples/sec                   batch loss = 897.5885345935822 | accuracy = 0.5307692307692308


Epoch[1] Batch[330] Speed: 1.2515182025037352 samples/sec                   batch loss = 911.0372433662415 | accuracy = 0.5318181818181819


Epoch[1] Batch[335] Speed: 1.2496645159137443 samples/sec                   batch loss = 924.6884422302246 | accuracy = 0.5305970149253731


Epoch[1] Batch[340] Speed: 1.2511593415214595 samples/sec                   batch loss = 938.7858090400696 | accuracy = 0.5308823529411765


Epoch[1] Batch[345] Speed: 1.2435908678005878 samples/sec                   batch loss = 952.4993271827698 | accuracy = 0.5311594202898551


Epoch[1] Batch[350] Speed: 1.2405903942256684 samples/sec                   batch loss = 966.0854063034058 | accuracy = 0.5314285714285715


Epoch[1] Batch[355] Speed: 1.239622332716498 samples/sec                   batch loss = 979.093367099762 | accuracy = 0.5338028169014084


Epoch[1] Batch[360] Speed: 1.2432180206715218 samples/sec                   batch loss = 993.0666058063507 | accuracy = 0.5326388888888889


Epoch[1] Batch[365] Speed: 1.2472982924121123 samples/sec                   batch loss = 1007.1304297447205 | accuracy = 0.5321917808219178


Epoch[1] Batch[370] Speed: 1.2498962406522531 samples/sec                   batch loss = 1020.2495687007904 | accuracy = 0.5358108108108108


Epoch[1] Batch[375] Speed: 1.2480233994865741 samples/sec                   batch loss = 1033.5265100002289 | accuracy = 0.5373333333333333


Epoch[1] Batch[380] Speed: 1.250359388921453 samples/sec                   batch loss = 1046.4700260162354 | accuracy = 0.5394736842105263


Epoch[1] Batch[385] Speed: 1.242256801392447 samples/sec                   batch loss = 1059.4195303916931 | accuracy = 0.5409090909090909


Epoch[1] Batch[390] Speed: 1.2494082199085055 samples/sec                   batch loss = 1071.6936399936676 | accuracy = 0.5435897435897435


Epoch[1] Batch[395] Speed: 1.2494450664572294 samples/sec                   batch loss = 1085.9846193790436 | accuracy = 0.5417721518987342


Epoch[1] Batch[400] Speed: 1.2461456106521387 samples/sec                   batch loss = 1099.1854388713837 | accuracy = 0.54375


Epoch[1] Batch[405] Speed: 1.245957004379768 samples/sec                   batch loss = 1113.3659446239471 | accuracy = 0.5425925925925926


Epoch[1] Batch[410] Speed: 1.2531819146747751 samples/sec                   batch loss = 1126.625437259674 | accuracy = 0.5439024390243903


Epoch[1] Batch[415] Speed: 1.2500262452211903 samples/sec                   batch loss = 1141.2071537971497 | accuracy = 0.5421686746987951


Epoch[1] Batch[420] Speed: 1.256629841102824 samples/sec                   batch loss = 1154.7498953342438 | accuracy = 0.5434523809523809


Epoch[1] Batch[425] Speed: 1.253147749050105 samples/sec                   batch loss = 1167.7295849323273 | accuracy = 0.5452941176470588


Epoch[1] Batch[430] Speed: 1.2517104574410494 samples/sec                   batch loss = 1180.4458165168762 | accuracy = 0.547093023255814


Epoch[1] Batch[435] Speed: 1.2532617665362575 samples/sec                   batch loss = 1194.2419209480286 | accuracy = 0.5471264367816092


Epoch[1] Batch[440] Speed: 1.2499587251464288 samples/sec                   batch loss = 1208.067370414734 | accuracy = 0.5460227272727273


Epoch[1] Batch[445] Speed: 1.2528103114885267 samples/sec                   batch loss = 1221.7537052631378 | accuracy = 0.547191011235955


Epoch[1] Batch[450] Speed: 1.2543854513796755 samples/sec                   batch loss = 1234.6557266712189 | accuracy = 0.5488888888888889


Epoch[1] Batch[455] Speed: 1.2589542597347458 samples/sec                   batch loss = 1248.361938238144 | accuracy = 0.548901098901099


Epoch[1] Batch[460] Speed: 1.2605149689954276 samples/sec                   batch loss = 1261.6413791179657 | accuracy = 0.5489130434782609


Epoch[1] Batch[465] Speed: 1.2577242179225605 samples/sec                   batch loss = 1274.5757722854614 | accuracy = 0.5516129032258065


Epoch[1] Batch[470] Speed: 1.2576583148944873 samples/sec                   batch loss = 1287.675131559372 | accuracy = 0.551063829787234


Epoch[1] Batch[475] Speed: 1.2615870962890552 samples/sec                   batch loss = 1301.6103904247284 | accuracy = 0.55


Epoch[1] Batch[480] Speed: 1.2542952348833278 samples/sec                   batch loss = 1315.611144065857 | accuracy = 0.5489583333333333


Epoch[1] Batch[485] Speed: 1.2513495262058023 samples/sec                   batch loss = 1328.828292608261 | accuracy = 0.5494845360824743


Epoch[1] Batch[490] Speed: 1.2483760047913448 samples/sec                   batch loss = 1341.9601690769196 | accuracy = 0.5494897959183673


Epoch[1] Batch[495] Speed: 1.2521045825601846 samples/sec                   batch loss = 1355.357587814331 | accuracy = 0.5494949494949495


Epoch[1] Batch[500] Speed: 1.2504847364117655 samples/sec                   batch loss = 1368.462115764618 | accuracy = 0.55


Epoch[1] Batch[505] Speed: 1.245904819206515 samples/sec                   batch loss = 1381.6105020046234 | accuracy = 0.5514851485148515


Epoch[1] Batch[510] Speed: 1.244476976656399 samples/sec                   batch loss = 1395.6124539375305 | accuracy = 0.5514705882352942


Epoch[1] Batch[515] Speed: 1.2436048792684018 samples/sec                   batch loss = 1408.551035642624 | accuracy = 0.5533980582524272


Epoch[1] Batch[520] Speed: 1.253180791391744 samples/sec                   batch loss = 1421.810714483261 | accuracy = 0.5538461538461539


Epoch[1] Batch[525] Speed: 1.2517499615198349 samples/sec                   batch loss = 1434.7433500289917 | accuracy = 0.5552380952380952


Epoch[1] Batch[530] Speed: 1.2528021725611407 samples/sec                   batch loss = 1446.7562808990479 | accuracy = 0.5556603773584906


Epoch[1] Batch[535] Speed: 1.2508002200222974 samples/sec                   batch loss = 1459.6404452323914 | accuracy = 0.5565420560747664


Epoch[1] Batch[540] Speed: 1.2526273514832174 samples/sec                   batch loss = 1472.2783217430115 | accuracy = 0.5587962962962963


Epoch[1] Batch[545] Speed: 1.251452294785762 samples/sec                   batch loss = 1486.736204624176 | accuracy = 0.5587155963302752


Epoch[1] Batch[550] Speed: 1.2577009295112571 samples/sec                   batch loss = 1500.7257196903229 | accuracy = 0.5595454545454546


Epoch[1] Batch[555] Speed: 1.254679636034665 samples/sec                   batch loss = 1514.0352396965027 | accuracy = 0.55990990990991


Epoch[1] Batch[560] Speed: 1.2564115141439771 samples/sec                   batch loss = 1528.309900045395 | accuracy = 0.5598214285714286


Epoch[1] Batch[565] Speed: 1.2522938395729015 samples/sec                   batch loss = 1541.5559861660004 | accuracy = 0.5615044247787611


Epoch[1] Batch[570] Speed: 1.2469813287445526 samples/sec                   batch loss = 1554.442177772522 | accuracy = 0.5618421052631579


Epoch[1] Batch[575] Speed: 1.2471225009089226 samples/sec                   batch loss = 1568.2090156078339 | accuracy = 0.561304347826087


Epoch[1] Batch[580] Speed: 1.2409012716245176 samples/sec                   batch loss = 1581.3974854946136 | accuracy = 0.5616379310344828


Epoch[1] Batch[585] Speed: 1.2478418348669162 samples/sec                   batch loss = 1594.146300792694 | accuracy = 0.5628205128205128


Epoch[1] Batch[590] Speed: 1.2441727014359156 samples/sec                   batch loss = 1609.3831429481506 | accuracy = 0.5614406779661016


Epoch[1] Batch[595] Speed: 1.2453347682049647 samples/sec                   batch loss = 1622.5640869140625 | accuracy = 0.5617647058823529


Epoch[1] Batch[600] Speed: 1.2530250489754187 samples/sec                   batch loss = 1636.3329286575317 | accuracy = 0.5608333333333333


Epoch[1] Batch[605] Speed: 1.2597153081558703 samples/sec                   batch loss = 1648.719333410263 | accuracy = 0.5628099173553719


Epoch[1] Batch[610] Speed: 1.2595747696690311 samples/sec                   batch loss = 1661.8527946472168 | accuracy = 0.5631147540983606


Epoch[1] Batch[615] Speed: 1.248920394167512 samples/sec                   batch loss = 1675.3115034103394 | accuracy = 0.5638211382113821


Epoch[1] Batch[620] Speed: 1.2490050969055269 samples/sec                   batch loss = 1688.6407845020294 | accuracy = 0.5641129032258064


Epoch[1] Batch[625] Speed: 1.2496772683152977 samples/sec                   batch loss = 1703.18901181221 | accuracy = 0.5632


Epoch[1] Batch[630] Speed: 1.2424217470252465 samples/sec                   batch loss = 1717.5375826358795 | accuracy = 0.5630952380952381


Epoch[1] Batch[635] Speed: 1.2432158096897261 samples/sec                   batch loss = 1730.0563523769379 | accuracy = 0.5641732283464567


Epoch[1] Batch[640] Speed: 1.2412968836576421 samples/sec                   batch loss = 1742.5273649692535 | accuracy = 0.564453125


Epoch[1] Batch[645] Speed: 1.2483165577531872 samples/sec                   batch loss = 1756.5957961082458 | accuracy = 0.563953488372093


Epoch[1] Batch[650] Speed: 1.24285092122166 samples/sec                   batch loss = 1769.4290199279785 | accuracy = 0.5642307692307692


Epoch[1] Batch[655] Speed: 1.2505705836519725 samples/sec                   batch loss = 1781.1706684827805 | accuracy = 0.566412213740458


Epoch[1] Batch[660] Speed: 1.2440907746706764 samples/sec                   batch loss = 1794.6743105649948 | accuracy = 0.5662878787878788


Epoch[1] Batch[665] Speed: 1.2484446545870393 samples/sec                   batch loss = 1807.7066713571548 | accuracy = 0.5665413533834587


Epoch[1] Batch[670] Speed: 1.25011631438534 samples/sec                   batch loss = 1820.2385576963425 | accuracy = 0.567910447761194


Epoch[1] Batch[675] Speed: 1.2448781976158778 samples/sec                   batch loss = 1833.6943165063858 | accuracy = 0.5674074074074074


Epoch[1] Batch[680] Speed: 1.2416185903662385 samples/sec                   batch loss = 1845.9086478948593 | accuracy = 0.56875


Epoch[1] Batch[685] Speed: 1.2445054091189316 samples/sec                   batch loss = 1858.468402504921 | accuracy = 0.5704379562043795


Epoch[1] Batch[690] Speed: 1.2485381196247394 samples/sec                   batch loss = 1871.3077193498611 | accuracy = 0.571376811594203


Epoch[1] Batch[695] Speed: 1.245275980169957 samples/sec                   batch loss = 1882.2810062170029 | accuracy = 0.5726618705035971


Epoch[1] Batch[700] Speed: 1.2544501676703559 samples/sec                   batch loss = 1894.6416741609573 | accuracy = 0.5728571428571428


Epoch[1] Batch[705] Speed: 1.2533544561343168 samples/sec                   batch loss = 1907.5200463533401 | accuracy = 0.573404255319149


Epoch[1] Batch[710] Speed: 1.2548813110135828 samples/sec                   batch loss = 1919.3365474939346 | accuracy = 0.5742957746478873


Epoch[1] Batch[715] Speed: 1.2450745161233108 samples/sec                   batch loss = 1932.4345113039017 | accuracy = 0.5737762237762237


Epoch[1] Batch[720] Speed: 1.2484697382630323 samples/sec                   batch loss = 1945.5896719694138 | accuracy = 0.5736111111111111


Epoch[1] Batch[725] Speed: 1.2475387879406186 samples/sec                   batch loss = 1956.1601777076721 | accuracy = 0.5755172413793104


Epoch[1] Batch[730] Speed: 1.2501999683150633 samples/sec                   batch loss = 1969.4043412208557 | accuracy = 0.5753424657534246


Epoch[1] Batch[735] Speed: 1.2453846869316014 samples/sec                   batch loss = 1982.3219723701477 | accuracy = 0.5748299319727891


Epoch[1] Batch[740] Speed: 1.2466870370590544 samples/sec                   batch loss = 1994.6354694366455 | accuracy = 0.5753378378378379


Epoch[1] Batch[745] Speed: 1.2438459812733789 samples/sec                   batch loss = 2007.6787803173065 | accuracy = 0.575503355704698


Epoch[1] Batch[750] Speed: 1.243434274260849 samples/sec                   batch loss = 2019.2215057611465 | accuracy = 0.5763333333333334


Epoch[1] Batch[755] Speed: 1.2508450756204659 samples/sec                   batch loss = 2032.6291283369064 | accuracy = 0.5764900662251655


Epoch[1] Batch[760] Speed: 1.2478666158809268 samples/sec                   batch loss = 2045.8994257450104 | accuracy = 0.5763157894736842


Epoch[1] Batch[765] Speed: 1.2508109440410775 samples/sec                   batch loss = 2059.0196781158447 | accuracy = 0.5771241830065359


Epoch[1] Batch[770] Speed: 1.255182583434302 samples/sec                   batch loss = 2072.2185938358307 | accuracy = 0.5775974025974026


Epoch[1] Batch[775] Speed: 1.2476872309925178 samples/sec                   batch loss = 2083.8715603351593 | accuracy = 0.5780645161290323


Epoch[1] Batch[780] Speed: 1.2528515690574662 samples/sec                   batch loss = 2096.3759372234344 | accuracy = 0.5788461538461539


Epoch[1] Batch[785] Speed: 1.2614185403807756 samples/sec                   batch loss = 2109.01348900795 | accuracy = 0.5796178343949044


[Epoch 1] training: accuracy=0.578997461928934
[Epoch 1] time cost: 648.5814869403839
[Epoch 1] validation: validation accuracy=0.7166666666666667


Epoch[2] Batch[5] Speed: 1.2491083177466695 samples/sec                   batch loss = 12.68394935131073 | accuracy = 0.75


Epoch[2] Batch[10] Speed: 1.2535864274584272 samples/sec                   batch loss = 23.873944997787476 | accuracy = 0.775


Epoch[2] Batch[15] Speed: 1.247991835380563 samples/sec                   batch loss = 37.02005648612976 | accuracy = 0.7


Epoch[2] Batch[20] Speed: 1.2567158752324992 samples/sec                   batch loss = 49.27128028869629 | accuracy = 0.7125


Epoch[2] Batch[25] Speed: 1.2569748017145344 samples/sec                   batch loss = 61.28008460998535 | accuracy = 0.73


Epoch[2] Batch[30] Speed: 1.2524514572130256 samples/sec                   batch loss = 72.49426937103271 | accuracy = 0.7333333333333333


Epoch[2] Batch[35] Speed: 1.2479516398745207 samples/sec                   batch loss = 85.23160684108734 | accuracy = 0.7285714285714285


Epoch[2] Batch[40] Speed: 1.2493715616101964 samples/sec                   batch loss = 98.25592267513275 | accuracy = 0.71875


Epoch[2] Batch[45] Speed: 1.246468168962057 samples/sec                   batch loss = 110.38783085346222 | accuracy = 0.7166666666666667


Epoch[2] Batch[50] Speed: 1.2455264224594247 samples/sec                   batch loss = 122.69540452957153 | accuracy = 0.715


Epoch[2] Batch[55] Speed: 1.254808948317301 samples/sec                   batch loss = 134.740669131279 | accuracy = 0.7136363636363636


Epoch[2] Batch[60] Speed: 1.2474926850073564 samples/sec                   batch loss = 147.04583084583282 | accuracy = 0.7125


Epoch[2] Batch[65] Speed: 1.2562589185410034 samples/sec                   batch loss = 158.69727194309235 | accuracy = 0.7192307692307692


Epoch[2] Batch[70] Speed: 1.2586681711172496 samples/sec                   batch loss = 170.27659559249878 | accuracy = 0.7178571428571429


Epoch[2] Batch[75] Speed: 1.2523179564273308 samples/sec                   batch loss = 184.1981430053711 | accuracy = 0.7033333333333334


Epoch[2] Batch[80] Speed: 1.2456337857090514 samples/sec                   batch loss = 196.06092882156372 | accuracy = 0.7


Epoch[2] Batch[85] Speed: 1.251510360428404 samples/sec                   batch loss = 210.0355930328369 | accuracy = 0.6970588235294117


Epoch[2] Batch[90] Speed: 1.2509891761786454 samples/sec                   batch loss = 222.56821179389954 | accuracy = 0.6944444444444444


Epoch[2] Batch[95] Speed: 1.2460744370047183 samples/sec                   batch loss = 233.41212952136993 | accuracy = 0.7026315789473684


Epoch[2] Batch[100] Speed: 1.245925822287831 samples/sec                   batch loss = 244.7408071756363 | accuracy = 0.705


Epoch[2] Batch[105] Speed: 1.2484166921215778 samples/sec                   batch loss = 257.3265817165375 | accuracy = 0.7047619047619048


Epoch[2] Batch[110] Speed: 1.248389752724981 samples/sec                   batch loss = 271.8094801902771 | accuracy = 0.6931818181818182


Epoch[2] Batch[115] Speed: 1.2574380282978 samples/sec                   batch loss = 283.8230872154236 | accuracy = 0.6934782608695652


Epoch[2] Batch[120] Speed: 1.2606485181803584 samples/sec                   batch loss = 295.3539928197861 | accuracy = 0.69375


Epoch[2] Batch[125] Speed: 1.2541245905198013 samples/sec                   batch loss = 309.68990790843964 | accuracy = 0.686


Epoch[2] Batch[130] Speed: 1.2614794315783855 samples/sec                   batch loss = 319.1900027990341 | accuracy = 0.6903846153846154


Epoch[2] Batch[135] Speed: 1.249020160505623 samples/sec                   batch loss = 331.2067713737488 | accuracy = 0.6888888888888889


Epoch[2] Batch[140] Speed: 1.247045190590605 samples/sec                   batch loss = 342.31612944602966 | accuracy = 0.6875


Epoch[2] Batch[145] Speed: 1.2454131608864982 samples/sec                   batch loss = 351.69208776950836 | accuracy = 0.6948275862068966


Epoch[2] Batch[150] Speed: 1.2532673836973953 samples/sec                   batch loss = 364.741574883461 | accuracy = 0.6916666666666667


Epoch[2] Batch[155] Speed: 1.24733278896586 samples/sec                   batch loss = 378.9371300935745 | accuracy = 0.6854838709677419


Epoch[2] Batch[160] Speed: 1.2511713779942393 samples/sec                   batch loss = 392.2107471227646 | accuracy = 0.6828125


Epoch[2] Batch[165] Speed: 1.2553410230617026 samples/sec                   batch loss = 405.03915083408356 | accuracy = 0.6833333333333333


Epoch[2] Batch[170] Speed: 1.2576651971481194 samples/sec                   batch loss = 417.3823307752609 | accuracy = 0.6808823529411765


Epoch[2] Batch[175] Speed: 1.2553498525280342 samples/sec                   batch loss = 427.755686879158 | accuracy = 0.6842857142857143


Epoch[2] Batch[180] Speed: 1.247109522488157 samples/sec                   batch loss = 439.26357769966125 | accuracy = 0.6888888888888889


Epoch[2] Batch[185] Speed: 1.2351164021670393 samples/sec                   batch loss = 452.06978738307953 | accuracy = 0.6851351351351351


Epoch[2] Batch[190] Speed: 1.2475048365564403 samples/sec                   batch loss = 463.292995095253 | accuracy = 0.6842105263157895


Epoch[2] Batch[195] Speed: 1.2500626625239362 samples/sec                   batch loss = 473.37433886528015 | accuracy = 0.6846153846153846


Epoch[2] Batch[200] Speed: 1.248710035206367 samples/sec                   batch loss = 484.68042266368866 | accuracy = 0.68375


Epoch[2] Batch[205] Speed: 1.246463446037379 samples/sec                   batch loss = 496.6360454559326 | accuracy = 0.6829268292682927


Epoch[2] Batch[210] Speed: 1.2514540684153233 samples/sec                   batch loss = 508.20509600639343 | accuracy = 0.6833333333333333


Epoch[2] Batch[215] Speed: 1.2525963022268332 samples/sec                   batch loss = 522.8843252658844 | accuracy = 0.6813953488372093


Epoch[2] Batch[220] Speed: 1.259658559366192 samples/sec                   batch loss = 533.474858880043 | accuracy = 0.6829545454545455


Epoch[2] Batch[225] Speed: 1.2492808555255859 samples/sec                   batch loss = 545.9932783842087 | accuracy = 0.6822222222222222


Epoch[2] Batch[230] Speed: 1.2496472028508006 samples/sec                   batch loss = 561.9492717981339 | accuracy = 0.6804347826086956


Epoch[2] Batch[235] Speed: 1.249277971753293 samples/sec                   batch loss = 576.5850757360458 | accuracy = 0.6776595744680851


Epoch[2] Batch[240] Speed: 1.2502839129649055 samples/sec                   batch loss = 589.2489289045334 | accuracy = 0.678125


Epoch[2] Batch[245] Speed: 1.2565800521140338 samples/sec                   batch loss = 601.7336709499359 | accuracy = 0.6785714285714286


Epoch[2] Batch[250] Speed: 1.257180512446766 samples/sec                   batch loss = 612.8749191761017 | accuracy = 0.68


Epoch[2] Batch[255] Speed: 1.2588513888680464 samples/sec                   batch loss = 628.2090398073196 | accuracy = 0.6784313725490196


Epoch[2] Batch[260] Speed: 1.2504473625231805 samples/sec                   batch loss = 641.5780106782913 | accuracy = 0.6759615384615385


Epoch[2] Batch[265] Speed: 1.2554941473637955 samples/sec                   batch loss = 654.509675860405 | accuracy = 0.6754716981132075


Epoch[2] Batch[270] Speed: 1.2538504379040751 samples/sec                   batch loss = 666.6059651374817 | accuracy = 0.675


Epoch[2] Batch[275] Speed: 1.2493973338440973 samples/sec                   batch loss = 677.1022822856903 | accuracy = 0.6763636363636364


Epoch[2] Batch[280] Speed: 1.2532778691995745 samples/sec                   batch loss = 689.6886003017426 | accuracy = 0.6741071428571429


Epoch[2] Batch[285] Speed: 1.2492281125051805 samples/sec                   batch loss = 701.2291557788849 | accuracy = 0.6754385964912281


Epoch[2] Batch[290] Speed: 1.2610586246783222 samples/sec                   batch loss = 713.7151575088501 | accuracy = 0.6741379310344827


Epoch[2] Batch[295] Speed: 1.2515123209380232 samples/sec                   batch loss = 728.8380873203278 | accuracy = 0.6703389830508475


Epoch[2] Batch[300] Speed: 1.243971778657287 samples/sec                   batch loss = 737.8864916563034 | accuracy = 0.6725


Epoch[2] Batch[305] Speed: 1.2507749493661477 samples/sec                   batch loss = 748.0773532390594 | accuracy = 0.6737704918032786


Epoch[2] Batch[310] Speed: 1.2469142298751956 samples/sec                   batch loss = 760.640643954277 | accuracy = 0.6733870967741935


Epoch[2] Batch[315] Speed: 1.2476696943170518 samples/sec                   batch loss = 773.414673447609 | accuracy = 0.6722222222222223


Epoch[2] Batch[320] Speed: 1.2476275711955453 samples/sec                   batch loss = 785.6603957414627 | accuracy = 0.67265625


Epoch[2] Batch[325] Speed: 1.2500636870823225 samples/sec                   batch loss = 798.1125465631485 | accuracy = 0.6723076923076923


Epoch[2] Batch[330] Speed: 1.2505895070503061 samples/sec                   batch loss = 812.1684726476669 | accuracy = 0.6704545454545454


Epoch[2] Batch[335] Speed: 1.2500986162137127 samples/sec                   batch loss = 824.6260635852814 | accuracy = 0.6701492537313433


Epoch[2] Batch[340] Speed: 1.2524319164544533 samples/sec                   batch loss = 836.8158222436905 | accuracy = 0.6713235294117647


Epoch[2] Batch[345] Speed: 1.2497192506876087 samples/sec                   batch loss = 847.1585607528687 | accuracy = 0.6731884057971015


Epoch[2] Batch[350] Speed: 1.2533181276682719 samples/sec                   batch loss = 857.122083067894 | accuracy = 0.6757142857142857


Epoch[2] Batch[355] Speed: 1.2573139213218045 samples/sec                   batch loss = 869.5172934532166 | accuracy = 0.676056338028169


Epoch[2] Batch[360] Speed: 1.2562411400834976 samples/sec                   batch loss = 880.0667021274567 | accuracy = 0.675


Epoch[2] Batch[365] Speed: 1.254166215961547 samples/sec                   batch loss = 890.2227157354355 | accuracy = 0.6753424657534246


Epoch[2] Batch[370] Speed: 1.2501701571675474 samples/sec                   batch loss = 900.6415786743164 | accuracy = 0.677027027027027


Epoch[2] Batch[375] Speed: 1.2506520608806335 samples/sec                   batch loss = 912.7635096311569 | accuracy = 0.678


Epoch[2] Batch[380] Speed: 1.2519516299339222 samples/sec                   batch loss = 923.0860460996628 | accuracy = 0.6789473684210526


Epoch[2] Batch[385] Speed: 1.2533897566869876 samples/sec                   batch loss = 936.7472029924393 | accuracy = 0.6785714285714286


Epoch[2] Batch[390] Speed: 1.2533089522548253 samples/sec                   batch loss = 949.2755080461502 | accuracy = 0.6775641025641026


Epoch[2] Batch[395] Speed: 1.2553370780211468 samples/sec                   batch loss = 961.8700470924377 | accuracy = 0.6772151898734177


Epoch[2] Batch[400] Speed: 1.2595484812940276 samples/sec                   batch loss = 974.015283703804 | accuracy = 0.676875


Epoch[2] Batch[405] Speed: 1.254436473521428 samples/sec                   batch loss = 985.5154011249542 | accuracy = 0.6759259259259259


Epoch[2] Batch[410] Speed: 1.2510744396486089 samples/sec                   batch loss = 1000.2151367664337 | accuracy = 0.675


Epoch[2] Batch[415] Speed: 1.2457266453392775 samples/sec                   batch loss = 1014.3594617843628 | accuracy = 0.6728915662650602


Epoch[2] Batch[420] Speed: 1.2512865292911912 samples/sec                   batch loss = 1029.0423393249512 | accuracy = 0.6714285714285714


Epoch[2] Batch[425] Speed: 1.2463502917597375 samples/sec                   batch loss = 1041.243234872818 | accuracy = 0.6711764705882353


Epoch[2] Batch[430] Speed: 1.2533275841023004 samples/sec                   batch loss = 1051.8606840968132 | accuracy = 0.6709302325581395


Epoch[2] Batch[435] Speed: 1.2515094268545535 samples/sec                   batch loss = 1061.7759920954704 | accuracy = 0.6729885057471264


Epoch[2] Batch[440] Speed: 1.2599169022769305 samples/sec                   batch loss = 1075.137929737568 | accuracy = 0.6727272727272727


Epoch[2] Batch[445] Speed: 1.2510611922637884 samples/sec                   batch loss = 1086.4001544117928 | accuracy = 0.6747191011235955


Epoch[2] Batch[450] Speed: 1.2566613730322458 samples/sec                   batch loss = 1099.390157520771 | accuracy = 0.6733333333333333


Epoch[2] Batch[455] Speed: 1.2527133060976046 samples/sec                   batch loss = 1109.6082690358162 | accuracy = 0.6741758241758242


Epoch[2] Batch[460] Speed: 1.2528199473668569 samples/sec                   batch loss = 1123.7909262776375 | accuracy = 0.6728260869565217


Epoch[2] Batch[465] Speed: 1.251117355638636 samples/sec                   batch loss = 1134.5675199627876 | accuracy = 0.6725806451612903


Epoch[2] Batch[470] Speed: 1.2534817160038647 samples/sec                   batch loss = 1147.017658174038 | accuracy = 0.6718085106382978


Epoch[2] Batch[475] Speed: 1.256239164730392 samples/sec                   batch loss = 1158.7842923998833 | accuracy = 0.6726315789473685


Epoch[2] Batch[480] Speed: 1.2613710265670173 samples/sec                   batch loss = 1169.7349991202354 | accuracy = 0.6723958333333333


Epoch[2] Batch[485] Speed: 1.2586383324520902 samples/sec                   batch loss = 1180.3623953461647 | accuracy = 0.6721649484536083


Epoch[2] Batch[490] Speed: 1.256487073036181 samples/sec                   batch loss = 1190.5798552632332 | accuracy = 0.673469387755102


Epoch[2] Batch[495] Speed: 1.252962444694896 samples/sec                   batch loss = 1201.1391914486885 | accuracy = 0.6742424242424242


Epoch[2] Batch[500] Speed: 1.2591763065052501 samples/sec                   batch loss = 1217.0589239001274 | accuracy = 0.672


Epoch[2] Batch[505] Speed: 1.2503737397021752 samples/sec                   batch loss = 1228.2778142094612 | accuracy = 0.6717821782178218


Epoch[2] Batch[510] Speed: 1.249345046161577 samples/sec                   batch loss = 1242.5369529128075 | accuracy = 0.6705882352941176


Epoch[2] Batch[515] Speed: 1.2512782235046347 samples/sec                   batch loss = 1252.099929511547 | accuracy = 0.6723300970873787


Epoch[2] Batch[520] Speed: 1.2495912644588034 samples/sec                   batch loss = 1264.3290240168571 | accuracy = 0.6721153846153847


Epoch[2] Batch[525] Speed: 1.2621948214792595 samples/sec                   batch loss = 1275.9843510985374 | accuracy = 0.6719047619047619


Epoch[2] Batch[530] Speed: 1.2561727588551477 samples/sec                   batch loss = 1289.108823955059 | accuracy = 0.6707547169811321


Epoch[2] Batch[535] Speed: 1.2578441621847596 samples/sec                   batch loss = 1304.2044666409492 | accuracy = 0.6691588785046729


Epoch[2] Batch[540] Speed: 1.2538039610039629 samples/sec                   batch loss = 1317.2125070691109 | accuracy = 0.6685185185185185


Epoch[2] Batch[545] Speed: 1.253760017265601 samples/sec                   batch loss = 1328.9619697928429 | accuracy = 0.668348623853211


Epoch[2] Batch[550] Speed: 1.245753192419574 samples/sec                   batch loss = 1341.9715937972069 | accuracy = 0.6681818181818182


Epoch[2] Batch[555] Speed: 1.2509017791923216 samples/sec                   batch loss = 1353.8886916041374 | accuracy = 0.6693693693693694


Epoch[2] Batch[560] Speed: 1.250723385475335 samples/sec                   batch loss = 1363.8628460764885 | accuracy = 0.6705357142857142


Epoch[2] Batch[565] Speed: 1.2444139313821028 samples/sec                   batch loss = 1376.174248635769 | accuracy = 0.6699115044247788


Epoch[2] Batch[570] Speed: 1.2487973120945992 samples/sec                   batch loss = 1391.15324562788 | accuracy = 0.6684210526315789


Epoch[2] Batch[575] Speed: 1.2481600719202341 samples/sec                   batch loss = 1401.694705426693 | accuracy = 0.6695652173913044


Epoch[2] Batch[580] Speed: 1.2554703777684704 samples/sec                   batch loss = 1416.1484623551369 | accuracy = 0.6689655172413793


Epoch[2] Batch[585] Speed: 1.2575685696606587 samples/sec                   batch loss = 1425.6677741408348 | accuracy = 0.6705128205128205


Epoch[2] Batch[590] Speed: 1.2516124084466673 samples/sec                   batch loss = 1435.0156967043877 | accuracy = 0.6711864406779661


Epoch[2] Batch[595] Speed: 1.2563884624929775 samples/sec                   batch loss = 1446.6217311024666 | accuracy = 0.6710084033613445


Epoch[2] Batch[600] Speed: 1.2491453326181696 samples/sec                   batch loss = 1461.0197523236275 | accuracy = 0.6708333333333333


Epoch[2] Batch[605] Speed: 1.250256892923959 samples/sec                   batch loss = 1472.1370113492012 | accuracy = 0.6710743801652893


Epoch[2] Batch[610] Speed: 1.2461894850652684 samples/sec                   batch loss = 1486.990414917469 | accuracy = 0.6692622950819672


Epoch[2] Batch[615] Speed: 1.2476106856327027 samples/sec                   batch loss = 1497.8820558190346 | accuracy = 0.6703252032520325


Epoch[2] Batch[620] Speed: 1.2551454916116482 samples/sec                   batch loss = 1508.3525353074074 | accuracy = 0.6713709677419355


Epoch[2] Batch[625] Speed: 1.2489185347412177 samples/sec                   batch loss = 1523.788760960102 | accuracy = 0.6708


Epoch[2] Batch[630] Speed: 1.2566148757557418 samples/sec                   batch loss = 1534.8214368224144 | accuracy = 0.6714285714285714


Epoch[2] Batch[635] Speed: 1.2488117199958555 samples/sec                   batch loss = 1548.4048723578453 | accuracy = 0.6696850393700787


Epoch[2] Batch[640] Speed: 1.2518211311879268 samples/sec                   batch loss = 1561.13388723135 | accuracy = 0.6703125


Epoch[2] Batch[645] Speed: 1.2560903726558104 samples/sec                   batch loss = 1571.400138437748 | accuracy = 0.6709302325581395


Epoch[2] Batch[650] Speed: 1.2493609552956897 samples/sec                   batch loss = 1583.06126588583 | accuracy = 0.6711538461538461


Epoch[2] Batch[655] Speed: 1.243366082222221 samples/sec                   batch loss = 1594.4551529288292 | accuracy = 0.6709923664122137


Epoch[2] Batch[660] Speed: 1.2462504885919445 samples/sec                   batch loss = 1605.0653852820396 | accuracy = 0.6708333333333333


Epoch[2] Batch[665] Speed: 1.2401545290855256 samples/sec                   batch loss = 1618.9410683512688 | accuracy = 0.6695488721804511


Epoch[2] Batch[670] Speed: 1.2454790812392005 samples/sec                   batch loss = 1631.4768406748772 | accuracy = 0.6690298507462686


Epoch[2] Batch[675] Speed: 1.2459677380425362 samples/sec                   batch loss = 1641.1598116755486 | accuracy = 0.6703703703703704


Epoch[2] Batch[680] Speed: 1.2466722149566045 samples/sec                   batch loss = 1651.489242374897 | accuracy = 0.6713235294117647


Epoch[2] Batch[685] Speed: 1.2488849730484908 samples/sec                   batch loss = 1665.3662262558937 | accuracy = 0.67007299270073


Epoch[2] Batch[690] Speed: 1.2493206715490595 samples/sec                   batch loss = 1678.614798605442 | accuracy = 0.6692028985507247


Epoch[2] Batch[695] Speed: 1.2408054592666495 samples/sec                   batch loss = 1688.1796944737434 | accuracy = 0.6701438848920863


Epoch[2] Batch[700] Speed: 1.242826615214885 samples/sec                   batch loss = 1698.1305707097054 | accuracy = 0.6714285714285714


Epoch[2] Batch[705] Speed: 1.2405038020578627 samples/sec                   batch loss = 1710.8902449011803 | accuracy = 0.6712765957446809


Epoch[2] Batch[710] Speed: 1.2552064360429418 samples/sec                   batch loss = 1719.275331914425 | accuracy = 0.6725352112676056


Epoch[2] Batch[715] Speed: 1.2449098815425286 samples/sec                   batch loss = 1728.26321464777 | accuracy = 0.6727272727272727


Epoch[2] Batch[720] Speed: 1.2515437832897445 samples/sec                   batch loss = 1739.7529769539833 | accuracy = 0.671875


Epoch[2] Batch[725] Speed: 1.2540335679318362 samples/sec                   batch loss = 1749.8746200203896 | accuracy = 0.6717241379310345


Epoch[2] Batch[730] Speed: 1.2539431208797713 samples/sec                   batch loss = 1761.7689011693 | accuracy = 0.6719178082191781


Epoch[2] Batch[735] Speed: 1.25174828044441 samples/sec                   batch loss = 1772.957852423191 | accuracy = 0.6717687074829932


Epoch[2] Batch[740] Speed: 1.241852304306967 samples/sec                   batch loss = 1784.7212712168694 | accuracy = 0.6719594594594595


Epoch[2] Batch[745] Speed: 1.2459674604454798 samples/sec                   batch loss = 1796.5503911376 | accuracy = 0.6718120805369128


Epoch[2] Batch[750] Speed: 1.2433318047250892 samples/sec                   batch loss = 1807.9004557728767 | accuracy = 0.6713333333333333


Epoch[2] Batch[755] Speed: 1.2470046852938736 samples/sec                   batch loss = 1817.0786293148994 | accuracy = 0.6721854304635762


Epoch[2] Batch[760] Speed: 1.2551727233687902 samples/sec                   batch loss = 1828.6685082316399 | accuracy = 0.6720394736842106


Epoch[2] Batch[765] Speed: 1.2512897956419549 samples/sec                   batch loss = 1838.8039708733559 | accuracy = 0.6722222222222223


Epoch[2] Batch[770] Speed: 1.254781826188516 samples/sec                   batch loss = 1855.1759892106056 | accuracy = 0.6714285714285714


Epoch[2] Batch[775] Speed: 1.2519930178240108 samples/sec                   batch loss = 1865.7278136014938 | accuracy = 0.6712903225806451


Epoch[2] Batch[780] Speed: 1.2416810770225348 samples/sec                   batch loss = 1881.4135352373123 | accuracy = 0.6701923076923076


Epoch[2] Batch[785] Speed: 1.245997811789541 samples/sec                   batch loss = 1891.3888438940048 | accuracy = 0.6713375796178344


[Epoch 2] training: accuracy=0.671002538071066
[Epoch 2] time cost: 646.3019113540649
[Epoch 2] validation: validation accuracy=0.7244444444444444


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).