<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[09:36:34] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[09:36:34] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[09:36:34] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[10.068427 , -4.9949946]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7767226393285299 samples/sec                   batch loss = 14.515136480331421 | accuracy = 0.6


Epoch[1] Batch[10] Speed: 1.2648274170222604 samples/sec                   batch loss = 28.558231592178345 | accuracy = 0.525


Epoch[1] Batch[15] Speed: 1.2621738360457522 samples/sec                   batch loss = 41.22479724884033 | accuracy = 0.5666666666666667


Epoch[1] Batch[20] Speed: 1.2595943448192914 samples/sec                   batch loss = 55.27118110656738 | accuracy = 0.55


Epoch[1] Batch[25] Speed: 1.2577732489760716 samples/sec                   batch loss = 69.70912981033325 | accuracy = 0.55


Epoch[1] Batch[30] Speed: 1.2544280320717311 samples/sec                   batch loss = 83.4534912109375 | accuracy = 0.575


Epoch[1] Batch[35] Speed: 1.260009916526638 samples/sec                   batch loss = 96.94073176383972 | accuracy = 0.5785714285714286


Epoch[1] Batch[40] Speed: 1.2582080009959342 samples/sec                   batch loss = 110.83357548713684 | accuracy = 0.58125


Epoch[1] Batch[45] Speed: 1.2585976370596832 samples/sec                   batch loss = 125.21331286430359 | accuracy = 0.5666666666666667


Epoch[1] Batch[50] Speed: 1.260775558237507 samples/sec                   batch loss = 139.3997938632965 | accuracy = 0.575


Epoch[1] Batch[55] Speed: 1.260972279260163 samples/sec                   batch loss = 153.5342366695404 | accuracy = 0.5727272727272728


Epoch[1] Batch[60] Speed: 1.2582480106370348 samples/sec                   batch loss = 168.45573782920837 | accuracy = 0.5541666666666667


Epoch[1] Batch[65] Speed: 1.2573217420707763 samples/sec                   batch loss = 182.9184160232544 | accuracy = 0.5461538461538461


Epoch[1] Batch[70] Speed: 1.2552004258513942 samples/sec                   batch loss = 196.47592163085938 | accuracy = 0.55


Epoch[1] Batch[75] Speed: 1.257227711079198 samples/sec                   batch loss = 210.63712096214294 | accuracy = 0.54


Epoch[1] Batch[80] Speed: 1.2613453269738348 samples/sec                   batch loss = 224.32840299606323 | accuracy = 0.54375


Epoch[1] Batch[85] Speed: 1.26135082716769 samples/sec                   batch loss = 238.42165350914001 | accuracy = 0.5441176470588235


Epoch[1] Batch[90] Speed: 1.263522403868436 samples/sec                   batch loss = 251.88676762580872 | accuracy = 0.5472222222222223


Epoch[1] Batch[95] Speed: 1.2637009457862929 samples/sec                   batch loss = 265.16444635391235 | accuracy = 0.5473684210526316


Epoch[1] Batch[100] Speed: 1.261914661759777 samples/sec                   batch loss = 278.5253348350525 | accuracy = 0.545


Epoch[1] Batch[105] Speed: 1.253635511120576 samples/sec                   batch loss = 292.1576313972473 | accuracy = 0.55


Epoch[1] Batch[110] Speed: 1.2590372110234618 samples/sec                   batch loss = 305.891339302063 | accuracy = 0.5522727272727272


Epoch[1] Batch[115] Speed: 1.253338351502907 samples/sec                   batch loss = 319.64624977111816 | accuracy = 0.5521739130434783


Epoch[1] Batch[120] Speed: 1.2591298119329928 samples/sec                   batch loss = 332.9454753398895 | accuracy = 0.5583333333333333


Epoch[1] Batch[125] Speed: 1.2527095646283637 samples/sec                   batch loss = 346.4424843788147 | accuracy = 0.558


Epoch[1] Batch[130] Speed: 1.2551619244271626 samples/sec                   batch loss = 360.7708406448364 | accuracy = 0.5538461538461539


Epoch[1] Batch[135] Speed: 1.2521769140420014 samples/sec                   batch loss = 374.4974799156189 | accuracy = 0.5537037037037037


Epoch[1] Batch[140] Speed: 1.260095562338101 samples/sec                   batch loss = 388.27991676330566 | accuracy = 0.5535714285714286


Epoch[1] Batch[145] Speed: 1.2573288090962975 samples/sec                   batch loss = 401.86100339889526 | accuracy = 0.5551724137931034


Epoch[1] Batch[150] Speed: 1.2516178240855407 samples/sec                   batch loss = 415.0481626987457 | accuracy = 0.5583333333333333


Epoch[1] Batch[155] Speed: 1.2529442915593776 samples/sec                   batch loss = 428.4890167713165 | accuracy = 0.5629032258064516


Epoch[1] Batch[160] Speed: 1.257470449037913 samples/sec                   batch loss = 441.92360615730286 | accuracy = 0.5640625


Epoch[1] Batch[165] Speed: 1.2536660498423167 samples/sec                   batch loss = 456.59467458724976 | accuracy = 0.5606060606060606


Epoch[1] Batch[170] Speed: 1.2629054143672027 samples/sec                   batch loss = 469.98371601104736 | accuracy = 0.5617647058823529


Epoch[1] Batch[175] Speed: 1.263107365228247 samples/sec                   batch loss = 483.5625081062317 | accuracy = 0.5614285714285714


Epoch[1] Batch[180] Speed: 1.265046961916899 samples/sec                   batch loss = 497.38216400146484 | accuracy = 0.5611111111111111


Epoch[1] Batch[185] Speed: 1.2625551960999084 samples/sec                   batch loss = 510.50047612190247 | accuracy = 0.5635135135135135


Epoch[1] Batch[190] Speed: 1.2565450421821316 samples/sec                   batch loss = 524.1706063747406 | accuracy = 0.5618421052631579


Epoch[1] Batch[195] Speed: 1.2554446362179552 samples/sec                   batch loss = 538.5306572914124 | accuracy = 0.5602564102564103


Epoch[1] Batch[200] Speed: 1.2626659904236706 samples/sec                   batch loss = 551.7989985942841 | accuracy = 0.55875


Epoch[1] Batch[205] Speed: 1.256632005930969 samples/sec                   batch loss = 565.0104231834412 | accuracy = 0.5597560975609757


Epoch[1] Batch[210] Speed: 1.256667961993493 samples/sec                   batch loss = 578.2180182933807 | accuracy = 0.5619047619047619


Epoch[1] Batch[215] Speed: 1.2543310573440298 samples/sec                   batch loss = 593.1047821044922 | accuracy = 0.5593023255813954


Epoch[1] Batch[220] Speed: 1.2589431122170838 samples/sec                   batch loss = 607.3756890296936 | accuracy = 0.5545454545454546


Epoch[1] Batch[225] Speed: 1.258187619704208 samples/sec                   batch loss = 621.0029594898224 | accuracy = 0.5544444444444444


Epoch[1] Batch[230] Speed: 1.2523615186813475 samples/sec                   batch loss = 635.0693638324738 | accuracy = 0.5554347826086956


Epoch[1] Batch[235] Speed: 1.248907564238781 samples/sec                   batch loss = 649.1984686851501 | accuracy = 0.5531914893617021


Epoch[1] Batch[240] Speed: 1.2513140604957846 samples/sec                   batch loss = 662.5380029678345 | accuracy = 0.5541666666666667


Epoch[1] Batch[245] Speed: 1.2630160800044927 samples/sec                   batch loss = 676.4074559211731 | accuracy = 0.5540816326530612


Epoch[1] Batch[250] Speed: 1.2641017066999316 samples/sec                   batch loss = 690.2597570419312 | accuracy = 0.551


Epoch[1] Batch[255] Speed: 1.2616993337828977 samples/sec                   batch loss = 704.2818465232849 | accuracy = 0.5490196078431373


Epoch[1] Batch[260] Speed: 1.2595812000860682 samples/sec                   batch loss = 718.0293891429901 | accuracy = 0.5490384615384616


Epoch[1] Batch[265] Speed: 1.259810752234076 samples/sec                   batch loss = 731.9689252376556 | accuracy = 0.5509433962264151


Epoch[1] Batch[270] Speed: 1.259286792137864 samples/sec                   batch loss = 745.6918025016785 | accuracy = 0.549074074074074


Epoch[1] Batch[275] Speed: 1.2608920104276087 samples/sec                   batch loss = 759.2701835632324 | accuracy = 0.5472727272727272


Epoch[1] Batch[280] Speed: 1.263242986181533 samples/sec                   batch loss = 772.7378587722778 | accuracy = 0.55


Epoch[1] Batch[285] Speed: 1.258466221189712 samples/sec                   batch loss = 786.1422402858734 | accuracy = 0.5526315789473685


Epoch[1] Batch[290] Speed: 1.2579119709885462 samples/sec                   batch loss = 799.7659833431244 | accuracy = 0.5517241379310345


Epoch[1] Batch[295] Speed: 1.254698590159332 samples/sec                   batch loss = 813.5875816345215 | accuracy = 0.5508474576271186


Epoch[1] Batch[300] Speed: 1.2560036719124377 samples/sec                   batch loss = 826.6182773113251 | accuracy = 0.5541666666666667


Epoch[1] Batch[305] Speed: 1.2612787594655772 samples/sec                   batch loss = 839.4813239574432 | accuracy = 0.5565573770491803


Epoch[1] Batch[310] Speed: 1.2535147759598133 samples/sec                   batch loss = 853.3248646259308 | accuracy = 0.5556451612903226


Epoch[1] Batch[315] Speed: 1.2614394058611929 samples/sec                   batch loss = 866.529926776886 | accuracy = 0.5587301587301587


Epoch[1] Batch[320] Speed: 1.2617594929502343 samples/sec                   batch loss = 879.5894572734833 | accuracy = 0.56015625


Epoch[1] Batch[325] Speed: 1.2616384213547358 samples/sec                   batch loss = 893.4089059829712 | accuracy = 0.5592307692307692


Epoch[1] Batch[330] Speed: 1.2538474392904109 samples/sec                   batch loss = 907.0977394580841 | accuracy = 0.5606060606060606


Epoch[1] Batch[335] Speed: 1.2546171477790153 samples/sec                   batch loss = 921.1927120685577 | accuracy = 0.5574626865671641


Epoch[1] Batch[340] Speed: 1.2582995362344276 samples/sec                   batch loss = 934.5049731731415 | accuracy = 0.5566176470588236


Epoch[1] Batch[345] Speed: 1.2497247430469547 samples/sec                   batch loss = 948.4570508003235 | accuracy = 0.5565217391304348


Epoch[1] Batch[350] Speed: 1.2474435246677535 samples/sec                   batch loss = 962.1439616680145 | accuracy = 0.5578571428571428


Epoch[1] Batch[355] Speed: 1.2532614856795223 samples/sec                   batch loss = 975.9243984222412 | accuracy = 0.5591549295774648


Epoch[1] Batch[360] Speed: 1.2518369166333758 samples/sec                   batch loss = 990.261595249176 | accuracy = 0.5576388888888889


Epoch[1] Batch[365] Speed: 1.2536600543930665 samples/sec                   batch loss = 1004.1897964477539 | accuracy = 0.5575342465753425


Epoch[1] Batch[370] Speed: 1.2587393741541677 samples/sec                   batch loss = 1017.2464847564697 | accuracy = 0.5594594594594594


Epoch[1] Batch[375] Speed: 1.2564021992933625 samples/sec                   batch loss = 1030.464432477951 | accuracy = 0.5626666666666666


Epoch[1] Batch[380] Speed: 1.2595492377786934 samples/sec                   batch loss = 1043.2547011375427 | accuracy = 0.5644736842105263


Epoch[1] Batch[385] Speed: 1.2567380916478892 samples/sec                   batch loss = 1056.6168024539948 | accuracy = 0.5636363636363636


Epoch[1] Batch[390] Speed: 1.2581260080551424 samples/sec                   batch loss = 1070.4555275440216 | accuracy = 0.5634615384615385


Epoch[1] Batch[395] Speed: 1.262114112092676 samples/sec                   batch loss = 1084.2076811790466 | accuracy = 0.5651898734177215


Epoch[1] Batch[400] Speed: 1.2580223291776964 samples/sec                   batch loss = 1097.4124948978424 | accuracy = 0.566875


Epoch[1] Batch[405] Speed: 1.258794151203113 samples/sec                   batch loss = 1110.5672993659973 | accuracy = 0.5685185185185185


Epoch[1] Batch[410] Speed: 1.2693041908817666 samples/sec                   batch loss = 1123.8311641216278 | accuracy = 0.5689024390243902


Epoch[1] Batch[415] Speed: 1.2626983962122997 samples/sec                   batch loss = 1137.4692635536194 | accuracy = 0.5692771084337349


Epoch[1] Batch[420] Speed: 1.2560260512185812 samples/sec                   batch loss = 1151.1512355804443 | accuracy = 0.5708333333333333


Epoch[1] Batch[425] Speed: 1.2611392939244344 samples/sec                   batch loss = 1164.781189918518 | accuracy = 0.57


Epoch[1] Batch[430] Speed: 1.2586793137663999 samples/sec                   batch loss = 1178.5972378253937 | accuracy = 0.5697674418604651


Epoch[1] Batch[435] Speed: 1.2582630149084661 samples/sec                   batch loss = 1191.6384453773499 | accuracy = 0.5712643678160919


Epoch[1] Batch[440] Speed: 1.2571611064523822 samples/sec                   batch loss = 1205.4176371097565 | accuracy = 0.571590909090909


Epoch[1] Batch[445] Speed: 1.2553340722926039 samples/sec                   batch loss = 1218.7923774719238 | accuracy = 0.5713483146067416


Epoch[1] Batch[450] Speed: 1.250845262136946 samples/sec                   batch loss = 1232.492154121399 | accuracy = 0.5688888888888889


Epoch[1] Batch[455] Speed: 1.2603919582964906 samples/sec                   batch loss = 1246.820771932602 | accuracy = 0.567032967032967


Epoch[1] Batch[460] Speed: 1.2569153804245872 samples/sec                   batch loss = 1260.316213130951 | accuracy = 0.5668478260869565


Epoch[1] Batch[465] Speed: 1.252197287983233 samples/sec                   batch loss = 1274.2929587364197 | accuracy = 0.5655913978494623


Epoch[1] Batch[470] Speed: 1.2495587833981008 samples/sec                   batch loss = 1287.146867275238 | accuracy = 0.5670212765957446


Epoch[1] Batch[475] Speed: 1.2432851829911113 samples/sec                   batch loss = 1301.796527147293 | accuracy = 0.5663157894736842


Epoch[1] Batch[480] Speed: 1.2568537049962532 samples/sec                   batch loss = 1315.3374524116516 | accuracy = 0.565625


Epoch[1] Batch[485] Speed: 1.2531767663107518 samples/sec                   batch loss = 1328.5271253585815 | accuracy = 0.5644329896907216


Epoch[1] Batch[490] Speed: 1.2548899462895768 samples/sec                   batch loss = 1340.991914987564 | accuracy = 0.5663265306122449


Epoch[1] Batch[495] Speed: 1.2516148361410564 samples/sec                   batch loss = 1354.4897117614746 | accuracy = 0.5666666666666667


Epoch[1] Batch[500] Speed: 1.2534858366964405 samples/sec                   batch loss = 1368.3416957855225 | accuracy = 0.5665


Epoch[1] Batch[505] Speed: 1.254197061821244 samples/sec                   batch loss = 1382.3834218978882 | accuracy = 0.5668316831683168


Epoch[1] Batch[510] Speed: 1.2516348182911052 samples/sec                   batch loss = 1395.4466948509216 | accuracy = 0.5676470588235294


Epoch[1] Batch[515] Speed: 1.2536343870242403 samples/sec                   batch loss = 1407.7928974628448 | accuracy = 0.5699029126213592


Epoch[1] Batch[520] Speed: 1.2510156681785738 samples/sec                   batch loss = 1421.1435151100159 | accuracy = 0.5692307692307692


Epoch[1] Batch[525] Speed: 1.256297016930176 samples/sec                   batch loss = 1435.5897204875946 | accuracy = 0.5676190476190476


Epoch[1] Batch[530] Speed: 1.2544488545198271 samples/sec                   batch loss = 1448.21955037117 | accuracy = 0.5693396226415094


Epoch[1] Batch[535] Speed: 1.2566670207090846 samples/sec                   batch loss = 1459.938754081726 | accuracy = 0.5705607476635514


Epoch[1] Batch[540] Speed: 1.2629560861737936 samples/sec                   batch loss = 1473.6653718948364 | accuracy = 0.5712962962962963


Epoch[1] Batch[545] Speed: 1.2572891405549778 samples/sec                   batch loss = 1486.6134450435638 | accuracy = 0.5711009174311926


Epoch[1] Batch[550] Speed: 1.259806968249023 samples/sec                   batch loss = 1500.835455417633 | accuracy = 0.5704545454545454


Epoch[1] Batch[555] Speed: 1.261037487440147 samples/sec                   batch loss = 1514.6285531520844 | accuracy = 0.5698198198198198


Epoch[1] Batch[560] Speed: 1.2619048854729067 samples/sec                   batch loss = 1527.8563797473907 | accuracy = 0.5700892857142857


Epoch[1] Batch[565] Speed: 1.2578517066137092 samples/sec                   batch loss = 1540.0406517982483 | accuracy = 0.5716814159292035


Epoch[1] Batch[570] Speed: 1.2550347925995118 samples/sec                   batch loss = 1554.1629405021667 | accuracy = 0.5706140350877194


Epoch[1] Batch[575] Speed: 1.2606477603746624 samples/sec                   batch loss = 1567.8663160800934 | accuracy = 0.571304347826087


Epoch[1] Batch[580] Speed: 1.2625532958523058 samples/sec                   batch loss = 1580.7575008869171 | accuracy = 0.571551724137931


Epoch[1] Batch[585] Speed: 1.263604340350087 samples/sec                   batch loss = 1594.735162973404 | accuracy = 0.5722222222222222


Epoch[1] Batch[590] Speed: 1.2582974600298638 samples/sec                   batch loss = 1609.3823890686035 | accuracy = 0.5716101694915254


Epoch[1] Batch[595] Speed: 1.2592134479292634 samples/sec                   batch loss = 1622.7494497299194 | accuracy = 0.5714285714285714


Epoch[1] Batch[600] Speed: 1.2534592399758173 samples/sec                   batch loss = 1635.0756077766418 | accuracy = 0.5733333333333334


Epoch[1] Batch[605] Speed: 1.256153101805417 samples/sec                   batch loss = 1648.7530438899994 | accuracy = 0.5743801652892562


Epoch[1] Batch[610] Speed: 1.258027517432346 samples/sec                   batch loss = 1662.0809302330017 | accuracy = 0.575


Epoch[1] Batch[615] Speed: 1.2562304168127396 samples/sec                   batch loss = 1674.676831483841 | accuracy = 0.5760162601626017


Epoch[1] Batch[620] Speed: 1.2611421379169752 samples/sec                   batch loss = 1687.8403877019882 | accuracy = 0.5762096774193548


Epoch[1] Batch[625] Speed: 1.2593307460475935 samples/sec                   batch loss = 1700.3416553735733 | accuracy = 0.5764


Epoch[1] Batch[630] Speed: 1.2583083129930568 samples/sec                   batch loss = 1712.196811556816 | accuracy = 0.5785714285714286


Epoch[1] Batch[635] Speed: 1.2562079361638643 samples/sec                   batch loss = 1726.2250841856003 | accuracy = 0.5779527559055118


Epoch[1] Batch[640] Speed: 1.2486237927160853 samples/sec                   batch loss = 1742.2417925596237 | accuracy = 0.575390625


Epoch[1] Batch[645] Speed: 1.260578046147968 samples/sec                   batch loss = 1755.595465540886 | accuracy = 0.5751937984496124


Epoch[1] Batch[650] Speed: 1.2541136221133782 samples/sec                   batch loss = 1768.0968016386032 | accuracy = 0.5761538461538461


Epoch[1] Batch[655] Speed: 1.2539669264034707 samples/sec                   batch loss = 1783.2462114095688 | accuracy = 0.5751908396946565


Epoch[1] Batch[660] Speed: 1.2609420469417345 samples/sec                   batch loss = 1796.92966735363 | accuracy = 0.5753787878787879


Epoch[1] Batch[665] Speed: 1.25635007631459 samples/sec                   batch loss = 1810.1739262342453 | accuracy = 0.5759398496240602


Epoch[1] Batch[670] Speed: 1.2546080471559642 samples/sec                   batch loss = 1823.341584801674 | accuracy = 0.576865671641791


Epoch[1] Batch[675] Speed: 1.254761649586836 samples/sec                   batch loss = 1836.4137696027756 | accuracy = 0.5774074074074074


Epoch[1] Batch[680] Speed: 1.2540435975795483 samples/sec                   batch loss = 1849.1090923547745 | accuracy = 0.5775735294117647


Epoch[1] Batch[685] Speed: 1.2522308410103045 samples/sec                   batch loss = 1861.6966091394424 | accuracy = 0.5784671532846716


Epoch[1] Batch[690] Speed: 1.2521517746337931 samples/sec                   batch loss = 1875.3020907640457 | accuracy = 0.5778985507246377


Epoch[1] Batch[695] Speed: 1.2540789369625833 samples/sec                   batch loss = 1888.300539135933 | accuracy = 0.5784172661870504


Epoch[1] Batch[700] Speed: 1.2546541145536731 samples/sec                   batch loss = 1902.5934118032455 | accuracy = 0.5775


Epoch[1] Batch[705] Speed: 1.2586137827125092 samples/sec                   batch loss = 1914.9877179861069 | accuracy = 0.5783687943262411


Epoch[1] Batch[710] Speed: 1.2584881219288788 samples/sec                   batch loss = 1925.8436485528946 | accuracy = 0.5802816901408451


Epoch[1] Batch[715] Speed: 1.2549022423767933 samples/sec                   batch loss = 1937.9826501607895 | accuracy = 0.5807692307692308


Epoch[1] Batch[720] Speed: 1.2517928304688286 samples/sec                   batch loss = 1950.9060579538345 | accuracy = 0.5822916666666667


Epoch[1] Batch[725] Speed: 1.2564350371088033 samples/sec                   batch loss = 1964.7794548273087 | accuracy = 0.5817241379310345


Epoch[1] Batch[730] Speed: 1.263830601120476 samples/sec                   batch loss = 1977.62404692173 | accuracy = 0.5818493150684931


Epoch[1] Batch[735] Speed: 1.255184555465994 samples/sec                   batch loss = 1990.2452384233475 | accuracy = 0.5829931972789115


Epoch[1] Batch[740] Speed: 1.258360504204769 samples/sec                   batch loss = 2003.3691893815994 | accuracy = 0.5827702702702703


Epoch[1] Batch[745] Speed: 1.253464016064335 samples/sec                   batch loss = 2016.8703657388687 | accuracy = 0.5812080536912752


Epoch[1] Batch[750] Speed: 1.2572952649882212 samples/sec                   batch loss = 2029.4250180721283 | accuracy = 0.5816666666666667


Epoch[1] Batch[755] Speed: 1.2570311205789537 samples/sec                   batch loss = 2042.8190649747849 | accuracy = 0.5817880794701987


Epoch[1] Batch[760] Speed: 1.2594024967631439 samples/sec                   batch loss = 2055.867865204811 | accuracy = 0.5822368421052632


Epoch[1] Batch[765] Speed: 1.261529325009668 samples/sec                   batch loss = 2068.4115155935287 | accuracy = 0.5823529411764706


Epoch[1] Batch[770] Speed: 1.2589613451240096 samples/sec                   batch loss = 2080.527846455574 | accuracy = 0.5834415584415584


Epoch[1] Batch[775] Speed: 1.2568779036412505 samples/sec                   batch loss = 2094.218401312828 | accuracy = 0.5835483870967741


Epoch[1] Batch[780] Speed: 1.2504062631452324 samples/sec                   batch loss = 2106.620847582817 | accuracy = 0.5842948717948718


Epoch[1] Batch[785] Speed: 1.256889862070531 samples/sec                   batch loss = 2119.854264855385 | accuracy = 0.5840764331210191


[Epoch 1] training: accuracy=0.5843908629441624
[Epoch 1] time cost: 644.9156265258789
[Epoch 1] validation: validation accuracy=0.6644444444444444


Epoch[2] Batch[5] Speed: 1.2541050912632452 samples/sec                   batch loss = 15.133585691452026 | accuracy = 0.4


Epoch[2] Batch[10] Speed: 1.2571074133765088 samples/sec                   batch loss = 26.665149211883545 | accuracy = 0.575


Epoch[2] Batch[15] Speed: 1.2570463783976322 samples/sec                   batch loss = 37.93652892112732 | accuracy = 0.6666666666666666


Epoch[2] Batch[20] Speed: 1.2587092488092608 samples/sec                   batch loss = 50.00120520591736 | accuracy = 0.675


Epoch[2] Batch[25] Speed: 1.2612481331858476 samples/sec                   batch loss = 61.96054124832153 | accuracy = 0.67


Epoch[2] Batch[30] Speed: 1.263553711676074 samples/sec                   batch loss = 73.96652483940125 | accuracy = 0.6583333333333333


Epoch[2] Batch[35] Speed: 1.2597061334217126 samples/sec                   batch loss = 88.40493845939636 | accuracy = 0.6357142857142857


Epoch[2] Batch[40] Speed: 1.2595342028164045 samples/sec                   batch loss = 100.94038605690002 | accuracy = 0.6375


Epoch[2] Batch[45] Speed: 1.25539090178438 samples/sec                   batch loss = 113.8200261592865 | accuracy = 0.6444444444444445


Epoch[2] Batch[50] Speed: 1.256903704001153 samples/sec                   batch loss = 127.92928719520569 | accuracy = 0.63


Epoch[2] Batch[55] Speed: 1.265331758973279 samples/sec                   batch loss = 141.37231135368347 | accuracy = 0.6318181818181818


Epoch[2] Batch[60] Speed: 1.2664114300897353 samples/sec                   batch loss = 155.33521580696106 | accuracy = 0.625


Epoch[2] Batch[65] Speed: 1.2656585032228727 samples/sec                   batch loss = 168.24506258964539 | accuracy = 0.6192307692307693


Epoch[2] Batch[70] Speed: 1.251089366615546 samples/sec                   batch loss = 182.04802775382996 | accuracy = 0.6071428571428571


Epoch[2] Batch[75] Speed: 1.2528539080033088 samples/sec                   batch loss = 195.26456308364868 | accuracy = 0.6066666666666667


Epoch[2] Batch[80] Speed: 1.253162257512453 samples/sec                   batch loss = 207.94978749752045 | accuracy = 0.6125


Epoch[2] Batch[85] Speed: 1.2469404568885012 samples/sec                   batch loss = 219.1090282201767 | accuracy = 0.6264705882352941


Epoch[2] Batch[90] Speed: 1.2501960555208966 samples/sec                   batch loss = 233.01092171669006 | accuracy = 0.6277777777777778


Epoch[2] Batch[95] Speed: 1.2481554290230752 samples/sec                   batch loss = 245.7668948173523 | accuracy = 0.6263157894736842


Epoch[2] Batch[100] Speed: 1.257824735696767 samples/sec                   batch loss = 258.5285177230835 | accuracy = 0.6275


Epoch[2] Batch[105] Speed: 1.2561147300073163 samples/sec                   batch loss = 272.32355892658234 | accuracy = 0.6214285714285714


Epoch[2] Batch[110] Speed: 1.2568174558512688 samples/sec                   batch loss = 285.17943143844604 | accuracy = 0.625


Epoch[2] Batch[115] Speed: 1.2528824438457955 samples/sec                   batch loss = 298.162579536438 | accuracy = 0.6239130434782608


Epoch[2] Batch[120] Speed: 1.2508252119337493 samples/sec                   batch loss = 310.348557472229 | accuracy = 0.6270833333333333


Epoch[2] Batch[125] Speed: 1.2522688824108759 samples/sec                   batch loss = 323.4511992931366 | accuracy = 0.63


Epoch[2] Batch[130] Speed: 1.2639245750271246 samples/sec                   batch loss = 336.0146543979645 | accuracy = 0.6326923076923077


Epoch[2] Batch[135] Speed: 1.254652613322419 samples/sec                   batch loss = 349.21203088760376 | accuracy = 0.6314814814814815


Epoch[2] Batch[140] Speed: 1.2561987183898102 samples/sec                   batch loss = 362.5808937549591 | accuracy = 0.6285714285714286


Epoch[2] Batch[145] Speed: 1.2590039536132394 samples/sec                   batch loss = 375.3208655118942 | accuracy = 0.6275862068965518


Epoch[2] Batch[150] Speed: 1.2546762581319766 samples/sec                   batch loss = 389.4473088979721 | accuracy = 0.625


Epoch[2] Batch[155] Speed: 1.2584691475374619 samples/sec                   batch loss = 402.6025747060776 | accuracy = 0.6258064516129033


Epoch[2] Batch[160] Speed: 1.2527739210131619 samples/sec                   batch loss = 414.9434577226639 | accuracy = 0.628125


Epoch[2] Batch[165] Speed: 1.2524519247027563 samples/sec                   batch loss = 426.8235012292862 | accuracy = 0.6318181818181818


Epoch[2] Batch[170] Speed: 1.2578197377168103 samples/sec                   batch loss = 440.37158477306366 | accuracy = 0.6264705882352941


Epoch[2] Batch[175] Speed: 1.2574654538730778 samples/sec                   batch loss = 453.01424610614777 | accuracy = 0.6271428571428571


Epoch[2] Batch[180] Speed: 1.2609578737113247 samples/sec                   batch loss = 464.9991112947464 | accuracy = 0.6277777777777778


Epoch[2] Batch[185] Speed: 1.2607136928787088 samples/sec                   batch loss = 477.6082696914673 | accuracy = 0.6270270270270271


Epoch[2] Batch[190] Speed: 1.2534702905888806 samples/sec                   batch loss = 489.9575026035309 | accuracy = 0.6263157894736842


Epoch[2] Batch[195] Speed: 1.2581934698222315 samples/sec                   batch loss = 501.3187954425812 | accuracy = 0.6294871794871795


Epoch[2] Batch[200] Speed: 1.2525139169324533 samples/sec                   batch loss = 515.2121660709381 | accuracy = 0.62875


Epoch[2] Batch[205] Speed: 1.2559534624215583 samples/sec                   batch loss = 527.1453108787537 | accuracy = 0.6329268292682927


Epoch[2] Batch[210] Speed: 1.2576575606794906 samples/sec                   batch loss = 542.0993976593018 | accuracy = 0.6297619047619047


Epoch[2] Batch[215] Speed: 1.2597358335605313 samples/sec                   batch loss = 555.4636945724487 | accuracy = 0.6290697674418605


Epoch[2] Batch[220] Speed: 1.2569876095748003 samples/sec                   batch loss = 567.3935673236847 | accuracy = 0.6306818181818182


Epoch[2] Batch[225] Speed: 1.2556032359824896 samples/sec                   batch loss = 580.6441645622253 | accuracy = 0.63


Epoch[2] Batch[230] Speed: 1.2597548461680204 samples/sec                   batch loss = 590.9419023990631 | accuracy = 0.6347826086956522


Epoch[2] Batch[235] Speed: 1.2623746031287406 samples/sec                   batch loss = 604.4935004711151 | accuracy = 0.6351063829787233


Epoch[2] Batch[240] Speed: 1.2588304200104115 samples/sec                   batch loss = 617.4533383846283 | accuracy = 0.6354166666666666


Epoch[2] Batch[245] Speed: 1.2630101849540438 samples/sec                   batch loss = 631.6967804431915 | accuracy = 0.6316326530612245


Epoch[2] Batch[250] Speed: 1.2554066834486073 samples/sec                   batch loss = 642.3408591747284 | accuracy = 0.635


Epoch[2] Batch[255] Speed: 1.2627331796426882 samples/sec                   batch loss = 655.5189834833145 | accuracy = 0.6362745098039215


Epoch[2] Batch[260] Speed: 1.2622519889598574 samples/sec                   batch loss = 668.1458624601364 | accuracy = 0.6346153846153846


Epoch[2] Batch[265] Speed: 1.2665370528091666 samples/sec                   batch loss = 678.8213546276093 | accuracy = 0.6386792452830189


Epoch[2] Batch[270] Speed: 1.2585782817060571 samples/sec                   batch loss = 692.935097694397 | accuracy = 0.6361111111111111


Epoch[2] Batch[275] Speed: 1.2599264585505527 samples/sec                   batch loss = 706.3906997442245 | accuracy = 0.6363636363636364


Epoch[2] Batch[280] Speed: 1.2610263029998916 samples/sec                   batch loss = 717.3666325807571 | accuracy = 0.6375


Epoch[2] Batch[285] Speed: 1.2650902695172368 samples/sec                   batch loss = 728.361763715744 | accuracy = 0.6412280701754386


Epoch[2] Batch[290] Speed: 1.2590057487126818 samples/sec                   batch loss = 740.8170833587646 | accuracy = 0.6422413793103449


Epoch[2] Batch[295] Speed: 1.2605489692588394 samples/sec                   batch loss = 754.6551699638367 | accuracy = 0.6415254237288136


Epoch[2] Batch[300] Speed: 1.2628070295076952 samples/sec                   batch loss = 764.8891801834106 | accuracy = 0.6441666666666667


Epoch[2] Batch[305] Speed: 1.2664789229126843 samples/sec                   batch loss = 779.2626392841339 | accuracy = 0.6409836065573771


Epoch[2] Batch[310] Speed: 1.2594375715635764 samples/sec                   batch loss = 792.1900032758713 | accuracy = 0.6403225806451613


Epoch[2] Batch[315] Speed: 1.2595735403320736 samples/sec                   batch loss = 803.9863282442093 | accuracy = 0.6420634920634921


Epoch[2] Batch[320] Speed: 1.2612318250617227 samples/sec                   batch loss = 815.3004069328308 | accuracy = 0.64140625


Epoch[2] Batch[325] Speed: 1.2560855765318888 samples/sec                   batch loss = 827.4210866689682 | accuracy = 0.6423076923076924


Epoch[2] Batch[330] Speed: 1.2638028019948497 samples/sec                   batch loss = 841.2906478643417 | accuracy = 0.6401515151515151


Epoch[2] Batch[335] Speed: 1.2591205512290808 samples/sec                   batch loss = 852.4411059617996 | accuracy = 0.641044776119403


Epoch[2] Batch[340] Speed: 1.2561056076453374 samples/sec                   batch loss = 864.5936771631241 | accuracy = 0.6404411764705882


Epoch[2] Batch[345] Speed: 1.2564973301741436 samples/sec                   batch loss = 875.229042172432 | accuracy = 0.6427536231884058


Epoch[2] Batch[350] Speed: 1.2580977045730366 samples/sec                   batch loss = 885.2107417583466 | accuracy = 0.6457142857142857


Epoch[2] Batch[355] Speed: 1.252569743241067 samples/sec                   batch loss = 897.1749163866043 | accuracy = 0.645774647887324


Epoch[2] Batch[360] Speed: 1.2499515544773163 samples/sec                   batch loss = 906.8058032989502 | accuracy = 0.6486111111111111


Epoch[2] Batch[365] Speed: 1.2454508816367733 samples/sec                   batch loss = 921.4330130815506 | accuracy = 0.6458904109589041


Epoch[2] Batch[370] Speed: 1.2499640333570776 samples/sec                   batch loss = 935.2425113916397 | accuracy = 0.6452702702702703


Epoch[2] Batch[375] Speed: 1.2497121758551892 samples/sec                   batch loss = 945.8955141305923 | accuracy = 0.648


Epoch[2] Batch[380] Speed: 1.2540487530683697 samples/sec                   batch loss = 958.1031621694565 | accuracy = 0.6486842105263158


Epoch[2] Batch[385] Speed: 1.2565434423125015 samples/sec                   batch loss = 968.8294608592987 | accuracy = 0.6487012987012987


Epoch[2] Batch[390] Speed: 1.261360879370057 samples/sec                   batch loss = 982.1795001029968 | accuracy = 0.6474358974358975


Epoch[2] Batch[395] Speed: 1.26176442739518 samples/sec                   batch loss = 995.3241473436356 | accuracy = 0.6455696202531646


Epoch[2] Batch[400] Speed: 1.256151973189658 samples/sec                   batch loss = 1004.2909345626831 | accuracy = 0.648125


Epoch[2] Batch[405] Speed: 1.2593290445488388 samples/sec                   batch loss = 1019.8786659240723 | accuracy = 0.6444444444444445


Epoch[2] Batch[410] Speed: 1.2580905345597377 samples/sec                   batch loss = 1030.712239265442 | accuracy = 0.6451219512195122


Epoch[2] Batch[415] Speed: 1.2627135068113926 samples/sec                   batch loss = 1041.4956800937653 | accuracy = 0.6469879518072289


Epoch[2] Batch[420] Speed: 1.2604663868437762 samples/sec                   batch loss = 1052.1965728998184 | accuracy = 0.6482142857142857


Epoch[2] Batch[425] Speed: 1.2596988504804694 samples/sec                   batch loss = 1064.8591361045837 | accuracy = 0.648235294117647


Epoch[2] Batch[430] Speed: 1.2551771369027727 samples/sec                   batch loss = 1076.5272200107574 | accuracy = 0.65


Epoch[2] Batch[435] Speed: 1.2550757273633881 samples/sec                   batch loss = 1087.0142221450806 | accuracy = 0.6505747126436782


Epoch[2] Batch[440] Speed: 1.2559962436635779 samples/sec                   batch loss = 1099.7520506381989 | accuracy = 0.6517045454545455


Epoch[2] Batch[445] Speed: 1.252113273110473 samples/sec                   batch loss = 1111.7770788669586 | accuracy = 0.652247191011236


Epoch[2] Batch[450] Speed: 1.2519841420866173 samples/sec                   batch loss = 1121.4865614175797 | accuracy = 0.6538888888888889


Epoch[2] Batch[455] Speed: 1.2525944318388578 samples/sec                   batch loss = 1134.5558935403824 | accuracy = 0.6516483516483517


Epoch[2] Batch[460] Speed: 1.2591063769460944 samples/sec                   batch loss = 1146.7930456399918 | accuracy = 0.6521739130434783


Epoch[2] Batch[465] Speed: 1.2569626533235696 samples/sec                   batch loss = 1159.1997522115707 | accuracy = 0.6516129032258065


Epoch[2] Batch[470] Speed: 1.262191118116757 samples/sec                   batch loss = 1170.432383298874 | accuracy = 0.652127659574468


Epoch[2] Batch[475] Speed: 1.2540286000349063 samples/sec                   batch loss = 1181.5316886901855 | accuracy = 0.6521052631578947


Epoch[2] Batch[480] Speed: 1.2609571155336612 samples/sec                   batch loss = 1197.598387479782 | accuracy = 0.65


Epoch[2] Batch[485] Speed: 1.2613864847602911 samples/sec                   batch loss = 1209.6218674182892 | accuracy = 0.65


Epoch[2] Batch[490] Speed: 1.2583911791791857 samples/sec                   batch loss = 1220.4691331386566 | accuracy = 0.6515306122448979


Epoch[2] Batch[495] Speed: 1.2593851964355136 samples/sec                   batch loss = 1231.7689218521118 | accuracy = 0.6515151515151515


Epoch[2] Batch[500] Speed: 1.2586374826374107 samples/sec                   batch loss = 1241.6793367862701 | accuracy = 0.6525


Epoch[2] Batch[505] Speed: 1.2551719721320547 samples/sec                   batch loss = 1254.6903767585754 | accuracy = 0.651980198019802


Epoch[2] Batch[510] Speed: 1.2584056205310468 samples/sec                   batch loss = 1267.3378989696503 | accuracy = 0.6524509803921569


Epoch[2] Batch[515] Speed: 1.2561028803669687 samples/sec                   batch loss = 1279.0480756759644 | accuracy = 0.6519417475728155


Epoch[2] Batch[520] Speed: 1.2558705409291226 samples/sec                   batch loss = 1291.5857385396957 | accuracy = 0.6528846153846154


Epoch[2] Batch[525] Speed: 1.2526372651242037 samples/sec                   batch loss = 1304.0213842391968 | accuracy = 0.6533333333333333


Epoch[2] Batch[530] Speed: 1.2571604470359135 samples/sec                   batch loss = 1318.7457669973373 | accuracy = 0.6514150943396226


Epoch[2] Batch[535] Speed: 1.2585056808306556 samples/sec                   batch loss = 1333.7779933214188 | accuracy = 0.65


Epoch[2] Batch[540] Speed: 1.2539498688104447 samples/sec                   batch loss = 1344.3855273723602 | accuracy = 0.6513888888888889


Epoch[2] Batch[545] Speed: 1.2537080195927748 samples/sec                   batch loss = 1356.3961032629013 | accuracy = 0.6513761467889908


Epoch[2] Batch[550] Speed: 1.2533492127205128 samples/sec                   batch loss = 1370.2742491960526 | accuracy = 0.65


Epoch[2] Batch[555] Speed: 1.2562424569890194 samples/sec                   batch loss = 1382.723426103592 | accuracy = 0.65


Epoch[2] Batch[560] Speed: 1.2595578428557217 samples/sec                   batch loss = 1393.9446926116943 | accuracy = 0.65


Epoch[2] Batch[565] Speed: 1.2576325778189963 samples/sec                   batch loss = 1406.1384296417236 | accuracy = 0.6513274336283186


Epoch[2] Batch[570] Speed: 1.2519816195315872 samples/sec                   batch loss = 1418.5955290794373 | accuracy = 0.650438596491228


Epoch[2] Batch[575] Speed: 1.2536213663886 samples/sec                   batch loss = 1428.6545624732971 | accuracy = 0.6517391304347826


Epoch[2] Batch[580] Speed: 1.2570019245510362 samples/sec                   batch loss = 1438.6408202648163 | accuracy = 0.6521551724137931


Epoch[2] Batch[585] Speed: 1.2512500406462712 samples/sec                   batch loss = 1452.4683694839478 | accuracy = 0.6517094017094017


Epoch[2] Batch[590] Speed: 1.2505331114200446 samples/sec                   batch loss = 1463.6250284910202 | accuracy = 0.6521186440677966


Epoch[2] Batch[595] Speed: 1.255428759656545 samples/sec                   batch loss = 1476.321340560913 | accuracy = 0.6521008403361345


Epoch[2] Batch[600] Speed: 1.2471197197960182 samples/sec                   batch loss = 1486.8653991222382 | accuracy = 0.65375


Epoch[2] Batch[605] Speed: 1.2528251863532003 samples/sec                   batch loss = 1500.3465173244476 | accuracy = 0.6524793388429752


Epoch[2] Batch[610] Speed: 1.2534723508942223 samples/sec                   batch loss = 1512.8923735618591 | accuracy = 0.6516393442622951


Epoch[2] Batch[615] Speed: 1.2558115059955488 samples/sec                   batch loss = 1525.3879843950272 | accuracy = 0.6524390243902439


Epoch[2] Batch[620] Speed: 1.2587342744646115 samples/sec                   batch loss = 1539.7287834882736 | accuracy = 0.6508064516129032


Epoch[2] Batch[625] Speed: 1.2505585587358408 samples/sec                   batch loss = 1550.6621655225754 | accuracy = 0.652


Epoch[2] Batch[630] Speed: 1.253715139739949 samples/sec                   batch loss = 1561.2559759616852 | accuracy = 0.6531746031746032


Epoch[2] Batch[635] Speed: 1.251979377268983 samples/sec                   batch loss = 1572.9590381383896 | accuracy = 0.6535433070866141


Epoch[2] Batch[640] Speed: 1.2511325635329196 samples/sec                   batch loss = 1583.948105096817 | accuracy = 0.65390625


Epoch[2] Batch[645] Speed: 1.256950505167426 samples/sec                   batch loss = 1595.018796801567 | accuracy = 0.6546511627906977


Epoch[2] Batch[650] Speed: 1.256611958025192 samples/sec                   batch loss = 1605.2357441186905 | accuracy = 0.655


Epoch[2] Batch[655] Speed: 1.2543229924230954 samples/sec                   batch loss = 1617.4088398218155 | accuracy = 0.6553435114503817


Epoch[2] Batch[660] Speed: 1.254027568966891 samples/sec                   batch loss = 1631.2739725112915 | accuracy = 0.6545454545454545


Epoch[2] Batch[665] Speed: 1.2636945684203802 samples/sec                   batch loss = 1642.9457010030746 | accuracy = 0.6545112781954887


Epoch[2] Batch[670] Speed: 1.2633147077697109 samples/sec                   batch loss = 1654.0718928575516 | accuracy = 0.6555970149253731


Epoch[2] Batch[675] Speed: 1.2595618144693914 samples/sec                   batch loss = 1663.4910216331482 | accuracy = 0.6570370370370371


Epoch[2] Batch[680] Speed: 1.2554255656039524 samples/sec                   batch loss = 1673.2753059864044 | accuracy = 0.6580882352941176


Epoch[2] Batch[685] Speed: 1.252623236433045 samples/sec                   batch loss = 1683.999107003212 | accuracy = 0.6591240875912409


Epoch[2] Batch[690] Speed: 1.2568397700125953 samples/sec                   batch loss = 1693.9609558582306 | accuracy = 0.6597826086956522


Epoch[2] Batch[695] Speed: 1.2560367710006501 samples/sec                   batch loss = 1706.948156118393 | accuracy = 0.6600719424460432


Epoch[2] Batch[700] Speed: 1.2534372329793566 samples/sec                   batch loss = 1717.791449189186 | accuracy = 0.6610714285714285


Epoch[2] Batch[705] Speed: 1.2609392038515683 samples/sec                   batch loss = 1728.5557404756546 | accuracy = 0.6617021276595745


Epoch[2] Batch[710] Speed: 1.2616030341248812 samples/sec                   batch loss = 1740.5590584278107 | accuracy = 0.6616197183098591


Epoch[2] Batch[715] Speed: 1.2570199129110773 samples/sec                   batch loss = 1752.1297906637192 | accuracy = 0.6618881118881119


Epoch[2] Batch[720] Speed: 1.2603823002743852 samples/sec                   batch loss = 1763.352411031723 | accuracy = 0.6618055555555555


Epoch[2] Batch[725] Speed: 1.2545387178774887 samples/sec                   batch loss = 1773.7675354480743 | accuracy = 0.6620689655172414


Epoch[2] Batch[730] Speed: 1.2609175020190804 samples/sec                   batch loss = 1786.2253569364548 | accuracy = 0.663013698630137


Epoch[2] Batch[735] Speed: 1.256986291106545 samples/sec                   batch loss = 1797.4267055988312 | accuracy = 0.6629251700680272


Epoch[2] Batch[740] Speed: 1.2538279486511863 samples/sec                   batch loss = 1807.9769555330276 | accuracy = 0.6635135135135135


Epoch[2] Batch[745] Speed: 1.2570021129081599 samples/sec                   batch loss = 1818.1426140069962 | accuracy = 0.6644295302013423


Epoch[2] Batch[750] Speed: 1.2575208741505173 samples/sec                   batch loss = 1828.4376085996628 | accuracy = 0.6656666666666666


Epoch[2] Batch[755] Speed: 1.2548566260431044 samples/sec                   batch loss = 1842.1603314876556 | accuracy = 0.6645695364238411


Epoch[2] Batch[760] Speed: 1.2554223715676123 samples/sec                   batch loss = 1852.131209731102 | accuracy = 0.6654605263157894


Epoch[2] Batch[765] Speed: 1.2577134693178689 samples/sec                   batch loss = 1865.5782326459885 | accuracy = 0.665359477124183


Epoch[2] Batch[770] Speed: 1.2587661954238432 samples/sec                   batch loss = 1877.1915340423584 | accuracy = 0.6659090909090909


Epoch[2] Batch[775] Speed: 1.2534365774636544 samples/sec                   batch loss = 1887.0162091255188 | accuracy = 0.6661290322580645


Epoch[2] Batch[780] Speed: 1.2503675893272435 samples/sec                   batch loss = 1898.383751630783 | accuracy = 0.6666666666666666


Epoch[2] Batch[785] Speed: 1.2572581424158304 samples/sec                   batch loss = 1912.0126665830612 | accuracy = 0.6671974522292994


[Epoch 2] training: accuracy=0.6671954314720813
[Epoch 2] time cost: 643.3774552345276
[Epoch 2] validation: validation accuracy=0.74


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).