<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[15:37:23] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

15:37:23] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[15:37:24] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 4.7622  , -8.098564]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7793601227784144 samples/sec                   batch loss = 14.600656270980835 | accuracy = 0.35


Epoch[1] Batch[10] Speed: 1.2506347204438388 samples/sec                   batch loss = 27.218080520629883 | accuracy = 0.575


Epoch[1] Batch[15] Speed: 1.2516212789141345 samples/sec                   batch loss = 40.710931062698364 | accuracy = 0.5833333333333334


Epoch[1] Batch[20] Speed: 1.2571428314520394 samples/sec                   batch loss = 55.73762392997742 | accuracy = 0.575


Epoch[1] Batch[25] Speed: 1.2569814881384342 samples/sec                   batch loss = 70.54274868965149 | accuracy = 0.54


Epoch[1] Batch[30] Speed: 1.2548652609793687 samples/sec                   batch loss = 84.52632451057434 | accuracy = 0.55


Epoch[1] Batch[35] Speed: 1.2562991806117425 samples/sec                   batch loss = 99.51118063926697 | accuracy = 0.5214285714285715


Epoch[1] Batch[40] Speed: 1.2531111517763445 samples/sec                   batch loss = 113.96372699737549 | accuracy = 0.50625


Epoch[1] Batch[45] Speed: 1.2526136971025403 samples/sec                   batch loss = 126.20503091812134 | accuracy = 0.5277777777777778


Epoch[1] Batch[50] Speed: 1.2530872851407213 samples/sec                   batch loss = 141.27309894561768 | accuracy = 0.52


Epoch[1] Batch[55] Speed: 1.252508213016974 samples/sec                   batch loss = 155.12097883224487 | accuracy = 0.5181818181818182


Epoch[1] Batch[60] Speed: 1.2507957439645054 samples/sec                   batch loss = 169.3127875328064 | accuracy = 0.5125


Epoch[1] Batch[65] Speed: 1.2581776180155337 samples/sec                   batch loss = 183.68286204338074 | accuracy = 0.5115384615384615


Epoch[1] Batch[70] Speed: 1.2509350762839124 samples/sec                   batch loss = 197.1446042060852 | accuracy = 0.5142857142857142


Epoch[1] Batch[75] Speed: 1.2521940168929449 samples/sec                   batch loss = 212.05407071113586 | accuracy = 0.5066666666666667


Epoch[1] Batch[80] Speed: 1.2529985653847788 samples/sec                   batch loss = 225.62531352043152 | accuracy = 0.50625


Epoch[1] Batch[85] Speed: 1.2598590946432835 samples/sec                   batch loss = 239.72443890571594 | accuracy = 0.5


Epoch[1] Batch[90] Speed: 1.2581324236881306 samples/sec                   batch loss = 253.92298865318298 | accuracy = 0.49722222222222223


Epoch[1] Batch[95] Speed: 1.2587282304416094 samples/sec                   batch loss = 267.3561441898346 | accuracy = 0.5026315789473684


Epoch[1] Batch[100] Speed: 1.2513252599765 samples/sec                   batch loss = 281.09439635276794 | accuracy = 0.515


Epoch[1] Batch[105] Speed: 1.2559088037946418 samples/sec                   batch loss = 294.3013801574707 | accuracy = 0.5285714285714286


Epoch[1] Batch[110] Speed: 1.254688550031866 samples/sec                   batch loss = 308.34918785095215 | accuracy = 0.5272727272727272


Epoch[1] Batch[115] Speed: 1.2588861495942982 samples/sec                   batch loss = 321.9359655380249 | accuracy = 0.5282608695652173


Epoch[1] Batch[120] Speed: 1.2589223292367047 samples/sec                   batch loss = 335.76987409591675 | accuracy = 0.525


Epoch[1] Batch[125] Speed: 1.2548445185608688 samples/sec                   batch loss = 349.0611298084259 | accuracy = 0.534


Epoch[1] Batch[130] Speed: 1.2543066753421077 samples/sec                   batch loss = 362.47563004493713 | accuracy = 0.5326923076923077


Epoch[1] Batch[135] Speed: 1.2543193351094093 samples/sec                   batch loss = 376.01384472846985 | accuracy = 0.5370370370370371


Epoch[1] Batch[140] Speed: 1.2640987540985484 samples/sec                   batch loss = 390.1278147697449 | accuracy = 0.5357142857142857


Epoch[1] Batch[145] Speed: 1.2522314952654734 samples/sec                   batch loss = 403.65084767341614 | accuracy = 0.5396551724137931


Epoch[1] Batch[150] Speed: 1.2559308976657444 samples/sec                   batch loss = 416.6754777431488 | accuracy = 0.5483333333333333


Epoch[1] Batch[155] Speed: 1.2559550607891792 samples/sec                   batch loss = 429.7677643299103 | accuracy = 0.5516129032258065


Epoch[1] Batch[160] Speed: 1.249446090003495 samples/sec                   batch loss = 443.0274815559387 | accuracy = 0.55625


Epoch[1] Batch[165] Speed: 1.25662635856885 samples/sec                   batch loss = 457.12379217147827 | accuracy = 0.5545454545454546


Epoch[1] Batch[170] Speed: 1.2479547960099582 samples/sec                   batch loss = 470.40449714660645 | accuracy = 0.5514705882352942


Epoch[1] Batch[175] Speed: 1.2568914628224277 samples/sec                   batch loss = 484.5512912273407 | accuracy = 0.5471428571428572


Epoch[1] Batch[180] Speed: 1.2544966924918026 samples/sec                   batch loss = 498.24540305137634 | accuracy = 0.55


Epoch[1] Batch[185] Speed: 1.25385718483726 samples/sec                   batch loss = 510.9836325645447 | accuracy = 0.5554054054054054


Epoch[1] Batch[190] Speed: 1.256291654794937 samples/sec                   batch loss = 525.689710855484 | accuracy = 0.5486842105263158


Epoch[1] Batch[195] Speed: 1.2527067585410996 samples/sec                   batch loss = 539.0740447044373 | accuracy = 0.5487179487179488


Epoch[1] Batch[200] Speed: 1.2535954195975008 samples/sec                   batch loss = 552.9936110973358 | accuracy = 0.545


Epoch[1] Batch[205] Speed: 1.2568981483597192 samples/sec                   batch loss = 565.9253170490265 | accuracy = 0.5475609756097561


Epoch[1] Batch[210] Speed: 1.255310872309734 samples/sec                   batch loss = 578.6631553173065 | accuracy = 0.55


Epoch[1] Batch[215] Speed: 1.2547177325298118 samples/sec                   batch loss = 591.4822540283203 | accuracy = 0.5558139534883721


Epoch[1] Batch[220] Speed: 1.256387239367781 samples/sec                   batch loss = 604.8302767276764 | accuracy = 0.5568181818181818


Epoch[1] Batch[225] Speed: 1.2511967579826437 samples/sec                   batch loss = 617.8738603591919 | accuracy = 0.5611111111111111


Epoch[1] Batch[230] Speed: 1.2589364049075151 samples/sec                   batch loss = 632.4372582435608 | accuracy = 0.5554347826086956


Epoch[1] Batch[235] Speed: 1.2556568006408355 samples/sec                   batch loss = 646.1121189594269 | accuracy = 0.5542553191489362


Epoch[1] Batch[240] Speed: 1.25628582234893 samples/sec                   batch loss = 660.6904804706573 | accuracy = 0.5510416666666667


Epoch[1] Batch[245] Speed: 1.2565129514582725 samples/sec                   batch loss = 674.1151099205017 | accuracy = 0.55


Epoch[1] Batch[250] Speed: 1.2539079765816623 samples/sec                   batch loss = 687.6146972179413 | accuracy = 0.55


Epoch[1] Batch[255] Speed: 1.2506996097065326 samples/sec                   batch loss = 701.1110599040985 | accuracy = 0.5509803921568628


Epoch[1] Batch[260] Speed: 1.2506471197394406 samples/sec                   batch loss = 715.0576496124268 | accuracy = 0.55


Epoch[1] Batch[265] Speed: 1.250393868624998 samples/sec                   batch loss = 727.7209203243256 | accuracy = 0.5537735849056604


Epoch[1] Batch[270] Speed: 1.2538025555062238 samples/sec                   batch loss = 741.239251613617 | accuracy = 0.5527777777777778


Epoch[1] Batch[275] Speed: 1.2629278501626948 samples/sec                   batch loss = 754.7564260959625 | accuracy = 0.55


Epoch[1] Batch[280] Speed: 1.248059143122503 samples/sec                   batch loss = 769.1823151111603 | accuracy = 0.5491071428571429


Epoch[1] Batch[285] Speed: 1.2591217796819671 samples/sec                   batch loss = 782.3742499351501 | accuracy = 0.5517543859649123


Epoch[1] Batch[290] Speed: 1.254355440293876 samples/sec                   batch loss = 795.350213766098 | accuracy = 0.553448275862069


Epoch[1] Batch[295] Speed: 1.2509698675757661 samples/sec                   batch loss = 810.0853023529053 | accuracy = 0.5491525423728814


Epoch[1] Batch[300] Speed: 1.2530257040607533 samples/sec                   batch loss = 823.6164383888245 | accuracy = 0.55


Epoch[1] Batch[305] Speed: 1.245153615333457 samples/sec                   batch loss = 837.9252517223358 | accuracy = 0.5483606557377049


Epoch[1] Batch[310] Speed: 1.2514641501946506 samples/sec                   batch loss = 851.1876881122589 | accuracy = 0.5483870967741935


Epoch[1] Batch[315] Speed: 1.2587270027565294 samples/sec                   batch loss = 864.9131338596344 | accuracy = 0.5492063492063493


Epoch[1] Batch[320] Speed: 1.2543110827876525 samples/sec                   batch loss = 877.9422507286072 | accuracy = 0.55


Epoch[1] Batch[325] Speed: 1.2525803105899274 samples/sec                   batch loss = 891.3103747367859 | accuracy = 0.5523076923076923


Epoch[1] Batch[330] Speed: 1.2549587511582057 samples/sec                   batch loss = 905.1819875240326 | accuracy = 0.5507575757575758


Epoch[1] Batch[335] Speed: 1.2524567866166447 samples/sec                   batch loss = 917.0332133769989 | accuracy = 0.5552238805970149


Epoch[1] Batch[340] Speed: 1.255265132345455 samples/sec                   batch loss = 930.0427243709564 | accuracy = 0.5580882352941177


Epoch[1] Batch[345] Speed: 1.2526821590475945 samples/sec                   batch loss = 943.721360206604 | accuracy = 0.5565217391304348


Epoch[1] Batch[350] Speed: 1.2494153843449978 samples/sec                   batch loss = 956.3332834243774 | accuracy = 0.5557142857142857


Epoch[1] Batch[355] Speed: 1.2540440662585988 samples/sec                   batch loss = 969.8011019229889 | accuracy = 0.5570422535211268


Epoch[1] Batch[360] Speed: 1.25249755337977 samples/sec                   batch loss = 982.7165520191193 | accuracy = 0.5576388888888889


Epoch[1] Batch[365] Speed: 1.2542859513857374 samples/sec                   batch loss = 996.147588968277 | accuracy = 0.5595890410958904


Epoch[1] Batch[370] Speed: 1.260637624811061 samples/sec                   batch loss = 1009.8096299171448 | accuracy = 0.5587837837837838


Epoch[1] Batch[375] Speed: 1.2533811420450955 samples/sec                   batch loss = 1022.9300806522369 | accuracy = 0.5593333333333333


Epoch[1] Batch[380] Speed: 1.2554600434425338 samples/sec                   batch loss = 1035.7940356731415 | accuracy = 0.5625


Epoch[1] Batch[385] Speed: 1.2491824426939255 samples/sec                   batch loss = 1048.7681198120117 | accuracy = 0.5642857142857143


Epoch[1] Batch[390] Speed: 1.2537936540937178 samples/sec                   batch loss = 1061.172295331955 | accuracy = 0.5666666666666667


Epoch[1] Batch[395] Speed: 1.2529504673093323 samples/sec                   batch loss = 1074.4652516841888 | accuracy = 0.5670886075949367


Epoch[1] Batch[400] Speed: 1.253293878652953 samples/sec                   batch loss = 1087.6787991523743 | accuracy = 0.566875


Epoch[1] Batch[405] Speed: 1.253089624966763 samples/sec                   batch loss = 1100.8695635795593 | accuracy = 0.5648148148148148


Epoch[1] Batch[410] Speed: 1.256534313722553 samples/sec                   batch loss = 1113.999785900116 | accuracy = 0.5634146341463414


Epoch[1] Batch[415] Speed: 1.252598359660058 samples/sec                   batch loss = 1127.9531562328339 | accuracy = 0.5644578313253013


Epoch[1] Batch[420] Speed: 1.2568933460651737 samples/sec                   batch loss = 1140.657824754715 | accuracy = 0.5672619047619047


Epoch[1] Batch[425] Speed: 1.2543642559013688 samples/sec                   batch loss = 1153.7915122509003 | accuracy = 0.5658823529411765


Epoch[1] Batch[430] Speed: 1.2522437393097514 samples/sec                   batch loss = 1167.463476896286 | accuracy = 0.5656976744186046


Epoch[1] Batch[435] Speed: 1.2597410359582843 samples/sec                   batch loss = 1180.259851694107 | accuracy = 0.5672413793103448


Epoch[1] Batch[440] Speed: 1.2514386660105594 samples/sec                   batch loss = 1194.8751201629639 | accuracy = 0.5664772727272728


Epoch[1] Batch[445] Speed: 1.25066912210226 samples/sec                   batch loss = 1206.653832435608 | accuracy = 0.5662921348314607


Epoch[1] Batch[450] Speed: 1.2539931697604751 samples/sec                   batch loss = 1221.2374942302704 | accuracy = 0.5672222222222222


Epoch[1] Batch[455] Speed: 1.251083955548874 samples/sec                   batch loss = 1234.457091808319 | accuracy = 0.5681318681318681


Epoch[1] Batch[460] Speed: 1.2561844217017535 samples/sec                   batch loss = 1248.2215983867645 | accuracy = 0.5673913043478261


Epoch[1] Batch[465] Speed: 1.252009648491828 samples/sec                   batch loss = 1262.7615435123444 | accuracy = 0.5666666666666667


Epoch[1] Batch[470] Speed: 1.2597236317416534 samples/sec                   batch loss = 1276.3364644050598 | accuracy = 0.5664893617021277


Epoch[1] Batch[475] Speed: 1.2527968402177778 samples/sec                   batch loss = 1289.7886002063751 | accuracy = 0.5663157894736842


Epoch[1] Batch[480] Speed: 1.2528530659817994 samples/sec                   batch loss = 1303.807157754898 | accuracy = 0.5661458333333333


Epoch[1] Batch[485] Speed: 1.2588358038396361 samples/sec                   batch loss = 1317.1526408195496 | accuracy = 0.5670103092783505


Epoch[1] Batch[490] Speed: 1.2601240504313524 samples/sec                   batch loss = 1330.930816411972 | accuracy = 0.5673469387755102


Epoch[1] Batch[495] Speed: 1.2512008636657572 samples/sec                   batch loss = 1343.2842094898224 | accuracy = 0.5686868686868687


Epoch[1] Batch[500] Speed: 1.2547794800387027 samples/sec                   batch loss = 1356.5673568248749 | accuracy = 0.569


Epoch[1] Batch[505] Speed: 1.2529805047795173 samples/sec                   batch loss = 1369.226120710373 | accuracy = 0.5707920792079207


Epoch[1] Batch[510] Speed: 1.2543937046800313 samples/sec                   batch loss = 1381.8362696170807 | accuracy = 0.571078431372549


Epoch[1] Batch[515] Speed: 1.258515782141323 samples/sec                   batch loss = 1394.0637352466583 | accuracy = 0.5728155339805825


Epoch[1] Batch[520] Speed: 1.2503568729094223 samples/sec                   batch loss = 1406.0337448120117 | accuracy = 0.5745192307692307


Epoch[1] Batch[525] Speed: 1.2585853628639354 samples/sec                   batch loss = 1419.2387845516205 | accuracy = 0.5742857142857143


Epoch[1] Batch[530] Speed: 1.2528853442828891 samples/sec                   batch loss = 1432.5743100643158 | accuracy = 0.5745283018867925


Epoch[1] Batch[535] Speed: 1.2556193049001378 samples/sec                   batch loss = 1445.826954126358 | accuracy = 0.5747663551401869


Epoch[1] Batch[540] Speed: 1.2551377918118518 samples/sec                   batch loss = 1459.7974681854248 | accuracy = 0.575


Epoch[1] Batch[545] Speed: 1.255620996389079 samples/sec                   batch loss = 1471.838791847229 | accuracy = 0.5770642201834862


Epoch[1] Batch[550] Speed: 1.2559699164005085 samples/sec                   batch loss = 1484.6309950351715 | accuracy = 0.5772727272727273


Epoch[1] Batch[555] Speed: 1.2528041371201377 samples/sec                   batch loss = 1497.1793286800385 | accuracy = 0.5774774774774775


Epoch[1] Batch[560] Speed: 1.2555980677045804 samples/sec                   batch loss = 1510.408703804016 | accuracy = 0.5767857142857142


Epoch[1] Batch[565] Speed: 1.2576856557988743 samples/sec                   batch loss = 1521.5033439397812 | accuracy = 0.5778761061946903


Epoch[1] Batch[570] Speed: 1.2567550368995217 samples/sec                   batch loss = 1534.4042636156082 | accuracy = 0.5789473684210527


Epoch[1] Batch[575] Speed: 1.257652846856257 samples/sec                   batch loss = 1549.5020629167557 | accuracy = 0.5782608695652174


Epoch[1] Batch[580] Speed: 1.2526354881392767 samples/sec                   batch loss = 1561.3295735120773 | accuracy = 0.5793103448275863


Epoch[1] Batch[585] Speed: 1.2534038961928673 samples/sec                   batch loss = 1573.9279462099075 | accuracy = 0.5803418803418804


Epoch[1] Batch[590] Speed: 1.255169155002305 samples/sec                   batch loss = 1586.9344846010208 | accuracy = 0.5817796610169491


Epoch[1] Batch[595] Speed: 1.2506646469826994 samples/sec                   batch loss = 1598.7442291975021 | accuracy = 0.5831932773109244


Epoch[1] Batch[600] Speed: 1.2586555178396386 samples/sec                   batch loss = 1611.2591418027878 | accuracy = 0.5833333333333334


Epoch[1] Batch[605] Speed: 1.2492517393393145 samples/sec                   batch loss = 1623.6390248537064 | accuracy = 0.5838842975206612


Epoch[1] Batch[610] Speed: 1.2510755591587739 samples/sec                   batch loss = 1636.230968117714 | accuracy = 0.5836065573770491


Epoch[1] Batch[615] Speed: 1.248499189643149 samples/sec                   batch loss = 1649.8563491106033 | accuracy = 0.582520325203252


Epoch[1] Batch[620] Speed: 1.2498465182608816 samples/sec                   batch loss = 1662.2184120416641 | accuracy = 0.5826612903225806


Epoch[1] Batch[625] Speed: 1.2516997179925582 samples/sec                   batch loss = 1673.2516306638718 | accuracy = 0.5844


Epoch[1] Batch[630] Speed: 1.254475962255991 samples/sec                   batch loss = 1685.5247818231583 | accuracy = 0.5857142857142857


Epoch[1] Batch[635] Speed: 1.2589330985136047 samples/sec                   batch loss = 1698.0656852722168 | accuracy = 0.5862204724409449


Epoch[1] Batch[640] Speed: 1.254769720149643 samples/sec                   batch loss = 1713.6747360229492 | accuracy = 0.586328125


Epoch[1] Batch[645] Speed: 1.2594314262417479 samples/sec                   batch loss = 1726.4274944067001 | accuracy = 0.5864341085271317


Epoch[1] Batch[650] Speed: 1.2608915366159632 samples/sec                   batch loss = 1741.8466197252274 | accuracy = 0.5853846153846154


Epoch[1] Batch[655] Speed: 1.2558220341120796 samples/sec                   batch loss = 1753.6937981843948 | accuracy = 0.5862595419847328


Epoch[1] Batch[660] Speed: 1.2576048621464835 samples/sec                   batch loss = 1766.6883434057236 | accuracy = 0.5863636363636363


Epoch[1] Batch[665] Speed: 1.2561549828361889 samples/sec                   batch loss = 1778.846718788147 | accuracy = 0.587593984962406


Epoch[1] Batch[670] Speed: 1.2544670512792042 samples/sec                   batch loss = 1790.7333670854568 | accuracy = 0.5880597014925373


Epoch[1] Batch[675] Speed: 1.26524931206678 samples/sec                   batch loss = 1803.520276427269 | accuracy = 0.5892592592592593


Epoch[1] Batch[680] Speed: 1.257215557799136 samples/sec                   batch loss = 1816.787803053856 | accuracy = 0.5889705882352941


Epoch[1] Batch[685] Speed: 1.2538378813050748 samples/sec                   batch loss = 1827.885251402855 | accuracy = 0.5897810218978102


Epoch[1] Batch[690] Speed: 1.2504887442182318 samples/sec                   batch loss = 1844.0947505235672 | accuracy = 0.5898550724637681


Epoch[1] Batch[695] Speed: 1.2475664327822655 samples/sec                   batch loss = 1855.9712308645248 | accuracy = 0.5910071942446044


Epoch[1] Batch[700] Speed: 1.2522653305360936 samples/sec                   batch loss = 1868.5827013254166 | accuracy = 0.5917857142857142


Epoch[1] Batch[705] Speed: 1.2526117331407665 samples/sec                   batch loss = 1882.6284214258194 | accuracy = 0.5925531914893617


Epoch[1] Batch[710] Speed: 1.2593295171869205 samples/sec                   batch loss = 1895.4457043409348 | accuracy = 0.5929577464788732


Epoch[1] Batch[715] Speed: 1.2537384681571193 samples/sec                   batch loss = 1907.730400800705 | accuracy = 0.5923076923076923


Epoch[1] Batch[720] Speed: 1.257334085860491 samples/sec                   batch loss = 1920.0012592077255 | accuracy = 0.5927083333333333


Epoch[1] Batch[725] Speed: 1.253938059979427 samples/sec                   batch loss = 1932.6340979337692 | accuracy = 0.5927586206896551


Epoch[1] Batch[730] Speed: 1.2584878387248009 samples/sec                   batch loss = 1946.347904086113 | accuracy = 0.5924657534246576


Epoch[1] Batch[735] Speed: 1.2544783072709231 samples/sec                   batch loss = 1959.025430560112 | accuracy = 0.5931972789115646


Epoch[1] Batch[740] Speed: 1.2549121920588575 samples/sec                   batch loss = 1970.9578667879105 | accuracy = 0.5935810810810811


Epoch[1] Batch[745] Speed: 1.2561439788860915 samples/sec                   batch loss = 1983.9269009828568 | accuracy = 0.5926174496644295


Epoch[1] Batch[750] Speed: 1.2574685640653884 samples/sec                   batch loss = 1998.3782578706741 | accuracy = 0.592


Epoch[1] Batch[755] Speed: 1.254961848962794 samples/sec                   batch loss = 2009.4426889419556 | accuracy = 0.5927152317880795


Epoch[1] Batch[760] Speed: 1.2587070768181807 samples/sec                   batch loss = 2024.5708980560303 | accuracy = 0.5911184210526316


Epoch[1] Batch[765] Speed: 1.2564418118860397 samples/sec                   batch loss = 2036.7313148975372 | accuracy = 0.5915032679738562


Epoch[1] Batch[770] Speed: 1.2516240801405096 samples/sec                   batch loss = 2048.8800790309906 | accuracy = 0.5918831168831169


Epoch[1] Batch[775] Speed: 1.251921735191778 samples/sec                   batch loss = 2061.93976521492 | accuracy = 0.5912903225806452


Epoch[1] Batch[780] Speed: 1.2545354345428374 samples/sec                   batch loss = 2074.549728155136 | accuracy = 0.5919871794871795


Epoch[1] Batch[785] Speed: 1.2505689989598727 samples/sec                   batch loss = 2086.8756597042084 | accuracy = 0.5923566878980892


[Epoch 1] training: accuracy=0.5926395939086294
[Epoch 1] time cost: 646.4754803180695
[Epoch 1] validation: validation accuracy=0.67


Epoch[2] Batch[5] Speed: 1.2580004446472308 samples/sec                   batch loss = 12.218496918678284 | accuracy = 0.8


Epoch[2] Batch[10] Speed: 1.2561586508623972 samples/sec                   batch loss = 23.480989456176758 | accuracy = 0.8


Epoch[2] Batch[15] Speed: 1.2540689067495874 samples/sec                   batch loss = 34.81950104236603 | accuracy = 0.7666666666666667


Epoch[2] Batch[20] Speed: 1.2553780325066666 samples/sec                   batch loss = 47.13966703414917 | accuracy = 0.7375


Epoch[2] Batch[25] Speed: 1.2530189660728777 samples/sec                   batch loss = 59.47846698760986 | accuracy = 0.71


Epoch[2] Batch[30] Speed: 1.2599372450137483 samples/sec                   batch loss = 69.20812845230103 | accuracy = 0.7333333333333333


Epoch[2] Batch[35] Speed: 1.2571798530099392 samples/sec                   batch loss = 81.2097817659378 | accuracy = 0.7214285714285714


Epoch[2] Batch[40] Speed: 1.254064313528038 samples/sec                   batch loss = 91.12993657588959 | accuracy = 0.73125


Epoch[2] Batch[45] Speed: 1.2604538867789352 samples/sec                   batch loss = 104.11789965629578 | accuracy = 0.7222222222222222


Epoch[2] Batch[50] Speed: 1.2540241008414643 samples/sec                   batch loss = 115.38134431838989 | accuracy = 0.735


Epoch[2] Batch[55] Speed: 1.2554969659525759 samples/sec                   batch loss = 128.19062781333923 | accuracy = 0.7227272727272728


Epoch[2] Batch[60] Speed: 1.2565732758459272 samples/sec                   batch loss = 142.3450608253479 | accuracy = 0.7041666666666667


Epoch[2] Batch[65] Speed: 1.2549797789806871 samples/sec                   batch loss = 154.1677964925766 | accuracy = 0.7076923076923077


Epoch[2] Batch[70] Speed: 1.25840854659697 samples/sec                   batch loss = 166.43748652935028 | accuracy = 0.7071428571428572


Epoch[2] Batch[75] Speed: 1.259876218773934 samples/sec                   batch loss = 179.89811170101166 | accuracy = 0.7033333333333334


Epoch[2] Batch[80] Speed: 1.2566012284229247 samples/sec                   batch loss = 189.6925928592682 | accuracy = 0.70625


Epoch[2] Batch[85] Speed: 1.2597807647762715 samples/sec                   batch loss = 204.26584601402283 | accuracy = 0.7


Epoch[2] Batch[90] Speed: 1.2520287089016624 samples/sec                   batch loss = 217.82788932323456 | accuracy = 0.6972222222222222


Epoch[2] Batch[95] Speed: 1.2581431794549371 samples/sec                   batch loss = 228.12001180648804 | accuracy = 0.7078947368421052


Epoch[2] Batch[100] Speed: 1.259441069696633 samples/sec                   batch loss = 240.16750156879425 | accuracy = 0.7025


Epoch[2] Batch[105] Speed: 1.258105440731676 samples/sec                   batch loss = 252.91837286949158 | accuracy = 0.7


Epoch[2] Batch[110] Speed: 1.2487961037050888 samples/sec                   batch loss = 264.27465319633484 | accuracy = 0.6977272727272728


Epoch[2] Batch[115] Speed: 1.255719016559455 samples/sec                   batch loss = 276.36707735061646 | accuracy = 0.6956521739130435


Epoch[2] Batch[120] Speed: 1.251787693511821 samples/sec                   batch loss = 288.1386151313782 | accuracy = 0.6916666666666667


Epoch[2] Batch[125] Speed: 1.2577255379393117 samples/sec                   batch loss = 299.00651931762695 | accuracy = 0.696


Epoch[2] Batch[130] Speed: 1.257244952181918 samples/sec                   batch loss = 311.8520122766495 | accuracy = 0.6903846153846154


Epoch[2] Batch[135] Speed: 1.2597460492185912 samples/sec                   batch loss = 324.10840809345245 | accuracy = 0.687037037037037


Epoch[2] Batch[140] Speed: 1.258319449271636 samples/sec                   batch loss = 336.33336865901947 | accuracy = 0.6875


Epoch[2] Batch[145] Speed: 1.2549363159996225 samples/sec                   batch loss = 347.76386618614197 | accuracy = 0.6913793103448276


Epoch[2] Batch[150] Speed: 1.2498230551355058 samples/sec                   batch loss = 362.4940378665924 | accuracy = 0.69


Epoch[2] Batch[155] Speed: 1.248697581352229 samples/sec                   batch loss = 376.5340030193329 | accuracy = 0.6806451612903226


Epoch[2] Batch[160] Speed: 1.2557975001141484 samples/sec                   batch loss = 390.06546115875244 | accuracy = 0.678125


Epoch[2] Batch[165] Speed: 1.251165406381898 samples/sec                   batch loss = 402.5973938703537 | accuracy = 0.6818181818181818


Epoch[2] Batch[170] Speed: 1.2482898084330984 samples/sec                   batch loss = 413.7215050458908 | accuracy = 0.6838235294117647


Epoch[2] Batch[175] Speed: 1.2565969931040475 samples/sec                   batch loss = 425.852884888649 | accuracy = 0.6828571428571428


Epoch[2] Batch[180] Speed: 1.2475790496272332 samples/sec                   batch loss = 440.5500396490097 | accuracy = 0.6777777777777778


Epoch[2] Batch[185] Speed: 1.24650410140411 samples/sec                   batch loss = 454.1270900964737 | accuracy = 0.6743243243243243


Epoch[2] Batch[190] Speed: 1.2528413713556372 samples/sec                   batch loss = 468.22741997241974 | accuracy = 0.6684210526315789


Epoch[2] Batch[195] Speed: 1.2524568801153586 samples/sec                   batch loss = 476.98319387435913 | accuracy = 0.676923076923077


Epoch[2] Batch[200] Speed: 1.2478588195007394 samples/sec                   batch loss = 488.61597061157227 | accuracy = 0.67625


Epoch[2] Batch[205] Speed: 1.253928219456811 samples/sec                   batch loss = 500.8040237426758 | accuracy = 0.6780487804878049


Epoch[2] Batch[210] Speed: 1.2494139886691071 samples/sec                   batch loss = 511.5552363395691 | accuracy = 0.6797619047619048


Epoch[2] Batch[215] Speed: 1.2498576914873623 samples/sec                   batch loss = 522.802544593811 | accuracy = 0.6802325581395349


Epoch[2] Batch[220] Speed: 1.2523990070814088 samples/sec                   batch loss = 533.5002650022507 | accuracy = 0.678409090909091


Epoch[2] Batch[225] Speed: 1.2570096472393983 samples/sec                   batch loss = 544.0658326148987 | accuracy = 0.6811111111111111


Epoch[2] Batch[230] Speed: 1.2503737397021752 samples/sec                   batch loss = 554.87149477005 | accuracy = 0.683695652173913


Epoch[2] Batch[235] Speed: 1.2475324798933993 samples/sec                   batch loss = 567.2018309831619 | accuracy = 0.6851063829787234


Epoch[2] Batch[240] Speed: 1.2535728458262734 samples/sec                   batch loss = 577.0223401784897 | accuracy = 0.6885416666666667


Epoch[2] Batch[245] Speed: 1.256848903041315 samples/sec                   batch loss = 587.8672757148743 | accuracy = 0.689795918367347


Epoch[2] Batch[250] Speed: 1.2550464343248573 samples/sec                   batch loss = 599.6482925415039 | accuracy = 0.69


Epoch[2] Batch[255] Speed: 1.2552014588489704 samples/sec                   batch loss = 612.3863751888275 | accuracy = 0.6892156862745098


Epoch[2] Batch[260] Speed: 1.2542133760504472 samples/sec                   batch loss = 623.1817920207977 | accuracy = 0.6865384615384615


Epoch[2] Batch[265] Speed: 1.2516889787283505 samples/sec                   batch loss = 635.9156537055969 | accuracy = 0.6858490566037736


Epoch[2] Batch[270] Speed: 1.2575467951745964 samples/sec                   batch loss = 647.0535465478897 | accuracy = 0.6861111111111111


Epoch[2] Batch[275] Speed: 1.2578446337089175 samples/sec                   batch loss = 658.4451093673706 | accuracy = 0.6881818181818182


Epoch[2] Batch[280] Speed: 1.25930985574231 samples/sec                   batch loss = 670.6756978034973 | accuracy = 0.6883928571428571


Epoch[2] Batch[285] Speed: 1.2642761251419061 samples/sec                   batch loss = 679.8319231271744 | accuracy = 0.6929824561403509


Epoch[2] Batch[290] Speed: 1.259506592373976 samples/sec                   batch loss = 696.6869432926178 | accuracy = 0.6887931034482758


Epoch[2] Batch[295] Speed: 1.2606594117379202 samples/sec                   batch loss = 706.1853861808777 | accuracy = 0.6923728813559322


Epoch[2] Batch[300] Speed: 1.2626740679572865 samples/sec                   batch loss = 717.0485211610794 | accuracy = 0.6941666666666667


Epoch[2] Batch[305] Speed: 1.2620386344191123 samples/sec                   batch loss = 729.295281291008 | accuracy = 0.6942622950819672


Epoch[2] Batch[310] Speed: 1.2545389054971308 samples/sec                   batch loss = 742.3649941682816 | accuracy = 0.692741935483871


Epoch[2] Batch[315] Speed: 1.2568655688069414 samples/sec                   batch loss = 752.3978708982468 | accuracy = 0.6936507936507936


Epoch[2] Batch[320] Speed: 1.2522018675383477 samples/sec                   batch loss = 762.4186936616898 | accuracy = 0.69609375


Epoch[2] Batch[325] Speed: 1.2551457733134321 samples/sec                   batch loss = 772.3374679088593 | accuracy = 0.6953846153846154


Epoch[2] Batch[330] Speed: 1.254773380090248 samples/sec                   batch loss = 784.0803509950638 | accuracy = 0.6931818181818182


Epoch[2] Batch[335] Speed: 1.2549344386168657 samples/sec                   batch loss = 800.2877615690231 | accuracy = 0.6888059701492537


Epoch[2] Batch[340] Speed: 1.2557481530216121 samples/sec                   batch loss = 811.190133690834 | accuracy = 0.6897058823529412


Epoch[2] Batch[345] Speed: 1.2567189817179218 samples/sec                   batch loss = 821.7324895858765 | accuracy = 0.6898550724637681


Epoch[2] Batch[350] Speed: 1.2541053724982127 samples/sec                   batch loss = 834.509113907814 | accuracy = 0.6892857142857143


Epoch[2] Batch[355] Speed: 1.2547513269252617 samples/sec                   batch loss = 849.874426484108 | accuracy = 0.6866197183098591


Epoch[2] Batch[360] Speed: 1.2517885341019008 samples/sec                   batch loss = 863.815732717514 | accuracy = 0.6854166666666667


Epoch[2] Batch[365] Speed: 1.2502494393250811 samples/sec                   batch loss = 875.4000424146652 | accuracy = 0.6842465753424658


Epoch[2] Batch[370] Speed: 1.2539431208797713 samples/sec                   batch loss = 890.0746084451675 | accuracy = 0.6824324324324325


Epoch[2] Batch[375] Speed: 1.2556248492420135 samples/sec                   batch loss = 902.8491023778915 | accuracy = 0.6813333333333333


Epoch[2] Batch[380] Speed: 1.2542552886884406 samples/sec                   batch loss = 913.6067626476288 | accuracy = 0.6828947368421052


Epoch[2] Batch[385] Speed: 1.2556399789664114 samples/sec                   batch loss = 924.9717185497284 | accuracy = 0.6844155844155844


Epoch[2] Batch[390] Speed: 1.2598568240752783 samples/sec                   batch loss = 936.7972195148468 | accuracy = 0.6858974358974359


Epoch[2] Batch[395] Speed: 1.2561831049179044 samples/sec                   batch loss = 946.1881363391876 | accuracy = 0.6873417721518987


Epoch[2] Batch[400] Speed: 1.250115476039588 samples/sec                   batch loss = 957.926309466362 | accuracy = 0.686875


Epoch[2] Batch[405] Speed: 1.2603197160910415 samples/sec                   batch loss = 973.0180732011795 | accuracy = 0.6845679012345679


Epoch[2] Batch[410] Speed: 1.2484975172878252 samples/sec                   batch loss = 982.7817621231079 | accuracy = 0.6853658536585366


Epoch[2] Batch[415] Speed: 1.2493861688468895 samples/sec                   batch loss = 993.5692987442017 | accuracy = 0.686144578313253


Epoch[2] Batch[420] Speed: 1.2549377240403763 samples/sec                   batch loss = 1006.0936269760132 | accuracy = 0.6863095238095238


Epoch[2] Batch[425] Speed: 1.2566846229608912 samples/sec                   batch loss = 1018.3880970478058 | accuracy = 0.6864705882352942


Epoch[2] Batch[430] Speed: 1.2566083814707438 samples/sec                   batch loss = 1031.8001956939697 | accuracy = 0.6854651162790698


Epoch[2] Batch[435] Speed: 1.2536104067826346 samples/sec                   batch loss = 1041.4881726503372 | accuracy = 0.6879310344827586


Epoch[2] Batch[440] Speed: 1.257683298777536 samples/sec                   batch loss = 1054.943308711052 | accuracy = 0.6880681818181819


Epoch[2] Batch[445] Speed: 1.2542663533421017 samples/sec                   batch loss = 1065.3445521593094 | accuracy = 0.6898876404494382


Epoch[2] Batch[450] Speed: 1.2528741168590836 samples/sec                   batch loss = 1078.5893961191177 | accuracy = 0.6894444444444444


Epoch[2] Batch[455] Speed: 1.257515972840643 samples/sec                   batch loss = 1088.4864609241486 | accuracy = 0.6895604395604396


Epoch[2] Batch[460] Speed: 1.2523351566119116 samples/sec                   batch loss = 1103.4477937221527 | accuracy = 0.6891304347826087


Epoch[2] Batch[465] Speed: 1.2565072110617346 samples/sec                   batch loss = 1115.0603840351105 | accuracy = 0.6876344086021505


Epoch[2] Batch[470] Speed: 1.2597188078319457 samples/sec                   batch loss = 1125.5927605628967 | accuracy = 0.6877659574468085


Epoch[2] Batch[475] Speed: 1.2548477096575432 samples/sec                   batch loss = 1136.9776631593704 | accuracy = 0.6873684210526316


Epoch[2] Batch[480] Speed: 1.2537436211374104 samples/sec                   batch loss = 1147.7064059972763 | accuracy = 0.6875


Epoch[2] Batch[485] Speed: 1.2581180829517857 samples/sec                   batch loss = 1158.7611691951752 | accuracy = 0.688659793814433


Epoch[2] Batch[490] Speed: 1.2575383118127177 samples/sec                   batch loss = 1170.0600299835205 | accuracy = 0.6887755102040817


Epoch[2] Batch[495] Speed: 1.2476811998051272 samples/sec                   batch loss = 1180.4685473442078 | accuracy = 0.6898989898989899


Epoch[2] Batch[500] Speed: 1.259558504789594 samples/sec                   batch loss = 1190.9390803575516 | accuracy = 0.6895


Epoch[2] Batch[505] Speed: 1.254035442620152 samples/sec                   batch loss = 1203.5169246196747 | accuracy = 0.6886138613861386


Epoch[2] Batch[510] Speed: 1.2511739905925634 samples/sec                   batch loss = 1216.6416300535202 | accuracy = 0.6877450980392157


Epoch[2] Batch[515] Speed: 1.2502949076606833 samples/sec                   batch loss = 1226.1459928750992 | accuracy = 0.6893203883495146


Epoch[2] Batch[520] Speed: 1.2535554243014173 samples/sec                   batch loss = 1236.3488212823868 | accuracy = 0.6903846153846154


Epoch[2] Batch[525] Speed: 1.2608563807852982 samples/sec                   batch loss = 1252.0189179182053 | accuracy = 0.6885714285714286


Epoch[2] Batch[530] Speed: 1.2539702067630256 samples/sec                   batch loss = 1263.7251924276352 | accuracy = 0.6886792452830188


Epoch[2] Batch[535] Speed: 1.2566730449536654 samples/sec                   batch loss = 1275.3526015281677 | accuracy = 0.6883177570093458


Epoch[2] Batch[540] Speed: 1.2516563885243006 samples/sec                   batch loss = 1284.8036239147186 | accuracy = 0.6893518518518519


Epoch[2] Batch[545] Speed: 1.252628193201356 samples/sec                   batch loss = 1296.4730931520462 | accuracy = 0.689908256880734


Epoch[2] Batch[550] Speed: 1.2571664760122356 samples/sec                   batch loss = 1307.0276018381119 | accuracy = 0.6904545454545454


Epoch[2] Batch[555] Speed: 1.2570282950976228 samples/sec                   batch loss = 1317.0997321605682 | accuracy = 0.6914414414414415


Epoch[2] Batch[560] Speed: 1.2612978186752672 samples/sec                   batch loss = 1326.5582257509232 | accuracy = 0.6928571428571428


Epoch[2] Batch[565] Speed: 1.2622279627931556 samples/sec                   batch loss = 1338.1399548053741 | accuracy = 0.6933628318584071


Epoch[2] Batch[570] Speed: 1.2575085266929862 samples/sec                   batch loss = 1349.658128976822 | accuracy = 0.6929824561403509


Epoch[2] Batch[575] Speed: 1.2606872622105574 samples/sec                   batch loss = 1362.4299765825272 | accuracy = 0.6921739130434783


Epoch[2] Batch[580] Speed: 1.2601069195630714 samples/sec                   batch loss = 1374.8217384815216 | accuracy = 0.6913793103448276


Epoch[2] Batch[585] Speed: 1.2598946679438923 samples/sec                   batch loss = 1387.7522320747375 | accuracy = 0.6901709401709402


Epoch[2] Batch[590] Speed: 1.2591540983035674 samples/sec                   batch loss = 1399.3655492067337 | accuracy = 0.690677966101695


Epoch[2] Batch[595] Speed: 1.254982783012852 samples/sec                   batch loss = 1412.5275608301163 | accuracy = 0.6899159663865546


Epoch[2] Batch[600] Speed: 1.2558961119223557 samples/sec                   batch loss = 1424.6328891515732 | accuracy = 0.6891666666666667


Epoch[2] Batch[605] Speed: 1.2550058769903616 samples/sec                   batch loss = 1434.0745587348938 | accuracy = 0.6900826446280992


Epoch[2] Batch[610] Speed: 1.2554110046877702 samples/sec                   batch loss = 1443.4148732423782 | accuracy = 0.6901639344262295


Epoch[2] Batch[615] Speed: 1.2542298782312993 samples/sec                   batch loss = 1457.0229482650757 | accuracy = 0.6894308943089431


Epoch[2] Batch[620] Speed: 1.2580232724935407 samples/sec                   batch loss = 1469.5505878925323 | accuracy = 0.6887096774193548


Epoch[2] Batch[625] Speed: 1.2527125578019684 samples/sec                   batch loss = 1480.0442427396774 | accuracy = 0.6896


Epoch[2] Batch[630] Speed: 1.2537066143101254 samples/sec                   batch loss = 1492.1615203619003 | accuracy = 0.6892857142857143


Epoch[2] Batch[635] Speed: 1.2558775916504865 samples/sec                   batch loss = 1501.482983827591 | accuracy = 0.6897637795275591


Epoch[2] Batch[640] Speed: 1.2556658225066575 samples/sec                   batch loss = 1513.3570890426636 | accuracy = 0.688671875


Epoch[2] Batch[645] Speed: 1.2577367581935692 samples/sec                   batch loss = 1527.2675411701202 | accuracy = 0.687984496124031


Epoch[2] Batch[650] Speed: 1.2559607021191828 samples/sec                   batch loss = 1539.1940612792969 | accuracy = 0.6873076923076923


Epoch[2] Batch[655] Speed: 1.2551714087050931 samples/sec                   batch loss = 1550.2086312770844 | accuracy = 0.6870229007633588


Epoch[2] Batch[660] Speed: 1.2565380780734718 samples/sec                   batch loss = 1560.0528057813644 | accuracy = 0.6878787878787879


Epoch[2] Batch[665] Speed: 1.2553542673077769 samples/sec                   batch loss = 1570.9794945716858 | accuracy = 0.6879699248120301


Epoch[2] Batch[670] Speed: 1.2524936261907476 samples/sec                   batch loss = 1581.4313997030258 | accuracy = 0.6880597014925374


Epoch[2] Batch[675] Speed: 1.2550668078639169 samples/sec                   batch loss = 1594.8822001218796 | accuracy = 0.687037037037037


Epoch[2] Batch[680] Speed: 1.2604519928513451 samples/sec                   batch loss = 1608.7455060482025 | accuracy = 0.6871323529411765


Epoch[2] Batch[685] Speed: 1.2585576051810767 samples/sec                   batch loss = 1616.9142957925797 | accuracy = 0.6883211678832116


Epoch[2] Batch[690] Speed: 1.2553556762865714 samples/sec                   batch loss = 1631.8385939598083 | accuracy = 0.6876811594202898


Epoch[2] Batch[695] Speed: 1.251711484702301 samples/sec                   batch loss = 1641.481758594513 | accuracy = 0.6881294964028777


Epoch[2] Batch[700] Speed: 1.2591416242530205 samples/sec                   batch loss = 1650.5304408073425 | accuracy = 0.6889285714285714


Epoch[2] Batch[705] Speed: 1.2565088108391045 samples/sec                   batch loss = 1665.196779847145 | accuracy = 0.6886524822695036


Epoch[2] Batch[710] Speed: 1.2549043073921715 samples/sec                   batch loss = 1676.825716972351 | accuracy = 0.6887323943661972


Epoch[2] Batch[715] Speed: 1.2547339664683639 samples/sec                   batch loss = 1687.7822624444962 | accuracy = 0.6881118881118881


Epoch[2] Batch[720] Speed: 1.2565879578525287 samples/sec                   batch loss = 1698.4467021226883 | accuracy = 0.6888888888888889


Epoch[2] Batch[725] Speed: 1.2584106231681387 samples/sec                   batch loss = 1708.4842520952225 | accuracy = 0.6886206896551724


Epoch[2] Batch[730] Speed: 1.2552493542398706 samples/sec                   batch loss = 1718.6877687573433 | accuracy = 0.6893835616438356


Epoch[2] Batch[735] Speed: 1.2570079520069948 samples/sec                   batch loss = 1729.2538713812828 | accuracy = 0.689795918367347


Epoch[2] Batch[740] Speed: 1.2521189734294027 samples/sec                   batch loss = 1741.0337223410606 | accuracy = 0.6898648648648649


Epoch[2] Batch[745] Speed: 1.25943861154705 samples/sec                   batch loss = 1751.9498141407967 | accuracy = 0.6902684563758389


Epoch[2] Batch[750] Speed: 1.2537085817067166 samples/sec                   batch loss = 1764.8036742806435 | accuracy = 0.69


Epoch[2] Batch[755] Speed: 1.2500877181152195 samples/sec                   batch loss = 1778.584716141224 | accuracy = 0.6900662251655629


Epoch[2] Batch[760] Speed: 1.2533979032507614 samples/sec                   batch loss = 1789.104186475277 | accuracy = 0.6901315789473684


Epoch[2] Batch[765] Speed: 1.2500071525983005 samples/sec                   batch loss = 1802.6267475485802 | accuracy = 0.6888888888888889


Epoch[2] Batch[770] Speed: 1.2518974467650095 samples/sec                   batch loss = 1812.9027509093285 | accuracy = 0.6896103896103896


Epoch[2] Batch[775] Speed: 1.2503333906187375 samples/sec                   batch loss = 1821.2383486628532 | accuracy = 0.6903225806451613


Epoch[2] Batch[780] Speed: 1.2546179921734142 samples/sec                   batch loss = 1830.1134052872658 | accuracy = 0.6916666666666667


Epoch[2] Batch[785] Speed: 1.2552713309954577 samples/sec                   batch loss = 1841.5686871409416 | accuracy = 0.6914012738853503


[Epoch 2] training: accuracy=0.6913071065989848
[Epoch 2] time cost: 643.6363046169281
[Epoch 2] validation: validation accuracy=0.7288888888888889


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).