<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[03:22:51] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[03:22:51] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[03:22:51] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 5.6812434, -5.7443237]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7972792204869942 samples/sec                   batch loss = 14.841939449310303 | accuracy = 0.4


Epoch[1] Batch[10] Speed: 1.281832894295111 samples/sec                   batch loss = 28.594372749328613 | accuracy = 0.475


Epoch[1] Batch[15] Speed: 1.290702614160357 samples/sec                   batch loss = 42.15288686752319 | accuracy = 0.5


Epoch[1] Batch[20] Speed: 1.2873457240330626 samples/sec                   batch loss = 55.967594623565674 | accuracy = 0.5


Epoch[1] Batch[25] Speed: 1.2887968605905713 samples/sec                   batch loss = 68.56163668632507 | accuracy = 0.53


Epoch[1] Batch[30] Speed: 1.28505553349186 samples/sec                   batch loss = 82.47399830818176 | accuracy = 0.5333333333333333


Epoch[1] Batch[35] Speed: 1.283337488171333 samples/sec                   batch loss = 96.67923951148987 | accuracy = 0.5285714285714286


Epoch[1] Batch[40] Speed: 1.281172851031458 samples/sec                   batch loss = 112.2089307308197 | accuracy = 0.51875


Epoch[1] Batch[45] Speed: 1.2911255557840327 samples/sec                   batch loss = 127.21167397499084 | accuracy = 0.5055555555555555


Epoch[1] Batch[50] Speed: 1.2788097016631859 samples/sec                   batch loss = 142.24410200119019 | accuracy = 0.485


Epoch[1] Batch[55] Speed: 1.2789235622605242 samples/sec                   batch loss = 156.16020131111145 | accuracy = 0.4863636363636364


Epoch[1] Batch[60] Speed: 1.2774079938503689 samples/sec                   batch loss = 170.5307924747467 | accuracy = 0.4791666666666667


Epoch[1] Batch[65] Speed: 1.281774918729778 samples/sec                   batch loss = 184.8703043460846 | accuracy = 0.4653846153846154


Epoch[1] Batch[70] Speed: 1.2835132296860845 samples/sec                   batch loss = 199.47185158729553 | accuracy = 0.45357142857142857


Epoch[1] Batch[75] Speed: 1.2890282725122966 samples/sec                   batch loss = 213.3975489139557 | accuracy = 0.4633333333333333


Epoch[1] Batch[80] Speed: 1.2796256954257001 samples/sec                   batch loss = 226.98507714271545 | accuracy = 0.471875


Epoch[1] Batch[85] Speed: 1.282166159625148 samples/sec                   batch loss = 240.25204181671143 | accuracy = 0.4852941176470588


Epoch[1] Batch[90] Speed: 1.2919250143325056 samples/sec                   batch loss = 253.99934840202332 | accuracy = 0.4861111111111111


Epoch[1] Batch[95] Speed: 1.2885529629769867 samples/sec                   batch loss = 267.53197836875916 | accuracy = 0.4921052631578947


Epoch[1] Batch[100] Speed: 1.2891714984041742 samples/sec                   batch loss = 281.45235085487366 | accuracy = 0.4975


Epoch[1] Batch[105] Speed: 1.290021206608342 samples/sec                   batch loss = 295.1386649608612 | accuracy = 0.5


Epoch[1] Batch[110] Speed: 1.2859956288822452 samples/sec                   batch loss = 308.73785948753357 | accuracy = 0.5022727272727273


Epoch[1] Batch[115] Speed: 1.2818048851746202 samples/sec                   batch loss = 322.51047015190125 | accuracy = 0.508695652173913


Epoch[1] Batch[120] Speed: 1.2864117426096302 samples/sec                   batch loss = 336.432199716568 | accuracy = 0.5104166666666666


Epoch[1] Batch[125] Speed: 1.285011045039758 samples/sec                   batch loss = 350.19575476646423 | accuracy = 0.51


Epoch[1] Batch[130] Speed: 1.2803609415953976 samples/sec                   batch loss = 364.6628930568695 | accuracy = 0.5076923076923077


Epoch[1] Batch[135] Speed: 1.2809025891352275 samples/sec                   batch loss = 377.96391105651855 | accuracy = 0.5166666666666667


Epoch[1] Batch[140] Speed: 1.282834387501682 samples/sec                   batch loss = 391.5704391002655 | accuracy = 0.5196428571428572


Epoch[1] Batch[145] Speed: 1.283993278656858 samples/sec                   batch loss = 405.59026741981506 | accuracy = 0.5172413793103449


Epoch[1] Batch[150] Speed: 1.2799120177667231 samples/sec                   batch loss = 418.90224027633667 | accuracy = 0.52


Epoch[1] Batch[155] Speed: 1.2816344085035811 samples/sec                   batch loss = 432.15021872520447 | accuracy = 0.5274193548387097


Epoch[1] Batch[160] Speed: 1.2841185807048225 samples/sec                   batch loss = 446.1586196422577 | accuracy = 0.525


Epoch[1] Batch[165] Speed: 1.289107112149876 samples/sec                   batch loss = 460.279767036438 | accuracy = 0.5257575757575758


Epoch[1] Batch[170] Speed: 1.2877694335199987 samples/sec                   batch loss = 473.57067918777466 | accuracy = 0.5279411764705882


Epoch[1] Batch[175] Speed: 1.2809555956889838 samples/sec                   batch loss = 487.0465977191925 | accuracy = 0.5328571428571428


Epoch[1] Batch[180] Speed: 1.2886302598165293 samples/sec                   batch loss = 501.14383125305176 | accuracy = 0.5305555555555556


Epoch[1] Batch[185] Speed: 1.2846219024410488 samples/sec                   batch loss = 515.2260768413544 | accuracy = 0.5297297297297298


Epoch[1] Batch[190] Speed: 1.289827614148697 samples/sec                   batch loss = 528.6917455196381 | accuracy = 0.5328947368421053


Epoch[1] Batch[195] Speed: 1.2865962203530505 samples/sec                   batch loss = 541.9125468730927 | accuracy = 0.5346153846153846


Epoch[1] Batch[200] Speed: 1.287748478642663 samples/sec                   batch loss = 556.3418834209442 | accuracy = 0.53125


Epoch[1] Batch[205] Speed: 1.2903126966457532 samples/sec                   batch loss = 571.0809624195099 | accuracy = 0.526829268292683


Epoch[1] Batch[210] Speed: 1.2884255077780287 samples/sec                   batch loss = 584.6727414131165 | accuracy = 0.5297619047619048


Epoch[1] Batch[215] Speed: 1.2851966964504378 samples/sec                   batch loss = 598.1385922431946 | accuracy = 0.5348837209302325


Epoch[1] Batch[220] Speed: 1.2873144114285837 samples/sec                   batch loss = 611.8303575515747 | accuracy = 0.5386363636363637


Epoch[1] Batch[225] Speed: 1.2835012502443304 samples/sec                   batch loss = 625.553200006485 | accuracy = 0.5366666666666666


Epoch[1] Batch[230] Speed: 1.284033765742356 samples/sec                   batch loss = 639.0494229793549 | accuracy = 0.5358695652173913


Epoch[1] Batch[235] Speed: 1.282562051559141 samples/sec                   batch loss = 653.1101195812225 | accuracy = 0.5361702127659574


Epoch[1] Batch[240] Speed: 1.28312882081926 samples/sec                   batch loss = 667.2274279594421 | accuracy = 0.5354166666666667


Epoch[1] Batch[245] Speed: 1.284651903786424 samples/sec                   batch loss = 680.6875858306885 | accuracy = 0.5377551020408163


Epoch[1] Batch[250] Speed: 1.2859933617014232 samples/sec                   batch loss = 694.3583579063416 | accuracy = 0.538


Epoch[1] Batch[255] Speed: 1.2871381212536472 samples/sec                   batch loss = 708.6948926448822 | accuracy = 0.5352941176470588


Epoch[1] Batch[260] Speed: 1.2859751259725076 samples/sec                   batch loss = 722.604875087738 | accuracy = 0.5355769230769231


Epoch[1] Batch[265] Speed: 1.289116819204174 samples/sec                   batch loss = 736.2615675926208 | accuracy = 0.5386792452830189


Epoch[1] Batch[270] Speed: 1.2817977361896038 samples/sec                   batch loss = 749.8526268005371 | accuracy = 0.5388888888888889


Epoch[1] Batch[275] Speed: 1.2826518694312834 samples/sec                   batch loss = 763.4712307453156 | accuracy = 0.5390909090909091


Epoch[1] Batch[280] Speed: 1.2793632075831396 samples/sec                   batch loss = 777.0166082382202 | accuracy = 0.5410714285714285


Epoch[1] Batch[285] Speed: 1.278879399878875 samples/sec                   batch loss = 791.0524210929871 | accuracy = 0.5403508771929825


Epoch[1] Batch[290] Speed: 1.2815469845878715 samples/sec                   batch loss = 805.258162021637 | accuracy = 0.5387931034482759


Epoch[1] Batch[295] Speed: 1.2818742246735875 samples/sec                   batch loss = 818.7969696521759 | accuracy = 0.5415254237288135


Epoch[1] Batch[300] Speed: 1.2836874478224154 samples/sec                   batch loss = 831.7919313907623 | accuracy = 0.5458333333333333


Epoch[1] Batch[305] Speed: 1.2864513959278667 samples/sec                   batch loss = 845.8297011852264 | accuracy = 0.5434426229508197


Epoch[1] Batch[310] Speed: 1.2813510338812437 samples/sec                   batch loss = 859.4251778125763 | accuracy = 0.5435483870967742


Epoch[1] Batch[315] Speed: 1.284992738739455 samples/sec                   batch loss = 873.0186576843262 | accuracy = 0.5452380952380952


Epoch[1] Batch[320] Speed: 1.2774163583633262 samples/sec                   batch loss = 886.5165266990662 | accuracy = 0.5453125


Epoch[1] Batch[325] Speed: 1.281535531274312 samples/sec                   batch loss = 900.7493019104004 | accuracy = 0.5430769230769231


Epoch[1] Batch[330] Speed: 1.2839538749890276 samples/sec                   batch loss = 914.8472681045532 | accuracy = 0.5424242424242425


Epoch[1] Batch[335] Speed: 1.2813777509035862 samples/sec                   batch loss = 928.6298475265503 | accuracy = 0.5432835820895522


Epoch[1] Batch[340] Speed: 1.2807262907805916 samples/sec                   batch loss = 941.9589359760284 | accuracy = 0.5433823529411764


Epoch[1] Batch[345] Speed: 1.28284831632245 samples/sec                   batch loss = 955.6706800460815 | accuracy = 0.5456521739130434


Epoch[1] Batch[350] Speed: 1.2850997297166045 samples/sec                   batch loss = 968.9836254119873 | accuracy = 0.5471428571428572


Epoch[1] Batch[355] Speed: 1.2834764083441952 samples/sec                   batch loss = 982.4949791431427 | accuracy = 0.5471830985915493


Epoch[1] Batch[360] Speed: 1.290508618612807 samples/sec                   batch loss = 995.1912190914154 | accuracy = 0.5506944444444445


Epoch[1] Batch[365] Speed: 1.2833669386994215 samples/sec                   batch loss = 1008.7355065345764 | accuracy = 0.5493150684931507


Epoch[1] Batch[370] Speed: 1.2843554929585659 samples/sec                   batch loss = 1022.8087604045868 | accuracy = 0.5472972972972973


Epoch[1] Batch[375] Speed: 1.28605191667331 samples/sec                   batch loss = 1036.5893487930298 | accuracy = 0.548


Epoch[1] Batch[380] Speed: 1.2814136689493567 samples/sec                   batch loss = 1049.5056886672974 | accuracy = 0.5493421052631579


Epoch[1] Batch[385] Speed: 1.2831504107183325 samples/sec                   batch loss = 1063.2192358970642 | accuracy = 0.5487012987012987


Epoch[1] Batch[390] Speed: 1.2826154897162731 samples/sec                   batch loss = 1076.807168006897 | accuracy = 0.5474358974358975


Epoch[1] Batch[395] Speed: 1.2912028634673487 samples/sec                   batch loss = 1090.4510283470154 | accuracy = 0.5481012658227848


Epoch[1] Batch[400] Speed: 1.2802360786897424 samples/sec                   batch loss = 1104.2474265098572 | accuracy = 0.548125


Epoch[1] Batch[405] Speed: 1.2867803562396003 samples/sec                   batch loss = 1118.381433725357 | accuracy = 0.5469135802469136


Epoch[1] Batch[410] Speed: 1.2796443371337756 samples/sec                   batch loss = 1132.4397912025452 | accuracy = 0.5457317073170732


Epoch[1] Batch[415] Speed: 1.2832338330642659 samples/sec                   batch loss = 1145.4886538982391 | accuracy = 0.5475903614457831


Epoch[1] Batch[420] Speed: 1.2829254207592509 samples/sec                   batch loss = 1159.0477087497711 | accuracy = 0.5482142857142858


Epoch[1] Batch[425] Speed: 1.2856031314458447 samples/sec                   batch loss = 1172.5956802368164 | accuracy = 0.548235294117647


Epoch[1] Batch[430] Speed: 1.2845194166119749 samples/sec                   batch loss = 1185.6093022823334 | accuracy = 0.5482558139534883


Epoch[1] Batch[435] Speed: 1.2825641105616117 samples/sec                   batch loss = 1199.3530020713806 | accuracy = 0.5488505747126436


Epoch[1] Batch[440] Speed: 1.2836603397520625 samples/sec                   batch loss = 1212.366149187088 | accuracy = 0.55


Epoch[1] Batch[445] Speed: 1.2833244322029893 samples/sec                   batch loss = 1226.1121594905853 | accuracy = 0.551123595505618


Epoch[1] Batch[450] Speed: 1.2860352565510338 samples/sec                   batch loss = 1239.8831858634949 | accuracy = 0.5516666666666666


Epoch[1] Batch[455] Speed: 1.2830829937595722 samples/sec                   batch loss = 1252.7080166339874 | accuracy = 0.5538461538461539


Epoch[1] Batch[460] Speed: 1.2887098426913957 samples/sec                   batch loss = 1266.6083748340607 | accuracy = 0.5543478260869565


Epoch[1] Batch[465] Speed: 1.2842483309464652 samples/sec                   batch loss = 1279.7758979797363 | accuracy = 0.5553763440860215


Epoch[1] Batch[470] Speed: 1.2869905092096459 samples/sec                   batch loss = 1292.5226521492004 | accuracy = 0.5569148936170213


Epoch[1] Batch[475] Speed: 1.282944551157568 samples/sec                   batch loss = 1306.102344751358 | accuracy = 0.5573684210526316


Epoch[1] Batch[480] Speed: 1.2838356784950153 samples/sec                   batch loss = 1319.3911294937134 | accuracy = 0.559375


Epoch[1] Batch[485] Speed: 1.2902365869840284 samples/sec                   batch loss = 1332.556572675705 | accuracy = 0.5587628865979382


Epoch[1] Batch[490] Speed: 1.2835907086887193 samples/sec                   batch loss = 1346.3019530773163 | accuracy = 0.5581632653061225


Epoch[1] Batch[495] Speed: 1.285233321196409 samples/sec                   batch loss = 1359.4836299419403 | accuracy = 0.5575757575757576


Epoch[1] Batch[500] Speed: 1.2836807689161818 samples/sec                   batch loss = 1372.8230352401733 | accuracy = 0.559


Epoch[1] Batch[505] Speed: 1.283342396498816 samples/sec                   batch loss = 1386.4237267971039 | accuracy = 0.558910891089109


Epoch[1] Batch[510] Speed: 1.2806036048180058 samples/sec                   batch loss = 1399.226907491684 | accuracy = 0.5588235294117647


Epoch[1] Batch[515] Speed: 1.2836530718408765 samples/sec                   batch loss = 1413.931205034256 | accuracy = 0.558252427184466


Epoch[1] Batch[520] Speed: 1.2767830079013107 samples/sec                   batch loss = 1427.2842829227448 | accuracy = 0.5586538461538462


Epoch[1] Batch[525] Speed: 1.2851749391891882 samples/sec                   batch loss = 1439.074514389038 | accuracy = 0.56


Epoch[1] Batch[530] Speed: 1.282797997432602 samples/sec                   batch loss = 1453.1633639335632 | accuracy = 0.5589622641509434


Epoch[1] Batch[535] Speed: 1.2888141863759415 samples/sec                   batch loss = 1467.4315965175629 | accuracy = 0.5579439252336449


Epoch[1] Batch[540] Speed: 1.2840597102656102 samples/sec                   batch loss = 1480.6051616668701 | accuracy = 0.5592592592592592


Epoch[1] Batch[545] Speed: 1.2853725536615255 samples/sec                   batch loss = 1493.111522436142 | accuracy = 0.560091743119266


Epoch[1] Batch[550] Speed: 1.279978125373834 samples/sec                   batch loss = 1506.2278401851654 | accuracy = 0.5604545454545454


Epoch[1] Batch[555] Speed: 1.2832814377340007 samples/sec                   batch loss = 1520.4957029819489 | accuracy = 0.5603603603603604


Epoch[1] Batch[560] Speed: 1.282092085687919 samples/sec                   batch loss = 1534.1726405620575 | accuracy = 0.5602678571428571


Epoch[1] Batch[565] Speed: 1.278654345852151 samples/sec                   batch loss = 1547.7943530082703 | accuracy = 0.5606194690265487


Epoch[1] Batch[570] Speed: 1.2783258254151095 samples/sec                   batch loss = 1561.9517962932587 | accuracy = 0.5596491228070175


Epoch[1] Batch[575] Speed: 1.2799862306168734 samples/sec                   batch loss = 1574.6910302639008 | accuracy = 0.561304347826087


Epoch[1] Batch[580] Speed: 1.2820272290305241 samples/sec                   batch loss = 1587.69735622406 | accuracy = 0.5616379310344828


Epoch[1] Batch[585] Speed: 1.2826343166965846 samples/sec                   batch loss = 1601.1820707321167 | accuracy = 0.5611111111111111


Epoch[1] Batch[590] Speed: 1.2864078957760021 samples/sec                   batch loss = 1614.4740781784058 | accuracy = 0.5605932203389831


Epoch[1] Batch[595] Speed: 1.2866346023179218 samples/sec                   batch loss = 1628.1403512954712 | accuracy = 0.5617647058823529


Epoch[1] Batch[600] Speed: 1.284754607524658 samples/sec                   batch loss = 1641.7291550636292 | accuracy = 0.5616666666666666


Epoch[1] Batch[605] Speed: 1.2885473219574177 samples/sec                   batch loss = 1655.9245953559875 | accuracy = 0.5619834710743802


Epoch[1] Batch[610] Speed: 1.2863646944553235 samples/sec                   batch loss = 1668.8986585140228 | accuracy = 0.5627049180327869


Epoch[1] Batch[615] Speed: 1.2870373069545609 samples/sec                   batch loss = 1682.9183626174927 | accuracy = 0.5626016260162602


Epoch[1] Batch[620] Speed: 1.2838118060345036 samples/sec                   batch loss = 1696.2331011295319 | accuracy = 0.5629032258064516


Epoch[1] Batch[625] Speed: 1.2832504206822701 samples/sec                   batch loss = 1709.5558290481567 | accuracy = 0.5636


Epoch[1] Batch[630] Speed: 1.2834519601402912 samples/sec                   batch loss = 1721.474762916565 | accuracy = 0.5654761904761905


Epoch[1] Batch[635] Speed: 1.287997806816021 samples/sec                   batch loss = 1735.0037288665771 | accuracy = 0.5649606299212598


Epoch[1] Batch[640] Speed: 1.283170921795422 samples/sec                   batch loss = 1747.276819229126 | accuracy = 0.566015625


Epoch[1] Batch[645] Speed: 1.280586499080962 samples/sec                   batch loss = 1759.8892676830292 | accuracy = 0.5666666666666667


Epoch[1] Batch[650] Speed: 1.2847336522676556 samples/sec                   batch loss = 1773.4744882583618 | accuracy = 0.566923076923077


Epoch[1] Batch[655] Speed: 1.2779686585211465 samples/sec                   batch loss = 1786.0486886501312 | accuracy = 0.5675572519083969


Epoch[1] Batch[660] Speed: 1.2782133372077191 samples/sec                   batch loss = 1800.4788239002228 | accuracy = 0.5662878787878788


Epoch[1] Batch[665] Speed: 1.2810269952137532 samples/sec                   batch loss = 1813.247543811798 | accuracy = 0.5669172932330827


Epoch[1] Batch[670] Speed: 1.2866354903596375 samples/sec                   batch loss = 1825.75852227211 | accuracy = 0.567910447761194


Epoch[1] Batch[675] Speed: 1.2855226512887128 samples/sec                   batch loss = 1838.5297632217407 | accuracy = 0.5685185185185185


Epoch[1] Batch[680] Speed: 1.282526951486369 samples/sec                   batch loss = 1850.747991323471 | accuracy = 0.5698529411764706


Epoch[1] Batch[685] Speed: 1.2870361221595354 samples/sec                   batch loss = 1863.6601779460907 | accuracy = 0.5708029197080292


Epoch[1] Batch[690] Speed: 1.2839953422572297 samples/sec                   batch loss = 1876.8441410064697 | accuracy = 0.571376811594203


Epoch[1] Batch[695] Speed: 1.2904375478417967 samples/sec                   batch loss = 1890.475111246109 | accuracy = 0.570863309352518


Epoch[1] Batch[700] Speed: 1.283350740741704 samples/sec                   batch loss = 1903.5624403953552 | accuracy = 0.5714285714285714


Epoch[1] Batch[705] Speed: 1.281466913572024 samples/sec                   batch loss = 1917.0417058467865 | accuracy = 0.5723404255319149


Epoch[1] Batch[710] Speed: 1.2888107211816007 samples/sec                   batch loss = 1929.9207048416138 | accuracy = 0.5725352112676056


Epoch[1] Batch[715] Speed: 1.2834115097359244 samples/sec                   batch loss = 1943.787088394165 | accuracy = 0.5723776223776224


Epoch[1] Batch[720] Speed: 1.2873402911375778 samples/sec                   batch loss = 1957.0452287197113 | accuracy = 0.5732638888888889


Epoch[1] Batch[725] Speed: 1.28284184232588 samples/sec                   batch loss = 1970.6376082897186 | accuracy = 0.5727586206896552


Epoch[1] Batch[730] Speed: 1.2894769749814559 samples/sec                   batch loss = 1983.566602230072 | accuracy = 0.5736301369863014


Epoch[1] Batch[735] Speed: 1.283562033493522 samples/sec                   batch loss = 1995.0540716648102 | accuracy = 0.5748299319727891


Epoch[1] Batch[740] Speed: 1.28372654046158 samples/sec                   batch loss = 2007.8806190490723 | accuracy = 0.5746621621621621


Epoch[1] Batch[745] Speed: 1.288237933262041 samples/sec                   batch loss = 2020.9414956569672 | accuracy = 0.5758389261744966


Epoch[1] Batch[750] Speed: 1.2876906585796304 samples/sec                   batch loss = 2033.3455675840378 | accuracy = 0.5766666666666667


Epoch[1] Batch[755] Speed: 1.2831993832278354 samples/sec                   batch loss = 2045.4687558412552 | accuracy = 0.578476821192053


Epoch[1] Batch[760] Speed: 1.2899145847196642 samples/sec                   batch loss = 2058.1604162454605 | accuracy = 0.5792763157894737


Epoch[1] Batch[765] Speed: 1.2908999458085995 samples/sec                   batch loss = 2070.2251707315445 | accuracy = 0.5807189542483661


Epoch[1] Batch[770] Speed: 1.2878137177683229 samples/sec                   batch loss = 2082.573443174362 | accuracy = 0.5814935064935065


Epoch[1] Batch[775] Speed: 1.2919874935187758 samples/sec                   batch loss = 2095.4089527130127 | accuracy = 0.5812903225806452


Epoch[1] Batch[780] Speed: 1.288612048202395 samples/sec                   batch loss = 2109.120193004608 | accuracy = 0.5810897435897436


Epoch[1] Batch[785] Speed: 1.2990894953276122 samples/sec                   batch loss = 2121.10879778862 | accuracy = 0.5821656050955414


[Epoch 1] training: accuracy=0.5818527918781726
[Epoch 1] time cost: 630.8468821048737
[Epoch 1] validation: validation accuracy=0.6533333333333333


Epoch[2] Batch[5] Speed: 1.289092948037015 samples/sec                   batch loss = 12.229369640350342 | accuracy = 0.55


Epoch[2] Batch[10] Speed: 1.2876926352481999 samples/sec                   batch loss = 25.324400305747986 | accuracy = 0.55


Epoch[2] Batch[15] Speed: 1.2838203528627823 samples/sec                   batch loss = 36.20944941043854 | accuracy = 0.65


Epoch[2] Batch[20] Speed: 1.290201462476092 samples/sec                   batch loss = 48.81137835979462 | accuracy = 0.6375


Epoch[2] Batch[25] Speed: 1.2894887689026382 samples/sec                   batch loss = 60.72331368923187 | accuracy = 0.64


Epoch[2] Batch[30] Speed: 1.2880865086709978 samples/sec                   batch loss = 75.41748201847076 | accuracy = 0.6


Epoch[2] Batch[35] Speed: 1.2780120765520673 samples/sec                   batch loss = 88.49802124500275 | accuracy = 0.6071428571428571


Epoch[2] Batch[40] Speed: 1.2796439467259437 samples/sec                   batch loss = 100.86852538585663 | accuracy = 0.60625


Epoch[2] Batch[45] Speed: 1.2823252123506117 samples/sec                   batch loss = 114.38182580471039 | accuracy = 0.5777777777777777


Epoch[2] Batch[50] Speed: 1.2881808605063283 samples/sec                   batch loss = 126.24532639980316 | accuracy = 0.59


Epoch[2] Batch[55] Speed: 1.2851122312047767 samples/sec                   batch loss = 138.06960880756378 | accuracy = 0.6


Epoch[2] Batch[60] Speed: 1.2836501254138413 samples/sec                   batch loss = 151.7269730567932 | accuracy = 0.6125


Epoch[2] Batch[65] Speed: 1.2864001035424888 samples/sec                   batch loss = 163.55204963684082 | accuracy = 0.6192307692307693


Epoch[2] Batch[70] Speed: 1.2902238863762432 samples/sec                   batch loss = 177.34261536598206 | accuracy = 0.6214285714285714


Epoch[2] Batch[75] Speed: 1.285278120430779 samples/sec                   batch loss = 190.0626392364502 | accuracy = 0.62


Epoch[2] Batch[80] Speed: 1.2848366642647926 samples/sec                   batch loss = 201.35783398151398 | accuracy = 0.625


Epoch[2] Batch[85] Speed: 1.287281322476738 samples/sec                   batch loss = 212.65572905540466 | accuracy = 0.6323529411764706


Epoch[2] Batch[90] Speed: 1.288725681252167 samples/sec                   batch loss = 223.76767492294312 | accuracy = 0.6388888888888888


Epoch[2] Batch[95] Speed: 1.293745952035393 samples/sec                   batch loss = 235.90982174873352 | accuracy = 0.6447368421052632


Epoch[2] Batch[100] Speed: 1.291547183859848 samples/sec                   batch loss = 248.57716369628906 | accuracy = 0.6475


Epoch[2] Batch[105] Speed: 1.290642542775399 samples/sec                   batch loss = 260.2808916568756 | accuracy = 0.6476190476190476


Epoch[2] Batch[110] Speed: 1.2870855892074924 samples/sec                   batch loss = 273.11314964294434 | accuracy = 0.65


Epoch[2] Batch[115] Speed: 1.2838127884227701 samples/sec                   batch loss = 285.01577591896057 | accuracy = 0.65


Epoch[2] Batch[120] Speed: 1.289514636997887 samples/sec                   batch loss = 298.7531032562256 | accuracy = 0.6458333333333334


Epoch[2] Batch[125] Speed: 1.2884096765946735 samples/sec                   batch loss = 313.826073884964 | accuracy = 0.64


Epoch[2] Batch[130] Speed: 1.2866690395001619 samples/sec                   batch loss = 325.22569239139557 | accuracy = 0.6403846153846153


Epoch[2] Batch[135] Speed: 1.28315443437075 samples/sec                   batch loss = 336.4248481988907 | accuracy = 0.6425925925925926


Epoch[2] Batch[140] Speed: 1.288344377075429 samples/sec                   batch loss = 348.66361832618713 | accuracy = 0.6392857142857142


Epoch[2] Batch[145] Speed: 1.288796365574977 samples/sec                   batch loss = 359.8920520544052 | accuracy = 0.6431034482758621


Epoch[2] Batch[150] Speed: 1.2840464430488687 samples/sec                   batch loss = 371.4925733804703 | accuracy = 0.645


Epoch[2] Batch[155] Speed: 1.2796466795857682 samples/sec                   batch loss = 382.8137010335922 | accuracy = 0.6483870967741936


Epoch[2] Batch[160] Speed: 1.2795022444195332 samples/sec                   batch loss = 396.28300869464874 | accuracy = 0.6484375


Epoch[2] Batch[165] Speed: 1.2871568837379408 samples/sec                   batch loss = 409.04469430446625 | accuracy = 0.646969696969697


Epoch[2] Batch[170] Speed: 1.2844923717355905 samples/sec                   batch loss = 422.7417720556259 | accuracy = 0.6455882352941177


Epoch[2] Batch[175] Speed: 1.2863031523796877 samples/sec                   batch loss = 435.6516044139862 | accuracy = 0.6442857142857142


Epoch[2] Batch[180] Speed: 1.281303670177115 samples/sec                   batch loss = 446.88496363162994 | accuracy = 0.6444444444444445


Epoch[2] Batch[185] Speed: 1.2871343688224273 samples/sec                   batch loss = 461.1435023546219 | accuracy = 0.6472972972972973


Epoch[2] Batch[190] Speed: 1.2799321325048758 samples/sec                   batch loss = 471.31514513492584 | accuracy = 0.65


Epoch[2] Batch[195] Speed: 1.287226210924226 samples/sec                   batch loss = 484.0693506002426 | accuracy = 0.6448717948717949


Epoch[2] Batch[200] Speed: 1.281733203070925 samples/sec                   batch loss = 497.66407573223114 | accuracy = 0.64375


Epoch[2] Batch[205] Speed: 1.282013318034242 samples/sec                   batch loss = 511.8837558031082 | accuracy = 0.6390243902439025


Epoch[2] Batch[210] Speed: 1.287002257690635 samples/sec                   batch loss = 523.6077910661697 | accuracy = 0.6404761904761904


Epoch[2] Batch[215] Speed: 1.2819237854660124 samples/sec                   batch loss = 536.9899572134018 | accuracy = 0.6395348837209303


Epoch[2] Batch[220] Speed: 1.287900219225016 samples/sec                   batch loss = 547.1747236251831 | accuracy = 0.6443181818181818


Epoch[2] Batch[225] Speed: 1.285854980234555 samples/sec                   batch loss = 558.1606973409653 | accuracy = 0.6466666666666666


Epoch[2] Batch[230] Speed: 1.2858392121571363 samples/sec                   batch loss = 567.7828164100647 | accuracy = 0.65


Epoch[2] Batch[235] Speed: 1.2836782152351063 samples/sec                   batch loss = 577.5637035369873 | accuracy = 0.652127659574468


Epoch[2] Batch[240] Speed: 1.2877787250517672 samples/sec                   batch loss = 588.2703351974487 | accuracy = 0.6552083333333333


Epoch[2] Batch[245] Speed: 1.278846645645281 samples/sec                   batch loss = 601.0752561092377 | accuracy = 0.6540816326530612


Epoch[2] Batch[250] Speed: 1.28681055718709 samples/sec                   batch loss = 613.6717482805252 | accuracy = 0.656


Epoch[2] Batch[255] Speed: 1.283210571860779 samples/sec                   batch loss = 625.3323264122009 | accuracy = 0.6568627450980392


Epoch[2] Batch[260] Speed: 1.2837588575313061 samples/sec                   batch loss = 637.520383477211 | accuracy = 0.6557692307692308


Epoch[2] Batch[265] Speed: 1.28579250149753 samples/sec                   batch loss = 649.1156142950058 | accuracy = 0.6566037735849056


Epoch[2] Batch[270] Speed: 1.2779038290158202 samples/sec                   batch loss = 660.9341419935226 | accuracy = 0.6537037037037037


Epoch[2] Batch[275] Speed: 1.2828457659523644 samples/sec                   batch loss = 674.0897251367569 | accuracy = 0.6536363636363637


Epoch[2] Batch[280] Speed: 1.2864965760706744 samples/sec                   batch loss = 685.4431297779083 | accuracy = 0.65625


Epoch[2] Batch[285] Speed: 1.2797068054552625 samples/sec                   batch loss = 696.3337113857269 | accuracy = 0.6570175438596492


Epoch[2] Batch[290] Speed: 1.2792379539532115 samples/sec                   batch loss = 710.6912069320679 | accuracy = 0.6543103448275862


Epoch[2] Batch[295] Speed: 1.2770195537438525 samples/sec                   batch loss = 723.5457751750946 | accuracy = 0.6550847457627119


Epoch[2] Batch[300] Speed: 1.277762511812506 samples/sec                   batch loss = 737.0525461435318 | accuracy = 0.6558333333333334


Epoch[2] Batch[305] Speed: 1.2794325758694818 samples/sec                   batch loss = 750.4178160429001 | accuracy = 0.6557377049180327


Epoch[2] Batch[310] Speed: 1.280257962133636 samples/sec                   batch loss = 761.8971868753433 | accuracy = 0.6556451612903226


Epoch[2] Batch[315] Speed: 1.274408904294392 samples/sec                   batch loss = 775.3392368555069 | accuracy = 0.653968253968254


Epoch[2] Batch[320] Speed: 1.2838534606166168 samples/sec                   batch loss = 786.9754832983017 | accuracy = 0.6546875


Epoch[2] Batch[325] Speed: 1.2782793668465215 samples/sec                   batch loss = 800.5214475393295 | accuracy = 0.6538461538461539


Epoch[2] Batch[330] Speed: 1.2842027187822085 samples/sec                   batch loss = 813.9901303052902 | accuracy = 0.6530303030303031


Epoch[2] Batch[335] Speed: 1.2821610643183075 samples/sec                   batch loss = 826.3019050359726 | accuracy = 0.6529850746268657


Epoch[2] Batch[340] Speed: 1.2778763806668332 samples/sec                   batch loss = 838.5447804927826 | accuracy = 0.6514705882352941


Epoch[2] Batch[345] Speed: 1.2811714813387944 samples/sec                   batch loss = 854.0961806774139 | accuracy = 0.65


Epoch[2] Batch[350] Speed: 1.286128026390391 samples/sec                   batch loss = 866.8449976444244 | accuracy = 0.6507142857142857


Epoch[2] Batch[355] Speed: 1.2824563649782394 samples/sec                   batch loss = 877.6255097389221 | accuracy = 0.6528169014084507


Epoch[2] Batch[360] Speed: 1.2852534066207664 samples/sec                   batch loss = 893.0200109481812 | accuracy = 0.6513888888888889


Epoch[2] Batch[365] Speed: 1.284815903115851 samples/sec                   batch loss = 904.3758842945099 | accuracy = 0.6520547945205479


Epoch[2] Batch[370] Speed: 1.2914109843878065 samples/sec                   batch loss = 915.9348278045654 | accuracy = 0.652027027027027


Epoch[2] Batch[375] Speed: 1.2898507192111015 samples/sec                   batch loss = 929.352126121521 | accuracy = 0.6513333333333333


Epoch[2] Batch[380] Speed: 1.2901665383078864 samples/sec                   batch loss = 939.9898612499237 | accuracy = 0.6526315789473685


Epoch[2] Batch[385] Speed: 1.2916146977501157 samples/sec                   batch loss = 952.2877572774887 | accuracy = 0.6525974025974026


Epoch[2] Batch[390] Speed: 1.2915659756771165 samples/sec                   batch loss = 964.1497396230698 | accuracy = 0.6532051282051282


Epoch[2] Batch[395] Speed: 1.287866803570639 samples/sec                   batch loss = 974.632182598114 | accuracy = 0.6544303797468355


Epoch[2] Batch[400] Speed: 1.287528295980503 samples/sec                   batch loss = 986.7718331813812 | accuracy = 0.655625


Epoch[2] Batch[405] Speed: 1.2936199611386978 samples/sec                   batch loss = 998.7132196426392 | accuracy = 0.6555555555555556


Epoch[2] Batch[410] Speed: 1.2905974680860672 samples/sec                   batch loss = 1008.4253482818604 | accuracy = 0.6579268292682927


Epoch[2] Batch[415] Speed: 1.2893961081904457 samples/sec                   batch loss = 1023.2681986093521 | accuracy = 0.6578313253012048


Epoch[2] Batch[420] Speed: 1.2905948868115085 samples/sec                   batch loss = 1032.0134811401367 | accuracy = 0.6601190476190476


Epoch[2] Batch[425] Speed: 1.287990094188581 samples/sec                   batch loss = 1045.1099343299866 | accuracy = 0.6611764705882353


Epoch[2] Batch[430] Speed: 1.2882045000810445 samples/sec                   batch loss = 1057.2596986293793 | accuracy = 0.6616279069767442


Epoch[2] Batch[435] Speed: 1.281591918795006 samples/sec                   batch loss = 1066.7955451011658 | accuracy = 0.6626436781609195


Epoch[2] Batch[440] Speed: 1.282220348759139 samples/sec                   batch loss = 1077.8234686851501 | accuracy = 0.6630681818181818


Epoch[2] Batch[445] Speed: 1.2786869927798596 samples/sec                   batch loss = 1090.9214128255844 | accuracy = 0.6617977528089888


Epoch[2] Batch[450] Speed: 1.2855524977112143 samples/sec                   batch loss = 1104.9610821008682 | accuracy = 0.6611111111111111


Epoch[2] Batch[455] Speed: 1.2843444809916478 samples/sec                   batch loss = 1118.6180795431137 | accuracy = 0.6598901098901099


Epoch[2] Batch[460] Speed: 1.2896019618731018 samples/sec                   batch loss = 1129.8768646717072 | accuracy = 0.6603260869565217


Epoch[2] Batch[465] Speed: 1.2867219322867947 samples/sec                   batch loss = 1140.4050390720367 | accuracy = 0.660752688172043


Epoch[2] Batch[470] Speed: 1.2877337513303546 samples/sec                   batch loss = 1151.9609359502792 | accuracy = 0.6611702127659574


Epoch[2] Batch[475] Speed: 1.2865842819568714 samples/sec                   batch loss = 1162.1502161026 | accuracy = 0.6626315789473685


Epoch[2] Batch[480] Speed: 1.2944192169979427 samples/sec                   batch loss = 1178.7169246673584 | accuracy = 0.6609375


Epoch[2] Batch[485] Speed: 1.285764516145843 samples/sec                   batch loss = 1190.1787289381027 | accuracy = 0.661340206185567


Epoch[2] Batch[490] Speed: 1.289434459190774 samples/sec                   batch loss = 1200.605305671692 | accuracy = 0.6627551020408163


Epoch[2] Batch[495] Speed: 1.2916195701595916 samples/sec                   batch loss = 1212.9703195095062 | accuracy = 0.6631313131313131


Epoch[2] Batch[500] Speed: 1.288878147308955 samples/sec                   batch loss = 1222.7904158830643 | accuracy = 0.664


Epoch[2] Batch[505] Speed: 1.2912570241068742 samples/sec                   batch loss = 1232.3807760477066 | accuracy = 0.6658415841584159


Epoch[2] Batch[510] Speed: 1.2924837609599125 samples/sec                   batch loss = 1244.4274278879166 | accuracy = 0.6666666666666666


Epoch[2] Batch[515] Speed: 1.2884913103028366 samples/sec                   batch loss = 1254.8301049470901 | accuracy = 0.6684466019417475


Epoch[2] Batch[520] Speed: 1.288767259326562 samples/sec                   batch loss = 1267.6383947134018 | accuracy = 0.6673076923076923


Epoch[2] Batch[525] Speed: 1.288206577239955 samples/sec                   batch loss = 1280.1331387758255 | accuracy = 0.6680952380952381


Epoch[2] Batch[530] Speed: 1.289441693614552 samples/sec                   batch loss = 1291.6321215629578 | accuracy = 0.6683962264150943


Epoch[2] Batch[535] Speed: 1.2861486327726874 samples/sec                   batch loss = 1304.2571634054184 | accuracy = 0.6672897196261682


Epoch[2] Batch[540] Speed: 1.2840981376033085 samples/sec                   batch loss = 1314.5462223291397 | accuracy = 0.6680555555555555


Epoch[2] Batch[545] Speed: 1.2908658777074788 samples/sec                   batch loss = 1326.5768048763275 | accuracy = 0.6678899082568808


Epoch[2] Batch[550] Speed: 1.2917544212203176 samples/sec                   batch loss = 1334.9680671691895 | accuracy = 0.67


Epoch[2] Batch[555] Speed: 1.2899924417580948 samples/sec                   batch loss = 1346.456369638443 | accuracy = 0.6702702702702703


Epoch[2] Batch[560] Speed: 1.2869398648519594 samples/sec                   batch loss = 1360.1232670545578 | accuracy = 0.6691964285714286


Epoch[2] Batch[565] Speed: 1.2839246922852403 samples/sec                   batch loss = 1371.8568774461746 | accuracy = 0.668141592920354


Epoch[2] Batch[570] Speed: 1.284596328630073 samples/sec                   batch loss = 1384.304539680481 | accuracy = 0.6675438596491228


Epoch[2] Batch[575] Speed: 1.285349313331062 samples/sec                   batch loss = 1396.0950411558151 | accuracy = 0.6673913043478261


Epoch[2] Batch[580] Speed: 1.2881216170159773 samples/sec                   batch loss = 1407.4655927419662 | accuracy = 0.6689655172413793


Epoch[2] Batch[585] Speed: 1.2873038425336993 samples/sec                   batch loss = 1418.2439374923706 | accuracy = 0.6696581196581196


Epoch[2] Batch[590] Speed: 1.2835718536215968 samples/sec                   batch loss = 1432.679869055748 | accuracy = 0.6690677966101695


Epoch[2] Batch[595] Speed: 1.2867004193974283 samples/sec                   batch loss = 1443.065831065178 | accuracy = 0.6697478991596638


Epoch[2] Batch[600] Speed: 1.2848002589018772 samples/sec                   batch loss = 1452.6760358810425 | accuracy = 0.6708333333333333


Epoch[2] Batch[605] Speed: 1.2872789519902617 samples/sec                   batch loss = 1463.6627566814423 | accuracy = 0.6714876033057852


Epoch[2] Batch[610] Speed: 1.2815023471883313 samples/sec                   batch loss = 1480.0981771945953 | accuracy = 0.6700819672131147


Epoch[2] Batch[615] Speed: 1.2894975897073946 samples/sec                   batch loss = 1491.5036840438843 | accuracy = 0.6707317073170732


Epoch[2] Batch[620] Speed: 1.2889439960775182 samples/sec                   batch loss = 1507.193041563034 | accuracy = 0.6689516129032258


Epoch[2] Batch[625] Speed: 1.2869354225545935 samples/sec                   batch loss = 1521.240968465805 | accuracy = 0.6676


Epoch[2] Batch[630] Speed: 1.288854383992077 samples/sec                   batch loss = 1534.6120207309723 | accuracy = 0.6674603174603174


Epoch[2] Batch[635] Speed: 1.2859467384075238 samples/sec                   batch loss = 1545.3642246723175 | accuracy = 0.668503937007874


Epoch[2] Batch[640] Speed: 1.2857177123822823 samples/sec                   batch loss = 1557.2818216085434 | accuracy = 0.66875


Epoch[2] Batch[645] Speed: 1.284466704742597 samples/sec                   batch loss = 1567.8506435155869 | accuracy = 0.6689922480620155


Epoch[2] Batch[650] Speed: 1.283696189437054 samples/sec                   batch loss = 1579.1469420194626 | accuracy = 0.6688461538461539


Epoch[2] Batch[655] Speed: 1.278577461614782 samples/sec                   batch loss = 1591.7904282808304 | accuracy = 0.6687022900763359


Epoch[2] Batch[660] Speed: 1.2810399066625129 samples/sec                   batch loss = 1603.233071088791 | accuracy = 0.668939393939394


Epoch[2] Batch[665] Speed: 1.2780327157537044 samples/sec                   batch loss = 1615.5096011161804 | accuracy = 0.6687969924812031


Epoch[2] Batch[670] Speed: 1.2802056971128937 samples/sec                   batch loss = 1627.5332264900208 | accuracy = 0.6694029850746268


Epoch[2] Batch[675] Speed: 1.280960388010634 samples/sec                   batch loss = 1638.7647253274918 | accuracy = 0.6692592592592592


Epoch[2] Batch[680] Speed: 1.2832623954422704 samples/sec                   batch loss = 1648.8445188999176 | accuracy = 0.6705882352941176


Epoch[2] Batch[685] Speed: 1.286811741566859 samples/sec                   batch loss = 1658.2628645896912 | accuracy = 0.6715328467153284


Epoch[2] Batch[690] Speed: 1.284503681272626 samples/sec                   batch loss = 1671.704501748085 | accuracy = 0.6717391304347826


Epoch[2] Batch[695] Speed: 1.2918136014817512 samples/sec                   batch loss = 1682.4067202210426 | accuracy = 0.6723021582733812


Epoch[2] Batch[700] Speed: 1.28871588109677 samples/sec                   batch loss = 1692.8781281113625 | accuracy = 0.6728571428571428


Epoch[2] Batch[705] Speed: 1.284617279407514 samples/sec                   batch loss = 1704.2442848086357 | accuracy = 0.673049645390071


Epoch[2] Batch[710] Speed: 1.2924656393994107 samples/sec                   batch loss = 1715.680477321148 | accuracy = 0.672887323943662


Epoch[2] Batch[715] Speed: 1.2827625902096893 samples/sec                   batch loss = 1726.9360154271126 | accuracy = 0.6730769230769231


Epoch[2] Batch[720] Speed: 1.283934321448261 samples/sec                   batch loss = 1739.4547813534737 | accuracy = 0.6739583333333333


Epoch[2] Batch[725] Speed: 1.2865330776301565 samples/sec                   batch loss = 1751.7789333462715 | accuracy = 0.6741379310344827


Epoch[2] Batch[730] Speed: 1.290446282374963 samples/sec                   batch loss = 1762.8813289999962 | accuracy = 0.673972602739726


Epoch[2] Batch[735] Speed: 1.2878924089019954 samples/sec                   batch loss = 1776.1053332686424 | accuracy = 0.6738095238095239


Epoch[2] Batch[740] Speed: 1.2794857535591317 samples/sec                   batch loss = 1787.8895129561424 | accuracy = 0.6733108108108108


Epoch[2] Batch[745] Speed: 1.2850276786358492 samples/sec                   batch loss = 1798.0961995124817 | accuracy = 0.6734899328859061


Epoch[2] Batch[750] Speed: 1.2797192998514502 samples/sec                   batch loss = 1807.881465792656 | accuracy = 0.6746666666666666


Epoch[2] Batch[755] Speed: 1.278519195563671 samples/sec                   batch loss = 1817.501327753067 | accuracy = 0.6754966887417219


Epoch[2] Batch[760] Speed: 1.2804874902783576 samples/sec                   batch loss = 1832.18545460701 | accuracy = 0.6746710526315789


Epoch[2] Batch[765] Speed: 1.284335140632007 samples/sec                   batch loss = 1841.530129134655 | accuracy = 0.6751633986928105


Epoch[2] Batch[770] Speed: 1.2801074308909592 samples/sec                   batch loss = 1853.747889816761 | accuracy = 0.6746753246753247


Epoch[2] Batch[775] Speed: 1.2831343163610023 samples/sec                   batch loss = 1866.8906357884407 | accuracy = 0.6745161290322581


Epoch[2] Batch[780] Speed: 1.2853108110737712 samples/sec                   batch loss = 1884.8909559845924 | accuracy = 0.6721153846153847


Epoch[2] Batch[785] Speed: 1.2864868097884306 samples/sec                   batch loss = 1896.9065216183662 | accuracy = 0.6722929936305733


[Epoch 2] training: accuracy=0.6725888324873096
[Epoch 2] time cost: 628.9886920452118
[Epoch 2] validation: validation accuracy=0.7411111111111112


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).