<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[09:32:19] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[09:32:19] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[09:32:19] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 5.915921 , -6.1112623]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7798545483314177 samples/sec                   batch loss = 13.635015964508057 | accuracy = 0.6


Epoch[1] Batch[10] Speed: 1.2585888562645065 samples/sec                   batch loss = 28.671812295913696 | accuracy = 0.525


Epoch[1] Batch[15] Speed: 1.25679156488499 samples/sec                   batch loss = 44.5319881439209 | accuracy = 0.45


Epoch[1] Batch[20] Speed: 1.2564236518838399 samples/sec                   batch loss = 59.37200427055359 | accuracy = 0.4125


Epoch[1] Batch[25] Speed: 1.2531735837068767 samples/sec                   batch loss = 74.44500398635864 | accuracy = 0.41


Epoch[1] Batch[30] Speed: 1.2571011965797876 samples/sec                   batch loss = 88.4896171092987 | accuracy = 0.4166666666666667


Epoch[1] Batch[35] Speed: 1.2565899343026958 samples/sec                   batch loss = 101.99705338478088 | accuracy = 0.45


Epoch[1] Batch[40] Speed: 1.258452911199089 samples/sec                   batch loss = 115.22491335868835 | accuracy = 0.48125


Epoch[1] Batch[45] Speed: 1.2598087656390888 samples/sec                   batch loss = 129.54189348220825 | accuracy = 0.4888888888888889


Epoch[1] Batch[50] Speed: 1.2617847349402473 samples/sec                   batch loss = 143.5057291984558 | accuracy = 0.49


Epoch[1] Batch[55] Speed: 1.2542556637582494 samples/sec                   batch loss = 157.93718647956848 | accuracy = 0.4772727272727273


Epoch[1] Batch[60] Speed: 1.2589481191285612 samples/sec                   batch loss = 171.95459175109863 | accuracy = 0.4791666666666667


Epoch[1] Batch[65] Speed: 1.2481059378859365 samples/sec                   batch loss = 185.95533204078674 | accuracy = 0.4807692307692308


Epoch[1] Batch[70] Speed: 1.2549413849611233 samples/sec                   batch loss = 199.9077501296997 | accuracy = 0.48214285714285715


Epoch[1] Batch[75] Speed: 1.2590646119144837 samples/sec                   batch loss = 214.38027238845825 | accuracy = 0.47


Epoch[1] Batch[80] Speed: 1.2535952322599542 samples/sec                   batch loss = 228.27655601501465 | accuracy = 0.46875


Epoch[1] Batch[85] Speed: 1.2555491121278406 samples/sec                   batch loss = 241.95308232307434 | accuracy = 0.47058823529411764


Epoch[1] Batch[90] Speed: 1.2595696632081563 samples/sec                   batch loss = 255.02279829978943 | accuracy = 0.4888888888888889


Epoch[1] Batch[95] Speed: 1.2850604549700377 samples/sec                   batch loss = 268.5839579105377 | accuracy = 0.49736842105263157


Epoch[1] Batch[100] Speed: 1.2862889511984712 samples/sec                   batch loss = 282.2711992263794 | accuracy = 0.4975


Epoch[1] Batch[105] Speed: 1.2805540483099338 samples/sec                   batch loss = 296.51146054267883 | accuracy = 0.4928571428571429


Epoch[1] Batch[110] Speed: 1.2789652902269313 samples/sec                   batch loss = 309.8777139186859 | accuracy = 0.5022727272727273


Epoch[1] Batch[115] Speed: 1.2747851026596446 samples/sec                   batch loss = 323.922251701355 | accuracy = 0.5043478260869565


Epoch[1] Batch[120] Speed: 1.2771583732879488 samples/sec                   batch loss = 337.7584912776947 | accuracy = 0.5041666666666667


Epoch[1] Batch[125] Speed: 1.272886820370115 samples/sec                   batch loss = 351.12332820892334 | accuracy = 0.51


Epoch[1] Batch[130] Speed: 1.2795031226429556 samples/sec                   batch loss = 365.04604744911194 | accuracy = 0.5076923076923077


Epoch[1] Batch[135] Speed: 1.282223680609431 samples/sec                   batch loss = 379.0030517578125 | accuracy = 0.5018518518518519


Epoch[1] Batch[140] Speed: 1.2801567574764823 samples/sec                   batch loss = 392.55414938926697 | accuracy = 0.5


Epoch[1] Batch[145] Speed: 1.2753216540702055 samples/sec                   batch loss = 406.4163303375244 | accuracy = 0.5


Epoch[1] Batch[150] Speed: 1.2808841063316332 samples/sec                   batch loss = 419.9822008609772 | accuracy = 0.5066666666666667


Epoch[1] Batch[155] Speed: 1.2782558953527459 samples/sec                   batch loss = 434.0248644351959 | accuracy = 0.5048387096774194


Epoch[1] Batch[160] Speed: 1.2757272015268686 samples/sec                   batch loss = 447.8020088672638 | accuracy = 0.503125


Epoch[1] Batch[165] Speed: 1.2799039134635277 samples/sec                   batch loss = 461.4947853088379 | accuracy = 0.503030303030303


Epoch[1] Batch[170] Speed: 1.2793760855149374 samples/sec                   batch loss = 475.3832802772522 | accuracy = 0.5044117647058823


Epoch[1] Batch[175] Speed: 1.2815388595665527 samples/sec                   batch loss = 488.69961428642273 | accuracy = 0.5057142857142857


Epoch[1] Batch[180] Speed: 1.282963780235424 samples/sec                   batch loss = 502.70529890060425 | accuracy = 0.5027777777777778


Epoch[1] Batch[185] Speed: 1.2802899093972084 samples/sec                   batch loss = 516.7195692062378 | accuracy = 0.49864864864864866


Epoch[1] Batch[190] Speed: 1.2802048179247427 samples/sec                   batch loss = 530.8903551101685 | accuracy = 0.49605263157894736


Epoch[1] Batch[195] Speed: 1.2782514154145521 samples/sec                   batch loss = 543.6266527175903 | accuracy = 0.5038461538461538


Epoch[1] Batch[200] Speed: 1.277913660094272 samples/sec                   batch loss = 557.7509415149689 | accuracy = 0.50125


Epoch[1] Batch[205] Speed: 1.2773624772579568 samples/sec                   batch loss = 571.3440001010895 | accuracy = 0.5060975609756098


Epoch[1] Batch[210] Speed: 1.2838351872831781 samples/sec                   batch loss = 584.8592855930328 | accuracy = 0.5107142857142857


Epoch[1] Batch[215] Speed: 1.2802731051332057 samples/sec                   batch loss = 598.6884667873383 | accuracy = 0.5104651162790698


Epoch[1] Batch[220] Speed: 1.2850307298155996 samples/sec                   batch loss = 612.2026858329773 | accuracy = 0.5136363636363637


Epoch[1] Batch[225] Speed: 1.2801139749915174 samples/sec                   batch loss = 625.6061131954193 | accuracy = 0.5133333333333333


Epoch[1] Batch[230] Speed: 1.2830754380062745 samples/sec                   batch loss = 639.1732225418091 | accuracy = 0.5173913043478261


Epoch[1] Batch[235] Speed: 1.2822985538117835 samples/sec                   batch loss = 652.8283216953278 | accuracy = 0.5170212765957447


Epoch[1] Batch[240] Speed: 1.2782624205362934 samples/sec                   batch loss = 666.1051094532013 | accuracy = 0.51875


Epoch[1] Batch[245] Speed: 1.2793528664013487 samples/sec                   batch loss = 679.341070652008 | accuracy = 0.5224489795918368


Epoch[1] Batch[250] Speed: 1.2836886264601417 samples/sec                   batch loss = 693.3758594989777 | accuracy = 0.521


Epoch[1] Batch[255] Speed: 1.2765678215004215 samples/sec                   batch loss = 707.0643410682678 | accuracy = 0.5225490196078432


Epoch[1] Batch[260] Speed: 1.2808957435902213 samples/sec                   batch loss = 720.4392416477203 | accuracy = 0.5240384615384616


Epoch[1] Batch[265] Speed: 1.2796519501341073 samples/sec                   batch loss = 733.7820687294006 | accuracy = 0.5245283018867924


Epoch[1] Batch[270] Speed: 1.2768144904592509 samples/sec                   batch loss = 747.2908506393433 | accuracy = 0.524074074074074


Epoch[1] Batch[275] Speed: 1.2827103166366005 samples/sec                   batch loss = 760.5352108478546 | accuracy = 0.5245454545454545


Epoch[1] Batch[280] Speed: 1.2834517637729461 samples/sec                   batch loss = 774.6790432929993 | accuracy = 0.5258928571428572


Epoch[1] Batch[285] Speed: 1.2841924958105644 samples/sec                   batch loss = 788.585057258606 | accuracy = 0.525438596491228


Epoch[1] Batch[290] Speed: 1.2801427893644752 samples/sec                   batch loss = 802.308385848999 | accuracy = 0.525


Epoch[1] Batch[295] Speed: 1.2767232536124022 samples/sec                   batch loss = 816.0400819778442 | accuracy = 0.5245762711864407


Epoch[1] Batch[300] Speed: 1.2836006274619975 samples/sec                   batch loss = 830.4563798904419 | accuracy = 0.5233333333333333


Epoch[1] Batch[305] Speed: 1.2819950970450993 samples/sec                   batch loss = 844.154655456543 | accuracy = 0.5270491803278688


Epoch[1] Batch[310] Speed: 1.2778836806445502 samples/sec                   batch loss = 857.6954283714294 | accuracy = 0.5290322580645161


Epoch[1] Batch[315] Speed: 1.284086638644747 samples/sec                   batch loss = 871.0018005371094 | accuracy = 0.530952380952381


Epoch[1] Batch[320] Speed: 1.2915767141039332 samples/sec                   batch loss = 883.617607831955 | accuracy = 0.5359375


Epoch[1] Batch[325] Speed: 1.2907533564963787 samples/sec                   batch loss = 897.6894359588623 | accuracy = 0.5338461538461539


Epoch[1] Batch[330] Speed: 1.282323840192879 samples/sec                   batch loss = 911.2792844772339 | accuracy = 0.5333333333333333


Epoch[1] Batch[335] Speed: 1.2794325758694818 samples/sec                   batch loss = 925.395003080368 | accuracy = 0.5313432835820896


Epoch[1] Batch[340] Speed: 1.2822628801483698 samples/sec                   batch loss = 939.1130776405334 | accuracy = 0.5308823529411765


Epoch[1] Batch[345] Speed: 1.2717681268401249 samples/sec                   batch loss = 953.1018970012665 | accuracy = 0.5297101449275362


Epoch[1] Batch[350] Speed: 1.2727570383081508 samples/sec                   batch loss = 967.4261088371277 | accuracy = 0.5278571428571428


Epoch[1] Batch[355] Speed: 1.277701108913307 samples/sec                   batch loss = 980.971999168396 | accuracy = 0.530281690140845


Epoch[1] Batch[360] Speed: 1.2739563081970215 samples/sec                   batch loss = 994.62579870224 | accuracy = 0.5305555555555556


Epoch[1] Batch[365] Speed: 1.2731307161425538 samples/sec                   batch loss = 1007.3229820728302 | accuracy = 0.5335616438356164


Epoch[1] Batch[370] Speed: 1.273631067300784 samples/sec                   batch loss = 1020.5808215141296 | accuracy = 0.5358108108108108


Epoch[1] Batch[375] Speed: 1.2769355767390194 samples/sec                   batch loss = 1034.258803844452 | accuracy = 0.5346666666666666


Epoch[1] Batch[380] Speed: 1.279500487976305 samples/sec                   batch loss = 1048.0707728862762 | accuracy = 0.5342105263157895


Epoch[1] Batch[385] Speed: 1.2786564897783488 samples/sec                   batch loss = 1060.8690972328186 | accuracy = 0.5363636363636364


Epoch[1] Batch[390] Speed: 1.275482503945871 samples/sec                   batch loss = 1074.899501800537 | accuracy = 0.5365384615384615


Epoch[1] Batch[395] Speed: 1.2768930092815387 samples/sec                   batch loss = 1088.9969975948334 | accuracy = 0.5354430379746835


Epoch[1] Batch[400] Speed: 1.2816230515314002 samples/sec                   batch loss = 1102.402592420578 | accuracy = 0.53625


Epoch[1] Batch[405] Speed: 1.2816433179933315 samples/sec                   batch loss = 1115.7412214279175 | accuracy = 0.5388888888888889


Epoch[1] Batch[410] Speed: 1.2808882135752246 samples/sec                   batch loss = 1129.3910641670227 | accuracy = 0.5384146341463415


Epoch[1] Batch[415] Speed: 1.281677292678934 samples/sec                   batch loss = 1144.2486741542816 | accuracy = 0.536144578313253


Epoch[1] Batch[420] Speed: 1.2793160882697674 samples/sec                   batch loss = 1157.7228546142578 | accuracy = 0.5375


Epoch[1] Batch[425] Speed: 1.2878273595119876 samples/sec                   batch loss = 1171.5465989112854 | accuracy = 0.5376470588235294


Epoch[1] Batch[430] Speed: 1.2817841239591607 samples/sec                   batch loss = 1185.393919467926 | accuracy = 0.538953488372093


Epoch[1] Batch[435] Speed: 1.2813557312989543 samples/sec                   batch loss = 1198.9520428180695 | accuracy = 0.5402298850574713


Epoch[1] Batch[440] Speed: 1.2833529004457753 samples/sec                   batch loss = 1213.1596796512604 | accuracy = 0.540340909090909


Epoch[1] Batch[445] Speed: 1.280449474184473 samples/sec                   batch loss = 1226.8903081417084 | accuracy = 0.5398876404494382


Epoch[1] Batch[450] Speed: 1.2734024439570102 samples/sec                   batch loss = 1240.1389091014862 | accuracy = 0.5411111111111111


Epoch[1] Batch[455] Speed: 1.2841051157042112 samples/sec                   batch loss = 1253.4927062988281 | accuracy = 0.5417582417582417


Epoch[1] Batch[460] Speed: 1.28088400854044 samples/sec                   batch loss = 1266.8886141777039 | accuracy = 0.5423913043478261


Epoch[1] Batch[465] Speed: 1.2826175488903249 samples/sec                   batch loss = 1280.83291721344 | accuracy = 0.5424731182795699


Epoch[1] Batch[470] Speed: 1.2784556740808317 samples/sec                   batch loss = 1294.4083762168884 | accuracy = 0.5425531914893617


Epoch[1] Batch[475] Speed: 1.2837455965307714 samples/sec                   batch loss = 1308.3597722053528 | accuracy = 0.5426315789473685


Epoch[1] Batch[480] Speed: 1.2845027961712403 samples/sec                   batch loss = 1321.7491488456726 | accuracy = 0.54375


Epoch[1] Batch[485] Speed: 1.2785867184161062 samples/sec                   batch loss = 1334.7662618160248 | accuracy = 0.5448453608247422


Epoch[1] Batch[490] Speed: 1.2807257041782327 samples/sec                   batch loss = 1348.1790082454681 | accuracy = 0.5464285714285714


Epoch[1] Batch[495] Speed: 1.2833164809539301 samples/sec                   batch loss = 1360.9770922660828 | accuracy = 0.547979797979798


Epoch[1] Batch[500] Speed: 1.283162874309019 samples/sec                   batch loss = 1374.764883518219 | accuracy = 0.548


Epoch[1] Batch[505] Speed: 1.2805909954018544 samples/sec                   batch loss = 1387.8508455753326 | accuracy = 0.549009900990099


Epoch[1] Batch[510] Speed: 1.2805619653644391 samples/sec                   batch loss = 1402.0969138145447 | accuracy = 0.5495098039215687


Epoch[1] Batch[515] Speed: 1.2729159863503767 samples/sec                   batch loss = 1414.6315023899078 | accuracy = 0.5514563106796116


Epoch[1] Batch[520] Speed: 1.2804808446286975 samples/sec                   batch loss = 1427.3373093605042 | accuracy = 0.5533653846153846


Epoch[1] Batch[525] Speed: 1.2755732725944136 samples/sec                   batch loss = 1440.0642719268799 | accuracy = 0.5528571428571428


Epoch[1] Batch[530] Speed: 1.2804345224575722 samples/sec                   batch loss = 1452.633812904358 | accuracy = 0.5533018867924528


Epoch[1] Batch[535] Speed: 1.2835335559718084 samples/sec                   batch loss = 1466.3788442611694 | accuracy = 0.5542056074766355


Epoch[1] Batch[540] Speed: 1.2760665195016916 samples/sec                   batch loss = 1479.6561906337738 | accuracy = 0.5537037037037037


Epoch[1] Batch[545] Speed: 1.2772907076831812 samples/sec                   batch loss = 1492.649676322937 | accuracy = 0.5532110091743119


Epoch[1] Batch[550] Speed: 1.2843737811065439 samples/sec                   batch loss = 1506.5183866024017 | accuracy = 0.5531818181818182


Epoch[1] Batch[555] Speed: 1.2721492304267306 samples/sec                   batch loss = 1519.274717092514 | accuracy = 0.5536036036036036


Epoch[1] Batch[560] Speed: 1.2880965959431891 samples/sec                   batch loss = 1531.4159104824066 | accuracy = 0.55625


Epoch[1] Batch[565] Speed: 1.2761808628581535 samples/sec                   batch loss = 1544.2184960842133 | accuracy = 0.5579646017699115


Epoch[1] Batch[570] Speed: 1.280518667114541 samples/sec                   batch loss = 1556.9832181930542 | accuracy = 0.5583333333333333


Epoch[1] Batch[575] Speed: 1.2767424909680973 samples/sec                   batch loss = 1569.2548325061798 | accuracy = 0.5595652173913044


Epoch[1] Batch[580] Speed: 1.2796052975297614 samples/sec                   batch loss = 1581.408712387085 | accuracy = 0.5607758620689656


Epoch[1] Batch[585] Speed: 1.2802825819022707 samples/sec                   batch loss = 1593.8273603916168 | accuracy = 0.5615384615384615


Epoch[1] Batch[590] Speed: 1.283632348924378 samples/sec                   batch loss = 1605.9044387340546 | accuracy = 0.5627118644067797


Epoch[1] Batch[595] Speed: 1.2810007818608287 samples/sec                   batch loss = 1619.677412033081 | accuracy = 0.5617647058823529


Epoch[1] Batch[600] Speed: 1.2872929774955664 samples/sec                   batch loss = 1633.0613782405853 | accuracy = 0.5625


Epoch[1] Batch[605] Speed: 1.2733072487187782 samples/sec                   batch loss = 1645.5421133041382 | accuracy = 0.5640495867768595


Epoch[1] Batch[610] Speed: 1.2728110143885651 samples/sec                   batch loss = 1658.7228260040283 | accuracy = 0.5639344262295082


Epoch[1] Batch[615] Speed: 1.2783267020243354 samples/sec                   batch loss = 1671.2234094142914 | accuracy = 0.5654471544715447


Epoch[1] Batch[620] Speed: 1.2765623820619767 samples/sec                   batch loss = 1683.552670121193 | accuracy = 0.5669354838709677


Epoch[1] Batch[625] Speed: 1.2792968708299668 samples/sec                   batch loss = 1695.1377614736557 | accuracy = 0.5684


Epoch[1] Batch[630] Speed: 1.2789682151865096 samples/sec                   batch loss = 1709.2205618619919 | accuracy = 0.5674603174603174


Epoch[1] Batch[635] Speed: 1.2797074887362476 samples/sec                   batch loss = 1722.4730303287506 | accuracy = 0.5677165354330709


Epoch[1] Batch[640] Speed: 1.2757430135966739 samples/sec                   batch loss = 1735.6744601726532 | accuracy = 0.5671875


Epoch[1] Batch[645] Speed: 1.273034498965279 samples/sec                   batch loss = 1749.7080881595612 | accuracy = 0.5670542635658915


Epoch[1] Batch[650] Speed: 1.2791611945749382 samples/sec                   batch loss = 1762.221833229065 | accuracy = 0.5676923076923077


Epoch[1] Batch[655] Speed: 1.283307351863552 samples/sec                   batch loss = 1777.8268358707428 | accuracy = 0.566030534351145


Epoch[1] Batch[660] Speed: 1.276705376993452 samples/sec                   batch loss = 1790.9566161632538 | accuracy = 0.5666666666666667


Epoch[1] Batch[665] Speed: 1.282228090437913 samples/sec                   batch loss = 1805.4581208229065 | accuracy = 0.5661654135338345


Epoch[1] Batch[670] Speed: 1.2813547526674245 samples/sec                   batch loss = 1817.0466121435165 | accuracy = 0.567910447761194


Epoch[1] Batch[675] Speed: 1.278080616847075 samples/sec                   batch loss = 1830.1401742696762 | accuracy = 0.5685185185185185


Epoch[1] Batch[680] Speed: 1.2748765470796646 samples/sec                   batch loss = 1843.8549019098282 | accuracy = 0.5683823529411764


Epoch[1] Batch[685] Speed: 1.2736390923595469 samples/sec                   batch loss = 1857.4623678922653 | accuracy = 0.5686131386861314


Epoch[1] Batch[690] Speed: 1.2726071073171652 samples/sec                   batch loss = 1871.111298918724 | accuracy = 0.5684782608695652


Epoch[1] Batch[695] Speed: 1.2809414145390514 samples/sec                   batch loss = 1884.9367226362228 | accuracy = 0.5690647482014388


Epoch[1] Batch[700] Speed: 1.2788307565470536 samples/sec                   batch loss = 1896.4720126390457 | accuracy = 0.5710714285714286


Epoch[1] Batch[705] Speed: 1.2812023979719125 samples/sec                   batch loss = 1909.8418000936508 | accuracy = 0.5716312056737589


Epoch[1] Batch[710] Speed: 1.2784370669890264 samples/sec                   batch loss = 1922.9312435388565 | accuracy = 0.5714788732394366


Epoch[1] Batch[715] Speed: 1.2786403130582118 samples/sec                   batch loss = 1936.889863371849 | accuracy = 0.570979020979021


Epoch[1] Batch[720] Speed: 1.2837492309999603 samples/sec                   batch loss = 1949.9926429986954 | accuracy = 0.5715277777777777


Epoch[1] Batch[725] Speed: 1.2819944113190618 samples/sec                   batch loss = 1962.3423773050308 | accuracy = 0.5720689655172414


Epoch[1] Batch[730] Speed: 1.2802245511049646 samples/sec                   batch loss = 1974.8947954177856 | accuracy = 0.572945205479452


Epoch[1] Batch[735] Speed: 1.2846873170136677 samples/sec                   batch loss = 1987.1919302940369 | accuracy = 0.5738095238095238


Epoch[1] Batch[740] Speed: 1.2881673101789706 samples/sec                   batch loss = 2000.9442493915558 | accuracy = 0.5733108108108108


Epoch[1] Batch[745] Speed: 1.290033109680237 samples/sec                   batch loss = 2013.1578421592712 | accuracy = 0.5738255033557047


Epoch[1] Batch[750] Speed: 1.2861641126425154 samples/sec                   batch loss = 2025.791952252388 | accuracy = 0.5733333333333334


Epoch[1] Batch[755] Speed: 1.2860988433615095 samples/sec                   batch loss = 2038.0293926000595 | accuracy = 0.573841059602649


Epoch[1] Batch[760] Speed: 1.2815294620800262 samples/sec                   batch loss = 2049.8071843385696 | accuracy = 0.5746710526315789


Epoch[1] Batch[765] Speed: 1.2798521655137185 samples/sec                   batch loss = 2062.831607222557 | accuracy = 0.5748366013071895


Epoch[1] Batch[770] Speed: 1.2738146053791837 samples/sec                   batch loss = 2074.9680486917496 | accuracy = 0.5756493506493506


Epoch[1] Batch[775] Speed: 1.2683129651410063 samples/sec                   batch loss = 2088.201954603195 | accuracy = 0.5764516129032258


Epoch[1] Batch[780] Speed: 1.276691095489131 samples/sec                   batch loss = 2100.895369052887 | accuracy = 0.5772435897435897


Epoch[1] Batch[785] Speed: 1.2812538637779756 samples/sec                   batch loss = 2114.563069820404 | accuracy = 0.5777070063694267


[Epoch 1] training: accuracy=0.5783629441624365
[Epoch 1] time cost: 634.5635509490967
[Epoch 1] validation: validation accuracy=0.6777777777777778


Epoch[2] Batch[5] Speed: 1.2796162283787722 samples/sec                   batch loss = 13.20943009853363 | accuracy = 0.6


Epoch[2] Batch[10] Speed: 1.2843480205318496 samples/sec                   batch loss = 27.93158257007599 | accuracy = 0.55


Epoch[2] Batch[15] Speed: 1.2801589064438466 samples/sec                   batch loss = 42.44639551639557 | accuracy = 0.5166666666666667


Epoch[2] Batch[20] Speed: 1.2873853361661534 samples/sec                   batch loss = 53.91759669780731 | accuracy = 0.5875


Epoch[2] Batch[25] Speed: 1.2852723111230586 samples/sec                   batch loss = 64.93324446678162 | accuracy = 0.61


Epoch[2] Batch[30] Speed: 1.2850543523427065 samples/sec                   batch loss = 77.27899968624115 | accuracy = 0.6333333333333333


Epoch[2] Batch[35] Speed: 1.2849553403923655 samples/sec                   batch loss = 89.37643074989319 | accuracy = 0.6428571428571429


Epoch[2] Batch[40] Speed: 1.2829031517321416 samples/sec                   batch loss = 105.57362008094788 | accuracy = 0.61875


Epoch[2] Batch[45] Speed: 1.2847941587455816 samples/sec                   batch loss = 115.77367031574249 | accuracy = 0.6388888888888888


Epoch[2] Batch[50] Speed: 1.2835807900687304 samples/sec                   batch loss = 128.62781167030334 | accuracy = 0.64


Epoch[2] Batch[55] Speed: 1.2815288747416944 samples/sec                   batch loss = 143.1627037525177 | accuracy = 0.6136363636363636


Epoch[2] Batch[60] Speed: 1.279595635595502 samples/sec                   batch loss = 155.7120988368988 | accuracy = 0.6125


Epoch[2] Batch[65] Speed: 1.2888467601135756 samples/sec                   batch loss = 168.16333901882172 | accuracy = 0.6230769230769231


Epoch[2] Batch[70] Speed: 1.2852806804813601 samples/sec                   batch loss = 181.41164815425873 | accuracy = 0.6285714285714286


Epoch[2] Batch[75] Speed: 1.2782661214162705 samples/sec                   batch loss = 194.13837337493896 | accuracy = 0.6233333333333333


Epoch[2] Batch[80] Speed: 1.287442140860487 samples/sec                   batch loss = 207.4734148979187 | accuracy = 0.621875


Epoch[2] Batch[85] Speed: 1.2799453148364022 samples/sec                   batch loss = 219.78116476535797 | accuracy = 0.6147058823529412


Epoch[2] Batch[90] Speed: 1.2837108245415538 samples/sec                   batch loss = 230.53790271282196 | accuracy = 0.6305555555555555


Epoch[2] Batch[95] Speed: 1.2831269562711496 samples/sec                   batch loss = 241.03961193561554 | accuracy = 0.6342105263157894


Epoch[2] Batch[100] Speed: 1.28655359833527 samples/sec                   batch loss = 253.54887461662292 | accuracy = 0.6375


Epoch[2] Batch[105] Speed: 1.2843694548309605 samples/sec                   batch loss = 264.1215522289276 | accuracy = 0.6452380952380953


Epoch[2] Batch[110] Speed: 1.2795000000762642 samples/sec                   batch loss = 274.30764985084534 | accuracy = 0.65


Epoch[2] Batch[115] Speed: 1.2837271298160668 samples/sec                   batch loss = 285.6853264570236 | accuracy = 0.6543478260869565


Epoch[2] Batch[120] Speed: 1.280089752386661 samples/sec                   batch loss = 298.3537392616272 | accuracy = 0.6541666666666667


Epoch[2] Batch[125] Speed: 1.2826084297411484 samples/sec                   batch loss = 311.2059648036957 | accuracy = 0.652


Epoch[2] Batch[130] Speed: 1.2877529265548935 samples/sec                   batch loss = 323.3215024471283 | accuracy = 0.6557692307692308


Epoch[2] Batch[135] Speed: 1.2818148743009397 samples/sec                   batch loss = 334.5871444940567 | accuracy = 0.6611111111111111


Epoch[2] Batch[140] Speed: 1.2815817373477993 samples/sec                   batch loss = 348.02734076976776 | accuracy = 0.6571428571428571


Epoch[2] Batch[145] Speed: 1.288325283163755 samples/sec                   batch loss = 361.4815193414688 | accuracy = 0.6551724137931034


Epoch[2] Batch[150] Speed: 1.285336117924259 samples/sec                   batch loss = 373.29035234451294 | accuracy = 0.6566666666666666


Epoch[2] Batch[155] Speed: 1.2800051758021789 samples/sec                   batch loss = 385.47798895835876 | accuracy = 0.6532258064516129


Epoch[2] Batch[160] Speed: 1.281026212710072 samples/sec                   batch loss = 401.20396876335144 | accuracy = 0.64375


Epoch[2] Batch[165] Speed: 1.2857477649336642 samples/sec                   batch loss = 412.97808134555817 | accuracy = 0.6454545454545455


Epoch[2] Batch[170] Speed: 1.2844213720616187 samples/sec                   batch loss = 425.7405515909195 | accuracy = 0.6470588235294118


Epoch[2] Batch[175] Speed: 1.281066806350045 samples/sec                   batch loss = 436.87861001491547 | accuracy = 0.6471428571428571


Epoch[2] Batch[180] Speed: 1.2792439039212282 samples/sec                   batch loss = 448.3998934030533 | accuracy = 0.6513888888888889


Epoch[2] Batch[185] Speed: 1.279540692218255 samples/sec                   batch loss = 462.26992404460907 | accuracy = 0.6527027027027027


Epoch[2] Batch[190] Speed: 1.2789765026446351 samples/sec                   batch loss = 477.5535935163498 | accuracy = 0.6473684210526316


Epoch[2] Batch[195] Speed: 1.2844328770160445 samples/sec                   batch loss = 490.41107165813446 | accuracy = 0.6461538461538462


Epoch[2] Batch[200] Speed: 1.280121007532374 samples/sec                   batch loss = 502.73161494731903 | accuracy = 0.645


Epoch[2] Batch[205] Speed: 1.2700502139155803 samples/sec                   batch loss = 515.0190547704697 | accuracy = 0.6463414634146342


Epoch[2] Batch[210] Speed: 1.2736734175473907 samples/sec                   batch loss = 528.5305181741714 | accuracy = 0.6428571428571429


Epoch[2] Batch[215] Speed: 1.2747362860101525 samples/sec                   batch loss = 539.1758435964584 | accuracy = 0.6441860465116279


Epoch[2] Batch[220] Speed: 1.2784198242513507 samples/sec                   batch loss = 550.1832336187363 | accuracy = 0.6454545454545455


Epoch[2] Batch[225] Speed: 1.2734467121747666 samples/sec                   batch loss = 564.6963675022125 | accuracy = 0.6444444444444445


Epoch[2] Batch[230] Speed: 1.2847596250819955 samples/sec                   batch loss = 575.6157684326172 | accuracy = 0.6445652173913043


Epoch[2] Batch[235] Speed: 1.2829846777950535 samples/sec                   batch loss = 587.2462937831879 | accuracy = 0.6478723404255319


Epoch[2] Batch[240] Speed: 1.2779783932490825 samples/sec                   batch loss = 599.1082487106323 | accuracy = 0.65


Epoch[2] Batch[245] Speed: 1.2803225421931703 samples/sec                   batch loss = 610.6130822896957 | accuracy = 0.6520408163265307


Epoch[2] Batch[250] Speed: 1.2818753999839854 samples/sec                   batch loss = 623.6477261781693 | accuracy = 0.651


Epoch[2] Batch[255] Speed: 1.2782660240244175 samples/sec                   batch loss = 636.5144785642624 | accuracy = 0.65


Epoch[2] Batch[260] Speed: 1.288222700177379 samples/sec                   batch loss = 649.7204457521439 | accuracy = 0.65


Epoch[2] Batch[265] Speed: 1.280749168691739 samples/sec                   batch loss = 662.1883682012558 | accuracy = 0.65


Epoch[2] Batch[270] Speed: 1.2797763086301368 samples/sec                   batch loss = 674.9256035089493 | accuracy = 0.649074074074074


Epoch[2] Batch[275] Speed: 1.2766399955804795 samples/sec                   batch loss = 687.0192947387695 | accuracy = 0.649090909090909


Epoch[2] Batch[280] Speed: 1.2762698862607669 samples/sec                   batch loss = 700.1538712978363 | accuracy = 0.6464285714285715


Epoch[2] Batch[285] Speed: 1.2822189768258163 samples/sec                   batch loss = 711.7841355800629 | accuracy = 0.6482456140350877


Epoch[2] Batch[290] Speed: 1.2833051923127818 samples/sec                   batch loss = 725.5551764965057 | accuracy = 0.6474137931034483


Epoch[2] Batch[295] Speed: 1.2873516508804361 samples/sec                   batch loss = 739.9706590175629 | accuracy = 0.6457627118644068


Epoch[2] Batch[300] Speed: 1.273778338393988 samples/sec                   batch loss = 751.9359321594238 | accuracy = 0.6483333333333333


Epoch[2] Batch[305] Speed: 1.2780306712747393 samples/sec                   batch loss = 764.148598909378 | accuracy = 0.6491803278688525


Epoch[2] Batch[310] Speed: 1.27591056801131 samples/sec                   batch loss = 778.8348090648651 | accuracy = 0.646774193548387


Epoch[2] Batch[315] Speed: 1.2755586284876312 samples/sec                   batch loss = 789.5206569433212 | accuracy = 0.6476190476190476


Epoch[2] Batch[320] Speed: 1.2790979018398585 samples/sec                   batch loss = 800.4978359937668 | accuracy = 0.64921875


Epoch[2] Batch[325] Speed: 1.2776469118917033 samples/sec                   batch loss = 814.9388879537582 | accuracy = 0.6476923076923077


Epoch[2] Batch[330] Speed: 1.2764350541665743 samples/sec                   batch loss = 825.9320709705353 | accuracy = 0.646969696969697


Epoch[2] Batch[335] Speed: 1.2819595382193163 samples/sec                   batch loss = 838.3748590946198 | accuracy = 0.6462686567164179


Epoch[2] Batch[340] Speed: 1.2813251986994605 samples/sec                   batch loss = 849.6522351503372 | accuracy = 0.6485294117647059


Epoch[2] Batch[345] Speed: 1.2749326407734882 samples/sec                   batch loss = 860.0203813314438 | accuracy = 0.6514492753623189


Epoch[2] Batch[350] Speed: 1.2803584011057783 samples/sec                   batch loss = 874.0904177427292 | accuracy = 0.6507142857142857


Epoch[2] Batch[355] Speed: 1.278419239759919 samples/sec                   batch loss = 885.4685142040253 | accuracy = 0.652112676056338


Epoch[2] Batch[360] Speed: 1.2761160204769222 samples/sec                   batch loss = 899.6486749649048 | accuracy = 0.6506944444444445


Epoch[2] Batch[365] Speed: 1.2810474384611503 samples/sec                   batch loss = 911.3780955076218 | accuracy = 0.6506849315068494


Epoch[2] Batch[370] Speed: 1.2814892306488985 samples/sec                   batch loss = 925.0690702199936 | accuracy = 0.6506756756756756


Epoch[2] Batch[375] Speed: 1.2785797027228054 samples/sec                   batch loss = 937.1319597959518 | accuracy = 0.652


Epoch[2] Batch[380] Speed: 1.28147983389021 samples/sec                   batch loss = 950.4403451681137 | accuracy = 0.6513157894736842


Epoch[2] Batch[385] Speed: 1.2806202222573004 samples/sec                   batch loss = 962.1769820451736 | accuracy = 0.6519480519480519


Epoch[2] Batch[390] Speed: 1.2835001701417195 samples/sec                   batch loss = 974.7374602556229 | accuracy = 0.6512820512820513


Epoch[2] Batch[395] Speed: 1.275070716227572 samples/sec                   batch loss = 988.0731381177902 | accuracy = 0.6506329113924051


Epoch[2] Batch[400] Speed: 1.2803274274854226 samples/sec                   batch loss = 1003.3345047235489 | accuracy = 0.648125


Epoch[2] Batch[405] Speed: 1.2756030467489534 samples/sec                   batch loss = 1014.4850820302963 | accuracy = 0.6475308641975308


Epoch[2] Batch[410] Speed: 1.2831620891937987 samples/sec                   batch loss = 1028.3994253873825 | accuracy = 0.6463414634146342


Epoch[2] Batch[415] Speed: 1.2829263036881535 samples/sec                   batch loss = 1041.9410518407822 | accuracy = 0.6451807228915662


Epoch[2] Batch[420] Speed: 1.2818645284450383 samples/sec                   batch loss = 1055.3529698848724 | accuracy = 0.6452380952380953


Epoch[2] Batch[425] Speed: 1.2807580658779205 samples/sec                   batch loss = 1067.7069469690323 | accuracy = 0.6458823529411765


Epoch[2] Batch[430] Speed: 1.277938189781242 samples/sec                   batch loss = 1081.69129550457 | accuracy = 0.6459302325581395


Epoch[2] Batch[435] Speed: 1.2759714107596234 samples/sec                   batch loss = 1092.30238199234 | accuracy = 0.6471264367816092


Epoch[2] Batch[440] Speed: 1.2763617380328078 samples/sec                   batch loss = 1104.898662686348 | accuracy = 0.6465909090909091


Epoch[2] Batch[445] Speed: 1.2784152457494402 samples/sec                   batch loss = 1117.1312124729156 | accuracy = 0.6477528089887641


Epoch[2] Batch[450] Speed: 1.2799376006983303 samples/sec                   batch loss = 1129.8323743343353 | accuracy = 0.6477777777777778


Epoch[2] Batch[455] Speed: 1.2824672465583202 samples/sec                   batch loss = 1140.4458357095718 | accuracy = 0.65


Epoch[2] Batch[460] Speed: 1.2750890316394636 samples/sec                   batch loss = 1154.3905481100082 | accuracy = 0.6489130434782608


Epoch[2] Batch[465] Speed: 1.2772755379113063 samples/sec                   batch loss = 1167.329473376274 | accuracy = 0.6473118279569893


Epoch[2] Batch[470] Speed: 1.2783983932483267 samples/sec                   batch loss = 1177.295955836773 | accuracy = 0.6478723404255319


Epoch[2] Batch[475] Speed: 1.2705656583633262 samples/sec                   batch loss = 1187.5780166983604 | accuracy = 0.6505263157894737


Epoch[2] Batch[480] Speed: 1.2737498098168807 samples/sec                   batch loss = 1199.2260589003563 | accuracy = 0.6510416666666666


Epoch[2] Batch[485] Speed: 1.2691061098083007 samples/sec                   batch loss = 1214.32534044981 | accuracy = 0.6494845360824743


Epoch[2] Batch[490] Speed: 1.281005476710846 samples/sec                   batch loss = 1224.24444013834 | accuracy = 0.6510204081632653


Epoch[2] Batch[495] Speed: 1.2735046134628891 samples/sec                   batch loss = 1236.7948155999184 | accuracy = 0.6515151515151515


Epoch[2] Batch[500] Speed: 1.277209417473023 samples/sec                   batch loss = 1250.495760023594 | accuracy = 0.65


Epoch[2] Batch[505] Speed: 1.2769054487929605 samples/sec                   batch loss = 1262.3178904652596 | accuracy = 0.6504950495049505


Epoch[2] Batch[510] Speed: 1.274926149540825 samples/sec                   batch loss = 1275.3140545487404 | accuracy = 0.6490196078431373


Epoch[2] Batch[515] Speed: 1.2757140089342174 samples/sec                   batch loss = 1286.1669571995735 | accuracy = 0.6504854368932039


Epoch[2] Batch[520] Speed: 1.2778074729867956 samples/sec                   batch loss = 1298.127646267414 | accuracy = 0.6509615384615385


Epoch[2] Batch[525] Speed: 1.2826534384141999 samples/sec                   batch loss = 1308.9750730395317 | accuracy = 0.6514285714285715


Epoch[2] Batch[530] Speed: 1.2787561902870201 samples/sec                   batch loss = 1324.127345263958 | accuracy = 0.65


Epoch[2] Batch[535] Speed: 1.2778706380763074 samples/sec                   batch loss = 1334.780650317669 | accuracy = 0.6509345794392524


Epoch[2] Batch[540] Speed: 1.2732691746085907 samples/sec                   batch loss = 1349.4397649168968 | accuracy = 0.6504629629629629


Epoch[2] Batch[545] Speed: 1.2800611357323433 samples/sec                   batch loss = 1361.0668138861656 | accuracy = 0.6509174311926605


Epoch[2] Batch[550] Speed: 1.2737078412839404 samples/sec                   batch loss = 1375.5376362204552 | accuracy = 0.65


Epoch[2] Batch[555] Speed: 1.2771362067982477 samples/sec                   batch loss = 1388.6529821753502 | accuracy = 0.6495495495495496


Epoch[2] Batch[560] Speed: 1.2823650062023109 samples/sec                   batch loss = 1399.4324741959572 | accuracy = 0.65


Epoch[2] Batch[565] Speed: 1.2815543265632152 samples/sec                   batch loss = 1411.4261842370033 | accuracy = 0.6508849557522124


Epoch[2] Batch[570] Speed: 1.279164120430642 samples/sec                   batch loss = 1424.248241007328 | accuracy = 0.6508771929824562


Epoch[2] Batch[575] Speed: 1.2782888141225743 samples/sec                   batch loss = 1433.9419657588005 | accuracy = 0.6521739130434783


Epoch[2] Batch[580] Speed: 1.283408662594776 samples/sec                   batch loss = 1444.2351949810982 | accuracy = 0.653448275862069


Epoch[2] Batch[585] Speed: 1.2805098709882117 samples/sec                   batch loss = 1458.480570614338 | accuracy = 0.6525641025641026


Epoch[2] Batch[590] Speed: 1.2813562206152798 samples/sec                   batch loss = 1470.849937736988 | accuracy = 0.652542372881356


Epoch[2] Batch[595] Speed: 1.2852405085555452 samples/sec                   batch loss = 1483.7687893509865 | accuracy = 0.6525210084033614


Epoch[2] Batch[600] Speed: 1.283266714263008 samples/sec                   batch loss = 1495.3184151053429 | accuracy = 0.6533333333333333


Epoch[2] Batch[605] Speed: 1.2863830398671632 samples/sec                   batch loss = 1504.3140476346016 | accuracy = 0.6545454545454545


Epoch[2] Batch[610] Speed: 1.2820323232736954 samples/sec                   batch loss = 1517.4969283938408 | accuracy = 0.6540983606557377


Epoch[2] Batch[615] Speed: 1.2833777375650621 samples/sec                   batch loss = 1527.681730210781 | accuracy = 0.6548780487804878


Epoch[2] Batch[620] Speed: 1.2817057858068555 samples/sec                   batch loss = 1540.6859955191612 | accuracy = 0.6544354838709677


Epoch[2] Batch[625] Speed: 1.2882516829140267 samples/sec                   batch loss = 1553.8020831942558 | accuracy = 0.654


Epoch[2] Batch[630] Speed: 1.278063383723135 samples/sec                   batch loss = 1565.6610817313194 | accuracy = 0.6535714285714286


Epoch[2] Batch[635] Speed: 1.2770243166595903 samples/sec                   batch loss = 1576.9723961949348 | accuracy = 0.6543307086614173


Epoch[2] Batch[640] Speed: 1.2760678783007287 samples/sec                   batch loss = 1589.6668806672096 | accuracy = 0.65234375


Epoch[2] Batch[645] Speed: 1.2788612678636753 samples/sec                   batch loss = 1601.9987266659737 | accuracy = 0.6527131782945736


Epoch[2] Batch[650] Speed: 1.274415874291055 samples/sec                   batch loss = 1613.4356063008308 | accuracy = 0.6526923076923077


Epoch[2] Batch[655] Speed: 1.2801888950594364 samples/sec                   batch loss = 1624.2860435843468 | accuracy = 0.6534351145038167


Epoch[2] Batch[660] Speed: 1.2774144131179794 samples/sec                   batch loss = 1637.8951866030693 | accuracy = 0.6530303030303031


Epoch[2] Batch[665] Speed: 1.2794913155177945 samples/sec                   batch loss = 1646.4470275044441 | accuracy = 0.6545112781954887


Epoch[2] Batch[670] Speed: 1.2764654512719373 samples/sec                   batch loss = 1657.3747008442879 | accuracy = 0.6548507462686567


Epoch[2] Batch[675] Speed: 1.2785675228825737 samples/sec                   batch loss = 1668.321094572544 | accuracy = 0.6555555555555556


Epoch[2] Batch[680] Speed: 1.2723315697216135 samples/sec                   batch loss = 1679.9455713629723 | accuracy = 0.6551470588235294


Epoch[2] Batch[685] Speed: 1.2649096182181432 samples/sec                   batch loss = 1692.744323670864 | accuracy = 0.6558394160583941


Epoch[2] Batch[690] Speed: 1.277368993323307 samples/sec                   batch loss = 1706.0940670371056 | accuracy = 0.6554347826086957


Epoch[2] Batch[695] Speed: 1.2822819907926353 samples/sec                   batch loss = 1717.0972065329552 | accuracy = 0.6561151079136691


Epoch[2] Batch[700] Speed: 1.276054775716314 samples/sec                   batch loss = 1729.0932175517082 | accuracy = 0.6560714285714285


Epoch[2] Batch[705] Speed: 1.2824158792794165 samples/sec                   batch loss = 1739.3765957951546 | accuracy = 0.6563829787234042


Epoch[2] Batch[710] Speed: 1.2753503500497834 samples/sec                   batch loss = 1750.4543334841728 | accuracy = 0.6570422535211268


Epoch[2] Batch[715] Speed: 1.2772897352510983 samples/sec                   batch loss = 1762.4564660191536 | accuracy = 0.6576923076923077


Epoch[2] Batch[720] Speed: 1.2783168645897607 samples/sec                   batch loss = 1771.0561538338661 | accuracy = 0.6586805555555556


Epoch[2] Batch[725] Speed: 1.2778567198083293 samples/sec                   batch loss = 1782.480486214161 | accuracy = 0.6589655172413793


Epoch[2] Batch[730] Speed: 1.2776921568440793 samples/sec                   batch loss = 1793.4310520291328 | accuracy = 0.6589041095890411


Epoch[2] Batch[735] Speed: 1.27788377797815 samples/sec                   batch loss = 1804.1432012915611 | accuracy = 0.6598639455782312


Epoch[2] Batch[740] Speed: 1.2768837769883497 samples/sec                   batch loss = 1814.6617713570595 | accuracy = 0.6601351351351351


Epoch[2] Batch[745] Speed: 1.2781090476492298 samples/sec                   batch loss = 1826.2989960312843 | accuracy = 0.6600671140939597


Epoch[2] Batch[750] Speed: 1.276628532664628 samples/sec                   batch loss = 1838.0752080082893 | accuracy = 0.6606666666666666


Epoch[2] Batch[755] Speed: 1.2819089951840972 samples/sec                   batch loss = 1850.8649992346764 | accuracy = 0.6602649006622516


Epoch[2] Batch[760] Speed: 1.2791263779193776 samples/sec                   batch loss = 1863.4801546931267 | accuracy = 0.6595394736842105


Epoch[2] Batch[765] Speed: 1.2791226720564433 samples/sec                   batch loss = 1876.5069087147713 | accuracy = 0.6594771241830065


Epoch[2] Batch[770] Speed: 1.2807254108772548 samples/sec                   batch loss = 1887.1428273320198 | accuracy = 0.660064935064935


Epoch[2] Batch[775] Speed: 1.2777109368728905 samples/sec                   batch loss = 1898.4748818278313 | accuracy = 0.66


Epoch[2] Batch[780] Speed: 1.2751234350259746 samples/sec                   batch loss = 1908.769997060299 | accuracy = 0.6605769230769231


Epoch[2] Batch[785] Speed: 1.2808721759162944 samples/sec                   batch loss = 1918.4917230010033 | accuracy = 0.6611464968152866


[Epoch 2] training: accuracy=0.6602157360406091
[Epoch 2] time cost: 631.1979684829712
[Epoch 2] validation: validation accuracy=0.7588888888888888


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).