<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[03:37:53] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[03:37:53] 

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

/work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[03:37:54] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 3.593981 , -5.5166664]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7748451114862924 samples/sec                   batch loss = 14.02406096458435 | accuracy = 0.55


Epoch[1] Batch[10] Speed: 1.258566668779629 samples/sec                   batch loss = 29.910136699676514 | accuracy = 0.475


Epoch[1] Batch[15] Speed: 1.2589648406120906 samples/sec                   batch loss = 44.39330744743347 | accuracy = 0.4666666666666667


Epoch[1] Batch[20] Speed: 1.257236378678289 samples/sec                   batch loss = 57.61926603317261 | accuracy = 0.5375


Epoch[1] Batch[25] Speed: 1.2591807482395938 samples/sec                   batch loss = 70.62174034118652 | accuracy = 0.55


Epoch[1] Batch[30] Speed: 1.259186607596662 samples/sec                   batch loss = 84.09721565246582 | accuracy = 0.5583333333333333


Epoch[1] Batch[35] Speed: 1.2630844475930598 samples/sec                   batch loss = 98.45238399505615 | accuracy = 0.55


Epoch[1] Batch[40] Speed: 1.2617970716185654 samples/sec                   batch loss = 112.49129724502563 | accuracy = 0.54375


Epoch[1] Batch[45] Speed: 1.2641426635610915 samples/sec                   batch loss = 125.45684790611267 | accuracy = 0.55


Epoch[1] Batch[50] Speed: 1.2594082636446462 samples/sec                   batch loss = 138.5456109046936 | accuracy = 0.55


Epoch[1] Batch[55] Speed: 1.2616239057093217 samples/sec                   batch loss = 153.81155610084534 | accuracy = 0.5409090909090909


Epoch[1] Batch[60] Speed: 1.2630795979096554 samples/sec                   batch loss = 168.0045382976532 | accuracy = 0.5333333333333333


Epoch[1] Batch[65] Speed: 1.265810716918587 samples/sec                   batch loss = 182.34842777252197 | accuracy = 0.5307692307692308


Epoch[1] Batch[70] Speed: 1.2643262400497857 samples/sec                   batch loss = 195.177419424057 | accuracy = 0.5357142857142857


Epoch[1] Batch[75] Speed: 1.2480833756869316 samples/sec                   batch loss = 208.98584938049316 | accuracy = 0.5366666666666666


Epoch[1] Batch[80] Speed: 1.25878385653185 samples/sec                   batch loss = 223.09712481498718 | accuracy = 0.5375


Epoch[1] Batch[85] Speed: 1.2595989786395725 samples/sec                   batch loss = 236.61818385124207 | accuracy = 0.5441176470588235


Epoch[1] Batch[90] Speed: 1.2572656798166155 samples/sec                   batch loss = 249.63495087623596 | accuracy = 0.55


Epoch[1] Batch[95] Speed: 1.2605744469814564 samples/sec                   batch loss = 262.6284511089325 | accuracy = 0.5552631578947368


Epoch[1] Batch[100] Speed: 1.252550572842079 samples/sec                   batch loss = 276.4494411945343 | accuracy = 0.5575


Epoch[1] Batch[105] Speed: 1.2534933289341716 samples/sec                   batch loss = 290.1241919994354 | accuracy = 0.5571428571428572


Epoch[1] Batch[110] Speed: 1.2557459912376598 samples/sec                   batch loss = 304.2481322288513 | accuracy = 0.55


Epoch[1] Batch[115] Speed: 1.2553437470327482 samples/sec                   batch loss = 317.5513277053833 | accuracy = 0.5543478260869565


Epoch[1] Batch[120] Speed: 1.2565524769238685 samples/sec                   batch loss = 331.1546437740326 | accuracy = 0.5541666666666667


Epoch[1] Batch[125] Speed: 1.2562879859919365 samples/sec                   batch loss = 344.5840275287628 | accuracy = 0.554


Epoch[1] Batch[130] Speed: 1.257894805899291 samples/sec                   batch loss = 358.39085578918457 | accuracy = 0.551923076923077


Epoch[1] Batch[135] Speed: 1.2632724728513236 samples/sec                   batch loss = 372.4779794216156 | accuracy = 0.5462962962962963


Epoch[1] Batch[140] Speed: 1.2571825849670115 samples/sec                   batch loss = 386.0726330280304 | accuracy = 0.5482142857142858


Epoch[1] Batch[145] Speed: 1.2543660377945334 samples/sec                   batch loss = 400.1064147949219 | accuracy = 0.5448275862068965


Epoch[1] Batch[150] Speed: 1.2521743907101006 samples/sec                   batch loss = 414.068984746933 | accuracy = 0.5433333333333333


Epoch[1] Batch[155] Speed: 1.2570926250366588 samples/sec                   batch loss = 428.73340678215027 | accuracy = 0.535483870967742


Epoch[1] Batch[160] Speed: 1.2599543712689343 samples/sec                   batch loss = 443.08155250549316 | accuracy = 0.534375


Epoch[1] Batch[165] Speed: 1.264437250141764 samples/sec                   batch loss = 456.4150276184082 | accuracy = 0.5333333333333333


Epoch[1] Batch[170] Speed: 1.249163282830618 samples/sec                   batch loss = 470.0918242931366 | accuracy = 0.5352941176470588


Epoch[1] Batch[175] Speed: 1.256625417346765 samples/sec                   batch loss = 484.02769684791565 | accuracy = 0.5314285714285715


Epoch[1] Batch[180] Speed: 1.2574058919466278 samples/sec                   batch loss = 497.9468286037445 | accuracy = 0.5305555555555556


Epoch[1] Batch[185] Speed: 1.2596997017289957 samples/sec                   batch loss = 513.0096170902252 | accuracy = 0.5243243243243243


Epoch[1] Batch[190] Speed: 1.2642311584923975 samples/sec                   batch loss = 527.5739715099335 | accuracy = 0.5223684210526316


Epoch[1] Batch[195] Speed: 1.2554528095156066 samples/sec                   batch loss = 540.233047246933 | accuracy = 0.5256410256410257


Epoch[1] Batch[200] Speed: 1.2522660782975312 samples/sec                   batch loss = 554.2354023456573 | accuracy = 0.525


Epoch[1] Batch[205] Speed: 1.2495118796313343 samples/sec                   batch loss = 568.024327993393 | accuracy = 0.526829268292683


Epoch[1] Batch[210] Speed: 1.260872300163858 samples/sec                   batch loss = 582.0187501907349 | accuracy = 0.525


Epoch[1] Batch[215] Speed: 1.2544943474081345 samples/sec                   batch loss = 595.5882313251495 | accuracy = 0.5244186046511627


Epoch[1] Batch[220] Speed: 1.2473776744352727 samples/sec                   batch loss = 608.9115178585052 | accuracy = 0.5284090909090909


Epoch[1] Batch[225] Speed: 1.2533058626050229 samples/sec                   batch loss = 623.2507729530334 | accuracy = 0.5255555555555556


Epoch[1] Batch[230] Speed: 1.2564214877735072 samples/sec                   batch loss = 636.8056724071503 | accuracy = 0.5260869565217391


Epoch[1] Batch[235] Speed: 1.2580970441736579 samples/sec                   batch loss = 649.8135228157043 | accuracy = 0.5308510638297872


Epoch[1] Batch[240] Speed: 1.247453541932064 samples/sec                   batch loss = 663.4030637741089 | accuracy = 0.5333333333333333


Epoch[1] Batch[245] Speed: 1.2547968417550805 samples/sec                   batch loss = 677.1313939094543 | accuracy = 0.5336734693877551


Epoch[1] Batch[250] Speed: 1.2542023123314525 samples/sec                   batch loss = 690.8730449676514 | accuracy = 0.535


Epoch[1] Batch[255] Speed: 1.2602677409114584 samples/sec                   batch loss = 705.3036713600159 | accuracy = 0.5303921568627451


Epoch[1] Batch[260] Speed: 1.2562246789974323 samples/sec                   batch loss = 719.3935360908508 | accuracy = 0.5307692307692308


Epoch[1] Batch[265] Speed: 1.251681788193578 samples/sec                   batch loss = 733.303852558136 | accuracy = 0.529245283018868


Epoch[1] Batch[270] Speed: 1.25597800251387 samples/sec                   batch loss = 747.0756375789642 | accuracy = 0.5287037037037037


Epoch[1] Batch[275] Speed: 1.257582049482297 samples/sec                   batch loss = 760.6404500007629 | accuracy = 0.53


Epoch[1] Batch[280] Speed: 1.2582290434852519 samples/sec                   batch loss = 774.2167553901672 | accuracy = 0.5321428571428571


Epoch[1] Batch[285] Speed: 1.2572324217160518 samples/sec                   batch loss = 787.8085474967957 | accuracy = 0.5342105263157895


Epoch[1] Batch[290] Speed: 1.2547476671133053 samples/sec                   batch loss = 801.0047147274017 | accuracy = 0.5362068965517242


Epoch[1] Batch[295] Speed: 1.2574331276337027 samples/sec                   batch loss = 814.2517831325531 | accuracy = 0.538135593220339


Epoch[1] Batch[300] Speed: 1.253309701263132 samples/sec                   batch loss = 828.5281426906586 | accuracy = 0.5366666666666666


Epoch[1] Batch[305] Speed: 1.2538705852115428 samples/sec                   batch loss = 841.7423288822174 | accuracy = 0.5385245901639344


Epoch[1] Batch[310] Speed: 1.2532416387889913 samples/sec                   batch loss = 855.5383009910583 | accuracy = 0.5395161290322581


Epoch[1] Batch[315] Speed: 1.2506640875950061 samples/sec                   batch loss = 869.1541845798492 | accuracy = 0.5373015873015873


Epoch[1] Batch[320] Speed: 1.2539087263061237 samples/sec                   batch loss = 882.734710931778 | accuracy = 0.5390625


Epoch[1] Batch[325] Speed: 1.2585748827785783 samples/sec                   batch loss = 896.3083651065826 | accuracy = 0.5392307692307692


Epoch[1] Batch[330] Speed: 1.2538799562728355 samples/sec                   batch loss = 910.3218140602112 | accuracy = 0.5371212121212121


Epoch[1] Batch[335] Speed: 1.251699624606858 samples/sec                   batch loss = 923.3862564563751 | accuracy = 0.5388059701492537


Epoch[1] Batch[340] Speed: 1.252386946977171 samples/sec                   batch loss = 936.7384035587311 | accuracy = 0.5389705882352941


Epoch[1] Batch[345] Speed: 1.2621226573038438 samples/sec                   batch loss = 950.2890276908875 | accuracy = 0.5398550724637681


Epoch[1] Batch[350] Speed: 1.260962422796452 samples/sec                   batch loss = 964.209221124649 | accuracy = 0.5407142857142857


Epoch[1] Batch[355] Speed: 1.2568029567787304 samples/sec                   batch loss = 978.406325340271 | accuracy = 0.5394366197183098


Epoch[1] Batch[360] Speed: 1.2466757351740234 samples/sec                   batch loss = 991.313218832016 | accuracy = 0.5409722222222222


Epoch[1] Batch[365] Speed: 1.2521608396571853 samples/sec                   batch loss = 1004.5345039367676 | accuracy = 0.5438356164383562


Epoch[1] Batch[370] Speed: 1.2525599241829124 samples/sec                   batch loss = 1017.5937523841858 | accuracy = 0.5452702702702703


Epoch[1] Batch[375] Speed: 1.2504630201397458 samples/sec                   batch loss = 1031.3242502212524 | accuracy = 0.5446666666666666


Epoch[1] Batch[380] Speed: 1.2575872341060284 samples/sec                   batch loss = 1045.1657552719116 | accuracy = 0.5427631578947368


Epoch[1] Batch[385] Speed: 1.248946891593818 samples/sec                   batch loss = 1058.1067049503326 | accuracy = 0.5461038961038961


Epoch[1] Batch[390] Speed: 1.2486362450984012 samples/sec                   batch loss = 1071.597669839859 | accuracy = 0.5474358974358975


Epoch[1] Batch[395] Speed: 1.2511223004962142 samples/sec                   batch loss = 1085.39684176445 | accuracy = 0.5468354430379747


Epoch[1] Batch[400] Speed: 1.2557036029779334 samples/sec                   batch loss = 1098.4475061893463 | accuracy = 0.549375


Epoch[1] Batch[405] Speed: 1.2506365849838623 samples/sec                   batch loss = 1112.6801545619965 | accuracy = 0.55


Epoch[1] Batch[410] Speed: 1.2471797945945038 samples/sec                   batch loss = 1126.7336802482605 | accuracy = 0.5487804878048781


Epoch[1] Batch[415] Speed: 1.2547289930205898 samples/sec                   batch loss = 1140.583277463913 | accuracy = 0.5481927710843374


Epoch[1] Batch[420] Speed: 1.2513165803614668 samples/sec                   batch loss = 1154.7197093963623 | accuracy = 0.5470238095238096


Epoch[1] Batch[425] Speed: 1.2505267730556058 samples/sec                   batch loss = 1168.6580846309662 | accuracy = 0.5470588235294118


Epoch[1] Batch[430] Speed: 1.2541141845910917 samples/sec                   batch loss = 1181.8811597824097 | accuracy = 0.5476744186046512


Epoch[1] Batch[435] Speed: 1.2472828066765533 samples/sec                   batch loss = 1195.424123287201 | accuracy = 0.5477011494252874


Epoch[1] Batch[440] Speed: 1.2629380225868556 samples/sec                   batch loss = 1208.2713015079498 | accuracy = 0.5488636363636363


Epoch[1] Batch[445] Speed: 1.2607389877817277 samples/sec                   batch loss = 1221.275928735733 | accuracy = 0.55


Epoch[1] Batch[450] Speed: 1.2647835553266138 samples/sec                   batch loss = 1234.3766748905182 | accuracy = 0.5522222222222222


Epoch[1] Batch[455] Speed: 1.2575780903438705 samples/sec                   batch loss = 1246.8686664104462 | accuracy = 0.5538461538461539


Epoch[1] Batch[460] Speed: 1.2518453232460736 samples/sec                   batch loss = 1261.1191930770874 | accuracy = 0.5527173913043478


Epoch[1] Batch[465] Speed: 1.2648263681201215 samples/sec                   batch loss = 1274.5972261428833 | accuracy = 0.5543010752688172


Epoch[1] Batch[470] Speed: 1.2603808799895007 samples/sec                   batch loss = 1287.4765915870667 | accuracy = 0.5558510638297872


Epoch[1] Batch[475] Speed: 1.2606569488301747 samples/sec                   batch loss = 1299.0885252952576 | accuracy = 0.5584210526315789


Epoch[1] Batch[480] Speed: 1.2495327253151758 samples/sec                   batch loss = 1311.9126133918762 | accuracy = 0.5598958333333334


Epoch[1] Batch[485] Speed: 1.256274627967187 samples/sec                   batch loss = 1325.8804094791412 | accuracy = 0.5608247422680412


Epoch[1] Batch[490] Speed: 1.2569926009439647 samples/sec                   batch loss = 1340.8217689990997 | accuracy = 0.5591836734693878


Epoch[1] Batch[495] Speed: 1.2613337577571178 samples/sec                   batch loss = 1354.1346645355225 | accuracy = 0.5590909090909091


Epoch[1] Batch[500] Speed: 1.2542592269326218 samples/sec                   batch loss = 1368.8737530708313 | accuracy = 0.557


Epoch[1] Batch[505] Speed: 1.2485299431980887 samples/sec                   batch loss = 1382.1791589260101 | accuracy = 0.5574257425742575


Epoch[1] Batch[510] Speed: 1.2496841565820607 samples/sec                   batch loss = 1395.9595787525177 | accuracy = 0.5573529411764706


Epoch[1] Batch[515] Speed: 1.2537581434006742 samples/sec                   batch loss = 1409.079209804535 | accuracy = 0.5577669902912621


Epoch[1] Batch[520] Speed: 1.251939111336899 samples/sec                   batch loss = 1422.7122745513916 | accuracy = 0.5576923076923077


Epoch[1] Batch[525] Speed: 1.249894564552159 samples/sec                   batch loss = 1435.5515549182892 | accuracy = 0.5580952380952381


Epoch[1] Batch[530] Speed: 1.2479136746727886 samples/sec                   batch loss = 1448.4594719409943 | accuracy = 0.5580188679245283


Epoch[1] Batch[535] Speed: 1.2492185317995625 samples/sec                   batch loss = 1461.3055930137634 | accuracy = 0.5593457943925234


Epoch[1] Batch[540] Speed: 1.2567984376556744 samples/sec                   batch loss = 1474.9741163253784 | accuracy = 0.5583333333333333


Epoch[1] Batch[545] Speed: 1.2510244369045742 samples/sec                   batch loss = 1488.8422646522522 | accuracy = 0.5587155963302752


Epoch[1] Batch[550] Speed: 1.247547879062063 samples/sec                   batch loss = 1502.7044463157654 | accuracy = 0.5581818181818182


Epoch[1] Batch[555] Speed: 1.2459064846232375 samples/sec                   batch loss = 1515.6917157173157 | accuracy = 0.5594594594594594


Epoch[1] Batch[560] Speed: 1.2533613849984204 samples/sec                   batch loss = 1528.2524180412292 | accuracy = 0.5598214285714286


Epoch[1] Batch[565] Speed: 1.254636193840105 samples/sec                   batch loss = 1542.561956167221 | accuracy = 0.5579646017699115


Epoch[1] Batch[570] Speed: 1.253851562387897 samples/sec                   batch loss = 1555.34108376503 | accuracy = 0.5587719298245614


Epoch[1] Batch[575] Speed: 1.2488004725089907 samples/sec                   batch loss = 1568.3103778362274 | accuracy = 0.5595652173913044


Epoch[1] Batch[580] Speed: 1.252075054279893 samples/sec                   batch loss = 1581.299254655838 | accuracy = 0.5599137931034482


Epoch[1] Batch[585] Speed: 1.2556482487835479 samples/sec                   batch loss = 1594.7744009494781 | accuracy = 0.5606837606837607


Epoch[1] Batch[590] Speed: 1.2528410906872922 samples/sec                   batch loss = 1607.470621585846 | accuracy = 0.5614406779661016


Epoch[1] Batch[595] Speed: 1.2544872184076226 samples/sec                   batch loss = 1619.9078295230865 | accuracy = 0.5638655462184874


Epoch[1] Batch[600] Speed: 1.2495887515371078 samples/sec                   batch loss = 1631.3140633106232 | accuracy = 0.5654166666666667


Epoch[1] Batch[605] Speed: 1.249573301944325 samples/sec                   batch loss = 1644.1527206897736 | accuracy = 0.565702479338843


Epoch[1] Batch[610] Speed: 1.2511062531762902 samples/sec                   batch loss = 1656.531664133072 | accuracy = 0.5672131147540984


Epoch[1] Batch[615] Speed: 1.2506096429207698 samples/sec                   batch loss = 1669.7119553089142 | accuracy = 0.567479674796748


Epoch[1] Batch[620] Speed: 1.247314242194424 samples/sec                   batch loss = 1681.4959609508514 | accuracy = 0.5685483870967742


Epoch[1] Batch[625] Speed: 1.2463035359665406 samples/sec                   batch loss = 1694.6958074569702 | accuracy = 0.5692


Epoch[1] Batch[630] Speed: 1.2527080680469251 samples/sec                   batch loss = 1706.1317158937454 | accuracy = 0.5702380952380952


Epoch[1] Batch[635] Speed: 1.252450241741359 samples/sec                   batch loss = 1719.095797419548 | accuracy = 0.5708661417322834


Epoch[1] Batch[640] Speed: 1.2524319164544533 samples/sec                   batch loss = 1731.8078898191452 | accuracy = 0.570703125


Epoch[1] Batch[645] Speed: 1.2496123920149775 samples/sec                   batch loss = 1745.3490966558456 | accuracy = 0.5709302325581396


Epoch[1] Batch[650] Speed: 1.2531695586721847 samples/sec                   batch loss = 1759.3134390115738 | accuracy = 0.5707692307692308


Epoch[1] Batch[655] Speed: 1.256547300828545 samples/sec                   batch loss = 1772.7388767004013 | accuracy = 0.5717557251908397


Epoch[1] Batch[660] Speed: 1.2605889384876283 samples/sec                   batch loss = 1785.6061109304428 | accuracy = 0.571969696969697


Epoch[1] Batch[665] Speed: 1.2585300375378738 samples/sec                   batch loss = 1796.1234179735184 | accuracy = 0.574812030075188


Epoch[1] Batch[670] Speed: 1.2527440805300152 samples/sec                   batch loss = 1809.3375500440598 | accuracy = 0.5753731343283582


Epoch[1] Batch[675] Speed: 1.2554129774372784 samples/sec                   batch loss = 1822.3543154001236 | accuracy = 0.5755555555555556


Epoch[1] Batch[680] Speed: 1.2587920733659923 samples/sec                   batch loss = 1836.6241890192032 | accuracy = 0.575


Epoch[1] Batch[685] Speed: 1.2492281125051805 samples/sec                   batch loss = 1849.7004154920578 | accuracy = 0.5755474452554744


Epoch[1] Batch[690] Speed: 1.249162817792698 samples/sec                   batch loss = 1863.2952696084976 | accuracy = 0.5760869565217391


Epoch[1] Batch[695] Speed: 1.252434440824373 samples/sec                   batch loss = 1876.5100957155228 | accuracy = 0.5762589928057554


Epoch[1] Batch[700] Speed: 1.262686802161344 samples/sec                   batch loss = 1888.1325235366821 | accuracy = 0.5778571428571428


Epoch[1] Batch[705] Speed: 1.255695238448793 samples/sec                   batch loss = 1900.7929413318634 | accuracy = 0.5787234042553191


Epoch[1] Batch[710] Speed: 1.2586468306620036 samples/sec                   batch loss = 1911.965781569481 | accuracy = 0.5802816901408451


Epoch[1] Batch[715] Speed: 1.2513191002372979 samples/sec                   batch loss = 1925.1437538862228 | accuracy = 0.5814685314685315


Epoch[1] Batch[720] Speed: 1.2515408890607618 samples/sec                   batch loss = 1937.863453745842 | accuracy = 0.5819444444444445


Epoch[1] Batch[725] Speed: 1.257700175245148 samples/sec                   batch loss = 1950.556788802147 | accuracy = 0.5827586206896552


Epoch[1] Batch[730] Speed: 1.2550936606354821 samples/sec                   batch loss = 1963.102943778038 | accuracy = 0.5835616438356165


Epoch[1] Batch[735] Speed: 1.2535712535162606 samples/sec                   batch loss = 1976.178991675377 | accuracy = 0.5843537414965987


Epoch[1] Batch[740] Speed: 1.2492857858767965 samples/sec                   batch loss = 1988.0448402166367 | accuracy = 0.5851351351351352


Epoch[1] Batch[745] Speed: 1.253141477754243 samples/sec                   batch loss = 1999.745105266571 | accuracy = 0.5859060402684564


Epoch[1] Batch[750] Speed: 1.2555243069205162 samples/sec                   batch loss = 2013.673569202423 | accuracy = 0.5853333333333334


Epoch[1] Batch[755] Speed: 1.2558854885524033 samples/sec                   batch loss = 2026.4397976398468 | accuracy = 0.5864238410596027


Epoch[1] Batch[760] Speed: 1.2527995531587288 samples/sec                   batch loss = 2039.9955542087555 | accuracy = 0.5865131578947368


Epoch[1] Batch[765] Speed: 1.2519080027731604 samples/sec                   batch loss = 2053.344165325165 | accuracy = 0.5866013071895425


Epoch[1] Batch[770] Speed: 1.2545860936190196 samples/sec                   batch loss = 2063.556301355362 | accuracy = 0.5886363636363636


Epoch[1] Batch[775] Speed: 1.2557112157361485 samples/sec                   batch loss = 2079.0298957824707 | accuracy = 0.5880645161290322


Epoch[1] Batch[780] Speed: 1.253970956561905 samples/sec                   batch loss = 2091.4720752239227 | accuracy = 0.5887820512820513


Epoch[1] Batch[785] Speed: 1.2549764933369738 samples/sec                   batch loss = 2104.342184782028 | accuracy = 0.5885350318471337


[Epoch 1] training: accuracy=0.5888324873096447
[Epoch 1] time cost: 646.1648120880127
[Epoch 1] validation: validation accuracy=0.62


Epoch[2] Batch[5] Speed: 1.2578418045692725 samples/sec                   batch loss = 12.449939012527466 | accuracy = 0.7


Epoch[2] Batch[10] Speed: 1.251768266855622 samples/sec                   batch loss = 24.75379455089569 | accuracy = 0.7


Epoch[2] Batch[15] Speed: 1.2463264967889882 samples/sec                   batch loss = 37.033769607543945 | accuracy = 0.6833333333333333


Epoch[2] Batch[20] Speed: 1.2453570462533177 samples/sec                   batch loss = 49.10374927520752 | accuracy = 0.7


Epoch[2] Batch[25] Speed: 1.2523386154017377 samples/sec                   batch loss = 60.61322617530823 | accuracy = 0.72


Epoch[2] Batch[30] Speed: 1.2530801721232128 samples/sec                   batch loss = 74.3806540966034 | accuracy = 0.7083333333333334


Epoch[2] Batch[35] Speed: 1.2477357609762632 samples/sec                   batch loss = 87.09304618835449 | accuracy = 0.6928571428571428


Epoch[2] Batch[40] Speed: 1.2476195922462523 samples/sec                   batch loss = 100.75763082504272 | accuracy = 0.6875


Epoch[2] Batch[45] Speed: 1.2494698181369712 samples/sec                   batch loss = 113.72945046424866 | accuracy = 0.6833333333333333


Epoch[2] Batch[50] Speed: 1.246927574931378 samples/sec                   batch loss = 124.60495722293854 | accuracy = 0.695


Epoch[2] Batch[55] Speed: 1.2535751874659935 samples/sec                   batch loss = 139.17555558681488 | accuracy = 0.6727272727272727


Epoch[2] Batch[60] Speed: 1.2471636628302214 samples/sec                   batch loss = 152.2571552991867 | accuracy = 0.6708333333333333


Epoch[2] Batch[65] Speed: 1.2439078622906368 samples/sec                   batch loss = 161.64744007587433 | accuracy = 0.6846153846153846


Epoch[2] Batch[70] Speed: 1.2551280263479978 samples/sec                   batch loss = 175.85004341602325 | accuracy = 0.6714285714285714


Epoch[2] Batch[75] Speed: 1.2485581894896525 samples/sec                   batch loss = 188.84581196308136 | accuracy = 0.6633333333333333


Epoch[2] Batch[80] Speed: 1.2552199591843667 samples/sec                   batch loss = 201.41106855869293 | accuracy = 0.6625


Epoch[2] Batch[85] Speed: 1.2454075214608635 samples/sec                   batch loss = 215.59698235988617 | accuracy = 0.6558823529411765


Epoch[2] Batch[90] Speed: 1.2501111911788605 samples/sec                   batch loss = 227.0189105272293 | accuracy = 0.6527777777777778


Epoch[2] Batch[95] Speed: 1.249865978426028 samples/sec                   batch loss = 239.304736495018 | accuracy = 0.65


Epoch[2] Batch[100] Speed: 1.2525806846586414 samples/sec                   batch loss = 252.85282385349274 | accuracy = 0.6425


Epoch[2] Batch[105] Speed: 1.246969928841797 samples/sec                   batch loss = 266.57383620738983 | accuracy = 0.6357142857142857


Epoch[2] Batch[110] Speed: 1.2448784747277786 samples/sec                   batch loss = 278.27203392982483 | accuracy = 0.6386363636363637


Epoch[2] Batch[115] Speed: 1.2545634841554418 samples/sec                   batch loss = 289.02948701381683 | accuracy = 0.6456521739130435


Epoch[2] Batch[120] Speed: 1.2544587032158054 samples/sec                   batch loss = 302.16619622707367 | accuracy = 0.64375


Epoch[2] Batch[125] Speed: 1.2530261719792688 samples/sec                   batch loss = 315.12170803546906 | accuracy = 0.644


Epoch[2] Batch[130] Speed: 1.2482233116187285 samples/sec                   batch loss = 329.95438253879547 | accuracy = 0.6365384615384615


Epoch[2] Batch[135] Speed: 1.2491876513020623 samples/sec                   batch loss = 342.96792352199554 | accuracy = 0.6407407407407407


Epoch[2] Batch[140] Speed: 1.2525319640892205 samples/sec                   batch loss = 354.3616341352463 | accuracy = 0.6410714285714286


Epoch[2] Batch[145] Speed: 1.25365340325865 samples/sec                   batch loss = 366.5108426809311 | accuracy = 0.6396551724137931


Epoch[2] Batch[150] Speed: 1.2497996859934546 samples/sec                   batch loss = 378.9120293855667 | accuracy = 0.6383333333333333


Epoch[2] Batch[155] Speed: 1.2459300785051348 samples/sec                   batch loss = 390.6937355995178 | accuracy = 0.6387096774193548


Epoch[2] Batch[160] Speed: 1.2468283279137837 samples/sec                   batch loss = 402.8094322681427 | accuracy = 0.6390625


Epoch[2] Batch[165] Speed: 1.2538763015422696 samples/sec                   batch loss = 413.0171387195587 | accuracy = 0.6454545454545455


Epoch[2] Batch[170] Speed: 1.2556793555484753 samples/sec                   batch loss = 423.61213171482086 | accuracy = 0.65


Epoch[2] Batch[175] Speed: 1.257492032376124 samples/sec                   batch loss = 434.2512859106064 | accuracy = 0.6514285714285715


Epoch[2] Batch[180] Speed: 1.2529318466724508 samples/sec                   batch loss = 447.83983290195465 | accuracy = 0.6472222222222223


Epoch[2] Batch[185] Speed: 1.2487660806242782 samples/sec                   batch loss = 458.7363260984421 | accuracy = 0.6513513513513514


Epoch[2] Batch[190] Speed: 1.253256336661687 samples/sec                   batch loss = 470.98737347126007 | accuracy = 0.6513157894736842


Epoch[2] Batch[195] Speed: 1.25362220944315 samples/sec                   batch loss = 482.0123413801193 | accuracy = 0.6512820512820513


Epoch[2] Batch[200] Speed: 1.263585592041012 samples/sec                   batch loss = 492.15055549144745 | accuracy = 0.65375


Epoch[2] Batch[205] Speed: 1.2481645291340036 samples/sec                   batch loss = 506.9236065149307 | accuracy = 0.651219512195122


Epoch[2] Batch[210] Speed: 1.2527032977174541 samples/sec                   batch loss = 519.6035429239273 | accuracy = 0.6511904761904762


Epoch[2] Batch[215] Speed: 1.258866029700812 samples/sec                   batch loss = 531.4889277219772 | accuracy = 0.65


Epoch[2] Batch[220] Speed: 1.2591443647429408 samples/sec                   batch loss = 543.0307132005692 | accuracy = 0.6534090909090909


Epoch[2] Batch[225] Speed: 1.2623863814213307 samples/sec                   batch loss = 555.2837691307068 | accuracy = 0.6533333333333333


Epoch[2] Batch[230] Speed: 1.2529913597959033 samples/sec                   batch loss = 569.031821012497 | accuracy = 0.6510869565217391


Epoch[2] Batch[235] Speed: 1.2576379514070903 samples/sec                   batch loss = 580.7883312702179 | accuracy = 0.6510638297872341


Epoch[2] Batch[240] Speed: 1.2630720857284932 samples/sec                   batch loss = 590.152337551117 | accuracy = 0.65625


Epoch[2] Batch[245] Speed: 1.2559465048687233 samples/sec                   batch loss = 602.0386958122253 | accuracy = 0.6551020408163265


Epoch[2] Batch[250] Speed: 1.2579567721921128 samples/sec                   batch loss = 614.0001081228256 | accuracy = 0.66


Epoch[2] Batch[255] Speed: 1.24637742098658 samples/sec                   batch loss = 624.7702860832214 | accuracy = 0.6588235294117647


Epoch[2] Batch[260] Speed: 1.2575495287266951 samples/sec                   batch loss = 641.1188061237335 | accuracy = 0.6548076923076923


Epoch[2] Batch[265] Speed: 1.251859988385169 samples/sec                   batch loss = 652.2255024909973 | accuracy = 0.6575471698113208


Epoch[2] Batch[270] Speed: 1.2557014413473653 samples/sec                   batch loss = 662.2843642234802 | accuracy = 0.6611111111111111


Epoch[2] Batch[275] Speed: 1.2495793514381597 samples/sec                   batch loss = 674.0623691082001 | accuracy = 0.6618181818181819


Epoch[2] Batch[280] Speed: 1.2477746433620835 samples/sec                   batch loss = 688.2158033847809 | accuracy = 0.6580357142857143


Epoch[2] Batch[285] Speed: 1.2556163917909775 samples/sec                   batch loss = 697.6295504570007 | accuracy = 0.6622807017543859


Epoch[2] Batch[290] Speed: 1.2529280103285882 samples/sec                   batch loss = 710.6478521823883 | accuracy = 0.6603448275862069


Epoch[2] Batch[295] Speed: 1.2574512225833143 samples/sec                   batch loss = 720.3175059556961 | accuracy = 0.6644067796610169


Epoch[2] Batch[300] Speed: 1.242753702898356 samples/sec                   batch loss = 735.0741087198257 | accuracy = 0.6625


Epoch[2] Batch[305] Speed: 1.246745217217867 samples/sec                   batch loss = 746.7922002077103 | accuracy = 0.6647540983606557


Epoch[2] Batch[310] Speed: 1.252034875629097 samples/sec                   batch loss = 759.7912880182266 | accuracy = 0.6637096774193548


Epoch[2] Batch[315] Speed: 1.2539439643671337 samples/sec                   batch loss = 769.1771804094315 | accuracy = 0.665079365079365


Epoch[2] Batch[320] Speed: 1.2504928452560928 samples/sec                   batch loss = 781.6752207279205 | accuracy = 0.6640625


Epoch[2] Batch[325] Speed: 1.2478303264160724 samples/sec                   batch loss = 794.327475309372 | accuracy = 0.6607692307692308


Epoch[2] Batch[330] Speed: 1.2548211489651804 samples/sec                   batch loss = 808.4719843864441 | accuracy = 0.6613636363636364


Epoch[2] Batch[335] Speed: 1.250328545176471 samples/sec                   batch loss = 820.4625149965286 | accuracy = 0.6619402985074627


Epoch[2] Batch[340] Speed: 1.2491138047373689 samples/sec                   batch loss = 832.8118678331375 | accuracy = 0.6617647058823529


Epoch[2] Batch[345] Speed: 1.2541683723097063 samples/sec                   batch loss = 844.7282376289368 | accuracy = 0.6623188405797101


Epoch[2] Batch[350] Speed: 1.2438835149262246 samples/sec                   batch loss = 854.6420559883118 | accuracy = 0.6635714285714286


Epoch[2] Batch[355] Speed: 1.2515720727488442 samples/sec                   batch loss = 869.3985426425934 | accuracy = 0.6633802816901408


Epoch[2] Batch[360] Speed: 1.2535615124136632 samples/sec                   batch loss = 883.6339154243469 | accuracy = 0.6618055555555555


Epoch[2] Batch[365] Speed: 1.2531850973209657 samples/sec                   batch loss = 896.3790646791458 | accuracy = 0.660958904109589


Epoch[2] Batch[370] Speed: 1.2472583270167084 samples/sec                   batch loss = 908.3626637458801 | accuracy = 0.6621621621621622


Epoch[2] Batch[375] Speed: 1.2437603171210667 samples/sec                   batch loss = 921.1058874130249 | accuracy = 0.6626666666666666


Epoch[2] Batch[380] Speed: 1.2564747458717618 samples/sec                   batch loss = 932.9671820402145 | accuracy = 0.6631578947368421


Epoch[2] Batch[385] Speed: 1.2547651217929934 samples/sec                   batch loss = 944.4182916879654 | accuracy = 0.6636363636363637


Epoch[2] Batch[390] Speed: 1.2489560032493134 samples/sec                   batch loss = 954.5518690347672 | accuracy = 0.6653846153846154


Epoch[2] Batch[395] Speed: 1.2475534451201784 samples/sec                   batch loss = 965.5661215782166 | accuracy = 0.6664556962025316


Epoch[2] Batch[400] Speed: 1.2508829396064016 samples/sec                   batch loss = 975.2469067573547 | accuracy = 0.66875


Epoch[2] Batch[405] Speed: 1.2545437836275208 samples/sec                   batch loss = 987.1317467689514 | accuracy = 0.6679012345679012


Epoch[2] Batch[410] Speed: 1.2524727750996358 samples/sec                   batch loss = 1002.8034057617188 | accuracy = 0.6640243902439025


Epoch[2] Batch[415] Speed: 1.2513062209785646 samples/sec                   batch loss = 1015.1879422664642 | accuracy = 0.6638554216867469


Epoch[2] Batch[420] Speed: 1.2466431276050658 samples/sec                   batch loss = 1025.8986639976501 | accuracy = 0.6642857142857143


Epoch[2] Batch[425] Speed: 1.2483316976333567 samples/sec                   batch loss = 1036.2444701194763 | accuracy = 0.6652941176470588


Epoch[2] Batch[430] Speed: 1.2487180281093029 samples/sec                   batch loss = 1051.9244320392609 | accuracy = 0.6622093023255814


Epoch[2] Batch[435] Speed: 1.2494765180084062 samples/sec                   batch loss = 1063.6774278879166 | accuracy = 0.6632183908045977


Epoch[2] Batch[440] Speed: 1.250842371137756 samples/sec                   batch loss = 1073.1205440759659 | accuracy = 0.6647727272727273


Epoch[2] Batch[445] Speed: 1.2429440109717271 samples/sec                   batch loss = 1085.6175147294998 | accuracy = 0.6646067415730337


Epoch[2] Batch[450] Speed: 1.2482191325862912 samples/sec                   batch loss = 1097.9537544250488 | accuracy = 0.6655555555555556


Epoch[2] Batch[455] Speed: 1.2531322113274912 samples/sec                   batch loss = 1111.303230524063 | accuracy = 0.6637362637362637


Epoch[2] Batch[460] Speed: 1.2580606289382315 samples/sec                   batch loss = 1125.044445991516 | accuracy = 0.6614130434782609


Epoch[2] Batch[465] Speed: 1.2447790918386146 samples/sec                   batch loss = 1139.8815784454346 | accuracy = 0.6591397849462366


Epoch[2] Batch[470] Speed: 1.2461479246262903 samples/sec                   batch loss = 1151.5751235485077 | accuracy = 0.6585106382978724


Epoch[2] Batch[475] Speed: 1.2548314727756236 samples/sec                   batch loss = 1163.8665661811829 | accuracy = 0.6584210526315789


Epoch[2] Batch[480] Speed: 1.250831180299137 samples/sec                   batch loss = 1175.4809964895248 | accuracy = 0.659375


Epoch[2] Batch[485] Speed: 1.2537016490033286 samples/sec                   batch loss = 1189.0788642168045 | accuracy = 0.6582474226804124


Epoch[2] Batch[490] Speed: 1.246365383982498 samples/sec                   batch loss = 1202.1264671087265 | accuracy = 0.6586734693877551


Epoch[2] Batch[495] Speed: 1.252017963991421 samples/sec                   batch loss = 1213.8987509012222 | accuracy = 0.6585858585858586


Epoch[2] Batch[500] Speed: 1.2563521460977127 samples/sec                   batch loss = 1224.426315665245 | accuracy = 0.66


Epoch[2] Batch[505] Speed: 1.2633904332973531 samples/sec                   batch loss = 1236.0717918872833 | accuracy = 0.6613861386138614


Epoch[2] Batch[510] Speed: 1.2562358724890208 samples/sec                   batch loss = 1248.7747684717178 | accuracy = 0.6612745098039216


Epoch[2] Batch[515] Speed: 1.2515234306085554 samples/sec                   batch loss = 1260.656576514244 | accuracy = 0.662621359223301


Epoch[2] Batch[520] Speed: 1.2538691795644301 samples/sec                   batch loss = 1273.1677734851837 | accuracy = 0.6634615384615384


Epoch[2] Batch[525] Speed: 1.2564043633372493 samples/sec                   batch loss = 1282.607722401619 | accuracy = 0.6647619047619048


Epoch[2] Batch[530] Speed: 1.251343832890579 samples/sec                   batch loss = 1293.391214132309 | accuracy = 0.6660377358490566


Epoch[2] Batch[535] Speed: 1.251894270655803 samples/sec                   batch loss = 1305.7401266098022 | accuracy = 0.6654205607476635


Epoch[2] Batch[540] Speed: 1.2457419074597276 samples/sec                   batch loss = 1313.832899928093 | accuracy = 0.6675925925925926


Epoch[2] Batch[545] Speed: 1.2565430658731807 samples/sec                   batch loss = 1324.3628170490265 | accuracy = 0.668348623853211


Epoch[2] Batch[550] Speed: 1.2623309113826067 samples/sec                   batch loss = 1339.050388097763 | accuracy = 0.6672727272727272


Epoch[2] Batch[555] Speed: 1.2530710937839602 samples/sec                   batch loss = 1351.3242859840393 | accuracy = 0.6671171171171171


Epoch[2] Batch[560] Speed: 1.2547398783537982 samples/sec                   batch loss = 1365.6260610818863 | accuracy = 0.6660714285714285


Epoch[2] Batch[565] Speed: 1.2573691397755742 samples/sec                   batch loss = 1375.4282242059708 | accuracy = 0.668141592920354


Epoch[2] Batch[570] Speed: 1.2635981542982835 samples/sec                   batch loss = 1386.4048029184341 | accuracy = 0.6684210526315789


Epoch[2] Batch[575] Speed: 1.2552428740614137 samples/sec                   batch loss = 1397.3592110872269 | accuracy = 0.6678260869565218


Epoch[2] Batch[580] Speed: 1.2555294745912153 samples/sec                   batch loss = 1410.0155173540115 | accuracy = 0.6676724137931035


Epoch[2] Batch[585] Speed: 1.2474497390627792 samples/sec                   batch loss = 1421.5358477830887 | accuracy = 0.6675213675213675


Epoch[2] Batch[590] Speed: 1.2508456351700727 samples/sec                   batch loss = 1434.8291298151016 | accuracy = 0.6673728813559322


Epoch[2] Batch[595] Speed: 1.258980145410072 samples/sec                   batch loss = 1447.337724328041 | accuracy = 0.6668067226890756


Epoch[2] Batch[600] Speed: 1.2603814481030704 samples/sec                   batch loss = 1459.2692770957947 | accuracy = 0.6675


Epoch[2] Batch[605] Speed: 1.2625471200871017 samples/sec                   batch loss = 1470.1777049303055 | accuracy = 0.6690082644628099


Epoch[2] Batch[610] Speed: 1.250133081536509 samples/sec                   batch loss = 1480.831899523735 | accuracy = 0.6700819672131147


Epoch[2] Batch[615] Speed: 1.2523630144348712 samples/sec                   batch loss = 1491.0960932970047 | accuracy = 0.6707317073170732


Epoch[2] Batch[620] Speed: 1.2568676402889178 samples/sec                   batch loss = 1499.3107900619507 | accuracy = 0.6721774193548387


Epoch[2] Batch[625] Speed: 1.2568017328463605 samples/sec                   batch loss = 1512.1825335025787 | accuracy = 0.672


Epoch[2] Batch[630] Speed: 1.2538390994715738 samples/sec                   batch loss = 1522.9600278139114 | accuracy = 0.6718253968253968


Epoch[2] Batch[635] Speed: 1.2482811708766814 samples/sec                   batch loss = 1532.925854563713 | accuracy = 0.6732283464566929


Epoch[2] Batch[640] Speed: 1.2479965698946816 samples/sec                   batch loss = 1545.2410566806793 | accuracy = 0.67265625


Epoch[2] Batch[645] Speed: 1.2553347297944928 samples/sec                   batch loss = 1555.091195344925 | accuracy = 0.6736434108527132


Epoch[2] Batch[650] Speed: 1.2574974047632432 samples/sec                   batch loss = 1567.3819992542267 | accuracy = 0.6742307692307692


Epoch[2] Batch[655] Speed: 1.253738561847292 samples/sec                   batch loss = 1577.3522552251816 | accuracy = 0.6744274809160306


Epoch[2] Batch[660] Speed: 1.2524673520596579 samples/sec                   batch loss = 1590.1178570985794 | accuracy = 0.6738636363636363


Epoch[2] Batch[665] Speed: 1.2595712707932254 samples/sec                   batch loss = 1600.9358155727386 | accuracy = 0.674812030075188


Epoch[2] Batch[670] Speed: 1.2563158318018663 samples/sec                   batch loss = 1612.7813384532928 | accuracy = 0.6746268656716418


Epoch[2] Batch[675] Speed: 1.260870973533624 samples/sec                   batch loss = 1622.7159397602081 | accuracy = 0.6751851851851852


Epoch[2] Batch[680] Speed: 1.2529999690805682 samples/sec                   batch loss = 1632.2396553754807 | accuracy = 0.6761029411764706


Epoch[2] Batch[685] Speed: 1.2510638044019697 samples/sec                   batch loss = 1643.8578723669052 | accuracy = 0.6766423357664234


Epoch[2] Batch[690] Speed: 1.2552041822143665 samples/sec                   batch loss = 1656.610785126686 | accuracy = 0.6771739130434783


Epoch[2] Batch[695] Speed: 1.2496667498922933 samples/sec                   batch loss = 1665.2622419595718 | accuracy = 0.6784172661870503


Epoch[2] Batch[700] Speed: 1.2615881398255095 samples/sec                   batch loss = 1676.8529212474823 | accuracy = 0.6792857142857143


Epoch[2] Batch[705] Speed: 1.2521915869512226 samples/sec                   batch loss = 1688.4277575016022 | accuracy = 0.6790780141843972


Epoch[2] Batch[710] Speed: 1.2626905084331692 samples/sec                   batch loss = 1699.4423376321793 | accuracy = 0.6795774647887324


Epoch[2] Batch[715] Speed: 1.2591714867864123 samples/sec                   batch loss = 1712.2016295194626 | accuracy = 0.6793706293706294


Epoch[2] Batch[720] Speed: 1.2511451593080958 samples/sec                   batch loss = 1722.2089537382126 | accuracy = 0.6795138888888889


Epoch[2] Batch[725] Speed: 1.2513638063883101 samples/sec                   batch loss = 1735.4005883932114 | accuracy = 0.6786206896551724


Epoch[2] Batch[730] Speed: 1.2469565828790317 samples/sec                   batch loss = 1747.4443491697311 | accuracy = 0.6787671232876712


Epoch[2] Batch[735] Speed: 1.254738846115819 samples/sec                   batch loss = 1760.3715802431107 | accuracy = 0.6785714285714286


Epoch[2] Batch[740] Speed: 1.2567387506213574 samples/sec                   batch loss = 1769.9805798530579 | accuracy = 0.6790540540540541


Epoch[2] Batch[745] Speed: 1.2560886799020088 samples/sec                   batch loss = 1777.275699198246 | accuracy = 0.6802013422818792


Epoch[2] Batch[750] Speed: 1.2551818321857635 samples/sec                   batch loss = 1787.8903627991676 | accuracy = 0.6806666666666666


Epoch[2] Batch[755] Speed: 1.247187489778502 samples/sec                   batch loss = 1797.848326265812 | accuracy = 0.6814569536423841


Epoch[2] Batch[760] Speed: 1.2519387376512836 samples/sec                   batch loss = 1807.467605650425 | accuracy = 0.6819078947368421


Epoch[2] Batch[765] Speed: 1.249677547567879 samples/sec                   batch loss = 1824.7959859967232 | accuracy = 0.6803921568627451


Epoch[2] Batch[770] Speed: 1.2568919336325852 samples/sec                   batch loss = 1839.3831871151924 | accuracy = 0.6801948051948052


Epoch[2] Batch[775] Speed: 1.252098508591284 samples/sec                   batch loss = 1849.9422149062157 | accuracy = 0.68


Epoch[2] Batch[780] Speed: 1.2497003536166245 samples/sec                   batch loss = 1863.9251693487167 | accuracy = 0.6801282051282052


Epoch[2] Batch[785] Speed: 1.2565014707176465 samples/sec                   batch loss = 1873.55448448658 | accuracy = 0.6808917197452229


[Epoch 2] training: accuracy=0.680520304568528
[Epoch 2] time cost: 645.1565608978271
[Epoch 2] validation: validation accuracy=0.7033333333333334


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).