<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[03:32:41] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[03:32:41] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[03:32:42] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 6.788857, -5.161715]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7777688820702544 samples/sec                   batch loss = 13.702746391296387 | accuracy = 0.65


Epoch[1] Batch[10] Speed: 1.2604474474483498 samples/sec                   batch loss = 29.56662678718567 | accuracy = 0.5


Epoch[1] Batch[15] Speed: 1.253917441918779 samples/sec                   batch loss = 43.2096893787384 | accuracy = 0.48333333333333334


Epoch[1] Batch[20] Speed: 1.252621272441358 samples/sec                   batch loss = 57.313788652420044 | accuracy = 0.5125


Epoch[1] Batch[25] Speed: 1.2488822770351022 samples/sec                   batch loss = 70.82450032234192 | accuracy = 0.53


Epoch[1] Batch[30] Speed: 1.2542984231869296 samples/sec                   batch loss = 84.41898584365845 | accuracy = 0.5416666666666666


Epoch[1] Batch[35] Speed: 1.25363841504544 samples/sec                   batch loss = 97.73467421531677 | accuracy = 0.5571428571428572


Epoch[1] Batch[40] Speed: 1.2483786057285382 samples/sec                   batch loss = 112.03243041038513 | accuracy = 0.5625


Epoch[1] Batch[45] Speed: 1.252665604038822 samples/sec                   batch loss = 125.42032504081726 | accuracy = 0.5611111111111111


Epoch[1] Batch[50] Speed: 1.2516063392518972 samples/sec                   batch loss = 140.82039284706116 | accuracy = 0.55


Epoch[1] Batch[55] Speed: 1.253404458034128 samples/sec                   batch loss = 154.63822603225708 | accuracy = 0.5545454545454546


Epoch[1] Batch[60] Speed: 1.2529872423537545 samples/sec                   batch loss = 167.91706466674805 | accuracy = 0.5666666666666667


Epoch[1] Batch[65] Speed: 1.2516322971435443 samples/sec                   batch loss = 181.4944806098938 | accuracy = 0.5653846153846154


Epoch[1] Batch[70] Speed: 1.2514355855751025 samples/sec                   batch loss = 195.72915196418762 | accuracy = 0.5535714285714286


Epoch[1] Batch[75] Speed: 1.25094832099799 samples/sec                   batch loss = 209.5375111103058 | accuracy = 0.55


Epoch[1] Batch[80] Speed: 1.2509564358557361 samples/sec                   batch loss = 223.6807279586792 | accuracy = 0.54375


Epoch[1] Batch[85] Speed: 1.252879543422131 samples/sec                   batch loss = 237.33524131774902 | accuracy = 0.55


Epoch[1] Batch[90] Speed: 1.2502593153627297 samples/sec                   batch loss = 250.27009987831116 | accuracy = 0.5527777777777778


Epoch[1] Batch[95] Speed: 1.2551015476817988 samples/sec                   batch loss = 264.22504591941833 | accuracy = 0.5526315789473685


Epoch[1] Batch[100] Speed: 1.2535420306626315 samples/sec                   batch loss = 278.9493832588196 | accuracy = 0.545


Epoch[1] Batch[105] Speed: 1.2500926548371487 samples/sec                   batch loss = 292.8836991786957 | accuracy = 0.5452380952380952


Epoch[1] Batch[110] Speed: 1.2527352876807536 samples/sec                   batch loss = 307.22399139404297 | accuracy = 0.5340909090909091


Epoch[1] Batch[115] Speed: 1.24873949792491 samples/sec                   batch loss = 321.53403663635254 | accuracy = 0.532608695652174


Epoch[1] Batch[120] Speed: 1.2574097557612114 samples/sec                   batch loss = 335.2283227443695 | accuracy = 0.5333333333333333


Epoch[1] Batch[125] Speed: 1.2563077413379755 samples/sec                   batch loss = 349.17567920684814 | accuracy = 0.534


Epoch[1] Batch[130] Speed: 1.257711300762001 samples/sec                   batch loss = 362.7900559902191 | accuracy = 0.5384615384615384


Epoch[1] Batch[135] Speed: 1.2550044687966684 samples/sec                   batch loss = 376.6299264431 | accuracy = 0.5407407407407407


Epoch[1] Batch[140] Speed: 1.2617203983276046 samples/sec                   batch loss = 391.33673667907715 | accuracy = 0.5375


Epoch[1] Batch[145] Speed: 1.2571263467271228 samples/sec                   batch loss = 405.15162444114685 | accuracy = 0.5379310344827586


Epoch[1] Batch[150] Speed: 1.2631413152868398 samples/sec                   batch loss = 418.4237961769104 | accuracy = 0.54


Epoch[1] Batch[155] Speed: 1.255196857327395 samples/sec                   batch loss = 432.4768650531769 | accuracy = 0.5370967741935484


Epoch[1] Batch[160] Speed: 1.2581127052599412 samples/sec                   batch loss = 446.08612608909607 | accuracy = 0.5421875


Epoch[1] Batch[165] Speed: 1.2542135635727902 samples/sec                   batch loss = 459.1042764186859 | accuracy = 0.543939393939394


Epoch[1] Batch[170] Speed: 1.2546497985235003 samples/sec                   batch loss = 473.1543002128601 | accuracy = 0.5426470588235294


Epoch[1] Batch[175] Speed: 1.2551377918118518 samples/sec                   batch loss = 486.7597463130951 | accuracy = 0.5428571428571428


Epoch[1] Batch[180] Speed: 1.263399186079278 samples/sec                   batch loss = 500.08212971687317 | accuracy = 0.5458333333333333


Epoch[1] Batch[185] Speed: 1.2594062783188236 samples/sec                   batch loss = 513.604877948761 | accuracy = 0.5445945945945946


Epoch[1] Batch[190] Speed: 1.253798151633713 samples/sec                   batch loss = 528.4572360515594 | accuracy = 0.5394736842105263


Epoch[1] Batch[195] Speed: 1.2583527649090231 samples/sec                   batch loss = 542.5061905384064 | accuracy = 0.5384615384615384


Epoch[1] Batch[200] Speed: 1.2611172060190379 samples/sec                   batch loss = 556.9451339244843 | accuracy = 0.5325


Epoch[1] Batch[205] Speed: 1.2688018594434904 samples/sec                   batch loss = 570.724390745163 | accuracy = 0.5329268292682927


Epoch[1] Batch[210] Speed: 1.2573627319246785 samples/sec                   batch loss = 584.2454605102539 | accuracy = 0.5380952380952381


Epoch[1] Batch[215] Speed: 1.2559349404582063 samples/sec                   batch loss = 597.4367187023163 | accuracy = 0.5406976744186046


Epoch[1] Batch[220] Speed: 1.2550783562926853 samples/sec                   batch loss = 611.1655640602112 | accuracy = 0.5431818181818182


Epoch[1] Batch[225] Speed: 1.2551564780749198 samples/sec                   batch loss = 624.4588088989258 | accuracy = 0.5422222222222223


Epoch[1] Batch[230] Speed: 1.2569838425299806 samples/sec                   batch loss = 637.407684803009 | accuracy = 0.5445652173913044


Epoch[1] Batch[235] Speed: 1.2563948604174289 samples/sec                   batch loss = 650.8603281974792 | accuracy = 0.55


Epoch[1] Batch[240] Speed: 1.2560197510811573 samples/sec                   batch loss = 664.1463627815247 | accuracy = 0.5510416666666667


Epoch[1] Batch[245] Speed: 1.2603929051693699 samples/sec                   batch loss = 677.7050342559814 | accuracy = 0.5520408163265306


Epoch[1] Batch[250] Speed: 1.2542179703639966 samples/sec                   batch loss = 690.8258943557739 | accuracy = 0.553


Epoch[1] Batch[255] Speed: 1.2592478505920948 samples/sec                   batch loss = 704.2446706295013 | accuracy = 0.5529411764705883


Epoch[1] Batch[260] Speed: 1.2589472688955032 samples/sec                   batch loss = 717.9100184440613 | accuracy = 0.5538461538461539


Epoch[1] Batch[265] Speed: 1.25475376681176 samples/sec                   batch loss = 730.7826318740845 | accuracy = 0.5575471698113208


Epoch[1] Batch[270] Speed: 1.2609292531369505 samples/sec                   batch loss = 744.1595923900604 | accuracy = 0.5592592592592592


Epoch[1] Batch[275] Speed: 1.265646472835723 samples/sec                   batch loss = 757.6543416976929 | accuracy = 0.5581818181818182


Epoch[1] Batch[280] Speed: 1.2595333517915368 samples/sec                   batch loss = 771.5385885238647 | accuracy = 0.5571428571428572


Epoch[1] Batch[285] Speed: 1.2615676488800978 samples/sec                   batch loss = 784.4022271633148 | accuracy = 0.5587719298245614


Epoch[1] Batch[290] Speed: 1.2593932320477033 samples/sec                   batch loss = 799.9854173660278 | accuracy = 0.553448275862069


Epoch[1] Batch[295] Speed: 1.2607250612488674 samples/sec                   batch loss = 813.9523649215698 | accuracy = 0.5525423728813559


Epoch[1] Batch[300] Speed: 1.263840407268213 samples/sec                   batch loss = 827.9971601963043 | accuracy = 0.5516666666666666


Epoch[1] Batch[305] Speed: 1.2581933754650576 samples/sec                   batch loss = 841.6731803417206 | accuracy = 0.5524590163934426


Epoch[1] Batch[310] Speed: 1.2606601695576243 samples/sec                   batch loss = 855.0831744670868 | accuracy = 0.5548387096774193


Epoch[1] Batch[315] Speed: 1.2609939829782661 samples/sec                   batch loss = 868.6065666675568 | accuracy = 0.5547619047619048


Epoch[1] Batch[320] Speed: 1.2669796076831579 samples/sec                   batch loss = 882.9705612659454 | accuracy = 0.55234375


Epoch[1] Batch[325] Speed: 1.25912877245944 samples/sec                   batch loss = 896.3359642028809 | accuracy = 0.553076923076923


Epoch[1] Batch[330] Speed: 1.264221155753176 samples/sec                   batch loss = 910.8172526359558 | accuracy = 0.55


Epoch[1] Batch[335] Speed: 1.2586467362368186 samples/sec                   batch loss = 925.1533846855164 | accuracy = 0.5485074626865671


Epoch[1] Batch[340] Speed: 1.2702085827200102 samples/sec                   batch loss = 939.2891023159027 | accuracy = 0.5470588235294118


Epoch[1] Batch[345] Speed: 1.2636819090638276 samples/sec                   batch loss = 952.5772089958191 | accuracy = 0.5463768115942029


Epoch[1] Batch[350] Speed: 1.2621074658973204 samples/sec                   batch loss = 965.6595160961151 | accuracy = 0.5492857142857143


Epoch[1] Batch[355] Speed: 1.2697349400945905 samples/sec                   batch loss = 978.975667476654 | accuracy = 0.5507042253521127


Epoch[1] Batch[360] Speed: 1.261690604538072 samples/sec                   batch loss = 992.2336401939392 | accuracy = 0.5527777777777778


Epoch[1] Batch[365] Speed: 1.259148239249045 samples/sec                   batch loss = 1005.4905602931976 | accuracy = 0.5534246575342465


Epoch[1] Batch[370] Speed: 1.2563393511840006 samples/sec                   batch loss = 1019.7466208934784 | accuracy = 0.5513513513513514


Epoch[1] Batch[375] Speed: 1.2530695027483598 samples/sec                   batch loss = 1033.6881155967712 | accuracy = 0.5493333333333333


Epoch[1] Batch[380] Speed: 1.255659901892589 samples/sec                   batch loss = 1047.6392259597778 | accuracy = 0.5473684210526316


Epoch[1] Batch[385] Speed: 1.2572795300261645 samples/sec                   batch loss = 1060.9374418258667 | accuracy = 0.548051948051948


Epoch[1] Batch[390] Speed: 1.253483308086421 samples/sec                   batch loss = 1074.772985458374 | accuracy = 0.5467948717948717


Epoch[1] Batch[395] Speed: 1.2509843256522974 samples/sec                   batch loss = 1088.7052402496338 | accuracy = 0.5468354430379747


Epoch[1] Batch[400] Speed: 1.265056214613266 samples/sec                   batch loss = 1101.5714039802551 | accuracy = 0.54875


Epoch[1] Batch[405] Speed: 1.2648852046826613 samples/sec                   batch loss = 1114.6575167179108 | accuracy = 0.5487654320987654


Epoch[1] Batch[410] Speed: 1.2665675540175387 samples/sec                   batch loss = 1127.8901317119598 | accuracy = 0.5493902439024391


Epoch[1] Batch[415] Speed: 1.2663919292538581 samples/sec                   batch loss = 1141.7081291675568 | accuracy = 0.5493975903614458


Epoch[1] Batch[420] Speed: 1.2660192346119181 samples/sec                   batch loss = 1154.8949522972107 | accuracy = 0.5517857142857143


Epoch[1] Batch[425] Speed: 1.261643259977434 samples/sec                   batch loss = 1168.7395803928375 | accuracy = 0.5505882352941176


Epoch[1] Batch[430] Speed: 1.257110427603113 samples/sec                   batch loss = 1182.000785112381 | accuracy = 0.5511627906976744


Epoch[1] Batch[435] Speed: 1.260921292678333 samples/sec                   batch loss = 1195.2301998138428 | accuracy = 0.5522988505747126


Epoch[1] Batch[440] Speed: 1.2639741859367448 samples/sec                   batch loss = 1207.7758884429932 | accuracy = 0.5545454545454546


Epoch[1] Batch[445] Speed: 1.2541286216917429 samples/sec                   batch loss = 1221.4522721767426 | accuracy = 0.5544943820224719


Epoch[1] Batch[450] Speed: 1.257758162090906 samples/sec                   batch loss = 1233.6898760795593 | accuracy = 0.5566666666666666


Epoch[1] Batch[455] Speed: 1.2603878867593212 samples/sec                   batch loss = 1247.9209439754486 | accuracy = 0.554945054945055


Epoch[1] Batch[460] Speed: 1.2617203034406912 samples/sec                   batch loss = 1262.0986588001251 | accuracy = 0.5548913043478261


Epoch[1] Batch[465] Speed: 1.2589526537242668 samples/sec                   batch loss = 1275.8467903137207 | accuracy = 0.5543010752688172


Epoch[1] Batch[470] Speed: 1.2597909811630745 samples/sec                   batch loss = 1290.09694314003 | accuracy = 0.5531914893617021


Epoch[1] Batch[475] Speed: 1.2628941967683813 samples/sec                   batch loss = 1302.987122297287 | accuracy = 0.5552631578947368


Epoch[1] Batch[480] Speed: 1.264299467142729 samples/sec                   batch loss = 1315.2765884399414 | accuracy = 0.5572916666666666


Epoch[1] Batch[485] Speed: 1.264363400056024 samples/sec                   batch loss = 1329.010437488556 | accuracy = 0.5561855670103093


Epoch[1] Batch[490] Speed: 1.261691838010991 samples/sec                   batch loss = 1342.5027253627777 | accuracy = 0.5566326530612244


Epoch[1] Batch[495] Speed: 1.2660006056380786 samples/sec                   batch loss = 1355.5998277664185 | accuracy = 0.557070707070707


Epoch[1] Batch[500] Speed: 1.2654112578894854 samples/sec                   batch loss = 1369.4694123268127 | accuracy = 0.557


Epoch[1] Batch[505] Speed: 1.26092081884468 samples/sec                   batch loss = 1381.8520579338074 | accuracy = 0.558910891089109


Epoch[1] Batch[510] Speed: 1.2594083581841273 samples/sec                   batch loss = 1395.5631220340729 | accuracy = 0.557843137254902


Epoch[1] Batch[515] Speed: 1.258340873005962 samples/sec                   batch loss = 1410.0433099269867 | accuracy = 0.5558252427184466


Epoch[1] Batch[520] Speed: 1.261045923293286 samples/sec                   batch loss = 1422.9769535064697 | accuracy = 0.5567307692307693


Epoch[1] Batch[525] Speed: 1.2595175607609739 samples/sec                   batch loss = 1436.1412377357483 | accuracy = 0.5571428571428572


Epoch[1] Batch[530] Speed: 1.2610276299570067 samples/sec                   batch loss = 1450.7010853290558 | accuracy = 0.555188679245283


Epoch[1] Batch[535] Speed: 1.2580542140374307 samples/sec                   batch loss = 1463.6958751678467 | accuracy = 0.5560747663551402


Epoch[1] Batch[540] Speed: 1.2545156410894682 samples/sec                   batch loss = 1477.267261505127 | accuracy = 0.5564814814814815


Epoch[1] Batch[545] Speed: 1.2512558264275373 samples/sec                   batch loss = 1490.8306047916412 | accuracy = 0.5573394495412844


Epoch[1] Batch[550] Speed: 1.255127087369097 samples/sec                   batch loss = 1503.852875471115 | accuracy = 0.5577272727272727


Epoch[1] Batch[555] Speed: 1.2617284637674184 samples/sec                   batch loss = 1517.5037243366241 | accuracy = 0.5581081081081081


Epoch[1] Batch[560] Speed: 1.2563882743197154 samples/sec                   batch loss = 1530.754763841629 | accuracy = 0.559375


Epoch[1] Batch[565] Speed: 1.257954225511841 samples/sec                   batch loss = 1544.5551452636719 | accuracy = 0.5584070796460177


Epoch[1] Batch[570] Speed: 1.258943206686743 samples/sec                   batch loss = 1557.8828301429749 | accuracy = 0.5600877192982456


Epoch[1] Batch[575] Speed: 1.256279331464626 samples/sec                   batch loss = 1570.6795558929443 | accuracy = 0.5604347826086956


Epoch[1] Batch[580] Speed: 1.2589787282835156 samples/sec                   batch loss = 1584.7651934623718 | accuracy = 0.5594827586206896


Epoch[1] Batch[585] Speed: 1.260479644759231 samples/sec                   batch loss = 1597.6716330051422 | accuracy = 0.5606837606837607


Epoch[1] Batch[590] Speed: 1.2574745017479771 samples/sec                   batch loss = 1611.0367889404297 | accuracy = 0.5610169491525424


Epoch[1] Batch[595] Speed: 1.2562308871277938 samples/sec                   batch loss = 1624.7412045001984 | accuracy = 0.5617647058823529


Epoch[1] Batch[600] Speed: 1.2551077447163015 samples/sec                   batch loss = 1637.4599492549896 | accuracy = 0.5620833333333334


Epoch[1] Batch[605] Speed: 1.2606103448514232 samples/sec                   batch loss = 1652.3395035266876 | accuracy = 0.5599173553719008


Epoch[1] Batch[610] Speed: 1.2580861005134638 samples/sec                   batch loss = 1666.56436753273 | accuracy = 0.559016393442623


Epoch[1] Batch[615] Speed: 1.2572130141190914 samples/sec                   batch loss = 1678.4118589162827 | accuracy = 0.5601626016260163


Epoch[1] Batch[620] Speed: 1.2603003077355768 samples/sec                   batch loss = 1690.9912155866623 | accuracy = 0.5612903225806452


Epoch[1] Batch[625] Speed: 1.2599895714430873 samples/sec                   batch loss = 1703.4568020105362 | accuracy = 0.5636


Epoch[1] Batch[630] Speed: 1.2554608889719017 samples/sec                   batch loss = 1717.1200414896011 | accuracy = 0.5634920634920635


Epoch[1] Batch[635] Speed: 1.2587628899239267 samples/sec                   batch loss = 1730.5476070642471 | accuracy = 0.5633858267716535


Epoch[1] Batch[640] Speed: 1.2541204656265008 samples/sec                   batch loss = 1743.881111741066 | accuracy = 0.563671875


Epoch[1] Batch[645] Speed: 1.2572362844646128 samples/sec                   batch loss = 1757.1169573068619 | accuracy = 0.5643410852713179


Epoch[1] Batch[650] Speed: 1.2580939308716392 samples/sec                   batch loss = 1771.0630048513412 | accuracy = 0.5646153846153846


Epoch[1] Batch[655] Speed: 1.2647266350095827 samples/sec                   batch loss = 1784.3967267274857 | accuracy = 0.565267175572519


Epoch[1] Batch[660] Speed: 1.2631668028046354 samples/sec                   batch loss = 1796.8025625944138 | accuracy = 0.5659090909090909


Epoch[1] Batch[665] Speed: 1.2580313850681988 samples/sec                   batch loss = 1809.6359394788742 | accuracy = 0.5672932330827067


Epoch[1] Batch[670] Speed: 1.261699808201838 samples/sec                   batch loss = 1822.5549288988113 | accuracy = 0.5675373134328359


Epoch[1] Batch[675] Speed: 1.2623065023108633 samples/sec                   batch loss = 1835.8269098997116 | accuracy = 0.5677777777777778


Epoch[1] Batch[680] Speed: 1.25585145737376 samples/sec                   batch loss = 1848.7914346456528 | accuracy = 0.5680147058823529


Epoch[1] Batch[685] Speed: 1.2598008193217933 samples/sec                   batch loss = 1861.442690730095 | accuracy = 0.5682481751824817


Epoch[1] Batch[690] Speed: 1.2542174077931822 samples/sec                   batch loss = 1872.8758817911148 | accuracy = 0.5702898550724638


Epoch[1] Batch[695] Speed: 1.2639718052859061 samples/sec                   batch loss = 1885.3874017000198 | accuracy = 0.570863309352518


Epoch[1] Batch[700] Speed: 1.2580737419537034 samples/sec                   batch loss = 1896.7893069982529 | accuracy = 0.5725


Epoch[1] Batch[705] Speed: 1.2552082203296373 samples/sec                   batch loss = 1910.331141114235 | accuracy = 0.573404255319149


Epoch[1] Batch[710] Speed: 1.2650515405538543 samples/sec                   batch loss = 1923.4646192789078 | accuracy = 0.5735915492957746


Epoch[1] Batch[715] Speed: 1.262965688606671 samples/sec                   batch loss = 1937.8131130933762 | accuracy = 0.5734265734265734


Epoch[1] Batch[720] Speed: 1.2593047514291877 samples/sec                   batch loss = 1951.5551973581314 | accuracy = 0.5725694444444445


Epoch[1] Batch[725] Speed: 1.2640398954394718 samples/sec                   batch loss = 1965.1857460737228 | accuracy = 0.5727586206896552


Epoch[1] Batch[730] Speed: 1.2606183013843555 samples/sec                   batch loss = 1977.9324816465378 | accuracy = 0.5732876712328767


Epoch[1] Batch[735] Speed: 1.2592150546051875 samples/sec                   batch loss = 1991.420982003212 | accuracy = 0.573469387755102


Epoch[1] Batch[740] Speed: 1.2595059304946656 samples/sec                   batch loss = 2002.9288738965988 | accuracy = 0.5753378378378379


Epoch[1] Batch[745] Speed: 1.2623432587434034 samples/sec                   batch loss = 2014.765988111496 | accuracy = 0.5765100671140939


Epoch[1] Batch[750] Speed: 1.2614169280769283 samples/sec                   batch loss = 2028.611351966858 | accuracy = 0.5766666666666667


Epoch[1] Batch[755] Speed: 1.252907144772128 samples/sec                   batch loss = 2042.950814962387 | accuracy = 0.5761589403973509


Epoch[1] Batch[760] Speed: 1.2597035796535074 samples/sec                   batch loss = 2053.95369553566 | accuracy = 0.5776315789473684


Epoch[1] Batch[765] Speed: 1.2581331784727838 samples/sec                   batch loss = 2068.163328051567 | accuracy = 0.5777777777777777


Epoch[1] Batch[770] Speed: 1.2561870552777334 samples/sec                   batch loss = 2081.019329190254 | accuracy = 0.5785714285714286


Epoch[1] Batch[775] Speed: 1.2546828262927938 samples/sec                   batch loss = 2092.340374350548 | accuracy = 0.5796774193548387


Epoch[1] Batch[780] Speed: 1.2606023884189272 samples/sec                   batch loss = 2105.5629926919937 | accuracy = 0.5801282051282052


Epoch[1] Batch[785] Speed: 1.2538546547288068 samples/sec                   batch loss = 2118.260651230812 | accuracy = 0.5808917197452229


[Epoch 1] training: accuracy=0.5815355329949239
[Epoch 1] time cost: 644.6514472961426
[Epoch 1] validation: validation accuracy=0.6922222222222222


Epoch[2] Batch[5] Speed: 1.2519343468620148 samples/sec                   batch loss = 11.857275009155273 | accuracy = 0.75


Epoch[2] Batch[10] Speed: 1.258531831286142 samples/sec                   batch loss = 23.846375465393066 | accuracy = 0.75


Epoch[2] Batch[15] Speed: 1.2572039700067381 samples/sec                   batch loss = 37.51780915260315 | accuracy = 0.65


Epoch[2] Batch[20] Speed: 1.2588097352001724 samples/sec                   batch loss = 50.911168932914734 | accuracy = 0.6125


Epoch[2] Batch[25] Speed: 1.2572014263735838 samples/sec                   batch loss = 61.75223243236542 | accuracy = 0.65


Epoch[2] Batch[30] Speed: 1.2553294697986672 samples/sec                   batch loss = 73.21168959140778 | accuracy = 0.6583333333333333


Epoch[2] Batch[35] Speed: 1.2511253793895463 samples/sec                   batch loss = 84.20527946949005 | accuracy = 0.6714285714285714


Epoch[2] Batch[40] Speed: 1.2514280245705482 samples/sec                   batch loss = 97.37880325317383 | accuracy = 0.65625


Epoch[2] Batch[45] Speed: 1.251378926936806 samples/sec                   batch loss = 109.6189752817154 | accuracy = 0.6722222222222223


Epoch[2] Batch[50] Speed: 1.2478893558795978 samples/sec                   batch loss = 122.21114456653595 | accuracy = 0.665


Epoch[2] Batch[55] Speed: 1.2577695715144537 samples/sec                   batch loss = 135.99033629894257 | accuracy = 0.6545454545454545


Epoch[2] Batch[60] Speed: 1.2544940659986836 samples/sec                   batch loss = 147.9133073091507 | accuracy = 0.6625


Epoch[2] Batch[65] Speed: 1.2510401089763674 samples/sec                   batch loss = 161.85486829280853 | accuracy = 0.6538461538461539


Epoch[2] Batch[70] Speed: 1.2551203267624835 samples/sec                   batch loss = 173.63746917247772 | accuracy = 0.6535714285714286


Epoch[2] Batch[75] Speed: 1.2499227795028118 samples/sec                   batch loss = 186.75453507900238 | accuracy = 0.6466666666666666


Epoch[2] Batch[80] Speed: 1.2494419958284948 samples/sec                   batch loss = 199.48765528202057 | accuracy = 0.640625


Epoch[2] Batch[85] Speed: 1.2507188167280314 samples/sec                   batch loss = 210.52761924266815 | accuracy = 0.65


Epoch[2] Batch[90] Speed: 1.2538967308223234 samples/sec                   batch loss = 222.64129626750946 | accuracy = 0.6472222222222223


Epoch[2] Batch[95] Speed: 1.2509750911003406 samples/sec                   batch loss = 236.32118999958038 | accuracy = 0.6394736842105263


Epoch[2] Batch[100] Speed: 1.2519482667041568 samples/sec                   batch loss = 247.43798768520355 | accuracy = 0.6425


Epoch[2] Batch[105] Speed: 1.255530696046869 samples/sec                   batch loss = 260.1928938627243 | accuracy = 0.6404761904761904


Epoch[2] Batch[110] Speed: 1.2543240239769657 samples/sec                   batch loss = 273.09263730049133 | accuracy = 0.6431818181818182


Epoch[2] Batch[115] Speed: 1.2557488109573358 samples/sec                   batch loss = 283.6248381137848 | accuracy = 0.6478260869565218


Epoch[2] Batch[120] Speed: 1.2587386186419962 samples/sec                   batch loss = 295.9944885969162 | accuracy = 0.6520833333333333


Epoch[2] Batch[125] Speed: 1.253834976456052 samples/sec                   batch loss = 307.79068553447723 | accuracy = 0.654


Epoch[2] Batch[130] Speed: 1.2561276143937778 samples/sec                   batch loss = 319.8487995862961 | accuracy = 0.6538461538461539


Epoch[2] Batch[135] Speed: 1.2518793244758994 samples/sec                   batch loss = 333.2480961084366 | accuracy = 0.6481481481481481


Epoch[2] Batch[140] Speed: 1.2570042790191402 samples/sec                   batch loss = 344.6905987262726 | accuracy = 0.6535714285714286


Epoch[2] Batch[145] Speed: 1.2555991953252228 samples/sec                   batch loss = 357.4394303560257 | accuracy = 0.6551724137931034


Epoch[2] Batch[150] Speed: 1.2556896934852795 samples/sec                   batch loss = 368.9337486028671 | accuracy = 0.6583333333333333


Epoch[2] Batch[155] Speed: 1.254191623839157 samples/sec                   batch loss = 380.40487492084503 | accuracy = 0.6596774193548387


Epoch[2] Batch[160] Speed: 1.257054949310101 samples/sec                   batch loss = 393.60875737667084 | accuracy = 0.659375


Epoch[2] Batch[165] Speed: 1.258428463084611 samples/sec                   batch loss = 405.9857199192047 | accuracy = 0.656060606060606


Epoch[2] Batch[170] Speed: 1.2542610085273997 samples/sec                   batch loss = 419.28751826286316 | accuracy = 0.6558823529411765


Epoch[2] Batch[175] Speed: 1.2588314589914127 samples/sec                   batch loss = 431.6249086856842 | accuracy = 0.6557142857142857


Epoch[2] Batch[180] Speed: 1.254025225636798 samples/sec                   batch loss = 446.5198736190796 | accuracy = 0.6541666666666667


Epoch[2] Batch[185] Speed: 1.2567222764919705 samples/sec                   batch loss = 458.0463447570801 | accuracy = 0.6581081081081082


Epoch[2] Batch[190] Speed: 1.2583841945919803 samples/sec                   batch loss = 471.04189002513885 | accuracy = 0.656578947368421


Epoch[2] Batch[195] Speed: 1.2545617017012676 samples/sec                   batch loss = 480.6282252073288 | accuracy = 0.6628205128205128


Epoch[2] Batch[200] Speed: 1.2521168241227272 samples/sec                   batch loss = 494.17504346370697 | accuracy = 0.6625


Epoch[2] Batch[205] Speed: 1.257253620018739 samples/sec                   batch loss = 505.45300686359406 | accuracy = 0.6597560975609756


Epoch[2] Batch[210] Speed: 1.253794591078556 samples/sec                   batch loss = 516.6294537782669 | accuracy = 0.6607142857142857


Epoch[2] Batch[215] Speed: 1.2553584942536493 samples/sec                   batch loss = 526.6470452547073 | accuracy = 0.663953488372093


Epoch[2] Batch[220] Speed: 1.2531631935538181 samples/sec                   batch loss = 539.0305043458939 | accuracy = 0.6659090909090909


Epoch[2] Batch[225] Speed: 1.2545148906390706 samples/sec                   batch loss = 549.5848525762558 | accuracy = 0.6677777777777778


Epoch[2] Batch[230] Speed: 1.2603889283128376 samples/sec                   batch loss = 559.0668312311172 | accuracy = 0.6684782608695652


Epoch[2] Batch[235] Speed: 1.2621632011603596 samples/sec                   batch loss = 569.6603873968124 | accuracy = 0.6691489361702128


Epoch[2] Batch[240] Speed: 1.2627398324296497 samples/sec                   batch loss = 581.2110805511475 | accuracy = 0.6708333333333333


Epoch[2] Batch[245] Speed: 1.2602092384609376 samples/sec                   batch loss = 591.5924026966095 | accuracy = 0.673469387755102


Epoch[2] Batch[250] Speed: 1.257868304675968 samples/sec                   batch loss = 603.0369024276733 | accuracy = 0.673


Epoch[2] Batch[255] Speed: 1.2584126053561018 samples/sec                   batch loss = 616.3768448829651 | accuracy = 0.6715686274509803


Epoch[2] Batch[260] Speed: 1.2600140802526334 samples/sec                   batch loss = 626.1886097192764 | accuracy = 0.6721153846153847


Epoch[2] Batch[265] Speed: 1.2597178619716294 samples/sec                   batch loss = 636.5508555173874 | accuracy = 0.6745283018867925


Epoch[2] Batch[270] Speed: 1.254362567796725 samples/sec                   batch loss = 649.0398503541946 | accuracy = 0.675


Epoch[2] Batch[275] Speed: 1.2544867493972716 samples/sec                   batch loss = 657.8715898394585 | accuracy = 0.6763636363636364


Epoch[2] Batch[280] Speed: 1.2568875080310309 samples/sec                   batch loss = 672.6213063597679 | accuracy = 0.6741071428571429


Epoch[2] Batch[285] Speed: 1.2571252163618245 samples/sec                   batch loss = 684.4081441760063 | accuracy = 0.6728070175438596


Epoch[2] Batch[290] Speed: 1.2593229002860593 samples/sec                   batch loss = 696.0719042420387 | accuracy = 0.6741379310344827


Epoch[2] Batch[295] Speed: 1.2587123651573817 samples/sec                   batch loss = 706.846771299839 | accuracy = 0.6745762711864407


Epoch[2] Batch[300] Speed: 1.256381500119818 samples/sec                   batch loss = 718.9660688042641 | accuracy = 0.6758333333333333


Epoch[2] Batch[305] Speed: 1.2582931188970063 samples/sec                   batch loss = 732.4971850514412 | accuracy = 0.6721311475409836


Epoch[2] Batch[310] Speed: 1.2588653684945703 samples/sec                   batch loss = 745.2903559803963 | accuracy = 0.6733870967741935


Epoch[2] Batch[315] Speed: 1.2507891231877601 samples/sec                   batch loss = 757.0601113438606 | accuracy = 0.6730158730158731


Epoch[2] Batch[320] Speed: 1.2481705649950583 samples/sec                   batch loss = 768.5171170830727 | accuracy = 0.671875


Epoch[2] Batch[325] Speed: 1.2473315834089564 samples/sec                   batch loss = 779.2766560912132 | accuracy = 0.6730769230769231


Epoch[2] Batch[330] Speed: 1.2612486072655313 samples/sec                   batch loss = 791.0975776314735 | accuracy = 0.6727272727272727


Epoch[2] Batch[335] Speed: 1.2556020143856894 samples/sec                   batch loss = 802.5042055249214 | accuracy = 0.6738805970149254


Epoch[2] Batch[340] Speed: 1.2575989232327334 samples/sec                   batch loss = 812.2818587422371 | accuracy = 0.6757352941176471


Epoch[2] Batch[345] Speed: 1.253079236205868 samples/sec                   batch loss = 826.0744070410728 | accuracy = 0.6746376811594202


Epoch[2] Batch[350] Speed: 1.2518751209271024 samples/sec                   batch loss = 841.5804058909416 | accuracy = 0.6728571428571428


Epoch[2] Batch[355] Speed: 1.2560316931863347 samples/sec                   batch loss = 856.0765840411186 | accuracy = 0.6711267605633803


Epoch[2] Batch[360] Speed: 1.2538622450847992 samples/sec                   batch loss = 868.4860063195229 | accuracy = 0.6722222222222223


Epoch[2] Batch[365] Speed: 1.2487008342006098 samples/sec                   batch loss = 876.5001621842384 | accuracy = 0.6753424657534246


Epoch[2] Batch[370] Speed: 1.2527446417799035 samples/sec                   batch loss = 887.8969529271126 | accuracy = 0.677027027027027


Epoch[2] Batch[375] Speed: 1.2483122852154094 samples/sec                   batch loss = 898.6728604435921 | accuracy = 0.678


Epoch[2] Batch[380] Speed: 1.250437017527358 samples/sec                   batch loss = 912.0301364660263 | accuracy = 0.6769736842105263


Epoch[2] Batch[385] Speed: 1.251026489177347 samples/sec                   batch loss = 925.0934969186783 | accuracy = 0.6785714285714286


Epoch[2] Batch[390] Speed: 1.2507842742126374 samples/sec                   batch loss = 937.6152331829071 | accuracy = 0.6775641025641026


Epoch[2] Batch[395] Speed: 1.2474377740887082 samples/sec                   batch loss = 947.7515758275986 | accuracy = 0.6778481012658227


Epoch[2] Batch[400] Speed: 1.2480204286791208 samples/sec                   batch loss = 961.8513292074203 | accuracy = 0.676875


Epoch[2] Batch[405] Speed: 1.2497584429865483 samples/sec                   batch loss = 975.458349108696 | accuracy = 0.6734567901234568


Epoch[2] Batch[410] Speed: 1.2466350686108583 samples/sec                   batch loss = 988.4199296236038 | accuracy = 0.6719512195121952


Epoch[2] Batch[415] Speed: 1.2529121973543185 samples/sec                   batch loss = 999.6038615703583 | accuracy = 0.672289156626506


Epoch[2] Batch[420] Speed: 1.256880069324168 samples/sec                   batch loss = 1011.0845830440521 | accuracy = 0.6726190476190477


Epoch[2] Batch[425] Speed: 1.2462384540394482 samples/sec                   batch loss = 1023.8507808446884 | accuracy = 0.6723529411764706


Epoch[2] Batch[430] Speed: 1.248783090427792 samples/sec                   batch loss = 1037.4335602521896 | accuracy = 0.672093023255814


Epoch[2] Batch[435] Speed: 1.2541255279993355 samples/sec                   batch loss = 1047.0104525089264 | accuracy = 0.6729885057471264


Epoch[2] Batch[440] Speed: 1.2491703514496317 samples/sec                   batch loss = 1059.0984292030334 | accuracy = 0.6721590909090909


Epoch[2] Batch[445] Speed: 1.2500634076571415 samples/sec                   batch loss = 1069.386946797371 | accuracy = 0.6730337078651686


Epoch[2] Batch[450] Speed: 1.2542819192023178 samples/sec                   batch loss = 1078.6266074180603 | accuracy = 0.6744444444444444


Epoch[2] Batch[455] Speed: 1.2527547443638896 samples/sec                   batch loss = 1090.7855477333069 | accuracy = 0.6741758241758242


Epoch[2] Batch[460] Speed: 1.2493084845994573 samples/sec                   batch loss = 1101.1217546463013 | accuracy = 0.6744565217391304


Epoch[2] Batch[465] Speed: 1.2516867375138452 samples/sec                   batch loss = 1112.5214142799377 | accuracy = 0.6741935483870968


Epoch[2] Batch[470] Speed: 1.2514426799340608 samples/sec                   batch loss = 1121.6786340475082 | accuracy = 0.675


Epoch[2] Batch[475] Speed: 1.2528993788456224 samples/sec                   batch loss = 1133.2716467380524 | accuracy = 0.6752631578947368


Epoch[2] Batch[480] Speed: 1.2511098917766963 samples/sec                   batch loss = 1145.489668250084 | accuracy = 0.6744791666666666


Epoch[2] Batch[485] Speed: 1.251969941168543 samples/sec                   batch loss = 1156.5882060527802 | accuracy = 0.6747422680412372


Epoch[2] Batch[490] Speed: 1.2507755088530164 samples/sec                   batch loss = 1169.3083692789078 | accuracy = 0.6744897959183673


Epoch[2] Batch[495] Speed: 1.2513991815774215 samples/sec                   batch loss = 1181.6055186986923 | accuracy = 0.6747474747474748


Epoch[2] Batch[500] Speed: 1.254674475357335 samples/sec                   batch loss = 1196.7973903417587 | accuracy = 0.674


Epoch[2] Batch[505] Speed: 1.252514571483454 samples/sec                   batch loss = 1205.7902955412865 | accuracy = 0.6752475247524753


Epoch[2] Batch[510] Speed: 1.2551284958379751 samples/sec                   batch loss = 1217.0444085001945 | accuracy = 0.6754901960784314


Epoch[2] Batch[515] Speed: 1.2570376192342188 samples/sec                   batch loss = 1230.3102216124535 | accuracy = 0.6762135922330097


Epoch[2] Batch[520] Speed: 1.2570436470325026 samples/sec                   batch loss = 1243.4039737582207 | accuracy = 0.6754807692307693


Epoch[2] Batch[525] Speed: 1.2545397497862143 samples/sec                   batch loss = 1256.7304189801216 | accuracy = 0.6747619047619048


Epoch[2] Batch[530] Speed: 1.253598791682912 samples/sec                   batch loss = 1269.0801535248756 | accuracy = 0.6745283018867925


Epoch[2] Batch[535] Speed: 1.2555228036061186 samples/sec                   batch loss = 1281.631197154522 | accuracy = 0.6742990654205607


Epoch[2] Batch[540] Speed: 1.2511320037265758 samples/sec                   batch loss = 1294.2463175058365 | accuracy = 0.674074074074074


Epoch[2] Batch[545] Speed: 1.254102185175966 samples/sec                   batch loss = 1306.1396164894104 | accuracy = 0.6752293577981652


Epoch[2] Batch[550] Speed: 1.2587411684992094 samples/sec                   batch loss = 1315.2376118898392 | accuracy = 0.6763636363636364


Epoch[2] Batch[555] Speed: 1.2540455660339138 samples/sec                   batch loss = 1325.9946327209473 | accuracy = 0.6765765765765765


Epoch[2] Batch[560] Speed: 1.2560528510168454 samples/sec                   batch loss = 1339.9657382965088 | accuracy = 0.6758928571428572


Epoch[2] Batch[565] Speed: 1.252393958637417 samples/sec                   batch loss = 1351.870371222496 | accuracy = 0.6747787610619469


Epoch[2] Batch[570] Speed: 1.2533685948437512 samples/sec                   batch loss = 1363.3956019878387 | accuracy = 0.6754385964912281


Epoch[2] Batch[575] Speed: 1.2525162546177404 samples/sec                   batch loss = 1374.95964717865 | accuracy = 0.6760869565217391


Epoch[2] Batch[580] Speed: 1.2537217915294574 samples/sec                   batch loss = 1387.0872361660004 | accuracy = 0.6762931034482759


Epoch[2] Batch[585] Speed: 1.257008799622625 samples/sec                   batch loss = 1397.0637902617455 | accuracy = 0.6777777777777778


Epoch[2] Batch[590] Speed: 1.2516462102696784 samples/sec                   batch loss = 1409.5670081973076 | accuracy = 0.6779661016949152


Epoch[2] Batch[595] Speed: 1.2468574239084527 samples/sec                   batch loss = 1419.8002413511276 | accuracy = 0.6785714285714286


Epoch[2] Batch[600] Speed: 1.2533388196549564 samples/sec                   batch loss = 1431.4167147874832 | accuracy = 0.6791666666666667


Epoch[2] Batch[605] Speed: 1.2557882883624647 samples/sec                   batch loss = 1441.7063641548157 | accuracy = 0.6801652892561983


Epoch[2] Batch[610] Speed: 1.2553816020609034 samples/sec                   batch loss = 1450.9277729988098 | accuracy = 0.6807377049180328


Epoch[2] Batch[615] Speed: 1.2539865150909957 samples/sec                   batch loss = 1465.6000690460205 | accuracy = 0.6788617886178862


Epoch[2] Batch[620] Speed: 1.2546479219979045 samples/sec                   batch loss = 1476.1483421325684 | accuracy = 0.6794354838709677


Epoch[2] Batch[625] Speed: 1.2576601061587296 samples/sec                   batch loss = 1489.7855986356735 | accuracy = 0.6788


Epoch[2] Batch[630] Speed: 1.2596220536808165 samples/sec                   batch loss = 1502.407923579216 | accuracy = 0.6785714285714286


Epoch[2] Batch[635] Speed: 1.2602295906391017 samples/sec                   batch loss = 1517.048398733139 | accuracy = 0.6779527559055119


Epoch[2] Batch[640] Speed: 1.259629430306758 samples/sec                   batch loss = 1528.170728802681 | accuracy = 0.678125


Epoch[2] Batch[645] Speed: 1.2571517804837586 samples/sec                   batch loss = 1540.6218110322952 | accuracy = 0.6767441860465117


Epoch[2] Batch[650] Speed: 1.2574315255018025 samples/sec                   batch loss = 1550.5432653427124 | accuracy = 0.6776923076923077


Epoch[2] Batch[655] Speed: 1.2628932461335731 samples/sec                   batch loss = 1561.3179448843002 | accuracy = 0.6786259541984733


Epoch[2] Batch[660] Speed: 1.258630589738523 samples/sec                   batch loss = 1572.874749302864 | accuracy = 0.6787878787878788


Epoch[2] Batch[665] Speed: 1.258085534467253 samples/sec                   batch loss = 1584.4391185641289 | accuracy = 0.6789473684210526


Epoch[2] Batch[670] Speed: 1.266657631991508 samples/sec                   batch loss = 1595.9778082966805 | accuracy = 0.6783582089552239


Epoch[2] Batch[675] Speed: 1.2597459546284049 samples/sec                   batch loss = 1606.9993919730186 | accuracy = 0.6781481481481482


Epoch[2] Batch[680] Speed: 1.2525689016014787 samples/sec                   batch loss = 1616.7941223978996 | accuracy = 0.6790441176470589


Epoch[2] Batch[685] Speed: 1.254329744442857 samples/sec                   batch loss = 1628.436923444271 | accuracy = 0.6788321167883211


Epoch[2] Batch[690] Speed: 1.2525481415163335 samples/sec                   batch loss = 1643.010230600834 | accuracy = 0.6778985507246377


Epoch[2] Batch[695] Speed: 1.2581706357988245 samples/sec                   batch loss = 1655.6874139904976 | accuracy = 0.6776978417266187


Epoch[2] Batch[700] Speed: 1.2531093734466576 samples/sec                   batch loss = 1666.912180364132 | accuracy = 0.6775


Epoch[2] Batch[705] Speed: 1.255479678807279 samples/sec                   batch loss = 1675.0541833043098 | accuracy = 0.6787234042553192


Epoch[2] Batch[710] Speed: 1.254815517867447 samples/sec                   batch loss = 1687.3540283441544 | accuracy = 0.6792253521126761


Epoch[2] Batch[715] Speed: 1.2559550607891792 samples/sec                   batch loss = 1699.0206780433655 | accuracy = 0.679020979020979


Epoch[2] Batch[720] Speed: 1.2525431853815452 samples/sec                   batch loss = 1708.7737171649933 | accuracy = 0.6795138888888889


Epoch[2] Batch[725] Speed: 1.2598789624624434 samples/sec                   batch loss = 1719.147077679634 | accuracy = 0.6803448275862068


Epoch[2] Batch[730] Speed: 1.2561081468461888 samples/sec                   batch loss = 1728.7425006628036 | accuracy = 0.6811643835616439


Epoch[2] Batch[735] Speed: 1.2626004236810127 samples/sec                   batch loss = 1736.7607560157776 | accuracy = 0.682312925170068


Epoch[2] Batch[740] Speed: 1.2559794128931436 samples/sec                   batch loss = 1747.745106101036 | accuracy = 0.6820945945945946


Epoch[2] Batch[745] Speed: 1.2541749351540776 samples/sec                   batch loss = 1760.5215406417847 | accuracy = 0.6818791946308724


Epoch[2] Batch[750] Speed: 1.2600924391371247 samples/sec                   batch loss = 1768.1331266760826 | accuracy = 0.6823333333333333


Epoch[2] Batch[755] Speed: 1.2620020855826035 samples/sec                   batch loss = 1777.2821368575096 | accuracy = 0.6834437086092715


Epoch[2] Batch[760] Speed: 1.2588944622262987 samples/sec                   batch loss = 1787.8973085284233 | accuracy = 0.6832236842105263


Epoch[2] Batch[765] Speed: 1.2579346069872843 samples/sec                   batch loss = 1798.1759443879128 | accuracy = 0.6839869281045752


Epoch[2] Batch[770] Speed: 1.2621336713022544 samples/sec                   batch loss = 1806.324726998806 | accuracy = 0.6853896103896104


Epoch[2] Batch[775] Speed: 1.2630242571010606 samples/sec                   batch loss = 1816.624566257 | accuracy = 0.6861290322580645


Epoch[2] Batch[780] Speed: 1.2562468780491787 samples/sec                   batch loss = 1830.7239237427711 | accuracy = 0.6852564102564103


Epoch[2] Batch[785] Speed: 1.2551772308080882 samples/sec                   batch loss = 1842.1672549843788 | accuracy = 0.6859872611464968


[Epoch 2] training: accuracy=0.686230964467005
[Epoch 2] time cost: 644.1225340366364
[Epoch 2] validation: validation accuracy=0.7533333333333333


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).