<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[21:32:27] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

21:32:27] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[21:32:27] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 4.4527636, -7.3940105]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7698543650873406 samples/sec                   batch loss = 13.433427333831787 | accuracy = 0.65


Epoch[1] Batch[10] Speed: 1.2546800113583088 samples/sec                   batch loss = 27.80432152748108 | accuracy = 0.6


Epoch[1] Batch[15] Speed: 1.2474947257089835 samples/sec                   batch loss = 40.561909914016724 | accuracy = 0.6666666666666666


Epoch[1] Batch[20] Speed: 1.244800334059538 samples/sec                   batch loss = 54.46092748641968 | accuracy = 0.65


Epoch[1] Batch[25] Speed: 1.247314242194424 samples/sec                   batch loss = 68.69972252845764 | accuracy = 0.62


Epoch[1] Batch[30] Speed: 1.2533055817285235 samples/sec                   batch loss = 82.0565733909607 | accuracy = 0.625


Epoch[1] Batch[35] Speed: 1.2440145777026401 samples/sec                   batch loss = 95.24518036842346 | accuracy = 0.6142857142857143


Epoch[1] Batch[40] Speed: 1.2443986094838613 samples/sec                   batch loss = 109.09039855003357 | accuracy = 0.60625


Epoch[1] Batch[45] Speed: 1.2462262345778055 samples/sec                   batch loss = 124.365243434906 | accuracy = 0.5888888888888889


Epoch[1] Batch[50] Speed: 1.2473461429828223 samples/sec                   batch loss = 137.64806628227234 | accuracy = 0.59


Epoch[1] Batch[55] Speed: 1.2436792743891654 samples/sec                   batch loss = 152.76800441741943 | accuracy = 0.5681818181818182


Epoch[1] Batch[60] Speed: 1.253794591078556 samples/sec                   batch loss = 166.85911893844604 | accuracy = 0.5583333333333333


Epoch[1] Batch[65] Speed: 1.2421344773725578 samples/sec                   batch loss = 180.73202180862427 | accuracy = 0.55


Epoch[1] Batch[70] Speed: 1.2456792888975743 samples/sec                   batch loss = 194.58788561820984 | accuracy = 0.55


Epoch[1] Batch[75] Speed: 1.2444401456251413 samples/sec                   batch loss = 208.07134890556335 | accuracy = 0.5466666666666666


Epoch[1] Batch[80] Speed: 1.2448295202062754 samples/sec                   batch loss = 220.65806317329407 | accuracy = 0.565625


Epoch[1] Batch[85] Speed: 1.2432101901463912 samples/sec                   batch loss = 234.93653082847595 | accuracy = 0.5647058823529412


Epoch[1] Batch[90] Speed: 1.2513342197054147 samples/sec                   batch loss = 247.82828330993652 | accuracy = 0.575


Epoch[1] Batch[95] Speed: 1.247470608752912 samples/sec                   batch loss = 261.53685665130615 | accuracy = 0.5710526315789474


Epoch[1] Batch[100] Speed: 1.2461065520652592 samples/sec                   batch loss = 276.7088372707367 | accuracy = 0.5625


Epoch[1] Batch[105] Speed: 1.2492198340227645 samples/sec                   batch loss = 290.05366563796997 | accuracy = 0.5595238095238095


Epoch[1] Batch[110] Speed: 1.2509448698835004 samples/sec                   batch loss = 303.7250599861145 | accuracy = 0.5613636363636364


Epoch[1] Batch[115] Speed: 1.2433684780319325 samples/sec                   batch loss = 317.8309347629547 | accuracy = 0.5565217391304348


Epoch[1] Batch[120] Speed: 1.244072047432074 samples/sec                   batch loss = 332.401300907135 | accuracy = 0.5479166666666667


Epoch[1] Batch[125] Speed: 1.2429362759806046 samples/sec                   batch loss = 347.17897152900696 | accuracy = 0.538


Epoch[1] Batch[130] Speed: 1.2451623020741034 samples/sec                   batch loss = 361.11727118492126 | accuracy = 0.5403846153846154


Epoch[1] Batch[135] Speed: 1.244663750000371 samples/sec                   batch loss = 375.58397698402405 | accuracy = 0.5351851851851852


Epoch[1] Batch[140] Speed: 1.243994745882291 samples/sec                   batch loss = 389.22415828704834 | accuracy = 0.5410714285714285


Epoch[1] Batch[145] Speed: 1.242169608572422 samples/sec                   batch loss = 402.9936263561249 | accuracy = 0.5396551724137931


Epoch[1] Batch[150] Speed: 1.244333541770301 samples/sec                   batch loss = 417.0474941730499 | accuracy = 0.5383333333333333


Epoch[1] Batch[155] Speed: 1.2519607854842774 samples/sec                   batch loss = 430.88763189315796 | accuracy = 0.5338709677419354


Epoch[1] Batch[160] Speed: 1.2472482201646893 samples/sec                   batch loss = 444.3121531009674 | accuracy = 0.534375


Epoch[1] Batch[165] Speed: 1.2454967413126106 samples/sec                   batch loss = 457.1275131702423 | accuracy = 0.5409090909090909


Epoch[1] Batch[170] Speed: 1.2437025995992657 samples/sec                   batch loss = 471.07177114486694 | accuracy = 0.5397058823529411


Epoch[1] Batch[175] Speed: 1.2463514954205326 samples/sec                   batch loss = 484.83756589889526 | accuracy = 0.5414285714285715


Epoch[1] Batch[180] Speed: 1.2446157356810572 samples/sec                   batch loss = 499.04159235954285 | accuracy = 0.5347222222222222


Epoch[1] Batch[185] Speed: 1.249451952164413 samples/sec                   batch loss = 512.5269730091095 | accuracy = 0.5364864864864864


Epoch[1] Batch[190] Speed: 1.2527197601845732 samples/sec                   batch loss = 526.4162559509277 | accuracy = 0.5355263157894737


Epoch[1] Batch[195] Speed: 1.2550047504351542 samples/sec                   batch loss = 540.4452166557312 | accuracy = 0.532051282051282


Epoch[1] Batch[200] Speed: 1.2518663402717987 samples/sec                   batch loss = 554.3826291561127 | accuracy = 0.53125


Epoch[1] Batch[205] Speed: 1.2480009331061663 samples/sec                   batch loss = 567.6621856689453 | accuracy = 0.5341463414634147


Epoch[1] Batch[210] Speed: 1.2495854940610633 samples/sec                   batch loss = 581.3380255699158 | accuracy = 0.5333333333333333


Epoch[1] Batch[215] Speed: 1.2467070474564441 samples/sec                   batch loss = 594.9711155891418 | accuracy = 0.5348837209302325


Epoch[1] Batch[220] Speed: 1.251256293025132 samples/sec                   batch loss = 608.3390824794769 | accuracy = 0.5375


Epoch[1] Batch[225] Speed: 1.2458779881063515 samples/sec                   batch loss = 621.583199262619 | accuracy = 0.5411111111111111


Epoch[1] Batch[230] Speed: 1.2506762077737281 samples/sec                   batch loss = 635.4045562744141 | accuracy = 0.5413043478260869


Epoch[1] Batch[235] Speed: 1.2509449631566145 samples/sec                   batch loss = 648.643406867981 | accuracy = 0.5446808510638298


Epoch[1] Batch[240] Speed: 1.2551780759565607 samples/sec                   batch loss = 662.5067524909973 | accuracy = 0.54375


Epoch[1] Batch[245] Speed: 1.2473796220184317 samples/sec                   batch loss = 676.2653074264526 | accuracy = 0.5418367346938775


Epoch[1] Batch[250] Speed: 1.2452344806384377 samples/sec                   batch loss = 689.8405101299286 | accuracy = 0.542


Epoch[1] Batch[255] Speed: 1.247668673679163 samples/sec                   batch loss = 703.602931022644 | accuracy = 0.5441176470588235


Epoch[1] Batch[260] Speed: 1.2505550165565777 samples/sec                   batch loss = 717.4051804542542 | accuracy = 0.5451923076923076


Epoch[1] Batch[265] Speed: 1.2407383810676573 samples/sec                   batch loss = 730.5049879550934 | accuracy = 0.55


Epoch[1] Batch[270] Speed: 1.250360973082371 samples/sec                   batch loss = 744.5086522102356 | accuracy = 0.549074074074074


Epoch[1] Batch[275] Speed: 1.2480155083103879 samples/sec                   batch loss = 757.7001695632935 | accuracy = 0.5481818181818182


Epoch[1] Batch[280] Speed: 1.2530991715475683 samples/sec                   batch loss = 771.4155752658844 | accuracy = 0.5482142857142858


Epoch[1] Batch[285] Speed: 1.2540495966978489 samples/sec                   batch loss = 785.3757297992706 | accuracy = 0.5456140350877193


Epoch[1] Batch[290] Speed: 1.2515063460707294 samples/sec                   batch loss = 799.3309586048126 | accuracy = 0.5456896551724137


Epoch[1] Batch[295] Speed: 1.2491385432940394 samples/sec                   batch loss = 813.2658569812775 | accuracy = 0.5449152542372881


Epoch[1] Batch[300] Speed: 1.2531927731812282 samples/sec                   batch loss = 826.4668824672699 | accuracy = 0.5475


Epoch[1] Batch[305] Speed: 1.2491309170179592 samples/sec                   batch loss = 839.636146068573 | accuracy = 0.5491803278688525


Epoch[1] Batch[310] Speed: 1.2456281442851966 samples/sec                   batch loss = 852.7693421840668 | accuracy = 0.5491935483870968


Epoch[1] Batch[315] Speed: 1.2506986773418616 samples/sec                   batch loss = 866.5193409919739 | accuracy = 0.5484126984126985


Epoch[1] Batch[320] Speed: 1.2480542224484996 samples/sec                   batch loss = 880.1979048252106 | accuracy = 0.546875


Epoch[1] Batch[325] Speed: 1.2470432440516275 samples/sec                   batch loss = 893.2161636352539 | accuracy = 0.5492307692307692


Epoch[1] Batch[330] Speed: 1.2482776415869536 samples/sec                   batch loss = 906.5730743408203 | accuracy = 0.5507575757575758


Epoch[1] Batch[335] Speed: 1.2465318856403262 samples/sec                   batch loss = 919.6024172306061 | accuracy = 0.5537313432835821


Epoch[1] Batch[340] Speed: 1.2491749088911488 samples/sec                   batch loss = 932.7246377468109 | accuracy = 0.5566176470588236


Epoch[1] Batch[345] Speed: 1.2469170100716778 samples/sec                   batch loss = 946.6802940368652 | accuracy = 0.5565217391304348


Epoch[1] Batch[350] Speed: 1.2498094618230475 samples/sec                   batch loss = 959.9793260097504 | accuracy = 0.5585714285714286


Epoch[1] Batch[355] Speed: 1.2504989968633204 samples/sec                   batch loss = 972.8556988239288 | accuracy = 0.5612676056338028


Epoch[1] Batch[360] Speed: 1.2462510440384402 samples/sec                   batch loss = 986.5277695655823 | accuracy = 0.5604166666666667


Epoch[1] Batch[365] Speed: 1.2520656167371786 samples/sec                   batch loss = 999.7450852394104 | accuracy = 0.5623287671232877


Epoch[1] Batch[370] Speed: 1.245556752319956 samples/sec                   batch loss = 1013.5613305568695 | accuracy = 0.5601351351351351


Epoch[1] Batch[375] Speed: 1.2374906223139766 samples/sec                   batch loss = 1026.4532527923584 | accuracy = 0.56


Epoch[1] Batch[380] Speed: 1.2443431399838714 samples/sec                   batch loss = 1041.0519433021545 | accuracy = 0.5572368421052631


Epoch[1] Batch[385] Speed: 1.2429391305494533 samples/sec                   batch loss = 1053.9969141483307 | accuracy = 0.5577922077922078


Epoch[1] Batch[390] Speed: 1.244921613318093 samples/sec                   batch loss = 1067.4332721233368 | accuracy = 0.5576923076923077


Epoch[1] Batch[395] Speed: 1.2396608942674567 samples/sec                   batch loss = 1080.9871094226837 | accuracy = 0.5594936708860759


Epoch[1] Batch[400] Speed: 1.2469912459026604 samples/sec                   batch loss = 1095.2730832099915 | accuracy = 0.5575


Epoch[1] Batch[405] Speed: 1.2463211268434295 samples/sec                   batch loss = 1109.418069601059 | accuracy = 0.558641975308642


Epoch[1] Batch[410] Speed: 1.2459331318963316 samples/sec                   batch loss = 1123.0530211925507 | accuracy = 0.5579268292682927


Epoch[1] Batch[415] Speed: 1.2472863303430666 samples/sec                   batch loss = 1136.1391637325287 | accuracy = 0.5602409638554217


Epoch[1] Batch[420] Speed: 1.2487499077982338 samples/sec                   batch loss = 1148.675461769104 | accuracy = 0.5642857142857143


Epoch[1] Batch[425] Speed: 1.2440427123000108 samples/sec                   batch loss = 1161.3663024902344 | accuracy = 0.5652941176470588


Epoch[1] Batch[430] Speed: 1.2447461216533922 samples/sec                   batch loss = 1174.6422613859177 | accuracy = 0.5656976744186046


Epoch[1] Batch[435] Speed: 1.2366513472375273 samples/sec                   batch loss = 1187.811560511589 | accuracy = 0.5666666666666667


Epoch[1] Batch[440] Speed: 1.2442622981214677 samples/sec                   batch loss = 1201.1597174406052 | accuracy = 0.5676136363636364


Epoch[1] Batch[445] Speed: 1.2436022060034593 samples/sec                   batch loss = 1214.5227929353714 | accuracy = 0.5679775280898877


Epoch[1] Batch[450] Speed: 1.2445957000317431 samples/sec                   batch loss = 1228.4680479764938 | accuracy = 0.5683333333333334


Epoch[1] Batch[455] Speed: 1.240502151053463 samples/sec                   batch loss = 1242.1183129549026 | accuracy = 0.5692307692307692


Epoch[1] Batch[460] Speed: 1.2401318867753541 samples/sec                   batch loss = 1255.047061085701 | accuracy = 0.5717391304347826


Epoch[1] Batch[465] Speed: 1.2411714430228116 samples/sec                   batch loss = 1268.379175543785 | accuracy = 0.5731182795698925


Epoch[1] Batch[470] Speed: 1.241175207709406 samples/sec                   batch loss = 1282.0578244924545 | accuracy = 0.5723404255319149


Epoch[1] Batch[475] Speed: 1.2436324422219083 samples/sec                   batch loss = 1294.3834954500198 | accuracy = 0.5747368421052632


Epoch[1] Batch[480] Speed: 1.2430640077476929 samples/sec                   batch loss = 1307.8986073732376 | accuracy = 0.5755208333333334


Epoch[1] Batch[485] Speed: 1.2397400399767973 samples/sec                   batch loss = 1322.1994568109512 | accuracy = 0.5737113402061855


Epoch[1] Batch[490] Speed: 1.238984263101128 samples/sec                   batch loss = 1336.2555183172226 | accuracy = 0.5724489795918367


Epoch[1] Batch[495] Speed: 1.2425417350005215 samples/sec                   batch loss = 1349.421520113945 | accuracy = 0.5737373737373738


Epoch[1] Batch[500] Speed: 1.2391313178671766 samples/sec                   batch loss = 1362.4219762086868 | accuracy = 0.5755


Epoch[1] Batch[505] Speed: 1.2381849034900543 samples/sec                   batch loss = 1376.068484902382 | accuracy = 0.5762376237623762


Epoch[1] Batch[510] Speed: 1.2367741435631774 samples/sec                   batch loss = 1389.4205981492996 | accuracy = 0.5759803921568627


Epoch[1] Batch[515] Speed: 1.241492257885487 samples/sec                   batch loss = 1401.8076162338257 | accuracy = 0.5776699029126213


Epoch[1] Batch[520] Speed: 1.2396198597297712 samples/sec                   batch loss = 1414.4914531707764 | accuracy = 0.5793269230769231


Epoch[1] Batch[525] Speed: 1.2441785142176915 samples/sec                   batch loss = 1427.993782043457 | accuracy = 0.5795238095238096


Epoch[1] Batch[530] Speed: 1.2376650782875604 samples/sec                   batch loss = 1439.8431675434113 | accuracy = 0.5806603773584905


Epoch[1] Batch[535] Speed: 1.2409906731250313 samples/sec                   batch loss = 1453.5859076976776 | accuracy = 0.5803738317757009


Epoch[1] Batch[540] Speed: 1.2398718803234932 samples/sec                   batch loss = 1465.8324434757233 | accuracy = 0.5819444444444445


Epoch[1] Batch[545] Speed: 1.2388340415722157 samples/sec                   batch loss = 1479.155781030655 | accuracy = 0.5821100917431192


Epoch[1] Batch[550] Speed: 1.2389920404814407 samples/sec                   batch loss = 1493.166662454605 | accuracy = 0.5822727272727273


Epoch[1] Batch[555] Speed: 1.242306381671711 samples/sec                   batch loss = 1507.9458694458008 | accuracy = 0.581081081081081


Epoch[1] Batch[560] Speed: 1.2444474378116224 samples/sec                   batch loss = 1520.1695783138275 | accuracy = 0.5830357142857143


Epoch[1] Batch[565] Speed: 1.2427101621228123 samples/sec                   batch loss = 1533.043375968933 | accuracy = 0.5831858407079646


Epoch[1] Batch[570] Speed: 1.242873202563055 samples/sec                   batch loss = 1545.7172679901123 | accuracy = 0.5842105263157895


Epoch[1] Batch[575] Speed: 1.2443445243538163 samples/sec                   batch loss = 1558.7513790130615 | accuracy = 0.5847826086956521


Epoch[1] Batch[580] Speed: 1.2444621147498425 samples/sec                   batch loss = 1571.8955459594727 | accuracy = 0.5849137931034483


Epoch[1] Batch[585] Speed: 1.246753555583596 samples/sec                   batch loss = 1584.244690656662 | accuracy = 0.5854700854700855


Epoch[1] Batch[590] Speed: 1.2400340851616092 samples/sec                   batch loss = 1597.7729798555374 | accuracy = 0.5860169491525423


Epoch[1] Batch[595] Speed: 1.2387159575305384 samples/sec                   batch loss = 1610.3874822854996 | accuracy = 0.5861344537815126


Epoch[1] Batch[600] Speed: 1.2456902027582137 samples/sec                   batch loss = 1624.7964931726456 | accuracy = 0.5854166666666667


Epoch[1] Batch[605] Speed: 1.2403872336024169 samples/sec                   batch loss = 1637.5689934492111 | accuracy = 0.5859504132231405


Epoch[1] Batch[610] Speed: 1.245372484206643 samples/sec                   batch loss = 1650.672087073326 | accuracy = 0.5856557377049181


Epoch[1] Batch[615] Speed: 1.2447485227843844 samples/sec                   batch loss = 1664.7898354530334 | accuracy = 0.5853658536585366


Epoch[1] Batch[620] Speed: 1.245162948963894 samples/sec                   batch loss = 1677.3519926071167 | accuracy = 0.5854838709677419


Epoch[1] Batch[625] Speed: 1.2467277070128775 samples/sec                   batch loss = 1688.8707840442657 | accuracy = 0.5868


Epoch[1] Batch[630] Speed: 1.2439882891469962 samples/sec                   batch loss = 1701.6007931232452 | accuracy = 0.5873015873015873


Epoch[1] Batch[635] Speed: 1.2454203720297492 samples/sec                   batch loss = 1715.1436471939087 | accuracy = 0.5874015748031496


Epoch[1] Batch[640] Speed: 1.2397000078399283 samples/sec                   batch loss = 1729.7475397586823 | accuracy = 0.58671875


Epoch[1] Batch[645] Speed: 1.2484608195072002 samples/sec                   batch loss = 1742.2826523780823 | accuracy = 0.5872093023255814


Epoch[1] Batch[650] Speed: 1.2495397050522477 samples/sec                   batch loss = 1754.0923020839691 | accuracy = 0.5873076923076923


Epoch[1] Batch[655] Speed: 1.2424658196699276 samples/sec                   batch loss = 1767.7983829975128 | accuracy = 0.5862595419847328


Epoch[1] Batch[660] Speed: 1.2467031564955318 samples/sec                   batch loss = 1780.514190196991 | accuracy = 0.5867424242424243


Epoch[1] Batch[665] Speed: 1.2459520077377695 samples/sec                   batch loss = 1794.202793121338 | accuracy = 0.5868421052631579


Epoch[1] Batch[670] Speed: 1.24528485347995 samples/sec                   batch loss = 1805.3802783489227 | accuracy = 0.5884328358208956


Epoch[1] Batch[675] Speed: 1.2414330971851197 samples/sec                   batch loss = 1817.8424781560898 | accuracy = 0.5896296296296296


Epoch[1] Batch[680] Speed: 1.2473980779892948 samples/sec                   batch loss = 1830.7436311244965 | accuracy = 0.5897058823529412


Epoch[1] Batch[685] Speed: 1.24666072806962 samples/sec                   batch loss = 1842.8487403392792 | accuracy = 0.5912408759124088


Epoch[1] Batch[690] Speed: 1.247348461417675 samples/sec                   batch loss = 1855.3472316265106 | accuracy = 0.5927536231884057


Epoch[1] Batch[695] Speed: 1.246752721742004 samples/sec                   batch loss = 1868.2564233541489 | accuracy = 0.5938848920863309


Epoch[1] Batch[700] Speed: 1.242877069654077 samples/sec                   batch loss = 1881.5185941457748 | accuracy = 0.5939285714285715


Epoch[1] Batch[705] Speed: 1.2518606422528755 samples/sec                   batch loss = 1893.301881313324 | accuracy = 0.5957446808510638


Epoch[1] Batch[710] Speed: 1.2467017668725204 samples/sec                   batch loss = 1905.5141973495483 | accuracy = 0.5964788732394366


Epoch[1] Batch[715] Speed: 1.2457610550124474 samples/sec                   batch loss = 1918.73956990242 | accuracy = 0.5961538461538461


Epoch[1] Batch[720] Speed: 1.2425362135727616 samples/sec                   batch loss = 1931.8613106012344 | accuracy = 0.5958333333333333


Epoch[1] Batch[725] Speed: 1.2441190972306935 samples/sec                   batch loss = 1944.785924077034 | accuracy = 0.5972413793103448


Epoch[1] Batch[730] Speed: 1.247961386814848 samples/sec                   batch loss = 1956.2828580141068 | accuracy = 0.5989726027397261


Epoch[1] Batch[735] Speed: 1.2476682097534018 samples/sec                   batch loss = 1968.9793276786804 | accuracy = 0.5993197278911565


Epoch[1] Batch[740] Speed: 1.2440857007326123 samples/sec                   batch loss = 1981.3181564807892 | accuracy = 0.5993243243243244


Epoch[1] Batch[745] Speed: 1.241525239667823 samples/sec                   batch loss = 1993.2688893079758 | accuracy = 0.6


Epoch[1] Batch[750] Speed: 1.2442395055678204 samples/sec                   batch loss = 2006.8035000562668 | accuracy = 0.5993333333333334


Epoch[1] Batch[755] Speed: 1.2486076235753614 samples/sec                   batch loss = 2020.9768010377884 | accuracy = 0.6


Epoch[1] Batch[760] Speed: 1.2451030683005604 samples/sec                   batch loss = 2033.2896777391434 | accuracy = 0.6009868421052632


Epoch[1] Batch[765] Speed: 1.2466798112400816 samples/sec                   batch loss = 2046.4530202150345 | accuracy = 0.6016339869281045


Epoch[1] Batch[770] Speed: 1.248523903860804 samples/sec                   batch loss = 2061.5704661607742 | accuracy = 0.6009740259740259


Epoch[1] Batch[775] Speed: 1.2476415809929602 samples/sec                   batch loss = 2074.676290512085 | accuracy = 0.6009677419354839


Epoch[1] Batch[780] Speed: 1.242767143162013 samples/sec                   batch loss = 2087.8385438919067 | accuracy = 0.6006410256410256


Epoch[1] Batch[785] Speed: 1.2433351218196316 samples/sec                   batch loss = 2099.484544277191 | accuracy = 0.6009554140127389


[Epoch 1] training: accuracy=0.6015228426395939
[Epoch 1] time cost: 650.8629906177521
[Epoch 1] validation: validation accuracy=0.6677777777777778


Epoch[2] Batch[5] Speed: 1.2493850523581438 samples/sec                   batch loss = 12.92100214958191 | accuracy = 0.6


Epoch[2] Batch[10] Speed: 1.2494047772864307 samples/sec                   batch loss = 25.21601414680481 | accuracy = 0.65


Epoch[2] Batch[15] Speed: 1.2493864479693877 samples/sec                   batch loss = 36.32449555397034 | accuracy = 0.7


Epoch[2] Batch[20] Speed: 1.2533056753540093 samples/sec                   batch loss = 49.047390937805176 | accuracy = 0.675


Epoch[2] Batch[25] Speed: 1.24765856017579 samples/sec                   batch loss = 61.50941205024719 | accuracy = 0.68


Epoch[2] Batch[30] Speed: 1.2496699147089105 samples/sec                   batch loss = 75.65195274353027 | accuracy = 0.6666666666666666


Epoch[2] Batch[35] Speed: 1.24500734512967 samples/sec                   batch loss = 89.44622194766998 | accuracy = 0.65


Epoch[2] Batch[40] Speed: 1.2483966268053506 samples/sec                   batch loss = 103.42567718029022 | accuracy = 0.64375


Epoch[2] Batch[45] Speed: 1.2538608394563857 samples/sec                   batch loss = 114.49702596664429 | accuracy = 0.6555555555555556


Epoch[2] Batch[50] Speed: 1.2478826730184864 samples/sec                   batch loss = 128.23220205307007 | accuracy = 0.65


Epoch[2] Batch[55] Speed: 1.2488748398171339 samples/sec                   batch loss = 140.41411530971527 | accuracy = 0.6545454545454545


Epoch[2] Batch[60] Speed: 1.2521885962666646 samples/sec                   batch loss = 151.5510606765747 | accuracy = 0.6625


Epoch[2] Batch[65] Speed: 1.2468563119346552 samples/sec                   batch loss = 161.74524593353271 | accuracy = 0.6653846153846154


Epoch[2] Batch[70] Speed: 1.2465813446454719 samples/sec                   batch loss = 175.50211763381958 | accuracy = 0.6642857142857143


Epoch[2] Batch[75] Speed: 1.2444920235445336 samples/sec                   batch loss = 189.79043841362 | accuracy = 0.65


Epoch[2] Batch[80] Speed: 1.244809108231928 samples/sec                   batch loss = 201.97483158111572 | accuracy = 0.65625


Epoch[2] Batch[85] Speed: 1.2488347733293952 samples/sec                   batch loss = 214.46970558166504 | accuracy = 0.6588235294117647


Epoch[2] Batch[90] Speed: 1.244935747168136 samples/sec                   batch loss = 225.59611058235168 | accuracy = 0.6666666666666666


Epoch[2] Batch[95] Speed: 1.2468523273781673 samples/sec                   batch loss = 239.40997552871704 | accuracy = 0.6631578947368421


Epoch[2] Batch[100] Speed: 1.2385174331800137 samples/sec                   batch loss = 251.67988169193268 | accuracy = 0.6575


Epoch[2] Batch[105] Speed: 1.239892863757209 samples/sec                   batch loss = 263.97595036029816 | accuracy = 0.6619047619047619


Epoch[2] Batch[110] Speed: 1.2370250994816536 samples/sec                   batch loss = 274.8288735151291 | accuracy = 0.6636363636363637


Epoch[2] Batch[115] Speed: 1.2391938290020212 samples/sec                   batch loss = 286.1340311765671 | accuracy = 0.6717391304347826


Epoch[2] Batch[120] Speed: 1.2413268241894098 samples/sec                   batch loss = 299.84784162044525 | accuracy = 0.6666666666666666


Epoch[2] Batch[125] Speed: 1.241024270858622 samples/sec                   batch loss = 311.5459009408951 | accuracy = 0.668


Epoch[2] Batch[130] Speed: 1.2391444053280234 samples/sec                   batch loss = 325.057688832283 | accuracy = 0.6711538461538461


Epoch[2] Batch[135] Speed: 1.2389972559558726 samples/sec                   batch loss = 336.6802942752838 | accuracy = 0.6722222222222223


Epoch[2] Batch[140] Speed: 1.2431527078995532 samples/sec                   batch loss = 350.63120460510254 | accuracy = 0.6660714285714285


Epoch[2] Batch[145] Speed: 1.2396493530291477 samples/sec                   batch loss = 363.25526666641235 | accuracy = 0.6655172413793103


Epoch[2] Batch[150] Speed: 1.2431039811194668 samples/sec                   batch loss = 375.0928478240967 | accuracy = 0.6633333333333333


Epoch[2] Batch[155] Speed: 1.2446485142862695 samples/sec                   batch loss = 385.94411063194275 | accuracy = 0.6645161290322581


Epoch[2] Batch[160] Speed: 1.2458438495151096 samples/sec                   batch loss = 396.4057364463806 | accuracy = 0.66875


Epoch[2] Batch[165] Speed: 1.2508811675953404 samples/sec                   batch loss = 410.37325406074524 | accuracy = 0.6636363636363637


Epoch[2] Batch[170] Speed: 1.244228063467051 samples/sec                   batch loss = 422.1276675462723 | accuracy = 0.6661764705882353


Epoch[2] Batch[175] Speed: 1.2471024771728625 samples/sec                   batch loss = 436.3793879747391 | accuracy = 0.6614285714285715


Epoch[2] Batch[180] Speed: 1.2452329094410328 samples/sec                   batch loss = 447.487392783165 | accuracy = 0.6638888888888889


Epoch[2] Batch[185] Speed: 1.250247016924581 samples/sec                   batch loss = 458.58429968357086 | accuracy = 0.668918918918919


Epoch[2] Batch[190] Speed: 1.2483704313909965 samples/sec                   batch loss = 469.8669763803482 | accuracy = 0.6657894736842105


Epoch[2] Batch[195] Speed: 1.2441986287071203 samples/sec                   batch loss = 479.9579300880432 | accuracy = 0.6679487179487179


Epoch[2] Batch[200] Speed: 1.2517543510158426 samples/sec                   batch loss = 490.08276081085205 | accuracy = 0.67


Epoch[2] Batch[205] Speed: 1.2500314608683796 samples/sec                   batch loss = 501.79782032966614 | accuracy = 0.6682926829268293


Epoch[2] Batch[210] Speed: 1.2486797374568732 samples/sec                   batch loss = 514.1210925579071 | accuracy = 0.6666666666666666


Epoch[2] Batch[215] Speed: 1.249807692852074 samples/sec                   batch loss = 527.5733106136322 | accuracy = 0.6627906976744186


Epoch[2] Batch[220] Speed: 1.2468650224491526 samples/sec                   batch loss = 539.76890873909 | accuracy = 0.6613636363636364


Epoch[2] Batch[225] Speed: 1.2497641218845876 samples/sec                   batch loss = 549.7216262817383 | accuracy = 0.6633333333333333


Epoch[2] Batch[230] Speed: 1.2462835385201545 samples/sec                   batch loss = 561.5223951339722 | accuracy = 0.6652173913043479


Epoch[2] Batch[235] Speed: 1.2526657910986239 samples/sec                   batch loss = 573.7140318155289 | accuracy = 0.6670212765957447


Epoch[2] Batch[240] Speed: 1.245780388162855 samples/sec                   batch loss = 584.7850898504257 | accuracy = 0.6697916666666667


Epoch[2] Batch[245] Speed: 1.2472889267416583 samples/sec                   batch loss = 596.8097981214523 | accuracy = 0.6693877551020408


Epoch[2] Batch[250] Speed: 1.2461938356501183 samples/sec                   batch loss = 607.3106098175049 | accuracy = 0.672


Epoch[2] Batch[255] Speed: 1.2459498795505697 samples/sec                   batch loss = 622.7596175670624 | accuracy = 0.6676470588235294


Epoch[2] Batch[260] Speed: 1.2441177133622914 samples/sec                   batch loss = 635.6210739612579 | accuracy = 0.6653846153846154


Epoch[2] Batch[265] Speed: 1.246215589030027 samples/sec                   batch loss = 648.0352803468704 | accuracy = 0.6669811320754717


Epoch[2] Batch[270] Speed: 1.244140778237598 samples/sec                   batch loss = 661.4735810756683 | accuracy = 0.6648148148148149


Epoch[2] Batch[275] Speed: 1.2477495876090847 samples/sec                   batch loss = 671.6494810581207 | accuracy = 0.6672727272727272


Epoch[2] Batch[280] Speed: 1.2451237671377646 samples/sec                   batch loss = 683.4410794973373 | accuracy = 0.6660714285714285


Epoch[2] Batch[285] Speed: 1.241860669275599 samples/sec                   batch loss = 696.1339854001999 | accuracy = 0.6666666666666666


Epoch[2] Batch[290] Speed: 1.240434096807474 samples/sec                   batch loss = 707.6651338338852 | accuracy = 0.6663793103448276


Epoch[2] Batch[295] Speed: 1.246302424980461 samples/sec                   batch loss = 720.7990757226944 | accuracy = 0.6661016949152543


Epoch[2] Batch[300] Speed: 1.2432920931064613 samples/sec                   batch loss = 734.6656920909882 | accuracy = 0.6641666666666667


Epoch[2] Batch[305] Speed: 1.240098795655386 samples/sec                   batch loss = 745.2733575105667 | accuracy = 0.6655737704918033


Epoch[2] Batch[310] Speed: 1.243109968136131 samples/sec                   batch loss = 757.4176822900772 | accuracy = 0.6645161290322581


Epoch[2] Batch[315] Speed: 1.237173879058353 samples/sec                   batch loss = 769.6192249059677 | accuracy = 0.6666666666666666


Epoch[2] Batch[320] Speed: 1.2415056708780279 samples/sec                   batch loss = 780.0383801460266 | accuracy = 0.6671875


Epoch[2] Batch[325] Speed: 1.2445943151028616 samples/sec                   batch loss = 796.539835691452 | accuracy = 0.6630769230769231


Epoch[2] Batch[330] Speed: 1.2424065661493187 samples/sec                   batch loss = 809.7307002544403 | accuracy = 0.6643939393939394


Epoch[2] Batch[335] Speed: 1.2401831310802065 samples/sec                   batch loss = 823.7711827754974 | accuracy = 0.664179104477612


Epoch[2] Batch[340] Speed: 1.2421616073128816 samples/sec                   batch loss = 834.8364317417145 | accuracy = 0.6639705882352941


Epoch[2] Batch[345] Speed: 1.2422424523654614 samples/sec                   batch loss = 845.6120233535767 | accuracy = 0.6652173913043479


Epoch[2] Batch[350] Speed: 1.2422590089644823 samples/sec                   batch loss = 857.2458478212357 | accuracy = 0.6657142857142857


Epoch[2] Batch[355] Speed: 1.2503513749924449 samples/sec                   batch loss = 867.9379855394363 | accuracy = 0.6676056338028169


Epoch[2] Batch[360] Speed: 1.248126550978324 samples/sec                   batch loss = 878.0393122434616 | accuracy = 0.6708333333333333


Epoch[2] Batch[365] Speed: 1.2452843913252662 samples/sec                   batch loss = 888.6002013683319 | accuracy = 0.6712328767123288


Epoch[2] Batch[370] Speed: 1.2470081147021606 samples/sec                   batch loss = 898.2792766094208 | accuracy = 0.672972972972973


Epoch[2] Batch[375] Speed: 1.2456417392786685 samples/sec                   batch loss = 908.3624373674393 | accuracy = 0.674


Epoch[2] Batch[380] Speed: 1.249495315253507 samples/sec                   batch loss = 922.6995695829391 | accuracy = 0.6710526315789473


Epoch[2] Batch[385] Speed: 1.246958065749676 samples/sec                   batch loss = 935.9989695549011 | accuracy = 0.6714285714285714


Epoch[2] Batch[390] Speed: 1.242250362685492 samples/sec                   batch loss = 948.6676865816116 | accuracy = 0.6724358974358975


Epoch[2] Batch[395] Speed: 1.241272730232152 samples/sec                   batch loss = 959.1758248806 | accuracy = 0.6727848101265823


Epoch[2] Batch[400] Speed: 1.2438953195899387 samples/sec                   batch loss = 969.0021271705627 | accuracy = 0.673125


Epoch[2] Batch[405] Speed: 1.2513673531508327 samples/sec                   batch loss = 981.1809517145157 | accuracy = 0.6734567901234568


Epoch[2] Batch[410] Speed: 1.2443739660173165 samples/sec                   batch loss = 992.0265336036682 | accuracy = 0.6743902439024391


Epoch[2] Batch[415] Speed: 1.2469802165497423 samples/sec                   batch loss = 1006.3173878192902 | accuracy = 0.6734939759036145


Epoch[2] Batch[420] Speed: 1.251381540401978 samples/sec                   batch loss = 1020.3577061891556 | accuracy = 0.6726190476190477


Epoch[2] Batch[425] Speed: 1.2492534137158025 samples/sec                   batch loss = 1031.278721690178 | accuracy = 0.6735294117647059


Epoch[2] Batch[430] Speed: 1.2449576414125891 samples/sec                   batch loss = 1040.5624010562897 | accuracy = 0.6761627906976744


Epoch[2] Batch[435] Speed: 1.244326158629851 samples/sec                   batch loss = 1053.5841785669327 | accuracy = 0.6764367816091954


Epoch[2] Batch[440] Speed: 1.239729046912836 samples/sec                   batch loss = 1066.0356570482254 | accuracy = 0.6761363636363636


Epoch[2] Batch[445] Speed: 1.2448201915738548 samples/sec                   batch loss = 1077.4157147407532 | accuracy = 0.6764044943820224


Epoch[2] Batch[450] Speed: 1.2450863434035577 samples/sec                   batch loss = 1087.429073214531 | accuracy = 0.6772222222222222


Epoch[2] Batch[455] Speed: 1.2519421008298515 samples/sec                   batch loss = 1101.1605209112167 | accuracy = 0.6758241758241759


Epoch[2] Batch[460] Speed: 1.254589189584105 samples/sec                   batch loss = 1113.2064534425735 | accuracy = 0.6760869565217391


Epoch[2] Batch[465] Speed: 1.2470003290456806 samples/sec                   batch loss = 1125.7604767084122 | accuracy = 0.6752688172043011


Epoch[2] Batch[470] Speed: 1.2468963442409284 samples/sec                   batch loss = 1137.2906956672668 | accuracy = 0.676595744680851


Epoch[2] Batch[475] Speed: 1.2486061367788475 samples/sec                   batch loss = 1149.2855451107025 | accuracy = 0.6757894736842105


Epoch[2] Batch[480] Speed: 1.2422128355503008 samples/sec                   batch loss = 1162.4866344928741 | accuracy = 0.6755208333333333


Epoch[2] Batch[485] Speed: 1.2407604949666167 samples/sec                   batch loss = 1174.2703399658203 | accuracy = 0.6762886597938145


Epoch[2] Batch[490] Speed: 1.2386090518956263 samples/sec                   batch loss = 1184.8983486890793 | accuracy = 0.6775510204081633


Epoch[2] Batch[495] Speed: 1.2452386397095307 samples/sec                   batch loss = 1195.2240039110184 | accuracy = 0.6782828282828283


Epoch[2] Batch[500] Speed: 1.2488095820282112 samples/sec                   batch loss = 1209.6209143400192 | accuracy = 0.677


Epoch[2] Batch[505] Speed: 1.247233848317739 samples/sec                   batch loss = 1220.8398963212967 | accuracy = 0.6772277227722773


Epoch[2] Batch[510] Speed: 1.2478266140578573 samples/sec                   batch loss = 1231.3163820505142 | accuracy = 0.6784313725490196


Epoch[2] Batch[515] Speed: 1.2443843955300942 samples/sec                   batch loss = 1245.9312263727188 | accuracy = 0.6786407766990291


Epoch[2] Batch[520] Speed: 1.2443828264771766 samples/sec                   batch loss = 1258.8685938119888 | accuracy = 0.6778846153846154


Epoch[2] Batch[525] Speed: 1.24350542311697 samples/sec                   batch loss = 1269.2760655879974 | accuracy = 0.6785714285714286


Epoch[2] Batch[530] Speed: 1.2417263837813985 samples/sec                   batch loss = 1280.9677780866623 | accuracy = 0.6787735849056604


Epoch[2] Batch[535] Speed: 1.240382373237836 samples/sec                   batch loss = 1291.3046513795853 | accuracy = 0.6789719626168225


Epoch[2] Batch[540] Speed: 1.242213663330298 samples/sec                   batch loss = 1301.6836683750153 | accuracy = 0.6800925925925926


Epoch[2] Batch[545] Speed: 1.242356517889163 samples/sec                   batch loss = 1315.5973510742188 | accuracy = 0.6798165137614679


Epoch[2] Batch[550] Speed: 1.2412433433014265 samples/sec                   batch loss = 1326.4947445392609 | accuracy = 0.6804545454545454


Epoch[2] Batch[555] Speed: 1.24995770076009 samples/sec                   batch loss = 1340.077332496643 | accuracy = 0.6806306306306307


Epoch[2] Batch[560] Speed: 1.2412349866471113 samples/sec                   batch loss = 1351.2840259075165 | accuracy = 0.6803571428571429


Epoch[2] Batch[565] Speed: 1.2403257940603671 samples/sec                   batch loss = 1363.951330780983 | accuracy = 0.6792035398230089


Epoch[2] Batch[570] Speed: 1.2411478453856113 samples/sec                   batch loss = 1375.1550002098083 | accuracy = 0.6793859649122806


Epoch[2] Batch[575] Speed: 1.2451318990121008 samples/sec                   batch loss = 1387.7286854982376 | accuracy = 0.68


Epoch[2] Batch[580] Speed: 1.2476617148289717 samples/sec                   batch loss = 1399.1839318275452 | accuracy = 0.6793103448275862


Epoch[2] Batch[585] Speed: 1.2416436761652627 samples/sec                   batch loss = 1409.500412106514 | accuracy = 0.6807692307692308


Epoch[2] Batch[590] Speed: 1.2462574317087278 samples/sec                   batch loss = 1420.0099000930786 | accuracy = 0.6817796610169492


Epoch[2] Batch[595] Speed: 1.2516274416287105 samples/sec                   batch loss = 1430.5507568120956 | accuracy = 0.6819327731092437


Epoch[2] Batch[600] Speed: 1.2538065846081734 samples/sec                   batch loss = 1443.2953052520752 | accuracy = 0.68125


Epoch[2] Batch[605] Speed: 1.2441442841735197 samples/sec                   batch loss = 1457.1000072956085 | accuracy = 0.6814049586776859


Epoch[2] Batch[610] Speed: 1.2478657805498186 samples/sec                   batch loss = 1472.0457047224045 | accuracy = 0.680327868852459


Epoch[2] Batch[615] Speed: 1.244899258471951 samples/sec                   batch loss = 1486.2378739118576 | accuracy = 0.6796747967479675


Epoch[2] Batch[620] Speed: 1.247967420711351 samples/sec                   batch loss = 1497.9845885038376 | accuracy = 0.6786290322580645


Epoch[2] Batch[625] Speed: 1.2435451483648041 samples/sec                   batch loss = 1509.386741399765 | accuracy = 0.6784


Epoch[2] Batch[630] Speed: 1.2392983639064115 samples/sec                   batch loss = 1523.5211617946625 | accuracy = 0.6773809523809524


Epoch[2] Batch[635] Speed: 1.2462167924306582 samples/sec                   batch loss = 1538.2868356704712 | accuracy = 0.6751968503937008


Epoch[2] Batch[640] Speed: 1.2428349933143938 samples/sec                   batch loss = 1549.5390754938126 | accuracy = 0.67578125


Epoch[2] Batch[645] Speed: 1.2432838009772587 samples/sec                   batch loss = 1563.4069695472717 | accuracy = 0.6748062015503876


Epoch[2] Batch[650] Speed: 1.2431453387651548 samples/sec                   batch loss = 1573.3372867107391 | accuracy = 0.676923076923077


Epoch[2] Batch[655] Speed: 1.2406000265168649 samples/sec                   batch loss = 1584.2229663729668 | accuracy = 0.6767175572519084


Epoch[2] Batch[660] Speed: 1.24032451031396 samples/sec                   batch loss = 1598.8463669419289 | accuracy = 0.6761363636363636


Epoch[2] Batch[665] Speed: 1.2437012166572718 samples/sec                   batch loss = 1611.374073445797 | accuracy = 0.675187969924812


Epoch[2] Batch[670] Speed: 1.251963961930995 samples/sec                   batch loss = 1623.5934893488884 | accuracy = 0.6753731343283582


Epoch[2] Batch[675] Speed: 1.2433764948469501 samples/sec                   batch loss = 1633.5226592421532 | accuracy = 0.6759259259259259


Epoch[2] Batch[680] Speed: 1.2458065674999486 samples/sec                   batch loss = 1644.8325925469398 | accuracy = 0.6761029411764706


Epoch[2] Batch[685] Speed: 1.2458189637536472 samples/sec                   batch loss = 1659.6598972678185 | accuracy = 0.6759124087591241


Epoch[2] Batch[690] Speed: 1.2401347284774125 samples/sec                   batch loss = 1670.943198621273 | accuracy = 0.6764492753623188


Epoch[2] Batch[695] Speed: 1.2458407965615028 samples/sec                   batch loss = 1680.2416977286339 | accuracy = 0.6776978417266187


Epoch[2] Batch[700] Speed: 1.2425926264701483 samples/sec                   batch loss = 1690.504624426365 | accuracy = 0.6789285714285714


Epoch[2] Batch[705] Speed: 1.2415754047352723 samples/sec                   batch loss = 1702.9913471341133 | accuracy = 0.6787234042553192


Epoch[2] Batch[710] Speed: 1.2431033363672657 samples/sec                   batch loss = 1711.4179690480232 | accuracy = 0.680281690140845


Epoch[2] Batch[715] Speed: 1.244733284994847 samples/sec                   batch loss = 1721.805930197239 | accuracy = 0.6814685314685315


Epoch[2] Batch[720] Speed: 1.247896410088425 samples/sec                   batch loss = 1730.3753586411476 | accuracy = 0.6826388888888889


Epoch[2] Batch[725] Speed: 1.2549001773682111 samples/sec                   batch loss = 1740.3700349926949 | accuracy = 0.6841379310344827


Epoch[2] Batch[730] Speed: 1.249534214319208 samples/sec                   batch loss = 1750.5867392420769 | accuracy = 0.6842465753424658


Epoch[2] Batch[735] Speed: 1.252071690386964 samples/sec                   batch loss = 1761.1066150069237 | accuracy = 0.6846938775510204


Epoch[2] Batch[740] Speed: 1.2504959210521411 samples/sec                   batch loss = 1774.5485922694206 | accuracy = 0.6837837837837838


Epoch[2] Batch[745] Speed: 1.247894924985202 samples/sec                   batch loss = 1784.4148178696632 | accuracy = 0.684228187919463


Epoch[2] Batch[750] Speed: 1.2389882890269825 samples/sec                   batch loss = 1793.8302993178368 | accuracy = 0.6853333333333333


Epoch[2] Batch[755] Speed: 1.2487989852531667 samples/sec                   batch loss = 1805.4656366705894 | accuracy = 0.6857615894039735


Epoch[2] Batch[760] Speed: 1.2528913323245312 samples/sec                   batch loss = 1818.3427218794823 | accuracy = 0.6855263157894737


Epoch[2] Batch[765] Speed: 1.2536783219563197 samples/sec                   batch loss = 1833.3198986649513 | accuracy = 0.6852941176470588


Epoch[2] Batch[770] Speed: 1.245133100320723 samples/sec                   batch loss = 1841.9872809052467 | accuracy = 0.686038961038961


Epoch[2] Batch[775] Speed: 1.243880471572689 samples/sec                   batch loss = 1853.8923206925392 | accuracy = 0.6854838709677419


Epoch[2] Batch[780] Speed: 1.2433555776270913 samples/sec                   batch loss = 1863.6045300364494 | accuracy = 0.6865384615384615


Epoch[2] Batch[785] Speed: 1.2440118104339342 samples/sec                   batch loss = 1871.9161626696587 | accuracy = 0.6878980891719745


[Epoch 2] training: accuracy=0.6881345177664975
[Epoch 2] time cost: 649.1745903491974
[Epoch 2] validation: validation accuracy=0.7


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).