<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `device` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), device=gpu)
x

[23:32:59] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], device=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[23:32:59] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), device=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], device=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_device(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', device=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), device=gpu)
net(x)

[23:32:59] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 7.234633 , -4.2729955]], device=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, device=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.773188489285553 samples/sec                   batch loss = 14.366095066070557 | accuracy = 0.5


Epoch[1] Batch[10] Speed: 1.2556587741629062 samples/sec                   batch loss = 28.227754831314087 | accuracy = 0.475


Epoch[1] Batch[15] Speed: 1.2580100662004559 samples/sec                   batch loss = 42.7236111164093 | accuracy = 0.5


Epoch[1] Batch[20] Speed: 1.251808241592821 samples/sec                   batch loss = 56.65704107284546 | accuracy = 0.525


Epoch[1] Batch[25] Speed: 1.249639384205423 samples/sec                   batch loss = 71.61127424240112 | accuracy = 0.49


Epoch[1] Batch[30] Speed: 1.253672888471626 samples/sec                   batch loss = 85.88313436508179 | accuracy = 0.475


Epoch[1] Batch[35] Speed: 1.2530701578801764 samples/sec                   batch loss = 100.05028128623962 | accuracy = 0.4642857142857143


Epoch[1] Batch[40] Speed: 1.2448459611067255 samples/sec                   batch loss = 112.66678404808044 | accuracy = 0.48125


Epoch[1] Batch[45] Speed: 1.2505252816850754 samples/sec                   batch loss = 127.1012065410614 | accuracy = 0.4888888888888889


Epoch[1] Batch[50] Speed: 1.2540341303377425 samples/sec                   batch loss = 142.51887583732605 | accuracy = 0.485


Epoch[1] Batch[55] Speed: 1.2496486921276795 samples/sec                   batch loss = 157.4189932346344 | accuracy = 0.4590909090909091


Epoch[1] Batch[60] Speed: 1.250508597219773 samples/sec                   batch loss = 170.67400765419006 | accuracy = 0.475


Epoch[1] Batch[65] Speed: 1.2409158650348848 samples/sec                   batch loss = 185.24740862846375 | accuracy = 0.4653846153846154


Epoch[1] Batch[70] Speed: 1.248727322311151 samples/sec                   batch loss = 199.37777733802795 | accuracy = 0.46785714285714286


Epoch[1] Batch[75] Speed: 1.2523638557977983 samples/sec                   batch loss = 213.35941576957703 | accuracy = 0.4633333333333333


Epoch[1] Batch[80] Speed: 1.249670845540378 samples/sec                   batch loss = 227.4596848487854 | accuracy = 0.459375


Epoch[1] Batch[85] Speed: 1.2450856041919593 samples/sec                   batch loss = 241.97452640533447 | accuracy = 0.4588235294117647


Epoch[1] Batch[90] Speed: 1.248200930904706 samples/sec                   batch loss = 255.43078303337097 | accuracy = 0.46111111111111114


Epoch[1] Batch[95] Speed: 1.2429910675753317 samples/sec                   batch loss = 269.4919168949127 | accuracy = 0.4605263157894737


Epoch[1] Batch[100] Speed: 1.250108676165592 samples/sec                   batch loss = 283.53633069992065 | accuracy = 0.4625


Epoch[1] Batch[105] Speed: 1.2523544139009029 samples/sec                   batch loss = 297.5276575088501 | accuracy = 0.46190476190476193


Epoch[1] Batch[110] Speed: 1.2505167063735578 samples/sec                   batch loss = 311.2199800014496 | accuracy = 0.4681818181818182


Epoch[1] Batch[115] Speed: 1.2497667286091663 samples/sec                   batch loss = 325.0555772781372 | accuracy = 0.4717391304347826


Epoch[1] Batch[120] Speed: 1.253361853167677 samples/sec                   batch loss = 338.51838970184326 | accuracy = 0.48125


Epoch[1] Batch[125] Speed: 1.2490476850591044 samples/sec                   batch loss = 352.9447064399719 | accuracy = 0.482


Epoch[1] Batch[130] Speed: 1.2482185753840807 samples/sec                   batch loss = 366.75488233566284 | accuracy = 0.48653846153846153


Epoch[1] Batch[135] Speed: 1.2394227852574933 samples/sec                   batch loss = 380.87523126602173 | accuracy = 0.4888888888888889


Epoch[1] Batch[140] Speed: 1.251417663342326 samples/sec                   batch loss = 394.2146728038788 | accuracy = 0.4928571428571429


Epoch[1] Batch[145] Speed: 1.249078744626879 samples/sec                   batch loss = 408.13131737709045 | accuracy = 0.49310344827586206


Epoch[1] Batch[150] Speed: 1.2512115012428053 samples/sec                   batch loss = 421.76404094696045 | accuracy = 0.49333333333333335


Epoch[1] Batch[155] Speed: 1.2494965249965722 samples/sec                   batch loss = 435.946569442749 | accuracy = 0.4870967741935484


Epoch[1] Batch[160] Speed: 1.246799511244832 samples/sec                   batch loss = 449.3518068790436 | accuracy = 0.496875


Epoch[1] Batch[165] Speed: 1.250108955611011 samples/sec                   batch loss = 463.53675055503845 | accuracy = 0.49393939393939396


Epoch[1] Batch[170] Speed: 1.247756733036078 samples/sec                   batch loss = 477.4158446788788 | accuracy = 0.49264705882352944


Epoch[1] Batch[175] Speed: 1.2488227817714566 samples/sec                   batch loss = 491.19169187545776 | accuracy = 0.4942857142857143


Epoch[1] Batch[180] Speed: 1.2447967320666846 samples/sec                   batch loss = 504.7963545322418 | accuracy = 0.49722222222222223


Epoch[1] Batch[185] Speed: 1.2418148012174468 samples/sec                   batch loss = 518.7704603672028 | accuracy = 0.49864864864864866


Epoch[1] Batch[190] Speed: 1.2419185836877054 samples/sec                   batch loss = 532.7875640392303 | accuracy = 0.49736842105263157


Epoch[1] Batch[195] Speed: 1.2538541861913248 samples/sec                   batch loss = 546.7906546592712 | accuracy = 0.4987179487179487


Epoch[1] Batch[200] Speed: 1.2525131688750042 samples/sec                   batch loss = 560.7452929019928 | accuracy = 0.4975


Epoch[1] Batch[205] Speed: 1.2445312579395613 samples/sec                   batch loss = 574.9814233779907 | accuracy = 0.49634146341463414


Epoch[1] Batch[210] Speed: 1.2512390291458795 samples/sec                   batch loss = 588.9052214622498 | accuracy = 0.49523809523809526


Epoch[1] Batch[215] Speed: 1.2472969941919694 samples/sec                   batch loss = 602.4760546684265 | accuracy = 0.4965116279069767


Epoch[1] Batch[220] Speed: 1.2516212789141345 samples/sec                   batch loss = 616.2430174350739 | accuracy = 0.49204545454545456


Epoch[1] Batch[225] Speed: 1.251201983402192 samples/sec                   batch loss = 630.1550192832947 | accuracy = 0.4911111111111111


Epoch[1] Batch[230] Speed: 1.2459950356905363 samples/sec                   batch loss = 644.220828294754 | accuracy = 0.4923913043478261


Epoch[1] Batch[235] Speed: 1.2494807054645403 samples/sec                   batch loss = 658.0660111904144 | accuracy = 0.4957446808510638


Epoch[1] Batch[240] Speed: 1.2511850942571665 samples/sec                   batch loss = 671.888927936554 | accuracy = 0.49895833333333334


Epoch[1] Batch[245] Speed: 1.2505778546436823 samples/sec                   batch loss = 685.8221473693848 | accuracy = 0.5


Epoch[1] Batch[250] Speed: 1.243498694962068 samples/sec                   batch loss = 699.5997142791748 | accuracy = 0.504


Epoch[1] Batch[255] Speed: 1.248416413432213 samples/sec                   batch loss = 713.0653078556061 | accuracy = 0.5078431372549019


Epoch[1] Batch[260] Speed: 1.2545408755067602 samples/sec                   batch loss = 726.7322719097137 | accuracy = 0.510576923076923


Epoch[1] Batch[265] Speed: 1.2570086112634973 samples/sec                   batch loss = 740.612589597702 | accuracy = 0.5084905660377359


Epoch[1] Batch[270] Speed: 1.262932223334309 samples/sec                   batch loss = 754.2870831489563 | accuracy = 0.5101851851851852


Epoch[1] Batch[275] Speed: 1.249706031986759 samples/sec                   batch loss = 768.126946926117 | accuracy = 0.5127272727272727


Epoch[1] Batch[280] Speed: 1.2531650656407431 samples/sec                   batch loss = 781.7809948921204 | accuracy = 0.5133928571428571


Epoch[1] Batch[285] Speed: 1.2598426332106674 samples/sec                   batch loss = 795.2507085800171 | accuracy = 0.5149122807017544


Epoch[1] Batch[290] Speed: 1.2538583093331834 samples/sec                   batch loss = 808.7526292800903 | accuracy = 0.5146551724137931


Epoch[1] Batch[295] Speed: 1.255013387410097 samples/sec                   batch loss = 822.6233093738556 | accuracy = 0.5152542372881356


Epoch[1] Batch[300] Speed: 1.2538440658671823 samples/sec                   batch loss = 836.1310181617737 | accuracy = 0.5158333333333334


Epoch[1] Batch[305] Speed: 1.2546966196543523 samples/sec                   batch loss = 849.161238193512 | accuracy = 0.521311475409836


Epoch[1] Batch[310] Speed: 1.258258957114224 samples/sec                   batch loss = 862.7419047355652 | accuracy = 0.5233870967741936


Epoch[1] Batch[315] Speed: 1.2582088502307525 samples/sec                   batch loss = 876.4466171264648 | accuracy = 0.5238095238095238


Epoch[1] Batch[320] Speed: 1.2528939521107485 samples/sec                   batch loss = 889.9614078998566 | accuracy = 0.5265625


Epoch[1] Batch[325] Speed: 1.2547684063300184 samples/sec                   batch loss = 903.5666553974152 | accuracy = 0.5284615384615384


Epoch[1] Batch[330] Speed: 1.2572553159138358 samples/sec                   batch loss = 917.0733461380005 | accuracy = 0.5318181818181819


Epoch[1] Batch[335] Speed: 1.2559245984828489 samples/sec                   batch loss = 930.0376086235046 | accuracy = 0.5335820895522388


Epoch[1] Batch[340] Speed: 1.252003482012896 samples/sec                   batch loss = 943.6359570026398 | accuracy = 0.5330882352941176


Epoch[1] Batch[345] Speed: 1.245898712716625 samples/sec                   batch loss = 956.9352633953094 | accuracy = 0.5333333333333333


Epoch[1] Batch[350] Speed: 1.2494840554496562 samples/sec                   batch loss = 970.2725973129272 | accuracy = 0.5357142857142857


Epoch[1] Batch[355] Speed: 1.2551869970376262 samples/sec                   batch loss = 984.0668115615845 | accuracy = 0.5359154929577464


Epoch[1] Batch[360] Speed: 1.2543964245414412 samples/sec                   batch loss = 997.7254981994629 | accuracy = 0.5354166666666667


Epoch[1] Batch[365] Speed: 1.2595929263096948 samples/sec                   batch loss = 1011.2718453407288 | accuracy = 0.536986301369863


Epoch[1] Batch[370] Speed: 1.2490432215355893 samples/sec                   batch loss = 1024.924124956131 | accuracy = 0.5378378378378378


Epoch[1] Batch[375] Speed: 1.2568036158202247 samples/sec                   batch loss = 1038.785530090332 | accuracy = 0.5373333333333333


Epoch[1] Batch[380] Speed: 1.2531939900945392 samples/sec                   batch loss = 1052.3012330532074 | accuracy = 0.5394736842105263


Epoch[1] Batch[385] Speed: 1.2519414468770482 samples/sec                   batch loss = 1065.4554042816162 | accuracy = 0.538961038961039


Epoch[1] Batch[390] Speed: 1.2559307096295191 samples/sec                   batch loss = 1079.2561202049255 | accuracy = 0.5384615384615384


Epoch[1] Batch[395] Speed: 1.2459524703881228 samples/sec                   batch loss = 1092.9471077919006 | accuracy = 0.5392405063291139


Epoch[1] Batch[400] Speed: 1.251952751014526 samples/sec                   batch loss = 1105.7821779251099 | accuracy = 0.543125


Epoch[1] Batch[405] Speed: 1.2551150685632473 samples/sec                   batch loss = 1119.4519393444061 | accuracy = 0.5432098765432098


Epoch[1] Batch[410] Speed: 1.2566892353971775 samples/sec                   batch loss = 1133.702541589737 | accuracy = 0.5445121951219513


Epoch[1] Batch[415] Speed: 1.2464752071122989 samples/sec                   batch loss = 1146.4988567829132 | accuracy = 0.5481927710843374


Epoch[1] Batch[420] Speed: 1.25396776992286 samples/sec                   batch loss = 1159.7774641513824 | accuracy = 0.55


Epoch[1] Batch[425] Speed: 1.2519166905948294 samples/sec                   batch loss = 1173.1718442440033 | accuracy = 0.5511764705882353


Epoch[1] Batch[430] Speed: 1.2520963593548649 samples/sec                   batch loss = 1186.9004311561584 | accuracy = 0.5517441860465117


Epoch[1] Batch[435] Speed: 1.255036858051152 samples/sec                   batch loss = 1200.0745823383331 | accuracy = 0.5522988505747126


Epoch[1] Batch[440] Speed: 1.2406766314905808 samples/sec                   batch loss = 1213.36851978302 | accuracy = 0.5528409090909091


Epoch[1] Batch[445] Speed: 1.2471167532892589 samples/sec                   batch loss = 1227.7155420780182 | accuracy = 0.551123595505618


Epoch[1] Batch[450] Speed: 1.2492264381965146 samples/sec                   batch loss = 1240.730102300644 | accuracy = 0.5516666666666666


Epoch[1] Batch[455] Speed: 1.2502538183036531 samples/sec                   batch loss = 1254.0116231441498 | accuracy = 0.5527472527472528


Epoch[1] Batch[460] Speed: 1.2563494177486632 samples/sec                   batch loss = 1267.134156703949 | accuracy = 0.5532608695652174


Epoch[1] Batch[465] Speed: 1.2498422352436966 samples/sec                   batch loss = 1280.1920750141144 | accuracy = 0.5553763440860215


Epoch[1] Batch[470] Speed: 1.2525746060697236 samples/sec                   batch loss = 1293.7535693645477 | accuracy = 0.5547872340425531


Epoch[1] Batch[475] Speed: 1.2523135629776778 samples/sec                   batch loss = 1306.0804941654205 | accuracy = 0.5568421052631579


Epoch[1] Batch[480] Speed: 1.2512283911007536 samples/sec                   batch loss = 1320.0451576709747 | accuracy = 0.5552083333333333


Epoch[1] Batch[485] Speed: 1.2447403959168708 samples/sec                   batch loss = 1333.7223598957062 | accuracy = 0.5556701030927835


Epoch[1] Batch[490] Speed: 1.254573053206538 samples/sec                   batch loss = 1347.2317380905151 | accuracy = 0.5556122448979591


Epoch[1] Batch[495] Speed: 1.2570259405395499 samples/sec                   batch loss = 1361.3953840732574 | accuracy = 0.555050505050505


Epoch[1] Batch[500] Speed: 1.2598742319724734 samples/sec                   batch loss = 1374.974256515503 | accuracy = 0.5545


Epoch[1] Batch[505] Speed: 1.2616024649095248 samples/sec                   batch loss = 1387.6374111175537 | accuracy = 0.5554455445544555


Epoch[1] Batch[510] Speed: 1.2454573535637683 samples/sec                   batch loss = 1401.5972561836243 | accuracy = 0.5549019607843138


Epoch[1] Batch[515] Speed: 1.2521190668776865 samples/sec                   batch loss = 1414.3395547866821 | accuracy = 0.5567961165048544


Epoch[1] Batch[520] Speed: 1.2507905219375708 samples/sec                   batch loss = 1428.0486931800842 | accuracy = 0.55625


Epoch[1] Batch[525] Speed: 1.2482941736952529 samples/sec                   batch loss = 1441.163515329361 | accuracy = 0.5585714285714286


Epoch[1] Batch[530] Speed: 1.2547197969379906 samples/sec                   batch loss = 1455.1611151695251 | accuracy = 0.5589622641509434


Epoch[1] Batch[535] Speed: 1.2433134688829184 samples/sec                   batch loss = 1467.8424880504608 | accuracy = 0.5602803738317756


Epoch[1] Batch[540] Speed: 1.2566069696785398 samples/sec                   batch loss = 1480.712520122528 | accuracy = 0.5615740740740741


Epoch[1] Batch[545] Speed: 1.2556421403853755 samples/sec                   batch loss = 1493.974430322647 | accuracy = 0.5614678899082569


Epoch[1] Batch[550] Speed: 1.2630641932799858 samples/sec                   batch loss = 1507.0644998550415 | accuracy = 0.5627272727272727


Epoch[1] Batch[555] Speed: 1.2513300198165103 samples/sec                   batch loss = 1519.8737292289734 | accuracy = 0.5635135135135135


Epoch[1] Batch[560] Speed: 1.2483923537194612 samples/sec                   batch loss = 1533.1297050714493 | accuracy = 0.5647321428571429


Epoch[1] Batch[565] Speed: 1.2530482581307423 samples/sec                   batch loss = 1546.41404235363 | accuracy = 0.5646017699115045


Epoch[1] Batch[570] Speed: 1.2562641863287538 samples/sec                   batch loss = 1560.5846036672592 | accuracy = 0.5640350877192982


Epoch[1] Batch[575] Speed: 1.2596914730414328 samples/sec                   batch loss = 1573.4452096223831 | accuracy = 0.5652173913043478


Epoch[1] Batch[580] Speed: 1.2456186187182803 samples/sec                   batch loss = 1587.668994307518 | accuracy = 0.5642241379310344


Epoch[1] Batch[585] Speed: 1.2529882717117546 samples/sec                   batch loss = 1600.6080461740494 | accuracy = 0.564957264957265


Epoch[1] Batch[590] Speed: 1.2569361913623067 samples/sec                   batch loss = 1614.3747645616531 | accuracy = 0.5648305084745763


Epoch[1] Batch[595] Speed: 1.2535609504315939 samples/sec                   batch loss = 1626.7163747549057 | accuracy = 0.5651260504201681


Epoch[1] Batch[600] Speed: 1.2566265468134363 samples/sec                   batch loss = 1641.3355125188828 | accuracy = 0.5633333333333334


Epoch[1] Batch[605] Speed: 1.2470824540797922 samples/sec                   batch loss = 1654.8278583288193 | accuracy = 0.5648760330578513


Epoch[1] Batch[610] Speed: 1.2513136871832133 samples/sec                   batch loss = 1667.581249833107 | accuracy = 0.5651639344262295


Epoch[1] Batch[615] Speed: 1.2524611810712902 samples/sec                   batch loss = 1680.7252708673477 | accuracy = 0.5654471544715447


Epoch[1] Batch[620] Speed: 1.2514712448774779 samples/sec                   batch loss = 1694.6604303121567 | accuracy = 0.5649193548387097


Epoch[1] Batch[625] Speed: 1.2509257491887764 samples/sec                   batch loss = 1707.6447857618332 | accuracy = 0.5656


Epoch[1] Batch[630] Speed: 1.2508471273048054 samples/sec                   batch loss = 1721.133137345314 | accuracy = 0.5654761904761905


Epoch[1] Batch[635] Speed: 1.250183665167613 samples/sec                   batch loss = 1733.793002963066 | accuracy = 0.5661417322834645


Epoch[1] Batch[640] Speed: 1.2521518680869728 samples/sec                   batch loss = 1745.557020187378 | accuracy = 0.56796875


Epoch[1] Batch[645] Speed: 1.256159685437765 samples/sec                   batch loss = 1758.5706334114075 | accuracy = 0.5686046511627907


Epoch[1] Batch[650] Speed: 1.2477760353778662 samples/sec                   batch loss = 1772.320409655571 | accuracy = 0.5684615384615385


Epoch[1] Batch[655] Speed: 1.243940142462937 samples/sec                   batch loss = 1784.7580988407135 | accuracy = 0.5690839694656489


Epoch[1] Batch[660] Speed: 1.248615429315151 samples/sec                   batch loss = 1799.268223285675 | accuracy = 0.5685606060606061


Epoch[1] Batch[665] Speed: 1.2486775070058116 samples/sec                   batch loss = 1812.5054268836975 | accuracy = 0.5684210526315789


Epoch[1] Batch[670] Speed: 1.240486650170928 samples/sec                   batch loss = 1826.1707952022552 | accuracy = 0.5675373134328359


Epoch[1] Batch[675] Speed: 1.2377310940798527 samples/sec                   batch loss = 1839.783362865448 | accuracy = 0.5681481481481482


Epoch[1] Batch[680] Speed: 1.2478586338738271 samples/sec                   batch loss = 1852.7415618896484 | accuracy = 0.5680147058823529


Epoch[1] Batch[685] Speed: 1.2529343730580682 samples/sec                   batch loss = 1865.8155279159546 | accuracy = 0.5678832116788322


Epoch[1] Batch[690] Speed: 1.257575356667601 samples/sec                   batch loss = 1878.329853773117 | accuracy = 0.5681159420289855


Epoch[1] Batch[695] Speed: 1.2493161130438928 samples/sec                   batch loss = 1892.9565300941467 | accuracy = 0.5676258992805755


Epoch[1] Batch[700] Speed: 1.252037398398738 samples/sec                   batch loss = 1904.5296819210052 | accuracy = 0.5696428571428571


Epoch[1] Batch[705] Speed: 1.2580610062873745 samples/sec                   batch loss = 1916.5196895599365 | accuracy = 0.5712765957446808


Epoch[1] Batch[710] Speed: 1.262034647170437 samples/sec                   batch loss = 1929.9305284023285 | accuracy = 0.5711267605633803


Epoch[1] Batch[715] Speed: 1.2555867916095522 samples/sec                   batch loss = 1941.7058979272842 | accuracy = 0.573076923076923


Epoch[1] Batch[720] Speed: 1.2549183872236187 samples/sec                   batch loss = 1954.3640152215958 | accuracy = 0.5736111111111111


Epoch[1] Batch[725] Speed: 1.2568518218722964 samples/sec                   batch loss = 1968.1469930410385 | accuracy = 0.5737931034482758


Epoch[1] Batch[730] Speed: 1.2501425831218245 samples/sec                   batch loss = 1981.137172818184 | accuracy = 0.5743150684931507


Epoch[1] Batch[735] Speed: 1.2496972817329786 samples/sec                   batch loss = 1994.2885049581528 | accuracy = 0.5748299319727891


Epoch[1] Batch[740] Speed: 1.252959637474609 samples/sec                   batch loss = 2005.9198240041733 | accuracy = 0.5753378378378379


Epoch[1] Batch[745] Speed: 1.2469475003732915 samples/sec                   batch loss = 2017.8764564990997 | accuracy = 0.5765100671140939


Epoch[1] Batch[750] Speed: 1.2404737177783762 samples/sec                   batch loss = 2030.0760972499847 | accuracy = 0.5773333333333334


Epoch[1] Batch[755] Speed: 1.2555457295418542 samples/sec                   batch loss = 2043.4528477191925 | accuracy = 0.5771523178807947


Epoch[1] Batch[760] Speed: 1.2523774112452295 samples/sec                   batch loss = 2055.935953140259 | accuracy = 0.5773026315789473


Epoch[1] Batch[765] Speed: 1.2490772567081665 samples/sec                   batch loss = 2068.7448210716248 | accuracy = 0.5777777777777777


Epoch[1] Batch[770] Speed: 1.2425141283524195 samples/sec                   batch loss = 2081.308810710907 | accuracy = 0.5788961038961039


Epoch[1] Batch[775] Speed: 1.2548892892535304 samples/sec                   batch loss = 2094.0955929756165 | accuracy = 0.5783870967741935


Epoch[1] Batch[780] Speed: 1.2554808062152871 samples/sec                   batch loss = 2107.1799993515015 | accuracy = 0.5785256410256411


Epoch[1] Batch[785] Speed: 1.255900812585885 samples/sec                   batch loss = 2119.5214030742645 | accuracy = 0.5796178343949044


[Epoch 1] training: accuracy=0.5793147208121827
[Epoch 1] time cost: 647.9390182495117
[Epoch 1] validation: validation accuracy=0.67


Epoch[2] Batch[5] Speed: 1.2592434083843351 samples/sec                   batch loss = 12.99663233757019 | accuracy = 0.65


Epoch[2] Batch[10] Speed: 1.2547131345541918 samples/sec                   batch loss = 25.589734435081482 | accuracy = 0.625


Epoch[2] Batch[15] Speed: 1.2473495742694656 samples/sec                   batch loss = 37.94275987148285 | accuracy = 0.6333333333333333


Epoch[2] Batch[20] Speed: 1.2514046887128958 samples/sec                   batch loss = 49.6178617477417 | accuracy = 0.675


Epoch[2] Batch[25] Speed: 1.2501514327674652 samples/sec                   batch loss = 62.488473892211914 | accuracy = 0.67


Epoch[2] Batch[30] Speed: 1.254990762543126 samples/sec                   batch loss = 75.57708954811096 | accuracy = 0.6583333333333333


Epoch[2] Batch[35] Speed: 1.257044871436093 samples/sec                   batch loss = 87.82659840583801 | accuracy = 0.6714285714285714


Epoch[2] Batch[40] Speed: 1.2453209950385244 samples/sec                   batch loss = 98.43832421302795 | accuracy = 0.68125


Epoch[2] Batch[45] Speed: 1.252449306764759 samples/sec                   batch loss = 112.29322385787964 | accuracy = 0.6611111111111111


Epoch[2] Batch[50] Speed: 1.2518138457319905 samples/sec                   batch loss = 124.4899787902832 | accuracy = 0.665


Epoch[2] Batch[55] Speed: 1.2537253516711502 samples/sec                   batch loss = 137.83885860443115 | accuracy = 0.6681818181818182


Epoch[2] Batch[60] Speed: 1.247789862903299 samples/sec                   batch loss = 152.21715140342712 | accuracy = 0.6583333333333333


Epoch[2] Batch[65] Speed: 1.2430023948436624 samples/sec                   batch loss = 164.28235161304474 | accuracy = 0.6576923076923077


Epoch[2] Batch[70] Speed: 1.2461357069398697 samples/sec                   batch loss = 175.92566013336182 | accuracy = 0.6607142857142857


Epoch[2] Batch[75] Speed: 1.2534913622130976 samples/sec                   batch loss = 187.4073305130005 | accuracy = 0.6633333333333333


Epoch[2] Batch[80] Speed: 1.2528322029214054 samples/sec                   batch loss = 201.93730354309082 | accuracy = 0.653125


Epoch[2] Batch[85] Speed: 1.2462536361286336 samples/sec                   batch loss = 215.10918593406677 | accuracy = 0.6470588235294118


Epoch[2] Batch[90] Speed: 1.2465950530793592 samples/sec                   batch loss = 227.09483766555786 | accuracy = 0.65


Epoch[2] Batch[95] Speed: 1.2505663888875223 samples/sec                   batch loss = 238.508984208107 | accuracy = 0.65


Epoch[2] Batch[100] Speed: 1.2491074807523257 samples/sec                   batch loss = 251.67268979549408 | accuracy = 0.65


Epoch[2] Batch[105] Speed: 1.251553679786762 samples/sec                   batch loss = 264.5577162504196 | accuracy = 0.6523809523809524


Epoch[2] Batch[110] Speed: 1.2399208122282095 samples/sec                   batch loss = 277.79894053936005 | accuracy = 0.6522727272727272


Epoch[2] Batch[115] Speed: 1.2385664411063222 samples/sec                   batch loss = 292.1388646364212 | accuracy = 0.6456521739130435


Epoch[2] Batch[120] Speed: 1.2495715336418058 samples/sec                   batch loss = 302.96128737926483 | accuracy = 0.6520833333333333


Epoch[2] Batch[125] Speed: 1.2516948619546138 samples/sec                   batch loss = 315.0295695066452 | accuracy = 0.652


Epoch[2] Batch[130] Speed: 1.2424707883896646 samples/sec                   batch loss = 326.44426369667053 | accuracy = 0.6557692307692308


Epoch[2] Batch[135] Speed: 1.2409210049341082 samples/sec                   batch loss = 341.31280851364136 | accuracy = 0.6481481481481481


Epoch[2] Batch[140] Speed: 1.2455760791296269 samples/sec                   batch loss = 353.9874802827835 | accuracy = 0.6517857142857143


Epoch[2] Batch[145] Speed: 1.2471884169155139 samples/sec                   batch loss = 365.82144606113434 | accuracy = 0.6482758620689655


Epoch[2] Batch[150] Speed: 1.2485121040934775 samples/sec                   batch loss = 376.90294277668 | accuracy = 0.6483333333333333


Epoch[2] Batch[155] Speed: 1.2427947610545418 samples/sec                   batch loss = 389.6006861925125 | accuracy = 0.6483870967741936


Epoch[2] Batch[160] Speed: 1.2441628290579954 samples/sec                   batch loss = 401.7069183588028 | accuracy = 0.64375


Epoch[2] Batch[165] Speed: 1.2468826291798587 samples/sec                   batch loss = 414.6771881580353 | accuracy = 0.6454545454545455


Epoch[2] Batch[170] Speed: 1.2433721639110558 samples/sec                   batch loss = 428.174439907074 | accuracy = 0.6426470588235295


Epoch[2] Batch[175] Speed: 1.2567635096967005 samples/sec                   batch loss = 441.99170565605164 | accuracy = 0.6428571428571429


Epoch[2] Batch[180] Speed: 1.2429632567840716 samples/sec                   batch loss = 453.81136977672577 | accuracy = 0.6486111111111111


Epoch[2] Batch[185] Speed: 1.2449253084113037 samples/sec                   batch loss = 465.9104516506195 | accuracy = 0.6513513513513514


Epoch[2] Batch[190] Speed: 1.254100591520918 samples/sec                   batch loss = 477.04071271419525 | accuracy = 0.6526315789473685


Epoch[2] Batch[195] Speed: 1.2622619605624392 samples/sec                   batch loss = 488.85040414333344 | accuracy = 0.6538461538461539


Epoch[2] Batch[200] Speed: 1.2556927948995145 samples/sec                   batch loss = 500.42338740825653 | accuracy = 0.65375


Epoch[2] Batch[205] Speed: 1.2457198006309675 samples/sec                   batch loss = 512.5269985198975 | accuracy = 0.6536585365853659


Epoch[2] Batch[210] Speed: 1.2462108680192254 samples/sec                   batch loss = 524.3188011646271 | accuracy = 0.6571428571428571


Epoch[2] Batch[215] Speed: 1.2520405752342294 samples/sec                   batch loss = 536.39386677742 | accuracy = 0.6581395348837209


Epoch[2] Batch[220] Speed: 1.256100059056981 samples/sec                   batch loss = 549.334127664566 | accuracy = 0.6625


Epoch[2] Batch[225] Speed: 1.247982366460093 samples/sec                   batch loss = 561.8343671560287 | accuracy = 0.6611111111111111


Epoch[2] Batch[230] Speed: 1.250541966595597 samples/sec                   batch loss = 573.8985908031464 | accuracy = 0.6608695652173913


Epoch[2] Batch[235] Speed: 1.2537575812422883 samples/sec                   batch loss = 584.8531587123871 | accuracy = 0.6638297872340425


Epoch[2] Batch[240] Speed: 1.249935537175921 samples/sec                   batch loss = 595.2029039859772 | accuracy = 0.6666666666666666


Epoch[2] Batch[245] Speed: 1.2493054146420106 samples/sec                   batch loss = 609.7838340997696 | accuracy = 0.6642857142857143


Epoch[2] Batch[250] Speed: 1.2425226862821321 samples/sec                   batch loss = 623.6851243972778 | accuracy = 0.663


Epoch[2] Batch[255] Speed: 1.2470943195388107 samples/sec                   batch loss = 636.2400885820389 | accuracy = 0.6607843137254902


Epoch[2] Batch[260] Speed: 1.2538140806807054 samples/sec                   batch loss = 647.5079337358475 | accuracy = 0.6615384615384615


Epoch[2] Batch[265] Speed: 1.2533366661584258 samples/sec                   batch loss = 662.0155736207962 | accuracy = 0.6556603773584906


Epoch[2] Batch[270] Speed: 1.2545705202082185 samples/sec                   batch loss = 672.4375169277191 | accuracy = 0.6574074074074074


Epoch[2] Batch[275] Speed: 1.2443547687871455 samples/sec                   batch loss = 684.9499541521072 | accuracy = 0.6563636363636364


Epoch[2] Batch[280] Speed: 1.2505574401507478 samples/sec                   batch loss = 701.0542186498642 | accuracy = 0.6535714285714286


Epoch[2] Batch[285] Speed: 1.2570349820906477 samples/sec                   batch loss = 712.1177884340286 | accuracy = 0.6552631578947369


Epoch[2] Batch[290] Speed: 1.247164682642043 samples/sec                   batch loss = 723.6100006103516 | accuracy = 0.656896551724138


Epoch[2] Batch[295] Speed: 1.2451120315518325 samples/sec                   batch loss = 735.1796607971191 | accuracy = 0.6584745762711864


Epoch[2] Batch[300] Speed: 1.239269802706976 samples/sec                   batch loss = 746.8630747795105 | accuracy = 0.66


Epoch[2] Batch[305] Speed: 1.245074238924125 samples/sec                   batch loss = 759.1841067075729 | accuracy = 0.659016393442623


Epoch[2] Batch[310] Speed: 1.2546331914669588 samples/sec                   batch loss = 771.9537620544434 | accuracy = 0.6580645161290323


Epoch[2] Batch[315] Speed: 1.2504309597263858 samples/sec                   batch loss = 783.1380594968796 | accuracy = 0.6587301587301587


Epoch[2] Batch[320] Speed: 1.2426044986784581 samples/sec                   batch loss = 795.3018709421158 | accuracy = 0.659375


Epoch[2] Batch[325] Speed: 1.2555901744167945 samples/sec                   batch loss = 808.0801701545715 | accuracy = 0.66


Epoch[2] Batch[330] Speed: 1.2535344441830198 samples/sec                   batch loss = 816.8514256477356 | accuracy = 0.6636363636363637


Epoch[2] Batch[335] Speed: 1.2575935499781496 samples/sec                   batch loss = 827.8291794061661 | accuracy = 0.6656716417910448


Epoch[2] Batch[340] Speed: 1.2520602906620133 samples/sec                   batch loss = 840.3745813369751 | accuracy = 0.6647058823529411


Epoch[2] Batch[345] Speed: 1.2467817215547259 samples/sec                   batch loss = 850.9830521345139 | accuracy = 0.6652173913043479


Epoch[2] Batch[350] Speed: 1.2495610169986373 samples/sec                   batch loss = 863.6514329910278 | accuracy = 0.6664285714285715


Epoch[2] Batch[355] Speed: 1.2545767119999462 samples/sec                   batch loss = 875.0768766403198 | accuracy = 0.6676056338028169


Epoch[2] Batch[360] Speed: 1.2508500183259805 samples/sec                   batch loss = 888.7914828062057 | accuracy = 0.6659722222222222


Epoch[2] Batch[365] Speed: 1.248745074621178 samples/sec                   batch loss = 898.8623272180557 | accuracy = 0.6684931506849315


Epoch[2] Batch[370] Speed: 1.2479480196211767 samples/sec                   batch loss = 909.0700262784958 | accuracy = 0.6695945945945946


Epoch[2] Batch[375] Speed: 1.2562005995572019 samples/sec                   batch loss = 922.042726278305 | accuracy = 0.6686666666666666


Epoch[2] Batch[380] Speed: 1.2456210232165712 samples/sec                   batch loss = 936.0522776842117 | accuracy = 0.6664473684210527


Epoch[2] Batch[385] Speed: 1.2515397687176526 samples/sec                   batch loss = 949.5574787855148 | accuracy = 0.6655844155844156


Epoch[2] Batch[390] Speed: 1.2429278965155275 samples/sec                   batch loss = 961.0124324560165 | accuracy = 0.6666666666666666


Epoch[2] Batch[395] Speed: 1.2461826352698349 samples/sec                   batch loss = 972.1491636037827 | accuracy = 0.6683544303797468


Epoch[2] Batch[400] Speed: 1.2504275114658934 samples/sec                   batch loss = 982.8907089233398 | accuracy = 0.67


Epoch[2] Batch[405] Speed: 1.2472675995020808 samples/sec                   batch loss = 997.1736311912537 | accuracy = 0.667283950617284


Epoch[2] Batch[410] Speed: 1.253233587871108 samples/sec                   batch loss = 1007.0698395371437 | accuracy = 0.6682926829268293


Epoch[2] Batch[415] Speed: 1.2403204756996942 samples/sec                   batch loss = 1018.3310827612877 | accuracy = 0.6692771084337349


Epoch[2] Batch[420] Speed: 1.2560421309603025 samples/sec                   batch loss = 1029.574285686016 | accuracy = 0.669047619047619


Epoch[2] Batch[425] Speed: 1.2475097528992871 samples/sec                   batch loss = 1039.4790877699852 | accuracy = 0.6711764705882353


Epoch[2] Batch[430] Speed: 1.2527286463551568 samples/sec                   batch loss = 1052.8785949349403 | accuracy = 0.6709302325581395


Epoch[2] Batch[435] Speed: 1.247898452111128 samples/sec                   batch loss = 1065.5442669987679 | accuracy = 0.6706896551724137


Epoch[2] Batch[440] Speed: 1.2458109154115897 samples/sec                   batch loss = 1076.6866236329079 | accuracy = 0.6704545454545454


Epoch[2] Batch[445] Speed: 1.249864674855359 samples/sec                   batch loss = 1086.8075407147408 | accuracy = 0.6713483146067416


Epoch[2] Batch[450] Speed: 1.2564462343494813 samples/sec                   batch loss = 1096.9470133185387 | accuracy = 0.6722222222222223


Epoch[2] Batch[455] Speed: 1.2519753599019028 samples/sec                   batch loss = 1109.6695426106453 | accuracy = 0.6730769230769231


Epoch[2] Batch[460] Speed: 1.2459275802871055 samples/sec                   batch loss = 1121.9665009379387 | accuracy = 0.6744565217391304


Epoch[2] Batch[465] Speed: 1.2490219272479592 samples/sec                   batch loss = 1132.7152723670006 | accuracy = 0.6763440860215054


Epoch[2] Batch[470] Speed: 1.2551077447163015 samples/sec                   batch loss = 1141.6659943461418 | accuracy = 0.6781914893617021


Epoch[2] Batch[475] Speed: 1.253047696608771 samples/sec                   batch loss = 1150.0822016000748 | accuracy = 0.6789473684210526


Epoch[2] Batch[480] Speed: 1.2603098698262474 samples/sec                   batch loss = 1159.6042841672897 | accuracy = 0.6802083333333333


Epoch[2] Batch[485] Speed: 1.2425318884886238 samples/sec                   batch loss = 1171.5084832906723 | accuracy = 0.6798969072164949


Epoch[2] Batch[490] Speed: 1.247483223659776 samples/sec                   batch loss = 1183.0740513205528 | accuracy = 0.6806122448979591


Epoch[2] Batch[495] Speed: 1.2529648776293196 samples/sec                   batch loss = 1194.9221295714378 | accuracy = 0.6797979797979798


Epoch[2] Batch[500] Speed: 1.2575410453279356 samples/sec                   batch loss = 1205.963783442974 | accuracy = 0.6805


Epoch[2] Batch[505] Speed: 1.2550332904571346 samples/sec                   batch loss = 1217.7883680462837 | accuracy = 0.6821782178217822


Epoch[2] Batch[510] Speed: 1.247232735672434 samples/sec                   batch loss = 1233.8922974467278 | accuracy = 0.6808823529411765


Epoch[2] Batch[515] Speed: 1.2548894769780448 samples/sec                   batch loss = 1243.2237259149551 | accuracy = 0.6815533980582524


Epoch[2] Batch[520] Speed: 1.257614666190323 samples/sec                   batch loss = 1253.5181500911713 | accuracy = 0.6826923076923077


Epoch[2] Batch[525] Speed: 1.2568823291751694 samples/sec                   batch loss = 1264.6349226236343 | accuracy = 0.6833333333333333


Epoch[2] Batch[530] Speed: 1.2504024422518938 samples/sec                   batch loss = 1275.8538579940796 | accuracy = 0.6839622641509434


Epoch[2] Batch[535] Speed: 1.2455077444387908 samples/sec                   batch loss = 1288.0896680355072 | accuracy = 0.6827102803738317


Epoch[2] Batch[540] Speed: 1.252549170152997 samples/sec                   batch loss = 1302.4264302253723 | accuracy = 0.6814814814814815


Epoch[2] Batch[545] Speed: 1.2593625082016644 samples/sec                   batch loss = 1315.0046744346619 | accuracy = 0.6821100917431193


Epoch[2] Batch[550] Speed: 1.2581985651306422 samples/sec                   batch loss = 1327.7654874324799 | accuracy = 0.6809090909090909


Epoch[2] Batch[555] Speed: 1.2441983518978064 samples/sec                   batch loss = 1339.181479692459 | accuracy = 0.6815315315315316


Epoch[2] Batch[560] Speed: 1.252690389949277 samples/sec                   batch loss = 1348.5670535564423 | accuracy = 0.6821428571428572


Epoch[2] Batch[565] Speed: 1.259338591906889 samples/sec                   batch loss = 1363.213510274887 | accuracy = 0.6805309734513274


Epoch[2] Batch[570] Speed: 1.2532046615894243 samples/sec                   batch loss = 1375.9602794647217 | accuracy = 0.6802631578947368


Epoch[2] Batch[575] Speed: 1.2591154484504419 samples/sec                   batch loss = 1390.415182352066 | accuracy = 0.678695652173913


Epoch[2] Batch[580] Speed: 1.248173536517329 samples/sec                   batch loss = 1401.2992850542068 | accuracy = 0.6788793103448276


Epoch[2] Batch[585] Speed: 1.2457521749148082 samples/sec                   batch loss = 1411.2868818044662 | accuracy = 0.6790598290598291


Epoch[2] Batch[590] Speed: 1.253783815837469 samples/sec                   batch loss = 1423.1725325584412 | accuracy = 0.6796610169491526


Epoch[2] Batch[595] Speed: 1.2565121986163938 samples/sec                   batch loss = 1435.9446196556091 | accuracy = 0.6789915966386555


Epoch[2] Batch[600] Speed: 1.2525153195425784 samples/sec                   batch loss = 1444.6187900304794 | accuracy = 0.6808333333333333


Epoch[2] Batch[605] Speed: 1.2538759266993138 samples/sec                   batch loss = 1455.0965377092361 | accuracy = 0.6814049586776859


Epoch[2] Batch[610] Speed: 1.2591135585429234 samples/sec                   batch loss = 1470.3595234155655 | accuracy = 0.680327868852459


Epoch[2] Batch[615] Speed: 1.2595833750949255 samples/sec                   batch loss = 1482.7807856798172 | accuracy = 0.6808943089430894


Epoch[2] Batch[620] Speed: 1.2586200144787827 samples/sec                   batch loss = 1492.3090175390244 | accuracy = 0.682258064516129


Epoch[2] Batch[625] Speed: 1.2485637645666379 samples/sec                   batch loss = 1503.7085543870926 | accuracy = 0.6816


Epoch[2] Batch[630] Speed: 1.250792293691822 samples/sec                   batch loss = 1515.1412756443024 | accuracy = 0.6817460317460318


Epoch[2] Batch[635] Speed: 1.2579827111896484 samples/sec                   batch loss = 1525.9663887023926 | accuracy = 0.6826771653543308


Epoch[2] Batch[640] Speed: 1.2500355589074188 samples/sec                   batch loss = 1539.962130188942 | accuracy = 0.6828125


Epoch[2] Batch[645] Speed: 1.2589647461391844 samples/sec                   batch loss = 1552.1066665649414 | accuracy = 0.6829457364341085


Epoch[2] Batch[650] Speed: 1.2456765142177326 samples/sec                   batch loss = 1563.3915075063705 | accuracy = 0.683076923076923


Epoch[2] Batch[655] Speed: 1.2545230518353547 samples/sec                   batch loss = 1572.8454778194427 | accuracy = 0.683969465648855


Epoch[2] Batch[660] Speed: 1.261107726468602 samples/sec                   batch loss = 1582.245181620121 | accuracy = 0.6848484848484848


Epoch[2] Batch[665] Speed: 1.2539486504287003 samples/sec                   batch loss = 1592.7969593405724 | accuracy = 0.6853383458646617


Epoch[2] Batch[670] Speed: 1.2517926436696534 samples/sec                   batch loss = 1606.4490128159523 | accuracy = 0.685820895522388


Epoch[2] Batch[675] Speed: 1.2400997122831268 samples/sec                   batch loss = 1615.7008444666862 | accuracy = 0.6862962962962963


Epoch[2] Batch[680] Speed: 1.2436591766885068 samples/sec                   batch loss = 1624.4756826758385 | accuracy = 0.6875


Epoch[2] Batch[685] Speed: 1.2526168768632315 samples/sec                   batch loss = 1636.0540217757225 | accuracy = 0.6879562043795621


Epoch[2] Batch[690] Speed: 1.251932104768735 samples/sec                   batch loss = 1646.4730085730553 | accuracy = 0.6884057971014492


Epoch[2] Batch[695] Speed: 1.2450039267172088 samples/sec                   batch loss = 1657.1324053406715 | accuracy = 0.6888489208633094


Epoch[2] Batch[700] Speed: 1.2456158443087106 samples/sec                   batch loss = 1665.4065122008324 | accuracy = 0.69


Epoch[2] Batch[705] Speed: 1.251714846660004 samples/sec                   batch loss = 1678.7333744168282 | accuracy = 0.6900709219858157


Epoch[2] Batch[710] Speed: 1.250035372632334 samples/sec                   batch loss = 1692.4281826615334 | accuracy = 0.6897887323943662


Epoch[2] Batch[715] Speed: 1.2502111478619258 samples/sec                   batch loss = 1703.799772799015 | accuracy = 0.6898601398601398


Epoch[2] Batch[720] Speed: 1.2370956991845539 samples/sec                   batch loss = 1717.0439626574516 | accuracy = 0.6892361111111112


Epoch[2] Batch[725] Speed: 1.2499000584525815 samples/sec                   batch loss = 1728.1981005072594 | accuracy = 0.6896551724137931


Epoch[2] Batch[730] Speed: 1.2498832975515357 samples/sec                   batch loss = 1736.7333253026009 | accuracy = 0.6907534246575342


Epoch[2] Batch[735] Speed: 1.248626208830547 samples/sec                   batch loss = 1747.6719453930855 | accuracy = 0.691156462585034


Epoch[2] Batch[740] Speed: 1.2453138774813082 samples/sec                   batch loss = 1758.5313554406166 | accuracy = 0.6912162162162162


Epoch[2] Batch[745] Speed: 1.2432677698410999 samples/sec                   batch loss = 1772.7808062434196 | accuracy = 0.6899328859060403


Epoch[2] Batch[750] Speed: 1.2503443861847257 samples/sec                   batch loss = 1784.534007012844 | accuracy = 0.6893333333333334


Epoch[2] Batch[755] Speed: 1.256273405063622 samples/sec                   batch loss = 1797.8092568516731 | accuracy = 0.6887417218543046


Epoch[2] Batch[760] Speed: 1.2596811636880425 samples/sec                   batch loss = 1811.7916772961617 | accuracy = 0.6881578947368421


Epoch[2] Batch[765] Speed: 1.2516134355469903 samples/sec                   batch loss = 1822.0172870755196 | accuracy = 0.6875816993464052


Epoch[2] Batch[770] Speed: 1.246114881890226 samples/sec                   batch loss = 1832.5891042351723 | accuracy = 0.687987012987013


Epoch[2] Batch[775] Speed: 1.2496319379674514 samples/sec                   batch loss = 1848.221980392933 | accuracy = 0.6870967741935484


Epoch[2] Batch[780] Speed: 1.253626143712709 samples/sec                   batch loss = 1861.3718805909157 | accuracy = 0.6871794871794872


Epoch[2] Batch[785] Speed: 1.2501779824561534 samples/sec                   batch loss = 1874.9383442997932 | accuracy = 0.6872611464968152


[Epoch 2] training: accuracy=0.6871827411167513
[Epoch 2] time cost: 646.244785785675
[Epoch 2] validation: validation accuracy=0.7033333333333334


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).