<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[03:32:06] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[03:32:06] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[03:32:06] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 5.8964143, -0.8121536]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7731159475587417 samples/sec                   batch loss = 14.009833574295044 | accuracy = 0.55


Epoch[1] Batch[10] Speed: 1.2524134047193065 samples/sec                   batch loss = 26.74721646308899 | accuracy = 0.6


Epoch[1] Batch[15] Speed: 1.2517163408692247 samples/sec                   batch loss = 40.02427864074707 | accuracy = 0.6


Epoch[1] Batch[20] Speed: 1.2564621366561162 samples/sec                   batch loss = 55.44215536117554 | accuracy = 0.5375


Epoch[1] Batch[25] Speed: 1.2587522179998425 samples/sec                   batch loss = 69.69773077964783 | accuracy = 0.52


Epoch[1] Batch[30] Speed: 1.2517666791267217 samples/sec                   batch loss = 83.22654676437378 | accuracy = 0.5416666666666666


Epoch[1] Batch[35] Speed: 1.254802003439087 samples/sec                   batch loss = 96.73352026939392 | accuracy = 0.5571428571428572


Epoch[1] Batch[40] Speed: 1.2524450058903798 samples/sec                   batch loss = 110.20593810081482 | accuracy = 0.56875


Epoch[1] Batch[45] Speed: 1.2501152897406846 samples/sec                   batch loss = 124.30359816551208 | accuracy = 0.5555555555555556


Epoch[1] Batch[50] Speed: 1.2611029867468253 samples/sec                   batch loss = 137.989031791687 | accuracy = 0.55


Epoch[1] Batch[55] Speed: 1.2524479977998646 samples/sec                   batch loss = 152.61743927001953 | accuracy = 0.5409090909090909


Epoch[1] Batch[60] Speed: 1.2528863734734728 samples/sec                   batch loss = 167.704252243042 | accuracy = 0.5416666666666666


Epoch[1] Batch[65] Speed: 1.253767419086822 samples/sec                   batch loss = 181.31278610229492 | accuracy = 0.5461538461538461


Epoch[1] Batch[70] Speed: 1.252732949177793 samples/sec                   batch loss = 194.5258274078369 | accuracy = 0.5464285714285714


Epoch[1] Batch[75] Speed: 1.2515467708949932 samples/sec                   batch loss = 208.05165362358093 | accuracy = 0.5533333333333333


Epoch[1] Batch[80] Speed: 1.2467957123621283 samples/sec                   batch loss = 222.04365420341492 | accuracy = 0.55


Epoch[1] Batch[85] Speed: 1.2565688524881158 samples/sec                   batch loss = 236.36031532287598 | accuracy = 0.5441176470588235


Epoch[1] Batch[90] Speed: 1.2517905888824041 samples/sec                   batch loss = 251.8419439792633 | accuracy = 0.5361111111111111


Epoch[1] Batch[95] Speed: 1.254441350855259 samples/sec                   batch loss = 264.93115282058716 | accuracy = 0.5447368421052632


Epoch[1] Batch[100] Speed: 1.257866041278048 samples/sec                   batch loss = 278.25319027900696 | accuracy = 0.5475


Epoch[1] Batch[105] Speed: 1.2538410672839957 samples/sec                   batch loss = 291.56984877586365 | accuracy = 0.5523809523809524


Epoch[1] Batch[110] Speed: 1.2535039118725697 samples/sec                   batch loss = 305.4121632575989 | accuracy = 0.55


Epoch[1] Batch[115] Speed: 1.2467553159195086 samples/sec                   batch loss = 319.78242659568787 | accuracy = 0.5478260869565217


Epoch[1] Batch[120] Speed: 1.2551697184272432 samples/sec                   batch loss = 332.96024465560913 | accuracy = 0.55


Epoch[1] Batch[125] Speed: 1.2555742002094568 samples/sec                   batch loss = 347.179315328598 | accuracy = 0.55


Epoch[1] Batch[130] Speed: 1.2564573376927706 samples/sec                   batch loss = 361.113214969635 | accuracy = 0.5480769230769231


Epoch[1] Batch[135] Speed: 1.2513115406402513 samples/sec                   batch loss = 375.1576738357544 | accuracy = 0.5481481481481482


Epoch[1] Batch[140] Speed: 1.2468949541872152 samples/sec                   batch loss = 389.29177069664 | accuracy = 0.5446428571428571


Epoch[1] Batch[145] Speed: 1.248250615319291 samples/sec                   batch loss = 403.06412053108215 | accuracy = 0.5448275862068965


Epoch[1] Batch[150] Speed: 1.251721197073829 samples/sec                   batch loss = 416.3118028640747 | accuracy = 0.5466666666666666


Epoch[1] Batch[155] Speed: 1.2426021978451718 samples/sec                   batch loss = 429.86732506752014 | accuracy = 0.5451612903225806


Epoch[1] Batch[160] Speed: 1.2470121929419735 samples/sec                   batch loss = 443.0743601322174 | accuracy = 0.546875


Epoch[1] Batch[165] Speed: 1.2517835839765705 samples/sec                   batch loss = 457.2453656196594 | accuracy = 0.5409090909090909


Epoch[1] Batch[170] Speed: 1.257898955658152 samples/sec                   batch loss = 471.31799602508545 | accuracy = 0.5367647058823529


Epoch[1] Batch[175] Speed: 1.2572194204440714 samples/sec                   batch loss = 484.78939986228943 | accuracy = 0.5414285714285715


Epoch[1] Batch[180] Speed: 1.2605294591341114 samples/sec                   batch loss = 498.44624185562134 | accuracy = 0.5416666666666666


Epoch[1] Batch[185] Speed: 1.2555796501874552 samples/sec                   batch loss = 512.2093081474304 | accuracy = 0.5432432432432432


Epoch[1] Batch[190] Speed: 1.249527513829009 samples/sec                   batch loss = 525.5005757808685 | accuracy = 0.5447368421052632


Epoch[1] Batch[195] Speed: 1.2472345900823787 samples/sec                   batch loss = 538.9055542945862 | accuracy = 0.5448717948717948


Epoch[1] Batch[200] Speed: 1.2473990054394284 samples/sec                   batch loss = 552.7554321289062 | accuracy = 0.54375


Epoch[1] Batch[205] Speed: 1.2451235823236733 samples/sec                   batch loss = 566.2408752441406 | accuracy = 0.5439024390243903


Epoch[1] Batch[210] Speed: 1.2448273958519445 samples/sec                   batch loss = 579.909307718277 | accuracy = 0.5452380952380952


Epoch[1] Batch[215] Speed: 1.2414694748363881 samples/sec                   batch loss = 593.001754283905 | accuracy = 0.55


Epoch[1] Batch[220] Speed: 1.2417435699686514 samples/sec                   batch loss = 607.3188171386719 | accuracy = 0.5477272727272727


Epoch[1] Batch[225] Speed: 1.2433120868061809 samples/sec                   batch loss = 621.0826575756073 | accuracy = 0.5477777777777778


Epoch[1] Batch[230] Speed: 1.243629307911447 samples/sec                   batch loss = 635.2299337387085 | accuracy = 0.5434782608695652


Epoch[1] Batch[235] Speed: 1.24149574891044 samples/sec                   batch loss = 648.799957036972 | accuracy = 0.5457446808510639


Epoch[1] Batch[240] Speed: 1.2400321604486497 samples/sec                   batch loss = 662.4207129478455 | accuracy = 0.5479166666666667


Epoch[1] Batch[245] Speed: 1.2422988385910496 samples/sec                   batch loss = 676.6489291191101 | accuracy = 0.5448979591836735


Epoch[1] Batch[250] Speed: 1.250635839167186 samples/sec                   batch loss = 689.867388010025 | accuracy = 0.548


Epoch[1] Batch[255] Speed: 1.2516632052461 samples/sec                   batch loss = 703.4317116737366 | accuracy = 0.5529411764705883


Epoch[1] Batch[260] Speed: 1.2542385045441171 samples/sec                   batch loss = 717.3475213050842 | accuracy = 0.5509615384615385


Epoch[1] Batch[265] Speed: 1.2512124343722282 samples/sec                   batch loss = 731.2013983726501 | accuracy = 0.5509433962264151


Epoch[1] Batch[270] Speed: 1.2516577892145 samples/sec                   batch loss = 744.9599642753601 | accuracy = 0.5518518518518518


Epoch[1] Batch[275] Speed: 1.2552019283938851 samples/sec                   batch loss = 757.5326561927795 | accuracy = 0.5554545454545454


Epoch[1] Batch[280] Speed: 1.2563715271251292 samples/sec                   batch loss = 771.1557641029358 | accuracy = 0.5544642857142857


Epoch[1] Batch[285] Speed: 1.253397996890041 samples/sec                   batch loss = 785.499454498291 | accuracy = 0.55


Epoch[1] Batch[290] Speed: 1.254055033448387 samples/sec                   batch loss = 798.5858075618744 | accuracy = 0.5543103448275862


Epoch[1] Batch[295] Speed: 1.2585189919375408 samples/sec                   batch loss = 812.201516866684 | accuracy = 0.5542372881355933


Epoch[1] Batch[300] Speed: 1.2404038324931492 samples/sec                   batch loss = 825.2934110164642 | accuracy = 0.555


Epoch[1] Batch[305] Speed: 1.2516247337618012 samples/sec                   batch loss = 838.6682212352753 | accuracy = 0.5565573770491803


Epoch[1] Batch[310] Speed: 1.2462476187929374 samples/sec                   batch loss = 851.9941871166229 | accuracy = 0.5588709677419355


Epoch[1] Batch[315] Speed: 1.253310450272334 samples/sec                   batch loss = 865.5917203426361 | accuracy = 0.5595238095238095


Epoch[1] Batch[320] Speed: 1.2442493791626326 samples/sec                   batch loss = 878.7559058666229 | accuracy = 0.5609375


Epoch[1] Batch[325] Speed: 1.2505334842670124 samples/sec                   batch loss = 892.4815928936005 | accuracy = 0.56


Epoch[1] Batch[330] Speed: 1.2534918304794602 samples/sec                   batch loss = 906.8747100830078 | accuracy = 0.5590909090909091


Epoch[1] Batch[335] Speed: 1.2469800311841335 samples/sec                   batch loss = 920.4548392295837 | accuracy = 0.5582089552238806


Epoch[1] Batch[340] Speed: 1.2465837528620272 samples/sec                   batch loss = 933.204874753952 | accuracy = 0.5602941176470588


Epoch[1] Batch[345] Speed: 1.2463823284472684 samples/sec                   batch loss = 946.7821242809296 | accuracy = 0.5601449275362319


Epoch[1] Batch[350] Speed: 1.2485827202008066 samples/sec                   batch loss = 960.7065801620483 | accuracy = 0.5607142857142857


Epoch[1] Batch[355] Speed: 1.2451286647311772 samples/sec                   batch loss = 973.5545876026154 | accuracy = 0.5612676056338028


Epoch[1] Batch[360] Speed: 1.2491860701128579 samples/sec                   batch loss = 986.2970743179321 | accuracy = 0.5631944444444444


Epoch[1] Batch[365] Speed: 1.24816712919007 samples/sec                   batch loss = 1000.1743788719177 | accuracy = 0.5616438356164384


Epoch[1] Batch[370] Speed: 1.2528607377751504 samples/sec                   batch loss = 1013.0043358802795 | accuracy = 0.5628378378378378


Epoch[1] Batch[375] Speed: 1.250229035552786 samples/sec                   batch loss = 1026.9967126846313 | accuracy = 0.5626666666666666


Epoch[1] Batch[380] Speed: 1.2492585298939913 samples/sec                   batch loss = 1040.358009338379 | accuracy = 0.5631578947368421


Epoch[1] Batch[385] Speed: 1.248621841245846 samples/sec                   batch loss = 1053.563447713852 | accuracy = 0.562987012987013


Epoch[1] Batch[390] Speed: 1.254498849976521 samples/sec                   batch loss = 1066.7707042694092 | accuracy = 0.5634615384615385


Epoch[1] Batch[395] Speed: 1.2513625930267995 samples/sec                   batch loss = 1079.5986597537994 | accuracy = 0.5651898734177215


Epoch[1] Batch[400] Speed: 1.2474454724565474 samples/sec                   batch loss = 1092.201106786728 | accuracy = 0.565


Epoch[1] Batch[405] Speed: 1.251373980050475 samples/sec                   batch loss = 1105.3489217758179 | accuracy = 0.5641975308641975


Epoch[1] Batch[410] Speed: 1.24902657659378 samples/sec                   batch loss = 1118.928053855896 | accuracy = 0.5634146341463414


Epoch[1] Batch[415] Speed: 1.2526054672086564 samples/sec                   batch loss = 1131.6489081382751 | accuracy = 0.5650602409638554


Epoch[1] Batch[420] Speed: 1.2500444070378895 samples/sec                   batch loss = 1145.1578669548035 | accuracy = 0.5666666666666667


Epoch[1] Batch[425] Speed: 1.2545762429226757 samples/sec                   batch loss = 1158.0436615943909 | accuracy = 0.5682352941176471


Epoch[1] Batch[430] Speed: 1.2503287315389406 samples/sec                   batch loss = 1171.0098948478699 | accuracy = 0.5697674418604651


Epoch[1] Batch[435] Speed: 1.244196414236058 samples/sec                   batch loss = 1182.9653301239014 | accuracy = 0.5729885057471265


Epoch[1] Batch[440] Speed: 1.255333414791404 samples/sec                   batch loss = 1195.6903719902039 | accuracy = 0.5738636363636364


Epoch[1] Batch[445] Speed: 1.251921735191778 samples/sec                   batch loss = 1208.8250592947006 | accuracy = 0.5747191011235955


Epoch[1] Batch[450] Speed: 1.2508162594924601 samples/sec                   batch loss = 1221.5659011602402 | accuracy = 0.5755555555555556


Epoch[1] Batch[455] Speed: 1.2468900426889165 samples/sec                   batch loss = 1233.9264379739761 | accuracy = 0.5763736263736263


Epoch[1] Batch[460] Speed: 1.2343204802671284 samples/sec                   batch loss = 1246.8918360471725 | accuracy = 0.5782608695652174


Epoch[1] Batch[465] Speed: 1.2503376770028174 samples/sec                   batch loss = 1259.1083275079727 | accuracy = 0.582258064516129


Epoch[1] Batch[470] Speed: 1.247752464329591 samples/sec                   batch loss = 1272.4113155603409 | accuracy = 0.5829787234042553


Epoch[1] Batch[475] Speed: 1.250908867500282 samples/sec                   batch loss = 1286.7510949373245 | accuracy = 0.5815789473684211


Epoch[1] Batch[480] Speed: 1.2537156081735197 samples/sec                   batch loss = 1299.5146080255508 | accuracy = 0.58125


Epoch[1] Batch[485] Speed: 1.2540076040750499 samples/sec                   batch loss = 1313.8385220766068 | accuracy = 0.5804123711340207


Epoch[1] Batch[490] Speed: 1.2584031664217323 samples/sec                   batch loss = 1327.4630984067917 | accuracy = 0.5795918367346938


Epoch[1] Batch[495] Speed: 1.2545424702809915 samples/sec                   batch loss = 1341.2654625177383 | accuracy = 0.5797979797979798


Epoch[1] Batch[500] Speed: 1.2552970654395137 samples/sec                   batch loss = 1353.7294090986252 | accuracy = 0.582


Epoch[1] Batch[505] Speed: 1.2552608121104436 samples/sec                   batch loss = 1366.5751193761826 | accuracy = 0.5831683168316831


Epoch[1] Batch[510] Speed: 1.2580726098840327 samples/sec                   batch loss = 1378.7611292600632 | accuracy = 0.5843137254901961


Epoch[1] Batch[515] Speed: 1.2515729130493651 samples/sec                   batch loss = 1391.5201658010483 | accuracy = 0.5849514563106796


Epoch[1] Batch[520] Speed: 1.2528424004739787 samples/sec                   batch loss = 1404.492494225502 | accuracy = 0.5850961538461539


Epoch[1] Batch[525] Speed: 1.2482959383843641 samples/sec                   batch loss = 1418.2357298135757 | accuracy = 0.5842857142857143


Epoch[1] Batch[530] Speed: 1.2559980301964182 samples/sec                   batch loss = 1430.4304233789444 | accuracy = 0.5849056603773585


Epoch[1] Batch[535] Speed: 1.252184764472952 samples/sec                   batch loss = 1445.0192297697067 | accuracy = 0.5841121495327103


Epoch[1] Batch[540] Speed: 1.2502526070937452 samples/sec                   batch loss = 1457.6668692827225 | accuracy = 0.5856481481481481


Epoch[1] Batch[545] Speed: 1.251256759623075 samples/sec                   batch loss = 1471.127982020378 | accuracy = 0.5853211009174312


Epoch[1] Batch[550] Speed: 1.2525771310148797 samples/sec                   batch loss = 1484.266247868538 | accuracy = 0.5845454545454546


Epoch[1] Batch[555] Speed: 1.2264104159551623 samples/sec                   batch loss = 1497.05249106884 | accuracy = 0.5842342342342343


Epoch[1] Batch[560] Speed: 1.214291587196261 samples/sec                   batch loss = 1510.211512207985 | accuracy = 0.5848214285714286


Epoch[1] Batch[565] Speed: 1.2179877420190415 samples/sec                   batch loss = 1524.5663489103317 | accuracy = 0.5831858407079646


Epoch[1] Batch[570] Speed: 1.2181520543789668 samples/sec                   batch loss = 1536.0978437662125 | accuracy = 0.5850877192982457


Epoch[1] Batch[575] Speed: 1.2206299743195914 samples/sec                   batch loss = 1548.7763994932175 | accuracy = 0.5856521739130435


Epoch[1] Batch[580] Speed: 1.2453692486757406 samples/sec                   batch loss = 1561.839407324791 | accuracy = 0.5862068965517241


Epoch[1] Batch[585] Speed: 1.2483842720935203 samples/sec                   batch loss = 1573.790396809578 | accuracy = 0.588034188034188


Epoch[1] Batch[590] Speed: 1.2560962973300456 samples/sec                   batch loss = 1586.862920641899 | accuracy = 0.589406779661017


Epoch[1] Batch[595] Speed: 1.2518863304532932 samples/sec                   batch loss = 1598.3418048620224 | accuracy = 0.5911764705882353


Epoch[1] Batch[600] Speed: 1.2311394354413747 samples/sec                   batch loss = 1609.858589053154 | accuracy = 0.5929166666666666


Epoch[1] Batch[605] Speed: 1.215433436949686 samples/sec                   batch loss = 1622.2940685749054 | accuracy = 0.5925619834710744


Epoch[1] Batch[610] Speed: 1.2082617624320195 samples/sec                   batch loss = 1633.3728880882263 | accuracy = 0.5938524590163935


Epoch[1] Batch[615] Speed: 1.2102151315988319 samples/sec                   batch loss = 1646.9514558315277 | accuracy = 0.5939024390243902


Epoch[1] Batch[620] Speed: 1.218893071608047 samples/sec                   batch loss = 1657.781082034111 | accuracy = 0.5951612903225807


Epoch[1] Batch[625] Speed: 1.255456003706829 samples/sec                   batch loss = 1668.6084742546082 | accuracy = 0.5964


Epoch[1] Batch[630] Speed: 1.2482244260321043 samples/sec                   batch loss = 1681.333462357521 | accuracy = 0.5976190476190476


Epoch[1] Batch[635] Speed: 1.2489301562463941 samples/sec                   batch loss = 1693.8107969760895 | accuracy = 0.5976377952755906


Epoch[1] Batch[640] Speed: 1.252188783184031 samples/sec                   batch loss = 1710.7874524593353 | accuracy = 0.59609375


Epoch[1] Batch[645] Speed: 1.2504912607610033 samples/sec                   batch loss = 1725.0930216312408 | accuracy = 0.5957364341085272


Epoch[1] Batch[650] Speed: 1.2522512167064597 samples/sec                   batch loss = 1737.679137468338 | accuracy = 0.5961538461538461


Epoch[1] Batch[655] Speed: 1.2539975750030683 samples/sec                   batch loss = 1750.6588838100433 | accuracy = 0.5958015267175573


Epoch[1] Batch[660] Speed: 1.2496360333873542 samples/sec                   batch loss = 1762.5094857215881 | accuracy = 0.5962121212121212


Epoch[1] Batch[665] Speed: 1.249068794237904 samples/sec                   batch loss = 1777.288006901741 | accuracy = 0.5962406015037593


Epoch[1] Batch[670] Speed: 1.2496457135774712 samples/sec                   batch loss = 1789.3809254169464 | accuracy = 0.5966417910447761


Epoch[1] Batch[675] Speed: 1.2605944320868658 samples/sec                   batch loss = 1802.6599478721619 | accuracy = 0.5966666666666667


Epoch[1] Batch[680] Speed: 1.2488170184689598 samples/sec                   batch loss = 1814.070057272911 | accuracy = 0.5981617647058823


Epoch[1] Batch[685] Speed: 1.255139951502262 samples/sec                   batch loss = 1826.6273602247238 | accuracy = 0.5989051094890511


Epoch[1] Batch[690] Speed: 1.2519040792803178 samples/sec                   batch loss = 1837.7135635614395 | accuracy = 0.5996376811594203


Epoch[1] Batch[695] Speed: 1.2572712387075984 samples/sec                   batch loss = 1851.8250099420547 | accuracy = 0.5996402877697842


Epoch[1] Batch[700] Speed: 1.252100284053021 samples/sec                   batch loss = 1865.219383597374 | accuracy = 0.5996428571428571


Epoch[1] Batch[705] Speed: 1.251421210410151 samples/sec                   batch loss = 1877.9261893033981 | accuracy = 0.5989361702127659


Epoch[1] Batch[710] Speed: 1.2538679613394816 samples/sec                   batch loss = 1891.4956070184708 | accuracy = 0.5985915492957746


Epoch[1] Batch[715] Speed: 1.2529447594170913 samples/sec                   batch loss = 1903.69000518322 | accuracy = 0.5996503496503497


Epoch[1] Batch[720] Speed: 1.2572979974469167 samples/sec                   batch loss = 1916.8012419939041 | accuracy = 0.5993055555555555


Epoch[1] Batch[725] Speed: 1.24797261919205 samples/sec                   batch loss = 1929.4809342622757 | accuracy = 0.6006896551724138


Epoch[1] Batch[730] Speed: 1.2539327179622708 samples/sec                   batch loss = 1941.3849110603333 | accuracy = 0.6013698630136987


Epoch[1] Batch[735] Speed: 1.2508449823622467 samples/sec                   batch loss = 1954.705255150795 | accuracy = 0.6006802721088436


Epoch[1] Batch[740] Speed: 1.2502535387934661 samples/sec                   batch loss = 1966.6822618246078 | accuracy = 0.6013513513513513


Epoch[1] Batch[745] Speed: 1.2531241618154823 samples/sec                   batch loss = 1978.614158987999 | accuracy = 0.602013422818792


Epoch[1] Batch[750] Speed: 1.2494017068557215 samples/sec                   batch loss = 1991.2151011228561 | accuracy = 0.6026666666666667


Epoch[1] Batch[755] Speed: 1.2471860990755688 samples/sec                   batch loss = 2004.8989924192429 | accuracy = 0.6023178807947019


Epoch[1] Batch[760] Speed: 1.2511165159497217 samples/sec                   batch loss = 2016.4831142425537 | accuracy = 0.6032894736842105


Epoch[1] Batch[765] Speed: 1.2568995608062752 samples/sec                   batch loss = 2029.5760550498962 | accuracy = 0.6042483660130719


Epoch[1] Batch[770] Speed: 1.254087748683961 samples/sec                   batch loss = 2041.218596816063 | accuracy = 0.6042207792207792


Epoch[1] Batch[775] Speed: 1.252128411776498 samples/sec                   batch loss = 2055.127261996269 | accuracy = 0.604516129032258


Epoch[1] Batch[780] Speed: 1.2573630146226056 samples/sec                   batch loss = 2067.288051366806 | accuracy = 0.6048076923076923


Epoch[1] Batch[785] Speed: 1.2544614233590914 samples/sec                   batch loss = 2079.885646224022 | accuracy = 0.6050955414012739


[Epoch 1] training: accuracy=0.6046954314720813
[Epoch 1] time cost: 648.80730509758
[Epoch 1] validation: validation accuracy=0.6844444444444444


Epoch[2] Batch[5] Speed: 1.2483473023113436 samples/sec                   batch loss = 12.281871318817139 | accuracy = 0.65


Epoch[2] Batch[10] Speed: 1.2526206177788306 samples/sec                   batch loss = 25.704954147338867 | accuracy = 0.625


Epoch[2] Batch[15] Speed: 1.2517522029607584 samples/sec                   batch loss = 37.29379427433014 | accuracy = 0.65


Epoch[2] Batch[20] Speed: 1.2549867258268934 samples/sec                   batch loss = 48.42569363117218 | accuracy = 0.65


Epoch[2] Batch[25] Speed: 1.2549360343918508 samples/sec                   batch loss = 61.44555485248566 | accuracy = 0.65


Epoch[2] Batch[30] Speed: 1.2473545821270968 samples/sec                   batch loss = 72.14350724220276 | accuracy = 0.6583333333333333


Epoch[2] Batch[35] Speed: 1.242155629427626 samples/sec                   batch loss = 85.27463066577911 | accuracy = 0.6428571428571429


Epoch[2] Batch[40] Speed: 1.248753067972691 samples/sec                   batch loss = 96.83810985088348 | accuracy = 0.65


Epoch[2] Batch[45] Speed: 1.2473007961300198 samples/sec                   batch loss = 109.0670028924942 | accuracy = 0.65


Epoch[2] Batch[50] Speed: 1.2555845364148506 samples/sec                   batch loss = 121.5298523902893 | accuracy = 0.65


Epoch[2] Batch[55] Speed: 1.2531893096716629 samples/sec                   batch loss = 134.47120583057404 | accuracy = 0.6545454545454545


Epoch[2] Batch[60] Speed: 1.2464159409620852 samples/sec                   batch loss = 145.2609419822693 | accuracy = 0.6583333333333333


Epoch[2] Batch[65] Speed: 1.249781065788725 samples/sec                   batch loss = 156.7414983510971 | accuracy = 0.6615384615384615


Epoch[2] Batch[70] Speed: 1.2476652406367004 samples/sec                   batch loss = 169.1893824338913 | accuracy = 0.6535714285714286


Epoch[2] Batch[75] Speed: 1.2473546748655804 samples/sec                   batch loss = 182.7602959871292 | accuracy = 0.6433333333333333


Epoch[2] Batch[80] Speed: 1.2469515782166452 samples/sec                   batch loss = 195.32419753074646 | accuracy = 0.65


Epoch[2] Batch[85] Speed: 1.252640257952319 samples/sec                   batch loss = 206.52678549289703 | accuracy = 0.6529411764705882


Epoch[2] Batch[90] Speed: 1.2520297366853497 samples/sec                   batch loss = 221.69937765598297 | accuracy = 0.6444444444444445


Epoch[2] Batch[95] Speed: 1.2522570117503293 samples/sec                   batch loss = 234.09408974647522 | accuracy = 0.6447368421052632


Epoch[2] Batch[100] Speed: 1.2469646460307093 samples/sec                   batch loss = 245.1553064584732 | accuracy = 0.6475


Epoch[2] Batch[105] Speed: 1.2509803146685377 samples/sec                   batch loss = 258.765438079834 | accuracy = 0.6476190476190476


Epoch[2] Batch[110] Speed: 1.252197287983233 samples/sec                   batch loss = 276.48085618019104 | accuracy = 0.6386363636363637


Epoch[2] Batch[115] Speed: 1.2526117331407665 samples/sec                   batch loss = 290.4401683807373 | accuracy = 0.6369565217391304


Epoch[2] Batch[120] Speed: 1.2497290252589204 samples/sec                   batch loss = 302.5786589384079 | accuracy = 0.6395833333333333


Epoch[2] Batch[125] Speed: 1.2487510231520982 samples/sec                   batch loss = 314.45017743110657 | accuracy = 0.64


Epoch[2] Batch[130] Speed: 1.2541312466548458 samples/sec                   batch loss = 327.56102669239044 | accuracy = 0.6403846153846153


Epoch[2] Batch[135] Speed: 1.2494340867027727 samples/sec                   batch loss = 341.702388882637 | accuracy = 0.6444444444444445


Epoch[2] Batch[140] Speed: 1.2532036318758553 samples/sec                   batch loss = 352.8643938302994 | accuracy = 0.6464285714285715


Epoch[2] Batch[145] Speed: 1.2546107679469585 samples/sec                   batch loss = 363.7825765609741 | accuracy = 0.6517241379310345


Epoch[2] Batch[150] Speed: 1.251781996208814 samples/sec                   batch loss = 375.5315318107605 | accuracy = 0.6533333333333333


Epoch[2] Batch[155] Speed: 1.2513822871083178 samples/sec                   batch loss = 385.4037481546402 | accuracy = 0.6580645161290323


Epoch[2] Batch[160] Speed: 1.2499746499136308 samples/sec                   batch loss = 396.9465663433075 | accuracy = 0.6609375


Epoch[2] Batch[165] Speed: 1.2472357954517999 samples/sec                   batch loss = 408.7002999782562 | accuracy = 0.6651515151515152


Epoch[2] Batch[170] Speed: 1.2517995552762706 samples/sec                   batch loss = 420.8730182647705 | accuracy = 0.6705882352941176


Epoch[2] Batch[175] Speed: 1.2567883638941644 samples/sec                   batch loss = 434.17088747024536 | accuracy = 0.6671428571428571


Epoch[2] Batch[180] Speed: 1.2486116193585373 samples/sec                   batch loss = 448.2304787635803 | accuracy = 0.6666666666666666


Epoch[2] Batch[185] Speed: 1.24733297443636 samples/sec                   batch loss = 459.5407415628433 | accuracy = 0.6675675675675675


Epoch[2] Batch[190] Speed: 1.2480720484707242 samples/sec                   batch loss = 471.72054278850555 | accuracy = 0.6657894736842105


Epoch[2] Batch[195] Speed: 1.2468080356512503 samples/sec                   batch loss = 483.7717332839966 | accuracy = 0.6666666666666666


Epoch[2] Batch[200] Speed: 1.2439914252671966 samples/sec                   batch loss = 497.2259598970413 | accuracy = 0.6625


Epoch[2] Batch[205] Speed: 1.2493523029093367 samples/sec                   batch loss = 506.38860976696014 | accuracy = 0.6670731707317074


Epoch[2] Batch[210] Speed: 1.2493055076708024 samples/sec                   batch loss = 520.5625227689743 | accuracy = 0.6666666666666666


Epoch[2] Batch[215] Speed: 1.2526262291941253 samples/sec                   batch loss = 531.3146367073059 | accuracy = 0.6674418604651163


Epoch[2] Batch[220] Speed: 1.2508703491103852 samples/sec                   batch loss = 543.0205117464066 | accuracy = 0.6681818181818182


Epoch[2] Batch[225] Speed: 1.2537022111115579 samples/sec                   batch loss = 556.8971780538559 | accuracy = 0.6666666666666666


Epoch[2] Batch[230] Speed: 1.2467490157928673 samples/sec                   batch loss = 568.8418165445328 | accuracy = 0.6652173913043479


Epoch[2] Batch[235] Speed: 1.2442199433941723 samples/sec                   batch loss = 580.9887729883194 | accuracy = 0.6648936170212766


Epoch[2] Batch[240] Speed: 1.2469170100716778 samples/sec                   batch loss = 592.7745358943939 | accuracy = 0.665625


Epoch[2] Batch[245] Speed: 1.254417808690326 samples/sec                   batch loss = 605.6110793352127 | accuracy = 0.6642857142857143


Epoch[2] Batch[250] Speed: 1.2518978204259756 samples/sec                   batch loss = 617.3188034296036 | accuracy = 0.666


Epoch[2] Batch[255] Speed: 1.2506157956771904 samples/sec                   batch loss = 629.8008249998093 | accuracy = 0.6666666666666666


Epoch[2] Batch[260] Speed: 1.2470228521038194 samples/sec                   batch loss = 640.045761346817 | accuracy = 0.6653846153846154


Epoch[2] Batch[265] Speed: 1.2466283065466872 samples/sec                   batch loss = 651.6331702470779 | accuracy = 0.6660377358490566


Epoch[2] Batch[270] Speed: 1.2485362613365514 samples/sec                   batch loss = 663.7133718729019 | accuracy = 0.6657407407407407


Epoch[2] Batch[275] Speed: 1.2487311329739001 samples/sec                   batch loss = 673.9964796304703 | accuracy = 0.6672727272727272


Epoch[2] Batch[280] Speed: 1.2495778623265503 samples/sec                   batch loss = 683.6613472700119 | accuracy = 0.6714285714285714


Epoch[2] Batch[285] Speed: 1.2535647906524439 samples/sec                   batch loss = 697.4998527765274 | accuracy = 0.6719298245614035


Epoch[2] Batch[290] Speed: 1.2487241622669996 samples/sec                   batch loss = 711.3814591169357 | accuracy = 0.6706896551724137


Epoch[2] Batch[295] Speed: 1.2469836458234465 samples/sec                   batch loss = 722.0947587490082 | accuracy = 0.6711864406779661


Epoch[2] Batch[300] Speed: 1.2506432973739812 samples/sec                   batch loss = 733.8418478965759 | accuracy = 0.6716666666666666


Epoch[2] Batch[305] Speed: 1.2516324838948671 samples/sec                   batch loss = 747.031321644783 | accuracy = 0.6704918032786885


Epoch[2] Batch[310] Speed: 1.247185450081928 samples/sec                   batch loss = 756.7489227056503 | accuracy = 0.6717741935483871


Epoch[2] Batch[315] Speed: 1.251269917828342 samples/sec                   batch loss = 767.9921094179153 | accuracy = 0.6722222222222223


Epoch[2] Batch[320] Speed: 1.24886684490661 samples/sec                   batch loss = 779.8130711317062 | accuracy = 0.671875


Epoch[2] Batch[325] Speed: 1.245620930735696 samples/sec                   batch loss = 791.2806195020676 | accuracy = 0.6707692307692308


Epoch[2] Batch[330] Speed: 1.2516763720011577 samples/sec                   batch loss = 803.1040921211243 | accuracy = 0.6696969696969697


Epoch[2] Batch[335] Speed: 1.2479664924157834 samples/sec                   batch loss = 814.2890578508377 | accuracy = 0.6694029850746268


Epoch[2] Batch[340] Speed: 1.2541141845910917 samples/sec                   batch loss = 824.7075085639954 | accuracy = 0.6720588235294118


Epoch[2] Batch[345] Speed: 1.2464286271482732 samples/sec                   batch loss = 836.1748439073563 | accuracy = 0.6739130434782609


Epoch[2] Batch[350] Speed: 1.2466745308868796 samples/sec                   batch loss = 848.1487102508545 | accuracy = 0.6742857142857143


Epoch[2] Batch[355] Speed: 1.2499676653166565 samples/sec                   batch loss = 860.8038605451584 | accuracy = 0.6746478873239437


Epoch[2] Batch[360] Speed: 1.2549838156522302 samples/sec                   batch loss = 871.4559568166733 | accuracy = 0.6756944444444445


Epoch[2] Batch[365] Speed: 1.2452674766998737 samples/sec                   batch loss = 882.6263190507889 | accuracy = 0.6767123287671233


Epoch[2] Batch[370] Speed: 1.2508292219229658 samples/sec                   batch loss = 892.4347846508026 | accuracy = 0.6783783783783783


Epoch[2] Batch[375] Speed: 1.2521061711463881 samples/sec                   batch loss = 901.9483821988106 | accuracy = 0.6793333333333333


Epoch[2] Batch[380] Speed: 1.2480065032077154 samples/sec                   batch loss = 915.6664300560951 | accuracy = 0.6789473684210526


Epoch[2] Batch[385] Speed: 1.248526691239983 samples/sec                   batch loss = 928.5046721100807 | accuracy = 0.6792207792207792


Epoch[2] Batch[390] Speed: 1.2505157742815665 samples/sec                   batch loss = 940.8451567292213 | accuracy = 0.6788461538461539


Epoch[2] Batch[395] Speed: 1.2496452481802838 samples/sec                   batch loss = 951.6690428853035 | accuracy = 0.6791139240506329


Epoch[2] Batch[400] Speed: 1.2486309481591509 samples/sec                   batch loss = 962.8384553790092 | accuracy = 0.67875


Epoch[2] Batch[405] Speed: 1.2458738247633487 samples/sec                   batch loss = 974.7823620438576 | accuracy = 0.6771604938271605


Epoch[2] Batch[410] Speed: 1.2511428267380031 samples/sec                   batch loss = 986.9229391217232 | accuracy = 0.676219512195122


Epoch[2] Batch[415] Speed: 1.2542039062450356 samples/sec                   batch loss = 1000.0678740143776 | accuracy = 0.6753012048192771


Epoch[2] Batch[420] Speed: 1.253435079144624 samples/sec                   batch loss = 1009.647445499897 | accuracy = 0.6779761904761905


Epoch[2] Batch[425] Speed: 1.2521441115205179 samples/sec                   batch loss = 1018.6605795025826 | accuracy = 0.68


Epoch[2] Batch[430] Speed: 1.250797981989397 samples/sec                   batch loss = 1030.6181141734123 | accuracy = 0.6802325581395349


Epoch[2] Batch[435] Speed: 1.251089366615546 samples/sec                   batch loss = 1041.0412629246712 | accuracy = 0.6810344827586207


Epoch[2] Batch[440] Speed: 1.246801086398108 samples/sec                   batch loss = 1050.5738497376442 | accuracy = 0.6818181818181818


Epoch[2] Batch[445] Speed: 1.2481654577242123 samples/sec                   batch loss = 1063.7317778468132 | accuracy = 0.6808988764044944


Epoch[2] Batch[450] Speed: 1.2532723455649506 samples/sec                   batch loss = 1074.6502113938332 | accuracy = 0.6811111111111111


Epoch[2] Batch[455] Speed: 1.2513055676898968 samples/sec                   batch loss = 1085.7015689015388 | accuracy = 0.6818681318681319


Epoch[2] Batch[460] Speed: 1.2467598557561188 samples/sec                   batch loss = 1097.2849410176277 | accuracy = 0.6820652173913043


Epoch[2] Batch[465] Speed: 1.2489121197630038 samples/sec                   batch loss = 1113.207691013813 | accuracy = 0.6790322580645162


Epoch[2] Batch[470] Speed: 1.2514691911452291 samples/sec                   batch loss = 1123.4788845181465 | accuracy = 0.6792553191489362


Epoch[2] Batch[475] Speed: 1.2471777549230996 samples/sec                   batch loss = 1132.0456775426865 | accuracy = 0.6810526315789474


Epoch[2] Batch[480] Speed: 1.2520974806946845 samples/sec                   batch loss = 1142.9418977499008 | accuracy = 0.6822916666666666


Epoch[2] Batch[485] Speed: 1.2496615372881033 samples/sec                   batch loss = 1156.869723558426 | accuracy = 0.6809278350515464


Epoch[2] Batch[490] Speed: 1.2528629831958564 samples/sec                   batch loss = 1168.7230341434479 | accuracy = 0.6811224489795918


Epoch[2] Batch[495] Speed: 1.2507478148538351 samples/sec                   batch loss = 1178.9449802041054 | accuracy = 0.6818181818181818


Epoch[2] Batch[500] Speed: 1.248881812206384 samples/sec                   batch loss = 1189.679171025753 | accuracy = 0.682


Epoch[2] Batch[505] Speed: 1.2502802791672913 samples/sec                   batch loss = 1200.053940474987 | accuracy = 0.6816831683168317


Epoch[2] Batch[510] Speed: 1.249019602588029 samples/sec                   batch loss = 1211.2340922951698 | accuracy = 0.6808823529411765


Epoch[2] Batch[515] Speed: 1.2527959982729753 samples/sec                   batch loss = 1221.215616762638 | accuracy = 0.6825242718446602


Epoch[2] Batch[520] Speed: 1.2578173801928814 samples/sec                   batch loss = 1233.5464238524437 | accuracy = 0.6826923076923077


Epoch[2] Batch[525] Speed: 1.2539005731007447 samples/sec                   batch loss = 1248.5659573674202 | accuracy = 0.680952380952381


Epoch[2] Batch[530] Speed: 1.2504930316675438 samples/sec                   batch loss = 1258.5323441624641 | accuracy = 0.6820754716981132


Epoch[2] Batch[535] Speed: 1.2504221061228529 samples/sec                   batch loss = 1272.8759794831276 | accuracy = 0.6813084112149532


Epoch[2] Batch[540] Speed: 1.2531404481445028 samples/sec                   batch loss = 1283.7542089819908 | accuracy = 0.6810185185185185


Epoch[2] Batch[545] Speed: 1.251589345819673 samples/sec                   batch loss = 1294.5754824280739 | accuracy = 0.6821100917431193


Epoch[2] Batch[550] Speed: 1.249896520002706 samples/sec                   batch loss = 1304.1976550221443 | accuracy = 0.6831818181818182


Epoch[2] Batch[555] Speed: 1.2520068455393287 samples/sec                   batch loss = 1314.1165803074837 | accuracy = 0.6837837837837838


Epoch[2] Batch[560] Speed: 1.2518329935860895 samples/sec                   batch loss = 1324.5265416502953 | accuracy = 0.6839285714285714


Epoch[2] Batch[565] Speed: 1.2508855510002743 samples/sec                   batch loss = 1338.156675040722 | accuracy = 0.6823008849557523


Epoch[2] Batch[570] Speed: 1.254154403055382 samples/sec                   batch loss = 1348.7517609000206 | accuracy = 0.6833333333333333


Epoch[2] Batch[575] Speed: 1.2542133760504472 samples/sec                   batch loss = 1357.7552203536034 | accuracy = 0.6847826086956522


Epoch[2] Batch[580] Speed: 1.2530453569393074 samples/sec                   batch loss = 1365.04117333889 | accuracy = 0.6862068965517242


Epoch[2] Batch[585] Speed: 1.2491766760712966 samples/sec                   batch loss = 1374.5221724510193 | accuracy = 0.6867521367521368


Epoch[2] Batch[590] Speed: 1.2570018303724955 samples/sec                   batch loss = 1386.7037609815598 | accuracy = 0.6864406779661016


Epoch[2] Batch[595] Speed: 1.2520641216939612 samples/sec                   batch loss = 1401.3695378303528 | accuracy = 0.6861344537815126


Epoch[2] Batch[600] Speed: 1.2559149147876032 samples/sec                   batch loss = 1412.5948008298874 | accuracy = 0.6854166666666667


Epoch[2] Batch[605] Speed: 1.2584834018775517 samples/sec                   batch loss = 1425.2094634771347 | accuracy = 0.684297520661157


Epoch[2] Batch[610] Speed: 1.2593008759598228 samples/sec                   batch loss = 1438.5057235956192 | accuracy = 0.684016393442623


Epoch[2] Batch[615] Speed: 1.2538582156251128 samples/sec                   batch loss = 1446.9430470466614 | accuracy = 0.6853658536585366


Epoch[2] Batch[620] Speed: 1.2468050706271014 samples/sec                   batch loss = 1461.7723367214203 | accuracy = 0.6838709677419355


Epoch[2] Batch[625] Speed: 1.2522550489068829 samples/sec                   batch loss = 1477.6485973596573 | accuracy = 0.6836


Epoch[2] Batch[630] Speed: 1.2475161534791697 samples/sec                   batch loss = 1485.5427463054657 | accuracy = 0.6853174603174603


Epoch[2] Batch[635] Speed: 1.252111310717605 samples/sec                   batch loss = 1495.9496289491653 | accuracy = 0.6854330708661417


Epoch[2] Batch[640] Speed: 1.250930785802875 samples/sec                   batch loss = 1507.1966267824173 | accuracy = 0.6859375


Epoch[2] Batch[645] Speed: 1.2481012953915027 samples/sec                   batch loss = 1521.6999105215073 | accuracy = 0.6841085271317829


Epoch[2] Batch[650] Speed: 1.2579550744041192 samples/sec                   batch loss = 1533.1559020280838 | accuracy = 0.6846153846153846


Epoch[2] Batch[655] Speed: 1.249000726669794 samples/sec                   batch loss = 1546.9778147935867 | accuracy = 0.683969465648855


Epoch[2] Batch[660] Speed: 1.2482671467117248 samples/sec                   batch loss = 1557.3223741054535 | accuracy = 0.6852272727272727


Epoch[2] Batch[665] Speed: 1.2438268925449563 samples/sec                   batch loss = 1570.456768810749 | accuracy = 0.6853383458646617


Epoch[2] Batch[670] Speed: 1.243414184477689 samples/sec                   batch loss = 1582.5847592949867 | accuracy = 0.6843283582089552


Epoch[2] Batch[675] Speed: 1.2522191579973918 samples/sec                   batch loss = 1593.6519933342934 | accuracy = 0.6840740740740741


Epoch[2] Batch[680] Speed: 1.2507093063815173 samples/sec                   batch loss = 1606.7494103312492 | accuracy = 0.6823529411764706


Epoch[2] Batch[685] Speed: 1.2517112045399743 samples/sec                   batch loss = 1616.4662920832634 | accuracy = 0.6828467153284672


Epoch[2] Batch[690] Speed: 1.2545100127333706 samples/sec                   batch loss = 1628.0814017653465 | accuracy = 0.6840579710144927


Epoch[2] Batch[695] Speed: 1.252641847898072 samples/sec                   batch loss = 1640.9310552477837 | accuracy = 0.6827338129496403


Epoch[2] Batch[700] Speed: 1.2496141604331383 samples/sec                   batch loss = 1652.317817389965 | accuracy = 0.6825


Epoch[2] Batch[705] Speed: 1.2500199119938302 samples/sec                   batch loss = 1662.115803182125 | accuracy = 0.6829787234042554


Epoch[2] Batch[710] Speed: 1.2473387240492093 samples/sec                   batch loss = 1676.7224188446999 | accuracy = 0.6823943661971831


Epoch[2] Batch[715] Speed: 1.2531458770149175 samples/sec                   batch loss = 1691.6287450194359 | accuracy = 0.6811188811188811


Epoch[2] Batch[720] Speed: 1.2492540648634263 samples/sec                   batch loss = 1702.1873360276222 | accuracy = 0.6819444444444445


Epoch[2] Batch[725] Speed: 1.245016029829538 samples/sec                   batch loss = 1713.287182867527 | accuracy = 0.6817241379310345


Epoch[2] Batch[730] Speed: 1.2486363380275611 samples/sec                   batch loss = 1723.076854288578 | accuracy = 0.6825342465753425


Epoch[2] Batch[735] Speed: 1.2465793995542775 samples/sec                   batch loss = 1733.622380912304 | accuracy = 0.6826530612244898


Epoch[2] Batch[740] Speed: 1.2500082701991793 samples/sec                   batch loss = 1742.462296307087 | accuracy = 0.683445945945946


Epoch[2] Batch[745] Speed: 1.2501422105079147 samples/sec                   batch loss = 1754.1187925934792 | accuracy = 0.6838926174496645


Epoch[2] Batch[750] Speed: 1.249878176254766 samples/sec                   batch loss = 1763.7725608944893 | accuracy = 0.6843333333333333


Epoch[2] Batch[755] Speed: 1.246602463169156 samples/sec                   batch loss = 1775.04855042696 | accuracy = 0.6847682119205298


Epoch[2] Batch[760] Speed: 1.246817208784014 samples/sec                   batch loss = 1786.094969689846 | accuracy = 0.6845394736842105


Epoch[2] Batch[765] Speed: 1.2510301273138111 samples/sec                   batch loss = 1799.344588458538 | accuracy = 0.6833333333333333


Epoch[2] Batch[770] Speed: 1.2526733670675463 samples/sec                   batch loss = 1809.6968007683754 | accuracy = 0.6834415584415584


Epoch[2] Batch[775] Speed: 1.2437969235012096 samples/sec                   batch loss = 1823.4919473528862 | accuracy = 0.6835483870967742


Epoch[2] Batch[780] Speed: 1.2510622184596303 samples/sec                   batch loss = 1833.7537632584572 | accuracy = 0.6836538461538462


Epoch[2] Batch[785] Speed: 1.2490126286601566 samples/sec                   batch loss = 1841.162395298481 | accuracy = 0.6847133757961783


[Epoch 2] training: accuracy=0.6846446700507615
[Epoch 2] time cost: 646.846382856369
[Epoch 2] validation: validation accuracy=0.7066666666666667


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).