<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[09:37:06] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[09:37:06] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[09:37:06] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[3.6369002 , 0.73093164]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.774415563110617 samples/sec                   batch loss = 15.116132497787476 | accuracy = 0.35


Epoch[1] Batch[10] Speed: 1.2534293668361816 samples/sec                   batch loss = 29.7722110748291 | accuracy = 0.325


Epoch[1] Batch[15] Speed: 1.2581235550362915 samples/sec                   batch loss = 43.73451352119446 | accuracy = 0.4


Epoch[1] Batch[20] Speed: 1.2592637293781987 samples/sec                   batch loss = 59.26272130012512 | accuracy = 0.3875


Epoch[1] Batch[25] Speed: 1.2657997341530356 samples/sec                   batch loss = 73.11564564704895 | accuracy = 0.43


Epoch[1] Batch[30] Speed: 1.2625913018911985 samples/sec                   batch loss = 87.26424241065979 | accuracy = 0.45


Epoch[1] Batch[35] Speed: 1.26205800141392 samples/sec                   batch loss = 102.32423305511475 | accuracy = 0.45


Epoch[1] Batch[40] Speed: 1.260718903356248 samples/sec                   batch loss = 115.89332175254822 | accuracy = 0.45625


Epoch[1] Batch[45] Speed: 1.258169314847624 samples/sec                   batch loss = 129.61419367790222 | accuracy = 0.4722222222222222


Epoch[1] Batch[50] Speed: 1.265909761421676 samples/sec                   batch loss = 143.9354112148285 | accuracy = 0.465


Epoch[1] Batch[55] Speed: 1.2619530090463498 samples/sec                   batch loss = 157.85897302627563 | accuracy = 0.4727272727272727


Epoch[1] Batch[60] Speed: 1.263364080544918 samples/sec                   batch loss = 172.04329466819763 | accuracy = 0.4791666666666667


Epoch[1] Batch[65] Speed: 1.258023366825203 samples/sec                   batch loss = 185.74364185333252 | accuracy = 0.4807692307692308


Epoch[1] Batch[70] Speed: 1.2630496448080912 samples/sec                   batch loss = 199.60882902145386 | accuracy = 0.4785714285714286


Epoch[1] Batch[75] Speed: 1.2558460050358948 samples/sec                   batch loss = 213.7389271259308 | accuracy = 0.4766666666666667


Epoch[1] Batch[80] Speed: 1.2590207710613108 samples/sec                   batch loss = 227.39530730247498 | accuracy = 0.48125


Epoch[1] Batch[85] Speed: 1.25996354961941 samples/sec                   batch loss = 241.2197437286377 | accuracy = 0.4823529411764706


Epoch[1] Batch[90] Speed: 1.2560732572441897 samples/sec                   batch loss = 255.33425045013428 | accuracy = 0.475


Epoch[1] Batch[95] Speed: 1.259588481666981 samples/sec                   batch loss = 269.4222221374512 | accuracy = 0.4763157894736842


Epoch[1] Batch[100] Speed: 1.258690078885999 samples/sec                   batch loss = 282.80901741981506 | accuracy = 0.4825


Epoch[1] Batch[105] Speed: 1.2632091257684417 samples/sec                   batch loss = 296.10325598716736 | accuracy = 0.4880952380952381


Epoch[1] Batch[110] Speed: 1.2608528747857022 samples/sec                   batch loss = 309.8442368507385 | accuracy = 0.4863636363636364


Epoch[1] Batch[115] Speed: 1.2592626896835235 samples/sec                   batch loss = 323.93343234062195 | accuracy = 0.48478260869565215


Epoch[1] Batch[120] Speed: 1.2598506746613627 samples/sec                   batch loss = 337.3938944339752 | accuracy = 0.48541666666666666


Epoch[1] Batch[125] Speed: 1.2626014688945122 samples/sec                   batch loss = 351.6073303222656 | accuracy = 0.482


Epoch[1] Batch[130] Speed: 1.2597245776106343 samples/sec                   batch loss = 365.442902803421 | accuracy = 0.49038461538461536


Epoch[1] Batch[135] Speed: 1.2548720188384703 samples/sec                   batch loss = 379.4810001850128 | accuracy = 0.48703703703703705


Epoch[1] Batch[140] Speed: 1.260071996749406 samples/sec                   batch loss = 392.9206476211548 | accuracy = 0.49642857142857144


Epoch[1] Batch[145] Speed: 1.2563552507851838 samples/sec                   batch loss = 406.3527443408966 | accuracy = 0.4982758620689655


Epoch[1] Batch[150] Speed: 1.2548435800061122 samples/sec                   batch loss = 420.0838508605957 | accuracy = 0.5016666666666667


Epoch[1] Batch[155] Speed: 1.2504558436834794 samples/sec                   batch loss = 433.99112343788147 | accuracy = 0.5016129032258064


Epoch[1] Batch[160] Speed: 1.2572038757979194 samples/sec                   batch loss = 447.7388508319855 | accuracy = 0.50625


Epoch[1] Batch[165] Speed: 1.257224507866301 samples/sec                   batch loss = 461.3890812397003 | accuracy = 0.5075757575757576


Epoch[1] Batch[170] Speed: 1.2660731184223832 samples/sec                   batch loss = 475.45163345336914 | accuracy = 0.5044117647058823


Epoch[1] Batch[175] Speed: 1.2506888875967432 samples/sec                   batch loss = 489.36034631729126 | accuracy = 0.5028571428571429


Epoch[1] Batch[180] Speed: 1.249638081107372 samples/sec                   batch loss = 503.7218716144562 | accuracy = 0.49722222222222223


Epoch[1] Batch[185] Speed: 1.2547858615867178 samples/sec                   batch loss = 517.5584743022919 | accuracy = 0.49864864864864866


Epoch[1] Batch[190] Speed: 1.2507898691872699 samples/sec                   batch loss = 530.7493941783905 | accuracy = 0.5052631578947369


Epoch[1] Batch[195] Speed: 1.2523531986175847 samples/sec                   batch loss = 544.0380792617798 | accuracy = 0.5153846153846153


Epoch[1] Batch[200] Speed: 1.2518198235357452 samples/sec                   batch loss = 557.5599403381348 | accuracy = 0.51625


Epoch[1] Batch[205] Speed: 1.2608401775449336 samples/sec                   batch loss = 571.5451865196228 | accuracy = 0.5170731707317073


Epoch[1] Batch[210] Speed: 1.2645094886358412 samples/sec                   batch loss = 585.640597820282 | accuracy = 0.5190476190476191


Epoch[1] Batch[215] Speed: 1.2591368992984138 samples/sec                   batch loss = 599.5066380500793 | accuracy = 0.5174418604651163


Epoch[1] Batch[220] Speed: 1.26174487962814 samples/sec                   batch loss = 613.2678384780884 | accuracy = 0.5193181818181818


Epoch[1] Batch[225] Speed: 1.2632775142584673 samples/sec                   batch loss = 627.4006271362305 | accuracy = 0.5144444444444445


Epoch[1] Batch[230] Speed: 1.2660282149284816 samples/sec                   batch loss = 641.2061369419098 | accuracy = 0.5173913043478261


Epoch[1] Batch[235] Speed: 1.262778705114899 samples/sec                   batch loss = 655.0024676322937 | accuracy = 0.5148936170212766


Epoch[1] Batch[240] Speed: 1.2631519666617101 samples/sec                   batch loss = 668.666543006897 | accuracy = 0.5145833333333333


Epoch[1] Batch[245] Speed: 1.2569384514151258 samples/sec                   batch loss = 682.1368243694305 | accuracy = 0.5163265306122449


Epoch[1] Batch[250] Speed: 1.2670992184748648 samples/sec                   batch loss = 695.8162899017334 | accuracy = 0.519


Epoch[1] Batch[255] Speed: 1.2568844948733378 samples/sec                   batch loss = 709.4140465259552 | accuracy = 0.5205882352941177


Epoch[1] Batch[260] Speed: 1.2555248706643436 samples/sec                   batch loss = 723.1966276168823 | accuracy = 0.5230769230769231


Epoch[1] Batch[265] Speed: 1.2575373692240186 samples/sec                   batch loss = 736.8211057186127 | accuracy = 0.5235849056603774


Epoch[1] Batch[270] Speed: 1.2588878498964584 samples/sec                   batch loss = 750.9079718589783 | accuracy = 0.5203703703703704


Epoch[1] Batch[275] Speed: 1.2623262574400635 samples/sec                   batch loss = 764.3865420818329 | accuracy = 0.5227272727272727


Epoch[1] Batch[280] Speed: 1.258367299762453 samples/sec                   batch loss = 778.2561273574829 | accuracy = 0.5214285714285715


Epoch[1] Batch[285] Speed: 1.2501600962261403 samples/sec                   batch loss = 792.822164773941 | accuracy = 0.5219298245614035


Epoch[1] Batch[290] Speed: 1.2501424899683262 samples/sec                   batch loss = 806.2950792312622 | accuracy = 0.5258620689655172


Epoch[1] Batch[295] Speed: 1.2542628838957919 samples/sec                   batch loss = 820.1474199295044 | accuracy = 0.5271186440677966


Epoch[1] Batch[300] Speed: 1.2519454640264889 samples/sec                   batch loss = 832.9901087284088 | accuracy = 0.5316666666666666


Epoch[1] Batch[305] Speed: 1.2553907139098148 samples/sec                   batch loss = 846.7173693180084 | accuracy = 0.5311475409836065


Epoch[1] Batch[310] Speed: 1.2592536160574124 samples/sec                   batch loss = 860.047833442688 | accuracy = 0.5306451612903226


Epoch[1] Batch[315] Speed: 1.2623091616191628 samples/sec                   batch loss = 873.6702237129211 | accuracy = 0.530952380952381


Epoch[1] Batch[320] Speed: 1.258602830060723 samples/sec                   batch loss = 888.0591802597046 | accuracy = 0.5296875


Epoch[1] Batch[325] Speed: 1.257884903176274 samples/sec                   batch loss = 901.6323871612549 | accuracy = 0.5315384615384615


Epoch[1] Batch[330] Speed: 1.2676139938793853 samples/sec                   batch loss = 914.8198044300079 | accuracy = 0.5348484848484848


Epoch[1] Batch[335] Speed: 1.2661015907983968 samples/sec                   batch loss = 928.2455351352692 | accuracy = 0.5350746268656716


Epoch[1] Batch[340] Speed: 1.2657699383924952 samples/sec                   batch loss = 942.1666650772095 | accuracy = 0.5367647058823529


Epoch[1] Batch[345] Speed: 1.2597630756835518 samples/sec                   batch loss = 955.4697897434235 | accuracy = 0.5369565217391304


Epoch[1] Batch[350] Speed: 1.2600288427756297 samples/sec                   batch loss = 969.1069197654724 | accuracy = 0.5378571428571428


Epoch[1] Batch[355] Speed: 1.2588951234631034 samples/sec                   batch loss = 982.6548826694489 | accuracy = 0.5401408450704225


Epoch[1] Batch[360] Speed: 1.2588252251311312 samples/sec                   batch loss = 996.767893075943 | accuracy = 0.5381944444444444


Epoch[1] Batch[365] Speed: 1.256823669841909 samples/sec                   batch loss = 1009.7236006259918 | accuracy = 0.5404109589041096


Epoch[1] Batch[370] Speed: 1.2631830659001582 samples/sec                   batch loss = 1023.1017079353333 | accuracy = 0.5412162162162162


Epoch[1] Batch[375] Speed: 1.2645745865914912 samples/sec                   batch loss = 1036.4459178447723 | accuracy = 0.5426666666666666


Epoch[1] Batch[380] Speed: 1.2620061675554328 samples/sec                   batch loss = 1049.463461637497 | accuracy = 0.5447368421052632


Epoch[1] Batch[385] Speed: 1.2581675221325743 samples/sec                   batch loss = 1062.7304151058197 | accuracy = 0.5461038961038961


Epoch[1] Batch[390] Speed: 1.2614730766147992 samples/sec                   batch loss = 1076.398520231247 | accuracy = 0.5467948717948717


Epoch[1] Batch[395] Speed: 1.258122234183949 samples/sec                   batch loss = 1089.914011001587 | accuracy = 0.5462025316455696


Epoch[1] Batch[400] Speed: 1.265738807141219 samples/sec                   batch loss = 1103.4811840057373 | accuracy = 0.546875


Epoch[1] Batch[405] Speed: 1.2580790249724385 samples/sec                   batch loss = 1117.644226551056 | accuracy = 0.5450617283950617


Epoch[1] Batch[410] Speed: 1.2622630052156363 samples/sec                   batch loss = 1131.2340350151062 | accuracy = 0.5451219512195122


Epoch[1] Batch[415] Speed: 1.2582591458482595 samples/sec                   batch loss = 1144.6152458190918 | accuracy = 0.5457831325301205


Epoch[1] Batch[420] Speed: 1.2596230939689854 samples/sec                   batch loss = 1157.6342232227325 | accuracy = 0.5458333333333333


Epoch[1] Batch[425] Speed: 1.2616567324165577 samples/sec                   batch loss = 1171.1606123447418 | accuracy = 0.5470588235294118


Epoch[1] Batch[430] Speed: 1.2591131805621005 samples/sec                   batch loss = 1184.7827444076538 | accuracy = 0.547093023255814


Epoch[1] Batch[435] Speed: 1.2622498996916605 samples/sec                   batch loss = 1198.5125737190247 | accuracy = 0.5477011494252874


Epoch[1] Batch[440] Speed: 1.2531234130289246 samples/sec                   batch loss = 1211.8521783351898 | accuracy = 0.5488636363636363


Epoch[1] Batch[445] Speed: 1.2546001663107138 samples/sec                   batch loss = 1225.1981337070465 | accuracy = 0.550561797752809


Epoch[1] Batch[450] Speed: 1.2538553106818697 samples/sec                   batch loss = 1238.641077041626 | accuracy = 0.5538888888888889


Epoch[1] Batch[455] Speed: 1.2554738538981598 samples/sec                   batch loss = 1251.914205789566 | accuracy = 0.5543956043956044


Epoch[1] Batch[460] Speed: 1.254538436448131 samples/sec                   batch loss = 1265.4708569049835 | accuracy = 0.5543478260869565


Epoch[1] Batch[465] Speed: 1.2520197392247427 samples/sec                   batch loss = 1279.929165840149 | accuracy = 0.5532258064516129


Epoch[1] Batch[470] Speed: 1.2518399990449218 samples/sec                   batch loss = 1293.564210176468 | accuracy = 0.5531914893617021


Epoch[1] Batch[475] Speed: 1.255246536762755 samples/sec                   batch loss = 1306.7318496704102 | accuracy = 0.5547368421052632


Epoch[1] Batch[480] Speed: 1.2512261515353555 samples/sec                   batch loss = 1319.8991146087646 | accuracy = 0.5567708333333333


Epoch[1] Batch[485] Speed: 1.2625777144695096 samples/sec                   batch loss = 1331.7825934886932 | accuracy = 0.5608247422680412


Epoch[1] Batch[490] Speed: 1.2573485029599059 samples/sec                   batch loss = 1345.0467176437378 | accuracy = 0.5607142857142857


Epoch[1] Batch[495] Speed: 1.2593876543765916 samples/sec                   batch loss = 1358.803662776947 | accuracy = 0.5606060606060606


Epoch[1] Batch[500] Speed: 1.2562904318582226 samples/sec                   batch loss = 1372.8880467414856 | accuracy = 0.5595


Epoch[1] Batch[505] Speed: 1.2559064534285684 samples/sec                   batch loss = 1385.2636947631836 | accuracy = 0.5613861386138614


Epoch[1] Batch[510] Speed: 1.2602790065700877 samples/sec                   batch loss = 1398.9276554584503 | accuracy = 0.5622549019607843


Epoch[1] Batch[515] Speed: 1.2627162628716073 samples/sec                   batch loss = 1411.920289158821 | accuracy = 0.5635922330097087


Epoch[1] Batch[520] Speed: 1.2636103361196371 samples/sec                   batch loss = 1424.55013525486 | accuracy = 0.5639423076923077


Epoch[1] Batch[525] Speed: 1.2570313089448273 samples/sec                   batch loss = 1438.5490547418594 | accuracy = 0.5642857142857143


Epoch[1] Batch[530] Speed: 1.2597009313122316 samples/sec                   batch loss = 1452.5821415185928 | accuracy = 0.5636792452830188


Epoch[1] Batch[535] Speed: 1.2606060824643814 samples/sec                   batch loss = 1466.2382558584213 | accuracy = 0.5635514018691589


Epoch[1] Batch[540] Speed: 1.2566995899693039 samples/sec                   batch loss = 1479.2769805192947 | accuracy = 0.5652777777777778


Epoch[1] Batch[545] Speed: 1.262393885448165 samples/sec                   batch loss = 1492.7056375741959 | accuracy = 0.5655963302752294


Epoch[1] Batch[550] Speed: 1.2576682140501703 samples/sec                   batch loss = 1506.5602954626083 | accuracy = 0.5659090909090909


Epoch[1] Batch[555] Speed: 1.2579491321822314 samples/sec                   batch loss = 1519.4458476305008 | accuracy = 0.5666666666666667


Epoch[1] Batch[560] Speed: 1.257071714729541 samples/sec                   batch loss = 1533.1802431344986 | accuracy = 0.565625


Epoch[1] Batch[565] Speed: 1.260979292607015 samples/sec                   batch loss = 1546.786578297615 | accuracy = 0.5654867256637168


Epoch[1] Batch[570] Speed: 1.256252616067277 samples/sec                   batch loss = 1560.7133806943893 | accuracy = 0.5653508771929825


Epoch[1] Batch[575] Speed: 1.2563910028375334 samples/sec                   batch loss = 1573.198227763176 | accuracy = 0.5660869565217391


Epoch[1] Batch[580] Speed: 1.257485906025781 samples/sec                   batch loss = 1586.111637711525 | accuracy = 0.5676724137931034


Epoch[1] Batch[585] Speed: 1.2608419778844084 samples/sec                   batch loss = 1599.399961590767 | accuracy = 0.5683760683760684


Epoch[1] Batch[590] Speed: 1.2642954655956604 samples/sec                   batch loss = 1613.716071486473 | accuracy = 0.5677966101694916


Epoch[1] Batch[595] Speed: 1.2612634935488973 samples/sec                   batch loss = 1626.8994752168655 | accuracy = 0.5676470588235294


Epoch[1] Batch[600] Speed: 1.2597258072424329 samples/sec                   batch loss = 1639.2759623527527 | accuracy = 0.5691666666666667


Epoch[1] Batch[605] Speed: 1.2573199517702562 samples/sec                   batch loss = 1651.3744835853577 | accuracy = 0.5706611570247934


Epoch[1] Batch[610] Speed: 1.2577595764710152 samples/sec                   batch loss = 1664.3500685691833 | accuracy = 0.5713114754098361


Epoch[1] Batch[615] Speed: 1.2595877251351748 samples/sec                   batch loss = 1676.979657649994 | accuracy = 0.5719512195121951


Epoch[1] Batch[620] Speed: 1.258380891098013 samples/sec                   batch loss = 1689.9517676830292 | accuracy = 0.5725806451612904


Epoch[1] Batch[625] Speed: 1.2584447931870084 samples/sec                   batch loss = 1703.946177482605 | accuracy = 0.5728


Epoch[1] Batch[630] Speed: 1.258999418647975 samples/sec                   batch loss = 1715.4439380168915 | accuracy = 0.5738095238095238


Epoch[1] Batch[635] Speed: 1.2594442842144125 samples/sec                   batch loss = 1729.0166084766388 | accuracy = 0.5748031496062992


Epoch[1] Batch[640] Speed: 1.2580144997107887 samples/sec                   batch loss = 1741.2471482753754 | accuracy = 0.576171875


Epoch[1] Batch[645] Speed: 1.2584206285616475 samples/sec                   batch loss = 1754.287945508957 | accuracy = 0.5763565891472868


Epoch[1] Batch[650] Speed: 1.2643652104711032 samples/sec                   batch loss = 1768.5263526439667 | accuracy = 0.5765384615384616


Epoch[1] Batch[655] Speed: 1.2595281511090004 samples/sec                   batch loss = 1781.0482790470123 | accuracy = 0.5770992366412214


Epoch[1] Batch[660] Speed: 1.2550089750276103 samples/sec                   batch loss = 1795.424328804016 | accuracy = 0.5772727272727273


Epoch[1] Batch[665] Speed: 1.2603708434008825 samples/sec                   batch loss = 1808.2609913349152 | accuracy = 0.5778195488721805


Epoch[1] Batch[670] Speed: 1.2560550138574542 samples/sec                   batch loss = 1821.3655235767365 | accuracy = 0.5779850746268657


Epoch[1] Batch[675] Speed: 1.2574253997410083 samples/sec                   batch loss = 1835.0114340782166 | accuracy = 0.5785185185185185


Epoch[1] Batch[680] Speed: 1.2620156606176147 samples/sec                   batch loss = 1847.9744472503662 | accuracy = 0.5786764705882353


Epoch[1] Batch[685] Speed: 1.253148310661752 samples/sec                   batch loss = 1860.7772099971771 | accuracy = 0.5791970802919708


Epoch[1] Batch[690] Speed: 1.2567092857688065 samples/sec                   batch loss = 1873.7707660198212 | accuracy = 0.5797101449275363


Epoch[1] Batch[695] Speed: 1.2590905966197938 samples/sec                   batch loss = 1886.484581232071 | accuracy = 0.5805755395683453


Epoch[1] Batch[700] Speed: 1.2559884393955012 samples/sec                   batch loss = 1900.2541377544403 | accuracy = 0.5807142857142857


Epoch[1] Batch[705] Speed: 1.2542972979014546 samples/sec                   batch loss = 1915.7028665542603 | accuracy = 0.5797872340425532


Epoch[1] Batch[710] Speed: 1.2602313892352972 samples/sec                   batch loss = 1927.79283452034 | accuracy = 0.5806338028169014


Epoch[1] Batch[715] Speed: 1.254978934099232 samples/sec                   batch loss = 1941.581839799881 | accuracy = 0.5814685314685315


Epoch[1] Batch[720] Speed: 1.2569312004411226 samples/sec                   batch loss = 1956.2374284267426 | accuracy = 0.58125


Epoch[1] Batch[725] Speed: 1.2546292508740116 samples/sec                   batch loss = 1968.9220116138458 | accuracy = 0.5820689655172414


Epoch[1] Batch[730] Speed: 1.259675299727458 samples/sec                   batch loss = 1982.685483455658 | accuracy = 0.5815068493150685


Epoch[1] Batch[735] Speed: 1.2568876021924416 samples/sec                   batch loss = 1996.166879415512 | accuracy = 0.5819727891156462


Epoch[1] Batch[740] Speed: 1.252188409349354 samples/sec                   batch loss = 2008.0226500034332 | accuracy = 0.5820945945945946


Epoch[1] Batch[745] Speed: 1.253590267835374 samples/sec                   batch loss = 2021.6198754310608 | accuracy = 0.5828859060402685


Epoch[1] Batch[750] Speed: 1.2520798198258172 samples/sec                   batch loss = 2034.3556344509125 | accuracy = 0.583


Epoch[1] Batch[755] Speed: 1.2566881058178134 samples/sec                   batch loss = 2047.1136617660522 | accuracy = 0.583112582781457


Epoch[1] Batch[760] Speed: 1.2516038182190645 samples/sec                   batch loss = 2060.0497267246246 | accuracy = 0.5832236842105263


Epoch[1] Batch[765] Speed: 1.2539670201277913 samples/sec                   batch loss = 2072.8819451332092 | accuracy = 0.5839869281045752


Epoch[1] Batch[770] Speed: 1.2550754456930453 samples/sec                   batch loss = 2085.771409034729 | accuracy = 0.5844155844155844


Epoch[1] Batch[775] Speed: 1.2592931250926518 samples/sec                   batch loss = 2098.758940935135 | accuracy = 0.584516129032258


Epoch[1] Batch[780] Speed: 1.2574630034293663 samples/sec                   batch loss = 2112.2099475860596 | accuracy = 0.5849358974358975


Epoch[1] Batch[785] Speed: 1.2549840972814463 samples/sec                   batch loss = 2123.8400382995605 | accuracy = 0.5863057324840765


[Epoch 1] training: accuracy=0.5866116751269036
[Epoch 1] time cost: 644.0594365596771
[Epoch 1] validation: validation accuracy=0.6822222222222222


Epoch[2] Batch[5] Speed: 1.2585166317916696 samples/sec                   batch loss = 10.60991108417511 | accuracy = 0.85


Epoch[2] Batch[10] Speed: 1.2596115563236274 samples/sec                   batch loss = 23.411736488342285 | accuracy = 0.725


Epoch[2] Batch[15] Speed: 1.2595008245919381 samples/sec                   batch loss = 38.67470693588257 | accuracy = 0.6333333333333333


Epoch[2] Batch[20] Speed: 1.2630195980446899 samples/sec                   batch loss = 50.95856690406799 | accuracy = 0.6375


Epoch[2] Batch[25] Speed: 1.2575576351924513 samples/sec                   batch loss = 63.67573857307434 | accuracy = 0.62


Epoch[2] Batch[30] Speed: 1.256135984684278 samples/sec                   batch loss = 77.37428736686707 | accuracy = 0.6083333333333333


Epoch[2] Batch[35] Speed: 1.2563846049523697 samples/sec                   batch loss = 90.027658700943 | accuracy = 0.6214285714285714


Epoch[2] Batch[40] Speed: 1.2583526705279553 samples/sec                   batch loss = 101.65738463401794 | accuracy = 0.6375


Epoch[2] Batch[45] Speed: 1.2561833870856396 samples/sec                   batch loss = 115.58172464370728 | accuracy = 0.6222222222222222


Epoch[2] Batch[50] Speed: 1.2616438292296033 samples/sec                   batch loss = 126.6704568862915 | accuracy = 0.635


Epoch[2] Batch[55] Speed: 1.260314319561476 samples/sec                   batch loss = 139.43964171409607 | accuracy = 0.6318181818181818


Epoch[2] Batch[60] Speed: 1.258576582240023 samples/sec                   batch loss = 152.3073959350586 | accuracy = 0.6375


Epoch[2] Batch[65] Speed: 1.2584543271421558 samples/sec                   batch loss = 163.93145883083344 | accuracy = 0.65


Epoch[2] Batch[70] Speed: 1.2542101881791932 samples/sec                   batch loss = 175.86735343933105 | accuracy = 0.6571428571428571


Epoch[2] Batch[75] Speed: 1.2619136176830805 samples/sec                   batch loss = 188.60848450660706 | accuracy = 0.66


Epoch[2] Batch[80] Speed: 1.2558565337308827 samples/sec                   batch loss = 203.55985164642334 | accuracy = 0.65


Epoch[2] Batch[85] Speed: 1.2608887885154403 samples/sec                   batch loss = 215.07292652130127 | accuracy = 0.6529411764705882


Epoch[2] Batch[90] Speed: 1.2569876095748003 samples/sec                   batch loss = 227.28148937225342 | accuracy = 0.6611111111111111


Epoch[2] Batch[95] Speed: 1.258974476922988 samples/sec                   batch loss = 241.65908408164978 | accuracy = 0.65


Epoch[2] Batch[100] Speed: 1.2558950777811015 samples/sec                   batch loss = 254.1524442434311 | accuracy = 0.655


Epoch[2] Batch[105] Speed: 1.2590076382965727 samples/sec                   batch loss = 265.17526853084564 | accuracy = 0.6642857142857143


Epoch[2] Batch[110] Speed: 1.2608608343804675 samples/sec                   batch loss = 278.31397449970245 | accuracy = 0.6613636363636364


Epoch[2] Batch[115] Speed: 1.2567335729912952 samples/sec                   batch loss = 291.97756683826447 | accuracy = 0.658695652173913


Epoch[2] Batch[120] Speed: 1.2583135979820441 samples/sec                   batch loss = 304.9429076910019 | accuracy = 0.6520833333333333


Epoch[2] Batch[125] Speed: 1.2558739252654971 samples/sec                   batch loss = 317.1253937482834 | accuracy = 0.652


Epoch[2] Batch[130] Speed: 1.2559740534687556 samples/sec                   batch loss = 331.2891014814377 | accuracy = 0.65


Epoch[2] Batch[135] Speed: 1.260830702158753 samples/sec                   batch loss = 343.47903275489807 | accuracy = 0.6518518518518519


Epoch[2] Batch[140] Speed: 1.2574384995175147 samples/sec                   batch loss = 356.08651328086853 | accuracy = 0.65


Epoch[2] Batch[145] Speed: 1.260291976527336 samples/sec                   batch loss = 367.9326331615448 | accuracy = 0.653448275862069


Epoch[2] Batch[150] Speed: 1.2581877140605184 samples/sec                   batch loss = 380.86533880233765 | accuracy = 0.6516666666666666


Epoch[2] Batch[155] Speed: 1.2616252339244305 samples/sec                   batch loss = 394.9320821762085 | accuracy = 0.65


Epoch[2] Batch[160] Speed: 1.260048432042115 samples/sec                   batch loss = 406.201033949852 | accuracy = 0.6546875


Epoch[2] Batch[165] Speed: 1.2612582785688764 samples/sec                   batch loss = 420.7247964143753 | accuracy = 0.6484848484848484


Epoch[2] Batch[170] Speed: 1.2583920286612964 samples/sec                   batch loss = 433.7289777994156 | accuracy = 0.6441176470588236


Epoch[2] Batch[175] Speed: 1.255449239556416 samples/sec                   batch loss = 447.08768010139465 | accuracy = 0.6385714285714286


Epoch[2] Batch[180] Speed: 1.2621012944928798 samples/sec                   batch loss = 459.65987396240234 | accuracy = 0.6388888888888888


Epoch[2] Batch[185] Speed: 1.2594929767110532 samples/sec                   batch loss = 473.38772463798523 | accuracy = 0.6337837837837837


Epoch[2] Batch[190] Speed: 1.2649553961373554 samples/sec                   batch loss = 486.4638321399689 | accuracy = 0.631578947368421


Epoch[2] Batch[195] Speed: 1.2566762453570917 samples/sec                   batch loss = 497.08651638031006 | accuracy = 0.6358974358974359


Epoch[2] Batch[200] Speed: 1.2553796294100001 samples/sec                   batch loss = 510.11814308166504 | accuracy = 0.635


Epoch[2] Batch[205] Speed: 1.2587793231520352 samples/sec                   batch loss = 524.066162109375 | accuracy = 0.6353658536585366


Epoch[2] Batch[210] Speed: 1.2578439735751954 samples/sec                   batch loss = 535.3856892585754 | accuracy = 0.6404761904761904


Epoch[2] Batch[215] Speed: 1.2627104656554626 samples/sec                   batch loss = 547.8344630002975 | accuracy = 0.641860465116279


Epoch[2] Batch[220] Speed: 1.258372585246951 samples/sec                   batch loss = 560.6121178865433 | accuracy = 0.6431818181818182


Epoch[2] Batch[225] Speed: 1.2581549732702693 samples/sec                   batch loss = 571.905137181282 | accuracy = 0.6455555555555555


Epoch[2] Batch[230] Speed: 1.2635570423852143 samples/sec                   batch loss = 586.0077143907547 | accuracy = 0.6434782608695652


Epoch[2] Batch[235] Speed: 1.2670913713147711 samples/sec                   batch loss = 597.0051745176315 | accuracy = 0.6468085106382979


Epoch[2] Batch[240] Speed: 1.2637829054263852 samples/sec                   batch loss = 608.4615279436111 | accuracy = 0.6510416666666666


Epoch[2] Batch[245] Speed: 1.2593185520747827 samples/sec                   batch loss = 620.7627588510513 | accuracy = 0.65


Epoch[2] Batch[250] Speed: 1.2565536062594242 samples/sec                   batch loss = 632.4055659770966 | accuracy = 0.651


Epoch[2] Batch[255] Speed: 1.2649119070354118 samples/sec                   batch loss = 644.8456311225891 | accuracy = 0.6509803921568628


Epoch[2] Batch[260] Speed: 1.2599801088373883 samples/sec                   batch loss = 657.2566599845886 | accuracy = 0.65


Epoch[2] Batch[265] Speed: 1.2594183794496074 samples/sec                   batch loss = 669.6300110816956 | accuracy = 0.6509433962264151


Epoch[2] Batch[270] Speed: 1.2581543128107662 samples/sec                   batch loss = 681.6787602901459 | accuracy = 0.6509259259259259


Epoch[2] Batch[275] Speed: 1.262308116889566 samples/sec                   batch loss = 693.1892809867859 | accuracy = 0.6536363636363637


Epoch[2] Batch[280] Speed: 1.258726530570598 samples/sec                   batch loss = 706.9605264663696 | accuracy = 0.6526785714285714


Epoch[2] Batch[285] Speed: 1.2587534457341096 samples/sec                   batch loss = 720.4839715957642 | accuracy = 0.6517543859649123


Epoch[2] Batch[290] Speed: 1.2592987964495093 samples/sec                   batch loss = 732.9245532751083 | accuracy = 0.6525862068965518


Epoch[2] Batch[295] Speed: 1.2575079611663875 samples/sec                   batch loss = 745.4615931510925 | accuracy = 0.652542372881356


Epoch[2] Batch[300] Speed: 1.2555570988615192 samples/sec                   batch loss = 759.4054956436157 | accuracy = 0.6491666666666667


Epoch[2] Batch[305] Speed: 1.2609555044091536 samples/sec                   batch loss = 774.1045007705688 | accuracy = 0.6467213114754098


Epoch[2] Batch[310] Speed: 1.2581340276066013 samples/sec                   batch loss = 785.8413679599762 | accuracy = 0.6459677419354839


Epoch[2] Batch[315] Speed: 1.2565934166347847 samples/sec                   batch loss = 796.7806928157806 | accuracy = 0.6476190476190476


Epoch[2] Batch[320] Speed: 1.2554550642371352 samples/sec                   batch loss = 806.2186484336853 | accuracy = 0.6515625


Epoch[2] Batch[325] Speed: 1.2578089874794252 samples/sec                   batch loss = 819.0209345817566 | accuracy = 0.6507692307692308


Epoch[2] Batch[330] Speed: 1.2619969594215694 samples/sec                   batch loss = 830.6160018444061 | accuracy = 0.6515151515151515


Epoch[2] Batch[335] Speed: 1.2620608495515255 samples/sec                   batch loss = 842.1577169895172 | accuracy = 0.6522388059701493


Epoch[2] Batch[340] Speed: 1.2618301920481279 samples/sec                   batch loss = 857.0045664310455 | accuracy = 0.65


Epoch[2] Batch[345] Speed: 1.259400322378911 samples/sec                   batch loss = 867.2694625854492 | accuracy = 0.6521739130434783


Epoch[2] Batch[350] Speed: 1.258939522380538 samples/sec                   batch loss = 881.3757855892181 | accuracy = 0.6507142857142857


Epoch[2] Batch[355] Speed: 1.2644488763605444 samples/sec                   batch loss = 890.7998868227005 | accuracy = 0.6535211267605634


Epoch[2] Batch[360] Speed: 1.2639184810572872 samples/sec                   batch loss = 903.8073387145996 | accuracy = 0.6520833333333333


Epoch[2] Batch[365] Speed: 1.2588659352427352 samples/sec                   batch loss = 913.3921999931335 | accuracy = 0.6547945205479452


Epoch[2] Batch[370] Speed: 1.267165349013947 samples/sec                   batch loss = 924.9669191837311 | accuracy = 0.6533783783783784


Epoch[2] Batch[375] Speed: 1.262029805545202 samples/sec                   batch loss = 937.7164598703384 | accuracy = 0.6533333333333333


Epoch[2] Batch[380] Speed: 1.2559872170490451 samples/sec                   batch loss = 946.9175704717636 | accuracy = 0.6539473684210526


Epoch[2] Batch[385] Speed: 1.259418568531621 samples/sec                   batch loss = 959.1198306083679 | accuracy = 0.6532467532467533


Epoch[2] Batch[390] Speed: 1.258707926726841 samples/sec                   batch loss = 970.9478578567505 | accuracy = 0.6532051282051282


Epoch[2] Batch[395] Speed: 1.2550623012176574 samples/sec                   batch loss = 984.7900972366333 | accuracy = 0.6525316455696203


Epoch[2] Batch[400] Speed: 1.2583439875302769 samples/sec                   batch loss = 996.8611364364624 | accuracy = 0.651875


Epoch[2] Batch[405] Speed: 1.2557390359832676 samples/sec                   batch loss = 1009.2965136766434 | accuracy = 0.6512345679012346


Epoch[2] Batch[410] Speed: 1.26202259064319 samples/sec                   batch loss = 1020.4750679731369 | accuracy = 0.650609756097561


Epoch[2] Batch[415] Speed: 1.259507727025841 samples/sec                   batch loss = 1029.5481296777725 | accuracy = 0.6524096385542169


Epoch[2] Batch[420] Speed: 1.2591489952530162 samples/sec                   batch loss = 1039.2238929271698 | accuracy = 0.6547619047619048


Epoch[2] Batch[425] Speed: 1.2559662494764379 samples/sec                   batch loss = 1053.3565833568573 | accuracy = 0.6529411764705882


Epoch[2] Batch[430] Speed: 1.2570889515539576 samples/sec                   batch loss = 1066.447050333023 | accuracy = 0.6517441860465116


Epoch[2] Batch[435] Speed: 1.2623218884640073 samples/sec                   batch loss = 1080.9744950532913 | accuracy = 0.6505747126436782


Epoch[2] Batch[440] Speed: 1.2577701372763799 samples/sec                   batch loss = 1093.3922501802444 | accuracy = 0.6511363636363636


Epoch[2] Batch[445] Speed: 1.2573379492337884 samples/sec                   batch loss = 1106.9873231649399 | accuracy = 0.650561797752809


Epoch[2] Batch[450] Speed: 1.2592312161028207 samples/sec                   batch loss = 1117.3754180669785 | accuracy = 0.6516666666666666


Epoch[2] Batch[455] Speed: 1.2581499726655683 samples/sec                   batch loss = 1129.4519406557083 | accuracy = 0.6516483516483517


Epoch[2] Batch[460] Speed: 1.26165682729391 samples/sec                   batch loss = 1143.5657893419266 | accuracy = 0.6505434782608696


Epoch[2] Batch[465] Speed: 1.2559142566778116 samples/sec                   batch loss = 1154.9598726034164 | accuracy = 0.6510752688172043


Epoch[2] Batch[470] Speed: 1.259156177336021 samples/sec                   batch loss = 1166.2894769906998 | accuracy = 0.6526595744680851


Epoch[2] Batch[475] Speed: 1.2559214019067422 samples/sec                   batch loss = 1178.8260024785995 | accuracy = 0.6521052631578947


Epoch[2] Batch[480] Speed: 1.2542206894633812 samples/sec                   batch loss = 1191.5798555612564 | accuracy = 0.6505208333333333


Epoch[2] Batch[485] Speed: 1.2534170995956573 samples/sec                   batch loss = 1204.0771292448044 | accuracy = 0.6510309278350516


Epoch[2] Batch[490] Speed: 1.2552362061215414 samples/sec                   batch loss = 1215.8888672590256 | accuracy = 0.6505102040816326


Epoch[2] Batch[495] Speed: 1.2556922310048777 samples/sec                   batch loss = 1228.0776816606522 | accuracy = 0.65


Epoch[2] Batch[500] Speed: 1.253799744520199 samples/sec                   batch loss = 1240.4044502973557 | accuracy = 0.65


Epoch[2] Batch[505] Speed: 1.249787303495508 samples/sec                   batch loss = 1251.4900938272476 | accuracy = 0.651980198019802


Epoch[2] Batch[510] Speed: 1.2547260840411099 samples/sec                   batch loss = 1263.5711501836777 | accuracy = 0.6514705882352941


Epoch[2] Batch[515] Speed: 1.2489533069292456 samples/sec                   batch loss = 1273.9647878408432 | accuracy = 0.6529126213592233


Epoch[2] Batch[520] Speed: 1.2526122007501541 samples/sec                   batch loss = 1286.8217033147812 | accuracy = 0.6528846153846154


Epoch[2] Batch[525] Speed: 1.2602258987998376 samples/sec                   batch loss = 1299.8490422964096 | accuracy = 0.6523809523809524


Epoch[2] Batch[530] Speed: 1.2608942847284648 samples/sec                   batch loss = 1311.6022540330887 | accuracy = 0.6533018867924528


Epoch[2] Batch[535] Speed: 1.2589692808546684 samples/sec                   batch loss = 1322.0489077568054 | accuracy = 0.6546728971962616


Epoch[2] Batch[540] Speed: 1.2614447171846077 samples/sec                   batch loss = 1333.8081946372986 | accuracy = 0.6555555555555556


Epoch[2] Batch[545] Speed: 1.2595158587573663 samples/sec                   batch loss = 1346.9174671173096 | accuracy = 0.6555045871559633


Epoch[2] Batch[550] Speed: 1.2552232461030373 samples/sec                   batch loss = 1356.73175740242 | accuracy = 0.6559090909090909


Epoch[2] Batch[555] Speed: 1.2531658144770794 samples/sec                   batch loss = 1368.5856236219406 | accuracy = 0.6558558558558558


Epoch[2] Batch[560] Speed: 1.257480345236383 samples/sec                   batch loss = 1380.1627514362335 | accuracy = 0.6553571428571429


Epoch[2] Batch[565] Speed: 1.2629289909871526 samples/sec                   batch loss = 1387.6127403974533 | accuracy = 0.6584070796460177


Epoch[2] Batch[570] Speed: 1.2542976729963886 samples/sec                   batch loss = 1397.0255063772202 | accuracy = 0.6596491228070176


Epoch[2] Batch[575] Speed: 1.257961488293918 samples/sec                   batch loss = 1411.1202701330185 | accuracy = 0.66


Epoch[2] Batch[580] Speed: 1.2627868791380914 samples/sec                   batch loss = 1423.0380411148071 | accuracy = 0.6607758620689655


Epoch[2] Batch[585] Speed: 1.2625323935061956 samples/sec                   batch loss = 1433.7535898685455 | accuracy = 0.661965811965812


Epoch[2] Batch[590] Speed: 1.2624355866656853 samples/sec                   batch loss = 1446.6178574562073 | accuracy = 0.6614406779661017


Epoch[2] Batch[595] Speed: 1.2680918060344126 samples/sec                   batch loss = 1458.7062604427338 | accuracy = 0.6626050420168067


Epoch[2] Batch[600] Speed: 1.2657449186771057 samples/sec                   batch loss = 1469.4931894540787 | accuracy = 0.6633333333333333


Epoch[2] Batch[605] Speed: 1.2603484036819097 samples/sec                   batch loss = 1480.7598584890366 | accuracy = 0.6636363636363637


Epoch[2] Batch[610] Speed: 1.2636820042460133 samples/sec                   batch loss = 1491.0758057832718 | accuracy = 0.6643442622950819


Epoch[2] Batch[615] Speed: 1.2578536870413048 samples/sec                   batch loss = 1501.4751896858215 | accuracy = 0.6646341463414634


Epoch[2] Batch[620] Speed: 1.2608054983247592 samples/sec                   batch loss = 1509.4500303268433 | accuracy = 0.6665322580645161


Epoch[2] Batch[625] Speed: 1.2606964512419132 samples/sec                   batch loss = 1520.851152896881 | accuracy = 0.6664


Epoch[2] Batch[630] Speed: 1.2601990152892464 samples/sec                   batch loss = 1533.0790643692017 | accuracy = 0.6666666666666666


Epoch[2] Batch[635] Speed: 1.2607507355724916 samples/sec                   batch loss = 1546.0439548492432 | accuracy = 0.665748031496063


Epoch[2] Batch[640] Speed: 1.2629865102250755 samples/sec                   batch loss = 1559.4857251644135 | accuracy = 0.666015625


Epoch[2] Batch[645] Speed: 1.2650318908043536 samples/sec                   batch loss = 1574.7261147499084 | accuracy = 0.6655038759689923


Epoch[2] Batch[650] Speed: 1.2609383509270187 samples/sec                   batch loss = 1588.0222063064575 | accuracy = 0.6646153846153846


Epoch[2] Batch[655] Speed: 1.2621412672752281 samples/sec                   batch loss = 1601.9104821681976 | accuracy = 0.6645038167938931


Epoch[2] Batch[660] Speed: 1.2625298282659458 samples/sec                   batch loss = 1614.602427482605 | accuracy = 0.6647727272727273


Epoch[2] Batch[665] Speed: 1.2657032850073315 samples/sec                   batch loss = 1627.1094017028809 | accuracy = 0.6642857142857143


Epoch[2] Batch[670] Speed: 1.2629498113955495 samples/sec                   batch loss = 1638.6748574972153 | accuracy = 0.6652985074626866


Epoch[2] Batch[675] Speed: 1.2683655101557898 samples/sec                   batch loss = 1647.6863757371902 | accuracy = 0.6659259259259259


Epoch[2] Batch[680] Speed: 1.2674518667987715 samples/sec                   batch loss = 1657.895235657692 | accuracy = 0.6676470588235294


Epoch[2] Batch[685] Speed: 1.257189933048398 samples/sec                   batch loss = 1674.5439232587814 | accuracy = 0.6667883211678832


Epoch[2] Batch[690] Speed: 1.2573314474728579 samples/sec                   batch loss = 1686.6301527023315 | accuracy = 0.6681159420289855


Epoch[2] Batch[695] Speed: 1.2627306135864513 samples/sec                   batch loss = 1699.5907621383667 | accuracy = 0.6676258992805756


Epoch[2] Batch[700] Speed: 1.2556232517147565 samples/sec                   batch loss = 1713.4680460691452 | accuracy = 0.6667857142857143


Epoch[2] Batch[705] Speed: 1.2597295907399466 samples/sec                   batch loss = 1726.0071437358856 | accuracy = 0.6670212765957447


Epoch[2] Batch[710] Speed: 1.262687657452911 samples/sec                   batch loss = 1738.011211514473 | accuracy = 0.6676056338028169


Epoch[2] Batch[715] Speed: 1.2569637833965441 samples/sec                   batch loss = 1748.2816815376282 | accuracy = 0.6681818181818182


Epoch[2] Batch[720] Speed: 1.257599677377402 samples/sec                   batch loss = 1760.571576476097 | accuracy = 0.6684027777777778


Epoch[2] Batch[725] Speed: 1.2534947337387172 samples/sec                   batch loss = 1770.3899322748184 | accuracy = 0.6689655172413793


Epoch[2] Batch[730] Speed: 1.2590720765018464 samples/sec                   batch loss = 1782.0606372356415 | accuracy = 0.6702054794520548


Epoch[2] Batch[735] Speed: 1.2523860120950698 samples/sec                   batch loss = 1795.3329555988312 | accuracy = 0.6693877551020408


Epoch[2] Batch[740] Speed: 1.2537130786363975 samples/sec                   batch loss = 1808.4429856538773 | accuracy = 0.668918918918919


Epoch[2] Batch[745] Speed: 1.2634831999263774 samples/sec                   batch loss = 1818.1781741976738 | accuracy = 0.6697986577181209


Epoch[2] Batch[750] Speed: 1.2541267467248256 samples/sec                   batch loss = 1828.9599284529686 | accuracy = 0.67


Epoch[2] Batch[755] Speed: 1.2614805697875797 samples/sec                   batch loss = 1839.215443789959 | accuracy = 0.6708609271523179


Epoch[2] Batch[760] Speed: 1.264621674795232 samples/sec                   batch loss = 1848.3751347661018 | accuracy = 0.6723684210526316


Epoch[2] Batch[765] Speed: 1.2602798586028825 samples/sec                   batch loss = 1860.248600423336 | accuracy = 0.6725490196078432


Epoch[2] Batch[770] Speed: 1.2630467922058437 samples/sec                   batch loss = 1874.7966156601906 | accuracy = 0.6717532467532468


Epoch[2] Batch[775] Speed: 1.2617637631407277 samples/sec                   batch loss = 1885.2677918076515 | accuracy = 0.672258064516129


Epoch[2] Batch[780] Speed: 1.262202987945165 samples/sec                   batch loss = 1899.8516576886177 | accuracy = 0.6717948717948717


Epoch[2] Batch[785] Speed: 1.2598157660492846 samples/sec                   batch loss = 1908.0684517025948 | accuracy = 0.6729299363057325


[Epoch 2] training: accuracy=0.6738578680203046
[Epoch 2] time cost: 641.4686095714569
[Epoch 2] validation: validation accuracy=0.7366666666666667


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).