<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[10:27:16] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[10:27:16] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[10:27:16] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 3.866651, -2.572975]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7650909243471763 samples/sec                   batch loss = 13.80301547050476 | accuracy = 0.5


Epoch[1] Batch[10] Speed: 1.2408486831249637 samples/sec                   batch loss = 27.588540077209473 | accuracy = 0.55


Epoch[1] Batch[15] Speed: 1.2415711782178958 samples/sec                   batch loss = 41.787606954574585 | accuracy = 0.5333333333333333


Epoch[1] Batch[20] Speed: 1.2422518343840545 samples/sec                   batch loss = 55.83917737007141 | accuracy = 0.525


Epoch[1] Batch[25] Speed: 1.2388384324276867 samples/sec                   batch loss = 71.64052391052246 | accuracy = 0.48


Epoch[1] Batch[30] Speed: 1.2425471644523365 samples/sec                   batch loss = 86.08654117584229 | accuracy = 0.4666666666666667


Epoch[1] Batch[35] Speed: 1.241517981687901 samples/sec                   batch loss = 100.26733756065369 | accuracy = 0.45


Epoch[1] Batch[40] Speed: 1.2434912295603344 samples/sec                   batch loss = 114.70433759689331 | accuracy = 0.45


Epoch[1] Batch[45] Speed: 1.2411865937297988 samples/sec                   batch loss = 128.59148979187012 | accuracy = 0.45555555555555555


Epoch[1] Batch[50] Speed: 1.2387295849810938 samples/sec                   batch loss = 142.7716646194458 | accuracy = 0.46


Epoch[1] Batch[55] Speed: 1.2454788963196153 samples/sec                   batch loss = 157.76846528053284 | accuracy = 0.4636363636363636


Epoch[1] Batch[60] Speed: 1.2442084093819388 samples/sec                   batch loss = 172.08155846595764 | accuracy = 0.4666666666666667


Epoch[1] Batch[65] Speed: 1.250864380371203 samples/sec                   batch loss = 186.13901805877686 | accuracy = 0.4653846153846154


Epoch[1] Batch[70] Speed: 1.2480405747444085 samples/sec                   batch loss = 199.67015409469604 | accuracy = 0.4857142857142857


Epoch[1] Batch[75] Speed: 1.2419956274203223 samples/sec                   batch loss = 213.6150336265564 | accuracy = 0.49


Epoch[1] Batch[80] Speed: 1.2497145961835816 samples/sec                   batch loss = 227.15890669822693 | accuracy = 0.496875


Epoch[1] Batch[85] Speed: 1.250749120267332 samples/sec                   batch loss = 241.08093452453613 | accuracy = 0.5058823529411764


Epoch[1] Batch[90] Speed: 1.244526272726295 samples/sec                   batch loss = 254.7852213382721 | accuracy = 0.5083333333333333


Epoch[1] Batch[95] Speed: 1.245705186505512 samples/sec                   batch loss = 268.789781332016 | accuracy = 0.5078947368421053


Epoch[1] Batch[100] Speed: 1.2478446192014314 samples/sec                   batch loss = 283.1458852291107 | accuracy = 0.505


Epoch[1] Batch[105] Speed: 1.2441547098365564 samples/sec                   batch loss = 296.7635385990143 | accuracy = 0.5119047619047619


Epoch[1] Batch[110] Speed: 1.2487078046470654 samples/sec                   batch loss = 310.3498303890228 | accuracy = 0.5159090909090909


Epoch[1] Batch[115] Speed: 1.2412621691437458 samples/sec                   batch loss = 323.88828349113464 | accuracy = 0.5217391304347826


Epoch[1] Batch[120] Speed: 1.2372215946185057 samples/sec                   batch loss = 337.7505877017975 | accuracy = 0.51875


Epoch[1] Batch[125] Speed: 1.2458548587751066 samples/sec                   batch loss = 351.5334348678589 | accuracy = 0.516


Epoch[1] Batch[130] Speed: 1.2478926973369955 samples/sec                   batch loss = 365.0920832157135 | accuracy = 0.5211538461538462


Epoch[1] Batch[135] Speed: 1.2465277179259329 samples/sec                   batch loss = 379.00218081474304 | accuracy = 0.524074074074074


Epoch[1] Batch[140] Speed: 1.2453836700287213 samples/sec                   batch loss = 393.0914888381958 | accuracy = 0.5214285714285715


Epoch[1] Batch[145] Speed: 1.2376985874984536 samples/sec                   batch loss = 406.5972583293915 | accuracy = 0.5258620689655172


Epoch[1] Batch[150] Speed: 1.2426242861964565 samples/sec                   batch loss = 420.30612444877625 | accuracy = 0.5283333333333333


Epoch[1] Batch[155] Speed: 1.2466452581611784 samples/sec                   batch loss = 434.10001850128174 | accuracy = 0.5274193548387097


Epoch[1] Batch[160] Speed: 1.2432822346986068 samples/sec                   batch loss = 447.5930390357971 | accuracy = 0.5296875


Epoch[1] Batch[165] Speed: 1.2512641319167588 samples/sec                   batch loss = 460.91113209724426 | accuracy = 0.5333333333333333


Epoch[1] Batch[170] Speed: 1.2441670732386496 samples/sec                   batch loss = 474.8105525970459 | accuracy = 0.5323529411764706


Epoch[1] Batch[175] Speed: 1.2438378661868026 samples/sec                   batch loss = 488.8967123031616 | accuracy = 0.5285714285714286


Epoch[1] Batch[180] Speed: 1.2437738713733304 samples/sec                   batch loss = 502.52232813835144 | accuracy = 0.5305555555555556


Epoch[1] Batch[185] Speed: 1.2382804943404218 samples/sec                   batch loss = 515.6696622371674 | accuracy = 0.5364864864864864


Epoch[1] Batch[190] Speed: 1.2469440712983881 samples/sec                   batch loss = 528.8822455406189 | accuracy = 0.5407894736842105


Epoch[1] Batch[195] Speed: 1.248288693902973 samples/sec                   batch loss = 542.7496540546417 | accuracy = 0.541025641025641


Epoch[1] Batch[200] Speed: 1.2459880954971445 samples/sec                   batch loss = 556.7951879501343 | accuracy = 0.54


Epoch[1] Batch[205] Speed: 1.2528656028635194 samples/sec                   batch loss = 569.8661673069 | accuracy = 0.5463414634146342


Epoch[1] Batch[210] Speed: 1.246292148453123 samples/sec                   batch loss = 582.8705570697784 | accuracy = 0.5511904761904762


Epoch[1] Batch[215] Speed: 1.2452743164384115 samples/sec                   batch loss = 597.0988676548004 | accuracy = 0.5511627906976744


Epoch[1] Batch[220] Speed: 1.2449699283934454 samples/sec                   batch loss = 611.2187395095825 | accuracy = 0.5511363636363636


Epoch[1] Batch[225] Speed: 1.24455239938038 samples/sec                   batch loss = 624.4252319335938 | accuracy = 0.5533333333333333


Epoch[1] Batch[230] Speed: 1.2469800311841335 samples/sec                   batch loss = 638.4293739795685 | accuracy = 0.55


Epoch[1] Batch[235] Speed: 1.2485107104348985 samples/sec                   batch loss = 652.4042973518372 | accuracy = 0.5468085106382978


Epoch[1] Batch[240] Speed: 1.2504791441666052 samples/sec                   batch loss = 666.0667941570282 | accuracy = 0.54375


Epoch[1] Batch[245] Speed: 1.246186985806858 samples/sec                   batch loss = 680.1642301082611 | accuracy = 0.5408163265306123


Epoch[1] Batch[250] Speed: 1.2468056265680552 samples/sec                   batch loss = 694.3499202728271 | accuracy = 0.539


Epoch[1] Batch[255] Speed: 1.2451138796609256 samples/sec                   batch loss = 707.9682130813599 | accuracy = 0.5372549019607843


Epoch[1] Batch[260] Speed: 1.2458543036817094 samples/sec                   batch loss = 721.7634701728821 | accuracy = 0.5403846153846154


Epoch[1] Batch[265] Speed: 1.2512087951753497 samples/sec                   batch loss = 734.9966833591461 | accuracy = 0.5424528301886793


Epoch[1] Batch[270] Speed: 1.2495146714236538 samples/sec                   batch loss = 748.2313530445099 | accuracy = 0.5444444444444444


Epoch[1] Batch[275] Speed: 1.2486702580948918 samples/sec                   batch loss = 762.1458122730255 | accuracy = 0.5463636363636364


Epoch[1] Batch[280] Speed: 1.2487911772182096 samples/sec                   batch loss = 775.0839025974274 | accuracy = 0.5473214285714286


Epoch[1] Batch[285] Speed: 1.2476373130743301 samples/sec                   batch loss = 788.5152535438538 | accuracy = 0.5491228070175439


Epoch[1] Batch[290] Speed: 1.2511324702318276 samples/sec                   batch loss = 801.8554363250732 | accuracy = 0.5517241379310345


Epoch[1] Batch[295] Speed: 1.2440777670215002 samples/sec                   batch loss = 815.3819043636322 | accuracy = 0.5508474576271186


Epoch[1] Batch[300] Speed: 1.2455768189237337 samples/sec                   batch loss = 827.8430125713348 | accuracy = 0.5558333333333333


Epoch[1] Batch[305] Speed: 1.2487078975868773 samples/sec                   batch loss = 840.7328402996063 | accuracy = 0.559016393442623


Epoch[1] Batch[310] Speed: 1.2508340712465973 samples/sec                   batch loss = 854.4546051025391 | accuracy = 0.5588709677419355


Epoch[1] Batch[315] Speed: 1.2477962663578244 samples/sec                   batch loss = 868.8019602298737 | accuracy = 0.5563492063492064


Epoch[1] Batch[320] Speed: 1.243173710411907 samples/sec                   batch loss = 882.8110997676849 | accuracy = 0.55546875


Epoch[1] Batch[325] Speed: 1.2425917981850112 samples/sec                   batch loss = 896.622061252594 | accuracy = 0.5553846153846154


Epoch[1] Batch[330] Speed: 1.2430528635562368 samples/sec                   batch loss = 910.1321487426758 | accuracy = 0.5568181818181818


Epoch[1] Batch[335] Speed: 1.2400032904712255 samples/sec                   batch loss = 923.2075939178467 | accuracy = 0.5567164179104478


Epoch[1] Batch[340] Speed: 1.2412464655967488 samples/sec                   batch loss = 936.8227727413177 | accuracy = 0.5566176470588236


Epoch[1] Batch[345] Speed: 1.2443373256637393 samples/sec                   batch loss = 951.3768508434296 | accuracy = 0.5550724637681159


Epoch[1] Batch[350] Speed: 1.2446627342744931 samples/sec                   batch loss = 964.4730472564697 | accuracy = 0.5571428571428572


Epoch[1] Batch[355] Speed: 1.2511010285563098 samples/sec                   batch loss = 978.7543482780457 | accuracy = 0.5563380281690141


Epoch[1] Batch[360] Speed: 1.2447300527844216 samples/sec                   batch loss = 992.0780231952667 | accuracy = 0.5569444444444445


Epoch[1] Batch[365] Speed: 1.2430693496623781 samples/sec                   batch loss = 1005.1689839363098 | accuracy = 0.5595890410958904


Epoch[1] Batch[370] Speed: 1.2413004654243414 samples/sec                   batch loss = 1017.5180975198746 | accuracy = 0.5621621621621622


Epoch[1] Batch[375] Speed: 1.2437326562983828 samples/sec                   batch loss = 1031.1811026334763 | accuracy = 0.5606666666666666


Epoch[1] Batch[380] Speed: 1.2422082367926282 samples/sec                   batch loss = 1044.438864827156 | accuracy = 0.5598684210526316


Epoch[1] Batch[385] Speed: 1.2491629108002544 samples/sec                   batch loss = 1058.0986198186874 | accuracy = 0.5603896103896104


Epoch[1] Batch[390] Speed: 1.2539257827798302 samples/sec                   batch loss = 1071.2234259843826 | accuracy = 0.5615384615384615


Epoch[1] Batch[395] Speed: 1.2493230903612282 samples/sec                   batch loss = 1084.733278632164 | accuracy = 0.5626582278481013


Epoch[1] Batch[400] Speed: 1.2474464927292956 samples/sec                   batch loss = 1097.4086743593216 | accuracy = 0.563125


Epoch[1] Batch[405] Speed: 1.2388342245239057 samples/sec                   batch loss = 1111.133400797844 | accuracy = 0.5635802469135802


Epoch[1] Batch[410] Speed: 1.241902863527462 samples/sec                   batch loss = 1123.9615067243576 | accuracy = 0.5652439024390243


Epoch[1] Batch[415] Speed: 1.2388628571293545 samples/sec                   batch loss = 1136.4519857168198 | accuracy = 0.5662650602409639


Epoch[1] Batch[420] Speed: 1.2398182795797457 samples/sec                   batch loss = 1150.6796153783798 | accuracy = 0.5660714285714286


Epoch[1] Batch[425] Speed: 1.2342270433562186 samples/sec                   batch loss = 1164.4138296842575 | accuracy = 0.5658823529411765


Epoch[1] Batch[430] Speed: 1.2384001407788245 samples/sec                   batch loss = 1178.0567930936813 | accuracy = 0.5668604651162791


Epoch[1] Batch[435] Speed: 1.2345834326143352 samples/sec                   batch loss = 1191.868743777275 | accuracy = 0.5666666666666667


Epoch[1] Batch[440] Speed: 1.2353235700560892 samples/sec                   batch loss = 1204.8505736589432 | accuracy = 0.5670454545454545


Epoch[1] Batch[445] Speed: 1.2328243769970586 samples/sec                   batch loss = 1218.576424241066 | accuracy = 0.5668539325842696


Epoch[1] Batch[450] Speed: 1.2402206273600813 samples/sec                   batch loss = 1232.5916584730148 | accuracy = 0.5666666666666667


Epoch[1] Batch[455] Speed: 1.2439643075741302 samples/sec                   batch loss = 1246.364993929863 | accuracy = 0.5653846153846154


Epoch[1] Batch[460] Speed: 1.2406169063212082 samples/sec                   batch loss = 1260.316577076912 | accuracy = 0.5652173913043478


Epoch[1] Batch[465] Speed: 1.2427738634028587 samples/sec                   batch loss = 1274.3917516469955 | accuracy = 0.5650537634408602


Epoch[1] Batch[470] Speed: 1.2373163980182542 samples/sec                   batch loss = 1287.9273487329483 | accuracy = 0.5648936170212766


Epoch[1] Batch[475] Speed: 1.240447853802197 samples/sec                   batch loss = 1301.6620181798935 | accuracy = 0.5647368421052632


Epoch[1] Batch[480] Speed: 1.236196657146815 samples/sec                   batch loss = 1314.1571635007858 | accuracy = 0.5661458333333333


Epoch[1] Batch[485] Speed: 1.2447910982222279 samples/sec                   batch loss = 1327.3013993501663 | accuracy = 0.5664948453608247


Epoch[1] Batch[490] Speed: 1.2372607368984943 samples/sec                   batch loss = 1340.9011505842209 | accuracy = 0.5668367346938775


Epoch[1] Batch[495] Speed: 1.2402783889022941 samples/sec                   batch loss = 1353.1447209119797 | accuracy = 0.5691919191919191


Epoch[1] Batch[500] Speed: 1.2400219870651192 samples/sec                   batch loss = 1366.3836597204208 | accuracy = 0.5685


Epoch[1] Batch[505] Speed: 1.2471055363131465 samples/sec                   batch loss = 1378.564441561699 | accuracy = 0.5693069306930693


Epoch[1] Batch[510] Speed: 1.247679993574646 samples/sec                   batch loss = 1392.2041708230972 | accuracy = 0.5696078431372549


Epoch[1] Batch[515] Speed: 1.2454580932168506 samples/sec                   batch loss = 1406.5943096876144 | accuracy = 0.5694174757281554


Epoch[1] Batch[520] Speed: 1.2463207565040622 samples/sec                   batch loss = 1419.3124624490738 | accuracy = 0.5706730769230769


Epoch[1] Batch[525] Speed: 1.2470295257587776 samples/sec                   batch loss = 1433.3159943819046 | accuracy = 0.5714285714285714


Epoch[1] Batch[530] Speed: 1.2398128739500394 samples/sec                   batch loss = 1446.2513366937637 | accuracy = 0.5712264150943396


Epoch[1] Batch[535] Speed: 1.2449775963314706 samples/sec                   batch loss = 1459.6877292394638 | accuracy = 0.5710280373831775


Epoch[1] Batch[540] Speed: 1.2447982098048207 samples/sec                   batch loss = 1472.9428533315659 | accuracy = 0.5708333333333333


Epoch[1] Batch[545] Speed: 1.2518651259355476 samples/sec                   batch loss = 1484.761907696724 | accuracy = 0.5738532110091743


Epoch[1] Batch[550] Speed: 1.2478612326556253 samples/sec                   batch loss = 1498.4683619737625 | accuracy = 0.5740909090909091


Epoch[1] Batch[555] Speed: 1.247402158780198 samples/sec                   batch loss = 1512.4109565019608 | accuracy = 0.5738738738738739


Epoch[1] Batch[560] Speed: 1.246606353501565 samples/sec                   batch loss = 1526.4527198076248 | accuracy = 0.5723214285714285


Epoch[1] Batch[565] Speed: 1.2412962407786277 samples/sec                   batch loss = 1538.8217269182205 | accuracy = 0.5725663716814159


Epoch[1] Batch[570] Speed: 1.2459038939769265 samples/sec                   batch loss = 1552.0439332723618 | accuracy = 0.5728070175438597


Epoch[1] Batch[575] Speed: 1.2416446869696185 samples/sec                   batch loss = 1563.883267879486 | accuracy = 0.5743478260869566


Epoch[1] Batch[580] Speed: 1.2454022518797894 samples/sec                   batch loss = 1578.4137649536133 | accuracy = 0.5737068965517241


Epoch[1] Batch[585] Speed: 1.2522619656206753 samples/sec                   batch loss = 1591.7832317352295 | accuracy = 0.5735042735042735


Epoch[1] Batch[590] Speed: 1.2503223952461382 samples/sec                   batch loss = 1604.7888524532318 | accuracy = 0.5733050847457627


Epoch[1] Batch[595] Speed: 1.2518313122876366 samples/sec                   batch loss = 1618.3275980949402 | accuracy = 0.5726890756302521


Epoch[1] Batch[600] Speed: 1.2570743520272376 samples/sec                   batch loss = 1631.9331703186035 | accuracy = 0.5729166666666666


Epoch[1] Batch[605] Speed: 1.2489449391135083 samples/sec                   batch loss = 1643.9198914766312 | accuracy = 0.574793388429752


Epoch[1] Batch[610] Speed: 1.256934213822757 samples/sec                   batch loss = 1656.585888504982 | accuracy = 0.575


Epoch[1] Batch[615] Speed: 1.2454109420899904 samples/sec                   batch loss = 1669.7039910554886 | accuracy = 0.5752032520325203


Epoch[1] Batch[620] Speed: 1.2478013706077802 samples/sec                   batch loss = 1682.698427081108 | accuracy = 0.5762096774193548


Epoch[1] Batch[625] Speed: 1.2507957439645054 samples/sec                   batch loss = 1694.815730214119 | accuracy = 0.5768


Epoch[1] Batch[630] Speed: 1.2506892605366065 samples/sec                   batch loss = 1708.0949152708054 | accuracy = 0.5761904761904761


Epoch[1] Batch[635] Speed: 1.2466127448146644 samples/sec                   batch loss = 1720.314408659935 | accuracy = 0.5771653543307087


Epoch[1] Batch[640] Speed: 1.2479983337424592 samples/sec                   batch loss = 1733.3933998346329 | accuracy = 0.57734375


Epoch[1] Batch[645] Speed: 1.2540930920221287 samples/sec                   batch loss = 1747.697349190712 | accuracy = 0.5771317829457364


Epoch[1] Batch[650] Speed: 1.2558100019933116 samples/sec                   batch loss = 1759.6738830804825 | accuracy = 0.5784615384615385


Epoch[1] Batch[655] Speed: 1.2524874549434692 samples/sec                   batch loss = 1772.0093120336533 | accuracy = 0.5797709923664122


Epoch[1] Batch[660] Speed: 1.2490316909141614 samples/sec                   batch loss = 1785.632872223854 | accuracy = 0.5799242424242425


Epoch[1] Batch[665] Speed: 1.2534517481455847 samples/sec                   batch loss = 1798.6738353967667 | accuracy = 0.5796992481203007


Epoch[1] Batch[670] Speed: 1.2460024386487052 samples/sec                   batch loss = 1811.3285511732101 | accuracy = 0.5802238805970149


Epoch[1] Batch[675] Speed: 1.2506728513930245 samples/sec                   batch loss = 1822.9837839603424 | accuracy = 0.5814814814814815


Epoch[1] Batch[680] Speed: 1.246154496359741 samples/sec                   batch loss = 1836.7101271152496 | accuracy = 0.5808823529411765


Epoch[1] Batch[685] Speed: 1.2477984008572711 samples/sec                   batch loss = 1849.2647161483765 | accuracy = 0.5806569343065694


Epoch[1] Batch[690] Speed: 1.2479809739839076 samples/sec                   batch loss = 1863.557879447937 | accuracy = 0.5800724637681159


Epoch[1] Batch[695] Speed: 1.2504342216119237 samples/sec                   batch loss = 1876.3824775218964 | accuracy = 0.5809352517985612


Epoch[1] Batch[700] Speed: 1.2439250166878952 samples/sec                   batch loss = 1891.5442118644714 | accuracy = 0.5803571428571429


Epoch[1] Batch[705] Speed: 1.2555920537620278 samples/sec                   batch loss = 1904.7678322792053 | accuracy = 0.5801418439716312


Epoch[1] Batch[710] Speed: 1.2495857732726303 samples/sec                   batch loss = 1917.4424904584885 | accuracy = 0.5816901408450704


Epoch[1] Batch[715] Speed: 1.2518734395154878 samples/sec                   batch loss = 1929.5698239803314 | accuracy = 0.5821678321678322


Epoch[1] Batch[720] Speed: 1.2543005799897338 samples/sec                   batch loss = 1943.7760200500488 | accuracy = 0.5819444444444445


Epoch[1] Batch[725] Speed: 1.2474526144008322 samples/sec                   batch loss = 1957.7029564380646 | accuracy = 0.5813793103448276


Epoch[1] Batch[730] Speed: 1.2435166675931588 samples/sec                   batch loss = 1968.7307364940643 | accuracy = 0.5832191780821918


Epoch[1] Batch[735] Speed: 1.2453731313148408 samples/sec                   batch loss = 1981.7898695468903 | accuracy = 0.5833333333333334


Epoch[1] Batch[740] Speed: 1.2453266336805708 samples/sec                   batch loss = 1993.6677870750427 | accuracy = 0.5844594594594594


Epoch[1] Batch[745] Speed: 1.2457775205284816 samples/sec                   batch loss = 2006.0058212280273 | accuracy = 0.5852348993288591


Epoch[1] Batch[750] Speed: 1.2501331746886053 samples/sec                   batch loss = 2018.8352259397507 | accuracy = 0.585


Epoch[1] Batch[755] Speed: 1.2461700466529695 samples/sec                   batch loss = 2031.5484846830368 | accuracy = 0.5850993377483443


Epoch[1] Batch[760] Speed: 1.2497367519330185 samples/sec                   batch loss = 2044.3158241510391 | accuracy = 0.5855263157894737


Epoch[1] Batch[765] Speed: 1.245522261465736 samples/sec                   batch loss = 2057.042708516121 | accuracy = 0.5859477124183007


Epoch[1] Batch[770] Speed: 1.2480239565145461 samples/sec                   batch loss = 2071.422014594078 | accuracy = 0.586038961038961


Epoch[1] Batch[775] Speed: 1.2365821653935545 samples/sec                   batch loss = 2084.739803671837 | accuracy = 0.5861290322580646


Epoch[1] Batch[780] Speed: 1.2411644646397706 samples/sec                   batch loss = 2095.279389023781 | accuracy = 0.5868589743589744


Epoch[1] Batch[785] Speed: 1.2443750735678987 samples/sec                   batch loss = 2106.5286297798157 | accuracy = 0.5878980891719745


[Epoch 1] training: accuracy=0.5885152284263959
[Epoch 1] time cost: 650.5266296863556
[Epoch 1] validation: validation accuracy=0.6844444444444444


Epoch[2] Batch[5] Speed: 1.2560721287719019 samples/sec                   batch loss = 9.956449151039124 | accuracy = 0.85


Epoch[2] Batch[10] Speed: 1.2534326443838393 samples/sec                   batch loss = 21.47770130634308 | accuracy = 0.7


Epoch[2] Batch[15] Speed: 1.2520685133935725 samples/sec                   batch loss = 34.550150752067566 | accuracy = 0.6833333333333333


Epoch[2] Batch[20] Speed: 1.25636898685933 samples/sec                   batch loss = 45.764949798583984 | accuracy = 0.7125


Epoch[2] Batch[25] Speed: 1.2472053837662591 samples/sec                   batch loss = 59.31677722930908 | accuracy = 0.66


Epoch[2] Batch[30] Speed: 1.2493212297356517 samples/sec                   batch loss = 71.12079417705536 | accuracy = 0.675


Epoch[2] Batch[35] Speed: 1.255835476517444 samples/sec                   batch loss = 81.06557607650757 | accuracy = 0.6928571428571428


Epoch[2] Batch[40] Speed: 1.2495823296720256 samples/sec                   batch loss = 93.7652736902237 | accuracy = 0.6875


Epoch[2] Batch[45] Speed: 1.2519882529388175 samples/sec                   batch loss = 106.75615632534027 | accuracy = 0.6777777777777778


Epoch[2] Batch[50] Speed: 1.244584343705898 samples/sec                   batch loss = 119.89782011508942 | accuracy = 0.675


Epoch[2] Batch[55] Speed: 1.2518188895001452 samples/sec                   batch loss = 134.00385808944702 | accuracy = 0.6681818181818182


Epoch[2] Batch[60] Speed: 1.2509325579545185 samples/sec                   batch loss = 147.12284803390503 | accuracy = 0.6625


Epoch[2] Batch[65] Speed: 1.2487269505404208 samples/sec                   batch loss = 159.6483178138733 | accuracy = 0.6653846153846154


Epoch[2] Batch[70] Speed: 1.2475670821724685 samples/sec                   batch loss = 173.12582087516785 | accuracy = 0.6785714285714286


Epoch[2] Batch[75] Speed: 1.2425832393032528 samples/sec                   batch loss = 184.9779918193817 | accuracy = 0.6766666666666666


Epoch[2] Batch[80] Speed: 1.2482469933311122 samples/sec                   batch loss = 199.58518242835999 | accuracy = 0.6625


Epoch[2] Batch[85] Speed: 1.245018154827743 samples/sec                   batch loss = 211.37602579593658 | accuracy = 0.6617647058823529


Epoch[2] Batch[90] Speed: 1.2502248430793155 samples/sec                   batch loss = 223.18482542037964 | accuracy = 0.6611111111111111


Epoch[2] Batch[95] Speed: 1.246737527490554 samples/sec                   batch loss = 234.51321876049042 | accuracy = 0.6578947368421053


Epoch[2] Batch[100] Speed: 1.2509339570251523 samples/sec                   batch loss = 248.3683886528015 | accuracy = 0.65


Epoch[2] Batch[105] Speed: 1.2465239206993046 samples/sec                   batch loss = 260.79820585250854 | accuracy = 0.6476190476190476


Epoch[2] Batch[110] Speed: 1.2483233381449128 samples/sec                   batch loss = 272.1531310081482 | accuracy = 0.6545454545454545


Epoch[2] Batch[115] Speed: 1.2478461969965056 samples/sec                   batch loss = 283.47072541713715 | accuracy = 0.6565217391304348


Epoch[2] Batch[120] Speed: 1.2469916166406017 samples/sec                   batch loss = 295.4585005044937 | accuracy = 0.65625


Epoch[2] Batch[125] Speed: 1.2471208322396914 samples/sec                   batch loss = 308.232542514801 | accuracy = 0.654


Epoch[2] Batch[130] Speed: 1.2501810567034797 samples/sec                   batch loss = 322.2406806945801 | accuracy = 0.6519230769230769


Epoch[2] Batch[135] Speed: 1.246105811641763 samples/sec                   batch loss = 336.02516984939575 | accuracy = 0.65


Epoch[2] Batch[140] Speed: 1.2471327911345216 samples/sec                   batch loss = 346.98084700107574 | accuracy = 0.6535714285714286


Epoch[2] Batch[145] Speed: 1.251599990003443 samples/sec                   batch loss = 359.21286153793335 | accuracy = 0.6586206896551724


Epoch[2] Batch[150] Speed: 1.2472160462310726 samples/sec                   batch loss = 370.2395431995392 | accuracy = 0.66


Epoch[2] Batch[155] Speed: 1.245475567776472 samples/sec                   batch loss = 381.35877788066864 | accuracy = 0.6661290322580645


Epoch[2] Batch[160] Speed: 1.24892615842421 samples/sec                   batch loss = 394.6761482954025 | accuracy = 0.665625


Epoch[2] Batch[165] Speed: 1.2464326089971856 samples/sec                   batch loss = 404.4070839881897 | accuracy = 0.6696969696969697


Epoch[2] Batch[170] Speed: 1.2521447656850626 samples/sec                   batch loss = 416.6704376935959 | accuracy = 0.6720588235294118


Epoch[2] Batch[175] Speed: 1.2552742425037104 samples/sec                   batch loss = 427.85290467739105 | accuracy = 0.6742857142857143


Epoch[2] Batch[180] Speed: 1.247369884163454 samples/sec                   batch loss = 440.92072665691376 | accuracy = 0.6722222222222223


Epoch[2] Batch[185] Speed: 1.2407149834021287 samples/sec                   batch loss = 451.65712904930115 | accuracy = 0.6756756756756757


Epoch[2] Batch[190] Speed: 1.2381262403150175 samples/sec                   batch loss = 464.8359237909317 | accuracy = 0.675


Epoch[2] Batch[195] Speed: 1.2392995539849576 samples/sec                   batch loss = 476.0997533798218 | accuracy = 0.676923076923077


Epoch[2] Batch[200] Speed: 1.2438769671234287 samples/sec                   batch loss = 488.52544260025024 | accuracy = 0.67875


Epoch[2] Batch[205] Speed: 1.2457761329682016 samples/sec                   batch loss = 503.0742769241333 | accuracy = 0.6719512195121952


Epoch[2] Batch[210] Speed: 1.24267426398959 samples/sec                   batch loss = 514.2607290744781 | accuracy = 0.6738095238095239


Epoch[2] Batch[215] Speed: 1.240050033013126 samples/sec                   batch loss = 525.8826514482498 | accuracy = 0.6744186046511628


Epoch[2] Batch[220] Speed: 1.2528136793514815 samples/sec                   batch loss = 535.9667794704437 | accuracy = 0.6772727272727272


Epoch[2] Batch[225] Speed: 1.2440793353052075 samples/sec                   batch loss = 552.6095597743988 | accuracy = 0.6733333333333333


Epoch[2] Batch[230] Speed: 1.2473490178433222 samples/sec                   batch loss = 565.2728507518768 | accuracy = 0.6728260869565217


Epoch[2] Batch[235] Speed: 1.2502870809082645 samples/sec                   batch loss = 578.2764689922333 | accuracy = 0.6702127659574468


Epoch[2] Batch[240] Speed: 1.247634715225112 samples/sec                   batch loss = 588.9286665916443 | accuracy = 0.6729166666666667


Epoch[2] Batch[245] Speed: 1.2499059248741486 samples/sec                   batch loss = 598.6572731733322 | accuracy = 0.6744897959183673


Epoch[2] Batch[250] Speed: 1.2501425831218245 samples/sec                   batch loss = 608.7797865867615 | accuracy = 0.677


Epoch[2] Batch[255] Speed: 1.2522843987317136 samples/sec                   batch loss = 618.7932466268539 | accuracy = 0.6774509803921569


Epoch[2] Batch[260] Speed: 1.2523151520942064 samples/sec                   batch loss = 630.1145278215408 | accuracy = 0.6788461538461539


Epoch[2] Batch[265] Speed: 1.2489493089588497 samples/sec                   batch loss = 641.0496475696564 | accuracy = 0.6792452830188679


Epoch[2] Batch[270] Speed: 1.250108676165592 samples/sec                   batch loss = 652.1690149307251 | accuracy = 0.6787037037037037


Epoch[2] Batch[275] Speed: 1.252237570524691 samples/sec                   batch loss = 663.933820605278 | accuracy = 0.6781818181818182


Epoch[2] Batch[280] Speed: 1.2620714827121677 samples/sec                   batch loss = 675.9836200475693 | accuracy = 0.6776785714285715


Epoch[2] Batch[285] Speed: 1.2444554685462497 samples/sec                   batch loss = 690.2683811187744 | accuracy = 0.6736842105263158


Epoch[2] Batch[290] Speed: 1.2439251089171162 samples/sec                   batch loss = 700.7272682189941 | accuracy = 0.6758620689655173


Epoch[2] Batch[295] Speed: 1.2449813841429394 samples/sec                   batch loss = 712.4951890707016 | accuracy = 0.673728813559322


Epoch[2] Batch[300] Speed: 1.2398001388392748 samples/sec                   batch loss = 724.5050228238106 | accuracy = 0.6733333333333333


Epoch[2] Batch[305] Speed: 1.2442159756698357 samples/sec                   batch loss = 738.7951354384422 | accuracy = 0.671311475409836


Epoch[2] Batch[310] Speed: 1.2415200947619065 samples/sec                   batch loss = 751.1407759785652 | accuracy = 0.6717741935483871


Epoch[2] Batch[315] Speed: 1.237433576479963 samples/sec                   batch loss = 762.717181622982 | accuracy = 0.6714285714285714


Epoch[2] Batch[320] Speed: 1.2451054708088607 samples/sec                   batch loss = 774.9273309111595 | accuracy = 0.6734375


Epoch[2] Batch[325] Speed: 1.2451697875542218 samples/sec                   batch loss = 789.3297256827354 | accuracy = 0.6730769230769231


Epoch[2] Batch[330] Speed: 1.2381349206524916 samples/sec                   batch loss = 801.568788588047 | accuracy = 0.671969696969697


Epoch[2] Batch[335] Speed: 1.2407613208121369 samples/sec                   batch loss = 813.8871783614159 | accuracy = 0.6723880597014925


Epoch[2] Batch[340] Speed: 1.245742092457421 samples/sec                   batch loss = 825.601142346859 | accuracy = 0.6705882352941176


Epoch[2] Batch[345] Speed: 1.2417124146107044 samples/sec                   batch loss = 839.6446043848991 | accuracy = 0.6702898550724637


Epoch[2] Batch[350] Speed: 1.2410134386037273 samples/sec                   batch loss = 851.184519469738 | accuracy = 0.6707142857142857


Epoch[2] Batch[355] Speed: 1.2420086835106945 samples/sec                   batch loss = 864.054821908474 | accuracy = 0.6697183098591549


Epoch[2] Batch[360] Speed: 1.2433529054338424 samples/sec                   batch loss = 876.028810441494 | accuracy = 0.66875


Epoch[2] Batch[365] Speed: 1.2399600338466226 samples/sec                   batch loss = 887.6619331240654 | accuracy = 0.6698630136986301


Epoch[2] Batch[370] Speed: 1.2398458581870195 samples/sec                   batch loss = 902.749454677105 | accuracy = 0.6682432432432432


Epoch[2] Batch[375] Speed: 1.2445103018490202 samples/sec                   batch loss = 916.2167446017265 | accuracy = 0.668


Epoch[2] Batch[380] Speed: 1.2421701603872246 samples/sec                   batch loss = 927.7370597720146 | accuracy = 0.6684210526315789


Epoch[2] Batch[385] Speed: 1.2463106648410163 samples/sec                   batch loss = 939.1551260352135 | accuracy = 0.6681818181818182


Epoch[2] Batch[390] Speed: 1.2412522510678354 samples/sec                   batch loss = 953.059950530529 | accuracy = 0.6679487179487179


Epoch[2] Batch[395] Speed: 1.2458941791492115 samples/sec                   batch loss = 964.8258470892906 | accuracy = 0.6677215189873418


Epoch[2] Batch[400] Speed: 1.2438029171944254 samples/sec                   batch loss = 977.5735751986504 | accuracy = 0.666875


Epoch[2] Batch[405] Speed: 1.2434975889709698 samples/sec                   batch loss = 992.1562914252281 | accuracy = 0.6660493827160494


Epoch[2] Batch[410] Speed: 1.2444563916258369 samples/sec                   batch loss = 1002.3886477351189 | accuracy = 0.6670731707317074


Epoch[2] Batch[415] Speed: 1.2423517340754322 samples/sec                   batch loss = 1014.8454449772835 | accuracy = 0.6668674698795181


Epoch[2] Batch[420] Speed: 1.2489457758900364 samples/sec                   batch loss = 1028.2977731823921 | accuracy = 0.6666666666666666


Epoch[2] Batch[425] Speed: 1.2381302606667162 samples/sec                   batch loss = 1041.4942860007286 | accuracy = 0.6652941176470588


Epoch[2] Batch[430] Speed: 1.241453490454317 samples/sec                   batch loss = 1051.879863679409 | accuracy = 0.6656976744186046


Epoch[2] Batch[435] Speed: 1.238574487554432 samples/sec                   batch loss = 1062.3838198781013 | accuracy = 0.6672413793103448


Epoch[2] Batch[440] Speed: 1.2430831651718914 samples/sec                   batch loss = 1073.9848131537437 | accuracy = 0.6670454545454545


Epoch[2] Batch[445] Speed: 1.244623399256648 samples/sec                   batch loss = 1087.1204156279564 | accuracy = 0.6668539325842696


Epoch[2] Batch[450] Speed: 1.2407809579072495 samples/sec                   batch loss = 1098.4182042479515 | accuracy = 0.6677777777777778


Epoch[2] Batch[455] Speed: 1.2389216816182855 samples/sec                   batch loss = 1111.3780570626259 | accuracy = 0.6681318681318681


Epoch[2] Batch[460] Speed: 1.2405958066375495 samples/sec                   batch loss = 1124.9415583014488 | accuracy = 0.6668478260869565


Epoch[2] Batch[465] Speed: 1.2382053729465388 samples/sec                   batch loss = 1134.39831584692 | accuracy = 0.6682795698924732


Epoch[2] Batch[470] Speed: 1.239616837203838 samples/sec                   batch loss = 1149.6618112921715 | accuracy = 0.6659574468085107


Epoch[2] Batch[475] Speed: 1.2360747040130247 samples/sec                   batch loss = 1159.7749986052513 | accuracy = 0.6673684210526316


Epoch[2] Batch[480] Speed: 1.2426472957307686 samples/sec                   batch loss = 1170.303707063198 | accuracy = 0.6677083333333333


Epoch[2] Batch[485] Speed: 1.2440402216449682 samples/sec                   batch loss = 1184.267170369625 | accuracy = 0.6675257731958762


Epoch[2] Batch[490] Speed: 1.2455955914932642 samples/sec                   batch loss = 1197.4069640040398 | accuracy = 0.6663265306122449


Epoch[2] Batch[495] Speed: 1.2425109076567147 samples/sec                   batch loss = 1211.6858264803886 | accuracy = 0.6651515151515152


Epoch[2] Batch[500] Speed: 1.2414054479216883 samples/sec                   batch loss = 1224.1231445670128 | accuracy = 0.6655


Epoch[2] Batch[505] Speed: 1.2383958444408243 samples/sec                   batch loss = 1236.7278025746346 | accuracy = 0.6653465346534654


Epoch[2] Batch[510] Speed: 1.2422516504215435 samples/sec                   batch loss = 1250.1176021695137 | accuracy = 0.6651960784313725


Epoch[2] Batch[515] Speed: 1.2438771515676343 samples/sec                   batch loss = 1260.149984896183 | accuracy = 0.6669902912621359


Epoch[2] Batch[520] Speed: 1.234233126767286 samples/sec                   batch loss = 1273.5893002152443 | accuracy = 0.666826923076923


Epoch[2] Batch[525] Speed: 1.2369243218789978 samples/sec                   batch loss = 1283.6462588906288 | accuracy = 0.6676190476190477


Epoch[2] Batch[530] Speed: 1.2416626979413323 samples/sec                   batch loss = 1294.8992382884026 | accuracy = 0.6679245283018868


Epoch[2] Batch[535] Speed: 1.2373567326473591 samples/sec                   batch loss = 1303.3451798558235 | accuracy = 0.6696261682242991


Epoch[2] Batch[540] Speed: 1.2388835319740372 samples/sec                   batch loss = 1314.2972411513329 | accuracy = 0.6699074074074074


Epoch[2] Batch[545] Speed: 1.239239137500074 samples/sec                   batch loss = 1327.6253027319908 | accuracy = 0.6697247706422018


Epoch[2] Batch[550] Speed: 1.2421366845098585 samples/sec                   batch loss = 1338.1802445054054 | accuracy = 0.6704545454545454


Epoch[2] Batch[555] Speed: 1.2380986467874828 samples/sec                   batch loss = 1347.8891332745552 | accuracy = 0.6716216216216216


Epoch[2] Batch[560] Speed: 1.236714246540606 samples/sec                   batch loss = 1361.9496195912361 | accuracy = 0.671875


Epoch[2] Batch[565] Speed: 1.2355735736036513 samples/sec                   batch loss = 1378.5179317593575 | accuracy = 0.6703539823008849


Epoch[2] Batch[570] Speed: 1.2285025620139773 samples/sec                   batch loss = 1391.2639853358269 | accuracy = 0.6697368421052632


Epoch[2] Batch[575] Speed: 1.236029444484228 samples/sec                   batch loss = 1401.6611832976341 | accuracy = 0.67


Epoch[2] Batch[580] Speed: 1.2362470301752542 samples/sec                   batch loss = 1414.237987458706 | accuracy = 0.6693965517241379


Epoch[2] Batch[585] Speed: 1.2257866775792936 samples/sec                   batch loss = 1428.9652653336525 | accuracy = 0.6679487179487179


Epoch[2] Batch[590] Speed: 1.2346552984958035 samples/sec                   batch loss = 1439.8940506577492 | accuracy = 0.6690677966101695


Epoch[2] Batch[595] Speed: 1.2394542836604987 samples/sec                   batch loss = 1452.828599512577 | accuracy = 0.6680672268907563


Epoch[2] Batch[600] Speed: 1.23284811215812 samples/sec                   batch loss = 1464.4465431571007 | accuracy = 0.6679166666666667


Epoch[2] Batch[605] Speed: 1.2342609114668108 samples/sec                   batch loss = 1479.0552986264229 | accuracy = 0.6665289256198347


Epoch[2] Batch[610] Speed: 1.2343296521938223 samples/sec                   batch loss = 1490.7964497208595 | accuracy = 0.6672131147540984


Epoch[2] Batch[615] Speed: 1.2350793048073465 samples/sec                   batch loss = 1501.4036211371422 | accuracy = 0.6686991869918699


Epoch[2] Batch[620] Speed: 1.2294417950307945 samples/sec                   batch loss = 1513.4575110077858 | accuracy = 0.6689516129032258


Epoch[2] Batch[625] Speed: 1.23425746101126 samples/sec                   batch loss = 1525.018444597721 | accuracy = 0.6696


Epoch[2] Batch[630] Speed: 1.2317134714696487 samples/sec                   batch loss = 1535.6953988671303 | accuracy = 0.6694444444444444


Epoch[2] Batch[635] Speed: 1.2333804908874961 samples/sec                   batch loss = 1544.0761323571205 | accuracy = 0.6708661417322834


Epoch[2] Batch[640] Speed: 1.2377859756382774 samples/sec                   batch loss = 1554.0565281510353 | accuracy = 0.671484375


Epoch[2] Batch[645] Speed: 1.2388632230498107 samples/sec                   batch loss = 1563.4561168551445 | accuracy = 0.6724806201550387


Epoch[2] Batch[650] Speed: 1.2346350371033334 samples/sec                   batch loss = 1575.1096858382225 | accuracy = 0.6730769230769231


Epoch[2] Batch[655] Speed: 1.233763700046248 samples/sec                   batch loss = 1587.3257619738579 | accuracy = 0.6732824427480916


Epoch[2] Batch[660] Speed: 1.2399984331150782 samples/sec                   batch loss = 1598.1076006293297 | accuracy = 0.6727272727272727


Epoch[2] Batch[665] Speed: 1.2332532002276972 samples/sec                   batch loss = 1608.6900416016579 | accuracy = 0.6733082706766917


Epoch[2] Batch[670] Speed: 1.2339224047328115 samples/sec                   batch loss = 1622.1647390723228 | accuracy = 0.6716417910447762


Epoch[2] Batch[675] Speed: 1.2338381927022024 samples/sec                   batch loss = 1632.0992652773857 | accuracy = 0.6725925925925926


Epoch[2] Batch[680] Speed: 1.238192122551471 samples/sec                   batch loss = 1649.1347642540932 | accuracy = 0.6705882352941176


Epoch[2] Batch[685] Speed: 1.2456941798905656 samples/sec                   batch loss = 1661.418808043003 | accuracy = 0.6708029197080292


Epoch[2] Batch[690] Speed: 1.244821577005473 samples/sec                   batch loss = 1672.3837495446205 | accuracy = 0.6717391304347826


Epoch[2] Batch[695] Speed: 1.2441390252770468 samples/sec                   batch loss = 1685.0383275151253 | accuracy = 0.670863309352518


Epoch[2] Batch[700] Speed: 1.246304554372187 samples/sec                   batch loss = 1695.1925523877144 | accuracy = 0.6714285714285714


Epoch[2] Batch[705] Speed: 1.2419756760378102 samples/sec                   batch loss = 1707.8173173069954 | accuracy = 0.6709219858156028


Epoch[2] Batch[710] Speed: 1.2389721854805436 samples/sec                   batch loss = 1722.3729166388512 | accuracy = 0.6700704225352113


Epoch[2] Batch[715] Speed: 1.2447486151357614 samples/sec                   batch loss = 1732.6596028208733 | accuracy = 0.670979020979021


Epoch[2] Batch[720] Speed: 1.2470331406850381 samples/sec                   batch loss = 1743.3859540820122 | accuracy = 0.6715277777777777


Epoch[2] Batch[725] Speed: 1.2491714675546837 samples/sec                   batch loss = 1753.240348637104 | accuracy = 0.6731034482758621


Epoch[2] Batch[730] Speed: 1.2406148880595553 samples/sec                   batch loss = 1763.6639822125435 | accuracy = 0.6736301369863014


Epoch[2] Batch[735] Speed: 1.241099000209498 samples/sec                   batch loss = 1775.0124214291573 | accuracy = 0.673469387755102


Epoch[2] Batch[740] Speed: 1.2389937789680394 samples/sec                   batch loss = 1784.936281979084 | accuracy = 0.6739864864864865


Epoch[2] Batch[745] Speed: 1.2393548493852726 samples/sec                   batch loss = 1793.3660607933998 | accuracy = 0.6748322147651007


Epoch[2] Batch[750] Speed: 1.2342324003866922 samples/sec                   batch loss = 1805.7227751612663 | accuracy = 0.6756666666666666


Epoch[2] Batch[755] Speed: 1.2337526312596403 samples/sec                   batch loss = 1814.6924807429314 | accuracy = 0.676158940397351


Epoch[2] Batch[760] Speed: 1.239257536442101 samples/sec                   batch loss = 1829.124026954174 | accuracy = 0.6759868421052632


Epoch[2] Batch[765] Speed: 1.2396739929173164 samples/sec                   batch loss = 1839.9949223399162 | accuracy = 0.6761437908496732


Epoch[2] Batch[770] Speed: 1.246548093683455 samples/sec                   batch loss = 1850.928317129612 | accuracy = 0.6762987012987013


Epoch[2] Batch[775] Speed: 1.2422065812482013 samples/sec                   batch loss = 1861.1978284716606 | accuracy = 0.6767741935483871


Epoch[2] Batch[780] Speed: 1.2493335099668752 samples/sec                   batch loss = 1872.7061923146248 | accuracy = 0.6775641025641026


Epoch[2] Batch[785] Speed: 1.2436671972526256 samples/sec                   batch loss = 1881.2538782954216 | accuracy = 0.6789808917197452


[Epoch 2] training: accuracy=0.6798857868020305
[Epoch 2] time cost: 650.252863407135
[Epoch 2] validation: validation accuracy=0.7244444444444444


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).