<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[15:37:26] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

15:37:26] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[15:37:27] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 5.3023715, -1.2948772]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7725577322561287 samples/sec                   batch loss = 13.923776149749756 | accuracy = 0.5


Epoch[1] Batch[10] Speed: 1.251989841229852 samples/sec                   batch loss = 28.23363208770752 | accuracy = 0.525


Epoch[1] Batch[15] Speed: 1.2531316497302742 samples/sec                   batch loss = 40.287203550338745 | accuracy = 0.6333333333333333


Epoch[1] Batch[20] Speed: 1.2476728490265385 samples/sec                   batch loss = 54.96332573890686 | accuracy = 0.6


Epoch[1] Batch[25] Speed: 1.2552860765149751 samples/sec                   batch loss = 69.14568161964417 | accuracy = 0.58


Epoch[1] Batch[30] Speed: 1.2534627986265505 samples/sec                   batch loss = 84.0847315788269 | accuracy = 0.5583333333333333


Epoch[1] Batch[35] Speed: 1.2522095313585742 samples/sec                   batch loss = 98.05533361434937 | accuracy = 0.5714285714285714


Epoch[1] Batch[40] Speed: 1.2521238327586515 samples/sec                   batch loss = 111.21208548545837 | accuracy = 0.5875


Epoch[1] Batch[45] Speed: 1.257156678955077 samples/sec                   batch loss = 126.12146162986755 | accuracy = 0.5666666666666667


Epoch[1] Batch[50] Speed: 1.255313314372812 samples/sec                   batch loss = 141.2678394317627 | accuracy = 0.545


Epoch[1] Batch[55] Speed: 1.2558570977730963 samples/sec                   batch loss = 155.54972219467163 | accuracy = 0.5409090909090909


Epoch[1] Batch[60] Speed: 1.2601009569944355 samples/sec                   batch loss = 169.25931906700134 | accuracy = 0.5375


Epoch[1] Batch[65] Speed: 1.256004236086827 samples/sec                   batch loss = 183.26995396614075 | accuracy = 0.5346153846153846


Epoch[1] Batch[70] Speed: 1.2560008510480924 samples/sec                   batch loss = 197.58949971199036 | accuracy = 0.5285714285714286


Epoch[1] Batch[75] Speed: 1.2524407985418513 samples/sec                   batch loss = 211.12483191490173 | accuracy = 0.53


Epoch[1] Batch[80] Speed: 1.2585205024355437 samples/sec                   batch loss = 224.9580864906311 | accuracy = 0.528125


Epoch[1] Batch[85] Speed: 1.2487473053136315 samples/sec                   batch loss = 239.18260073661804 | accuracy = 0.5264705882352941


Epoch[1] Batch[90] Speed: 1.2527825273100386 samples/sec                   batch loss = 253.0506248474121 | accuracy = 0.5194444444444445


Epoch[1] Batch[95] Speed: 1.2547044078768128 samples/sec                   batch loss = 266.42354130744934 | accuracy = 0.5184210526315789


Epoch[1] Batch[100] Speed: 1.2522267285648967 samples/sec                   batch loss = 280.195675611496 | accuracy = 0.5225


Epoch[1] Batch[105] Speed: 1.2514409996838805 samples/sec                   batch loss = 294.61537432670593 | accuracy = 0.5142857142857142


Epoch[1] Batch[110] Speed: 1.2586784638963793 samples/sec                   batch loss = 308.70677185058594 | accuracy = 0.5136363636363637


Epoch[1] Batch[115] Speed: 1.2558328444154145 samples/sec                   batch loss = 322.49143290519714 | accuracy = 0.5108695652173914


Epoch[1] Batch[120] Speed: 1.2585515628545656 samples/sec                   batch loss = 336.62222599983215 | accuracy = 0.5104166666666666


Epoch[1] Batch[125] Speed: 1.258521446498637 samples/sec                   batch loss = 349.8914864063263 | accuracy = 0.518


Epoch[1] Batch[130] Speed: 1.254389859378853 samples/sec                   batch loss = 363.7918403148651 | accuracy = 0.5134615384615384


Epoch[1] Batch[135] Speed: 1.2530572425507278 samples/sec                   batch loss = 377.94187235832214 | accuracy = 0.5111111111111111


Epoch[1] Batch[140] Speed: 1.2585455205860725 samples/sec                   batch loss = 392.13701725006104 | accuracy = 0.5089285714285714


Epoch[1] Batch[145] Speed: 1.255280441243688 samples/sec                   batch loss = 406.2752993106842 | accuracy = 0.506896551724138


Epoch[1] Batch[150] Speed: 1.2476771171882954 samples/sec                   batch loss = 420.2909173965454 | accuracy = 0.505


Epoch[1] Batch[155] Speed: 1.2604602314778328 samples/sec                   batch loss = 434.2844111919403 | accuracy = 0.5032258064516129


Epoch[1] Batch[160] Speed: 1.2569106721632295 samples/sec                   batch loss = 447.8209617137909 | accuracy = 0.5078125


Epoch[1] Batch[165] Speed: 1.2507052971611285 samples/sec                   batch loss = 461.5936393737793 | accuracy = 0.5075757575757576


Epoch[1] Batch[170] Speed: 1.2522053255919574 samples/sec                   batch loss = 475.59805607795715 | accuracy = 0.5073529411764706


Epoch[1] Batch[175] Speed: 1.254920076824624 samples/sec                   batch loss = 489.2856707572937 | accuracy = 0.51


Epoch[1] Batch[180] Speed: 1.2473657108435778 samples/sec                   batch loss = 503.5267131328583 | accuracy = 0.5083333333333333


Epoch[1] Batch[185] Speed: 1.250184689924358 samples/sec                   batch loss = 517.6177926063538 | accuracy = 0.5054054054054054


Epoch[1] Batch[190] Speed: 1.2504426093957361 samples/sec                   batch loss = 531.7030010223389 | accuracy = 0.5039473684210526


Epoch[1] Batch[195] Speed: 1.2575066416063019 samples/sec                   batch loss = 545.1488049030304 | accuracy = 0.5076923076923077


Epoch[1] Batch[200] Speed: 1.2590018750834377 samples/sec                   batch loss = 558.2552011013031 | accuracy = 0.51


Epoch[1] Batch[205] Speed: 1.2547831400362437 samples/sec                   batch loss = 572.0510318279266 | accuracy = 0.5097560975609756


Epoch[1] Batch[210] Speed: 1.2543422171149679 samples/sec                   batch loss = 585.9396002292633 | accuracy = 0.5107142857142857


Epoch[1] Batch[215] Speed: 1.2553745569075727 samples/sec                   batch loss = 600.1849610805511 | accuracy = 0.5104651162790698


Epoch[1] Batch[220] Speed: 1.25206393481381 samples/sec                   batch loss = 613.4947099685669 | accuracy = 0.5125


Epoch[1] Batch[225] Speed: 1.2520799132682567 samples/sec                   batch loss = 627.316477060318 | accuracy = 0.5133333333333333


Epoch[1] Batch[230] Speed: 1.2514969170993364 samples/sec                   batch loss = 641.1058783531189 | accuracy = 0.5119565217391304


Epoch[1] Batch[235] Speed: 1.2565457009531635 samples/sec                   batch loss = 654.786999464035 | accuracy = 0.5127659574468085


Epoch[1] Batch[240] Speed: 1.2563902501417792 samples/sec                   batch loss = 668.6646900177002 | accuracy = 0.5145833333333333


Epoch[1] Batch[245] Speed: 1.252201119853592 samples/sec                   batch loss = 682.4536213874817 | accuracy = 0.5153061224489796


Epoch[1] Batch[250] Speed: 1.2562437738973022 samples/sec                   batch loss = 696.5101327896118 | accuracy = 0.512


Epoch[1] Batch[255] Speed: 1.2542826693740143 samples/sec                   batch loss = 710.3429653644562 | accuracy = 0.5107843137254902


Epoch[1] Batch[260] Speed: 1.2557091480642928 samples/sec                   batch loss = 724.7216403484344 | accuracy = 0.5086538461538461


Epoch[1] Batch[265] Speed: 1.2495512450552422 samples/sec                   batch loss = 738.1413278579712 | accuracy = 0.5113207547169811


Epoch[1] Batch[270] Speed: 1.2591351983234362 samples/sec                   batch loss = 751.4167609214783 | accuracy = 0.5175925925925926


Epoch[1] Batch[275] Speed: 1.252177755154895 samples/sec                   batch loss = 765.686774969101 | accuracy = 0.5163636363636364


Epoch[1] Batch[280] Speed: 1.2493842149928942 samples/sec                   batch loss = 779.4821870326996 | accuracy = 0.5151785714285714


Epoch[1] Batch[285] Speed: 1.2529804112026075 samples/sec                   batch loss = 793.0628378391266 | accuracy = 0.5149122807017544


Epoch[1] Batch[290] Speed: 1.2562189412345395 samples/sec                   batch loss = 806.7102808952332 | accuracy = 0.5172413793103449


Epoch[1] Batch[295] Speed: 1.2499362821575812 samples/sec                   batch loss = 820.4538657665253 | accuracy = 0.5169491525423728


Epoch[1] Batch[300] Speed: 1.2539523055810362 samples/sec                   batch loss = 834.1630651950836 | accuracy = 0.5166666666666667


Epoch[1] Batch[305] Speed: 1.2553054246647992 samples/sec                   batch loss = 847.2544605731964 | accuracy = 0.519672131147541


Epoch[1] Batch[310] Speed: 1.254827437084045 samples/sec                   batch loss = 860.6943681240082 | accuracy = 0.5201612903225806


Epoch[1] Batch[315] Speed: 1.2532266604199112 samples/sec                   batch loss = 874.0084083080292 | accuracy = 0.5206349206349207


Epoch[1] Batch[320] Speed: 1.250233321221396 samples/sec                   batch loss = 887.67919754982 | accuracy = 0.52109375


Epoch[1] Batch[325] Speed: 1.2566914004299676 samples/sec                   batch loss = 901.4579863548279 | accuracy = 0.5207692307692308


Epoch[1] Batch[330] Speed: 1.2546429492322226 samples/sec                   batch loss = 914.8806281089783 | accuracy = 0.5227272727272727


Epoch[1] Batch[335] Speed: 1.257453295996213 samples/sec                   batch loss = 928.5025923252106 | accuracy = 0.5223880597014925


Epoch[1] Batch[340] Speed: 1.2524265872624847 samples/sec                   batch loss = 942.4339010715485 | accuracy = 0.5198529411764706


Epoch[1] Batch[345] Speed: 1.244489623403017 samples/sec                   batch loss = 955.4329748153687 | accuracy = 0.5231884057971015


Epoch[1] Batch[350] Speed: 1.2456730921296184 samples/sec                   batch loss = 969.0059747695923 | accuracy = 0.5242857142857142


Epoch[1] Batch[355] Speed: 1.2492869021881339 samples/sec                   batch loss = 982.1738095283508 | accuracy = 0.528169014084507


Epoch[1] Batch[360] Speed: 1.243779219390874 samples/sec                   batch loss = 995.523138999939 | accuracy = 0.5284722222222222


Epoch[1] Batch[365] Speed: 1.2470311941836782 samples/sec                   batch loss = 1008.5811610221863 | accuracy = 0.5321917808219178


Epoch[1] Batch[370] Speed: 1.2499236175901334 samples/sec                   batch loss = 1020.9367520809174 | accuracy = 0.5344594594594595


Epoch[1] Batch[375] Speed: 1.2442288939350241 samples/sec                   batch loss = 1035.0606789588928 | accuracy = 0.534


Epoch[1] Batch[380] Speed: 1.25517582222983 samples/sec                   batch loss = 1047.9630825519562 | accuracy = 0.5348684210526315


Epoch[1] Batch[385] Speed: 1.2522548619697327 samples/sec                   batch loss = 1061.0061302185059 | accuracy = 0.537012987012987


Epoch[1] Batch[390] Speed: 1.2524891380048777 samples/sec                   batch loss = 1074.3770098686218 | accuracy = 0.5378205128205128


Epoch[1] Batch[395] Speed: 1.2570907411966943 samples/sec                   batch loss = 1088.2623405456543 | accuracy = 0.5386075949367088


Epoch[1] Batch[400] Speed: 1.2533417222054675 samples/sec                   batch loss = 1102.4442746639252 | accuracy = 0.53625


Epoch[1] Batch[405] Speed: 1.2491570513512569 samples/sec                   batch loss = 1116.1047756671906 | accuracy = 0.5364197530864198


Epoch[1] Batch[410] Speed: 1.2506728513930245 samples/sec                   batch loss = 1129.972051858902 | accuracy = 0.5365853658536586


Epoch[1] Batch[415] Speed: 1.2477791906251392 samples/sec                   batch loss = 1144.0078933238983 | accuracy = 0.5373493975903615


Epoch[1] Batch[420] Speed: 1.2510107241638517 samples/sec                   batch loss = 1156.6639420986176 | accuracy = 0.5392857142857143


Epoch[1] Batch[425] Speed: 1.254285670069774 samples/sec                   batch loss = 1169.1796045303345 | accuracy = 0.5411764705882353


Epoch[1] Batch[430] Speed: 1.2514604161833671 samples/sec                   batch loss = 1182.793560743332 | accuracy = 0.5412790697674419


Epoch[1] Batch[435] Speed: 1.2492727623919309 samples/sec                   batch loss = 1195.9627528190613 | accuracy = 0.542528735632184


Epoch[1] Batch[440] Speed: 1.2554526216225115 samples/sec                   batch loss = 1209.6168420314789 | accuracy = 0.5420454545454545


Epoch[1] Batch[445] Speed: 1.2569651959906196 samples/sec                   batch loss = 1223.301104068756 | accuracy = 0.5426966292134832


Epoch[1] Batch[450] Speed: 1.2557093360341802 samples/sec                   batch loss = 1235.8421609401703 | accuracy = 0.545


Epoch[1] Batch[455] Speed: 1.2540416291313625 samples/sec                   batch loss = 1249.3526530265808 | accuracy = 0.545054945054945


Epoch[1] Batch[460] Speed: 1.2502051854120513 samples/sec                   batch loss = 1262.548986673355 | accuracy = 0.5456521739130434


Epoch[1] Batch[465] Speed: 1.2496974679073152 samples/sec                   batch loss = 1275.1805629730225 | accuracy = 0.5473118279569893


Epoch[1] Batch[470] Speed: 1.252860457098128 samples/sec                   batch loss = 1288.6748805046082 | accuracy = 0.5473404255319149


Epoch[1] Batch[475] Speed: 1.2505954731665487 samples/sec                   batch loss = 1302.1718699932098 | accuracy = 0.5489473684210526


Epoch[1] Batch[480] Speed: 1.2488022386298827 samples/sec                   batch loss = 1316.0707411766052 | accuracy = 0.5494791666666666


Epoch[1] Batch[485] Speed: 1.2520400146150301 samples/sec                   batch loss = 1329.8425033092499 | accuracy = 0.5489690721649485


Epoch[1] Batch[490] Speed: 1.2493945425760877 samples/sec                   batch loss = 1343.7036006450653 | accuracy = 0.55


Epoch[1] Batch[495] Speed: 1.254481027499228 samples/sec                   batch loss = 1356.8577661514282 | accuracy = 0.5510101010101011


Epoch[1] Batch[500] Speed: 1.2547948709414805 samples/sec                   batch loss = 1370.8418595790863 | accuracy = 0.551


Epoch[1] Batch[505] Speed: 1.2474155142825638 samples/sec                   batch loss = 1385.191279888153 | accuracy = 0.5495049504950495


Epoch[1] Batch[510] Speed: 1.246164030124441 samples/sec                   batch loss = 1398.0074614286423 | accuracy = 0.5504901960784314


Epoch[1] Batch[515] Speed: 1.2484353645924566 samples/sec                   batch loss = 1411.2646933794022 | accuracy = 0.5509708737864077


Epoch[1] Batch[520] Speed: 1.2521914000330192 samples/sec                   batch loss = 1424.0302990674973 | accuracy = 0.551923076923077


Epoch[1] Batch[525] Speed: 1.2565315845822407 samples/sec                   batch loss = 1435.847343325615 | accuracy = 0.5542857142857143


Epoch[1] Batch[530] Speed: 1.2530172815872314 samples/sec                   batch loss = 1448.718515753746 | accuracy = 0.555188679245283


Epoch[1] Batch[535] Speed: 1.257169867336829 samples/sec                   batch loss = 1463.2419155836105 | accuracy = 0.5532710280373832


Epoch[1] Batch[540] Speed: 1.2577696658080726 samples/sec                   batch loss = 1476.701139330864 | accuracy = 0.5527777777777778


Epoch[1] Batch[545] Speed: 1.2545653604284654 samples/sec                   batch loss = 1490.61481320858 | accuracy = 0.5532110091743119


Epoch[1] Batch[550] Speed: 1.2556841485707375 samples/sec                   batch loss = 1503.8441542387009 | accuracy = 0.5545454545454546


Epoch[1] Batch[555] Speed: 1.2467061210349764 samples/sec                   batch loss = 1517.8546417951584 | accuracy = 0.5545045045045045


Epoch[1] Batch[560] Speed: 1.2537894376792766 samples/sec                   batch loss = 1531.123179078102 | accuracy = 0.5549107142857143


Epoch[1] Batch[565] Speed: 1.2501127747109249 samples/sec                   batch loss = 1543.9273034334183 | accuracy = 0.5561946902654867


Epoch[1] Batch[570] Speed: 1.2548968921412915 samples/sec                   batch loss = 1556.0085567235947 | accuracy = 0.5587719298245614


Epoch[1] Batch[575] Speed: 1.25314409858667 samples/sec                   batch loss = 1568.3579388856888 | accuracy = 0.5595652173913044


Epoch[1] Batch[580] Speed: 1.252541502174879 samples/sec                   batch loss = 1582.329518198967 | accuracy = 0.5594827586206896


Epoch[1] Batch[585] Speed: 1.2522048582862995 samples/sec                   batch loss = 1595.1120692491531 | accuracy = 0.5611111111111111


Epoch[1] Batch[590] Speed: 1.2530605181525627 samples/sec                   batch loss = 1608.0607324838638 | accuracy = 0.5614406779661016


Epoch[1] Batch[595] Speed: 1.2535998220459619 samples/sec                   batch loss = 1621.3785492181778 | accuracy = 0.5609243697478992


Epoch[1] Batch[600] Speed: 1.2501832925292133 samples/sec                   batch loss = 1634.169730067253 | accuracy = 0.5625


Epoch[1] Batch[605] Speed: 1.251441466419589 samples/sec                   batch loss = 1647.4492639303207 | accuracy = 0.562396694214876


Epoch[1] Batch[610] Speed: 1.2513111673291835 samples/sec                   batch loss = 1660.1290925741196 | accuracy = 0.5631147540983606


Epoch[1] Batch[615] Speed: 1.2532487536397305 samples/sec                   batch loss = 1673.8231564760208 | accuracy = 0.5630081300813008


Epoch[1] Batch[620] Speed: 1.247611706175721 samples/sec                   batch loss = 1687.0332645177841 | accuracy = 0.5629032258064516


Epoch[1] Batch[625] Speed: 1.2515873850687256 samples/sec                   batch loss = 1699.3251413106918 | accuracy = 0.5632


Epoch[1] Batch[630] Speed: 1.2481810582463106 samples/sec                   batch loss = 1711.8825410604477 | accuracy = 0.5646825396825397


Epoch[1] Batch[635] Speed: 1.2455979034251978 samples/sec                   batch loss = 1724.320496082306 | accuracy = 0.5661417322834645


Epoch[1] Batch[640] Speed: 1.2484013643912921 samples/sec                   batch loss = 1738.3067824840546 | accuracy = 0.566015625


Epoch[1] Batch[645] Speed: 1.2499726942186098 samples/sec                   batch loss = 1750.2247836589813 | accuracy = 0.5678294573643411


Epoch[1] Batch[650] Speed: 1.2485678529880702 samples/sec                   batch loss = 1763.2463550567627 | accuracy = 0.5684615384615385


Epoch[1] Batch[655] Speed: 1.252143550808595 samples/sec                   batch loss = 1775.3606398105621 | accuracy = 0.5698473282442749


Epoch[1] Batch[660] Speed: 1.2581671447195306 samples/sec                   batch loss = 1789.0373599529266 | accuracy = 0.5700757575757576


Epoch[1] Batch[665] Speed: 1.2530951469908664 samples/sec                   batch loss = 1801.0069353580475 | accuracy = 0.5714285714285714


Epoch[1] Batch[670] Speed: 1.2506560697597238 samples/sec                   batch loss = 1813.7041280269623 | accuracy = 0.5708955223880597


Epoch[1] Batch[675] Speed: 1.2566973307753775 samples/sec                   batch loss = 1826.0603696107864 | accuracy = 0.5711111111111111


Epoch[1] Batch[680] Speed: 1.2502927646115884 samples/sec                   batch loss = 1838.8008922338486 | accuracy = 0.5716911764705882


Epoch[1] Batch[685] Speed: 1.2537601109589942 samples/sec                   batch loss = 1851.923083305359 | accuracy = 0.5722627737226277


Epoch[1] Batch[690] Speed: 1.2509550367348206 samples/sec                   batch loss = 1866.562475681305 | accuracy = 0.5717391304347826


Epoch[1] Batch[695] Speed: 1.251211594555685 samples/sec                   batch loss = 1878.8027309179306 | accuracy = 0.5715827338129497


Epoch[1] Batch[700] Speed: 1.252180185042935 samples/sec                   batch loss = 1890.6252473592758 | accuracy = 0.5732142857142857


Epoch[1] Batch[705] Speed: 1.252325247752233 samples/sec                   batch loss = 1902.987906575203 | accuracy = 0.574468085106383


Epoch[1] Batch[710] Speed: 1.2567511771075723 samples/sec                   batch loss = 1915.8986402750015 | accuracy = 0.575


Epoch[1] Batch[715] Speed: 1.2568939110390978 samples/sec                   batch loss = 1928.7727509737015 | accuracy = 0.5748251748251748


Epoch[1] Batch[720] Speed: 1.2562851638503638 samples/sec                   batch loss = 1941.6942743062973 | accuracy = 0.5756944444444444


Epoch[1] Batch[725] Speed: 1.2521050497910027 samples/sec                   batch loss = 1955.1281605958939 | accuracy = 0.576551724137931


Epoch[1] Batch[730] Speed: 1.2584735842842023 samples/sec                   batch loss = 1969.146705031395 | accuracy = 0.5746575342465754


Epoch[1] Batch[735] Speed: 1.2598659063964048 samples/sec                   batch loss = 1982.9123216867447 | accuracy = 0.5738095238095238


Epoch[1] Batch[740] Speed: 1.2561262036816445 samples/sec                   batch loss = 1995.2124651670456 | accuracy = 0.5739864864864865


Epoch[1] Batch[745] Speed: 1.256927527901791 samples/sec                   batch loss = 2007.556876540184 | accuracy = 0.5751677852348993


Epoch[1] Batch[750] Speed: 1.2526063088973847 samples/sec                   batch loss = 2020.4789842367172 | accuracy = 0.5756666666666667


Epoch[1] Batch[755] Speed: 1.256816985096906 samples/sec                   batch loss = 2031.6283569335938 | accuracy = 0.5771523178807947


Epoch[1] Batch[760] Speed: 1.2513691265396334 samples/sec                   batch loss = 2043.4873715639114 | accuracy = 0.5779605263157894


Epoch[1] Batch[765] Speed: 1.2552705796406816 samples/sec                   batch loss = 2055.325684428215 | accuracy = 0.5790849673202615


Epoch[1] Batch[770] Speed: 1.2528032951655275 samples/sec                   batch loss = 2068.774722933769 | accuracy = 0.5785714285714286


Epoch[1] Batch[775] Speed: 1.2604382620465266 samples/sec                   batch loss = 2082.033818125725 | accuracy = 0.5774193548387097


Epoch[1] Batch[780] Speed: 1.257329562917042 samples/sec                   batch loss = 2094.633656144142 | accuracy = 0.5782051282051283


Epoch[1] Batch[785] Speed: 1.2585154045193152 samples/sec                   batch loss = 2108.6585236787796 | accuracy = 0.5786624203821656


[Epoch 1] training: accuracy=0.5774111675126904
[Epoch 1] time cost: 646.6779227256775
[Epoch 1] validation: validation accuracy=0.6722222222222223


Epoch[2] Batch[5] Speed: 1.2600591259566736 samples/sec                   batch loss = 10.562422513961792 | accuracy = 0.9


Epoch[2] Batch[10] Speed: 1.260198541998278 samples/sec                   batch loss = 24.840868711471558 | accuracy = 0.675


Epoch[2] Batch[15] Speed: 1.254683858438603 samples/sec                   batch loss = 35.91372513771057 | accuracy = 0.7166666666666667


Epoch[2] Batch[20] Speed: 1.2583521042418453 samples/sec                   batch loss = 48.000590801239014 | accuracy = 0.7


Epoch[2] Batch[25] Speed: 1.2562512991404557 samples/sec                   batch loss = 58.789289474487305 | accuracy = 0.7


Epoch[2] Batch[30] Speed: 1.2566578903234993 samples/sec                   batch loss = 69.87059664726257 | accuracy = 0.6916666666666667


Epoch[2] Batch[35] Speed: 1.2579429070576085 samples/sec                   batch loss = 85.05704879760742 | accuracy = 0.6714285714285714


Epoch[2] Batch[40] Speed: 1.269216808671566 samples/sec                   batch loss = 97.21443772315979 | accuracy = 0.68125


Epoch[2] Batch[45] Speed: 1.2597939136750795 samples/sec                   batch loss = 111.09638786315918 | accuracy = 0.6555555555555556


Epoch[2] Batch[50] Speed: 1.2557843405102564 samples/sec                   batch loss = 124.65238440036774 | accuracy = 0.645


Epoch[2] Batch[55] Speed: 1.2566940361321288 samples/sec                   batch loss = 137.0520474910736 | accuracy = 0.6454545454545455


Epoch[2] Batch[60] Speed: 1.2583012349523488 samples/sec                   batch loss = 147.85526168346405 | accuracy = 0.6583333333333333


Epoch[2] Batch[65] Speed: 1.253456430528985 samples/sec                   batch loss = 160.19658029079437 | accuracy = 0.6615384615384615


Epoch[2] Batch[70] Speed: 1.2591901988424636 samples/sec                   batch loss = 171.6188942193985 | accuracy = 0.6678571428571428


Epoch[2] Batch[75] Speed: 1.254818708816624 samples/sec                   batch loss = 183.44579112529755 | accuracy = 0.67


Epoch[2] Batch[80] Speed: 1.2541609657535562 samples/sec                   batch loss = 195.4360227584839 | accuracy = 0.678125


Epoch[2] Batch[85] Speed: 1.2573861962852237 samples/sec                   batch loss = 205.8693083524704 | accuracy = 0.6852941176470588


Epoch[2] Batch[90] Speed: 1.2589387666280842 samples/sec                   batch loss = 219.12867724895477 | accuracy = 0.6833333333333333


Epoch[2] Batch[95] Speed: 1.2569433482241155 samples/sec                   batch loss = 230.0714329481125 | accuracy = 0.6921052631578948


Epoch[2] Batch[100] Speed: 1.2533727147926108 samples/sec                   batch loss = 244.0218335390091 | accuracy = 0.685


Epoch[2] Batch[105] Speed: 1.2542878268287194 samples/sec                   batch loss = 258.15139520168304 | accuracy = 0.6785714285714286


Epoch[2] Batch[110] Speed: 1.255957223293024 samples/sec                   batch loss = 270.8558248281479 | accuracy = 0.675


Epoch[2] Batch[115] Speed: 1.25619918868113 samples/sec                   batch loss = 281.34133434295654 | accuracy = 0.6782608695652174


Epoch[2] Batch[120] Speed: 1.2607174823126477 samples/sec                   batch loss = 292.6453572511673 | accuracy = 0.6833333333333333


Epoch[2] Batch[125] Speed: 1.2533539879705862 samples/sec                   batch loss = 304.6760460138321 | accuracy = 0.684


Epoch[2] Batch[130] Speed: 1.252131589073869 samples/sec                   batch loss = 318.75045597553253 | accuracy = 0.6788461538461539


Epoch[2] Batch[135] Speed: 1.258958794374687 samples/sec                   batch loss = 329.9679614305496 | accuracy = 0.6796296296296296


Epoch[2] Batch[140] Speed: 1.2569757434600755 samples/sec                   batch loss = 341.80643117427826 | accuracy = 0.6785714285714286


Epoch[2] Batch[145] Speed: 1.2544750242524727 samples/sec                   batch loss = 354.45241487026215 | accuracy = 0.6827586206896552


Epoch[2] Batch[150] Speed: 1.255223903488837 samples/sec                   batch loss = 367.9265025854111 | accuracy = 0.6816666666666666


Epoch[2] Batch[155] Speed: 1.2559255386553887 samples/sec                   batch loss = 382.3105808496475 | accuracy = 0.6774193548387096


Epoch[2] Batch[160] Speed: 1.2499312535486096 samples/sec                   batch loss = 395.3954191207886 | accuracy = 0.6765625


Epoch[2] Batch[165] Speed: 1.2494356685199066 samples/sec                   batch loss = 409.53805804252625 | accuracy = 0.6712121212121213


Epoch[2] Batch[170] Speed: 1.2504516496887328 samples/sec                   batch loss = 421.1955535411835 | accuracy = 0.6735294117647059


Epoch[2] Batch[175] Speed: 1.2505357213534891 samples/sec                   batch loss = 434.8806662559509 | accuracy = 0.6728571428571428


Epoch[2] Batch[180] Speed: 1.2546039190818186 samples/sec                   batch loss = 446.80841636657715 | accuracy = 0.6722222222222223


Epoch[2] Batch[185] Speed: 1.2522372901268142 samples/sec                   batch loss = 458.3931996822357 | accuracy = 0.6716216216216216


Epoch[2] Batch[190] Speed: 1.2479114469576404 samples/sec                   batch loss = 473.01322770118713 | accuracy = 0.6697368421052632


Epoch[2] Batch[195] Speed: 1.2448956559065742 samples/sec                   batch loss = 485.2099301815033 | accuracy = 0.6717948717948717


Epoch[2] Batch[200] Speed: 1.2541747476432852 samples/sec                   batch loss = 497.69207775592804 | accuracy = 0.67


Epoch[2] Batch[205] Speed: 1.2561759567108934 samples/sec                   batch loss = 510.8397146463394 | accuracy = 0.6670731707317074


Epoch[2] Batch[210] Speed: 1.2500212158883348 samples/sec                   batch loss = 523.5584388971329 | accuracy = 0.6666666666666666


Epoch[2] Batch[215] Speed: 1.2586635441429461 samples/sec                   batch loss = 537.5662621259689 | accuracy = 0.663953488372093


Epoch[2] Batch[220] Speed: 1.2633276451565634 samples/sec                   batch loss = 549.423600435257 | accuracy = 0.6613636363636364


Epoch[2] Batch[225] Speed: 1.2533573587572522 samples/sec                   batch loss = 560.9659572839737 | accuracy = 0.66


Epoch[2] Batch[230] Speed: 1.2576492643742319 samples/sec                   batch loss = 575.3406916856766 | accuracy = 0.658695652173913


Epoch[2] Batch[235] Speed: 1.252261311333666 samples/sec                   batch loss = 587.9842916727066 | accuracy = 0.6595744680851063


Epoch[2] Batch[240] Speed: 1.2532757159124899 samples/sec                   batch loss = 602.3081556558609 | accuracy = 0.6541666666666667


Epoch[2] Batch[245] Speed: 1.251017347286803 samples/sec                   batch loss = 614.1071346998215 | accuracy = 0.6551020408163265


Epoch[2] Batch[250] Speed: 1.242990699212348 samples/sec                   batch loss = 627.1102176904678 | accuracy = 0.655


Epoch[2] Batch[255] Speed: 1.2505423394478454 samples/sec                   batch loss = 639.5146440267563 | accuracy = 0.6568627450980392


Epoch[2] Batch[260] Speed: 1.2484386160748335 samples/sec                   batch loss = 653.8401087522507 | accuracy = 0.6567307692307692


Epoch[2] Batch[265] Speed: 1.2558407407546026 samples/sec                   batch loss = 664.5113818645477 | accuracy = 0.659433962264151


Epoch[2] Batch[270] Speed: 1.2527395905490128 samples/sec                   batch loss = 676.2238036394119 | accuracy = 0.6601851851851852


Epoch[2] Batch[275] Speed: 1.249780507191155 samples/sec                   batch loss = 687.2652734518051 | accuracy = 0.6609090909090909


Epoch[2] Batch[280] Speed: 1.24971655107105 samples/sec                   batch loss = 698.788231253624 | accuracy = 0.6607142857142857


Epoch[2] Batch[285] Speed: 1.2508634477608513 samples/sec                   batch loss = 710.0475908517838 | accuracy = 0.6622807017543859


Epoch[2] Batch[290] Speed: 1.2505489576122508 samples/sec                   batch loss = 720.9463347196579 | accuracy = 0.6655172413793103


Epoch[2] Batch[295] Speed: 1.2498835768962029 samples/sec                   batch loss = 732.8669142723083 | accuracy = 0.6686440677966101


Epoch[2] Batch[300] Speed: 1.2512400556334873 samples/sec                   batch loss = 744.3232536315918 | accuracy = 0.6691666666666667


Epoch[2] Batch[305] Speed: 1.25186148294093 samples/sec                   batch loss = 755.6573832035065 | accuracy = 0.6680327868852459


Epoch[2] Batch[310] Speed: 1.252348711438073 samples/sec                   batch loss = 769.5086594820023 | accuracy = 0.667741935483871


Epoch[2] Batch[315] Speed: 1.2510454263756532 samples/sec                   batch loss = 781.9802523851395 | accuracy = 0.6674603174603174


Epoch[2] Batch[320] Speed: 1.2543062064667447 samples/sec                   batch loss = 792.6561497449875 | accuracy = 0.66796875


Epoch[2] Batch[325] Speed: 1.2530707194222788 samples/sec                   batch loss = 802.8208047151566 | accuracy = 0.67


Epoch[2] Batch[330] Speed: 1.2562654092143715 samples/sec                   batch loss = 813.7401438951492 | accuracy = 0.6712121212121213


Epoch[2] Batch[335] Speed: 1.2586659048398707 samples/sec                   batch loss = 825.1791013479233 | accuracy = 0.6723880597014925


Epoch[2] Batch[340] Speed: 1.258189223763414 samples/sec                   batch loss = 836.6386682987213 | accuracy = 0.6735294117647059


Epoch[2] Batch[345] Speed: 1.2504167940209128 samples/sec                   batch loss = 850.0521001815796 | accuracy = 0.6717391304347826


Epoch[2] Batch[350] Speed: 1.2586709095466375 samples/sec                   batch loss = 861.085587978363 | accuracy = 0.6728571428571428


Epoch[2] Batch[355] Speed: 1.2579995013656209 samples/sec                   batch loss = 872.0143229961395 | accuracy = 0.673943661971831


Epoch[2] Batch[360] Speed: 1.2575693237689236 samples/sec                   batch loss = 884.3408976793289 | accuracy = 0.6736111111111112


Epoch[2] Batch[365] Speed: 1.2516024176496572 samples/sec                   batch loss = 897.7513498067856 | accuracy = 0.6726027397260274


Epoch[2] Batch[370] Speed: 1.2539393720607788 samples/sec                   batch loss = 909.0637751817703 | accuracy = 0.672972972972973


Epoch[2] Batch[375] Speed: 1.2599931672705198 samples/sec                   batch loss = 922.0815192461014 | accuracy = 0.672


Epoch[2] Batch[380] Speed: 1.256057740927969 samples/sec                   batch loss = 938.0072172880173 | accuracy = 0.6703947368421053


Epoch[2] Batch[385] Speed: 1.2560220078395368 samples/sec                   batch loss = 950.3112443685532 | accuracy = 0.6694805194805195


Epoch[2] Batch[390] Speed: 1.252503724726563 samples/sec                   batch loss = 961.8688472509384 | accuracy = 0.6698717948717948


Epoch[2] Batch[395] Speed: 1.2524279896770267 samples/sec                   batch loss = 974.9059580564499 | accuracy = 0.6683544303797468


Epoch[2] Batch[400] Speed: 1.2555094618485918 samples/sec                   batch loss = 986.2746385335922 | accuracy = 0.66875


Epoch[2] Batch[405] Speed: 1.256248477163708 samples/sec                   batch loss = 1000.7222751379013 | accuracy = 0.6685185185185185


Epoch[2] Batch[410] Speed: 1.2582310251012245 samples/sec                   batch loss = 1015.1007677316666 | accuracy = 0.6670731707317074


Epoch[2] Batch[415] Speed: 1.253605161227075 samples/sec                   batch loss = 1029.4304465055466 | accuracy = 0.6650602409638554


Epoch[2] Batch[420] Speed: 1.262172031900786 samples/sec                   batch loss = 1042.259026169777 | accuracy = 0.6636904761904762


Epoch[2] Batch[425] Speed: 1.2555858519441085 samples/sec                   batch loss = 1056.7829591035843 | accuracy = 0.6629411764705883


Epoch[2] Batch[430] Speed: 1.2527670921878205 samples/sec                   batch loss = 1069.433829665184 | accuracy = 0.661046511627907


Epoch[2] Batch[435] Speed: 1.250827356808504 samples/sec                   batch loss = 1082.7543090581894 | accuracy = 0.6609195402298851


Epoch[2] Batch[440] Speed: 1.2508154202076593 samples/sec                   batch loss = 1099.32792365551 | accuracy = 0.6585227272727273


Epoch[2] Batch[445] Speed: 1.250913997246936 samples/sec                   batch loss = 1110.992627978325 | accuracy = 0.6589887640449438


Epoch[2] Batch[450] Speed: 1.248560233678766 samples/sec                   batch loss = 1121.9133108854294 | accuracy = 0.66


Epoch[2] Batch[455] Speed: 1.2546330976430295 samples/sec                   batch loss = 1134.3634346723557 | accuracy = 0.6593406593406593


Epoch[2] Batch[460] Speed: 1.2534117620163525 samples/sec                   batch loss = 1147.6181427240372 | accuracy = 0.658695652173913


Epoch[2] Batch[465] Speed: 1.25659831075575 samples/sec                   batch loss = 1161.7173565626144 | accuracy = 0.6580645161290323


Epoch[2] Batch[470] Speed: 1.26151708842266 samples/sec                   batch loss = 1172.3998779058456 | accuracy = 0.6590425531914894


Epoch[2] Batch[475] Speed: 1.2594555351558632 samples/sec                   batch loss = 1186.070221543312 | accuracy = 0.6584210526315789


Epoch[2] Batch[480] Speed: 1.2582519739947107 samples/sec                   batch loss = 1197.9398401975632 | accuracy = 0.659375


Epoch[2] Batch[485] Speed: 1.2556095319421545 samples/sec                   batch loss = 1210.632018327713 | accuracy = 0.6587628865979381


Epoch[2] Batch[490] Speed: 1.2620280018119652 samples/sec                   batch loss = 1223.351403951645 | accuracy = 0.6586734693877551


Epoch[2] Batch[495] Speed: 1.2638550691710115 samples/sec                   batch loss = 1236.8372211456299 | accuracy = 0.6585858585858586


Epoch[2] Batch[500] Speed: 1.2507410080990347 samples/sec                   batch loss = 1247.9048943519592 | accuracy = 0.6595


Epoch[2] Batch[505] Speed: 1.2529630061404629 samples/sec                   batch loss = 1258.9146686792374 | accuracy = 0.6599009900990099


Epoch[2] Batch[510] Speed: 1.2590005523862282 samples/sec                   batch loss = 1271.1531629562378 | accuracy = 0.6602941176470588


Epoch[2] Batch[515] Speed: 1.253988858276418 samples/sec                   batch loss = 1283.7306557893753 | accuracy = 0.6601941747572816


Epoch[2] Batch[520] Speed: 1.257213862011296 samples/sec                   batch loss = 1295.7592812776566 | accuracy = 0.6600961538461538


Epoch[2] Batch[525] Speed: 1.259904318494184 samples/sec                   batch loss = 1309.6130384206772 | accuracy = 0.66


Epoch[2] Batch[530] Speed: 1.2546835769441231 samples/sec                   batch loss = 1320.0750517845154 | accuracy = 0.6608490566037736


Epoch[2] Batch[535] Speed: 1.2500785899396383 samples/sec                   batch loss = 1332.7849606275558 | accuracy = 0.6612149532710281


Epoch[2] Batch[540] Speed: 1.2587014107897765 samples/sec                   batch loss = 1343.3641778230667 | accuracy = 0.6625


Epoch[2] Batch[545] Speed: 1.2610899052995836 samples/sec                   batch loss = 1355.7542192935944 | accuracy = 0.6614678899082569


Epoch[2] Batch[550] Speed: 1.2580852514443386 samples/sec                   batch loss = 1365.360844373703 | accuracy = 0.6631818181818182


Epoch[2] Batch[555] Speed: 1.2533651303622655 samples/sec                   batch loss = 1378.1331586837769 | accuracy = 0.6635135135135135


Epoch[2] Batch[560] Speed: 1.2622221700597713 samples/sec                   batch loss = 1389.9289120435715 | accuracy = 0.6629464285714286


Epoch[2] Batch[565] Speed: 1.262954660082354 samples/sec                   batch loss = 1402.5654458999634 | accuracy = 0.6623893805309734


Epoch[2] Batch[570] Speed: 1.2573166538615919 samples/sec                   batch loss = 1413.9973174333572 | accuracy = 0.6618421052631579


Epoch[2] Batch[575] Speed: 1.2529780717844055 samples/sec                   batch loss = 1423.0269792079926 | accuracy = 0.6643478260869565


Epoch[2] Batch[580] Speed: 1.2586750644277738 samples/sec                   batch loss = 1433.8854874372482 | accuracy = 0.6650862068965517


Epoch[2] Batch[585] Speed: 1.2557829305690547 samples/sec                   batch loss = 1443.9513436555862 | accuracy = 0.6658119658119658


Epoch[2] Batch[590] Speed: 1.2605915905640408 samples/sec                   batch loss = 1453.6349969506264 | accuracy = 0.6665254237288135


Epoch[2] Batch[595] Speed: 1.2532818949300593 samples/sec                   batch loss = 1462.2085773348808 | accuracy = 0.6680672268907563


Epoch[2] Batch[600] Speed: 1.2604370310234658 samples/sec                   batch loss = 1476.194514453411 | accuracy = 0.6670833333333334


Epoch[2] Batch[605] Speed: 1.2531229450377808 samples/sec                   batch loss = 1487.4532089829445 | accuracy = 0.6681818181818182


Epoch[2] Batch[610] Speed: 1.2565160569405571 samples/sec                   batch loss = 1498.7392804026604 | accuracy = 0.6684426229508197


Epoch[2] Batch[615] Speed: 1.256597275457751 samples/sec                   batch loss = 1507.2811061739922 | accuracy = 0.6695121951219513


Epoch[2] Batch[620] Speed: 1.2505007677917723 samples/sec                   batch loss = 1518.273982822895 | accuracy = 0.6705645161290322


Epoch[2] Batch[625] Speed: 1.2496090413416623 samples/sec                   batch loss = 1529.2271204590797 | accuracy = 0.6708


Epoch[2] Batch[630] Speed: 1.2504262999194242 samples/sec                   batch loss = 1538.4075053334236 | accuracy = 0.6718253968253968


Epoch[2] Batch[635] Speed: 1.2520236634425812 samples/sec                   batch loss = 1552.5348594784737 | accuracy = 0.6704724409448819


Epoch[2] Batch[640] Speed: 1.2540491280146648 samples/sec                   batch loss = 1562.9313890337944 | accuracy = 0.67109375


Epoch[2] Batch[645] Speed: 1.249648133648434 samples/sec                   batch loss = 1575.9251863360405 | accuracy = 0.6709302325581395


Epoch[2] Batch[650] Speed: 1.245234850332521 samples/sec                   batch loss = 1587.659991323948 | accuracy = 0.6703846153846154


Epoch[2] Batch[655] Speed: 1.2501344788194089 samples/sec                   batch loss = 1600.8160930275917 | accuracy = 0.6702290076335878


Epoch[2] Batch[660] Speed: 1.2562234561911094 samples/sec                   batch loss = 1612.7856860756874 | accuracy = 0.6704545454545454


Epoch[2] Batch[665] Speed: 1.251272157550417 samples/sec                   batch loss = 1622.5792501568794 | accuracy = 0.6710526315789473


Epoch[2] Batch[670] Speed: 1.2518566256477548 samples/sec                   batch loss = 1631.9612618088722 | accuracy = 0.6727611940298508


Epoch[2] Batch[675] Speed: 1.2519413534552748 samples/sec                   batch loss = 1641.7401247620583 | accuracy = 0.6733333333333333


Epoch[2] Batch[680] Speed: 1.254409273701372 samples/sec                   batch loss = 1652.8196610808372 | accuracy = 0.6738970588235295


Epoch[2] Batch[685] Speed: 1.260164465982795 samples/sec                   batch loss = 1667.427711904049 | accuracy = 0.672992700729927


Epoch[2] Batch[690] Speed: 1.2610464920065474 samples/sec                   batch loss = 1678.8410241007805 | accuracy = 0.6728260869565217


Epoch[2] Batch[695] Speed: 1.2553366083751132 samples/sec                   batch loss = 1688.302135527134 | accuracy = 0.6733812949640288


Epoch[2] Batch[700] Speed: 1.2480696345004052 samples/sec                   batch loss = 1702.4153980612755 | accuracy = 0.6732142857142858


Epoch[2] Batch[705] Speed: 1.2545478174947682 samples/sec                   batch loss = 1713.4940895438194 | accuracy = 0.6723404255319149


Epoch[2] Batch[710] Speed: 1.2503522136546212 samples/sec                   batch loss = 1724.8389174342155 | accuracy = 0.671830985915493


Epoch[2] Batch[715] Speed: 1.2517567792609023 samples/sec                   batch loss = 1735.3090680837631 | accuracy = 0.672027972027972


Epoch[2] Batch[720] Speed: 1.2533945322460112 samples/sec                   batch loss = 1748.276074051857 | accuracy = 0.6708333333333333


Epoch[2] Batch[725] Speed: 1.2540566269876507 samples/sec                   batch loss = 1759.9265016317368 | accuracy = 0.6706896551724137


Epoch[2] Batch[730] Speed: 1.251736232869333 samples/sec                   batch loss = 1776.3309165239334 | accuracy = 0.6695205479452054


Epoch[2] Batch[735] Speed: 1.2511751102808997 samples/sec                   batch loss = 1788.9728999137878 | accuracy = 0.669047619047619


Epoch[2] Batch[740] Speed: 1.2534511862619275 samples/sec                   batch loss = 1799.3483628034592 | accuracy = 0.6692567567567568


Epoch[2] Batch[745] Speed: 1.2555192332488523 samples/sec                   batch loss = 1808.5525344610214 | accuracy = 0.6701342281879195


Epoch[2] Batch[750] Speed: 1.254302736799955 samples/sec                   batch loss = 1820.3671602010727 | accuracy = 0.67


Epoch[2] Batch[755] Speed: 1.2519000623964567 samples/sec                   batch loss = 1830.6055009365082 | accuracy = 0.6705298013245033


Epoch[2] Batch[760] Speed: 1.2547289930205898 samples/sec                   batch loss = 1842.195504307747 | accuracy = 0.6707236842105263


Epoch[2] Batch[765] Speed: 1.2533048327251406 samples/sec                   batch loss = 1854.2115721702576 | accuracy = 0.6709150326797385


Epoch[2] Batch[770] Speed: 1.2509499999254377 samples/sec                   batch loss = 1865.5265446901321 | accuracy = 0.6714285714285714


Epoch[2] Batch[775] Speed: 1.2590481712367518 samples/sec                   batch loss = 1878.6411689519882 | accuracy = 0.6716129032258065


Epoch[2] Batch[780] Speed: 1.2543990506257185 samples/sec                   batch loss = 1887.247725725174 | accuracy = 0.6730769230769231


Epoch[2] Batch[785] Speed: 1.2571903098753998 samples/sec                   batch loss = 1900.0083544254303 | accuracy = 0.6729299363057325


[Epoch 2] training: accuracy=0.6729060913705583
[Epoch 2] time cost: 644.1332046985626
[Epoch 2] validation: validation accuracy=0.7322222222222222


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).