<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[03:32:27] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[03:32:27] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[03:32:27] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 3.7292671, -5.4460793]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7654645039512142 samples/sec                   batch loss = 14.26544189453125 | accuracy = 0.5


Epoch[1] Batch[10] Speed: 1.2507287934234446 samples/sec                   batch loss = 28.195366144180298 | accuracy = 0.5


Epoch[1] Batch[15] Speed: 1.2547673740451004 samples/sec                   batch loss = 42.0564239025116 | accuracy = 0.55


Epoch[1] Batch[20] Speed: 1.2468009010857515 samples/sec                   batch loss = 55.23058080673218 | accuracy = 0.5375


Epoch[1] Batch[25] Speed: 1.2483652295955705 samples/sec                   batch loss = 69.1113908290863 | accuracy = 0.53


Epoch[1] Batch[30] Speed: 1.255639227170255 samples/sec                   batch loss = 82.61458969116211 | accuracy = 0.5666666666666667


Epoch[1] Batch[35] Speed: 1.2493530471959247 samples/sec                   batch loss = 95.67316341400146 | accuracy = 0.5714285714285714


Epoch[1] Batch[40] Speed: 1.251155795937976 samples/sec                   batch loss = 111.16531252861023 | accuracy = 0.54375


Epoch[1] Batch[45] Speed: 1.25280497907588 samples/sec                   batch loss = 125.3194751739502 | accuracy = 0.5444444444444444


Epoch[1] Batch[50] Speed: 1.2487589236323449 samples/sec                   batch loss = 138.48406171798706 | accuracy = 0.55


Epoch[1] Batch[55] Speed: 1.2563733147257383 samples/sec                   batch loss = 152.28498649597168 | accuracy = 0.5590909090909091


Epoch[1] Batch[60] Speed: 1.253236021858629 samples/sec                   batch loss = 166.81483697891235 | accuracy = 0.5333333333333333


Epoch[1] Batch[65] Speed: 1.2554812759692213 samples/sec                   batch loss = 181.07622146606445 | accuracy = 0.5307692307692308


Epoch[1] Batch[70] Speed: 1.2534891145394274 samples/sec                   batch loss = 194.16267132759094 | accuracy = 0.5428571428571428


Epoch[1] Batch[75] Speed: 1.2433330025627434 samples/sec                   batch loss = 208.22943758964539 | accuracy = 0.5333333333333333


Epoch[1] Batch[80] Speed: 1.2463311260895413 samples/sec                   batch loss = 222.00569987297058 | accuracy = 0.534375


Epoch[1] Batch[85] Speed: 1.2537420283935248 samples/sec                   batch loss = 236.94022631645203 | accuracy = 0.5235294117647059


Epoch[1] Batch[90] Speed: 1.253515056930085 samples/sec                   batch loss = 250.5303180217743 | accuracy = 0.5277777777777778


Epoch[1] Batch[95] Speed: 1.2542368167779245 samples/sec                   batch loss = 264.6327877044678 | accuracy = 0.5184210526315789


Epoch[1] Batch[100] Speed: 1.2450562212417942 samples/sec                   batch loss = 278.15051198005676 | accuracy = 0.5275


Epoch[1] Batch[105] Speed: 1.2485553090527168 samples/sec                   batch loss = 291.7213749885559 | accuracy = 0.5357142857142857


Epoch[1] Batch[110] Speed: 1.2503351610781646 samples/sec                   batch loss = 305.1374235153198 | accuracy = 0.5363636363636364


Epoch[1] Batch[115] Speed: 1.2500644322167491 samples/sec                   batch loss = 319.7601692676544 | accuracy = 0.5282608695652173


Epoch[1] Batch[120] Speed: 1.249326625565092 samples/sec                   batch loss = 333.54053354263306 | accuracy = 0.5291666666666667


Epoch[1] Batch[125] Speed: 1.2451200708663615 samples/sec                   batch loss = 347.1267259120941 | accuracy = 0.534


Epoch[1] Batch[130] Speed: 1.2543123956497595 samples/sec                   batch loss = 361.37751054763794 | accuracy = 0.5307692307692308


Epoch[1] Batch[135] Speed: 1.2437420608400507 samples/sec                   batch loss = 375.4329581260681 | accuracy = 0.5277777777777778


Epoch[1] Batch[140] Speed: 1.2469954167172088 samples/sec                   batch loss = 388.98929357528687 | accuracy = 0.5285714285714286


Epoch[1] Batch[145] Speed: 1.2512621721846442 samples/sec                   batch loss = 401.6978199481964 | accuracy = 0.5379310344827586


Epoch[1] Batch[150] Speed: 1.2474898094845779 samples/sec                   batch loss = 414.55207562446594 | accuracy = 0.545


Epoch[1] Batch[155] Speed: 1.248964371213309 samples/sec                   batch loss = 428.1368751525879 | accuracy = 0.5483870967741935


Epoch[1] Batch[160] Speed: 1.25127579711589 samples/sec                   batch loss = 441.9705994129181 | accuracy = 0.5484375


Epoch[1] Batch[165] Speed: 1.2541433403767883 samples/sec                   batch loss = 455.48066902160645 | accuracy = 0.5515151515151515


Epoch[1] Batch[170] Speed: 1.243865162808043 samples/sec                   batch loss = 469.34409499168396 | accuracy = 0.55


Epoch[1] Batch[175] Speed: 1.2497313525603408 samples/sec                   batch loss = 483.156236410141 | accuracy = 0.5471428571428572


Epoch[1] Batch[180] Speed: 1.249677547567879 samples/sec                   batch loss = 497.2753965854645 | accuracy = 0.5430555555555555


Epoch[1] Batch[185] Speed: 1.2493474650681302 samples/sec                   batch loss = 510.20525550842285 | accuracy = 0.5472972972972973


Epoch[1] Batch[190] Speed: 1.2537802553637332 samples/sec                   batch loss = 523.8543961048126 | accuracy = 0.5486842105263158


Epoch[1] Batch[195] Speed: 1.2499483882340223 samples/sec                   batch loss = 537.877593755722 | accuracy = 0.5474358974358975


Epoch[1] Batch[200] Speed: 1.2471457700395987 samples/sec                   batch loss = 552.0634515285492 | accuracy = 0.54375


Epoch[1] Batch[205] Speed: 1.2423360949414832 samples/sec                   batch loss = 565.7721364498138 | accuracy = 0.5414634146341464


Epoch[1] Batch[210] Speed: 1.2422981026856454 samples/sec                   batch loss = 579.1647655963898 | accuracy = 0.5416666666666666


Epoch[1] Batch[215] Speed: 1.24749945645207 samples/sec                   batch loss = 592.3774666786194 | accuracy = 0.5430232558139535


Epoch[1] Batch[220] Speed: 1.2527539960187473 samples/sec                   batch loss = 605.6905777454376 | accuracy = 0.5454545454545454


Epoch[1] Batch[225] Speed: 1.2433937267417887 samples/sec                   batch loss = 618.9203517436981 | accuracy = 0.5477777777777778


Epoch[1] Batch[230] Speed: 1.2521346729367377 samples/sec                   batch loss = 632.951155424118 | accuracy = 0.5456521739130434


Epoch[1] Batch[235] Speed: 1.2474039209481604 samples/sec                   batch loss = 646.7138996124268 | accuracy = 0.5468085106382978


Epoch[1] Batch[240] Speed: 1.2466450728951404 samples/sec                   batch loss = 659.6321527957916 | accuracy = 0.5520833333333334


Epoch[1] Batch[245] Speed: 1.2425948352359126 samples/sec                   batch loss = 674.3625016212463 | accuracy = 0.5479591836734694


Epoch[1] Batch[250] Speed: 1.249855270604672 samples/sec                   batch loss = 687.9422812461853 | accuracy = 0.548


Epoch[1] Batch[255] Speed: 1.2549520862365462 samples/sec                   batch loss = 701.2313117980957 | accuracy = 0.5519607843137255


Epoch[1] Batch[260] Speed: 1.2453153564475397 samples/sec                   batch loss = 714.211422920227 | accuracy = 0.5548076923076923


Epoch[1] Batch[265] Speed: 1.2514321317715633 samples/sec                   batch loss = 727.9827513694763 | accuracy = 0.5537735849056604


Epoch[1] Batch[270] Speed: 1.250397130317026 samples/sec                   batch loss = 741.0426788330078 | accuracy = 0.5564814814814815


Epoch[1] Batch[275] Speed: 1.2510954307976103 samples/sec                   batch loss = 754.5164937973022 | accuracy = 0.5572727272727273


Epoch[1] Batch[280] Speed: 1.2542471309755954 samples/sec                   batch loss = 767.91885638237 | accuracy = 0.5589285714285714


Epoch[1] Batch[285] Speed: 1.260062059717175 samples/sec                   batch loss = 780.7578887939453 | accuracy = 0.5631578947368421


Epoch[1] Batch[290] Speed: 1.2566333236561005 samples/sec                   batch loss = 794.5708570480347 | accuracy = 0.5637931034482758


Epoch[1] Batch[295] Speed: 1.2556888476476926 samples/sec                   batch loss = 807.8415765762329 | accuracy = 0.5652542372881356


Epoch[1] Batch[300] Speed: 1.2601690093450504 samples/sec                   batch loss = 822.4325094223022 | accuracy = 0.5633333333333334


Epoch[1] Batch[305] Speed: 1.2510014892221535 samples/sec                   batch loss = 836.1450278759003 | accuracy = 0.5622950819672131


Epoch[1] Batch[310] Speed: 1.2563666347705267 samples/sec                   batch loss = 849.8585875034332 | accuracy = 0.5612903225806452


Epoch[1] Batch[315] Speed: 1.2587456071641863 samples/sec                   batch loss = 863.7289566993713 | accuracy = 0.5603174603174603


Epoch[1] Batch[320] Speed: 1.2502078871401041 samples/sec                   batch loss = 877.9493358135223 | accuracy = 0.55859375


Epoch[1] Batch[325] Speed: 1.247102384471876 samples/sec                   batch loss = 890.7916958332062 | accuracy = 0.56


Epoch[1] Batch[330] Speed: 1.2492299728534064 samples/sec                   batch loss = 904.6014482975006 | accuracy = 0.5590909090909091


Epoch[1] Batch[335] Speed: 1.2429941986695088 samples/sec                   batch loss = 917.1458783149719 | accuracy = 0.5611940298507463


Epoch[1] Batch[340] Speed: 1.2545268041451418 samples/sec                   batch loss = 930.3049552440643 | accuracy = 0.5625


Epoch[1] Batch[345] Speed: 1.246367698772919 samples/sec                   batch loss = 943.5244197845459 | accuracy = 0.5644927536231884


Epoch[1] Batch[350] Speed: 1.2493298816916143 samples/sec                   batch loss = 956.5226395130157 | accuracy = 0.5635714285714286


Epoch[1] Batch[355] Speed: 1.2559077696324856 samples/sec                   batch loss = 970.8919367790222 | accuracy = 0.5612676056338028


Epoch[1] Batch[360] Speed: 1.2497341453334871 samples/sec                   batch loss = 984.0320346355438 | accuracy = 0.5631944444444444


Epoch[1] Batch[365] Speed: 1.2553981349978973 samples/sec                   batch loss = 997.7889273166656 | accuracy = 0.5623287671232877


Epoch[1] Batch[370] Speed: 1.2512390291458795 samples/sec                   batch loss = 1011.001002073288 | accuracy = 0.5628378378378378


Epoch[1] Batch[375] Speed: 1.2562797077459427 samples/sec                   batch loss = 1024.1715655326843 | accuracy = 0.5626666666666666


Epoch[1] Batch[380] Speed: 1.2607138823498647 samples/sec                   batch loss = 1038.7492825984955 | accuracy = 0.5611842105263158


Epoch[1] Batch[385] Speed: 1.2561772734797565 samples/sec                   batch loss = 1051.9411404132843 | accuracy = 0.5623376623376624


Epoch[1] Batch[390] Speed: 1.2575248329286923 samples/sec                   batch loss = 1064.5833160877228 | accuracy = 0.5634615384615385


Epoch[1] Batch[395] Speed: 1.254036473701115 samples/sec                   batch loss = 1077.661480665207 | accuracy = 0.5639240506329114


Epoch[1] Batch[400] Speed: 1.2507101455239213 samples/sec                   batch loss = 1091.0435569286346 | accuracy = 0.564375


Epoch[1] Batch[405] Speed: 1.2514907556700334 samples/sec                   batch loss = 1104.3806507587433 | accuracy = 0.5648148148148148


Epoch[1] Batch[410] Speed: 1.2511922790863408 samples/sec                   batch loss = 1118.4841408729553 | accuracy = 0.5634146341463414


Epoch[1] Batch[415] Speed: 1.251984048658472 samples/sec                   batch loss = 1131.159794807434 | accuracy = 0.5644578313253013


Epoch[1] Batch[420] Speed: 1.254520331424795 samples/sec                   batch loss = 1145.7594962120056 | accuracy = 0.5630952380952381


Epoch[1] Batch[425] Speed: 1.2585243731032267 samples/sec                   batch loss = 1158.8067042827606 | accuracy = 0.5652941176470588


Epoch[1] Batch[430] Speed: 1.2513546595672205 samples/sec                   batch loss = 1171.1070446968079 | accuracy = 0.5680232558139535


Epoch[1] Batch[435] Speed: 1.2539095697472147 samples/sec                   batch loss = 1184.4902980327606 | accuracy = 0.5689655172413793


Epoch[1] Batch[440] Speed: 1.2553660088943748 samples/sec                   batch loss = 1197.476662158966 | accuracy = 0.5698863636363637


Epoch[1] Batch[445] Speed: 1.2562281593053808 samples/sec                   batch loss = 1209.9577198028564 | accuracy = 0.5707865168539326


Epoch[1] Batch[450] Speed: 1.2558420568207882 samples/sec                   batch loss = 1222.6923907995224 | accuracy = 0.5716666666666667


Epoch[1] Batch[455] Speed: 1.2571304914172745 samples/sec                   batch loss = 1236.1331428289413 | accuracy = 0.5719780219780219


Epoch[1] Batch[460] Speed: 1.261627131379437 samples/sec                   batch loss = 1249.5168615579605 | accuracy = 0.5728260869565217


Epoch[1] Batch[465] Speed: 1.2479421715639913 samples/sec                   batch loss = 1263.4853817224503 | accuracy = 0.5731182795698925


Epoch[1] Batch[470] Speed: 1.2524926911493721 samples/sec                   batch loss = 1276.1285897493362 | accuracy = 0.575


Epoch[1] Batch[475] Speed: 1.244569202260477 samples/sec                   batch loss = 1289.3316074609756 | accuracy = 0.5747368421052632


Epoch[1] Batch[480] Speed: 1.245574876966078 samples/sec                   batch loss = 1301.6770921945572 | accuracy = 0.5770833333333333


Epoch[1] Batch[485] Speed: 1.2482693756970134 samples/sec                   batch loss = 1315.4139763116837 | accuracy = 0.5752577319587628


Epoch[1] Batch[490] Speed: 1.252769524363653 samples/sec                   batch loss = 1328.6240805387497 | accuracy = 0.5770408163265306


Epoch[1] Batch[495] Speed: 1.254845269405685 samples/sec                   batch loss = 1342.4574438333511 | accuracy = 0.5767676767676768


Epoch[1] Batch[500] Speed: 1.2546999976666784 samples/sec                   batch loss = 1354.7398866415024 | accuracy = 0.5785


Epoch[1] Batch[505] Speed: 1.2521156093004788 samples/sec                   batch loss = 1367.3735972642899 | accuracy = 0.5797029702970297


Epoch[1] Batch[510] Speed: 1.2598466066205336 samples/sec                   batch loss = 1379.9802819490433 | accuracy = 0.5794117647058824


Epoch[1] Batch[515] Speed: 1.2551581683171094 samples/sec                   batch loss = 1394.0191978216171 | accuracy = 0.5776699029126213


Epoch[1] Batch[520] Speed: 1.2575103175305715 samples/sec                   batch loss = 1407.0488971471786 | accuracy = 0.5774038461538461


Epoch[1] Batch[525] Speed: 1.2578695306865757 samples/sec                   batch loss = 1419.4634996652603 | accuracy = 0.5776190476190476


Epoch[1] Batch[530] Speed: 1.2577802267828695 samples/sec                   batch loss = 1434.094784617424 | accuracy = 0.5768867924528301


Epoch[1] Batch[535] Speed: 1.2568768678824935 samples/sec                   batch loss = 1445.5071334838867 | accuracy = 0.5789719626168224


Epoch[1] Batch[540] Speed: 1.2560200331755111 samples/sec                   batch loss = 1458.7766680717468 | accuracy = 0.5796296296296296


Epoch[1] Batch[545] Speed: 1.2570187827371744 samples/sec                   batch loss = 1472.2868783473969 | accuracy = 0.5793577981651377


Epoch[1] Batch[550] Speed: 1.260503130894093 samples/sec                   batch loss = 1486.7540347576141 | accuracy = 0.5781818181818181


Epoch[1] Batch[555] Speed: 1.2471872116376672 samples/sec                   batch loss = 1500.2159459590912 | accuracy = 0.5779279279279279


Epoch[1] Batch[560] Speed: 1.2490871142356872 samples/sec                   batch loss = 1513.9626126289368 | accuracy = 0.5785714285714286


Epoch[1] Batch[565] Speed: 1.2474091147353277 samples/sec                   batch loss = 1527.3803055286407 | accuracy = 0.5783185840707965


Epoch[1] Batch[570] Speed: 1.2535132774538251 samples/sec                   batch loss = 1541.9691655635834 | accuracy = 0.5780701754385965


Epoch[1] Batch[575] Speed: 1.2514559353992396 samples/sec                   batch loss = 1554.2879695892334 | accuracy = 0.5795652173913044


Epoch[1] Batch[580] Speed: 1.246489098431548 samples/sec                   batch loss = 1566.3847242593765 | accuracy = 0.5801724137931035


Epoch[1] Batch[585] Speed: 1.2484224517296385 samples/sec                   batch loss = 1578.7561749219894 | accuracy = 0.5816239316239317


Epoch[1] Batch[590] Speed: 1.245663380900871 samples/sec                   batch loss = 1591.4426169395447 | accuracy = 0.5822033898305085


Epoch[1] Batch[595] Speed: 1.2554286657135898 samples/sec                   batch loss = 1605.596316576004 | accuracy = 0.5806722689075631


Epoch[1] Batch[600] Speed: 1.2427109905657547 samples/sec                   batch loss = 1617.533213019371 | accuracy = 0.5816666666666667


Epoch[1] Batch[605] Speed: 1.2474772872012352 samples/sec                   batch loss = 1630.9163571596146 | accuracy = 0.5805785123966942


Epoch[1] Batch[610] Speed: 1.2407782967691665 samples/sec                   batch loss = 1644.0046368837357 | accuracy = 0.5807377049180328


Epoch[1] Batch[615] Speed: 1.248905240004532 samples/sec                   batch loss = 1657.33485019207 | accuracy = 0.5804878048780487


Epoch[1] Batch[620] Speed: 1.2475258936180544 samples/sec                   batch loss = 1671.520586848259 | accuracy = 0.5806451612903226


Epoch[1] Batch[625] Speed: 1.2520082470140097 samples/sec                   batch loss = 1685.9624415636063 | accuracy = 0.582


Epoch[1] Batch[630] Speed: 1.2517040137498405 samples/sec                   batch loss = 1698.4657599925995 | accuracy = 0.582936507936508


Epoch[1] Batch[635] Speed: 1.2510805036859682 samples/sec                   batch loss = 1713.7752735614777 | accuracy = 0.5811023622047244


Epoch[1] Batch[640] Speed: 1.252870280868737 samples/sec                   batch loss = 1726.9766354560852 | accuracy = 0.58125


Epoch[1] Batch[645] Speed: 1.2519232298949776 samples/sec                   batch loss = 1740.2958971261978 | accuracy = 0.5813953488372093


Epoch[1] Batch[650] Speed: 1.2513086474853026 samples/sec                   batch loss = 1752.4195846319199 | accuracy = 0.5823076923076923


Epoch[1] Batch[655] Speed: 1.2530007177196083 samples/sec                   batch loss = 1766.346773982048 | accuracy = 0.5820610687022901


Epoch[1] Batch[660] Speed: 1.2520680461900218 samples/sec                   batch loss = 1779.8698021173477 | accuracy = 0.5814393939393939


Epoch[1] Batch[665] Speed: 1.2516192246894287 samples/sec                   batch loss = 1793.6919672489166 | accuracy = 0.5812030075187969


Epoch[1] Batch[670] Speed: 1.2535543940113494 samples/sec                   batch loss = 1808.3866094350815 | accuracy = 0.5805970149253732


Epoch[1] Batch[675] Speed: 1.2524576281055726 samples/sec                   batch loss = 1824.1067479848862 | accuracy = 0.5796296296296296


Epoch[1] Batch[680] Speed: 1.2507133156276096 samples/sec                   batch loss = 1836.3120896816254 | accuracy = 0.58125


Epoch[1] Batch[685] Speed: 1.249315926983164 samples/sec                   batch loss = 1849.348352432251 | accuracy = 0.5813868613138686


Epoch[1] Batch[690] Speed: 1.2566548782666371 samples/sec                   batch loss = 1861.9295341968536 | accuracy = 0.5822463768115942


Epoch[1] Batch[695] Speed: 1.2522512167064597 samples/sec                   batch loss = 1872.3284285068512 | accuracy = 0.5838129496402877


Epoch[1] Batch[700] Speed: 1.255338486961356 samples/sec                   batch loss = 1886.9189298152924 | accuracy = 0.5832142857142857


Epoch[1] Batch[705] Speed: 1.254756675919602 samples/sec                   batch loss = 1900.416194677353 | accuracy = 0.5819148936170213


Epoch[1] Batch[710] Speed: 1.255134787037558 samples/sec                   batch loss = 1913.777153968811 | accuracy = 0.5823943661971831


Epoch[1] Batch[715] Speed: 1.2559969018592962 samples/sec                   batch loss = 1925.5608422756195 | accuracy = 0.5835664335664336


Epoch[1] Batch[720] Speed: 1.2531381081286588 samples/sec                   batch loss = 1938.4188976287842 | accuracy = 0.5836805555555555


Epoch[1] Batch[725] Speed: 1.2558651824338334 samples/sec                   batch loss = 1950.760814189911 | accuracy = 0.5844827586206897


Epoch[1] Batch[730] Speed: 1.250687489074237 samples/sec                   batch loss = 1962.8677337169647 | accuracy = 0.5856164383561644


Epoch[1] Batch[735] Speed: 1.2544697714586939 samples/sec                   batch loss = 1974.8554666042328 | accuracy = 0.5863945578231292


Epoch[1] Batch[740] Speed: 1.251382567123425 samples/sec                   batch loss = 1986.4084312915802 | accuracy = 0.5864864864864865


Epoch[1] Batch[745] Speed: 1.2501181773799255 samples/sec                   batch loss = 1999.7013533115387 | accuracy = 0.5859060402684564


Epoch[1] Batch[750] Speed: 1.2507746696229012 samples/sec                   batch loss = 2012.888057589531 | accuracy = 0.5853333333333334


Epoch[1] Batch[755] Speed: 1.2519358415953294 samples/sec                   batch loss = 2025.0474747419357 | accuracy = 0.5857615894039735


Epoch[1] Batch[760] Speed: 1.2527768209478134 samples/sec                   batch loss = 2036.781732082367 | accuracy = 0.587171052631579


Epoch[1] Batch[765] Speed: 1.251180708752641 samples/sec                   batch loss = 2049.997947335243 | accuracy = 0.5866013071895425


Epoch[1] Batch[770] Speed: 1.2499088115462462 samples/sec                   batch loss = 2062.8378225564957 | accuracy = 0.5863636363636363


Epoch[1] Batch[775] Speed: 1.2488255704853102 samples/sec                   batch loss = 2076.2641764879227 | accuracy = 0.5864516129032258


Epoch[1] Batch[780] Speed: 1.2488828348300207 samples/sec                   batch loss = 2086.8533046245575 | accuracy = 0.5881410256410257


Epoch[1] Batch[785] Speed: 1.249243274504663 samples/sec                   batch loss = 2101.643754720688 | accuracy = 0.5885350318471337


[Epoch 1] training: accuracy=0.5885152284263959
[Epoch 1] time cost: 648.2725083827972
[Epoch 1] validation: validation accuracy=0.71


Epoch[2] Batch[5] Speed: 1.2508116900666164 samples/sec                   batch loss = 11.963411808013916 | accuracy = 0.7


Epoch[2] Batch[10] Speed: 1.2522884180823433 samples/sec                   batch loss = 26.07426416873932 | accuracy = 0.625


Epoch[2] Batch[15] Speed: 1.2453405918502019 samples/sec                   batch loss = 40.1668940782547 | accuracy = 0.5833333333333334


Epoch[2] Batch[20] Speed: 1.2478155699227735 samples/sec                   batch loss = 53.0569714307785 | accuracy = 0.5875


Epoch[2] Batch[25] Speed: 1.2509479479036136 samples/sec                   batch loss = 65.55414855480194 | accuracy = 0.6


Epoch[2] Batch[30] Speed: 1.2481930374833266 samples/sec                   batch loss = 80.53346288204193 | accuracy = 0.5583333333333333


Epoch[2] Batch[35] Speed: 1.2476596735809737 samples/sec                   batch loss = 92.10110676288605 | accuracy = 0.5928571428571429


Epoch[2] Batch[40] Speed: 1.2428574582266148 samples/sec                   batch loss = 103.6598870754242 | accuracy = 0.60625


Epoch[2] Batch[45] Speed: 1.2447194327826403 samples/sec                   batch loss = 116.90700054168701 | accuracy = 0.6


Epoch[2] Batch[50] Speed: 1.2525400059949734 samples/sec                   batch loss = 129.63822102546692 | accuracy = 0.615


Epoch[2] Batch[55] Speed: 1.2502893171132423 samples/sec                   batch loss = 142.45393085479736 | accuracy = 0.6272727272727273


Epoch[2] Batch[60] Speed: 1.24683194167366 samples/sec                   batch loss = 155.13422012329102 | accuracy = 0.625


Epoch[2] Batch[65] Speed: 1.2506941087750676 samples/sec                   batch loss = 168.53061938285828 | accuracy = 0.6230769230769231


Epoch[2] Batch[70] Speed: 1.2545818718730732 samples/sec                   batch loss = 182.71180963516235 | accuracy = 0.6142857142857143


Epoch[2] Batch[75] Speed: 1.2448556595943472 samples/sec                   batch loss = 194.35320949554443 | accuracy = 0.6266666666666667


Epoch[2] Batch[80] Speed: 1.2443188678646349 samples/sec                   batch loss = 208.98275923728943 | accuracy = 0.61875


Epoch[2] Batch[85] Speed: 1.2457687326989106 samples/sec                   batch loss = 221.13109135627747 | accuracy = 0.6264705882352941


Epoch[2] Batch[90] Speed: 1.2503939618159627 samples/sec                   batch loss = 232.80453634262085 | accuracy = 0.6277777777777778


Epoch[2] Batch[95] Speed: 1.2469225705018352 samples/sec                   batch loss = 247.64872002601624 | accuracy = 0.618421052631579


Epoch[2] Batch[100] Speed: 1.2530800785314153 samples/sec                   batch loss = 260.1974022388458 | accuracy = 0.6225


Epoch[2] Batch[105] Speed: 1.2516139024113306 samples/sec                   batch loss = 271.65989422798157 | accuracy = 0.6285714285714286


Epoch[2] Batch[110] Speed: 1.2528992852808296 samples/sec                   batch loss = 284.08521699905396 | accuracy = 0.6295454545454545


Epoch[2] Batch[115] Speed: 1.241405264209771 samples/sec                   batch loss = 295.67923164367676 | accuracy = 0.6347826086956522


Epoch[2] Batch[120] Speed: 1.2507186302492586 samples/sec                   batch loss = 308.92911541461945 | accuracy = 0.63125


Epoch[2] Batch[125] Speed: 1.2503963847859194 samples/sec                   batch loss = 320.88033080101013 | accuracy = 0.63


Epoch[2] Batch[130] Speed: 1.2543560029885243 samples/sec                   batch loss = 333.0008547306061 | accuracy = 0.6365384615384615


Epoch[2] Batch[135] Speed: 1.2442172674842833 samples/sec                   batch loss = 348.5568835735321 | accuracy = 0.6277777777777778


Epoch[2] Batch[140] Speed: 1.248189787280011 samples/sec                   batch loss = 360.13611114025116 | accuracy = 0.6321428571428571


Epoch[2] Batch[145] Speed: 1.2479984265766908 samples/sec                   batch loss = 372.4243447780609 | accuracy = 0.6310344827586207


Epoch[2] Batch[150] Speed: 1.2573206113540651 samples/sec                   batch loss = 383.1954538822174 | accuracy = 0.6383333333333333


Epoch[2] Batch[155] Speed: 1.2470326772317821 samples/sec                   batch loss = 394.85123121738434 | accuracy = 0.6387096774193548


Epoch[2] Batch[160] Speed: 1.25189623236841 samples/sec                   batch loss = 407.0363048315048 | accuracy = 0.640625


Epoch[2] Batch[165] Speed: 1.2522944004194314 samples/sec                   batch loss = 419.8518010377884 | accuracy = 0.6424242424242425


Epoch[2] Batch[170] Speed: 1.2498515461880764 samples/sec                   batch loss = 432.77344024181366 | accuracy = 0.6441176470588236


Epoch[2] Batch[175] Speed: 1.2508984216008 samples/sec                   batch loss = 442.9678316116333 | accuracy = 0.6485714285714286


Epoch[2] Batch[180] Speed: 1.2529241740082184 samples/sec                   batch loss = 455.2299017906189 | accuracy = 0.6486111111111111


Epoch[2] Batch[185] Speed: 1.2544047717759643 samples/sec                   batch loss = 468.00932335853577 | accuracy = 0.6472972972972973


Epoch[2] Batch[190] Speed: 1.2497828346843234 samples/sec                   batch loss = 480.1530296802521 | accuracy = 0.6460526315789473


Epoch[2] Batch[195] Speed: 1.2447554491756876 samples/sec                   batch loss = 495.66745233535767 | accuracy = 0.6397435897435897


Epoch[2] Batch[200] Speed: 1.2433564990757022 samples/sec                   batch loss = 505.0887258052826 | accuracy = 0.645


Epoch[2] Batch[205] Speed: 1.2387869333104535 samples/sec                   batch loss = 516.7433032989502 | accuracy = 0.6463414634146342


Epoch[2] Batch[210] Speed: 1.2488770709732235 samples/sec                   batch loss = 530.8408415317535 | accuracy = 0.6416666666666667


Epoch[2] Batch[215] Speed: 1.2431144814637398 samples/sec                   batch loss = 542.8309046030045 | accuracy = 0.6406976744186047


Epoch[2] Batch[220] Speed: 1.250135969257945 samples/sec                   batch loss = 553.2503457069397 | accuracy = 0.6454545454545455


Epoch[2] Batch[225] Speed: 1.2433013067130811 samples/sec                   batch loss = 565.8809393644333 | accuracy = 0.6455555555555555


Epoch[2] Batch[230] Speed: 1.2477407719353046 samples/sec                   batch loss = 576.6338475942612 | accuracy = 0.6467391304347826


Epoch[2] Batch[235] Speed: 1.24536813935473 samples/sec                   batch loss = 588.2073795795441 | accuracy = 0.6478723404255319


Epoch[2] Batch[240] Speed: 1.2447638533006407 samples/sec                   batch loss = 599.0987256765366 | accuracy = 0.6520833333333333


Epoch[2] Batch[245] Speed: 1.2533234644501323 samples/sec                   batch loss = 611.448267698288 | accuracy = 0.6530612244897959


Epoch[2] Batch[250] Speed: 1.2489792478706134 samples/sec                   batch loss = 623.2237592935562 | accuracy = 0.653


Epoch[2] Batch[255] Speed: 1.256840994019051 samples/sec                   batch loss = 635.8658660650253 | accuracy = 0.6529411764705882


Epoch[2] Batch[260] Speed: 1.2586983889288186 samples/sec                   batch loss = 646.7664428949356 | accuracy = 0.6576923076923077


Epoch[2] Batch[265] Speed: 1.2424735488066905 samples/sec                   batch loss = 658.2242060899734 | accuracy = 0.659433962264151


Epoch[2] Batch[270] Speed: 1.2460069730040249 samples/sec                   batch loss = 674.7615438699722 | accuracy = 0.6574074074074074


Epoch[2] Batch[275] Speed: 1.2488965009611697 samples/sec                   batch loss = 686.7233054637909 | accuracy = 0.6563636363636364


Epoch[2] Batch[280] Speed: 1.2494414375339828 samples/sec                   batch loss = 699.0566575527191 | accuracy = 0.6553571428571429


Epoch[2] Batch[285] Speed: 1.2496645159137443 samples/sec                   batch loss = 710.9207923412323 | accuracy = 0.6570175438596492


Epoch[2] Batch[290] Speed: 1.2562211987087677 samples/sec                   batch loss = 723.2617144584656 | accuracy = 0.6560344827586206


Epoch[2] Batch[295] Speed: 1.2510966436410775 samples/sec                   batch loss = 736.3878639936447 | accuracy = 0.6550847457627119


Epoch[2] Batch[300] Speed: 1.2549715179663874 samples/sec                   batch loss = 748.2441694736481 | accuracy = 0.655


Epoch[2] Batch[305] Speed: 1.2497588153717238 samples/sec                   batch loss = 758.969463467598 | accuracy = 0.6573770491803279


Epoch[2] Batch[310] Speed: 1.2587820620651022 samples/sec                   batch loss = 769.6285628080368 | accuracy = 0.6588709677419354


Epoch[2] Batch[315] Speed: 1.246573471694743 samples/sec                   batch loss = 780.9889343976974 | accuracy = 0.6595238095238095


Epoch[2] Batch[320] Speed: 1.2505801851076355 samples/sec                   batch loss = 793.4140083789825 | accuracy = 0.659375


Epoch[2] Batch[325] Speed: 1.2574284154927098 samples/sec                   batch loss = 808.3701117038727 | accuracy = 0.6584615384615384


Epoch[2] Batch[330] Speed: 1.2488714001337808 samples/sec                   batch loss = 819.3797235488892 | accuracy = 0.6583333333333333


Epoch[2] Batch[335] Speed: 1.2533833893315638 samples/sec                   batch loss = 828.6089705228806 | accuracy = 0.6604477611940298


Epoch[2] Batch[340] Speed: 1.2477493092174816 samples/sec                   batch loss = 840.118048787117 | accuracy = 0.6602941176470588


Epoch[2] Batch[345] Speed: 1.2544769002609117 samples/sec                   batch loss = 850.0056356191635 | accuracy = 0.6623188405797101


Epoch[2] Batch[350] Speed: 1.2521689702537384 samples/sec                   batch loss = 862.5063971281052 | accuracy = 0.6614285714285715


Epoch[2] Batch[355] Speed: 1.252544775080884 samples/sec                   batch loss = 872.5504256486893 | accuracy = 0.6626760563380282


Epoch[2] Batch[360] Speed: 1.2572528662890536 samples/sec                   batch loss = 885.2601108551025 | accuracy = 0.6625


Epoch[2] Batch[365] Speed: 1.2486540877507397 samples/sec                   batch loss = 898.1704289913177 | accuracy = 0.6616438356164384


Epoch[2] Batch[370] Speed: 1.2537062395686176 samples/sec                   batch loss = 912.3003213405609 | accuracy = 0.6601351351351351


Epoch[2] Batch[375] Speed: 1.251744357952645 samples/sec                   batch loss = 922.8782349824905 | accuracy = 0.662


Epoch[2] Batch[380] Speed: 1.2492902511341164 samples/sec                   batch loss = 933.2918826341629 | accuracy = 0.6631578947368421


Epoch[2] Batch[385] Speed: 1.2521353270914204 samples/sec                   batch loss = 947.6190614700317 | accuracy = 0.6616883116883117


Epoch[2] Batch[390] Speed: 1.2550815485787807 samples/sec                   batch loss = 959.372855424881 | accuracy = 0.6608974358974359


Epoch[2] Batch[395] Speed: 1.2559306156114276 samples/sec                   batch loss = 972.5079653263092 | accuracy = 0.660759493670886


Epoch[2] Batch[400] Speed: 1.256415371849828 samples/sec                   batch loss = 981.9521021842957 | accuracy = 0.6625


Epoch[2] Batch[405] Speed: 1.2508065611590087 samples/sec                   batch loss = 995.8931537866592 | accuracy = 0.6598765432098765


Epoch[2] Batch[410] Speed: 1.2386343820160015 samples/sec                   batch loss = 1006.4152084589005 | accuracy = 0.6603658536585366


Epoch[2] Batch[415] Speed: 1.2421695166033362 samples/sec                   batch loss = 1017.7067978382111 | accuracy = 0.6608433734939759


Epoch[2] Batch[420] Speed: 1.2451965881592584 samples/sec                   batch loss = 1030.7703738212585 | accuracy = 0.6607142857142857


Epoch[2] Batch[425] Speed: 1.2457522674151726 samples/sec                   batch loss = 1041.7734699249268 | accuracy = 0.6617647058823529


Epoch[2] Batch[430] Speed: 1.2445973619504698 samples/sec                   batch loss = 1054.1695448160172 | accuracy = 0.6616279069767442


Epoch[2] Batch[435] Speed: 1.2488068863403112 samples/sec                   batch loss = 1067.4262603521347 | accuracy = 0.6614942528735632


Epoch[2] Batch[440] Speed: 1.248165922019835 samples/sec                   batch loss = 1079.8446938991547 | accuracy = 0.6613636363636364


Epoch[2] Batch[445] Speed: 1.2543599418651987 samples/sec                   batch loss = 1089.782860994339 | accuracy = 0.6629213483146067


Epoch[2] Batch[450] Speed: 1.2597807647762715 samples/sec                   batch loss = 1101.9938652515411 | accuracy = 0.6622222222222223


Epoch[2] Batch[455] Speed: 1.2584076970925584 samples/sec                   batch loss = 1113.9910733699799 | accuracy = 0.6620879120879121


Epoch[2] Batch[460] Speed: 1.255477142146664 samples/sec                   batch loss = 1123.644107580185 | accuracy = 0.6635869565217392


Epoch[2] Batch[465] Speed: 1.2515234306085554 samples/sec                   batch loss = 1136.9708304405212 | accuracy = 0.6634408602150538


Epoch[2] Batch[470] Speed: 1.2486771352647426 samples/sec                   batch loss = 1147.3818222284317 | accuracy = 0.6643617021276595


Epoch[2] Batch[475] Speed: 1.250483804367432 samples/sec                   batch loss = 1157.2488071918488 | accuracy = 0.6663157894736842


Epoch[2] Batch[480] Speed: 1.2536416000107602 samples/sec                   batch loss = 1169.6618077754974 | accuracy = 0.6661458333333333


Epoch[2] Batch[485] Speed: 1.2508262377424968 samples/sec                   batch loss = 1181.5876613855362 | accuracy = 0.6654639175257732


Epoch[2] Batch[490] Speed: 1.2476349007854117 samples/sec                   batch loss = 1191.8693784475327 | accuracy = 0.6653061224489796


Epoch[2] Batch[495] Speed: 1.254048659331831 samples/sec                   batch loss = 1202.4165465831757 | accuracy = 0.6666666666666666


Epoch[2] Batch[500] Speed: 1.2523315108811555 samples/sec                   batch loss = 1212.1880904436111 | accuracy = 0.668


Epoch[2] Batch[505] Speed: 1.2574713915262945 samples/sec                   batch loss = 1222.7440674304962 | accuracy = 0.6688118811881189


Epoch[2] Batch[510] Speed: 1.2521828953137975 samples/sec                   batch loss = 1231.4516279101372 | accuracy = 0.6700980392156862


Epoch[2] Batch[515] Speed: 1.2498574121542654 samples/sec                   batch loss = 1244.4445378184319 | accuracy = 0.6689320388349514


Epoch[2] Batch[520] Speed: 1.2509452429760408 samples/sec                   batch loss = 1257.0119296908379 | accuracy = 0.6677884615384615


Epoch[2] Batch[525] Speed: 1.2496693562106955 samples/sec                   batch loss = 1268.066387116909 | accuracy = 0.6680952380952381


Epoch[2] Batch[530] Speed: 1.2499070422940242 samples/sec                   batch loss = 1278.0669167637825 | accuracy = 0.6693396226415095


Epoch[2] Batch[535] Speed: 1.2485150772421783 samples/sec                   batch loss = 1288.5400967001915 | accuracy = 0.6696261682242991


Epoch[2] Batch[540] Speed: 1.2546503614822737 samples/sec                   batch loss = 1300.5767355561256 | accuracy = 0.6694444444444444


Epoch[2] Batch[545] Speed: 1.2533825465981938 samples/sec                   batch loss = 1308.4996653795242 | accuracy = 0.6711009174311927


Epoch[2] Batch[550] Speed: 1.2531255657926894 samples/sec                   batch loss = 1317.0933929085732 | accuracy = 0.6727272727272727


Epoch[2] Batch[555] Speed: 1.257366783940447 samples/sec                   batch loss = 1328.0554646849632 | accuracy = 0.672972972972973


Epoch[2] Batch[560] Speed: 1.2615386211801765 samples/sec                   batch loss = 1342.7481066584587 | accuracy = 0.671875


Epoch[2] Batch[565] Speed: 1.2587008441897416 samples/sec                   batch loss = 1353.9123172163963 | accuracy = 0.672566371681416


Epoch[2] Batch[570] Speed: 1.256047867108148 samples/sec                   batch loss = 1365.497175514698 | accuracy = 0.6719298245614035


Epoch[2] Batch[575] Speed: 1.2584344098358393 samples/sec                   batch loss = 1380.0327425599098 | accuracy = 0.671304347826087


Epoch[2] Batch[580] Speed: 1.2623019435227032 samples/sec                   batch loss = 1392.3428043723106 | accuracy = 0.6706896551724137


Epoch[2] Batch[585] Speed: 1.258649474572937 samples/sec                   batch loss = 1403.3380087018013 | accuracy = 0.6713675213675213


Epoch[2] Batch[590] Speed: 1.2603194320620175 samples/sec                   batch loss = 1414.5722224116325 | accuracy = 0.6716101694915254


Epoch[2] Batch[595] Speed: 1.2540716252026805 samples/sec                   batch loss = 1424.7483823895454 | accuracy = 0.6718487394957983


Epoch[2] Batch[600] Speed: 1.2543124894257294 samples/sec                   batch loss = 1436.4810400605202 | accuracy = 0.67125


Epoch[2] Batch[605] Speed: 1.2527577377533998 samples/sec                   batch loss = 1450.1060770154 | accuracy = 0.6710743801652893


Epoch[2] Batch[610] Speed: 1.252076829675115 samples/sec                   batch loss = 1458.6873771548271 | accuracy = 0.6729508196721311


Epoch[2] Batch[615] Speed: 1.2537967461489983 samples/sec                   batch loss = 1473.177779853344 | accuracy = 0.6727642276422764


Epoch[2] Batch[620] Speed: 1.2546171477790153 samples/sec                   batch loss = 1483.1213032603264 | accuracy = 0.6733870967741935


Epoch[2] Batch[625] Speed: 1.2449609671879798 samples/sec                   batch loss = 1495.4508309960365 | accuracy = 0.6728


Epoch[2] Batch[630] Speed: 1.2448539969857366 samples/sec                   batch loss = 1508.0656529068947 | accuracy = 0.6722222222222223


Epoch[2] Batch[635] Speed: 1.2491428214896854 samples/sec                   batch loss = 1515.393029808998 | accuracy = 0.6736220472440945


Epoch[2] Batch[640] Speed: 1.2460549096360016 samples/sec                   batch loss = 1524.2118604183197 | accuracy = 0.674609375


Epoch[2] Batch[645] Speed: 1.2408128006863322 samples/sec                   batch loss = 1538.3997789621353 | accuracy = 0.6736434108527132


Epoch[2] Batch[650] Speed: 1.2430774547240595 samples/sec                   batch loss = 1548.4902255535126 | accuracy = 0.6742307692307692


Epoch[2] Batch[655] Speed: 1.2369900762629917 samples/sec                   batch loss = 1559.852504491806 | accuracy = 0.6740458015267176


Epoch[2] Batch[660] Speed: 1.2421388916550027 samples/sec                   batch loss = 1571.2590301036835 | accuracy = 0.675


Epoch[2] Batch[665] Speed: 1.241513020585896 samples/sec                   batch loss = 1582.328066945076 | accuracy = 0.6759398496240602


Epoch[2] Batch[670] Speed: 1.244506516903507 samples/sec                   batch loss = 1593.9831010103226 | accuracy = 0.6764925373134328


Epoch[2] Batch[675] Speed: 1.244962537699202 samples/sec                   batch loss = 1605.9563179016113 | accuracy = 0.6759259259259259


Epoch[2] Batch[680] Speed: 1.2466060756198734 samples/sec                   batch loss = 1615.0009993314743 | accuracy = 0.6768382352941177


Epoch[2] Batch[685] Speed: 1.2467685649229818 samples/sec                   batch loss = 1628.6563659906387 | accuracy = 0.6759124087591241


Epoch[2] Batch[690] Speed: 1.2435978734950284 samples/sec                   batch loss = 1640.7256129980087 | accuracy = 0.6757246376811594


Epoch[2] Batch[695] Speed: 1.2438726327003433 samples/sec                   batch loss = 1651.355083823204 | accuracy = 0.6751798561151079


Epoch[2] Batch[700] Speed: 1.24327707522037 samples/sec                   batch loss = 1665.2966414690018 | accuracy = 0.6742857142857143


Epoch[2] Batch[705] Speed: 1.2453192387506113 samples/sec                   batch loss = 1676.8285827636719 | accuracy = 0.674468085106383


Epoch[2] Batch[710] Speed: 1.2372137482123609 samples/sec                   batch loss = 1690.2420305013657 | accuracy = 0.675


Epoch[2] Batch[715] Speed: 1.235950498672385 samples/sec                   batch loss = 1702.2055337429047 | accuracy = 0.6741258741258741


Epoch[2] Batch[720] Speed: 1.2389859100676335 samples/sec                   batch loss = 1713.2870469093323 | accuracy = 0.6746527777777778


Epoch[2] Batch[725] Speed: 1.2448131721010516 samples/sec                   batch loss = 1726.3230041265488 | accuracy = 0.6744827586206896


Epoch[2] Batch[730] Speed: 1.2480099381284555 samples/sec                   batch loss = 1742.5203465223312 | accuracy = 0.6732876712328767


Epoch[2] Batch[735] Speed: 1.2446672588843408 samples/sec                   batch loss = 1755.8837932348251 | accuracy = 0.673469387755102


Epoch[2] Batch[740] Speed: 1.2470729989537863 samples/sec                   batch loss = 1768.076938033104 | accuracy = 0.6726351351351352


Epoch[2] Batch[745] Speed: 1.242090796060601 samples/sec                   batch loss = 1779.09831905365 | accuracy = 0.6724832214765101


Epoch[2] Batch[750] Speed: 1.240707276129061 samples/sec                   batch loss = 1790.4609289169312 | accuracy = 0.673


Epoch[2] Batch[755] Speed: 1.238558943373773 samples/sec                   batch loss = 1801.3443932533264 | accuracy = 0.6738410596026491


Epoch[2] Batch[760] Speed: 1.2436377890232755 samples/sec                   batch loss = 1811.3822131156921 | accuracy = 0.675


Epoch[2] Batch[765] Speed: 1.24518928719045 samples/sec                   batch loss = 1823.1918141841888 | accuracy = 0.6758169934640523


Epoch[2] Batch[770] Speed: 1.2415246884258635 samples/sec                   batch loss = 1832.531551361084 | accuracy = 0.676948051948052


Epoch[2] Batch[775] Speed: 1.2420884051722096 samples/sec                   batch loss = 1840.8701560497284 | accuracy = 0.677741935483871


Epoch[2] Batch[780] Speed: 1.2362051282731439 samples/sec                   batch loss = 1848.542314171791 | accuracy = 0.6791666666666667


Epoch[2] Batch[785] Speed: 1.2375535157367985 samples/sec                   batch loss = 1861.2848227024078 | accuracy = 0.6789808917197452


[Epoch 2] training: accuracy=0.679251269035533
[Epoch 2] time cost: 647.8564803600311
[Epoch 2] validation: validation accuracy=0.7233333333333334


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).