<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[21:39:22] /work/mxnet/src/storage/storage.cc:205: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[21:39:22] /work/mxnet/src/storage/storage.cc:205: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[21:39:22] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 4.8225746, -3.266057 ]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7207339165144232 samples/sec                   batch loss = 12.325985193252563 | accuracy = 0.65


Epoch[1] Batch[10] Speed: 1.1675652774161878 samples/sec                   batch loss = 24.63808584213257 | accuracy = 0.7


Epoch[1] Batch[15] Speed: 1.1867058122938083 samples/sec                   batch loss = 40.46247911453247 | accuracy = 0.6


Epoch[1] Batch[20] Speed: 1.1722651382674414 samples/sec                   batch loss = 54.093900203704834 | accuracy = 0.6


Epoch[1] Batch[25] Speed: 1.162950349792135 samples/sec                   batch loss = 68.19164395332336 | accuracy = 0.57


Epoch[1] Batch[30] Speed: 1.177982452451077 samples/sec                   batch loss = 79.83681321144104 | accuracy = 0.625


Epoch[1] Batch[35] Speed: 1.1861315246468551 samples/sec                   batch loss = 94.77859616279602 | accuracy = 0.6071428571428571


Epoch[1] Batch[40] Speed: 1.159991918810876 samples/sec                   batch loss = 108.14993214607239 | accuracy = 0.60625


Epoch[1] Batch[45] Speed: 1.1803790419388334 samples/sec                   batch loss = 122.20559477806091 | accuracy = 0.5944444444444444


Epoch[1] Batch[50] Speed: 1.1751275414073556 samples/sec                   batch loss = 135.45279240608215 | accuracy = 0.595


Epoch[1] Batch[55] Speed: 1.1825814765027252 samples/sec                   batch loss = 149.44382619857788 | accuracy = 0.5909090909090909


Epoch[1] Batch[60] Speed: 1.1789071305501448 samples/sec                   batch loss = 163.6742992401123 | accuracy = 0.5791666666666667


Epoch[1] Batch[65] Speed: 1.1805195739122958 samples/sec                   batch loss = 177.94338274002075 | accuracy = 0.573076923076923


Epoch[1] Batch[70] Speed: 1.167781370643202 samples/sec                   batch loss = 191.71987128257751 | accuracy = 0.5678571428571428


Epoch[1] Batch[75] Speed: 1.1726503191564015 samples/sec                   batch loss = 205.66123914718628 | accuracy = 0.57


Epoch[1] Batch[80] Speed: 1.1755683919465667 samples/sec                   batch loss = 220.53425908088684 | accuracy = 0.559375


Epoch[1] Batch[85] Speed: 1.188818798981811 samples/sec                   batch loss = 235.07670950889587 | accuracy = 0.55


Epoch[1] Batch[90] Speed: 1.1636069867953773 samples/sec                   batch loss = 247.66224718093872 | accuracy = 0.5527777777777778


Epoch[1] Batch[95] Speed: 1.1628421778643916 samples/sec                   batch loss = 261.0108165740967 | accuracy = 0.5552631578947368


Epoch[1] Batch[100] Speed: 1.1835785163053354 samples/sec                   batch loss = 273.629408121109 | accuracy = 0.5625


Epoch[1] Batch[105] Speed: 1.185787470210804 samples/sec                   batch loss = 286.9314408302307 | accuracy = 0.5666666666666667


Epoch[1] Batch[110] Speed: 1.1913375529871035 samples/sec                   batch loss = 301.56324911117554 | accuracy = 0.5613636363636364


Epoch[1] Batch[115] Speed: 1.1713591060669923 samples/sec                   batch loss = 316.063179731369 | accuracy = 0.5543478260869565


Epoch[1] Batch[120] Speed: 1.176394744667709 samples/sec                   batch loss = 330.73852133750916 | accuracy = 0.55


Epoch[1] Batch[125] Speed: 1.1806309767486178 samples/sec                   batch loss = 344.29423117637634 | accuracy = 0.554


Epoch[1] Batch[130] Speed: 1.188989575267521 samples/sec                   batch loss = 358.2592248916626 | accuracy = 0.5557692307692308


Epoch[1] Batch[135] Speed: 1.183736431427392 samples/sec                   batch loss = 372.38331866264343 | accuracy = 0.5518518518518518


Epoch[1] Batch[140] Speed: 1.1662989342026646 samples/sec                   batch loss = 386.341299533844 | accuracy = 0.5517857142857143


Epoch[1] Batch[145] Speed: 1.1767107552868015 samples/sec                   batch loss = 398.9925057888031 | accuracy = 0.5655172413793104


Epoch[1] Batch[150] Speed: 1.181097501909218 samples/sec                   batch loss = 412.8539454936981 | accuracy = 0.5666666666666667


Epoch[1] Batch[155] Speed: 1.1829677957963125 samples/sec                   batch loss = 426.3788299560547 | accuracy = 0.5709677419354838


Epoch[1] Batch[160] Speed: 1.1835676617104247 samples/sec                   batch loss = 440.13805079460144 | accuracy = 0.5671875


Epoch[1] Batch[165] Speed: 1.1677610500846594 samples/sec                   batch loss = 453.61172890663147 | accuracy = 0.5651515151515152


Epoch[1] Batch[170] Speed: 1.1752817273820024 samples/sec                   batch loss = 467.917032957077 | accuracy = 0.5632352941176471


Epoch[1] Batch[175] Speed: 1.185134952617817 samples/sec                   batch loss = 481.776159286499 | accuracy = 0.5614285714285714


Epoch[1] Batch[180] Speed: 1.186091693315151 samples/sec                   batch loss = 496.07519698143005 | accuracy = 0.5555555555555556


Epoch[1] Batch[185] Speed: 1.1844283807045637 samples/sec                   batch loss = 509.4571306705475 | accuracy = 0.5554054054054054


Epoch[1] Batch[190] Speed: 1.1620914134484739 samples/sec                   batch loss = 523.5146813392639 | accuracy = 0.5513157894736842


Epoch[1] Batch[195] Speed: 1.1767492162061748 samples/sec                   batch loss = 537.0832688808441 | accuracy = 0.5512820512820513


Epoch[1] Batch[200] Speed: 1.1825032094290042 samples/sec                   batch loss = 550.6629185676575 | accuracy = 0.55125


Epoch[1] Batch[205] Speed: 1.1861672493378137 samples/sec                   batch loss = 564.3111696243286 | accuracy = 0.55


Epoch[1] Batch[210] Speed: 1.1837327565617934 samples/sec                   batch loss = 577.7302865982056 | accuracy = 0.5511904761904762


Epoch[1] Batch[215] Speed: 1.169381270958957 samples/sec                   batch loss = 591.7026422023773 | accuracy = 0.55


Epoch[1] Batch[220] Speed: 1.161236477707862 samples/sec                   batch loss = 605.3829348087311 | accuracy = 0.5511363636363636


Epoch[1] Batch[225] Speed: 1.1723726127724148 samples/sec                   batch loss = 618.5279018878937 | accuracy = 0.5533333333333333


Epoch[1] Batch[230] Speed: 1.1839387512764914 samples/sec                   batch loss = 631.7606430053711 | accuracy = 0.5554347826086956


Epoch[1] Batch[235] Speed: 1.1868742187738206 samples/sec                   batch loss = 644.9916784763336 | accuracy = 0.5595744680851064


Epoch[1] Batch[240] Speed: 1.1581292349832395 samples/sec                   batch loss = 658.0707383155823 | accuracy = 0.5614583333333333


Epoch[1] Batch[245] Speed: 1.1589754366273386 samples/sec                   batch loss = 672.3865201473236 | accuracy = 0.5591836734693878


Epoch[1] Batch[250] Speed: 1.1698621125679172 samples/sec                   batch loss = 685.7505469322205 | accuracy = 0.56


Epoch[1] Batch[255] Speed: 1.1829099944489894 samples/sec                   batch loss = 699.0713455677032 | accuracy = 0.5617647058823529


Epoch[1] Batch[260] Speed: 1.1818133405615772 samples/sec                   batch loss = 713.3907015323639 | accuracy = 0.5605769230769231


Epoch[1] Batch[265] Speed: 1.1545236796109697 samples/sec                   batch loss = 726.5061573982239 | accuracy = 0.5622641509433962


Epoch[1] Batch[270] Speed: 1.160410568543367 samples/sec                   batch loss = 739.824271440506 | accuracy = 0.562962962962963


Epoch[1] Batch[275] Speed: 1.1711156085007506 samples/sec                   batch loss = 752.5452573299408 | accuracy = 0.5636363636363636


Epoch[1] Batch[280] Speed: 1.1779109126912246 samples/sec                   batch loss = 765.7170102596283 | accuracy = 0.5651785714285714


Epoch[1] Batch[285] Speed: 1.1760216087420803 samples/sec                   batch loss = 778.4823362827301 | accuracy = 0.5684210526315789


Epoch[1] Batch[290] Speed: 1.1551591406255433 samples/sec                   batch loss = 793.123780965805 | accuracy = 0.5637931034482758


Epoch[1] Batch[295] Speed: 1.1595725291632848 samples/sec                   batch loss = 806.8370020389557 | accuracy = 0.5635593220338984


Epoch[1] Batch[300] Speed: 1.1741818258421797 samples/sec                   batch loss = 820.6657643318176 | accuracy = 0.5616666666666666


Epoch[1] Batch[305] Speed: 1.1787509981174091 samples/sec                   batch loss = 834.0726239681244 | accuracy = 0.5606557377049181


Epoch[1] Batch[310] Speed: 1.1642151637888154 samples/sec                   batch loss = 847.8575911521912 | accuracy = 0.5596774193548387


Epoch[1] Batch[315] Speed: 1.1587979654817062 samples/sec                   batch loss = 861.7489721775055 | accuracy = 0.5603174603174603


Epoch[1] Batch[320] Speed: 1.1576784378214893 samples/sec                   batch loss = 875.9104044437408 | accuracy = 0.55859375


Epoch[1] Batch[325] Speed: 1.1606337364633676 samples/sec                   batch loss = 889.7084650993347 | accuracy = 0.556923076923077


Epoch[1] Batch[330] Speed: 1.1757081102702882 samples/sec                   batch loss = 903.392653465271 | accuracy = 0.5575757575757576


Epoch[1] Batch[335] Speed: 1.1640399610876935 samples/sec                   batch loss = 917.5416746139526 | accuracy = 0.5552238805970149


Epoch[1] Batch[340] Speed: 1.146121231928175 samples/sec                   batch loss = 930.9773285388947 | accuracy = 0.5536764705882353


Epoch[1] Batch[345] Speed: 1.1605419703420563 samples/sec                   batch loss = 944.7302281856537 | accuracy = 0.5528985507246377


Epoch[1] Batch[350] Speed: 1.1727767195249397 samples/sec                   batch loss = 957.6570312976837 | accuracy = 0.5535714285714286


Epoch[1] Batch[355] Speed: 1.1883602963998892 samples/sec                   batch loss = 971.0846583843231 | accuracy = 0.5556338028169014


Epoch[1] Batch[360] Speed: 1.1566395300754808 samples/sec                   batch loss = 984.3573441505432 | accuracy = 0.5569444444444445


Epoch[1] Batch[365] Speed: 1.149523646192455 samples/sec                   batch loss = 997.7095739841461 | accuracy = 0.5582191780821918


Epoch[1] Batch[370] Speed: 1.162829443596359 samples/sec                   batch loss = 1012.1809575557709 | accuracy = 0.5560810810810811


Epoch[1] Batch[375] Speed: 1.1731884883419348 samples/sec                   batch loss = 1025.3778817653656 | accuracy = 0.5573333333333333


Epoch[1] Batch[380] Speed: 1.1788170905128332 samples/sec                   batch loss = 1037.942565202713 | accuracy = 0.5585526315789474


Epoch[1] Batch[385] Speed: 1.144332764436912 samples/sec                   batch loss = 1052.0525135993958 | accuracy = 0.5558441558441558


Epoch[1] Batch[390] Speed: 1.1527692662766527 samples/sec                   batch loss = 1065.463523864746 | accuracy = 0.5570512820512821


Epoch[1] Batch[395] Speed: 1.1623759470467219 samples/sec                   batch loss = 1079.3595776557922 | accuracy = 0.5569620253164557


Epoch[1] Batch[400] Speed: 1.17362484725655 samples/sec                   batch loss = 1093.2402656078339 | accuracy = 0.556875


Epoch[1] Batch[405] Speed: 1.1773439470935376 samples/sec                   batch loss = 1106.3308954238892 | accuracy = 0.5580246913580247


Epoch[1] Batch[410] Speed: 1.1419038750792165 samples/sec                   batch loss = 1119.3490035533905 | accuracy = 0.5585365853658537


Epoch[1] Batch[415] Speed: 1.145569430534983 samples/sec                   batch loss = 1132.2640120983124 | accuracy = 0.5590361445783133


Epoch[1] Batch[420] Speed: 1.1598809286283853 samples/sec                   batch loss = 1145.104006767273 | accuracy = 0.5589285714285714


Epoch[1] Batch[425] Speed: 1.1698577076118566 samples/sec                   batch loss = 1158.6139879226685 | accuracy = 0.5594117647058824


Epoch[1] Batch[430] Speed: 1.1753295637150445 samples/sec                   batch loss = 1172.7029857635498 | accuracy = 0.5581395348837209


Epoch[1] Batch[435] Speed: 1.14003773621061 samples/sec                   batch loss = 1185.9970190525055 | accuracy = 0.5591954022988506


Epoch[1] Batch[440] Speed: 1.1477812436610992 samples/sec                   batch loss = 1198.2421901226044 | accuracy = 0.5625


Epoch[1] Batch[445] Speed: 1.1589502175222839 samples/sec                   batch loss = 1211.782234430313 | accuracy = 0.5617977528089888


Epoch[1] Batch[450] Speed: 1.1664358900270286 samples/sec                   batch loss = 1225.0772614479065 | accuracy = 0.5611111111111111


Epoch[1] Batch[455] Speed: 1.162937048879881 samples/sec                   batch loss = 1239.011472940445 | accuracy = 0.560989010989011


Epoch[1] Batch[460] Speed: 1.139493400946344 samples/sec                   batch loss = 1251.299128293991 | accuracy = 0.5630434782608695


Epoch[1] Batch[465] Speed: 1.147221096616285 samples/sec                   batch loss = 1264.3883113861084 | accuracy = 0.5629032258064516


Epoch[1] Batch[470] Speed: 1.162564263468858 samples/sec                   batch loss = 1277.21160197258 | accuracy = 0.5638297872340425


Epoch[1] Batch[475] Speed: 1.1730474817665026 samples/sec                   batch loss = 1290.8321673870087 | accuracy = 0.5636842105263158


Epoch[1] Batch[480] Speed: 1.164241258882716 samples/sec                   batch loss = 1303.8597056865692 | accuracy = 0.5635416666666667


Epoch[1] Batch[485] Speed: 1.1365364637756954 samples/sec                   batch loss = 1318.1261644363403 | accuracy = 0.5628865979381443


Epoch[1] Batch[490] Speed: 1.1555610963121634 samples/sec                   batch loss = 1331.2320623397827 | accuracy = 0.5627551020408164


Epoch[1] Batch[495] Speed: 1.157365220288047 samples/sec                   batch loss = 1345.6515617370605 | accuracy = 0.5611111111111111


Epoch[1] Batch[500] Speed: 1.1737347062105066 samples/sec                   batch loss = 1357.5129930973053 | accuracy = 0.563


Epoch[1] Batch[505] Speed: 1.1570961419544266 samples/sec                   batch loss = 1370.4115796089172 | accuracy = 0.5653465346534653


Epoch[1] Batch[510] Speed: 1.1423640141652132 samples/sec                   batch loss = 1383.7701892852783 | accuracy = 0.5651960784313725


Epoch[1] Batch[515] Speed: 1.1521804237609687 samples/sec                   batch loss = 1396.6417832374573 | accuracy = 0.566990291262136


Epoch[1] Batch[520] Speed: 1.1541495983436374 samples/sec                   batch loss = 1410.8433344364166 | accuracy = 0.5673076923076923


Epoch[1] Batch[525] Speed: 1.1805385134033057 samples/sec                   batch loss = 1424.0846672058105 | accuracy = 0.5676190476190476


Epoch[1] Batch[530] Speed: 1.1402243855495053 samples/sec                   batch loss = 1437.2884783744812 | accuracy = 0.5688679245283019


Epoch[1] Batch[535] Speed: 1.1319333397878184 samples/sec                   batch loss = 1450.0281410217285 | accuracy = 0.5705607476635514


Epoch[1] Batch[540] Speed: 1.1491650765989454 samples/sec                   batch loss = 1463.1252450942993 | accuracy = 0.5703703703703704


Epoch[1] Batch[545] Speed: 1.1566673600008934 samples/sec                   batch loss = 1476.0351219177246 | accuracy = 0.5715596330275229


Epoch[1] Batch[550] Speed: 1.1704530035019653 samples/sec                   batch loss = 1488.7443552017212 | accuracy = 0.5718181818181818


Epoch[1] Batch[555] Speed: 1.136651116758725 samples/sec                   batch loss = 1502.103182554245 | accuracy = 0.5711711711711712


Epoch[1] Batch[560] Speed: 1.1454525021912634 samples/sec                   batch loss = 1514.4060796499252 | accuracy = 0.5732142857142857


Epoch[1] Batch[565] Speed: 1.1532590542421404 samples/sec                   batch loss = 1529.4495257139206 | accuracy = 0.5716814159292035


Epoch[1] Batch[570] Speed: 1.1604234104126203 samples/sec                   batch loss = 1541.6363364458084 | accuracy = 0.5723684210526315


Epoch[1] Batch[575] Speed: 1.166286934869584 samples/sec                   batch loss = 1553.7965372800827 | accuracy = 0.5734782608695652


Epoch[1] Batch[580] Speed: 1.1460390267134248 samples/sec                   batch loss = 1567.5724333524704 | accuracy = 0.5732758620689655


Epoch[1] Batch[585] Speed: 1.1454836286094525 samples/sec                   batch loss = 1580.856304526329 | accuracy = 0.5739316239316239


Epoch[1] Batch[590] Speed: 1.1549452284069774 samples/sec                   batch loss = 1594.4435765743256 | accuracy = 0.573728813559322


Epoch[1] Batch[595] Speed: 1.1612932252906847 samples/sec                   batch loss = 1607.8963251113892 | accuracy = 0.5743697478991596


Epoch[1] Batch[600] Speed: 1.1652850515029636 samples/sec                   batch loss = 1621.331496477127 | accuracy = 0.5745833333333333


Epoch[1] Batch[605] Speed: 1.1211440828351555 samples/sec                   batch loss = 1635.380511045456 | accuracy = 0.5739669421487603


Epoch[1] Batch[610] Speed: 1.1369529132577831 samples/sec                   batch loss = 1647.4653205871582 | accuracy = 0.575


Epoch[1] Batch[615] Speed: 1.1478071569236812 samples/sec                   batch loss = 1660.8166863918304 | accuracy = 0.5747967479674797


Epoch[1] Batch[620] Speed: 1.1728769902981104 samples/sec                   batch loss = 1673.832753419876 | accuracy = 0.5758064516129032


Epoch[1] Batch[625] Speed: 1.1592948134369225 samples/sec                   batch loss = 1686.6989495754242 | accuracy = 0.576


Epoch[1] Batch[630] Speed: 1.1372729864029942 samples/sec                   batch loss = 1698.8327646255493 | accuracy = 0.5773809523809523


Epoch[1] Batch[635] Speed: 1.1492614290512313 samples/sec                   batch loss = 1710.6574537754059 | accuracy = 0.5783464566929134


Epoch[1] Batch[640] Speed: 1.1493459869146003 samples/sec                   batch loss = 1726.2112448215485 | accuracy = 0.57734375


Epoch[1] Batch[645] Speed: 1.1739459427839396 samples/sec                   batch loss = 1738.7789285182953 | accuracy = 0.5782945736434109


Epoch[1] Batch[650] Speed: 1.1458132188003831 samples/sec                   batch loss = 1753.777268409729 | accuracy = 0.5761538461538461


Epoch[1] Batch[655] Speed: 1.128280407663304 samples/sec                   batch loss = 1767.3993787765503 | accuracy = 0.5759541984732824


Epoch[1] Batch[660] Speed: 1.140427142482699 samples/sec                   batch loss = 1779.638353586197 | accuracy = 0.5768939393939394


Epoch[1] Batch[665] Speed: 1.1501473845161183 samples/sec                   batch loss = 1790.8996423482895 | accuracy = 0.5789473684210527


Epoch[1] Batch[670] Speed: 1.169552785837893 samples/sec                   batch loss = 1803.2857137918472 | accuracy = 0.5798507462686567


Epoch[1] Batch[675] Speed: 1.1357649868004647 samples/sec                   batch loss = 1815.838918685913 | accuracy = 0.5796296296296296


Epoch[1] Batch[680] Speed: 1.1297626505424534 samples/sec                   batch loss = 1827.118525505066 | accuracy = 0.580514705882353


Epoch[1] Batch[685] Speed: 1.1416093878998856 samples/sec                   batch loss = 1841.0215435028076 | accuracy = 0.5802919708029197


Epoch[1] Batch[690] Speed: 1.1549800533237726 samples/sec                   batch loss = 1853.798945426941 | accuracy = 0.5811594202898551


Epoch[1] Batch[695] Speed: 1.1682529248597497 samples/sec                   batch loss = 1868.0156099796295 | accuracy = 0.5809352517985612


Epoch[1] Batch[700] Speed: 1.1301077660447707 samples/sec                   batch loss = 1880.5605854988098 | accuracy = 0.5821428571428572


Epoch[1] Batch[705] Speed: 1.137656417267228 samples/sec                   batch loss = 1890.3597201108932 | accuracy = 0.5843971631205673


Epoch[1] Batch[710] Speed: 1.1508666853662088 samples/sec                   batch loss = 1905.0694397687912 | accuracy = 0.5841549295774648


Epoch[1] Batch[715] Speed: 1.153168529798819 samples/sec                   batch loss = 1917.356962800026 | accuracy = 0.5849650349650349


Epoch[1] Batch[720] Speed: 1.172590079541778 samples/sec                   batch loss = 1930.1438845396042 | accuracy = 0.5854166666666667


Epoch[1] Batch[725] Speed: 1.1316362627790557 samples/sec                   batch loss = 1942.2598005533218 | accuracy = 0.5868965517241379


Epoch[1] Batch[730] Speed: 1.1336074713209854 samples/sec                   batch loss = 1953.8433326482773 | accuracy = 0.5876712328767123


Epoch[1] Batch[735] Speed: 1.1499592066871012 samples/sec                   batch loss = 1967.3423672914505 | accuracy = 0.5877551020408164


Epoch[1] Batch[740] Speed: 1.161975514401468 samples/sec                   batch loss = 1978.4263550043106 | accuracy = 0.5891891891891892


Epoch[1] Batch[745] Speed: 1.15941682851439 samples/sec                   batch loss = 1991.654944062233 | accuracy = 0.5885906040268456


Epoch[1] Batch[750] Speed: 1.1198929530423323 samples/sec                   batch loss = 2003.3903003931046 | accuracy = 0.5886666666666667


Epoch[1] Batch[755] Speed: 1.1360186191711321 samples/sec                   batch loss = 2016.368897676468 | accuracy = 0.5894039735099338


Epoch[1] Batch[760] Speed: 1.1436035980276584 samples/sec                   batch loss = 2029.56278860569 | accuracy = 0.5894736842105263


Epoch[1] Batch[765] Speed: 1.1650031369338458 samples/sec                   batch loss = 2042.691497206688 | accuracy = 0.5895424836601307


Epoch[1] Batch[770] Speed: 1.1452402150869547 samples/sec                   batch loss = 2054.8451319932938 | accuracy = 0.5902597402597403


Epoch[1] Batch[775] Speed: 1.1312727446816975 samples/sec                   batch loss = 2068.4855164289474 | accuracy = 0.59


Epoch[1] Batch[780] Speed: 1.1426580347842503 samples/sec                   batch loss = 2081.983533024788 | accuracy = 0.591025641025641


Epoch[1] Batch[785] Speed: 1.1454416318055762 samples/sec                   batch loss = 2094.3601351976395 | accuracy = 0.5914012738853504


[Epoch 1] training: accuracy=0.5913705583756346
[Epoch 1] time cost: 699.4585700035095
[Epoch 1] validation: validation accuracy=0.6722222222222223


Epoch[2] Batch[5] Speed: 1.1761331536359796 samples/sec                   batch loss = 12.592769384384155 | accuracy = 0.6


Epoch[2] Batch[10] Speed: 1.1718724341391653 samples/sec                   batch loss = 23.818468809127808 | accuracy = 0.65


Epoch[2] Batch[15] Speed: 1.1526855503035611 samples/sec                   batch loss = 36.59362244606018 | accuracy = 0.6666666666666666


Epoch[2] Batch[20] Speed: 1.159440144889106 samples/sec                   batch loss = 50.138067960739136 | accuracy = 0.6375


Epoch[2] Batch[25] Speed: 1.1706741697143785 samples/sec                   batch loss = 61.687657594680786 | accuracy = 0.64


Epoch[2] Batch[30] Speed: 1.1586311905550501 samples/sec                   batch loss = 74.9232268333435 | accuracy = 0.6416666666666667


Epoch[2] Batch[35] Speed: 1.1598883861180493 samples/sec                   batch loss = 89.88202857971191 | accuracy = 0.6357142857142857


Epoch[2] Batch[40] Speed: 1.1521588226746222 samples/sec                   batch loss = 104.44343400001526 | accuracy = 0.625


Epoch[2] Batch[45] Speed: 1.1664894972451192 samples/sec                   batch loss = 116.42166423797607 | accuracy = 0.6388888888888888


Epoch[2] Batch[50] Speed: 1.1557097124932698 samples/sec                   batch loss = 125.99994730949402 | accuracy = 0.665


Epoch[2] Batch[55] Speed: 1.1536899138460468 samples/sec                   batch loss = 137.53375351428986 | accuracy = 0.6727272727272727


Epoch[2] Batch[60] Speed: 1.1616158112314485 samples/sec                   batch loss = 151.30997598171234 | accuracy = 0.6666666666666666


Epoch[2] Batch[65] Speed: 1.158074714702576 samples/sec                   batch loss = 162.64247024059296 | accuracy = 0.676923076923077


Epoch[2] Batch[70] Speed: 1.167081937599589 samples/sec                   batch loss = 175.51999163627625 | accuracy = 0.6642857142857143


Epoch[2] Batch[75] Speed: 1.1489513326040823 samples/sec                   batch loss = 188.35716462135315 | accuracy = 0.66


Epoch[2] Batch[80] Speed: 1.166474979826646 samples/sec                   batch loss = 204.63516235351562 | accuracy = 0.65


Epoch[2] Batch[85] Speed: 1.1612054537540326 samples/sec                   batch loss = 215.6713161468506 | accuracy = 0.6588235294117647


Epoch[2] Batch[90] Speed: 1.1663311228223947 samples/sec                   batch loss = 227.65603852272034 | accuracy = 0.6583333333333333


Epoch[2] Batch[95] Speed: 1.1649294441100753 samples/sec                   batch loss = 241.45284986495972 | accuracy = 0.65


Epoch[2] Batch[100] Speed: 1.1546993670123769 samples/sec                   batch loss = 254.08277010917664 | accuracy = 0.65


Epoch[2] Batch[105] Speed: 1.1668259321481755 samples/sec                   batch loss = 264.13035476207733 | accuracy = 0.6523809523809524


Epoch[2] Batch[110] Speed: 1.1693710827322352 samples/sec                   batch loss = 274.3968003988266 | accuracy = 0.6613636363636364


Epoch[2] Batch[115] Speed: 1.1681940309305276 samples/sec                   batch loss = 288.34481155872345 | accuracy = 0.658695652173913


Epoch[2] Batch[120] Speed: 1.1618457188195046 samples/sec                   batch loss = 301.3337426185608 | accuracy = 0.6583333333333333


Epoch[2] Batch[125] Speed: 1.1271792837636136 samples/sec                   batch loss = 313.9494286775589 | accuracy = 0.658


Epoch[2] Batch[130] Speed: 1.1356689621410871 samples/sec                   batch loss = 327.57298719882965 | accuracy = 0.6538461538461539


Epoch[2] Batch[135] Speed: 1.1524795989969427 samples/sec                   batch loss = 339.1183547973633 | accuracy = 0.6592592592592592


Epoch[2] Batch[140] Speed: 1.1701182280549687 samples/sec                   batch loss = 352.02465903759 | accuracy = 0.6553571428571429


Epoch[2] Batch[145] Speed: 1.1614669588123103 samples/sec                   batch loss = 364.9028435945511 | accuracy = 0.656896551724138


Epoch[2] Batch[150] Speed: 1.1338627463198427 samples/sec                   batch loss = 375.8397847414017 | accuracy = 0.6583333333333333


Epoch[2] Batch[155] Speed: 1.1423703146874098 samples/sec                   batch loss = 388.1294685602188 | accuracy = 0.6580645161290323


Epoch[2] Batch[160] Speed: 1.1645009809321232 samples/sec                   batch loss = 402.9303766489029 | accuracy = 0.6515625


Epoch[2] Batch[165] Speed: 1.1748208538555538 samples/sec                   batch loss = 417.654797911644 | accuracy = 0.646969696969697


Epoch[2] Batch[170] Speed: 1.1497866921973547 samples/sec                   batch loss = 428.9595218896866 | accuracy = 0.6529411764705882


Epoch[2] Batch[175] Speed: 1.1325415702614627 samples/sec                   batch loss = 441.8724821805954 | accuracy = 0.6514285714285715


Epoch[2] Batch[180] Speed: 1.147768208947086 samples/sec                   batch loss = 454.57437479496 | accuracy = 0.6527777777777778


Epoch[2] Batch[185] Speed: 1.1645480245106543 samples/sec                   batch loss = 467.08018028736115 | accuracy = 0.654054054054054


Epoch[2] Batch[190] Speed: 1.171579060422031 samples/sec                   batch loss = 478.8860626220703 | accuracy = 0.6526315789473685


Epoch[2] Batch[195] Speed: 1.1448257979285996 samples/sec                   batch loss = 492.39081287384033 | accuracy = 0.6525641025641026


Epoch[2] Batch[200] Speed: 1.1326390548523206 samples/sec                   batch loss = 504.04914569854736 | accuracy = 0.65375


Epoch[2] Batch[205] Speed: 1.1494847391541856 samples/sec                   batch loss = 514.8244352340698 | accuracy = 0.6560975609756098


Epoch[2] Batch[210] Speed: 1.1638004654170777 samples/sec                   batch loss = 525.0532438158989 | accuracy = 0.6571428571428571


Epoch[2] Batch[215] Speed: 1.1688431680961826 samples/sec                   batch loss = 536.9634385704994 | accuracy = 0.6569767441860465


Epoch[2] Batch[220] Speed: 1.1408642916658138 samples/sec                   batch loss = 550.5137532353401 | accuracy = 0.6534090909090909


Epoch[2] Batch[225] Speed: 1.1340312049573633 samples/sec                   batch loss = 562.560618698597 | accuracy = 0.6522222222222223


Epoch[2] Batch[230] Speed: 1.1564428449768243 samples/sec                   batch loss = 573.3191921114922 | accuracy = 0.6532608695652173


Epoch[2] Batch[235] Speed: 1.1674146527335778 samples/sec                   batch loss = 586.0912639498711 | accuracy = 0.6563829787234042


Epoch[2] Batch[240] Speed: 1.1682251854047054 samples/sec                   batch loss = 598.6537567973137 | accuracy = 0.65625


Epoch[2] Batch[245] Speed: 1.1327765557635283 samples/sec                   batch loss = 612.0364584326744 | accuracy = 0.6551020408163265


Epoch[2] Batch[250] Speed: 1.131364060067193 samples/sec                   batch loss = 623.5701896548271 | accuracy = 0.656


Epoch[2] Batch[255] Speed: 1.158420469800187 samples/sec                   batch loss = 636.0876169800758 | accuracy = 0.6549019607843137


Epoch[2] Batch[260] Speed: 1.164387590815441 samples/sec                   batch loss = 648.4588342308998 | accuracy = 0.6557692307692308


Epoch[2] Batch[265] Speed: 1.1623925370607013 samples/sec                   batch loss = 663.9328172802925 | accuracy = 0.6518867924528302


Epoch[2] Batch[270] Speed: 1.1359423175958225 samples/sec                   batch loss = 675.2259535193443 | accuracy = 0.6537037037037037


Epoch[2] Batch[275] Speed: 1.138136996092381 samples/sec                   batch loss = 685.8586239814758 | accuracy = 0.6536363636363637


Epoch[2] Batch[280] Speed: 1.1559070242994338 samples/sec                   batch loss = 696.303685426712 | accuracy = 0.6571428571428571


Epoch[2] Batch[285] Speed: 1.168900987341795 samples/sec                   batch loss = 711.606173992157 | accuracy = 0.6543859649122807


Epoch[2] Batch[290] Speed: 1.1595363849359661 samples/sec                   batch loss = 724.6831440925598 | accuracy = 0.6551724137931034


Epoch[2] Batch[295] Speed: 1.1406667303950557 samples/sec                   batch loss = 735.9228148460388 | accuracy = 0.6576271186440678


Epoch[2] Batch[300] Speed: 1.1418775282185156 samples/sec                   batch loss = 747.2592650651932 | accuracy = 0.66


Epoch[2] Batch[305] Speed: 1.1649995774613364 samples/sec                   batch loss = 760.4967044591904 | accuracy = 0.660655737704918


Epoch[2] Batch[310] Speed: 1.1763162221503165 samples/sec                   batch loss = 770.7820039987564 | accuracy = 0.6612903225806451


Epoch[2] Batch[315] Speed: 1.1489112054482493 samples/sec                   batch loss = 783.6160563230515 | accuracy = 0.6595238095238095


Epoch[2] Batch[320] Speed: 1.1372409941054298 samples/sec                   batch loss = 795.1245547533035 | accuracy = 0.6609375


Epoch[2] Batch[325] Speed: 1.1507268100530952 samples/sec                   batch loss = 808.9297087192535 | accuracy = 0.6592307692307692


Epoch[2] Batch[330] Speed: 1.1640708136158382 samples/sec                   batch loss = 823.0873143672943 | accuracy = 0.656060606060606


Epoch[2] Batch[335] Speed: 1.173337160250112 samples/sec                   batch loss = 837.002778172493 | accuracy = 0.6544776119402985


Epoch[2] Batch[340] Speed: 1.1506935828242486 samples/sec                   batch loss = 848.6234192848206 | accuracy = 0.6544117647058824


Epoch[2] Batch[345] Speed: 1.1410255250591537 samples/sec                   batch loss = 860.7091505527496 | accuracy = 0.6543478260869565


Epoch[2] Batch[350] Speed: 1.1604710080695169 samples/sec                   batch loss = 870.4676049947739 | accuracy = 0.6571428571428571


Epoch[2] Batch[355] Speed: 1.172297820883137 samples/sec                   batch loss = 882.9107922315598 | accuracy = 0.6577464788732394


Epoch[2] Batch[360] Speed: 1.179469549942936 samples/sec                   batch loss = 895.1704965829849 | accuracy = 0.6590277777777778


Epoch[2] Batch[365] Speed: 1.1502758411930847 samples/sec                   batch loss = 907.2835694551468 | accuracy = 0.6602739726027397


Epoch[2] Batch[370] Speed: 1.146366821727461 samples/sec                   batch loss = 918.9207216501236 | accuracy = 0.6621621621621622


Epoch[2] Batch[375] Speed: 1.1553509337583454 samples/sec                   batch loss = 934.4252153635025 | accuracy = 0.6593333333333333


Epoch[2] Batch[380] Speed: 1.1663279606374768 samples/sec                   batch loss = 946.1833930015564 | accuracy = 0.6585526315789474


Epoch[2] Batch[385] Speed: 1.1675490269285798 samples/sec                   batch loss = 956.9651812314987 | accuracy = 0.6597402597402597


Epoch[2] Batch[390] Speed: 1.1346568865286033 samples/sec                   batch loss = 970.0917454957962 | accuracy = 0.657051282051282


Epoch[2] Batch[395] Speed: 1.1380303801428364 samples/sec                   batch loss = 979.4977223873138 | accuracy = 0.660126582278481


Epoch[2] Batch[400] Speed: 1.1454772936987954 samples/sec                   batch loss = 991.130664229393 | accuracy = 0.66


Epoch[2] Batch[405] Speed: 1.1573252217907937 samples/sec                   batch loss = 1001.7278664112091 | accuracy = 0.6611111111111111


Epoch[2] Batch[410] Speed: 1.1598969663592038 samples/sec                   batch loss = 1012.7740038633347 | accuracy = 0.6615853658536586


Epoch[2] Batch[415] Speed: 1.1464779825691396 samples/sec                   batch loss = 1024.0329457521439 | accuracy = 0.6626506024096386


Epoch[2] Batch[420] Speed: 1.137620237865823 samples/sec                   batch loss = 1039.3993538618088 | accuracy = 0.6613095238095238


Epoch[2] Batch[425] Speed: 1.1557761921987593 samples/sec                   batch loss = 1050.7209326028824 | accuracy = 0.6617647058823529


Epoch[2] Batch[430] Speed: 1.1759006892916757 samples/sec                   batch loss = 1061.673870921135 | accuracy = 0.6616279069767442


Epoch[2] Batch[435] Speed: 1.161752312914722 samples/sec                   batch loss = 1074.9373832941055 | accuracy = 0.6603448275862069


Epoch[2] Batch[440] Speed: 1.1498047372003202 samples/sec                   batch loss = 1086.3320932388306 | accuracy = 0.6596590909090909


Epoch[2] Batch[445] Speed: 1.1466850075582362 samples/sec                   batch loss = 1098.6748888492584 | accuracy = 0.6578651685393259


Epoch[2] Batch[450] Speed: 1.1577420284204516 samples/sec                   batch loss = 1109.729014992714 | accuracy = 0.6577777777777778


Epoch[2] Batch[455] Speed: 1.1825367988209885 samples/sec                   batch loss = 1120.7803575992584 | accuracy = 0.6576923076923077


Epoch[2] Batch[460] Speed: 1.1676267083463043 samples/sec                   batch loss = 1131.3230702877045 | accuracy = 0.6592391304347827


Epoch[2] Batch[465] Speed: 1.15629483712077 samples/sec                   batch loss = 1144.3482211828232 | accuracy = 0.6586021505376344


Epoch[2] Batch[470] Speed: 1.1543713968931937 samples/sec                   batch loss = 1157.7486449480057 | accuracy = 0.6579787234042553


Epoch[2] Batch[475] Speed: 1.158637431746105 samples/sec                   batch loss = 1169.157572388649 | accuracy = 0.6573684210526316


Epoch[2] Batch[480] Speed: 1.177716186618545 samples/sec                   batch loss = 1179.8611197471619 | accuracy = 0.6578125


Epoch[2] Batch[485] Speed: 1.1716379688778635 samples/sec                   batch loss = 1191.9486999511719 | accuracy = 0.6582474226804124


Epoch[2] Batch[490] Speed: 1.1595461620877727 samples/sec                   batch loss = 1202.5500844717026 | accuracy = 0.6586734693877551


Epoch[2] Batch[495] Speed: 1.1600116490580221 samples/sec                   batch loss = 1212.224925518036 | accuracy = 0.6601010101010101


Epoch[2] Batch[500] Speed: 1.1712333381339994 samples/sec                   batch loss = 1222.1763058900833 | accuracy = 0.6605


Epoch[2] Batch[505] Speed: 1.1732634759747027 samples/sec                   batch loss = 1233.7747728824615 | accuracy = 0.6594059405940594


Epoch[2] Batch[510] Speed: 1.1639110765903569 samples/sec                   batch loss = 1246.0064101219177 | accuracy = 0.6598039215686274


Epoch[2] Batch[515] Speed: 1.162320865154087 samples/sec                   batch loss = 1258.9559780359268 | accuracy = 0.6597087378640777


Epoch[2] Batch[520] Speed: 1.1664650854803422 samples/sec                   batch loss = 1272.0859124660492 | accuracy = 0.6605769230769231


Epoch[2] Batch[525] Speed: 1.167650762429175 samples/sec                   batch loss = 1285.7189211845398 | accuracy = 0.660952380952381


Epoch[2] Batch[530] Speed: 1.1731868475818656 samples/sec                   batch loss = 1298.1231617927551 | accuracy = 0.6608490566037736


Epoch[2] Batch[535] Speed: 1.1692679878660301 samples/sec                   batch loss = 1308.3131242990494 | accuracy = 0.6626168224299065


Epoch[2] Batch[540] Speed: 1.167506940268631 samples/sec                   batch loss = 1321.985745549202 | accuracy = 0.662962962962963


Epoch[2] Batch[545] Speed: 1.175560896206323 samples/sec                   batch loss = 1332.5482144355774 | accuracy = 0.6637614678899083


Epoch[2] Batch[550] Speed: 1.1619172516692693 samples/sec                   batch loss = 1343.5155093669891 | accuracy = 0.6640909090909091


Epoch[2] Batch[555] Speed: 1.1687323502927958 samples/sec                   batch loss = 1355.0930037498474 | accuracy = 0.6648648648648648


Epoch[2] Batch[560] Speed: 1.164679394832686 samples/sec                   batch loss = 1364.5721744298935 | accuracy = 0.6651785714285714


Epoch[2] Batch[565] Speed: 1.1671977207622022 samples/sec                   batch loss = 1377.8742154836655 | accuracy = 0.6641592920353983


Epoch[2] Batch[570] Speed: 1.1698280973793176 samples/sec                   batch loss = 1390.2057147026062 | accuracy = 0.6635964912280702


Epoch[2] Batch[575] Speed: 1.1684496606093369 samples/sec                   batch loss = 1401.9755799770355 | accuracy = 0.6643478260869565


Epoch[2] Batch[580] Speed: 1.1696581326481355 samples/sec                   batch loss = 1412.2606498003006 | accuracy = 0.6655172413793103


Epoch[2] Batch[585] Speed: 1.180137425117773 samples/sec                   batch loss = 1423.3678983449936 | accuracy = 0.6662393162393162


Epoch[2] Batch[590] Speed: 1.1847324087120952 samples/sec                   batch loss = 1431.7552886009216 | accuracy = 0.6677966101694915


Epoch[2] Batch[595] Speed: 1.1788607420606927 samples/sec                   batch loss = 1445.0556172132492 | accuracy = 0.6668067226890756


Epoch[2] Batch[600] Speed: 1.1802309880053017 samples/sec                   batch loss = 1455.6693587303162 | accuracy = 0.6675


Epoch[2] Batch[605] Speed: 1.1813623043739345 samples/sec                   batch loss = 1468.9649324417114 | accuracy = 0.6673553719008265


Epoch[2] Batch[610] Speed: 1.1673844349864584 samples/sec                   batch loss = 1483.2171325683594 | accuracy = 0.6676229508196722


Epoch[2] Batch[615] Speed: 1.176172896101111 samples/sec                   batch loss = 1493.0351068973541 | accuracy = 0.6682926829268293


Epoch[2] Batch[620] Speed: 1.1782320412345708 samples/sec                   batch loss = 1504.1441791057587 | accuracy = 0.6685483870967742


Epoch[2] Batch[625] Speed: 1.1864136909118501 samples/sec                   batch loss = 1513.1630282402039 | accuracy = 0.67


Epoch[2] Batch[630] Speed: 1.1829090770128268 samples/sec                   batch loss = 1524.1381362080574 | accuracy = 0.6706349206349206


Epoch[2] Batch[635] Speed: 1.1735910233789024 samples/sec                   batch loss = 1532.4054921269417 | accuracy = 0.6716535433070866


Epoch[2] Batch[640] Speed: 1.1751561035079467 samples/sec                   batch loss = 1542.6858729720116 | accuracy = 0.671875


Epoch[2] Batch[645] Speed: 1.1891530676947342 samples/sec                   batch loss = 1553.0844013094902 | accuracy = 0.6717054263565891


Epoch[2] Batch[650] Speed: 1.1838613904353106 samples/sec                   batch loss = 1563.9835250973701 | accuracy = 0.6723076923076923


Epoch[2] Batch[655] Speed: 1.1758479442199552 samples/sec                   batch loss = 1580.2183546423912 | accuracy = 0.6725190839694657


Epoch[2] Batch[660] Speed: 1.1757404083817993 samples/sec                   batch loss = 1591.9509537816048 | accuracy = 0.671969696969697


Epoch[2] Batch[665] Speed: 1.1783170264793792 samples/sec                   batch loss = 1603.2985818982124 | accuracy = 0.6714285714285714


Epoch[2] Batch[670] Speed: 1.1856390619685426 samples/sec                   batch loss = 1615.8536711335182 | accuracy = 0.6708955223880597


Epoch[2] Batch[675] Speed: 1.1862391245431816 samples/sec                   batch loss = 1629.1559219956398 | accuracy = 0.6703703703703704


Epoch[2] Batch[680] Speed: 1.170861834499202 samples/sec                   batch loss = 1642.2389683127403 | accuracy = 0.6702205882352941


Epoch[2] Batch[685] Speed: 1.147086419545439 samples/sec                   batch loss = 1655.6125966906548 | accuracy = 0.6697080291970803


Epoch[2] Batch[690] Speed: 1.147417482171556 samples/sec                   batch loss = 1668.5543655753136 | accuracy = 0.6699275362318841


Epoch[2] Batch[695] Speed: 1.170069754145545 samples/sec                   batch loss = 1680.08453053236 | accuracy = 0.670863309352518


Epoch[2] Batch[700] Speed: 1.1806926270259224 samples/sec                   batch loss = 1692.326437175274 | accuracy = 0.6707142857142857


Epoch[2] Batch[705] Speed: 1.1566804381361118 samples/sec                   batch loss = 1704.7610681653023 | accuracy = 0.6705673758865248


Epoch[2] Batch[710] Speed: 1.1443248031748483 samples/sec                   batch loss = 1713.9500148892403 | accuracy = 0.6721830985915493


Epoch[2] Batch[715] Speed: 1.1512464619715466 samples/sec                   batch loss = 1726.4584420323372 | accuracy = 0.6723776223776223


Epoch[2] Batch[720] Speed: 1.1768521486727352 samples/sec                   batch loss = 1740.3215414881706 | accuracy = 0.6725694444444444


Epoch[2] Batch[725] Speed: 1.1806554034468615 samples/sec                   batch loss = 1752.5948931574821 | accuracy = 0.6724137931034483


Epoch[2] Batch[730] Speed: 1.1573218687489593 samples/sec                   batch loss = 1764.2123547196388 | accuracy = 0.672945205479452


Epoch[2] Batch[735] Speed: 1.1472225871044048 samples/sec                   batch loss = 1776.5004277825356 | accuracy = 0.6724489795918367


Epoch[2] Batch[740] Speed: 1.1483719639584091 samples/sec                   batch loss = 1787.3878765702248 | accuracy = 0.6733108108108108


Epoch[2] Batch[745] Speed: 1.172893307442439 samples/sec                   batch loss = 1801.1749157309532 | accuracy = 0.6728187919463087


Epoch[2] Batch[750] Speed: 1.1747451734853689 samples/sec                   batch loss = 1812.3804178833961 | accuracy = 0.6726666666666666


Epoch[2] Batch[755] Speed: 1.1500520659712976 samples/sec                   batch loss = 1823.5001474022865 | accuracy = 0.6735099337748345


Epoch[2] Batch[760] Speed: 1.1471045367513149 samples/sec                   batch loss = 1833.5745418667793 | accuracy = 0.6736842105263158


Epoch[2] Batch[765] Speed: 1.156582997187272 samples/sec                   batch loss = 1844.377526819706 | accuracy = 0.6745098039215687


Epoch[2] Batch[770] Speed: 1.1892060859745874 samples/sec                   batch loss = 1854.9958909153938 | accuracy = 0.6753246753246753


Epoch[2] Batch[775] Speed: 1.179484475535134 samples/sec                   batch loss = 1864.058714568615 | accuracy = 0.6764516129032258


Epoch[2] Batch[780] Speed: 1.1545544270058068 samples/sec                   batch loss = 1874.2061901688576 | accuracy = 0.6772435897435898


Epoch[2] Batch[785] Speed: 1.1577300447331966 samples/sec                   batch loss = 1887.2078105807304 | accuracy = 0.6770700636942675


[Epoch 2] training: accuracy=0.6773477157360406
[Epoch 2] time cost: 695.6619911193848
[Epoch 2] validation: validation accuracy=0.7488888888888889


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).