<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[21:38:36] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[21:38:36] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[21:38:36] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 6.0209208, -4.5477695]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7708740984950941 samples/sec                   batch loss = 13.648914575576782 | accuracy = 0.65


Epoch[1] Batch[10] Speed: 1.2506082445754811 samples/sec                   batch loss = 28.191324472427368 | accuracy = 0.55


Epoch[1] Batch[15] Speed: 1.2529224897774283 samples/sec                   batch loss = 41.98419237136841 | accuracy = 0.5833333333333334


Epoch[1] Batch[20] Speed: 1.243557776179382 samples/sec                   batch loss = 55.26859164237976 | accuracy = 0.575


Epoch[1] Batch[25] Speed: 1.2536016954376643 samples/sec                   batch loss = 70.78349995613098 | accuracy = 0.55


Epoch[1] Batch[30] Speed: 1.253118826730786 samples/sec                   batch loss = 85.39969968795776 | accuracy = 0.55


Epoch[1] Batch[35] Speed: 1.2392600079833997 samples/sec                   batch loss = 99.57779002189636 | accuracy = 0.5357142857142857


Epoch[1] Batch[40] Speed: 1.2386008221156022 samples/sec                   batch loss = 113.1980767250061 | accuracy = 0.55625


Epoch[1] Batch[45] Speed: 1.2350264812857654 samples/sec                   batch loss = 127.18209052085876 | accuracy = 0.5666666666666667


Epoch[1] Batch[50] Speed: 1.2460695319682868 samples/sec                   batch loss = 141.41956686973572 | accuracy = 0.565


Epoch[1] Batch[55] Speed: 1.252685526221573 samples/sec                   batch loss = 155.30158352851868 | accuracy = 0.5681818181818182


Epoch[1] Batch[60] Speed: 1.2476652406367004 samples/sec                   batch loss = 170.0306203365326 | accuracy = 0.5583333333333333


Epoch[1] Batch[65] Speed: 1.2497092900912818 samples/sec                   batch loss = 183.26158118247986 | accuracy = 0.5692307692307692


Epoch[1] Batch[70] Speed: 1.2440638371456352 samples/sec                   batch loss = 197.02024292945862 | accuracy = 0.5678571428571428


Epoch[1] Batch[75] Speed: 1.2471683910627267 samples/sec                   batch loss = 211.57759284973145 | accuracy = 0.5566666666666666


Epoch[1] Batch[80] Speed: 1.2493727711136402 samples/sec                   batch loss = 224.83905410766602 | accuracy = 0.56875


Epoch[1] Batch[85] Speed: 1.235450287739052 samples/sec                   batch loss = 238.88493967056274 | accuracy = 0.5647058823529412


Epoch[1] Batch[90] Speed: 1.2410059112159246 samples/sec                   batch loss = 252.48031854629517 | accuracy = 0.5638888888888889


Epoch[1] Batch[95] Speed: 1.2370834759201708 samples/sec                   batch loss = 266.0102458000183 | accuracy = 0.5684210526315789


Epoch[1] Batch[100] Speed: 1.2358580893409956 samples/sec                   batch loss = 279.83588194847107 | accuracy = 0.5675


Epoch[1] Batch[105] Speed: 1.234695732410136 samples/sec                   batch loss = 294.01812386512756 | accuracy = 0.5595238095238095


Epoch[1] Batch[110] Speed: 1.235203971685134 samples/sec                   batch loss = 308.20148038864136 | accuracy = 0.55


Epoch[1] Batch[115] Speed: 1.2506308049278856 samples/sec                   batch loss = 321.51720690727234 | accuracy = 0.5543478260869565


Epoch[1] Batch[120] Speed: 1.2519599446628442 samples/sec                   batch loss = 335.0112669467926 | accuracy = 0.5583333333333333


Epoch[1] Batch[125] Speed: 1.2444705149142794 samples/sec                   batch loss = 348.9748685359955 | accuracy = 0.558


Epoch[1] Batch[130] Speed: 1.237382650403079 samples/sec                   batch loss = 362.8165090084076 | accuracy = 0.5576923076923077


Epoch[1] Batch[135] Speed: 1.2387980926136775 samples/sec                   batch loss = 375.91596269607544 | accuracy = 0.5648148148148148


Epoch[1] Batch[140] Speed: 1.2378539222796818 samples/sec                   batch loss = 389.65317821502686 | accuracy = 0.5642857142857143


Epoch[1] Batch[145] Speed: 1.2508150471925539 samples/sec                   batch loss = 403.2741401195526 | accuracy = 0.5637931034482758


Epoch[1] Batch[150] Speed: 1.2592521037991065 samples/sec                   batch loss = 416.07664728164673 | accuracy = 0.5683333333333334


Epoch[1] Batch[155] Speed: 1.246161901212877 samples/sec                   batch loss = 429.54346561431885 | accuracy = 0.567741935483871


Epoch[1] Batch[160] Speed: 1.2519778824317087 samples/sec                   batch loss = 443.28702783584595 | accuracy = 0.5671875


Epoch[1] Batch[165] Speed: 1.2472836412273494 samples/sec                   batch loss = 456.904910326004 | accuracy = 0.5681818181818182


Epoch[1] Batch[170] Speed: 1.2483347628071717 samples/sec                   batch loss = 471.3045163154602 | accuracy = 0.5632352941176471


Epoch[1] Batch[175] Speed: 1.2410952359851881 samples/sec                   batch loss = 485.23282837867737 | accuracy = 0.5614285714285714


Epoch[1] Batch[180] Speed: 1.2355823091813256 samples/sec                   batch loss = 498.9750304222107 | accuracy = 0.5597222222222222


Epoch[1] Batch[185] Speed: 1.236959979768883 samples/sec                   batch loss = 512.6900110244751 | accuracy = 0.5581081081081081


Epoch[1] Batch[190] Speed: 1.2475579907711494 samples/sec                   batch loss = 526.32155418396 | accuracy = 0.5578947368421052


Epoch[1] Batch[195] Speed: 1.248003346810733 samples/sec                   batch loss = 539.8674750328064 | accuracy = 0.5615384615384615


Epoch[1] Batch[200] Speed: 1.2503946141531042 samples/sec                   batch loss = 553.9391067028046 | accuracy = 0.56125


Epoch[1] Batch[205] Speed: 1.2453062978345073 samples/sec                   batch loss = 567.2073874473572 | accuracy = 0.5634146341463414


Epoch[1] Batch[210] Speed: 1.239250579563967 samples/sec                   batch loss = 581.4925103187561 | accuracy = 0.5619047619047619


Epoch[1] Batch[215] Speed: 1.2299972954630716 samples/sec                   batch loss = 595.6836054325104 | accuracy = 0.5616279069767441


Epoch[1] Batch[220] Speed: 1.2284789039271131 samples/sec                   batch loss = 609.8116042613983 | accuracy = 0.5579545454545455


Epoch[1] Batch[225] Speed: 1.2352788202734792 samples/sec                   batch loss = 623.334450006485 | accuracy = 0.56


Epoch[1] Batch[230] Speed: 1.2332745041696815 samples/sec                   batch loss = 637.0354912281036 | accuracy = 0.5565217391304348


Epoch[1] Batch[235] Speed: 1.2362506739531594 samples/sec                   batch loss = 651.4510564804077 | accuracy = 0.5521276595744681


Epoch[1] Batch[240] Speed: 1.2581652576577085 samples/sec                   batch loss = 664.1265068054199 | accuracy = 0.559375


Epoch[1] Batch[245] Speed: 1.2548729574357689 samples/sec                   batch loss = 677.6868088245392 | accuracy = 0.5591836734693878


Epoch[1] Batch[250] Speed: 1.2543620988795745 samples/sec                   batch loss = 691.1919691562653 | accuracy = 0.56


Epoch[1] Batch[255] Speed: 1.2570052208088516 samples/sec                   batch loss = 705.5231168270111 | accuracy = 0.5588235294117647


Epoch[1] Batch[260] Speed: 1.2546525194955849 samples/sec                   batch loss = 719.2712888717651 | accuracy = 0.5596153846153846


Epoch[1] Batch[265] Speed: 1.2556134786953344 samples/sec                   batch loss = 731.8392543792725 | accuracy = 0.5660377358490566


Epoch[1] Batch[270] Speed: 1.2452464033832382 samples/sec                   batch loss = 745.44051861763 | accuracy = 0.5675925925925925


Epoch[1] Batch[275] Speed: 1.2454494023486185 samples/sec                   batch loss = 759.1354570388794 | accuracy = 0.5663636363636364


Epoch[1] Batch[280] Speed: 1.240996364416402 samples/sec                   batch loss = 772.3857998847961 | accuracy = 0.5669642857142857


Epoch[1] Batch[285] Speed: 1.252577037498211 samples/sec                   batch loss = 786.602082490921 | accuracy = 0.5640350877192982


Epoch[1] Batch[290] Speed: 1.2552581824167355 samples/sec                   batch loss = 800.0212473869324 | accuracy = 0.5663793103448276


Epoch[1] Batch[295] Speed: 1.2559022227918055 samples/sec                   batch loss = 814.2971692085266 | accuracy = 0.5652542372881356


Epoch[1] Batch[300] Speed: 1.2548425475975016 samples/sec                   batch loss = 827.6204545497894 | accuracy = 0.5641666666666667


Epoch[1] Batch[305] Speed: 1.2597524813845942 samples/sec                   batch loss = 840.6177132129669 | accuracy = 0.5663934426229508


Epoch[1] Batch[310] Speed: 1.2570126609971832 samples/sec                   batch loss = 853.8333513736725 | accuracy = 0.5669354838709677


Epoch[1] Batch[315] Speed: 1.2497138514661545 samples/sec                   batch loss = 867.7705028057098 | accuracy = 0.5674603174603174


Epoch[1] Batch[320] Speed: 1.244278447197276 samples/sec                   batch loss = 880.7287180423737 | accuracy = 0.57109375


Epoch[1] Batch[325] Speed: 1.2445157485183318 samples/sec                   batch loss = 893.9915609359741 | accuracy = 0.5707692307692308


Epoch[1] Batch[330] Speed: 1.2471742318700243 samples/sec                   batch loss = 906.819249868393 | accuracy = 0.571969696969697


Epoch[1] Batch[335] Speed: 1.24455239938038 samples/sec                   batch loss = 921.1751322746277 | accuracy = 0.5701492537313433


Epoch[1] Batch[340] Speed: 1.2442928432982525 samples/sec                   batch loss = 934.7090172767639 | accuracy = 0.5683823529411764


Epoch[1] Batch[345] Speed: 1.2412971591774236 samples/sec                   batch loss = 947.8685984611511 | accuracy = 0.5688405797101449


Epoch[1] Batch[350] Speed: 1.258379380935119 samples/sec                   batch loss = 960.7775845527649 | accuracy = 0.5707142857142857


Epoch[1] Batch[355] Speed: 1.2602159593400992 samples/sec                   batch loss = 973.5555393695831 | accuracy = 0.571830985915493


Epoch[1] Batch[360] Speed: 1.2534595209211932 samples/sec                   batch loss = 986.9878013134003 | accuracy = 0.5729166666666666


Epoch[1] Batch[365] Speed: 1.245302970213838 samples/sec                   batch loss = 1000.5695207118988 | accuracy = 0.571917808219178


Epoch[1] Batch[370] Speed: 1.2447436281810127 samples/sec                   batch loss = 1013.9347221851349 | accuracy = 0.572972972972973


Epoch[1] Batch[375] Speed: 1.241156200866601 samples/sec                   batch loss = 1027.7644138336182 | accuracy = 0.572


Epoch[1] Batch[380] Speed: 1.2429127032680989 samples/sec                   batch loss = 1042.633730173111 | accuracy = 0.5677631578947369


Epoch[1] Batch[385] Speed: 1.2451045467661095 samples/sec                   batch loss = 1055.877587556839 | accuracy = 0.5681818181818182


Epoch[1] Batch[390] Speed: 1.2442713415522761 samples/sec                   batch loss = 1070.073879957199 | accuracy = 0.566025641025641


Epoch[1] Batch[395] Speed: 1.246654984705508 samples/sec                   batch loss = 1084.1397132873535 | accuracy = 0.5658227848101266


Epoch[1] Batch[400] Speed: 1.2437381883644925 samples/sec                   batch loss = 1097.6599016189575 | accuracy = 0.56375


Epoch[1] Batch[405] Speed: 1.2424803578877972 samples/sec                   batch loss = 1110.6996686458588 | accuracy = 0.5654320987654321


Epoch[1] Batch[410] Speed: 1.237998242311589 samples/sec                   batch loss = 1123.1297204494476 | accuracy = 0.5664634146341463


Epoch[1] Batch[415] Speed: 1.2372467767802366 samples/sec                   batch loss = 1136.3648445606232 | accuracy = 0.5662650602409639


Epoch[1] Batch[420] Speed: 1.228749902042568 samples/sec                   batch loss = 1149.5991916656494 | accuracy = 0.5678571428571428


Epoch[1] Batch[425] Speed: 1.236040189907317 samples/sec                   batch loss = 1162.6480355262756 | accuracy = 0.5688235294117647


Epoch[1] Batch[430] Speed: 1.2414682805867185 samples/sec                   batch loss = 1175.5161244869232 | accuracy = 0.5697674418604651


Epoch[1] Batch[435] Speed: 1.2386207566600387 samples/sec                   batch loss = 1187.7404732704163 | accuracy = 0.5712643678160919


Epoch[1] Batch[440] Speed: 1.2352119745032912 samples/sec                   batch loss = 1201.4827172756195 | accuracy = 0.5698863636363637


Epoch[1] Batch[445] Speed: 1.2264544358342024 samples/sec                   batch loss = 1215.2405219078064 | accuracy = 0.5685393258426966


Epoch[1] Batch[450] Speed: 1.2331801378303806 samples/sec                   batch loss = 1228.0010824203491 | accuracy = 0.5683333333333334


Epoch[1] Batch[455] Speed: 1.2349405731605092 samples/sec                   batch loss = 1242.3414375782013 | accuracy = 0.5664835164835165


Epoch[1] Batch[460] Speed: 1.231185060355691 samples/sec                   batch loss = 1254.932861328125 | accuracy = 0.5673913043478261


Epoch[1] Batch[465] Speed: 1.2319698861883612 samples/sec                   batch loss = 1268.8180930614471 | accuracy = 0.567741935483871


Epoch[1] Batch[470] Speed: 1.2372941329693896 samples/sec                   batch loss = 1282.8170096874237 | accuracy = 0.5675531914893617


Epoch[1] Batch[475] Speed: 1.2402822398630762 samples/sec                   batch loss = 1297.1787748336792 | accuracy = 0.5668421052631579


Epoch[1] Batch[480] Speed: 1.2373268008157388 samples/sec                   batch loss = 1310.0387570858002 | accuracy = 0.5677083333333334


Epoch[1] Batch[485] Speed: 1.2366788762900671 samples/sec                   batch loss = 1323.6764166355133 | accuracy = 0.5675257731958763


Epoch[1] Batch[490] Speed: 1.243755338082183 samples/sec                   batch loss = 1336.726016998291 | accuracy = 0.5678571428571428


Epoch[1] Batch[495] Speed: 1.2398251512109775 samples/sec                   batch loss = 1349.5420157909393 | accuracy = 0.5696969696969697


Epoch[1] Batch[500] Speed: 1.2454547647848988 samples/sec                   batch loss = 1362.558641910553 | accuracy = 0.5705


Epoch[1] Batch[505] Speed: 1.2329205915884989 samples/sec                   batch loss = 1376.0803775787354 | accuracy = 0.5707920792079207


Epoch[1] Batch[510] Speed: 1.230406918645166 samples/sec                   batch loss = 1390.02152967453 | accuracy = 0.5700980392156862


Epoch[1] Batch[515] Speed: 1.2338058902754225 samples/sec                   batch loss = 1404.1575796604156 | accuracy = 0.5699029126213592


Epoch[1] Batch[520] Speed: 1.246260394128768 samples/sec                   batch loss = 1417.960827589035 | accuracy = 0.5701923076923077


Epoch[1] Batch[525] Speed: 1.252378533088509 samples/sec                   batch loss = 1431.0491960048676 | accuracy = 0.5714285714285714


Epoch[1] Batch[530] Speed: 1.251890440663194 samples/sec                   batch loss = 1443.5271270275116 | accuracy = 0.5726415094339623


Epoch[1] Batch[535] Speed: 1.2563369051272293 samples/sec                   batch loss = 1457.161981344223 | accuracy = 0.5719626168224299


Epoch[1] Batch[540] Speed: 1.2546846090911674 samples/sec                   batch loss = 1470.1510424613953 | accuracy = 0.5717592592592593


Epoch[1] Batch[545] Speed: 1.2460201135670659 samples/sec                   batch loss = 1483.2493615150452 | accuracy = 0.5729357798165138


Epoch[1] Batch[550] Speed: 1.2419248350900949 samples/sec                   batch loss = 1495.8802189826965 | accuracy = 0.5736363636363636


Epoch[1] Batch[555] Speed: 1.2369110988873988 samples/sec                   batch loss = 1509.821887254715 | accuracy = 0.572972972972973


Epoch[1] Batch[560] Speed: 1.2366421407748989 samples/sec                   batch loss = 1523.7202589511871 | accuracy = 0.5727678571428572


Epoch[1] Batch[565] Speed: 1.236677508924094 samples/sec                   batch loss = 1536.7339441776276 | accuracy = 0.5725663716814159


Epoch[1] Batch[570] Speed: 1.2335381900403717 samples/sec                   batch loss = 1549.5614839792252 | accuracy = 0.5732456140350877


Epoch[1] Batch[575] Speed: 1.2319785708836037 samples/sec                   batch loss = 1563.3240352869034 | accuracy = 0.5726086956521739


Epoch[1] Batch[580] Speed: 1.2435669015480826 samples/sec                   batch loss = 1575.3252457380295 | accuracy = 0.5741379310344827


Epoch[1] Batch[585] Speed: 1.2469266481821355 samples/sec                   batch loss = 1588.3953238725662 | accuracy = 0.5743589743589743


Epoch[1] Batch[590] Speed: 1.244967895943781 samples/sec                   batch loss = 1602.2829459905624 | accuracy = 0.5741525423728814


Epoch[1] Batch[595] Speed: 1.2377699033302125 samples/sec                   batch loss = 1614.5166002511978 | accuracy = 0.5760504201680672


Epoch[1] Batch[600] Speed: 1.2332367921613694 samples/sec                   batch loss = 1626.7893325090408 | accuracy = 0.5770833333333333


Epoch[1] Batch[605] Speed: 1.2381923966947028 samples/sec                   batch loss = 1638.9445884227753 | accuracy = 0.5776859504132231


Epoch[1] Batch[610] Speed: 1.2370676042806104 samples/sec                   batch loss = 1652.4712274074554 | accuracy = 0.5770491803278689


Epoch[1] Batch[615] Speed: 1.2505693718282407 samples/sec                   batch loss = 1665.543686389923 | accuracy = 0.5776422764227642


Epoch[1] Batch[620] Speed: 1.2504565892854906 samples/sec                   batch loss = 1680.1032013893127 | accuracy = 0.5766129032258065


Epoch[1] Batch[625] Speed: 1.2501382049224126 samples/sec                   batch loss = 1693.8930921554565 | accuracy = 0.5776


Epoch[1] Batch[630] Speed: 1.2473955738808231 samples/sec                   batch loss = 1707.5959649085999 | accuracy = 0.5765873015873015


Epoch[1] Batch[635] Speed: 1.2514293314043106 samples/sec                   batch loss = 1721.1302144527435 | accuracy = 0.5763779527559055


Epoch[1] Batch[640] Speed: 1.254395486656864 samples/sec                   batch loss = 1733.7397079467773 | accuracy = 0.576953125


Epoch[1] Batch[645] Speed: 1.2387451334452524 samples/sec                   batch loss = 1746.8067212104797 | accuracy = 0.5775193798449613


Epoch[1] Batch[650] Speed: 1.2391105432348146 samples/sec                   batch loss = 1758.433559179306 | accuracy = 0.5792307692307692


Epoch[1] Batch[655] Speed: 1.2448840169917024 samples/sec                   batch loss = 1771.383994102478 | accuracy = 0.5790076335877863


Epoch[1] Batch[660] Speed: 1.248109280503314 samples/sec                   batch loss = 1783.5831401348114 | accuracy = 0.5806818181818182


Epoch[1] Batch[665] Speed: 1.247559382303579 samples/sec                   batch loss = 1796.7169275283813 | accuracy = 0.5812030075187969


Epoch[1] Batch[670] Speed: 1.2488488102517876 samples/sec                   batch loss = 1811.898182630539 | accuracy = 0.5802238805970149


Epoch[1] Batch[675] Speed: 1.2355052401700095 samples/sec                   batch loss = 1824.2556902170181 | accuracy = 0.5807407407407408


Epoch[1] Batch[680] Speed: 1.241275026144717 samples/sec                   batch loss = 1838.1479295492172 | accuracy = 0.5801470588235295


Epoch[1] Batch[685] Speed: 1.2371060982712814 samples/sec                   batch loss = 1850.638892531395 | accuracy = 0.581021897810219


Epoch[1] Batch[690] Speed: 1.241763789621558 samples/sec                   batch loss = 1861.115183711052 | accuracy = 0.5826086956521739


Epoch[1] Batch[695] Speed: 1.239589268784002 samples/sec                   batch loss = 1872.5050716400146 | accuracy = 0.5838129496402877


Epoch[1] Batch[700] Speed: 1.2443499695721523 samples/sec                   batch loss = 1886.5184364318848 | accuracy = 0.5835714285714285


Epoch[1] Batch[705] Speed: 1.2462294745632292 samples/sec                   batch loss = 1898.756558895111 | accuracy = 0.5851063829787234


Epoch[1] Batch[710] Speed: 1.25082157498902 samples/sec                   batch loss = 1912.6528408527374 | accuracy = 0.5855633802816902


Epoch[1] Batch[715] Speed: 1.2499458738757747 samples/sec                   batch loss = 1925.5732967853546 | accuracy = 0.5856643356643356


Epoch[1] Batch[720] Speed: 1.2516745043594673 samples/sec                   batch loss = 1937.0051342248917 | accuracy = 0.5868055555555556


Epoch[1] Batch[725] Speed: 1.252412282813552 samples/sec                   batch loss = 1949.1120884418488 | accuracy = 0.5875862068965517


Epoch[1] Batch[730] Speed: 1.2564304265382968 samples/sec                   batch loss = 1961.359723687172 | accuracy = 0.5883561643835616


Epoch[1] Batch[735] Speed: 1.2503132636430099 samples/sec                   batch loss = 1974.4790774583817 | accuracy = 0.5880952380952381


Epoch[1] Batch[740] Speed: 1.2385213646429551 samples/sec                   batch loss = 1987.0507587194443 | accuracy = 0.5888513513513514


Epoch[1] Batch[745] Speed: 1.2455320629619229 samples/sec                   batch loss = 1999.4119666814804 | accuracy = 0.5902684563758389


Epoch[1] Batch[750] Speed: 1.2478891702436004 samples/sec                   batch loss = 2011.3106843233109 | accuracy = 0.591


Epoch[1] Batch[755] Speed: 1.2518659666296243 samples/sec                   batch loss = 2025.1634529829025 | accuracy = 0.5913907284768212


Epoch[1] Batch[760] Speed: 1.2494217114481436 samples/sec                   batch loss = 2038.796444773674 | accuracy = 0.5911184210526316


Epoch[1] Batch[765] Speed: 1.2527418355354907 samples/sec                   batch loss = 2054.000499844551 | accuracy = 0.5905228758169935


Epoch[1] Batch[770] Speed: 1.2498298519026192 samples/sec                   batch loss = 2065.760013818741 | accuracy = 0.5915584415584415


Epoch[1] Batch[775] Speed: 1.253288729368942 samples/sec                   batch loss = 2080.6802320480347 | accuracy = 0.5903225806451613


Epoch[1] Batch[780] Speed: 1.246426126930563 samples/sec                   batch loss = 2095.187212705612 | accuracy = 0.5897435897435898


Epoch[1] Batch[785] Speed: 1.2446193366265086 samples/sec                   batch loss = 2107.8100004196167 | accuracy = 0.5898089171974522


[Epoch 1] training: accuracy=0.5904187817258884
[Epoch 1] time cost: 652.1395258903503
[Epoch 1] validation: validation accuracy=0.6833333333333333


Epoch[2] Batch[5] Speed: 1.2315368923021384 samples/sec                   batch loss = 13.664393663406372 | accuracy = 0.55


Epoch[2] Batch[10] Speed: 1.2326287326268224 samples/sec                   batch loss = 25.929492235183716 | accuracy = 0.6


Epoch[2] Batch[15] Speed: 1.2282832873700031 samples/sec                   batch loss = 39.008296251297 | accuracy = 0.6


Epoch[2] Batch[20] Speed: 1.2282457900709938 samples/sec                   batch loss = 51.14298379421234 | accuracy = 0.625


Epoch[2] Batch[25] Speed: 1.2349296650473989 samples/sec                   batch loss = 62.955394983291626 | accuracy = 0.66


Epoch[2] Batch[30] Speed: 1.2305098860151935 samples/sec                   batch loss = 73.99101889133453 | accuracy = 0.675


Epoch[2] Batch[35] Speed: 1.2376608783509437 samples/sec                   batch loss = 85.52898418903351 | accuracy = 0.6857142857142857


Epoch[2] Batch[40] Speed: 1.2388988098468776 samples/sec                   batch loss = 98.34790527820587 | accuracy = 0.69375


Epoch[2] Batch[45] Speed: 1.2446866504420038 samples/sec                   batch loss = 113.05873358249664 | accuracy = 0.6722222222222223


Epoch[2] Batch[50] Speed: 1.2422989305792864 samples/sec                   batch loss = 123.54446804523468 | accuracy = 0.695


Epoch[2] Batch[55] Speed: 1.2391128311557864 samples/sec                   batch loss = 136.6345897912979 | accuracy = 0.6818181818181818


Epoch[2] Batch[60] Speed: 1.2374858758999907 samples/sec                   batch loss = 150.60611951351166 | accuracy = 0.6708333333333333


Epoch[2] Batch[65] Speed: 1.2417438456867627 samples/sec                   batch loss = 161.88404750823975 | accuracy = 0.6730769230769231


Epoch[2] Batch[70] Speed: 1.2449600433597 samples/sec                   batch loss = 174.0607945919037 | accuracy = 0.6642857142857143


Epoch[2] Batch[75] Speed: 1.2528473589769573 samples/sec                   batch loss = 185.9655145406723 | accuracy = 0.67


Epoch[2] Batch[80] Speed: 1.2521518680869728 samples/sec                   batch loss = 198.7836582660675 | accuracy = 0.671875


Epoch[2] Batch[85] Speed: 1.2467177940460907 samples/sec                   batch loss = 211.66595029830933 | accuracy = 0.6676470588235294


Epoch[2] Batch[90] Speed: 1.2470384240765033 samples/sec                   batch loss = 225.39545345306396 | accuracy = 0.6611111111111111


Epoch[2] Batch[95] Speed: 1.2436551203506414 samples/sec                   batch loss = 239.86748039722443 | accuracy = 0.6552631578947369


Epoch[2] Batch[100] Speed: 1.2375166369603554 samples/sec                   batch loss = 251.93265759944916 | accuracy = 0.655


Epoch[2] Batch[105] Speed: 1.2372893880625568 samples/sec                   batch loss = 265.80320370197296 | accuracy = 0.6452380952380953


Epoch[2] Batch[110] Speed: 1.2429239370273808 samples/sec                   batch loss = 278.43359100818634 | accuracy = 0.6477272727272727


Epoch[2] Batch[115] Speed: 1.2484888768566913 samples/sec                   batch loss = 289.66729164123535 | accuracy = 0.65


Epoch[2] Batch[120] Speed: 1.2494007764251527 samples/sec                   batch loss = 301.70473074913025 | accuracy = 0.65


Epoch[2] Batch[125] Speed: 1.2473231445758795 samples/sec                   batch loss = 314.19748067855835 | accuracy = 0.65


Epoch[2] Batch[130] Speed: 1.2390430076135506 samples/sec                   batch loss = 326.65878343582153 | accuracy = 0.6519230769230769


Epoch[2] Batch[135] Speed: 1.2485373763088004 samples/sec                   batch loss = 340.1927250623703 | accuracy = 0.65


Epoch[2] Batch[140] Speed: 1.2408430849608234 samples/sec                   batch loss = 351.34906005859375 | accuracy = 0.6535714285714286


Epoch[2] Batch[145] Speed: 1.23944851493787 samples/sec                   batch loss = 360.77632880210876 | accuracy = 0.6620689655172414


Epoch[2] Batch[150] Speed: 1.2335700249504835 samples/sec                   batch loss = 374.500182390213 | accuracy = 0.6566666666666666


Epoch[2] Batch[155] Speed: 1.2344789649023098 samples/sec                   batch loss = 386.2305040359497 | accuracy = 0.6596774193548387


Epoch[2] Batch[160] Speed: 1.2413913022631626 samples/sec                   batch loss = 398.5877151489258 | accuracy = 0.6609375


Epoch[2] Batch[165] Speed: 1.2535108423892354 samples/sec                   batch loss = 412.45518255233765 | accuracy = 0.656060606060606


Epoch[2] Batch[170] Speed: 1.2569419356791451 samples/sec                   batch loss = 422.9719524383545 | accuracy = 0.6617647058823529


Epoch[2] Batch[175] Speed: 1.2583098229853802 samples/sec                   batch loss = 435.3437783718109 | accuracy = 0.6614285714285715


Epoch[2] Batch[180] Speed: 1.2585869679374653 samples/sec                   batch loss = 450.19769763946533 | accuracy = 0.6569444444444444


Epoch[2] Batch[185] Speed: 1.2503369315426054 samples/sec                   batch loss = 460.9178421497345 | accuracy = 0.6608108108108108


Epoch[2] Batch[190] Speed: 1.2395904594213167 samples/sec                   batch loss = 473.8668911457062 | accuracy = 0.6605263157894737


Epoch[2] Batch[195] Speed: 1.23548167553224 samples/sec                   batch loss = 485.55266296863556 | accuracy = 0.6602564102564102


Epoch[2] Batch[200] Speed: 1.240903841508361 samples/sec                   batch loss = 497.4577851295471 | accuracy = 0.66375


Epoch[2] Batch[205] Speed: 1.2407731580520682 samples/sec                   batch loss = 510.4490580558777 | accuracy = 0.6646341463414634


Epoch[2] Batch[210] Speed: 1.2485780741587928 samples/sec                   batch loss = 522.4863723516464 | accuracy = 0.6654761904761904


Epoch[2] Batch[215] Speed: 1.2499798651302725 samples/sec                   batch loss = 534.049795627594 | accuracy = 0.6651162790697674


Epoch[2] Batch[220] Speed: 1.2499769781299686 samples/sec                   batch loss = 545.1492568254471 | accuracy = 0.6659090909090909


Epoch[2] Batch[225] Speed: 1.2532722519444444 samples/sec                   batch loss = 558.9844797849655 | accuracy = 0.6633333333333333


Epoch[2] Batch[230] Speed: 1.248115780088357 samples/sec                   batch loss = 570.7228856086731 | accuracy = 0.6641304347826087


Epoch[2] Batch[235] Speed: 1.2441215882015753 samples/sec                   batch loss = 582.8295170068741 | accuracy = 0.6659574468085107


Epoch[2] Batch[240] Speed: 1.2410836680241206 samples/sec                   batch loss = 597.2036665678024 | accuracy = 0.6645833333333333


Epoch[2] Batch[245] Speed: 1.2398125990887712 samples/sec                   batch loss = 611.7836121320724 | accuracy = 0.6632653061224489


Epoch[2] Batch[250] Speed: 1.2395719590075425 samples/sec                   batch loss = 622.8655380010605 | accuracy = 0.664


Epoch[2] Batch[255] Speed: 1.2450815385438574 samples/sec                   batch loss = 633.5538531541824 | accuracy = 0.6666666666666666


Epoch[2] Batch[260] Speed: 1.2383709811127421 samples/sec                   batch loss = 646.1359921693802 | accuracy = 0.6663461538461538


Epoch[2] Batch[265] Speed: 1.2468574239084527 samples/sec                   batch loss = 659.459282875061 | accuracy = 0.6669811320754717


Epoch[2] Batch[270] Speed: 1.2576931040443717 samples/sec                   batch loss = 670.9289977550507 | accuracy = 0.6666666666666666


Epoch[2] Batch[275] Speed: 1.254493409377122 samples/sec                   batch loss = 680.550389289856 | accuracy = 0.6681818181818182


Epoch[2] Batch[280] Speed: 1.2571486718582985 samples/sec                   batch loss = 693.9878270626068 | accuracy = 0.6678571428571428


Epoch[2] Batch[285] Speed: 1.245681231180818 samples/sec                   batch loss = 707.4227466583252 | accuracy = 0.6666666666666666


Epoch[2] Batch[290] Speed: 1.2388076970930075 samples/sec                   batch loss = 721.0494117736816 | accuracy = 0.6646551724137931


Epoch[2] Batch[295] Speed: 1.237437774874045 samples/sec                   batch loss = 733.1348965167999 | accuracy = 0.6644067796610169


Epoch[2] Batch[300] Speed: 1.250218787333942 samples/sec                   batch loss = 743.8888491392136 | accuracy = 0.6658333333333334


Epoch[2] Batch[305] Speed: 1.252873087688636 samples/sec                   batch loss = 755.054563164711 | accuracy = 0.6672131147540984


Epoch[2] Batch[310] Speed: 1.2535960752793545 samples/sec                   batch loss = 768.5989328622818 | accuracy = 0.6637096774193548


Epoch[2] Batch[315] Speed: 1.2531390441339476 samples/sec                   batch loss = 783.1275721788406 | accuracy = 0.6619047619047619


Epoch[2] Batch[320] Speed: 1.2532683198958137 samples/sec                   batch loss = 794.6429400444031 | accuracy = 0.66328125


Epoch[2] Batch[325] Speed: 1.2557238098845007 samples/sec                   batch loss = 807.0401196479797 | accuracy = 0.6638461538461539


Epoch[2] Batch[330] Speed: 1.2490628426791468 samples/sec                   batch loss = 818.1071472167969 | accuracy = 0.6651515151515152


Epoch[2] Batch[335] Speed: 1.2442860143370282 samples/sec                   batch loss = 831.1202726364136 | accuracy = 0.6634328358208955


Epoch[2] Batch[340] Speed: 1.238668949700219 samples/sec                   batch loss = 843.1309235095978 | accuracy = 0.663235294117647


Epoch[2] Batch[345] Speed: 1.2477055104858492 samples/sec                   batch loss = 853.110503911972 | accuracy = 0.6659420289855073


Epoch[2] Batch[350] Speed: 1.2469836458234465 samples/sec                   batch loss = 865.1676375865936 | accuracy = 0.6657142857142857


Epoch[2] Batch[355] Speed: 1.2573986355785345 samples/sec                   batch loss = 876.1319187879562 | accuracy = 0.6669014084507042


Epoch[2] Batch[360] Speed: 1.2529419522760503 samples/sec                   batch loss = 886.3666989803314 | accuracy = 0.66875


Epoch[2] Batch[365] Speed: 1.2437063796564072 samples/sec                   batch loss = 899.0691509246826 | accuracy = 0.6698630136986301


Epoch[2] Batch[370] Speed: 1.248800286601819 samples/sec                   batch loss = 909.6426055431366 | accuracy = 0.6722972972972973


Epoch[2] Batch[375] Speed: 1.2466969495367322 samples/sec                   batch loss = 922.0354561805725 | accuracy = 0.6713333333333333


Epoch[2] Batch[380] Speed: 1.2514498677218782 samples/sec                   batch loss = 937.557626247406 | accuracy = 0.6677631578947368


Epoch[2] Batch[385] Speed: 1.2414153684459615 samples/sec                   batch loss = 949.043997168541 | accuracy = 0.6675324675324675


Epoch[2] Batch[390] Speed: 1.247855756663748 samples/sec                   batch loss = 961.171155333519 | accuracy = 0.666025641025641


Epoch[2] Batch[395] Speed: 1.2535764987880582 samples/sec                   batch loss = 971.9623147249222 | accuracy = 0.6683544303797468


Epoch[2] Batch[400] Speed: 1.2555934632746446 samples/sec                   batch loss = 985.0649312734604 | accuracy = 0.666875


Epoch[2] Batch[405] Speed: 1.2560735393625784 samples/sec                   batch loss = 998.8821326494217 | accuracy = 0.6654320987654321


Epoch[2] Batch[410] Speed: 1.2575718688909943 samples/sec                   batch loss = 1014.2802877426147 | accuracy = 0.6634146341463415


Epoch[2] Batch[415] Speed: 1.254993203360888 samples/sec                   batch loss = 1026.2747530937195 | accuracy = 0.6632530120481928


Epoch[2] Batch[420] Speed: 1.2515562006176164 samples/sec                   batch loss = 1037.1440734863281 | accuracy = 0.6654761904761904


Epoch[2] Batch[425] Speed: 1.253388071204256 samples/sec                   batch loss = 1047.3185864686966 | accuracy = 0.6670588235294118


Epoch[2] Batch[430] Speed: 1.2452153492692424 samples/sec                   batch loss = 1059.588943719864 | accuracy = 0.666860465116279


Epoch[2] Batch[435] Speed: 1.2398155309485808 samples/sec                   batch loss = 1070.8940169811249 | accuracy = 0.667816091954023


Epoch[2] Batch[440] Speed: 1.254236910542594 samples/sec                   batch loss = 1083.1437932252884 | accuracy = 0.6681818181818182


Epoch[2] Batch[445] Speed: 1.2561087111144387 samples/sec                   batch loss = 1094.0882724523544 | accuracy = 0.6685393258426966


Epoch[2] Batch[450] Speed: 1.2507921071911383 samples/sec                   batch loss = 1108.9168157577515 | accuracy = 0.6672222222222223


Epoch[2] Batch[455] Speed: 1.257614100568254 samples/sec                   batch loss = 1118.0253400802612 | accuracy = 0.6692307692307692


Epoch[2] Batch[460] Speed: 1.2558995904099823 samples/sec                   batch loss = 1129.9758899211884 | accuracy = 0.6695652173913044


Epoch[2] Batch[465] Speed: 1.2571720340260093 samples/sec                   batch loss = 1142.623574256897 | accuracy = 0.6693548387096774


Epoch[2] Batch[470] Speed: 1.25376638844825 samples/sec                   batch loss = 1155.535206437111 | accuracy = 0.6696808510638298


Epoch[2] Batch[475] Speed: 1.2446089031751695 samples/sec                   batch loss = 1170.6878031492233 | accuracy = 0.6673684210526316


Epoch[2] Batch[480] Speed: 1.2462989068709431 samples/sec                   batch loss = 1185.3139659166336 | accuracy = 0.665625


Epoch[2] Batch[485] Speed: 1.2436166786440845 samples/sec                   batch loss = 1200.7887215614319 | accuracy = 0.6654639175257732


Epoch[2] Batch[490] Speed: 1.2450452260918698 samples/sec                   batch loss = 1211.178249001503 | accuracy = 0.6663265306122449


Epoch[2] Batch[495] Speed: 1.2289559297843529 samples/sec                   batch loss = 1225.3722339868546 | accuracy = 0.6651515151515152


Epoch[2] Batch[500] Speed: 1.2387844636557634 samples/sec                   batch loss = 1236.9501655101776 | accuracy = 0.666


Epoch[2] Batch[505] Speed: 1.2414204206257382 samples/sec                   batch loss = 1249.3169984817505 | accuracy = 0.6663366336633664


Epoch[2] Batch[510] Speed: 1.2393754490988327 samples/sec                   batch loss = 1260.473522901535 | accuracy = 0.6666666666666666


Epoch[2] Batch[515] Speed: 1.2341242700135275 samples/sec                   batch loss = 1275.42325258255 | accuracy = 0.6655339805825242


Epoch[2] Batch[520] Speed: 1.2250619459928205 samples/sec                   batch loss = 1286.2552012205124 | accuracy = 0.6663461538461538


Epoch[2] Batch[525] Speed: 1.233265257261907 samples/sec                   batch loss = 1295.6768206357956 | accuracy = 0.6671428571428571


Epoch[2] Batch[530] Speed: 1.2399775377527908 samples/sec                   batch loss = 1308.877673983574 | accuracy = 0.6674528301886793


Epoch[2] Batch[535] Speed: 1.2448324758417537 samples/sec                   batch loss = 1323.4346281290054 | accuracy = 0.6663551401869159


Epoch[2] Batch[540] Speed: 1.2488988251628919 samples/sec                   batch loss = 1338.6046417951584 | accuracy = 0.6643518518518519


Epoch[2] Batch[545] Speed: 1.2500729081833755 samples/sec                   batch loss = 1350.2907606363297 | accuracy = 0.6651376146788991


Epoch[2] Batch[550] Speed: 1.246862891141806 samples/sec                   batch loss = 1361.4936368465424 | accuracy = 0.6654545454545454


Epoch[2] Batch[555] Speed: 1.2502908079210049 samples/sec                   batch loss = 1371.5481402873993 | accuracy = 0.6657657657657657


Epoch[2] Batch[560] Speed: 1.2456556120268925 samples/sec                   batch loss = 1383.4208053350449 | accuracy = 0.665625


Epoch[2] Batch[565] Speed: 1.233942824251868 samples/sec                   batch loss = 1393.3125886917114 | accuracy = 0.6663716814159292


Epoch[2] Batch[570] Speed: 1.2340530105629075 samples/sec                   batch loss = 1406.2055995464325 | accuracy = 0.6666666666666666


Epoch[2] Batch[575] Speed: 1.2384233598021974 samples/sec                   batch loss = 1417.512740969658 | accuracy = 0.6678260869565218


Epoch[2] Batch[580] Speed: 1.2519450903370812 samples/sec                   batch loss = 1427.9865436553955 | accuracy = 0.6681034482758621


Epoch[2] Batch[585] Speed: 1.2476572612055872 samples/sec                   batch loss = 1440.222863793373 | accuracy = 0.6675213675213675


Epoch[2] Batch[590] Speed: 1.2489260654519057 samples/sec                   batch loss = 1449.6070337295532 | accuracy = 0.6686440677966101


Epoch[2] Batch[595] Speed: 1.2371151291989626 samples/sec                   batch loss = 1459.7424194812775 | accuracy = 0.6697478991596638


Epoch[2] Batch[600] Speed: 1.233459108713938 samples/sec                   batch loss = 1469.9641599655151 | accuracy = 0.6708333333333333


Epoch[2] Batch[605] Speed: 1.2324889211860774 samples/sec                   batch loss = 1483.1289142370224 | accuracy = 0.6706611570247933


Epoch[2] Batch[610] Speed: 1.2384865309218651 samples/sec                   batch loss = 1497.013930797577 | accuracy = 0.6704918032786885


Epoch[2] Batch[615] Speed: 1.2373495233238156 samples/sec                   batch loss = 1509.989940404892 | accuracy = 0.6711382113821138


Epoch[2] Batch[620] Speed: 1.248348881377876 samples/sec                   batch loss = 1521.8089544773102 | accuracy = 0.6709677419354839


Epoch[2] Batch[625] Speed: 1.2455698833884825 samples/sec                   batch loss = 1532.548740386963 | accuracy = 0.672


Epoch[2] Batch[630] Speed: 1.2590560135754465 samples/sec                   batch loss = 1543.3882848024368 | accuracy = 0.6726190476190477


Epoch[2] Batch[635] Speed: 1.2566606200125028 samples/sec                   batch loss = 1553.9303073883057 | accuracy = 0.6724409448818898


Epoch[2] Batch[640] Speed: 1.2489086798742934 samples/sec                   batch loss = 1565.2504291534424 | accuracy = 0.672265625


Epoch[2] Batch[645] Speed: 1.248313585549898 samples/sec                   batch loss = 1578.5974732637405 | accuracy = 0.6713178294573643


Epoch[2] Batch[650] Speed: 1.252091593683038 samples/sec                   batch loss = 1592.3668867349625 | accuracy = 0.6707692307692308


Epoch[2] Batch[655] Speed: 1.248467229850075 samples/sec                   batch loss = 1603.5256621837616 | accuracy = 0.6717557251908397


Epoch[2] Batch[660] Speed: 1.2411709839162528 samples/sec                   batch loss = 1617.665232539177 | accuracy = 0.6708333333333333


Epoch[2] Batch[665] Speed: 1.2318796087964339 samples/sec                   batch loss = 1628.5647231340408 | accuracy = 0.6714285714285714


Epoch[2] Batch[670] Speed: 1.2384468539521831 samples/sec                   batch loss = 1640.469537138939 | accuracy = 0.6712686567164179


Epoch[2] Batch[675] Speed: 1.2437474085840374 samples/sec                   batch loss = 1652.5365072488785 | accuracy = 0.6711111111111111


Epoch[2] Batch[680] Speed: 1.2462955739434054 samples/sec                   batch loss = 1665.7460391521454 | accuracy = 0.6713235294117647


Epoch[2] Batch[685] Speed: 1.249338626801137 samples/sec                   batch loss = 1675.5349569320679 | accuracy = 0.6722627737226278


Epoch[2] Batch[690] Speed: 1.2412469247591504 samples/sec                   batch loss = 1684.258256316185 | accuracy = 0.6728260869565217


Epoch[2] Batch[695] Speed: 1.246217162708243 samples/sec                   batch loss = 1696.147322177887 | accuracy = 0.6733812949640288


Epoch[2] Batch[700] Speed: 1.2479258344248298 samples/sec                   batch loss = 1708.8551697731018 | accuracy = 0.6725


Epoch[2] Batch[705] Speed: 1.2358728374897865 samples/sec                   batch loss = 1718.509373307228 | accuracy = 0.6734042553191489


Epoch[2] Batch[710] Speed: 1.235743575571409 samples/sec                   batch loss = 1726.5797996520996 | accuracy = 0.6742957746478874


Epoch[2] Batch[715] Speed: 1.2357058034263897 samples/sec                   batch loss = 1740.317934513092 | accuracy = 0.6737762237762238


Epoch[2] Batch[720] Speed: 1.237914477790468 samples/sec                   batch loss = 1750.2811081409454 | accuracy = 0.6746527777777778


Epoch[2] Batch[725] Speed: 1.2415170629623162 samples/sec                   batch loss = 1762.3253433704376 | accuracy = 0.6737931034482758


Epoch[2] Batch[730] Speed: 1.2412729139048446 samples/sec                   batch loss = 1772.0367549657822 | accuracy = 0.6756849315068493


Epoch[2] Batch[735] Speed: 1.2520694478017198 samples/sec                   batch loss = 1782.5570681095123 | accuracy = 0.676530612244898


Epoch[2] Batch[740] Speed: 1.2593990933824486 samples/sec                   batch loss = 1793.5515804290771 | accuracy = 0.6766891891891892


Epoch[2] Batch[745] Speed: 1.263969043742168 samples/sec                   batch loss = 1803.6500970125198 | accuracy = 0.6768456375838926


Epoch[2] Batch[750] Speed: 1.2502115205169313 samples/sec                   batch loss = 1817.3239954710007 | accuracy = 0.6766666666666666


Epoch[2] Batch[755] Speed: 1.2432615970425918 samples/sec                   batch loss = 1827.0032024383545 | accuracy = 0.6771523178807947


Epoch[2] Batch[760] Speed: 1.2547658725427975 samples/sec                   batch loss = 1841.7959411144257 | accuracy = 0.6759868421052632


Epoch[2] Batch[765] Speed: 1.2469089475360358 samples/sec                   batch loss = 1851.2227538824081 | accuracy = 0.6764705882352942


Epoch[2] Batch[770] Speed: 1.2528583988038071 samples/sec                   batch loss = 1862.0236102342606 | accuracy = 0.6756493506493506


Epoch[2] Batch[775] Speed: 1.260719282301749 samples/sec                   batch loss = 1872.1744621992111 | accuracy = 0.6764516129032258


Epoch[2] Batch[780] Speed: 1.256355721193715 samples/sec                   batch loss = 1885.6251035928726 | accuracy = 0.6762820512820513


Epoch[2] Batch[785] Speed: 1.2552525473958913 samples/sec                   batch loss = 1894.8939805030823 | accuracy = 0.6770700636942675


[Epoch 2] training: accuracy=0.6773477157360406
[Epoch 2] time cost: 648.4251916408539
[Epoch 2] validation: validation accuracy=0.7566666666666667


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).