<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[21:28:03] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[21:28:03] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[21:28:03] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 3.9456928, -3.2526836]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.777311776633763 samples/sec                   batch loss = 13.168399572372437 | accuracy = 0.6


Epoch[1] Batch[10] Speed: 1.2509144635896272 samples/sec                   batch loss = 28.403834104537964 | accuracy = 0.5


Epoch[1] Batch[15] Speed: 1.257235059688109 samples/sec                   batch loss = 42.122119665145874 | accuracy = 0.5


Epoch[1] Batch[20] Speed: 1.2546311273437565 samples/sec                   batch loss = 56.32066893577576 | accuracy = 0.4875


Epoch[1] Batch[25] Speed: 1.252189157018931 samples/sec                   batch loss = 70.28226590156555 | accuracy = 0.49


Epoch[1] Batch[30] Speed: 1.255516414560091 samples/sec                   batch loss = 84.2173068523407 | accuracy = 0.49166666666666664


Epoch[1] Batch[35] Speed: 1.2573399280439401 samples/sec                   batch loss = 99.19894623756409 | accuracy = 0.4714285714285714


Epoch[1] Batch[40] Speed: 1.2523861055832173 samples/sec                   batch loss = 113.12689161300659 | accuracy = 0.4625


Epoch[1] Batch[45] Speed: 1.2545270855692807 samples/sec                   batch loss = 126.87460947036743 | accuracy = 0.45555555555555555


Epoch[1] Batch[50] Speed: 1.2539640209564833 samples/sec                   batch loss = 141.27507519721985 | accuracy = 0.46


Epoch[1] Batch[55] Speed: 1.2577281779811267 samples/sec                   batch loss = 155.62069177627563 | accuracy = 0.4590909090909091


Epoch[1] Batch[60] Speed: 1.2552892698578302 samples/sec                   batch loss = 169.10122108459473 | accuracy = 0.4791666666666667


Epoch[1] Batch[65] Speed: 1.2503417770498724 samples/sec                   batch loss = 183.66919112205505 | accuracy = 0.4807692307692308


Epoch[1] Batch[70] Speed: 1.2566151581174818 samples/sec                   batch loss = 197.6464605331421 | accuracy = 0.48928571428571427


Epoch[1] Batch[75] Speed: 1.2527574571225255 samples/sec                   batch loss = 211.47417092323303 | accuracy = 0.49666666666666665


Epoch[1] Batch[80] Speed: 1.2539169733343718 samples/sec                   batch loss = 225.05627608299255 | accuracy = 0.50625


Epoch[1] Batch[85] Speed: 1.2480515300206572 samples/sec                   batch loss = 238.61396431922913 | accuracy = 0.5088235294117647


Epoch[1] Batch[90] Speed: 1.2461765260563562 samples/sec                   batch loss = 252.28146886825562 | accuracy = 0.5166666666666667


Epoch[1] Batch[95] Speed: 1.2490276924419275 samples/sec                   batch loss = 265.9791696071625 | accuracy = 0.5184210526315789


Epoch[1] Batch[100] Speed: 1.2525110182148158 samples/sec                   batch loss = 280.1192219257355 | accuracy = 0.5125


Epoch[1] Batch[105] Speed: 1.2505394498487363 samples/sec                   batch loss = 294.3241641521454 | accuracy = 0.5071428571428571


Epoch[1] Batch[110] Speed: 1.2527286463551568 samples/sec                   batch loss = 308.3444378376007 | accuracy = 0.509090909090909


Epoch[1] Batch[115] Speed: 1.2523601164156641 samples/sec                   batch loss = 322.33045053482056 | accuracy = 0.508695652173913


Epoch[1] Batch[120] Speed: 1.257706869388482 samples/sec                   batch loss = 336.2550194263458 | accuracy = 0.50625


Epoch[1] Batch[125] Speed: 1.2565282908079964 samples/sec                   batch loss = 350.27051973342896 | accuracy = 0.504


Epoch[1] Batch[130] Speed: 1.2527758854835558 samples/sec                   batch loss = 363.6388394832611 | accuracy = 0.5115384615384615


Epoch[1] Batch[135] Speed: 1.2536521854529499 samples/sec                   batch loss = 377.1932668685913 | accuracy = 0.5148148148148148


Epoch[1] Batch[140] Speed: 1.2474503883315036 samples/sec                   batch loss = 390.96305751800537 | accuracy = 0.5178571428571429


Epoch[1] Batch[145] Speed: 1.2516881382719705 samples/sec                   batch loss = 404.70260095596313 | accuracy = 0.5206896551724138


Epoch[1] Batch[150] Speed: 1.2523815246804142 samples/sec                   batch loss = 418.2294239997864 | accuracy = 0.5233333333333333


Epoch[1] Batch[155] Speed: 1.2549612857244588 samples/sec                   batch loss = 431.7369749546051 | accuracy = 0.5225806451612903


Epoch[1] Batch[160] Speed: 1.2569748017145344 samples/sec                   batch loss = 445.3668625354767 | accuracy = 0.5296875


Epoch[1] Batch[165] Speed: 1.257372437959583 samples/sec                   batch loss = 458.882404088974 | accuracy = 0.5333333333333333


Epoch[1] Batch[170] Speed: 1.2548204920011738 samples/sec                   batch loss = 472.29491686820984 | accuracy = 0.5338235294117647


Epoch[1] Batch[175] Speed: 1.2569580388800128 samples/sec                   batch loss = 485.51265954971313 | accuracy = 0.54


Epoch[1] Batch[180] Speed: 1.2622201758523846 samples/sec                   batch loss = 499.42882776260376 | accuracy = 0.5388888888888889


Epoch[1] Batch[185] Speed: 1.2501470545060676 samples/sec                   batch loss = 513.1596992015839 | accuracy = 0.5364864864864864


Epoch[1] Batch[190] Speed: 1.255904385113701 samples/sec                   batch loss = 526.8384144306183 | accuracy = 0.5394736842105263


Epoch[1] Batch[195] Speed: 1.2538947628351782 samples/sec                   batch loss = 539.7628173828125 | accuracy = 0.5448717948717948


Epoch[1] Batch[200] Speed: 1.2520347821933797 samples/sec                   batch loss = 553.1697010993958 | accuracy = 0.5475


Epoch[1] Batch[205] Speed: 1.2572036873803238 samples/sec                   batch loss = 565.9766366481781 | accuracy = 0.5536585365853659


Epoch[1] Batch[210] Speed: 1.2565521945902967 samples/sec                   batch loss = 580.075216293335 | accuracy = 0.5547619047619048


Epoch[1] Batch[215] Speed: 1.2549739587092927 samples/sec                   batch loss = 593.6630268096924 | accuracy = 0.5534883720930233


Epoch[1] Batch[220] Speed: 1.2566659852978639 samples/sec                   batch loss = 607.1716816425323 | accuracy = 0.553409090909091


Epoch[1] Batch[225] Speed: 1.2593017266705668 samples/sec                   batch loss = 621.0578262805939 | accuracy = 0.5522222222222222


Epoch[1] Batch[230] Speed: 1.2498131859888337 samples/sec                   batch loss = 634.6848409175873 | accuracy = 0.5521739130434783


Epoch[1] Batch[235] Speed: 1.2559952093574145 samples/sec                   batch loss = 648.1981358528137 | accuracy = 0.5531914893617021


Epoch[1] Batch[240] Speed: 1.251873252692254 samples/sec                   batch loss = 662.308441400528 | accuracy = 0.5510416666666667


Epoch[1] Batch[245] Speed: 1.255435429642307 samples/sec                   batch loss = 676.775032043457 | accuracy = 0.5469387755102041


Epoch[1] Batch[250] Speed: 1.2564473634940725 samples/sec                   batch loss = 690.2798278331757 | accuracy = 0.547


Epoch[1] Batch[255] Speed: 1.2548035988772805 samples/sec                   batch loss = 704.3925666809082 | accuracy = 0.5441176470588235


Epoch[1] Batch[260] Speed: 1.2547228935629962 samples/sec                   batch loss = 717.7986662387848 | accuracy = 0.5442307692307692


Epoch[1] Batch[265] Speed: 1.2550776990593284 samples/sec                   batch loss = 731.4611928462982 | accuracy = 0.5443396226415095


Epoch[1] Batch[270] Speed: 1.2548636653843692 samples/sec                   batch loss = 744.8938114643097 | accuracy = 0.5453703703703704


Epoch[1] Batch[275] Speed: 1.2550957262808888 samples/sec                   batch loss = 758.43967461586 | accuracy = 0.5454545454545454


Epoch[1] Batch[280] Speed: 1.25591350455318 samples/sec                   batch loss = 771.8884336948395 | accuracy = 0.5464285714285714


Epoch[1] Batch[285] Speed: 1.254383388071556 samples/sec                   batch loss = 784.6753087043762 | accuracy = 0.5508771929824562


Epoch[1] Batch[290] Speed: 1.25776655412611 samples/sec                   batch loss = 797.9473721981049 | accuracy = 0.5525862068965517


Epoch[1] Batch[295] Speed: 1.2622487600937409 samples/sec                   batch loss = 810.9908263683319 | accuracy = 0.5542372881355933


Epoch[1] Batch[300] Speed: 1.2653013172554575 samples/sec                   batch loss = 824.2154200077057 | accuracy = 0.5558333333333333


Epoch[1] Batch[305] Speed: 1.256444070161354 samples/sec                   batch loss = 837.5409719944 | accuracy = 0.5549180327868852


Epoch[1] Batch[310] Speed: 1.2549781830944489 samples/sec                   batch loss = 851.022629737854 | accuracy = 0.5556451612903226


Epoch[1] Batch[315] Speed: 1.2594018349932345 samples/sec                   batch loss = 864.2437126636505 | accuracy = 0.5587301587301587


Epoch[1] Batch[320] Speed: 1.2564757809678686 samples/sec                   batch loss = 877.2673239707947 | accuracy = 0.559375


Epoch[1] Batch[325] Speed: 1.266163412748336 samples/sec                   batch loss = 890.912232875824 | accuracy = 0.5607692307692308


Epoch[1] Batch[330] Speed: 1.2565042938311355 samples/sec                   batch loss = 903.5475056171417 | accuracy = 0.5613636363636364


Epoch[1] Batch[335] Speed: 1.2577212950381857 samples/sec                   batch loss = 917.7267305850983 | accuracy = 0.5604477611940298


Epoch[1] Batch[340] Speed: 1.2604314441028728 samples/sec                   batch loss = 930.921097278595 | accuracy = 0.5610294117647059


Epoch[1] Batch[345] Speed: 1.2584815138669345 samples/sec                   batch loss = 944.2484002113342 | accuracy = 0.5630434782608695


Epoch[1] Batch[350] Speed: 1.2625322984971115 samples/sec                   batch loss = 956.9226500988007 | accuracy = 0.5657142857142857


Epoch[1] Batch[355] Speed: 1.2577009295112571 samples/sec                   batch loss = 970.3902468681335 | accuracy = 0.5683098591549296


Epoch[1] Batch[360] Speed: 1.2611540828257186 samples/sec                   batch loss = 985.3641285896301 | accuracy = 0.5673611111111111


Epoch[1] Batch[365] Speed: 1.2609988166591304 samples/sec                   batch loss = 998.307576417923 | accuracy = 0.5691780821917808


Epoch[1] Batch[370] Speed: 1.256948339241771 samples/sec                   batch loss = 1011.0694303512573 | accuracy = 0.5709459459459459


Epoch[1] Batch[375] Speed: 1.2581592190979167 samples/sec                   batch loss = 1024.2311012744904 | accuracy = 0.572


Epoch[1] Batch[380] Speed: 1.2649232558767245 samples/sec                   batch loss = 1037.6608004570007 | accuracy = 0.5717105263157894


Epoch[1] Batch[385] Speed: 1.256878280281222 samples/sec                   batch loss = 1051.1929314136505 | accuracy = 0.5707792207792208


Epoch[1] Batch[390] Speed: 1.2568629323852107 samples/sec                   batch loss = 1064.152624130249 | accuracy = 0.5711538461538461


Epoch[1] Batch[395] Speed: 1.2578490660532822 samples/sec                   batch loss = 1076.5965638160706 | accuracy = 0.5727848101265823


Epoch[1] Batch[400] Speed: 1.2463414958475751 samples/sec                   batch loss = 1090.333404302597 | accuracy = 0.575


Epoch[1] Batch[405] Speed: 1.2503731805746814 samples/sec                   batch loss = 1102.575386762619 | accuracy = 0.578395061728395


Epoch[1] Batch[410] Speed: 1.2544986623688597 samples/sec                   batch loss = 1116.1399810314178 | accuracy = 0.5774390243902439


Epoch[1] Batch[415] Speed: 1.2529222090727368 samples/sec                   batch loss = 1129.3328263759613 | accuracy = 0.5789156626506025


Epoch[1] Batch[420] Speed: 1.2520421636580206 samples/sec                   batch loss = 1142.1544961929321 | accuracy = 0.5797619047619048


Epoch[1] Batch[425] Speed: 1.2524009703761754 samples/sec                   batch loss = 1156.175153017044 | accuracy = 0.5794117647058824


Epoch[1] Batch[430] Speed: 1.2507877244410779 samples/sec                   batch loss = 1167.965791940689 | accuracy = 0.5813953488372093


Epoch[1] Batch[435] Speed: 1.2583458751282743 samples/sec                   batch loss = 1181.7599573135376 | accuracy = 0.5810344827586207


Epoch[1] Batch[440] Speed: 1.249032341830669 samples/sec                   batch loss = 1195.8103439807892 | accuracy = 0.5801136363636363


Epoch[1] Batch[445] Speed: 1.2516216524102601 samples/sec                   batch loss = 1208.7485382556915 | accuracy = 0.5803370786516854


Epoch[1] Batch[450] Speed: 1.249905552401301 samples/sec                   batch loss = 1222.5280482769012 | accuracy = 0.58


Epoch[1] Batch[455] Speed: 1.250250091511435 samples/sec                   batch loss = 1236.3952300548553 | accuracy = 0.5796703296703297


Epoch[1] Batch[460] Speed: 1.2554158896026457 samples/sec                   batch loss = 1250.3832142353058 | accuracy = 0.5798913043478261


Epoch[1] Batch[465] Speed: 1.247809073464046 samples/sec                   batch loss = 1263.7848000526428 | accuracy = 0.5801075268817204


Epoch[1] Batch[470] Speed: 1.2503633027380718 samples/sec                   batch loss = 1276.2208015918732 | accuracy = 0.5803191489361702


Epoch[1] Batch[475] Speed: 1.2546446380916185 samples/sec                   batch loss = 1289.5982979536057 | accuracy = 0.58


Epoch[1] Batch[480] Speed: 1.2501035530217177 samples/sec                   batch loss = 1303.3593097925186 | accuracy = 0.58125


Epoch[1] Batch[485] Speed: 1.2466764762749605 samples/sec                   batch loss = 1316.4200092554092 | accuracy = 0.5814432989690722


Epoch[1] Batch[490] Speed: 1.2533302057132356 samples/sec                   batch loss = 1329.8617986440659 | accuracy = 0.5816326530612245


Epoch[1] Batch[495] Speed: 1.2537030542748466 samples/sec                   batch loss = 1343.8482221364975 | accuracy = 0.5818181818181818


Epoch[1] Batch[500] Speed: 1.2457512499119179 samples/sec                   batch loss = 1356.564616560936 | accuracy = 0.582


Epoch[1] Batch[505] Speed: 1.2503065547950953 samples/sec                   batch loss = 1368.511054635048 | accuracy = 0.5826732673267326


Epoch[1] Batch[510] Speed: 1.2511888266256594 samples/sec                   batch loss = 1380.6292071342468 | accuracy = 0.5843137254901961


Epoch[1] Batch[515] Speed: 1.2535743444746865 samples/sec                   batch loss = 1394.7755479812622 | accuracy = 0.5844660194174758


Epoch[1] Batch[520] Speed: 1.247060763121193 samples/sec                   batch loss = 1407.8454024791718 | accuracy = 0.5846153846153846


Epoch[1] Batch[525] Speed: 1.2514816069929895 samples/sec                   batch loss = 1420.8600800037384 | accuracy = 0.5842857142857143


Epoch[1] Batch[530] Speed: 1.2573827095577768 samples/sec                   batch loss = 1434.371388912201 | accuracy = 0.5844339622641509


Epoch[1] Batch[535] Speed: 1.2487784428944848 samples/sec                   batch loss = 1447.990067243576 | accuracy = 0.5845794392523365


Epoch[1] Batch[540] Speed: 1.2547125715390848 samples/sec                   batch loss = 1460.8261023759842 | accuracy = 0.5847222222222223


Epoch[1] Batch[545] Speed: 1.2541286216917429 samples/sec                   batch loss = 1474.090680718422 | accuracy = 0.5848623853211009


Epoch[1] Batch[550] Speed: 1.2455162511906008 samples/sec                   batch loss = 1487.204502940178 | accuracy = 0.585


Epoch[1] Batch[555] Speed: 1.2493860758060846 samples/sec                   batch loss = 1499.226642370224 | accuracy = 0.586036036036036


Epoch[1] Batch[560] Speed: 1.251859427927679 samples/sec                   batch loss = 1511.2329564094543 | accuracy = 0.5879464285714285


Epoch[1] Batch[565] Speed: 1.2516851499917598 samples/sec                   batch loss = 1525.7885088920593 | accuracy = 0.5876106194690266


Epoch[1] Batch[570] Speed: 1.2454677087868669 samples/sec                   batch loss = 1539.1065595149994 | accuracy = 0.5864035087719298


Epoch[1] Batch[575] Speed: 1.2458919586384345 samples/sec                   batch loss = 1550.4599964618683 | accuracy = 0.5882608695652174


Epoch[1] Batch[580] Speed: 1.2418265666487442 samples/sec                   batch loss = 1563.2723352909088 | accuracy = 0.5883620689655172


Epoch[1] Batch[585] Speed: 1.2429647301722562 samples/sec                   batch loss = 1577.430334329605 | accuracy = 0.5876068376068376


Epoch[1] Batch[590] Speed: 1.2559469749712162 samples/sec                   batch loss = 1590.4067137241364 | accuracy = 0.5877118644067797


Epoch[1] Batch[595] Speed: 1.255764977594573 samples/sec                   batch loss = 1604.3063914775848 | accuracy = 0.5865546218487395


Epoch[1] Batch[600] Speed: 1.2495009917604898 samples/sec                   batch loss = 1616.3133790493011 | accuracy = 0.5879166666666666


Epoch[1] Batch[605] Speed: 1.2489731112065325 samples/sec                   batch loss = 1629.785187959671 | accuracy = 0.5867768595041323


Epoch[1] Batch[610] Speed: 1.25057002434842 samples/sec                   batch loss = 1642.0721637010574 | accuracy = 0.5885245901639344


Epoch[1] Batch[615] Speed: 1.252732949177793 samples/sec                   batch loss = 1653.6514383554459 | accuracy = 0.5898373983739837


Epoch[1] Batch[620] Speed: 1.2478800741473846 samples/sec                   batch loss = 1665.6983896493912 | accuracy = 0.5911290322580646


Epoch[1] Batch[625] Speed: 1.247269732193218 samples/sec                   batch loss = 1678.8075073957443 | accuracy = 0.5912


Epoch[1] Batch[630] Speed: 1.2518574663304147 samples/sec                   batch loss = 1692.2255996465683 | accuracy = 0.5908730158730159


Epoch[1] Batch[635] Speed: 1.25244893277451 samples/sec                   batch loss = 1705.9621440172195 | accuracy = 0.5905511811023622


Epoch[1] Batch[640] Speed: 1.2451918748658024 samples/sec                   batch loss = 1718.510390162468 | accuracy = 0.59140625


Epoch[1] Batch[645] Speed: 1.2415505054496683 samples/sec                   batch loss = 1731.151905655861 | accuracy = 0.5918604651162791


Epoch[1] Batch[650] Speed: 1.2503124250330833 samples/sec                   batch loss = 1744.5066012144089 | accuracy = 0.5926923076923077


Epoch[1] Batch[655] Speed: 1.2510054070595555 samples/sec                   batch loss = 1758.090313076973 | accuracy = 0.5916030534351145


Epoch[1] Batch[660] Speed: 1.2522990741600515 samples/sec                   batch loss = 1771.3199416399002 | accuracy = 0.5920454545454545


Epoch[1] Batch[665] Speed: 1.250068809899446 samples/sec                   batch loss = 1784.7157024145126 | accuracy = 0.5917293233082707


Epoch[1] Batch[670] Speed: 1.2531507443180387 samples/sec                   batch loss = 1797.6750048398972 | accuracy = 0.5925373134328358


Epoch[1] Batch[675] Speed: 1.250674809279579 samples/sec                   batch loss = 1810.35225045681 | accuracy = 0.5933333333333334


Epoch[1] Batch[680] Speed: 1.250079614524133 samples/sec                   batch loss = 1823.591420531273 | accuracy = 0.5941176470588235


Epoch[1] Batch[685] Speed: 1.2528845022191235 samples/sec                   batch loss = 1838.0871909856796 | accuracy = 0.5937956204379562


Epoch[1] Batch[690] Speed: 1.2501331746886053 samples/sec                   batch loss = 1851.448715686798 | accuracy = 0.5931159420289855


Epoch[1] Batch[695] Speed: 1.2500933068599411 samples/sec                   batch loss = 1865.102334022522 | accuracy = 0.5935251798561151


Epoch[1] Batch[700] Speed: 1.2533156933618153 samples/sec                   batch loss = 1879.1628568172455 | accuracy = 0.5928571428571429


Epoch[1] Batch[705] Speed: 1.249191743810356 samples/sec                   batch loss = 1892.6813116073608 | accuracy = 0.5921985815602837


Epoch[1] Batch[710] Speed: 1.2517197028530143 samples/sec                   batch loss = 1905.4697592258453 | accuracy = 0.5922535211267606


Epoch[1] Batch[715] Speed: 1.2541336841304234 samples/sec                   batch loss = 1918.5508453845978 | accuracy = 0.5926573426573427


Epoch[1] Batch[720] Speed: 1.2546048572781026 samples/sec                   batch loss = 1929.7283980846405 | accuracy = 0.5947916666666667


Epoch[1] Batch[725] Speed: 1.2529417651337613 samples/sec                   batch loss = 1942.0551936626434 | accuracy = 0.5962068965517241


Epoch[1] Batch[730] Speed: 1.2483977415282195 samples/sec                   batch loss = 1953.6178323030472 | accuracy = 0.5972602739726027


Epoch[1] Batch[735] Speed: 1.2483995994040917 samples/sec                   batch loss = 1968.6754795312881 | accuracy = 0.5969387755102041


Epoch[1] Batch[740] Speed: 1.2560353604925554 samples/sec                   batch loss = 1982.0004049539566 | accuracy = 0.5976351351351351


Epoch[1] Batch[745] Speed: 1.2505524065425924 samples/sec                   batch loss = 1994.1253570318222 | accuracy = 0.5986577181208054


Epoch[1] Batch[750] Speed: 1.252774950020695 samples/sec                   batch loss = 2009.3467475175858 | accuracy = 0.5976666666666667


Epoch[1] Batch[755] Speed: 1.2486419138024694 samples/sec                   batch loss = 2020.9080828428268 | accuracy = 0.5986754966887418


Epoch[1] Batch[760] Speed: 1.2524213516092777 samples/sec                   batch loss = 2034.5814863443375 | accuracy = 0.5990131578947369


Epoch[1] Batch[765] Speed: 1.2510722939263932 samples/sec                   batch loss = 2047.5682762861252 | accuracy = 0.5993464052287582


Epoch[1] Batch[770] Speed: 1.2523886297684697 samples/sec                   batch loss = 2060.255677342415 | accuracy = 0.6


Epoch[1] Batch[775] Speed: 1.25057067686928 samples/sec                   batch loss = 2071.7348380088806 | accuracy = 0.6009677419354839


Epoch[1] Batch[780] Speed: 1.251180708752641 samples/sec                   batch loss = 2085.497878551483 | accuracy = 0.6003205128205128


Epoch[1] Batch[785] Speed: 1.252752592874015 samples/sec                   batch loss = 2098.374334335327 | accuracy = 0.6006369426751592


[Epoch 1] training: accuracy=0.5999365482233503
[Epoch 1] time cost: 647.0654435157776
[Epoch 1] validation: validation accuracy=0.6711111111111111


Epoch[2] Batch[5] Speed: 1.254619118034381 samples/sec                   batch loss = 14.874561786651611 | accuracy = 0.45


Epoch[2] Batch[10] Speed: 1.2539798604921455 samples/sec                   batch loss = 28.953179359436035 | accuracy = 0.5


Epoch[2] Batch[15] Speed: 1.2584188351303531 samples/sec                   batch loss = 43.706462144851685 | accuracy = 0.48333333333333334


Epoch[2] Batch[20] Speed: 1.258562136964035 samples/sec                   batch loss = 55.6160089969635 | accuracy = 0.5375


Epoch[2] Batch[25] Speed: 1.2489518193093638 samples/sec                   batch loss = 67.80717587471008 | accuracy = 0.57


Epoch[2] Batch[30] Speed: 1.254341560652357 samples/sec                   batch loss = 80.73383712768555 | accuracy = 0.5833333333333334


Epoch[2] Batch[35] Speed: 1.2578087045809228 samples/sec                   batch loss = 91.78354549407959 | accuracy = 0.6214285714285714


Epoch[2] Batch[40] Speed: 1.2557901683008121 samples/sec                   batch loss = 104.69293427467346 | accuracy = 0.6125


Epoch[2] Batch[45] Speed: 1.252792536956232 samples/sec                   batch loss = 117.54995846748352 | accuracy = 0.6111111111111112


Epoch[2] Batch[50] Speed: 1.2521138337952793 samples/sec                   batch loss = 128.70347797870636 | accuracy = 0.63


Epoch[2] Batch[55] Speed: 1.2483460019066106 samples/sec                   batch loss = 140.90209591388702 | accuracy = 0.6363636363636364


Epoch[2] Batch[60] Speed: 1.2467873734324975 samples/sec                   batch loss = 155.07283771038055 | accuracy = 0.6375


Epoch[2] Batch[65] Speed: 1.2459105556606365 samples/sec                   batch loss = 167.2412155866623 | accuracy = 0.6384615384615384


Epoch[2] Batch[70] Speed: 1.250458080492181 samples/sec                   batch loss = 181.76237642765045 | accuracy = 0.6285714285714286


Epoch[2] Batch[75] Speed: 1.250819523388503 samples/sec                   batch loss = 194.61679863929749 | accuracy = 0.63


Epoch[2] Batch[80] Speed: 1.246037326273165 samples/sec                   batch loss = 208.25501489639282 | accuracy = 0.61875


Epoch[2] Batch[85] Speed: 1.2575287917317928 samples/sec                   batch loss = 221.60600447654724 | accuracy = 0.611764705882353


Epoch[2] Batch[90] Speed: 1.2562803662387891 samples/sec                   batch loss = 233.53740739822388 | accuracy = 0.6222222222222222


Epoch[2] Batch[95] Speed: 1.2551054912421293 samples/sec                   batch loss = 244.7654948234558 | accuracy = 0.6263157894736842


Epoch[2] Batch[100] Speed: 1.2560973318027067 samples/sec                   batch loss = 256.9420325756073 | accuracy = 0.6275


Epoch[2] Batch[105] Speed: 1.2484414959733685 samples/sec                   batch loss = 270.9559106826782 | accuracy = 0.6214285714285714


Epoch[2] Batch[110] Speed: 1.2546247473695247 samples/sec                   batch loss = 282.8140540122986 | accuracy = 0.625


Epoch[2] Batch[115] Speed: 1.2524321969394974 samples/sec                   batch loss = 294.588472366333 | accuracy = 0.6282608695652174


Epoch[2] Batch[120] Speed: 1.2528366935663031 samples/sec                   batch loss = 307.19733238220215 | accuracy = 0.6291666666666667


Epoch[2] Batch[125] Speed: 1.2513691265396334 samples/sec                   batch loss = 322.06237173080444 | accuracy = 0.622


Epoch[2] Batch[130] Speed: 1.2470991399460327 samples/sec                   batch loss = 335.5516582727432 | accuracy = 0.6211538461538462


Epoch[2] Batch[135] Speed: 1.2535127155150032 samples/sec                   batch loss = 346.71857810020447 | accuracy = 0.6240740740740741


Epoch[2] Batch[140] Speed: 1.2529900496977389 samples/sec                   batch loss = 357.7306959629059 | accuracy = 0.6285714285714286


Epoch[2] Batch[145] Speed: 1.2554662440176798 samples/sec                   batch loss = 369.36275708675385 | accuracy = 0.6293103448275862


Epoch[2] Batch[150] Speed: 1.2567837507301867 samples/sec                   batch loss = 383.19632971286774 | accuracy = 0.63


Epoch[2] Batch[155] Speed: 1.251068282378516 samples/sec                   batch loss = 397.2142206430435 | accuracy = 0.6290322580645161


Epoch[2] Batch[160] Speed: 1.2493876575016545 samples/sec                   batch loss = 409.7553013563156 | accuracy = 0.63125


Epoch[2] Batch[165] Speed: 1.246068328852041 samples/sec                   batch loss = 423.1991573572159 | accuracy = 0.6303030303030303


Epoch[2] Batch[170] Speed: 1.2519837683741195 samples/sec                   batch loss = 436.52091658115387 | accuracy = 0.6323529411764706


Epoch[2] Batch[175] Speed: 1.256240199438387 samples/sec                   batch loss = 447.50117778778076 | accuracy = 0.6357142857142857


Epoch[2] Batch[180] Speed: 1.2503437338999914 samples/sec                   batch loss = 459.6153407096863 | accuracy = 0.6361111111111111


Epoch[2] Batch[185] Speed: 1.2589813735890014 samples/sec                   batch loss = 470.8785116672516 | accuracy = 0.6391891891891892


Epoch[2] Batch[190] Speed: 1.2500718836098739 samples/sec                   batch loss = 483.3240475654602 | accuracy = 0.6381578947368421


Epoch[2] Batch[195] Speed: 1.254873520594822 samples/sec                   batch loss = 497.0109453201294 | accuracy = 0.6358974358974359


Epoch[2] Batch[200] Speed: 1.2544868431993137 samples/sec                   batch loss = 507.2700265645981 | accuracy = 0.64


Epoch[2] Batch[205] Speed: 1.2465273474637801 samples/sec                   batch loss = 517.1544996500015 | accuracy = 0.6451219512195122


Epoch[2] Batch[210] Speed: 1.2500701138959631 samples/sec                   batch loss = 531.3068424463272 | accuracy = 0.6440476190476191


Epoch[2] Batch[215] Speed: 1.2479002156815682 samples/sec                   batch loss = 541.2287693023682 | accuracy = 0.6465116279069767


Epoch[2] Batch[220] Speed: 1.2531736773126423 samples/sec                   batch loss = 552.2849935293198 | accuracy = 0.6477272727272727


Epoch[2] Batch[225] Speed: 1.2507269286084688 samples/sec                   batch loss = 563.6290014982224 | accuracy = 0.6511111111111111


Epoch[2] Batch[230] Speed: 1.2495461264792225 samples/sec                   batch loss = 575.7810705900192 | accuracy = 0.6489130434782608


Epoch[2] Batch[235] Speed: 1.2543459683430689 samples/sec                   batch loss = 588.9701812267303 | accuracy = 0.6446808510638298


Epoch[2] Batch[240] Speed: 1.2472526708683307 samples/sec                   batch loss = 601.2741665840149 | accuracy = 0.6447916666666667


Epoch[2] Batch[245] Speed: 1.2533048327251406 samples/sec                   batch loss = 615.4192321300507 | accuracy = 0.6428571428571429


Epoch[2] Batch[250] Speed: 1.2496421765675407 samples/sec                   batch loss = 627.2531348466873 | accuracy = 0.644


Epoch[2] Batch[255] Speed: 1.2534789064562784 samples/sec                   batch loss = 639.0994647741318 | accuracy = 0.6450980392156863


Epoch[2] Batch[260] Speed: 1.2509333973965233 samples/sec                   batch loss = 650.1248379945755 | accuracy = 0.6480769230769231


Epoch[2] Batch[265] Speed: 1.2515038254407005 samples/sec                   batch loss = 660.3393722772598 | accuracy = 0.6518867924528302


Epoch[2] Batch[270] Speed: 1.2507158330743395 samples/sec                   batch loss = 671.9643669128418 | accuracy = 0.6537037037037037


Epoch[2] Batch[275] Speed: 1.2526932894972238 samples/sec                   batch loss = 683.9541091918945 | accuracy = 0.6518181818181819


Epoch[2] Batch[280] Speed: 1.2504499720987112 samples/sec                   batch loss = 697.1990442276001 | accuracy = 0.65


Epoch[2] Batch[285] Speed: 1.2514495876766512 samples/sec                   batch loss = 709.0538853406906 | accuracy = 0.6491228070175439


Epoch[2] Batch[290] Speed: 1.2533167232595468 samples/sec                   batch loss = 722.5504211187363 | accuracy = 0.6491379310344828


Epoch[2] Batch[295] Speed: 1.2579858239412383 samples/sec                   batch loss = 732.4828680753708 | accuracy = 0.65


Epoch[2] Batch[300] Speed: 1.2535688182264184 samples/sec                   batch loss = 743.3608506917953 | accuracy = 0.6508333333333334


Epoch[2] Batch[305] Speed: 1.254321116874951 samples/sec                   batch loss = 754.3163684606552 | accuracy = 0.6524590163934426


Epoch[2] Batch[310] Speed: 1.2540752810719742 samples/sec                   batch loss = 766.8565627336502 | accuracy = 0.6548387096774193


Epoch[2] Batch[315] Speed: 1.2565077756885765 samples/sec                   batch loss = 781.5284172296524 | accuracy = 0.653968253968254


Epoch[2] Batch[320] Speed: 1.2507164857473683 samples/sec                   batch loss = 794.4897207021713 | accuracy = 0.653125


Epoch[2] Batch[325] Speed: 1.2486678418099548 samples/sec                   batch loss = 809.5416983366013 | accuracy = 0.6507692307692308


Epoch[2] Batch[330] Speed: 1.2479291760775868 samples/sec                   batch loss = 822.5655821561813 | accuracy = 0.6492424242424243


Epoch[2] Batch[335] Speed: 1.2527143350055638 samples/sec                   batch loss = 832.666535615921 | accuracy = 0.6514925373134328


Epoch[2] Batch[340] Speed: 1.2601919159620476 samples/sec                   batch loss = 845.1879217624664 | accuracy = 0.6514705882352941


Epoch[2] Batch[345] Speed: 1.2574197452455576 samples/sec                   batch loss = 855.7923440933228 | accuracy = 0.6543478260869565


Epoch[2] Batch[350] Speed: 1.2577771150486516 samples/sec                   batch loss = 869.0136475563049 | accuracy = 0.6521428571428571


Epoch[2] Batch[355] Speed: 1.2592658087726996 samples/sec                   batch loss = 880.0193095207214 | accuracy = 0.6535211267605634


Epoch[2] Batch[360] Speed: 1.2585993365825787 samples/sec                   batch loss = 892.3209681510925 | accuracy = 0.6534722222222222


Epoch[2] Batch[365] Speed: 1.2581104409823918 samples/sec                   batch loss = 905.0741387605667 | accuracy = 0.6513698630136986


Epoch[2] Batch[370] Speed: 1.254573803726523 samples/sec                   batch loss = 916.5822447538376 | accuracy = 0.6506756756756756


Epoch[2] Batch[375] Speed: 1.2535757494608282 samples/sec                   batch loss = 928.8019759654999 | accuracy = 0.65


Epoch[2] Batch[380] Speed: 1.2546203377193754 samples/sec                   batch loss = 937.0928055047989 | accuracy = 0.6526315789473685


Epoch[2] Batch[385] Speed: 1.2491590975018032 samples/sec                   batch loss = 947.3441718816757 | accuracy = 0.6532467532467533


Epoch[2] Batch[390] Speed: 1.2562393528589917 samples/sec                   batch loss = 959.7482079267502 | accuracy = 0.6525641025641026


Epoch[2] Batch[395] Speed: 1.2549467355760215 samples/sec                   batch loss = 972.1740210056305 | accuracy = 0.6518987341772152


Epoch[2] Batch[400] Speed: 1.25216925062102 samples/sec                   batch loss = 986.8836567401886 | accuracy = 0.65125


Epoch[2] Batch[405] Speed: 1.2505630330962236 samples/sec                   batch loss = 1002.5775253772736 | accuracy = 0.65


Epoch[2] Batch[410] Speed: 1.2529303495598558 samples/sec                   batch loss = 1015.0626094341278 | accuracy = 0.649390243902439


Epoch[2] Batch[415] Speed: 1.2492828090563162 samples/sec                   batch loss = 1027.244924545288 | accuracy = 0.6506024096385542


Epoch[2] Batch[420] Speed: 1.2510753725736072 samples/sec                   batch loss = 1036.8415969610214 | accuracy = 0.6517857142857143


Epoch[2] Batch[425] Speed: 1.2478013706077802 samples/sec                   batch loss = 1047.3155612945557 | accuracy = 0.6523529411764706


Epoch[2] Batch[430] Speed: 1.2515439700146547 samples/sec                   batch loss = 1059.7607026100159 | accuracy = 0.6534883720930232


Epoch[2] Batch[435] Speed: 1.251815713789531 samples/sec                   batch loss = 1070.6588340997696 | accuracy = 0.6551724137931034


Epoch[2] Batch[440] Speed: 1.2505832613333534 samples/sec                   batch loss = 1082.2744307518005 | accuracy = 0.6556818181818181


Epoch[2] Batch[445] Speed: 1.2545072923793608 samples/sec                   batch loss = 1094.710518360138 | accuracy = 0.6561797752808989


Epoch[2] Batch[450] Speed: 1.2544780258686683 samples/sec                   batch loss = 1108.9811092615128 | accuracy = 0.6561111111111111


Epoch[2] Batch[455] Speed: 1.2640040876749623 samples/sec                   batch loss = 1122.7550612688065 | accuracy = 0.6565934065934066


Epoch[2] Batch[460] Speed: 1.255490671121729 samples/sec                   batch loss = 1134.4787732362747 | accuracy = 0.6565217391304348


Epoch[2] Batch[465] Speed: 1.2580635534000122 samples/sec                   batch loss = 1148.181186079979 | accuracy = 0.6553763440860215


Epoch[2] Batch[470] Speed: 1.2603235031569282 samples/sec                   batch loss = 1160.7100526094437 | accuracy = 0.6542553191489362


Epoch[2] Batch[475] Speed: 1.2578832055822693 samples/sec                   batch loss = 1172.5986334085464 | accuracy = 0.6536842105263158


Epoch[2] Batch[480] Speed: 1.2536104067826346 samples/sec                   batch loss = 1183.0755097866058 | accuracy = 0.6546875


Epoch[2] Batch[485] Speed: 1.2564108555136382 samples/sec                   batch loss = 1195.989607334137 | accuracy = 0.6551546391752577


Epoch[2] Batch[490] Speed: 1.253542873610479 samples/sec                   batch loss = 1207.9256689548492 | accuracy = 0.6540816326530612


Epoch[2] Batch[495] Speed: 1.2512850361079502 samples/sec                   batch loss = 1223.654380083084 | accuracy = 0.6525252525252525


Epoch[2] Batch[500] Speed: 1.2499464326211778 samples/sec                   batch loss = 1237.6357171535492 | accuracy = 0.652


Epoch[2] Batch[505] Speed: 1.2513567129235805 samples/sec                   batch loss = 1250.3346154689789 | accuracy = 0.6514851485148515


Epoch[2] Batch[510] Speed: 1.2540094786857385 samples/sec                   batch loss = 1262.9229675531387 | accuracy = 0.6509803921568628


Epoch[2] Batch[515] Speed: 1.251675718325932 samples/sec                   batch loss = 1275.1992114782333 | accuracy = 0.6514563106796116


Epoch[2] Batch[520] Speed: 1.2493360218620937 samples/sec                   batch loss = 1288.7599931955338 | accuracy = 0.6504807692307693


Epoch[2] Batch[525] Speed: 1.2511077459328708 samples/sec                   batch loss = 1300.2287847995758 | accuracy = 0.65


Epoch[2] Batch[530] Speed: 1.2588568673333362 samples/sec                   batch loss = 1312.0479165315628 | accuracy = 0.6504716981132076


Epoch[2] Batch[535] Speed: 1.2613733974309576 samples/sec                   batch loss = 1322.4122023582458 | accuracy = 0.6504672897196262


Epoch[2] Batch[540] Speed: 1.25703554719191 samples/sec                   batch loss = 1334.1610666513443 | accuracy = 0.649074074074074


Epoch[2] Batch[545] Speed: 1.252208596741329 samples/sec                   batch loss = 1346.268651843071 | accuracy = 0.6495412844036698


Epoch[2] Batch[550] Speed: 1.253188373591283 samples/sec                   batch loss = 1358.9734463691711 | accuracy = 0.65


Epoch[2] Batch[555] Speed: 1.2545444403018167 samples/sec                   batch loss = 1369.7912142276764 | accuracy = 0.6513513513513514


Epoch[2] Batch[560] Speed: 1.260905940649124 samples/sec                   batch loss = 1381.5782877206802 | accuracy = 0.6517857142857143


Epoch[2] Batch[565] Speed: 1.250320718002911 samples/sec                   batch loss = 1397.3797625303268 | accuracy = 0.6491150442477877


Epoch[2] Batch[570] Speed: 1.265457549374988 samples/sec                   batch loss = 1411.6205343008041 | accuracy = 0.6491228070175439


Epoch[2] Batch[575] Speed: 1.2535859591213847 samples/sec                   batch loss = 1425.1163917779922 | accuracy = 0.6491304347826087


Epoch[2] Batch[580] Speed: 1.2529017179699804 samples/sec                   batch loss = 1436.8643887043 | accuracy = 0.6491379310344828


Epoch[2] Batch[585] Speed: 1.2511041073449463 samples/sec                   batch loss = 1444.5033629536629 | accuracy = 0.6512820512820513


Epoch[2] Batch[590] Speed: 1.2544898448720687 samples/sec                   batch loss = 1455.271005809307 | accuracy = 0.6512711864406779


Epoch[2] Batch[595] Speed: 1.256842500491808 samples/sec                   batch loss = 1466.5144906640053 | accuracy = 0.6516806722689076


Epoch[2] Batch[600] Speed: 1.2597287394510246 samples/sec                   batch loss = 1477.0372053980827 | accuracy = 0.6525


Epoch[2] Batch[605] Speed: 1.257285465923671 samples/sec                   batch loss = 1489.8538146615028 | accuracy = 0.6516528925619834


Epoch[2] Batch[610] Speed: 1.2550873698481728 samples/sec                   batch loss = 1500.9089244008064 | accuracy = 0.6524590163934426


Epoch[2] Batch[615] Speed: 1.256192322462821 samples/sec                   batch loss = 1512.246182024479 | accuracy = 0.6528455284552845


Epoch[2] Batch[620] Speed: 1.2525262600093678 samples/sec                   batch loss = 1523.971190392971 | accuracy = 0.6532258064516129


Epoch[2] Batch[625] Speed: 1.2496986780418546 samples/sec                   batch loss = 1538.446222126484 | accuracy = 0.6516


Epoch[2] Batch[630] Speed: 1.2564958245287265 samples/sec                   batch loss = 1552.8578869700432 | accuracy = 0.6507936507936508


Epoch[2] Batch[635] Speed: 1.2504854820482327 samples/sec                   batch loss = 1564.0347563624382 | accuracy = 0.6515748031496063


Epoch[2] Batch[640] Speed: 1.2490801395538913 samples/sec                   batch loss = 1575.1699988245964 | accuracy = 0.65234375


Epoch[2] Batch[645] Speed: 1.2472240200195575 samples/sec                   batch loss = 1587.8783760666847 | accuracy = 0.6511627906976745


Epoch[2] Batch[650] Speed: 1.256317525167961 samples/sec                   batch loss = 1600.8631755709648 | accuracy = 0.6511538461538462


Epoch[2] Batch[655] Speed: 1.2532107462950572 samples/sec                   batch loss = 1610.5279323458672 | accuracy = 0.6522900763358779


Epoch[2] Batch[660] Speed: 1.2468148923236073 samples/sec                   batch loss = 1622.5821242928505 | accuracy = 0.6522727272727272


Epoch[2] Batch[665] Speed: 1.248536540079427 samples/sec                   batch loss = 1634.7046832442284 | accuracy = 0.6526315789473685


Epoch[2] Batch[670] Speed: 1.2515972822553139 samples/sec                   batch loss = 1645.452811896801 | accuracy = 0.6526119402985074


Epoch[2] Batch[675] Speed: 1.2485519640458576 samples/sec                   batch loss = 1658.7566227316856 | accuracy = 0.652962962962963


Epoch[2] Batch[680] Speed: 1.2564121727750064 samples/sec                   batch loss = 1670.3589776158333 | accuracy = 0.6533088235294118


Epoch[2] Batch[685] Speed: 1.2477946886890137 samples/sec                   batch loss = 1682.390096962452 | accuracy = 0.6543795620437957


Epoch[2] Batch[690] Speed: 1.2506940155393582 samples/sec                   batch loss = 1694.6378237605095 | accuracy = 0.6543478260869565


Epoch[2] Batch[695] Speed: 1.25362511330639 samples/sec                   batch loss = 1706.4167851805687 | accuracy = 0.6550359712230216


Epoch[2] Batch[700] Speed: 1.25915901239134 samples/sec                   batch loss = 1717.9721539616585 | accuracy = 0.6560714285714285


Epoch[2] Batch[705] Speed: 1.2658860733801554 samples/sec                   batch loss = 1727.6236153244972 | accuracy = 0.6574468085106383


Epoch[2] Batch[710] Speed: 1.2585986756564629 samples/sec                   batch loss = 1738.01718634367 | accuracy = 0.6584507042253521


Epoch[2] Batch[715] Speed: 1.261720113666907 samples/sec                   batch loss = 1749.9773584008217 | accuracy = 0.6587412587412588


Epoch[2] Batch[720] Speed: 1.2603152663177062 samples/sec                   batch loss = 1763.7980709671974 | accuracy = 0.6576388888888889


Epoch[2] Batch[725] Speed: 1.2559426500415547 samples/sec                   batch loss = 1776.89247494936 | accuracy = 0.6575862068965517


Epoch[2] Batch[730] Speed: 1.2568946643384535 samples/sec                   batch loss = 1788.9810450673103 | accuracy = 0.6575342465753424


Epoch[2] Batch[735] Speed: 1.258887472051137 samples/sec                   batch loss = 1801.3879547715187 | accuracy = 0.6571428571428571


Epoch[2] Batch[740] Speed: 1.2610471555060005 samples/sec                   batch loss = 1815.5583427548409 | accuracy = 0.6564189189189189


Epoch[2] Batch[745] Speed: 1.2549717057155045 samples/sec                   batch loss = 1826.6149916052818 | accuracy = 0.6560402684563759


Epoch[2] Batch[750] Speed: 1.2559513939518525 samples/sec                   batch loss = 1840.310980975628 | accuracy = 0.6556666666666666


Epoch[2] Batch[755] Speed: 1.2578527439797673 samples/sec                   batch loss = 1853.0263211131096 | accuracy = 0.6549668874172185


Epoch[2] Batch[760] Speed: 1.2549274923495064 samples/sec                   batch loss = 1865.6841421723366 | accuracy = 0.6546052631578947


Epoch[2] Batch[765] Speed: 1.254239911019414 samples/sec                   batch loss = 1875.2445812821388 | accuracy = 0.6562091503267974


Epoch[2] Batch[770] Speed: 1.2528688774635037 samples/sec                   batch loss = 1885.0990853905678 | accuracy = 0.6568181818181819


Epoch[2] Batch[775] Speed: 1.2460364008467621 samples/sec                   batch loss = 1897.6856229901314 | accuracy = 0.6561290322580645


Epoch[2] Batch[780] Speed: 1.2502770180848524 samples/sec                   batch loss = 1910.212866127491 | accuracy = 0.6560897435897436


Epoch[2] Batch[785] Speed: 1.245170157209893 samples/sec                   batch loss = 1919.9306083321571 | accuracy = 0.6566878980891719


[Epoch 2] training: accuracy=0.6560913705583756
[Epoch 2] time cost: 644.9487953186035
[Epoch 2] validation: validation accuracy=0.7211111111111111


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).