<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[03:31:34] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

03:31:34] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[03:31:35] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 7.2568784, -4.3731008]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7759204241575878 samples/sec                   batch loss = 14.031952381134033 | accuracy = 0.5


Epoch[1] Batch[10] Speed: 1.2520843050786488 samples/sec                   batch loss = 28.898024559020996 | accuracy = 0.525


Epoch[1] Batch[15] Speed: 1.253249783427451 samples/sec                   batch loss = 41.386584758758545 | accuracy = 0.6166666666666667


Epoch[1] Batch[20] Speed: 1.256125733444971 samples/sec                   batch loss = 57.50089693069458 | accuracy = 0.55


Epoch[1] Batch[25] Speed: 1.2570162398533227 samples/sec                   batch loss = 71.10232973098755 | accuracy = 0.55


Epoch[1] Batch[30] Speed: 1.2578742461342627 samples/sec                   batch loss = 84.15004706382751 | accuracy = 0.5666666666666667


Epoch[1] Batch[35] Speed: 1.2526175315218484 samples/sec                   batch loss = 98.03580641746521 | accuracy = 0.55


Epoch[1] Batch[40] Speed: 1.248025906116364 samples/sec                   batch loss = 111.69763422012329 | accuracy = 0.55625


Epoch[1] Batch[45] Speed: 1.257354345277334 samples/sec                   batch loss = 125.38323259353638 | accuracy = 0.5555555555555556


Epoch[1] Batch[50] Speed: 1.2561324108387326 samples/sec                   batch loss = 139.54295945167542 | accuracy = 0.56


Epoch[1] Batch[55] Speed: 1.2602291173251667 samples/sec                   batch loss = 152.57507646083832 | accuracy = 0.5681818181818182


Epoch[1] Batch[60] Speed: 1.2565361017864272 samples/sec                   batch loss = 166.53607547283173 | accuracy = 0.5625


Epoch[1] Batch[65] Speed: 1.2576038251892472 samples/sec                   batch loss = 180.58126032352448 | accuracy = 0.5538461538461539


Epoch[1] Batch[70] Speed: 1.2599706463738178 samples/sec                   batch loss = 195.08377015590668 | accuracy = 0.5464285714285714


Epoch[1] Batch[75] Speed: 1.257740529752561 samples/sec                   batch loss = 209.16265738010406 | accuracy = 0.5366666666666666


Epoch[1] Batch[80] Speed: 1.251075932329274 samples/sec                   batch loss = 224.08251678943634 | accuracy = 0.528125


Epoch[1] Batch[85] Speed: 1.255182771246577 samples/sec                   batch loss = 237.86851346492767 | accuracy = 0.5264705882352941


Epoch[1] Batch[90] Speed: 1.2519221088672434 samples/sec                   batch loss = 252.07555067539215 | accuracy = 0.5222222222222223


Epoch[1] Batch[95] Speed: 1.2588034071063516 samples/sec                   batch loss = 266.6105226278305 | accuracy = 0.5157894736842106


Epoch[1] Batch[100] Speed: 1.2588595121267496 samples/sec                   batch loss = 280.87939870357513 | accuracy = 0.5125


Epoch[1] Batch[105] Speed: 1.2625589966122734 samples/sec                   batch loss = 295.17215597629547 | accuracy = 0.5071428571428571


Epoch[1] Batch[110] Speed: 1.254233535023331 samples/sec                   batch loss = 308.76447093486786 | accuracy = 0.5136363636363637


Epoch[1] Batch[115] Speed: 1.2544774630645374 samples/sec                   batch loss = 322.4364572763443 | accuracy = 0.5130434782608696


Epoch[1] Batch[120] Speed: 1.251662738344978 samples/sec                   batch loss = 336.5463091135025 | accuracy = 0.5125


Epoch[1] Batch[125] Speed: 1.258976838786403 samples/sec                   batch loss = 350.33824050426483 | accuracy = 0.512


Epoch[1] Batch[130] Speed: 1.2568102062731827 samples/sec                   batch loss = 364.13840091228485 | accuracy = 0.5153846153846153


Epoch[1] Batch[135] Speed: 1.2504988104500911 samples/sec                   batch loss = 377.87993609905243 | accuracy = 0.5148148148148148


Epoch[1] Batch[140] Speed: 1.256712580492015 samples/sec                   batch loss = 392.085924744606 | accuracy = 0.5089285714285714


Epoch[1] Batch[145] Speed: 1.2580141223895438 samples/sec                   batch loss = 405.42241084575653 | accuracy = 0.5172413793103449


Epoch[1] Batch[150] Speed: 1.2508315533238648 samples/sec                   batch loss = 419.6384974718094 | accuracy = 0.51


Epoch[1] Batch[155] Speed: 1.251776392354809 samples/sec                   batch loss = 433.42499697208405 | accuracy = 0.5080645161290323


Epoch[1] Batch[160] Speed: 1.255278750672168 samples/sec                   batch loss = 447.1120604276657 | accuracy = 0.509375


Epoch[1] Batch[165] Speed: 1.256322793447227 samples/sec                   batch loss = 461.05530083179474 | accuracy = 0.5106060606060606


Epoch[1] Batch[170] Speed: 1.2565628292424478 samples/sec                   batch loss = 474.70050275325775 | accuracy = 0.5117647058823529


Epoch[1] Batch[175] Speed: 1.2546095482805706 samples/sec                   batch loss = 488.6838847398758 | accuracy = 0.51


Epoch[1] Batch[180] Speed: 1.2563601430511269 samples/sec                   batch loss = 502.876189827919 | accuracy = 0.5083333333333333


Epoch[1] Batch[185] Speed: 1.2527472609526984 samples/sec                   batch loss = 516.8835846185684 | accuracy = 0.5121621621621621


Epoch[1] Batch[190] Speed: 1.2563356821024156 samples/sec                   batch loss = 530.3068333864212 | accuracy = 0.5144736842105263


Epoch[1] Batch[195] Speed: 1.253983234646116 samples/sec                   batch loss = 543.937176823616 | accuracy = 0.5128205128205128


Epoch[1] Batch[200] Speed: 1.2576206995240486 samples/sec                   batch loss = 557.91395008564 | accuracy = 0.51375


Epoch[1] Batch[205] Speed: 1.2556481548077414 samples/sec                   batch loss = 571.6639693975449 | accuracy = 0.5158536585365854


Epoch[1] Batch[210] Speed: 1.2566251349804145 samples/sec                   batch loss = 585.2972484827042 | accuracy = 0.5190476190476191


Epoch[1] Batch[215] Speed: 1.2554578826504346 samples/sec                   batch loss = 598.5771871805191 | accuracy = 0.5232558139534884


Epoch[1] Batch[220] Speed: 1.2589655963958493 samples/sec                   batch loss = 612.2313121557236 | accuracy = 0.525


Epoch[1] Batch[225] Speed: 1.2553539855123974 samples/sec                   batch loss = 626.3419333696365 | accuracy = 0.5222222222222223


Epoch[1] Batch[230] Speed: 1.2480038109857585 samples/sec                   batch loss = 640.1264218091965 | accuracy = 0.5217391304347826


Epoch[1] Batch[235] Speed: 1.2548368224533093 samples/sec                   batch loss = 653.604155421257 | accuracy = 0.5234042553191489


Epoch[1] Batch[240] Speed: 1.258397219965782 samples/sec                   batch loss = 666.932331442833 | accuracy = 0.528125


Epoch[1] Batch[245] Speed: 1.2554323294992642 samples/sec                   batch loss = 681.176939368248 | accuracy = 0.5265306122448979


Epoch[1] Batch[250] Speed: 1.2540368486400673 samples/sec                   batch loss = 694.8873366117477 | accuracy = 0.529


Epoch[1] Batch[255] Speed: 1.2604641140862733 samples/sec                   batch loss = 708.4831677675247 | accuracy = 0.5303921568627451


Epoch[1] Batch[260] Speed: 1.261143749518442 samples/sec                   batch loss = 721.6566687822342 | accuracy = 0.5326923076923077


Epoch[1] Batch[265] Speed: 1.2588671631988426 samples/sec                   batch loss = 735.5427747964859 | accuracy = 0.5311320754716982


Epoch[1] Batch[270] Speed: 1.2610344543519973 samples/sec                   batch loss = 749.2291587591171 | accuracy = 0.5324074074074074


Epoch[1] Batch[275] Speed: 1.257621830780567 samples/sec                   batch loss = 763.0294083356857 | accuracy = 0.5318181818181819


Epoch[1] Batch[280] Speed: 1.2582833986429285 samples/sec                   batch loss = 776.3450487852097 | accuracy = 0.5348214285714286


Epoch[1] Batch[285] Speed: 1.2582837761257208 samples/sec                   batch loss = 789.8877311944962 | accuracy = 0.5368421052631579


Epoch[1] Batch[290] Speed: 1.2612303080483604 samples/sec                   batch loss = 803.6950870752335 | accuracy = 0.5370689655172414


Epoch[1] Batch[295] Speed: 1.2587126484625215 samples/sec                   batch loss = 817.7262808084488 | accuracy = 0.5330508474576271


Epoch[1] Batch[300] Speed: 1.2543751349069556 samples/sec                   batch loss = 830.9996825456619 | accuracy = 0.5341666666666667


Epoch[1] Batch[305] Speed: 1.262159118171615 samples/sec                   batch loss = 844.8954545259476 | accuracy = 0.5344262295081967


Epoch[1] Batch[310] Speed: 1.2591728098428114 samples/sec                   batch loss = 858.7358430624008 | accuracy = 0.5346774193548387


Epoch[1] Batch[315] Speed: 1.259349557368012 samples/sec                   batch loss = 872.1121619939804 | accuracy = 0.5365079365079365


Epoch[1] Batch[320] Speed: 1.2615850092213254 samples/sec                   batch loss = 885.8399962186813 | accuracy = 0.5390625


Epoch[1] Batch[325] Speed: 1.2565748757915283 samples/sec                   batch loss = 899.5840631723404 | accuracy = 0.5384615384615384


Epoch[1] Batch[330] Speed: 1.2596292411614238 samples/sec                   batch loss = 912.2996298074722 | accuracy = 0.5424242424242425


Epoch[1] Batch[335] Speed: 1.2579211196066118 samples/sec                   batch loss = 926.0795065164566 | accuracy = 0.5417910447761194


Epoch[1] Batch[340] Speed: 1.2593989988443586 samples/sec                   batch loss = 939.7548369169235 | accuracy = 0.5426470588235294


Epoch[1] Batch[345] Speed: 1.2607171033682287 samples/sec                   batch loss = 954.0139364004135 | accuracy = 0.5405797101449276


Epoch[1] Batch[350] Speed: 1.2679852323983627 samples/sec                   batch loss = 967.4313932657242 | accuracy = 0.5421428571428571


Epoch[1] Batch[355] Speed: 1.259567771936858 samples/sec                   batch loss = 981.5526608228683 | accuracy = 0.5408450704225352


Epoch[1] Batch[360] Speed: 1.2599294863110502 samples/sec                   batch loss = 995.4689799547195 | accuracy = 0.5416666666666666


Epoch[1] Batch[365] Speed: 1.26124367685424 samples/sec                   batch loss = 1009.5020679235458 | accuracy = 0.5417808219178082


Epoch[1] Batch[370] Speed: 1.2600652774056913 samples/sec                   batch loss = 1023.0310553312302 | accuracy = 0.5418918918918919


Epoch[1] Batch[375] Speed: 1.26655942659274 samples/sec                   batch loss = 1036.874957203865 | accuracy = 0.542


Epoch[1] Batch[380] Speed: 1.2556465572211808 samples/sec                   batch loss = 1050.0394798517227 | accuracy = 0.5434210526315789


Epoch[1] Batch[385] Speed: 1.2569763085080774 samples/sec                   batch loss = 1063.1559401750565 | accuracy = 0.5448051948051948


Epoch[1] Batch[390] Speed: 1.255283634557872 samples/sec                   batch loss = 1076.8297358751297 | accuracy = 0.5448717948717948


Epoch[1] Batch[395] Speed: 1.2560366769666786 samples/sec                   batch loss = 1090.3078068494797 | accuracy = 0.5449367088607595


Epoch[1] Batch[400] Speed: 1.25694005229079 samples/sec                   batch loss = 1103.9963649511337 | accuracy = 0.544375


Epoch[1] Batch[405] Speed: 1.2584798146622225 samples/sec                   batch loss = 1117.2996579408646 | accuracy = 0.5450617283950617


Epoch[1] Batch[410] Speed: 1.2630156996769968 samples/sec                   batch loss = 1130.7555245161057 | accuracy = 0.5445121951219513


Epoch[1] Batch[415] Speed: 1.2536015080982421 samples/sec                   batch loss = 1143.9021850824356 | accuracy = 0.5463855421686747


Epoch[1] Batch[420] Speed: 1.2523620795885002 samples/sec                   batch loss = 1157.3950406312943 | accuracy = 0.5464285714285714


Epoch[1] Batch[425] Speed: 1.2557625337738663 samples/sec                   batch loss = 1171.3619836568832 | accuracy = 0.5458823529411765


Epoch[1] Batch[430] Speed: 1.248307176784717 samples/sec                   batch loss = 1184.596143603325 | accuracy = 0.547093023255814


Epoch[1] Batch[435] Speed: 1.2538321653246518 samples/sec                   batch loss = 1197.771015048027 | accuracy = 0.5477011494252874


Epoch[1] Batch[440] Speed: 1.252422940999374 samples/sec                   batch loss = 1211.1798506975174 | accuracy = 0.5482954545454546


Epoch[1] Batch[445] Speed: 1.2577336466744293 samples/sec                   batch loss = 1224.393049120903 | accuracy = 0.55


Epoch[1] Batch[450] Speed: 1.2593780117396811 samples/sec                   batch loss = 1238.0850018262863 | accuracy = 0.5505555555555556


Epoch[1] Batch[455] Speed: 1.2517386610440995 samples/sec                   batch loss = 1251.597206711769 | accuracy = 0.5505494505494506


Epoch[1] Batch[460] Speed: 1.253578465776299 samples/sec                   batch loss = 1264.4995468854904 | accuracy = 0.5521739130434783


Epoch[1] Batch[465] Speed: 1.2527451094885278 samples/sec                   batch loss = 1278.219680905342 | accuracy = 0.5516129032258065


Epoch[1] Batch[470] Speed: 1.258324356846803 samples/sec                   batch loss = 1292.4086645841599 | accuracy = 0.551595744680851


Epoch[1] Batch[475] Speed: 1.2557933642089203 samples/sec                   batch loss = 1305.4990109205246 | accuracy = 0.5531578947368421


Epoch[1] Batch[480] Speed: 1.2578915049742945 samples/sec                   batch loss = 1318.5020297765732 | accuracy = 0.5557291666666667


Epoch[1] Batch[485] Speed: 1.2616650816781696 samples/sec                   batch loss = 1332.1562258005142 | accuracy = 0.5556701030927835


Epoch[1] Batch[490] Speed: 1.2533349808184768 samples/sec                   batch loss = 1345.5161646604538 | accuracy = 0.5561224489795918


Epoch[1] Batch[495] Speed: 1.2506738769518837 samples/sec                   batch loss = 1358.8768433332443 | accuracy = 0.5565656565656566


Epoch[1] Batch[500] Speed: 1.260355315408928 samples/sec                   batch loss = 1372.3766378164291 | accuracy = 0.5565


Epoch[1] Batch[505] Speed: 1.2586876236670739 samples/sec                   batch loss = 1385.2893780469894 | accuracy = 0.5584158415841585


Epoch[1] Batch[510] Speed: 1.2557727790858628 samples/sec                   batch loss = 1399.022380232811 | accuracy = 0.5568627450980392


Epoch[1] Batch[515] Speed: 1.2561391823527932 samples/sec                   batch loss = 1412.8079940080643 | accuracy = 0.5567961165048544


Epoch[1] Batch[520] Speed: 1.2572378860990263 samples/sec                   batch loss = 1428.0029512643814 | accuracy = 0.5543269230769231


Epoch[1] Batch[525] Speed: 1.2538110822409367 samples/sec                   batch loss = 1441.613734126091 | accuracy = 0.5552380952380952


Epoch[1] Batch[530] Speed: 1.2567360205929168 samples/sec                   batch loss = 1454.7853590250015 | accuracy = 0.5556603773584906


Epoch[1] Batch[535] Speed: 1.2590533679562923 samples/sec                   batch loss = 1468.9822198152542 | accuracy = 0.5560747663551402


Epoch[1] Batch[540] Speed: 1.2582014902337464 samples/sec                   batch loss = 1482.7669750452042 | accuracy = 0.5555555555555556


Epoch[1] Batch[545] Speed: 1.2552712370760615 samples/sec                   batch loss = 1495.8847142457962 | accuracy = 0.5559633027522936


Epoch[1] Batch[550] Speed: 1.255647872880406 samples/sec                   batch loss = 1509.158663392067 | accuracy = 0.5568181818181818


Epoch[1] Batch[555] Speed: 1.2665758727838794 samples/sec                   batch loss = 1522.8452025651932 | accuracy = 0.5567567567567567


Epoch[1] Batch[560] Speed: 1.2618027655514605 samples/sec                   batch loss = 1535.6774796247482 | accuracy = 0.5575892857142857


Epoch[1] Batch[565] Speed: 1.2586545735753911 samples/sec                   batch loss = 1548.7070986032486 | accuracy = 0.5584070796460177


Epoch[1] Batch[570] Speed: 1.2531701203033803 samples/sec                   batch loss = 1562.4890028238297 | accuracy = 0.5574561403508772


Epoch[1] Batch[575] Speed: 1.2629622659405813 samples/sec                   batch loss = 1575.8895260095596 | accuracy = 0.5582608695652174


Epoch[1] Batch[580] Speed: 1.2577716459773378 samples/sec                   batch loss = 1589.0149787664413 | accuracy = 0.559051724137931


Epoch[1] Batch[585] Speed: 1.2632028484755853 samples/sec                   batch loss = 1602.6785284280777 | accuracy = 0.5598290598290598


Epoch[1] Batch[590] Speed: 1.2559290173060254 samples/sec                   batch loss = 1615.5883818864822 | accuracy = 0.5605932203389831


Epoch[1] Batch[595] Speed: 1.2591462547429395 samples/sec                   batch loss = 1629.4941500425339 | accuracy = 0.5605042016806723


Epoch[1] Batch[600] Speed: 1.2581793164042887 samples/sec                   batch loss = 1641.8315876722336 | accuracy = 0.5620833333333334


Epoch[1] Batch[605] Speed: 1.2500464560964928 samples/sec                   batch loss = 1655.1932801008224 | accuracy = 0.5628099173553719


Epoch[1] Batch[610] Speed: 1.2623616852541018 samples/sec                   batch loss = 1669.7013379335403 | accuracy = 0.5618852459016394


Epoch[1] Batch[615] Speed: 1.2591698802216658 samples/sec                   batch loss = 1683.9316452741623 | accuracy = 0.5621951219512196


Epoch[1] Batch[620] Speed: 1.2635696992400904 samples/sec                   batch loss = 1697.6253913640976 | accuracy = 0.5608870967741936


Epoch[1] Batch[625] Speed: 1.2642239183986632 samples/sec                   batch loss = 1710.1349285840988 | accuracy = 0.562


Epoch[1] Batch[630] Speed: 1.2622069762576185 samples/sec                   batch loss = 1723.8231424093246 | accuracy = 0.5615079365079365


Epoch[1] Batch[635] Speed: 1.2545500689668374 samples/sec                   batch loss = 1736.9638067483902 | accuracy = 0.5618110236220473


Epoch[1] Batch[640] Speed: 1.2558663105343049 samples/sec                   batch loss = 1750.30963909626 | accuracy = 0.562109375


Epoch[1] Batch[645] Speed: 1.2653262240065428 samples/sec                   batch loss = 1763.507682442665 | accuracy = 0.5624031007751938


Epoch[1] Batch[650] Speed: 1.2591598629104253 samples/sec                   batch loss = 1777.3184224367142 | accuracy = 0.5615384615384615


Epoch[1] Batch[655] Speed: 1.2566581727045705 samples/sec                   batch loss = 1789.9197944402695 | accuracy = 0.5625954198473282


Epoch[1] Batch[660] Speed: 1.253128654553619 samples/sec                   batch loss = 1803.4193481206894 | accuracy = 0.5632575757575757


Epoch[1] Batch[665] Speed: 1.2612836901511604 samples/sec                   batch loss = 1816.6078609228134 | accuracy = 0.562781954887218


Epoch[1] Batch[670] Speed: 1.2589624787936928 samples/sec                   batch loss = 1829.8103481531143 | accuracy = 0.5645522388059702


Epoch[1] Batch[675] Speed: 1.2592001221281552 samples/sec                   batch loss = 1843.2291024923325 | accuracy = 0.5644444444444444


Epoch[1] Batch[680] Speed: 1.2597627919053642 samples/sec                   batch loss = 1855.9924589395523 | accuracy = 0.5639705882352941


Epoch[1] Batch[685] Speed: 1.2631264797426167 samples/sec                   batch loss = 1870.074802994728 | accuracy = 0.5635036496350365


Epoch[1] Batch[690] Speed: 1.2623386996897503 samples/sec                   batch loss = 1882.737107872963 | accuracy = 0.5648550724637681


Epoch[1] Batch[695] Speed: 1.2638129885187868 samples/sec                   batch loss = 1895.5295165777206 | accuracy = 0.5654676258992806


Epoch[1] Batch[700] Speed: 1.256619111195166 samples/sec                   batch loss = 1909.0712965726852 | accuracy = 0.5657142857142857


Epoch[1] Batch[705] Speed: 1.2646358782009341 samples/sec                   batch loss = 1921.4801436662674 | accuracy = 0.5670212765957446


Epoch[1] Batch[710] Speed: 1.2641909579674162 samples/sec                   batch loss = 1934.1094838380814 | accuracy = 0.5690140845070423


Epoch[1] Batch[715] Speed: 1.262539709248607 samples/sec                   batch loss = 1946.1774011850357 | accuracy = 0.5702797202797203


Epoch[1] Batch[720] Speed: 1.2600401041635765 samples/sec                   batch loss = 1959.9085952043533 | accuracy = 0.5708333333333333


Epoch[1] Batch[725] Speed: 1.2617831217001492 samples/sec                   batch loss = 1972.1976245641708 | accuracy = 0.5717241379310345


Epoch[1] Batch[730] Speed: 1.264484232814952 samples/sec                   batch loss = 1985.6970485448837 | accuracy = 0.5726027397260274


Epoch[1] Batch[735] Speed: 1.2634859593475596 samples/sec                   batch loss = 1999.6395770311356 | accuracy = 0.5731292517006803


Epoch[1] Batch[740] Speed: 1.2654170799313635 samples/sec                   batch loss = 2015.0547369718552 | accuracy = 0.5712837837837837


Epoch[1] Batch[745] Speed: 1.2574121117573676 samples/sec                   batch loss = 2028.6841136217117 | accuracy = 0.5711409395973155


Epoch[1] Batch[750] Speed: 1.2594104380563007 samples/sec                   batch loss = 2040.815478682518 | accuracy = 0.5723333333333334


Epoch[1] Batch[755] Speed: 1.2669167493042883 samples/sec                   batch loss = 2052.731186270714 | accuracy = 0.573841059602649


Epoch[1] Batch[760] Speed: 1.258197715909708 samples/sec                   batch loss = 2065.3781876564026 | accuracy = 0.5746710526315789


Epoch[1] Batch[765] Speed: 1.2646715311080166 samples/sec                   batch loss = 2079.1494886875153 | accuracy = 0.5748366013071895


Epoch[1] Batch[770] Speed: 1.2629272797512388 samples/sec                   batch loss = 2091.231722831726 | accuracy = 0.575


Epoch[1] Batch[775] Speed: 1.2645501859868056 samples/sec                   batch loss = 2105.294734477997 | accuracy = 0.5751612903225807


Epoch[1] Batch[780] Speed: 1.2625311583892185 samples/sec                   batch loss = 2117.5538415908813 | accuracy = 0.5756410256410256


Epoch[1] Batch[785] Speed: 1.2622572121606086 samples/sec                   batch loss = 2131.7821819782257 | accuracy = 0.575796178343949


[Epoch 1] training: accuracy=0.575507614213198
[Epoch 1] time cost: 644.741313457489
[Epoch 1] validation: validation accuracy=0.6677777777777778


Epoch[2] Batch[5] Speed: 1.2545625460210346 samples/sec                   batch loss = 13.653967142105103 | accuracy = 0.6


Epoch[2] Batch[10] Speed: 1.2515881320207431 samples/sec                   batch loss = 27.170570135116577 | accuracy = 0.55


Epoch[2] Batch[15] Speed: 1.248816646644819 samples/sec                   batch loss = 39.56160259246826 | accuracy = 0.6166666666666667


Epoch[2] Batch[20] Speed: 1.2600604508790794 samples/sec                   batch loss = 52.291465520858765 | accuracy = 0.6


Epoch[2] Batch[25] Speed: 1.2514139296084439 samples/sec                   batch loss = 65.53275895118713 | accuracy = 0.59


Epoch[2] Batch[30] Speed: 1.2594580879184565 samples/sec                   batch loss = 78.82899355888367 | accuracy = 0.5916666666666667


Epoch[2] Batch[35] Speed: 1.2613789927051926 samples/sec                   batch loss = 90.1775290966034 | accuracy = 0.6285714285714286


Epoch[2] Batch[40] Speed: 1.2566544076340567 samples/sec                   batch loss = 103.39138305187225 | accuracy = 0.63125


Epoch[2] Batch[45] Speed: 1.2545912535693173 samples/sec                   batch loss = 117.25112473964691 | accuracy = 0.6388888888888888


Epoch[2] Batch[50] Speed: 1.2545712707251728 samples/sec                   batch loss = 131.14141881465912 | accuracy = 0.625


Epoch[2] Batch[55] Speed: 1.2552081264196848 samples/sec                   batch loss = 144.783265709877 | accuracy = 0.6272727272727273


Epoch[2] Batch[60] Speed: 1.2601188448696279 samples/sec                   batch loss = 157.02533221244812 | accuracy = 0.6333333333333333


Epoch[2] Batch[65] Speed: 1.2554069652676458 samples/sec                   batch loss = 168.45425868034363 | accuracy = 0.6461538461538462


Epoch[2] Batch[70] Speed: 1.2570498632600307 samples/sec                   batch loss = 181.5585594177246 | accuracy = 0.65


Epoch[2] Batch[75] Speed: 1.2610185308785842 samples/sec                   batch loss = 194.0905783176422 | accuracy = 0.6566666666666666


Epoch[2] Batch[80] Speed: 1.256072504929106 samples/sec                   batch loss = 206.20623064041138 | accuracy = 0.659375


Epoch[2] Batch[85] Speed: 1.2618097881394716 samples/sec                   batch loss = 217.49620413780212 | accuracy = 0.6647058823529411


Epoch[2] Batch[90] Speed: 1.2580224235092172 samples/sec                   batch loss = 228.49606800079346 | accuracy = 0.675


Epoch[2] Batch[95] Speed: 1.259507727025841 samples/sec                   batch loss = 241.0132851600647 | accuracy = 0.6763157894736842


Epoch[2] Batch[100] Speed: 1.259417339499547 samples/sec                   batch loss = 254.0226445198059 | accuracy = 0.6725


Epoch[2] Batch[105] Speed: 1.2590488326350382 samples/sec                   batch loss = 269.1344208717346 | accuracy = 0.6666666666666666


Epoch[2] Batch[110] Speed: 1.2525961151877842 samples/sec                   batch loss = 280.959148645401 | accuracy = 0.6704545454545454


Epoch[2] Batch[115] Speed: 1.2577675913517203 samples/sec                   batch loss = 294.4727187156677 | accuracy = 0.6630434782608695


Epoch[2] Batch[120] Speed: 1.2560272736406264 samples/sec                   batch loss = 306.4552402496338 | accuracy = 0.6666666666666666


Epoch[2] Batch[125] Speed: 1.267271689143322 samples/sec                   batch loss = 318.80508756637573 | accuracy = 0.672


Epoch[2] Batch[130] Speed: 1.260943658031853 samples/sec                   batch loss = 330.2840712070465 | accuracy = 0.675


Epoch[2] Batch[135] Speed: 1.2583809854833143 samples/sec                   batch loss = 342.80627512931824 | accuracy = 0.6722222222222223


Epoch[2] Batch[140] Speed: 1.2610445962976713 samples/sec                   batch loss = 355.19769382476807 | accuracy = 0.6714285714285714


Epoch[2] Batch[145] Speed: 1.2545638594095976 samples/sec                   batch loss = 368.73746490478516 | accuracy = 0.6655172413793103


Epoch[2] Batch[150] Speed: 1.2620363559882128 samples/sec                   batch loss = 384.26941990852356 | accuracy = 0.6566666666666666


Epoch[2] Batch[155] Speed: 1.2632253899537942 samples/sec                   batch loss = 398.44217109680176 | accuracy = 0.6516129032258065


Epoch[2] Batch[160] Speed: 1.2503230475086087 samples/sec                   batch loss = 410.1895307302475 | accuracy = 0.653125


Epoch[2] Batch[165] Speed: 1.2544869370013698 samples/sec                   batch loss = 423.62779009342194 | accuracy = 0.6515151515151515


Epoch[2] Batch[170] Speed: 1.2565661231977678 samples/sec                   batch loss = 435.863160610199 | accuracy = 0.65


Epoch[2] Batch[175] Speed: 1.2589556768062182 samples/sec                   batch loss = 447.73512983322144 | accuracy = 0.65


Epoch[2] Batch[180] Speed: 1.262628359982248 samples/sec                   batch loss = 461.0057953596115 | accuracy = 0.6472222222222223


Epoch[2] Batch[185] Speed: 1.2606152703123479 samples/sec                   batch loss = 473.71568405628204 | accuracy = 0.65


Epoch[2] Batch[190] Speed: 1.2574488664405932 samples/sec                   batch loss = 483.63972437381744 | accuracy = 0.656578947368421


Epoch[2] Batch[195] Speed: 1.2544155576929519 samples/sec                   batch loss = 497.6393848657608 | accuracy = 0.6525641025641026


Epoch[2] Batch[200] Speed: 1.261153608817092 samples/sec                   batch loss = 510.40891349315643 | accuracy = 0.65375


Epoch[2] Batch[205] Speed: 1.2557683613620236 samples/sec                   batch loss = 523.7761846780777 | accuracy = 0.6524390243902439


Epoch[2] Batch[210] Speed: 1.2643391028816793 samples/sec                   batch loss = 537.2352434396744 | accuracy = 0.6511904761904762


Epoch[2] Batch[215] Speed: 1.2659117673046145 samples/sec                   batch loss = 549.9136327505112 | accuracy = 0.6523255813953488


Epoch[2] Batch[220] Speed: 1.262323693037408 samples/sec                   batch loss = 560.3086656332016 | accuracy = 0.6568181818181819


Epoch[2] Batch[225] Speed: 1.2612357124251206 samples/sec                   batch loss = 572.4072580337524 | accuracy = 0.6577777777777778


Epoch[2] Batch[230] Speed: 1.2577393982824885 samples/sec                   batch loss = 583.5177398920059 | accuracy = 0.658695652173913


Epoch[2] Batch[235] Speed: 1.2590771789289463 samples/sec                   batch loss = 595.7594922780991 | accuracy = 0.6574468085106383


Epoch[2] Batch[240] Speed: 1.258881615477659 samples/sec                   batch loss = 606.6780942678452 | accuracy = 0.6614583333333334


Epoch[2] Batch[245] Speed: 1.2597341309669627 samples/sec                   batch loss = 619.0913323163986 | accuracy = 0.6591836734693878


Epoch[2] Batch[250] Speed: 1.2579815792837987 samples/sec                   batch loss = 634.5859102010727 | accuracy = 0.656


Epoch[2] Batch[255] Speed: 1.2514563087966915 samples/sec                   batch loss = 648.245344042778 | accuracy = 0.6558823529411765


Epoch[2] Batch[260] Speed: 1.256537042874721 samples/sec                   batch loss = 661.1864011287689 | accuracy = 0.6567307692307692


Epoch[2] Batch[265] Speed: 1.265659935427053 samples/sec                   batch loss = 671.665874838829 | accuracy = 0.659433962264151


Epoch[2] Batch[270] Speed: 1.261126496116732 samples/sec                   batch loss = 681.6918392181396 | accuracy = 0.6611111111111111


Epoch[2] Batch[275] Speed: 1.2573664070076458 samples/sec                   batch loss = 692.3365414142609 | accuracy = 0.6636363636363637


Epoch[2] Batch[280] Speed: 1.2552911483023934 samples/sec                   batch loss = 702.525202870369 | accuracy = 0.6660714285714285


Epoch[2] Batch[285] Speed: 1.2609712367422912 samples/sec                   batch loss = 713.4423167705536 | accuracy = 0.6692982456140351


Epoch[2] Batch[290] Speed: 1.2534421025460378 samples/sec                   batch loss = 727.413337469101 | accuracy = 0.6681034482758621


Epoch[2] Batch[295] Speed: 1.2476684881088171 samples/sec                   batch loss = 741.0384523868561 | accuracy = 0.6644067796610169


Epoch[2] Batch[300] Speed: 1.253813143666737 samples/sec                   batch loss = 753.0120568275452 | accuracy = 0.6633333333333333


Epoch[2] Batch[305] Speed: 1.2582437642100892 samples/sec                   batch loss = 766.1393988132477 | accuracy = 0.6622950819672131


Epoch[2] Batch[310] Speed: 1.2595519800432238 samples/sec                   batch loss = 777.3630573749542 | accuracy = 0.6645161290322581


Epoch[2] Batch[315] Speed: 1.2560521927624864 samples/sec                   batch loss = 790.0551514625549 | accuracy = 0.6634920634920635


Epoch[2] Batch[320] Speed: 1.2555573807480935 samples/sec                   batch loss = 801.0783828496933 | accuracy = 0.66484375


Epoch[2] Batch[325] Speed: 1.2480420601943236 samples/sec                   batch loss = 813.4922465085983 | accuracy = 0.6638461538461539


Epoch[2] Batch[330] Speed: 1.2520210472946223 samples/sec                   batch loss = 827.7193206548691 | accuracy = 0.6606060606060606


Epoch[2] Batch[335] Speed: 1.2627809862270054 samples/sec                   batch loss = 838.9088913202286 | accuracy = 0.6604477611940298


Epoch[2] Batch[340] Speed: 1.2596720840303162 samples/sec                   batch loss = 852.1363041400909 | accuracy = 0.6595588235294118


Epoch[2] Batch[345] Speed: 1.2633104270750513 samples/sec                   batch loss = 862.0990505218506 | accuracy = 0.6623188405797101


Epoch[2] Batch[350] Speed: 1.2496714971232303 samples/sec                   batch loss = 873.8195116519928 | accuracy = 0.6635714285714286


Epoch[2] Batch[355] Speed: 1.2605227349106807 samples/sec                   batch loss = 885.5170608758926 | accuracy = 0.6647887323943662


Epoch[2] Batch[360] Speed: 1.262755324190963 samples/sec                   batch loss = 899.3015834093094 | accuracy = 0.6625


Epoch[2] Batch[365] Speed: 1.2645148258440981 samples/sec                   batch loss = 911.0186356306076 | accuracy = 0.6616438356164384


Epoch[2] Batch[370] Speed: 1.2576669884319664 samples/sec                   batch loss = 922.0681089162827 | accuracy = 0.6608108108108108


Epoch[2] Batch[375] Speed: 1.257487696798938 samples/sec                   batch loss = 931.800507068634 | accuracy = 0.662


Epoch[2] Batch[380] Speed: 1.2604334326621527 samples/sec                   batch loss = 944.6126853227615 | accuracy = 0.6631578947368421


Epoch[2] Batch[385] Speed: 1.2542792936084464 samples/sec                   batch loss = 959.3260422945023 | accuracy = 0.6616883116883117


Epoch[2] Batch[390] Speed: 1.2489585136267385 samples/sec                   batch loss = 975.2804979085922 | accuracy = 0.6576923076923077


Epoch[2] Batch[395] Speed: 1.2527549314503148 samples/sec                   batch loss = 986.3889791965485 | accuracy = 0.6594936708860759


Epoch[2] Batch[400] Speed: 1.2558527734624068 samples/sec                   batch loss = 998.123509645462 | accuracy = 0.65875


Epoch[2] Batch[405] Speed: 1.2611659331571867 samples/sec                   batch loss = 1009.3857772350311 | accuracy = 0.6598765432098765


Epoch[2] Batch[410] Speed: 1.2569680211882956 samples/sec                   batch loss = 1020.8831419944763 | accuracy = 0.6621951219512195


Epoch[2] Batch[415] Speed: 1.2567946717446201 samples/sec                   batch loss = 1031.8772300481796 | accuracy = 0.6632530120481928


Epoch[2] Batch[420] Speed: 1.2579214968720693 samples/sec                   batch loss = 1044.0632580518723 | accuracy = 0.6625


Epoch[2] Batch[425] Speed: 1.258539572784689 samples/sec                   batch loss = 1057.7998205423355 | accuracy = 0.6617647058823529


Epoch[2] Batch[430] Speed: 1.2587767731402388 samples/sec                   batch loss = 1069.9166022539139 | accuracy = 0.6633720930232558


Epoch[2] Batch[435] Speed: 1.2545951939235809 samples/sec                   batch loss = 1082.7246562242508 | accuracy = 0.6626436781609195


Epoch[2] Batch[440] Speed: 1.25843063411426 samples/sec                   batch loss = 1092.5789657831192 | accuracy = 0.665340909090909


Epoch[2] Batch[445] Speed: 1.2465852348460706 samples/sec                   batch loss = 1104.1267473697662 | accuracy = 0.6651685393258427


Epoch[2] Batch[450] Speed: 1.2524896055227384 samples/sec                   batch loss = 1113.5923340320587 | accuracy = 0.6666666666666666


Epoch[2] Batch[455] Speed: 1.2584808530645568 samples/sec                   batch loss = 1128.386362671852 | accuracy = 0.6653846153846154


Epoch[2] Batch[460] Speed: 1.2541629345764016 samples/sec                   batch loss = 1139.5316754579544 | accuracy = 0.6652173913043479


Epoch[2] Batch[465] Speed: 1.2555046701598138 samples/sec                   batch loss = 1149.4476158618927 | accuracy = 0.6666666666666666


Epoch[2] Batch[470] Speed: 1.2560612203109904 samples/sec                   batch loss = 1161.8791844844818 | accuracy = 0.6659574468085107


Epoch[2] Batch[475] Speed: 1.2566703152106833 samples/sec                   batch loss = 1177.3251994848251 | accuracy = 0.6647368421052632


Epoch[2] Batch[480] Speed: 1.2593288554937054 samples/sec                   batch loss = 1186.86925303936 | accuracy = 0.6661458333333333


Epoch[2] Batch[485] Speed: 1.2553165078542505 samples/sec                   batch loss = 1201.100334763527 | accuracy = 0.6649484536082474


Epoch[2] Batch[490] Speed: 1.2558528674688443 samples/sec                   batch loss = 1213.1333084106445 | accuracy = 0.6642857142857143


Epoch[2] Batch[495] Speed: 1.2572641723292286 samples/sec                   batch loss = 1225.179312825203 | accuracy = 0.6656565656565656


Epoch[2] Batch[500] Speed: 1.2522751449760208 samples/sec                   batch loss = 1236.4925869703293 | accuracy = 0.667


Epoch[2] Batch[505] Speed: 1.2526173444764594 samples/sec                   batch loss = 1248.2416373491287 | accuracy = 0.6673267326732674


Epoch[2] Batch[510] Speed: 1.2521346729367377 samples/sec                   batch loss = 1261.4256423711777 | accuracy = 0.667156862745098


Epoch[2] Batch[515] Speed: 1.2534585844370967 samples/sec                   batch loss = 1272.729885339737 | accuracy = 0.666504854368932


Epoch[2] Batch[520] Speed: 1.2508325791430137 samples/sec                   batch loss = 1286.1222739219666 | accuracy = 0.6658653846153846


Epoch[2] Batch[525] Speed: 1.2530173751696485 samples/sec                   batch loss = 1301.6443674564362 | accuracy = 0.6652380952380952


Epoch[2] Batch[530] Speed: 1.2617963124313953 samples/sec                   batch loss = 1315.1307237148285 | accuracy = 0.6650943396226415


Epoch[2] Batch[535] Speed: 1.2565189742257776 samples/sec                   batch loss = 1326.678913474083 | accuracy = 0.6663551401869159


Epoch[2] Batch[540] Speed: 1.2464602048352702 samples/sec                   batch loss = 1338.0634433031082 | accuracy = 0.6680555555555555


Epoch[2] Batch[545] Speed: 1.2511627938194234 samples/sec                   batch loss = 1351.918356537819 | accuracy = 0.6660550458715596


Epoch[2] Batch[550] Speed: 1.2552504812342582 samples/sec                   batch loss = 1366.7586406469345 | accuracy = 0.6645454545454546


Epoch[2] Batch[555] Speed: 1.256668432636226 samples/sec                   batch loss = 1379.7516845464706 | accuracy = 0.663963963963964


Epoch[2] Batch[560] Speed: 1.261161572209309 samples/sec                   batch loss = 1391.3093165159225 | accuracy = 0.6638392857142857


Epoch[2] Batch[565] Speed: 1.2569370388811614 samples/sec                   batch loss = 1405.3116760253906 | accuracy = 0.6619469026548672


Epoch[2] Batch[570] Speed: 1.2625587115730526 samples/sec                   batch loss = 1417.0551471710205 | accuracy = 0.6614035087719298


Epoch[2] Batch[575] Speed: 1.2540567207253808 samples/sec                   batch loss = 1430.4243268966675 | accuracy = 0.66


Epoch[2] Batch[580] Speed: 1.2554317658385379 samples/sec                   batch loss = 1443.7584369182587 | accuracy = 0.6594827586206896


Epoch[2] Batch[585] Speed: 1.262308686741859 samples/sec                   batch loss = 1453.6986498832703 | accuracy = 0.6611111111111111


Epoch[2] Batch[590] Speed: 1.2531383889300987 samples/sec                   batch loss = 1464.661394238472 | accuracy = 0.6627118644067796


Epoch[2] Batch[595] Speed: 1.2556335887277783 samples/sec                   batch loss = 1473.8701565265656 | accuracy = 0.6647058823529411


Epoch[2] Batch[600] Speed: 1.258751556913152 samples/sec                   batch loss = 1487.3471865653992 | accuracy = 0.6629166666666667


Epoch[2] Batch[605] Speed: 1.2610701888484106 samples/sec                   batch loss = 1500.2596422433853 | accuracy = 0.6632231404958677


Epoch[2] Batch[610] Speed: 1.2587499514197948 samples/sec                   batch loss = 1512.6802462339401 | accuracy = 0.6627049180327869


Epoch[2] Batch[615] Speed: 1.2578964092119689 samples/sec                   batch loss = 1524.6218472719193 | accuracy = 0.6630081300813008


Epoch[2] Batch[620] Speed: 1.2543266497581096 samples/sec                   batch loss = 1534.3106429576874 | accuracy = 0.6649193548387097


Epoch[2] Batch[625] Speed: 1.2587119874173935 samples/sec                   batch loss = 1544.7728354930878 | accuracy = 0.6664


Epoch[2] Batch[630] Speed: 1.2539072268580973 samples/sec                   batch loss = 1557.363424897194 | accuracy = 0.6666666666666666


Epoch[2] Batch[635] Speed: 1.2512863426430911 samples/sec                   batch loss = 1569.9268223047256 | accuracy = 0.6661417322834645


Epoch[2] Batch[640] Speed: 1.2600187171616966 samples/sec                   batch loss = 1583.5083869695663 | accuracy = 0.66484375


Epoch[2] Batch[645] Speed: 1.2641500932295866 samples/sec                   batch loss = 1599.182096362114 | accuracy = 0.6647286821705426


Epoch[2] Batch[650] Speed: 1.2594168667955419 samples/sec                   batch loss = 1612.5331239700317 | accuracy = 0.6638461538461539


Epoch[2] Batch[655] Speed: 1.2609636548459893 samples/sec                   batch loss = 1624.5283534526825 | accuracy = 0.6637404580152672


Epoch[2] Batch[660] Speed: 1.260942426021392 samples/sec                   batch loss = 1637.532103419304 | accuracy = 0.6628787878787878


Epoch[2] Batch[665] Speed: 1.2636203291953318 samples/sec                   batch loss = 1647.1639577150345 | accuracy = 0.6642857142857143


Epoch[2] Batch[670] Speed: 1.2580745910072932 samples/sec                   batch loss = 1659.9473613500595 | accuracy = 0.6634328358208955


Epoch[2] Batch[675] Speed: 1.2624299820137836 samples/sec                   batch loss = 1670.6333770751953 | accuracy = 0.664074074074074


Epoch[2] Batch[680] Speed: 1.2611065889321262 samples/sec                   batch loss = 1682.6307159662247 | accuracy = 0.6639705882352941


Epoch[2] Batch[685] Speed: 1.2612814144453317 samples/sec                   batch loss = 1696.0409628152847 | accuracy = 0.6638686131386862


Epoch[2] Batch[690] Speed: 1.2593820767548851 samples/sec                   batch loss = 1707.2393915653229 | accuracy = 0.6634057971014493


Epoch[2] Batch[695] Speed: 1.2599001554935731 samples/sec                   batch loss = 1718.8246405124664 | accuracy = 0.6640287769784172


Epoch[2] Batch[700] Speed: 1.257306194894265 samples/sec                   batch loss = 1732.1111488342285 | accuracy = 0.6628571428571428


Epoch[2] Batch[705] Speed: 1.2608325972245957 samples/sec                   batch loss = 1741.3510538339615 | accuracy = 0.6638297872340425


Epoch[2] Batch[710] Speed: 1.2607696840834555 samples/sec                   batch loss = 1753.2810847759247 | accuracy = 0.6647887323943662


Epoch[2] Batch[715] Speed: 1.2565877696195038 samples/sec                   batch loss = 1765.3828338384628 | accuracy = 0.6646853146853147


Epoch[2] Batch[720] Speed: 1.2623162848215963 samples/sec                   batch loss = 1776.7180625200272 | accuracy = 0.6652777777777777


Epoch[2] Batch[725] Speed: 1.2532689752355386 samples/sec                   batch loss = 1788.4065425395966 | accuracy = 0.6651724137931034


Epoch[2] Batch[730] Speed: 1.2601874670911053 samples/sec                   batch loss = 1800.9218318462372 | accuracy = 0.665068493150685


Epoch[2] Batch[735] Speed: 1.2622329009029396 samples/sec                   batch loss = 1810.1731985807419 | accuracy = 0.6663265306122449


Epoch[2] Batch[740] Speed: 1.2531082502936666 samples/sec                   batch loss = 1821.1334298849106 | accuracy = 0.6662162162162162


Epoch[2] Batch[745] Speed: 1.2591819768099448 samples/sec                   batch loss = 1832.9411498308182 | accuracy = 0.6664429530201342


Epoch[2] Batch[750] Speed: 1.2625958627696303 samples/sec                   batch loss = 1841.2657178640366 | accuracy = 0.6683333333333333


Epoch[2] Batch[755] Speed: 1.2566241937601625 samples/sec                   batch loss = 1852.9590820074081 | accuracy = 0.6685430463576159


Epoch[2] Batch[760] Speed: 1.2557784187784846 samples/sec                   batch loss = 1864.729146361351 | accuracy = 0.66875


Epoch[2] Batch[765] Speed: 1.2528477332051902 samples/sec                   batch loss = 1875.877959370613 | accuracy = 0.6686274509803921


Epoch[2] Batch[770] Speed: 1.2555907382197737 samples/sec                   batch loss = 1883.6458076238632 | accuracy = 0.6704545454545454


Epoch[2] Batch[775] Speed: 1.2561060778669426 samples/sec                   batch loss = 1895.4794834852219 | accuracy = 0.6712903225806451


Epoch[2] Batch[780] Speed: 1.2554467030188028 samples/sec                   batch loss = 1905.314455986023 | accuracy = 0.6714743589743589


Epoch[2] Batch[785] Speed: 1.2608159208406875 samples/sec                   batch loss = 1916.2188985347748 | accuracy = 0.6719745222929936


[Epoch 2] training: accuracy=0.6719543147208121
[Epoch 2] time cost: 642.7746260166168
[Epoch 2] validation: validation accuracy=0.7277777777777777


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).