<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[15:31:55] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[15:31:55] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[15:31:55] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 8.986392 , -2.3077493]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7775741186324512 samples/sec                   batch loss = 15.73336148262024 | accuracy = 0.4


Epoch[1] Batch[10] Speed: 1.2570303671160235 samples/sec                   batch loss = 30.13128685951233 | accuracy = 0.5


Epoch[1] Batch[15] Speed: 1.2553432773817246 samples/sec                   batch loss = 44.83116674423218 | accuracy = 0.45


Epoch[1] Batch[20] Speed: 1.2563743496547317 samples/sec                   batch loss = 58.97359108924866 | accuracy = 0.475


Epoch[1] Batch[25] Speed: 1.260740503613395 samples/sec                   batch loss = 73.43080186843872 | accuracy = 0.45


Epoch[1] Batch[30] Speed: 1.2493768648349808 samples/sec                   batch loss = 87.59851932525635 | accuracy = 0.48333333333333334


Epoch[1] Batch[35] Speed: 1.2570966753117019 samples/sec                   batch loss = 101.4458019733429 | accuracy = 0.4928571428571429


Epoch[1] Batch[40] Speed: 1.256504199727148 samples/sec                   batch loss = 114.79361844062805 | accuracy = 0.50625


Epoch[1] Batch[45] Speed: 1.2488150663946904 samples/sec                   batch loss = 128.6646728515625 | accuracy = 0.5055555555555555


Epoch[1] Batch[50] Speed: 1.2520266533393503 samples/sec                   batch loss = 141.97873902320862 | accuracy = 0.52


Epoch[1] Batch[55] Speed: 1.2548472403776028 samples/sec                   batch loss = 155.17685914039612 | accuracy = 0.5272727272727272


Epoch[1] Batch[60] Speed: 1.2510094182042233 samples/sec                   batch loss = 169.9106523990631 | accuracy = 0.5166666666666667


Epoch[1] Batch[65] Speed: 1.2510940313657621 samples/sec                   batch loss = 183.48654103279114 | accuracy = 0.5115384615384615


Epoch[1] Batch[70] Speed: 1.2487820679674961 samples/sec                   batch loss = 197.83540964126587 | accuracy = 0.5071428571428571


Epoch[1] Batch[75] Speed: 1.253686940683818 samples/sec                   batch loss = 210.93641352653503 | accuracy = 0.5166666666666667


Epoch[1] Batch[80] Speed: 1.2519458377161197 samples/sec                   batch loss = 224.64772725105286 | accuracy = 0.525


Epoch[1] Batch[85] Speed: 1.2490156971788258 samples/sec                   batch loss = 238.44157791137695 | accuracy = 0.5294117647058824


Epoch[1] Batch[90] Speed: 1.2569715056162536 samples/sec                   batch loss = 252.9315218925476 | accuracy = 0.5222222222222223


Epoch[1] Batch[95] Speed: 1.2510095114869635 samples/sec                   batch loss = 267.3356137275696 | accuracy = 0.5210526315789473


Epoch[1] Batch[100] Speed: 1.2544779320679447 samples/sec                   batch loss = 280.8889617919922 | accuracy = 0.5175


Epoch[1] Batch[105] Speed: 1.2565262204444527 samples/sec                   batch loss = 294.7234523296356 | accuracy = 0.5214285714285715


Epoch[1] Batch[110] Speed: 1.2551762917555649 samples/sec                   batch loss = 308.59452080726624 | accuracy = 0.5272727272727272


Epoch[1] Batch[115] Speed: 1.2576548266577112 samples/sec                   batch loss = 322.50207471847534 | accuracy = 0.5260869565217391


Epoch[1] Batch[120] Speed: 1.2510933782986378 samples/sec                   batch loss = 336.0408139228821 | accuracy = 0.53125


Epoch[1] Batch[125] Speed: 1.2539038531131497 samples/sec                   batch loss = 348.9274160861969 | accuracy = 0.538


Epoch[1] Batch[130] Speed: 1.2460271466583366 samples/sec                   batch loss = 362.9370901584625 | accuracy = 0.5403846153846154


Epoch[1] Batch[135] Speed: 1.247713119366836 samples/sec                   batch loss = 377.34066224098206 | accuracy = 0.5370370370370371


Epoch[1] Batch[140] Speed: 1.2471325130180835 samples/sec                   batch loss = 390.1668059825897 | accuracy = 0.5410714285714285


Epoch[1] Batch[145] Speed: 1.2452039816326428 samples/sec                   batch loss = 404.7207043170929 | accuracy = 0.5379310344827586


Epoch[1] Batch[150] Speed: 1.2439828470935872 samples/sec                   batch loss = 418.5169417858124 | accuracy = 0.5383333333333333


Epoch[1] Batch[155] Speed: 1.2412182736760429 samples/sec                   batch loss = 432.97782850265503 | accuracy = 0.532258064516129


Epoch[1] Batch[160] Speed: 1.2521673815081806 samples/sec                   batch loss = 446.7047824859619 | accuracy = 0.534375


Epoch[1] Batch[165] Speed: 1.2484462338998683 samples/sec                   batch loss = 461.3100299835205 | accuracy = 0.5333333333333333


Epoch[1] Batch[170] Speed: 1.2496052253189387 samples/sec                   batch loss = 475.2970473766327 | accuracy = 0.5338235294117647


Epoch[1] Batch[175] Speed: 1.247705139323297 samples/sec                   batch loss = 489.0940487384796 | accuracy = 0.53


Epoch[1] Batch[180] Speed: 1.2503074865752226 samples/sec                   batch loss = 502.94205689430237 | accuracy = 0.5277777777777778


Epoch[1] Batch[185] Speed: 1.2500783105077948 samples/sec                   batch loss = 516.8491914272308 | accuracy = 0.522972972972973


Epoch[1] Batch[190] Speed: 1.2482692828224673 samples/sec                   batch loss = 530.853963136673 | accuracy = 0.5197368421052632


Epoch[1] Batch[195] Speed: 1.2478148274669238 samples/sec                   batch loss = 545.2329986095428 | accuracy = 0.5153846153846153


Epoch[1] Batch[200] Speed: 1.2491993708291838 samples/sec                   batch loss = 559.2917864322662 | accuracy = 0.515


Epoch[1] Batch[205] Speed: 1.2521029005321276 samples/sec                   batch loss = 573.1706202030182 | accuracy = 0.5134146341463415


Epoch[1] Batch[210] Speed: 1.2602793852511875 samples/sec                   batch loss = 586.9535481929779 | accuracy = 0.5154761904761904


Epoch[1] Batch[215] Speed: 1.252624545764258 samples/sec                   batch loss = 600.3554797172546 | accuracy = 0.5174418604651163


Epoch[1] Batch[220] Speed: 1.2570950740370537 samples/sec                   batch loss = 613.8131203651428 | accuracy = 0.5193181818181818


Epoch[1] Batch[225] Speed: 1.252566002629332 samples/sec                   batch loss = 626.8764481544495 | accuracy = 0.5233333333333333


Epoch[1] Batch[230] Speed: 1.257516915397267 samples/sec                   batch loss = 640.7015800476074 | accuracy = 0.5228260869565218


Epoch[1] Batch[235] Speed: 1.2555223338211077 samples/sec                   batch loss = 653.8847498893738 | accuracy = 0.5255319148936171


Epoch[1] Batch[240] Speed: 1.2562909022182156 samples/sec                   batch loss = 668.0007071495056 | accuracy = 0.521875


Epoch[1] Batch[245] Speed: 1.2615656567403524 samples/sec                   batch loss = 681.2731192111969 | accuracy = 0.5244897959183673


Epoch[1] Batch[250] Speed: 1.2551955426131418 samples/sec                   batch loss = 695.277574300766 | accuracy = 0.525


Epoch[1] Batch[255] Speed: 1.2490312259742142 samples/sec                   batch loss = 708.9543454647064 | accuracy = 0.5254901960784314


Epoch[1] Batch[260] Speed: 1.2533009004720679 samples/sec                   batch loss = 721.814728975296 | accuracy = 0.5317307692307692


Epoch[1] Batch[265] Speed: 1.257426436404024 samples/sec                   batch loss = 735.9568295478821 | accuracy = 0.5311320754716982


Epoch[1] Batch[270] Speed: 1.251980311544092 samples/sec                   batch loss = 749.7623016834259 | accuracy = 0.5324074074074074


Epoch[1] Batch[275] Speed: 1.2546046696387336 samples/sec                   batch loss = 763.2220754623413 | accuracy = 0.5336363636363637


Epoch[1] Batch[280] Speed: 1.2519339731792438 samples/sec                   batch loss = 777.4609973430634 | accuracy = 0.5330357142857143


Epoch[1] Batch[285] Speed: 1.2555358637701632 samples/sec                   batch loss = 791.075686454773 | accuracy = 0.5350877192982456


Epoch[1] Batch[290] Speed: 1.2565601940906257 samples/sec                   batch loss = 804.5320522785187 | accuracy = 0.5353448275862069


Epoch[1] Batch[295] Speed: 1.2594543060516388 samples/sec                   batch loss = 818.1232614517212 | accuracy = 0.5372881355932203


Epoch[1] Batch[300] Speed: 1.2552918057593188 samples/sec                   batch loss = 831.4595098495483 | accuracy = 0.5366666666666666


Epoch[1] Batch[305] Speed: 1.2546724110982899 samples/sec                   batch loss = 845.3564698696136 | accuracy = 0.5368852459016393


Epoch[1] Batch[310] Speed: 1.2549626938212453 samples/sec                   batch loss = 859.9732823371887 | accuracy = 0.5346774193548387


Epoch[1] Batch[315] Speed: 1.2550808913420806 samples/sec                   batch loss = 873.8722772598267 | accuracy = 0.5349206349206349


Epoch[1] Batch[320] Speed: 1.2525547809281747 samples/sec                   batch loss = 887.4717254638672 | accuracy = 0.53515625


Epoch[1] Batch[325] Speed: 1.2596920405336756 samples/sec                   batch loss = 901.6439833641052 | accuracy = 0.5330769230769231


Epoch[1] Batch[330] Speed: 1.250263321724374 samples/sec                   batch loss = 915.2508165836334 | accuracy = 0.5348484848484848


Epoch[1] Batch[335] Speed: 1.2528440844894568 samples/sec                   batch loss = 929.4797875881195 | accuracy = 0.5350746268656716


Epoch[1] Batch[340] Speed: 1.2557947741735511 samples/sec                   batch loss = 943.0050065517426 | accuracy = 0.5360294117647059


Epoch[1] Batch[345] Speed: 1.259658843097422 samples/sec                   batch loss = 956.8372690677643 | accuracy = 0.5347826086956522


Epoch[1] Batch[350] Speed: 1.2553241159196125 samples/sec                   batch loss = 971.1848273277283 | accuracy = 0.5342857142857143


Epoch[1] Batch[355] Speed: 1.2583776820061958 samples/sec                   batch loss = 984.9542999267578 | accuracy = 0.5359154929577464


Epoch[1] Batch[360] Speed: 1.25496428966809 samples/sec                   batch loss = 999.0542359352112 | accuracy = 0.5375


Epoch[1] Batch[365] Speed: 1.2525813392794283 samples/sec                   batch loss = 1013.0153682231903 | accuracy = 0.5376712328767124


Epoch[1] Batch[370] Speed: 1.2541366841133719 samples/sec                   batch loss = 1026.863165140152 | accuracy = 0.5378378378378378


Epoch[1] Batch[375] Speed: 1.2518348617007782 samples/sec                   batch loss = 1040.225819349289 | accuracy = 0.538


Epoch[1] Batch[380] Speed: 1.2481968448858811 samples/sec                   batch loss = 1053.3253600597382 | accuracy = 0.5421052631578948


Epoch[1] Batch[385] Speed: 1.2479600872140075 samples/sec                   batch loss = 1067.0337846279144 | accuracy = 0.5402597402597402


Epoch[1] Batch[390] Speed: 1.2414159195908478 samples/sec                   batch loss = 1081.1614289283752 | accuracy = 0.5397435897435897


Epoch[1] Batch[395] Speed: 1.2445295038784876 samples/sec                   batch loss = 1094.822985649109 | accuracy = 0.540506329113924


Epoch[1] Batch[400] Speed: 1.2487602248975036 samples/sec                   batch loss = 1108.7737472057343 | accuracy = 0.539375


Epoch[1] Batch[405] Speed: 1.2460939649854852 samples/sec                   batch loss = 1121.9626536369324 | accuracy = 0.5407407407407407


Epoch[1] Batch[410] Speed: 1.2432031888192145 samples/sec                   batch loss = 1134.9624242782593 | accuracy = 0.5432926829268293


Epoch[1] Batch[415] Speed: 1.2447314380154055 samples/sec                   batch loss = 1148.5795328617096 | accuracy = 0.5439759036144578


Epoch[1] Batch[420] Speed: 1.2502060238781283 samples/sec                   batch loss = 1162.1167919635773 | accuracy = 0.5446428571428571


Epoch[1] Batch[425] Speed: 1.255031225017237 samples/sec                   batch loss = 1175.6556887626648 | accuracy = 0.5458823529411765


Epoch[1] Batch[430] Speed: 1.25395249302532 samples/sec                   batch loss = 1189.709478378296 | accuracy = 0.5447674418604651


Epoch[1] Batch[435] Speed: 1.257723557915224 samples/sec                   batch loss = 1203.4644751548767 | accuracy = 0.5459770114942529


Epoch[1] Batch[440] Speed: 1.2577212950381857 samples/sec                   batch loss = 1216.8253207206726 | accuracy = 0.5477272727272727


Epoch[1] Batch[445] Speed: 1.261179585015091 samples/sec                   batch loss = 1230.1047360897064 | accuracy = 0.55


Epoch[1] Batch[450] Speed: 1.2598548373349874 samples/sec                   batch loss = 1243.5774834156036 | accuracy = 0.55


Epoch[1] Batch[455] Speed: 1.2604907248025123 samples/sec                   batch loss = 1257.0957868099213 | accuracy = 0.5516483516483517


Epoch[1] Batch[460] Speed: 1.2615522811078568 samples/sec                   batch loss = 1270.785878419876 | accuracy = 0.5516304347826086


Epoch[1] Batch[465] Speed: 1.2565614175525972 samples/sec                   batch loss = 1284.2943978309631 | accuracy = 0.5521505376344086


Epoch[1] Batch[470] Speed: 1.2596374690359688 samples/sec                   batch loss = 1297.453776359558 | accuracy = 0.5531914893617021


Epoch[1] Batch[475] Speed: 1.264611475230909 samples/sec                   batch loss = 1310.5806350708008 | accuracy = 0.5542105263157895


Epoch[1] Batch[480] Speed: 1.2566095109067912 samples/sec                   batch loss = 1324.216775894165 | accuracy = 0.5541666666666667


Epoch[1] Batch[485] Speed: 1.260237731694432 samples/sec                   batch loss = 1337.4704895019531 | accuracy = 0.5551546391752578


Epoch[1] Batch[490] Speed: 1.262683666102178 samples/sec                   batch loss = 1350.748062133789 | accuracy = 0.5561224489795918


Epoch[1] Batch[495] Speed: 1.2680250037374399 samples/sec                   batch loss = 1363.794589996338 | accuracy = 0.557070707070707


Epoch[1] Batch[500] Speed: 1.265008044840893 samples/sec                   batch loss = 1376.6798582077026 | accuracy = 0.558


Epoch[1] Batch[505] Speed: 1.26594453095914 samples/sec                   batch loss = 1389.8120658397675 | accuracy = 0.557920792079208


Epoch[1] Batch[510] Speed: 1.26267701390703 samples/sec                   batch loss = 1403.160136461258 | accuracy = 0.5598039215686275


Epoch[1] Batch[515] Speed: 1.262789635518608 samples/sec                   batch loss = 1417.080931186676 | accuracy = 0.5597087378640777


Epoch[1] Batch[520] Speed: 1.2613448528214382 samples/sec                   batch loss = 1430.6306953430176 | accuracy = 0.5591346153846154


Epoch[1] Batch[525] Speed: 1.2580409127604064 samples/sec                   batch loss = 1444.6600978374481 | accuracy = 0.559047619047619


Epoch[1] Batch[530] Speed: 1.2502632285528812 samples/sec                   batch loss = 1457.1207926273346 | accuracy = 0.5608490566037736


Epoch[1] Batch[535] Speed: 1.2533426585149519 samples/sec                   batch loss = 1469.2811193466187 | accuracy = 0.5630841121495327


Epoch[1] Batch[540] Speed: 1.2535829617726004 samples/sec                   batch loss = 1482.3105299472809 | accuracy = 0.5634259259259259


Epoch[1] Batch[545] Speed: 1.2609904762133823 samples/sec                   batch loss = 1496.1508781909943 | accuracy = 0.563302752293578


Epoch[1] Batch[550] Speed: 1.2531489658759762 samples/sec                   batch loss = 1509.8986175060272 | accuracy = 0.5636363636363636


Epoch[1] Batch[555] Speed: 1.25434962581207 samples/sec                   batch loss = 1523.14559674263 | accuracy = 0.5648648648648649


Epoch[1] Batch[560] Speed: 1.2515670309694138 samples/sec                   batch loss = 1536.229599237442 | accuracy = 0.565625


Epoch[1] Batch[565] Speed: 1.2629975393056532 samples/sec                   batch loss = 1549.9982781410217 | accuracy = 0.5668141592920354


Epoch[1] Batch[570] Speed: 1.2570764241974768 samples/sec                   batch loss = 1562.6624448299408 | accuracy = 0.5684210526315789


Epoch[1] Batch[575] Speed: 1.256111062237604 samples/sec                   batch loss = 1576.1572036743164 | accuracy = 0.568695652173913


Epoch[1] Batch[580] Speed: 1.253925689061674 samples/sec                   batch loss = 1588.8378131389618 | accuracy = 0.5698275862068966


Epoch[1] Batch[585] Speed: 1.2537430589920475 samples/sec                   batch loss = 1600.9226441383362 | accuracy = 0.5717948717948718


Epoch[1] Batch[590] Speed: 1.251571605915709 samples/sec                   batch loss = 1613.727796792984 | accuracy = 0.5711864406779661


Epoch[1] Batch[595] Speed: 1.252348898403244 samples/sec                   batch loss = 1626.0771362781525 | accuracy = 0.5722689075630252


Epoch[1] Batch[600] Speed: 1.251900996553194 samples/sec                   batch loss = 1639.3764209747314 | accuracy = 0.5729166666666666


Epoch[1] Batch[605] Speed: 1.2550131057677347 samples/sec                   batch loss = 1652.6395075321198 | accuracy = 0.5727272727272728


Epoch[1] Batch[610] Speed: 1.2544340348687348 samples/sec                   batch loss = 1664.6339423656464 | accuracy = 0.5737704918032787


Epoch[1] Batch[615] Speed: 1.2505568808589516 samples/sec                   batch loss = 1679.7351295948029 | accuracy = 0.5727642276422764


Epoch[1] Batch[620] Speed: 1.2520654298565812 samples/sec                   batch loss = 1692.6843750476837 | accuracy = 0.5733870967741935


Epoch[1] Batch[625] Speed: 1.256515962834808 samples/sec                   batch loss = 1705.0375294685364 | accuracy = 0.5748


Epoch[1] Batch[630] Speed: 1.25944125878546 samples/sec                   batch loss = 1719.0683317184448 | accuracy = 0.575


Epoch[1] Batch[635] Speed: 1.2573042161902905 samples/sec                   batch loss = 1733.0444297790527 | accuracy = 0.574015748031496


Epoch[1] Batch[640] Speed: 1.2587248307041776 samples/sec                   batch loss = 1746.1211013793945 | accuracy = 0.57421875


Epoch[1] Batch[645] Speed: 1.2624730156169295 samples/sec                   batch loss = 1759.2401304244995 | accuracy = 0.5740310077519379


Epoch[1] Batch[650] Speed: 1.2593995660731119 samples/sec                   batch loss = 1772.2139019966125 | accuracy = 0.5742307692307692


Epoch[1] Batch[655] Speed: 1.2573752649882193 samples/sec                   batch loss = 1784.7898094654083 | accuracy = 0.5748091603053435


Epoch[1] Batch[660] Speed: 1.2571623310848024 samples/sec                   batch loss = 1798.192550420761 | accuracy = 0.5761363636363637


Epoch[1] Batch[665] Speed: 1.2606412243383502 samples/sec                   batch loss = 1810.491760969162 | accuracy = 0.5766917293233083


Epoch[1] Batch[670] Speed: 1.2572465538384157 samples/sec                   batch loss = 1823.290512561798 | accuracy = 0.5772388059701492


Epoch[1] Batch[675] Speed: 1.2534111065272913 samples/sec                   batch loss = 1836.7074999809265 | accuracy = 0.577037037037037


Epoch[1] Batch[680] Speed: 1.2518203839577742 samples/sec                   batch loss = 1849.4785096645355 | accuracy = 0.5772058823529411


Epoch[1] Batch[685] Speed: 1.2532427621811058 samples/sec                   batch loss = 1863.0982496738434 | accuracy = 0.5762773722627738


Epoch[1] Batch[690] Speed: 1.2523135629776778 samples/sec                   batch loss = 1877.076791524887 | accuracy = 0.5757246376811594


Epoch[1] Batch[695] Speed: 1.2570097414191104 samples/sec                   batch loss = 1890.103447675705 | accuracy = 0.576978417266187


Epoch[1] Batch[700] Speed: 1.2582936851300281 samples/sec                   batch loss = 1903.2697393894196 | accuracy = 0.5785714285714286


Epoch[1] Batch[705] Speed: 1.2525845188758407 samples/sec                   batch loss = 1915.9136757850647 | accuracy = 0.5794326241134752


Epoch[1] Batch[710] Speed: 1.2569845959371382 samples/sec                   batch loss = 1928.5575649738312 | accuracy = 0.5799295774647887


Epoch[1] Batch[715] Speed: 1.2519293021634286 samples/sec                   batch loss = 1940.20123898983 | accuracy = 0.5807692307692308


Epoch[1] Batch[720] Speed: 1.2594493896587302 samples/sec                   batch loss = 1951.3410662412643 | accuracy = 0.5829861111111111


Epoch[1] Batch[725] Speed: 1.2554110046877702 samples/sec                   batch loss = 1965.9963101148605 | accuracy = 0.5827586206896552


Epoch[1] Batch[730] Speed: 1.261599523971699 samples/sec                   batch loss = 1977.744250535965 | accuracy = 0.5845890410958904


Epoch[1] Batch[735] Speed: 1.2571150431656142 samples/sec                   batch loss = 1989.6934517621994 | accuracy = 0.5853741496598639


Epoch[1] Batch[740] Speed: 1.2611379667323048 samples/sec                   batch loss = 2002.140342593193 | accuracy = 0.585472972972973


Epoch[1] Batch[745] Speed: 1.2588334425053591 samples/sec                   batch loss = 2015.583558678627 | accuracy = 0.5855704697986577


Epoch[1] Batch[750] Speed: 1.2631378916687803 samples/sec                   batch loss = 2030.1137894392014 | accuracy = 0.5856666666666667


Epoch[1] Batch[755] Speed: 1.2599354472570568 samples/sec                   batch loss = 2041.7741235494614 | accuracy = 0.5870860927152318


Epoch[1] Batch[760] Speed: 1.2562453730039287 samples/sec                   batch loss = 2054.8790711164474 | accuracy = 0.5865131578947368


Epoch[1] Batch[765] Speed: 1.2547255210143813 samples/sec                   batch loss = 2067.712924838066 | accuracy = 0.5879084967320262


Epoch[1] Batch[770] Speed: 1.2546159281002232 samples/sec                   batch loss = 2079.8994106054306 | accuracy = 0.5883116883116883


Epoch[1] Batch[775] Speed: 1.2562683253358362 samples/sec                   batch loss = 2092.2828134298325 | accuracy = 0.5890322580645161


Epoch[1] Batch[780] Speed: 1.2552887063255576 samples/sec                   batch loss = 2104.318611264229 | accuracy = 0.5894230769230769


Epoch[1] Batch[785] Speed: 1.2578379380990061 samples/sec                   batch loss = 2116.0241878032684 | accuracy = 0.5907643312101911


[Epoch 1] training: accuracy=0.5916878172588832
[Epoch 1] time cost: 646.0949518680573
[Epoch 1] validation: validation accuracy=0.6722222222222223


Epoch[2] Batch[5] Speed: 1.2551697184272432 samples/sec                   batch loss = 14.142788171768188 | accuracy = 0.65


Epoch[2] Batch[10] Speed: 1.2530211184777942 samples/sec                   batch loss = 25.308789491653442 | accuracy = 0.7


Epoch[2] Batch[15] Speed: 1.2588076573116036 samples/sec                   batch loss = 36.21313464641571 | accuracy = 0.7333333333333333


Epoch[2] Batch[20] Speed: 1.2555691261345403 samples/sec                   batch loss = 49.73885142803192 | accuracy = 0.6875


Epoch[2] Batch[25] Speed: 1.2554773300470987 samples/sec                   batch loss = 61.746466636657715 | accuracy = 0.69


Epoch[2] Batch[30] Speed: 1.2556207144739389 samples/sec                   batch loss = 74.16657948493958 | accuracy = 0.675


Epoch[2] Batch[35] Speed: 1.2624200077563676 samples/sec                   batch loss = 85.31881666183472 | accuracy = 0.6857142857142857


Epoch[2] Batch[40] Speed: 1.2565383604007 samples/sec                   batch loss = 98.9369285106659 | accuracy = 0.65625


Epoch[2] Batch[45] Speed: 1.2633256474545298 samples/sec                   batch loss = 114.80808186531067 | accuracy = 0.6444444444444445


Epoch[2] Batch[50] Speed: 1.2620685395872713 samples/sec                   batch loss = 129.1366159915924 | accuracy = 0.625


Epoch[2] Batch[55] Speed: 1.2556507861356443 samples/sec                   batch loss = 142.5361258983612 | accuracy = 0.6136363636363636


Epoch[2] Batch[60] Speed: 1.257708849360025 samples/sec                   batch loss = 154.25192427635193 | accuracy = 0.6208333333333333


Epoch[2] Batch[65] Speed: 1.253722353655749 samples/sec                   batch loss = 166.82114934921265 | accuracy = 0.6230769230769231


Epoch[2] Batch[70] Speed: 1.2533919103659639 samples/sec                   batch loss = 178.06597447395325 | accuracy = 0.6357142857142857


Epoch[2] Batch[75] Speed: 1.2589355546902794 samples/sec                   batch loss = 190.49734950065613 | accuracy = 0.64


Epoch[2] Batch[80] Speed: 1.2552609060282798 samples/sec                   batch loss = 201.5907006263733 | accuracy = 0.640625


Epoch[2] Batch[85] Speed: 1.2587994402740106 samples/sec                   batch loss = 215.87272095680237 | accuracy = 0.638235294117647


Epoch[2] Batch[90] Speed: 1.2592419906650516 samples/sec                   batch loss = 231.9356393814087 | accuracy = 0.6222222222222222


Epoch[2] Batch[95] Speed: 1.2603096804765093 samples/sec                   batch loss = 244.11721682548523 | accuracy = 0.6263157894736842


Epoch[2] Batch[100] Speed: 1.2561560174055146 samples/sec                   batch loss = 255.69890189170837 | accuracy = 0.63


Epoch[2] Batch[105] Speed: 1.2580806287547621 samples/sec                   batch loss = 268.3046684265137 | accuracy = 0.6285714285714286


Epoch[2] Batch[110] Speed: 1.265019872326401 samples/sec                   batch loss = 279.6109952926636 | accuracy = 0.6318181818181818


Epoch[2] Batch[115] Speed: 1.2546537392455228 samples/sec                   batch loss = 291.34339916706085 | accuracy = 0.6304347826086957


Epoch[2] Batch[120] Speed: 1.2597526705669415 samples/sec                   batch loss = 305.0925546884537 | accuracy = 0.63125


Epoch[2] Batch[125] Speed: 1.2588927619062769 samples/sec                   batch loss = 316.69594609737396 | accuracy = 0.638


Epoch[2] Batch[130] Speed: 1.2589853415680736 samples/sec                   batch loss = 326.26313996315 | accuracy = 0.6442307692307693


Epoch[2] Batch[135] Speed: 1.2548446124164216 samples/sec                   batch loss = 340.05866730213165 | accuracy = 0.6388888888888888


Epoch[2] Batch[140] Speed: 1.2557221181185412 samples/sec                   batch loss = 353.7709504365921 | accuracy = 0.6392857142857142


Epoch[2] Batch[145] Speed: 1.2586374826374107 samples/sec                   batch loss = 367.9436591863632 | accuracy = 0.6327586206896552


Epoch[2] Batch[150] Speed: 1.2583465357889114 samples/sec                   batch loss = 380.75857961177826 | accuracy = 0.6333333333333333


Epoch[2] Batch[155] Speed: 1.2600593152311323 samples/sec                   batch loss = 393.57950615882874 | accuracy = 0.632258064516129


Epoch[2] Batch[160] Speed: 1.2566046166985825 samples/sec                   batch loss = 406.89292097091675 | accuracy = 0.6296875


Epoch[2] Batch[165] Speed: 1.2566203347718705 samples/sec                   batch loss = 418.53063917160034 | accuracy = 0.6318181818181818


Epoch[2] Batch[170] Speed: 1.2617885308155747 samples/sec                   batch loss = 428.7598489522934 | accuracy = 0.6367647058823529


Epoch[2] Batch[175] Speed: 1.2578964092119689 samples/sec                   batch loss = 441.2339553833008 | accuracy = 0.6385714285714286


Epoch[2] Batch[180] Speed: 1.2592361307925468 samples/sec                   batch loss = 454.366192817688 | accuracy = 0.6361111111111111


Epoch[2] Batch[185] Speed: 1.2542495689016497 samples/sec                   batch loss = 465.4012632369995 | accuracy = 0.6378378378378379


Epoch[2] Batch[190] Speed: 1.2536391644475888 samples/sec                   batch loss = 478.9857223033905 | accuracy = 0.6381578947368421


Epoch[2] Batch[195] Speed: 1.2602843081261939 samples/sec                   batch loss = 492.33982384204865 | accuracy = 0.6346153846153846


Epoch[2] Batch[200] Speed: 1.2579344183505954 samples/sec                   batch loss = 502.95148253440857 | accuracy = 0.6375


Epoch[2] Batch[205] Speed: 1.2551956365212114 samples/sec                   batch loss = 516.919392824173 | accuracy = 0.6353658536585366


Epoch[2] Batch[210] Speed: 1.2536319514890957 samples/sec                   batch loss = 528.3252127170563 | accuracy = 0.6369047619047619


Epoch[2] Batch[215] Speed: 1.2577117721865536 samples/sec                   batch loss = 540.4565081596375 | accuracy = 0.6360465116279069


Epoch[2] Batch[220] Speed: 1.2561409692922898 samples/sec                   batch loss = 552.8536990880966 | accuracy = 0.6363636363636364


Epoch[2] Batch[225] Speed: 1.2574750672444817 samples/sec                   batch loss = 565.0710700750351 | accuracy = 0.6366666666666667


Epoch[2] Batch[230] Speed: 1.2589991352137306 samples/sec                   batch loss = 577.4030722379684 | accuracy = 0.6347826086956522


Epoch[2] Batch[235] Speed: 1.2570403505731533 samples/sec                   batch loss = 590.6692920923233 | accuracy = 0.6319148936170212


Epoch[2] Batch[240] Speed: 1.254369882949664 samples/sec                   batch loss = 603.9551147222519 | accuracy = 0.6322916666666667


Epoch[2] Batch[245] Speed: 1.2611773096849157 samples/sec                   batch loss = 613.8841584920883 | accuracy = 0.6357142857142857


Epoch[2] Batch[250] Speed: 1.2580198765631223 samples/sec                   batch loss = 623.4702085256577 | accuracy = 0.639


Epoch[2] Batch[255] Speed: 1.2555938391452104 samples/sec                   batch loss = 637.0688155889511 | accuracy = 0.6401960784313725


Epoch[2] Batch[260] Speed: 1.261315835373225 samples/sec                   batch loss = 648.6234756708145 | accuracy = 0.6423076923076924


Epoch[2] Batch[265] Speed: 1.2589124103288767 samples/sec                   batch loss = 663.175420165062 | accuracy = 0.6424528301886793


Epoch[2] Batch[270] Speed: 1.2573756419263313 samples/sec                   batch loss = 674.5738769769669 | accuracy = 0.6444444444444445


Epoch[2] Batch[275] Speed: 1.2594039148438625 samples/sec                   batch loss = 687.0773614645004 | accuracy = 0.6445454545454545


Epoch[2] Batch[280] Speed: 1.259867703954651 samples/sec                   batch loss = 698.4553006887436 | accuracy = 0.6446428571428572


Epoch[2] Batch[285] Speed: 1.257047226065086 samples/sec                   batch loss = 711.0380322933197 | accuracy = 0.6447368421052632


Epoch[2] Batch[290] Speed: 1.258209322028369 samples/sec                   batch loss = 721.4416393041611 | accuracy = 0.646551724137931


Epoch[2] Batch[295] Speed: 1.2571943607799376 samples/sec                   batch loss = 737.2938264608383 | accuracy = 0.6432203389830509


Epoch[2] Batch[300] Speed: 1.2580259137854306 samples/sec                   batch loss = 753.6774911880493 | accuracy = 0.6425


Epoch[2] Batch[305] Speed: 1.2626812903101503 samples/sec                   batch loss = 765.2598024606705 | accuracy = 0.6442622950819672


Epoch[2] Batch[310] Speed: 1.253944714134631 samples/sec                   batch loss = 775.5386273860931 | accuracy = 0.6459677419354839


Epoch[2] Batch[315] Speed: 1.2518778298775375 samples/sec                   batch loss = 786.451728105545 | accuracy = 0.6476190476190476


Epoch[2] Batch[320] Speed: 1.2527131190236116 samples/sec                   batch loss = 797.294278383255 | accuracy = 0.65


Epoch[2] Batch[325] Speed: 1.247892511700004 samples/sec                   batch loss = 809.7630071640015 | accuracy = 0.6507692307692308


Epoch[2] Batch[330] Speed: 1.2477575682211983 samples/sec                   batch loss = 822.0308500528336 | accuracy = 0.6484848484848484


Epoch[2] Batch[335] Speed: 1.2518580267861488 samples/sec                   batch loss = 835.4193818569183 | accuracy = 0.6492537313432836


Epoch[2] Batch[340] Speed: 1.2529834992480104 samples/sec                   batch loss = 849.5963151454926 | accuracy = 0.6463235294117647


Epoch[2] Batch[345] Speed: 1.2494566977635408 samples/sec                   batch loss = 860.4191504716873 | accuracy = 0.6478260869565218


Epoch[2] Batch[350] Speed: 1.2484018288624377 samples/sec                   batch loss = 869.9234981536865 | accuracy = 0.6514285714285715


Epoch[2] Batch[355] Speed: 1.2513897542207288 samples/sec                   batch loss = 882.2852170467377 | accuracy = 0.6507042253521127


Epoch[2] Batch[360] Speed: 1.2534747858092672 samples/sec                   batch loss = 894.7824966907501 | accuracy = 0.65


Epoch[2] Batch[365] Speed: 1.246313072011855 samples/sec                   batch loss = 906.7637786865234 | accuracy = 0.6506849315068494


Epoch[2] Batch[370] Speed: 1.2568853423224788 samples/sec                   batch loss = 917.9650665521622 | accuracy = 0.6513513513513514


Epoch[2] Batch[375] Speed: 1.2506268894364498 samples/sec                   batch loss = 929.405711889267 | accuracy = 0.6526666666666666


Epoch[2] Batch[380] Speed: 1.2514139296084439 samples/sec                   batch loss = 944.5834400653839 | accuracy = 0.65


Epoch[2] Batch[385] Speed: 1.2500751436222997 samples/sec                   batch loss = 956.1019794940948 | accuracy = 0.6506493506493507


Epoch[2] Batch[390] Speed: 1.2516506924164537 samples/sec                   batch loss = 968.6867878437042 | accuracy = 0.6512820512820513


Epoch[2] Batch[395] Speed: 1.2477832739100034 samples/sec                   batch loss = 982.1390886306763 | accuracy = 0.6518987341772152


Epoch[2] Batch[400] Speed: 1.2488153452620696 samples/sec                   batch loss = 992.7039473056793 | accuracy = 0.65375


Epoch[2] Batch[405] Speed: 1.2525286912501654 samples/sec                   batch loss = 1004.2467179298401 | accuracy = 0.6537037037037037


Epoch[2] Batch[410] Speed: 1.2499846147404348 samples/sec                   batch loss = 1018.1321358680725 | accuracy = 0.6530487804878049


Epoch[2] Batch[415] Speed: 1.2576923497876487 samples/sec                   batch loss = 1028.4382907152176 | accuracy = 0.6542168674698795


Epoch[2] Batch[420] Speed: 1.253903571968557 samples/sec                   batch loss = 1037.5697515010834 | accuracy = 0.655952380952381


Epoch[2] Batch[425] Speed: 1.260125848726307 samples/sec                   batch loss = 1048.2193723917007 | accuracy = 0.6558823529411765


Epoch[2] Batch[430] Speed: 1.261205562283185 samples/sec                   batch loss = 1058.79463160038 | accuracy = 0.6575581395348837


Epoch[2] Batch[435] Speed: 1.2569962738635725 samples/sec                   batch loss = 1069.569092988968 | accuracy = 0.6591954022988505


Epoch[2] Batch[440] Speed: 1.260605229990432 samples/sec                   batch loss = 1079.3211938142776 | accuracy = 0.6607954545454545


Epoch[2] Batch[445] Speed: 1.2546113308707063 samples/sec                   batch loss = 1093.4486606121063 | accuracy = 0.6612359550561798


Epoch[2] Batch[450] Speed: 1.2584345042291691 samples/sec                   batch loss = 1103.5228908061981 | accuracy = 0.6616666666666666


Epoch[2] Batch[455] Speed: 1.2570056917042363 samples/sec                   batch loss = 1114.2347769737244 | accuracy = 0.6626373626373626


Epoch[2] Batch[460] Speed: 1.2575317137214979 samples/sec                   batch loss = 1124.8262687921524 | accuracy = 0.6641304347826087


Epoch[2] Batch[465] Speed: 1.2575090922200938 samples/sec                   batch loss = 1137.2511245012283 | accuracy = 0.6639784946236559


Epoch[2] Batch[470] Speed: 1.2617911879418928 samples/sec                   batch loss = 1146.512799859047 | accuracy = 0.6659574468085107


Epoch[2] Batch[475] Speed: 1.2650900787283539 samples/sec                   batch loss = 1158.6227650642395 | accuracy = 0.6652631578947369


Epoch[2] Batch[480] Speed: 1.2592176063930234 samples/sec                   batch loss = 1167.5871515274048 | accuracy = 0.6677083333333333


Epoch[2] Batch[485] Speed: 1.2570628610253491 samples/sec                   batch loss = 1181.2851057052612 | accuracy = 0.6670103092783505


Epoch[2] Batch[490] Speed: 1.2558207180878596 samples/sec                   batch loss = 1191.5496846437454 | accuracy = 0.6673469387755102


Epoch[2] Batch[495] Speed: 1.2582860410272318 samples/sec                   batch loss = 1204.8854686021805 | accuracy = 0.6666666666666666


Epoch[2] Batch[500] Speed: 1.2519207075853986 samples/sec                   batch loss = 1213.6793533563614 | accuracy = 0.669


Epoch[2] Batch[505] Speed: 1.2565807109217755 samples/sec                   batch loss = 1225.911232471466 | accuracy = 0.6693069306930693


Epoch[2] Batch[510] Speed: 1.25912877245944 samples/sec                   batch loss = 1239.1036685705185 | accuracy = 0.6691176470588235


Epoch[2] Batch[515] Speed: 1.2592254508426164 samples/sec                   batch loss = 1249.1362953186035 | accuracy = 0.670873786407767


Epoch[2] Batch[520] Speed: 1.2605462226513566 samples/sec                   batch loss = 1260.6467306613922 | accuracy = 0.6721153846153847


Epoch[2] Batch[525] Speed: 1.255435429642307 samples/sec                   batch loss = 1274.409621477127 | accuracy = 0.6719047619047619


Epoch[2] Batch[530] Speed: 1.258941789643343 samples/sec                   batch loss = 1283.3745900392532 | accuracy = 0.6735849056603773


Epoch[2] Batch[535] Speed: 1.2523069261228295 samples/sec                   batch loss = 1297.9149156808853 | accuracy = 0.6733644859813084


Epoch[2] Batch[540] Speed: 1.2528148019764902 samples/sec                   batch loss = 1308.0886729955673 | accuracy = 0.674537037037037


Epoch[2] Batch[545] Speed: 1.257820492126334 samples/sec                   batch loss = 1316.7342633008957 | accuracy = 0.6752293577981652


Epoch[2] Batch[550] Speed: 1.2616480986372467 samples/sec                   batch loss = 1328.0029306411743 | accuracy = 0.675


Epoch[2] Batch[555] Speed: 1.2602478608287226 samples/sec                   batch loss = 1338.9862485527992 | accuracy = 0.6756756756756757


Epoch[2] Batch[560] Speed: 1.2482265619966368 samples/sec                   batch loss = 1346.5790370106697 | accuracy = 0.678125


Epoch[2] Batch[565] Speed: 1.2533715911675085 samples/sec                   batch loss = 1360.97774964571 | accuracy = 0.6778761061946903


Epoch[2] Batch[570] Speed: 1.2471676493768256 samples/sec                   batch loss = 1370.8840389847755 | accuracy = 0.6793859649122806


Epoch[2] Batch[575] Speed: 1.2519247246017464 samples/sec                   batch loss = 1385.6132989525795 | accuracy = 0.6782608695652174


Epoch[2] Batch[580] Speed: 1.2607482723079009 samples/sec                   batch loss = 1398.3517456650734 | accuracy = 0.678448275862069


Epoch[2] Batch[585] Speed: 1.2577709859202233 samples/sec                   batch loss = 1409.3151579499245 | accuracy = 0.6790598290598291


Epoch[2] Batch[590] Speed: 1.2537303171656988 samples/sec                   batch loss = 1420.270888209343 | accuracy = 0.6788135593220339


Epoch[2] Batch[595] Speed: 1.2566264526911362 samples/sec                   batch loss = 1430.803921341896 | accuracy = 0.6785714285714286


Epoch[2] Batch[600] Speed: 1.2557218361579912 samples/sec                   batch loss = 1442.9514886140823 | accuracy = 0.6791666666666667


Epoch[2] Batch[605] Speed: 1.2570675704269465 samples/sec                   batch loss = 1454.269182920456 | accuracy = 0.6789256198347108


Epoch[2] Batch[610] Speed: 1.2593430347862293 samples/sec                   batch loss = 1465.2857064008713 | accuracy = 0.6790983606557377


Epoch[2] Batch[615] Speed: 1.2507160195522782 samples/sec                   batch loss = 1478.8477774858475 | accuracy = 0.6780487804878049


Epoch[2] Batch[620] Speed: 1.2540345052752935 samples/sec                   batch loss = 1487.8907722830772 | accuracy = 0.6786290322580645


Epoch[2] Batch[625] Speed: 1.2501591646656827 samples/sec                   batch loss = 1498.8450148701668 | accuracy = 0.6792


Epoch[2] Batch[630] Speed: 1.250659798972649 samples/sec                   batch loss = 1509.1096048355103 | accuracy = 0.6793650793650794


Epoch[2] Batch[635] Speed: 1.25905875369272 samples/sec                   batch loss = 1521.399863600731 | accuracy = 0.6787401574803149


Epoch[2] Batch[640] Speed: 1.2592271520614484 samples/sec                   batch loss = 1531.6741845607758 | accuracy = 0.679296875


Epoch[2] Batch[645] Speed: 1.2570761416284055 samples/sec                   batch loss = 1541.571043729782 | accuracy = 0.6798449612403101


Epoch[2] Batch[650] Speed: 1.2595602069084604 samples/sec                   batch loss = 1556.6031069755554 | accuracy = 0.6788461538461539


Epoch[2] Batch[655] Speed: 1.2640267529923717 samples/sec                   batch loss = 1565.8361518383026 | accuracy = 0.6793893129770993


Epoch[2] Batch[660] Speed: 1.2539688008925434 samples/sec                   batch loss = 1574.8510669469833 | accuracy = 0.6799242424242424


Epoch[2] Batch[665] Speed: 1.2523655385270438 samples/sec                   batch loss = 1584.6937551498413 | accuracy = 0.6808270676691729


Epoch[2] Batch[670] Speed: 1.2518160874017081 samples/sec                   batch loss = 1595.9334661960602 | accuracy = 0.6805970149253732


Epoch[2] Batch[675] Speed: 1.2507815699928415 samples/sec                   batch loss = 1610.6214451789856 | accuracy = 0.6792592592592592


Epoch[2] Batch[680] Speed: 1.2480439170116902 samples/sec                   batch loss = 1625.8212493658066 | accuracy = 0.6783088235294118


Epoch[2] Batch[685] Speed: 1.2550386418557666 samples/sec                   batch loss = 1636.3948838114738 | accuracy = 0.6791970802919708


Epoch[2] Batch[690] Speed: 1.2544700528572454 samples/sec                   batch loss = 1645.1196158528328 | accuracy = 0.6800724637681159


Epoch[2] Batch[695] Speed: 1.2509536376170345 samples/sec                   batch loss = 1659.495018541813 | accuracy = 0.6794964028776979


Epoch[2] Batch[700] Speed: 1.250659985433879 samples/sec                   batch loss = 1670.3588927388191 | accuracy = 0.68


Epoch[2] Batch[705] Speed: 1.2568618024935692 samples/sec                   batch loss = 1678.5473304390907 | accuracy = 0.6815602836879433


Epoch[2] Batch[710] Speed: 1.2625807549860273 samples/sec                   batch loss = 1691.7343146204948 | accuracy = 0.6809859154929577


Epoch[2] Batch[715] Speed: 1.2580127074368912 samples/sec                   batch loss = 1702.46269851923 | accuracy = 0.6811188811188811


Epoch[2] Batch[720] Speed: 1.2533884457555824 samples/sec                   batch loss = 1713.418395936489 | accuracy = 0.6815972222222222


Epoch[2] Batch[725] Speed: 1.2562922192280699 samples/sec                   batch loss = 1721.8875405192375 | accuracy = 0.6820689655172414


Epoch[2] Batch[730] Speed: 1.250981060896127 samples/sec                   batch loss = 1731.5618817210197 | accuracy = 0.6821917808219178


Epoch[2] Batch[735] Speed: 1.2515527461482816 samples/sec                   batch loss = 1746.6822701096535 | accuracy = 0.6819727891156463


Epoch[2] Batch[740] Speed: 1.2641001827749148 samples/sec                   batch loss = 1755.2571821808815 | accuracy = 0.6831081081081081


Epoch[2] Batch[745] Speed: 1.2594583715593834 samples/sec                   batch loss = 1765.4377226233482 | accuracy = 0.6828859060402684


Epoch[2] Batch[750] Speed: 1.2548441431387978 samples/sec                   batch loss = 1778.6651132702827 | accuracy = 0.683


Epoch[2] Batch[755] Speed: 1.2564598783158576 samples/sec                   batch loss = 1790.7556021809578 | accuracy = 0.6827814569536423


Epoch[2] Batch[760] Speed: 1.2500852031963978 samples/sec                   batch loss = 1803.2462424635887 | accuracy = 0.6828947368421052


Epoch[2] Batch[765] Speed: 1.2528806661651823 samples/sec                   batch loss = 1817.429215848446 | accuracy = 0.6830065359477124


Epoch[2] Batch[770] Speed: 1.2554174866022931 samples/sec                   batch loss = 1826.6103268265724 | accuracy = 0.6837662337662338


Epoch[2] Batch[775] Speed: 1.2525336472702517 samples/sec                   batch loss = 1839.9440311789513 | accuracy = 0.6835483870967742


Epoch[2] Batch[780] Speed: 1.2481617433716672 samples/sec                   batch loss = 1850.9631307721138 | accuracy = 0.683974358974359


Epoch[2] Batch[785] Speed: 1.2570423284466883 samples/sec                   batch loss = 1864.8720198273659 | accuracy = 0.6831210191082803


[Epoch 2] training: accuracy=0.6836928934010152
[Epoch 2] time cost: 643.2863523960114
[Epoch 2] validation: validation accuracy=0.7111111111111111


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).