<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[15:42:48] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[15:42:48] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[15:42:49] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 3.5666127 , -0.56949866]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7739850981760953 samples/sec                   batch loss = 15.523548126220703 | accuracy = 0.3


Epoch[1] Batch[10] Speed: 1.2526645752109107 samples/sec                   batch loss = 28.474279403686523 | accuracy = 0.475


Epoch[1] Batch[15] Speed: 1.2603146982637974 samples/sec                   batch loss = 43.82032537460327 | accuracy = 0.43333333333333335


Epoch[1] Batch[20] Speed: 1.2552568675740134 samples/sec                   batch loss = 57.676121950149536 | accuracy = 0.475


Epoch[1] Batch[25] Speed: 1.258756656742747 samples/sec                   batch loss = 72.2971658706665 | accuracy = 0.47


Epoch[1] Batch[30] Speed: 1.2600864767055242 samples/sec                   batch loss = 85.25092816352844 | accuracy = 0.5083333333333333


Epoch[1] Batch[35] Speed: 1.2578912220386722 samples/sec                   batch loss = 98.86494851112366 | accuracy = 0.5214285714285715


Epoch[1] Batch[40] Speed: 1.254894076246194 samples/sec                   batch loss = 112.20695781707764 | accuracy = 0.51875


Epoch[1] Batch[45] Speed: 1.263612905752577 samples/sec                   batch loss = 126.98311734199524 | accuracy = 0.5222222222222223


Epoch[1] Batch[50] Speed: 1.2586255853523884 samples/sec                   batch loss = 140.51558208465576 | accuracy = 0.525


Epoch[1] Batch[55] Speed: 1.2584763218669466 samples/sec                   batch loss = 155.72565937042236 | accuracy = 0.509090909090909


Epoch[1] Batch[60] Speed: 1.2547048770499283 samples/sec                   batch loss = 170.08948159217834 | accuracy = 0.5041666666666667


Epoch[1] Batch[65] Speed: 1.254832129751128 samples/sec                   batch loss = 183.76722812652588 | accuracy = 0.5115384615384615


Epoch[1] Batch[70] Speed: 1.2550518797221402 samples/sec                   batch loss = 197.52350044250488 | accuracy = 0.5214285714285715


Epoch[1] Batch[75] Speed: 1.2545575739320993 samples/sec                   batch loss = 210.46696257591248 | accuracy = 0.5366666666666666


Epoch[1] Batch[80] Speed: 1.2631183013223761 samples/sec                   batch loss = 224.23673391342163 | accuracy = 0.540625


Epoch[1] Batch[85] Speed: 1.2595368504566804 samples/sec                   batch loss = 238.0006160736084 | accuracy = 0.5441176470588235


Epoch[1] Batch[90] Speed: 1.2601969328116445 samples/sec                   batch loss = 251.17317867279053 | accuracy = 0.5527777777777778


Epoch[1] Batch[95] Speed: 1.2608945690166486 samples/sec                   batch loss = 265.0791573524475 | accuracy = 0.55


Epoch[1] Batch[100] Speed: 1.2607332087069352 samples/sec                   batch loss = 278.3944900035858 | accuracy = 0.5525


Epoch[1] Batch[105] Speed: 1.2550895293650661 samples/sec                   batch loss = 292.10586953163147 | accuracy = 0.55


Epoch[1] Batch[110] Speed: 1.2539329054006987 samples/sec                   batch loss = 306.14982652664185 | accuracy = 0.5477272727272727


Epoch[1] Batch[115] Speed: 1.2551561024661626 samples/sec                   batch loss = 319.96114325523376 | accuracy = 0.5456521739130434


Epoch[1] Batch[120] Speed: 1.25923310636356 samples/sec                   batch loss = 334.0013966560364 | accuracy = 0.55


Epoch[1] Batch[125] Speed: 1.2539477132135852 samples/sec                   batch loss = 347.39787244796753 | accuracy = 0.556


Epoch[1] Batch[130] Speed: 1.2570838652287377 samples/sec                   batch loss = 360.1203308105469 | accuracy = 0.5615384615384615


Epoch[1] Batch[135] Speed: 1.2509052300690573 samples/sec                   batch loss = 373.28498435020447 | accuracy = 0.562962962962963


Epoch[1] Batch[140] Speed: 1.254656647889561 samples/sec                   batch loss = 386.89374589920044 | accuracy = 0.5607142857142857


Epoch[1] Batch[145] Speed: 1.2623715636050405 samples/sec                   batch loss = 402.1300518512726 | accuracy = 0.5586206896551724


Epoch[1] Batch[150] Speed: 1.2566654205288263 samples/sec                   batch loss = 416.6015841960907 | accuracy = 0.5533333333333333


Epoch[1] Batch[155] Speed: 1.2576000544500756 samples/sec                   batch loss = 430.8978908061981 | accuracy = 0.55


Epoch[1] Batch[160] Speed: 1.2543280564311017 samples/sec                   batch loss = 444.8104910850525 | accuracy = 0.5515625


Epoch[1] Batch[165] Speed: 1.2570495807029003 samples/sec                   batch loss = 459.5966432094574 | accuracy = 0.546969696969697


Epoch[1] Batch[170] Speed: 1.2582835873842964 samples/sec                   batch loss = 473.99665546417236 | accuracy = 0.5426470588235294


Epoch[1] Batch[175] Speed: 1.248905240004532 samples/sec                   batch loss = 487.82536697387695 | accuracy = 0.54


Epoch[1] Batch[180] Speed: 1.2518889460382892 samples/sec                   batch loss = 501.411434173584 | accuracy = 0.5388888888888889


Epoch[1] Batch[185] Speed: 1.2508945044334718 samples/sec                   batch loss = 515.65873503685 | accuracy = 0.5337837837837838


Epoch[1] Batch[190] Speed: 1.2582171539204918 samples/sec                   batch loss = 528.9328184127808 | accuracy = 0.5394736842105263


Epoch[1] Batch[195] Speed: 1.25420146849649 samples/sec                   batch loss = 542.4352841377258 | accuracy = 0.5397435897435897


Epoch[1] Batch[200] Speed: 1.2561073004447643 samples/sec                   batch loss = 555.8866338729858 | accuracy = 0.53875


Epoch[1] Batch[205] Speed: 1.2605554096267717 samples/sec                   batch loss = 569.3960974216461 | accuracy = 0.5390243902439025


Epoch[1] Batch[210] Speed: 1.2592662813633166 samples/sec                   batch loss = 582.6978454589844 | accuracy = 0.5452380952380952


Epoch[1] Batch[215] Speed: 1.2527444546965514 samples/sec                   batch loss = 596.7272765636444 | accuracy = 0.5453488372093023


Epoch[1] Batch[220] Speed: 1.2649668411348922 samples/sec                   batch loss = 610.1450746059418 | accuracy = 0.5465909090909091


Epoch[1] Batch[225] Speed: 1.258903058358939 samples/sec                   batch loss = 623.2860999107361 | accuracy = 0.5511111111111111


Epoch[1] Batch[230] Speed: 1.2529435429877624 samples/sec                   batch loss = 637.2220194339752 | accuracy = 0.5510869565217391


Epoch[1] Batch[235] Speed: 1.2580193105765016 samples/sec                   batch loss = 650.5322451591492 | accuracy = 0.5531914893617021


Epoch[1] Batch[240] Speed: 1.2532260987380355 samples/sec                   batch loss = 663.9748275279999 | accuracy = 0.553125


Epoch[1] Batch[245] Speed: 1.2537321909474481 samples/sec                   batch loss = 678.127201795578 | accuracy = 0.5520408163265306


Epoch[1] Batch[250] Speed: 1.2575073013859985 samples/sec                   batch loss = 692.0397264957428 | accuracy = 0.55


Epoch[1] Batch[255] Speed: 1.252179624298704 samples/sec                   batch loss = 705.4326684474945 | accuracy = 0.5490196078431373


Epoch[1] Batch[260] Speed: 1.253782222991456 samples/sec                   batch loss = 719.1702756881714 | accuracy = 0.5471153846153847


Epoch[1] Batch[265] Speed: 1.2530443274875116 samples/sec                   batch loss = 733.6471366882324 | accuracy = 0.5424528301886793


Epoch[1] Batch[270] Speed: 1.256228817744189 samples/sec                   batch loss = 747.7705760002136 | accuracy = 0.5407407407407407


Epoch[1] Batch[275] Speed: 1.2547161373099627 samples/sec                   batch loss = 761.4553306102753 | accuracy = 0.5390909090909091


Epoch[1] Batch[280] Speed: 1.2549230805709786 samples/sec                   batch loss = 774.7928152084351 | accuracy = 0.5383928571428571


Epoch[1] Batch[285] Speed: 1.2508497385491537 samples/sec                   batch loss = 788.8628606796265 | accuracy = 0.5368421052631579


Epoch[1] Batch[290] Speed: 1.2616236210921623 samples/sec                   batch loss = 802.8146374225616 | accuracy = 0.5362068965517242


Epoch[1] Batch[295] Speed: 1.2525492636655047 samples/sec                   batch loss = 816.9608476161957 | accuracy = 0.5338983050847458


Epoch[1] Batch[300] Speed: 1.251546864257887 samples/sec                   batch loss = 830.5605397224426 | accuracy = 0.5333333333333333


Epoch[1] Batch[305] Speed: 1.256257883802172 samples/sec                   batch loss = 844.3131172657013 | accuracy = 0.5344262295081967


Epoch[1] Batch[310] Speed: 1.252359742478679 samples/sec                   batch loss = 857.8117227554321 | accuracy = 0.5338709677419354


Epoch[1] Batch[315] Speed: 1.250774016889146 samples/sec                   batch loss = 871.2318584918976 | accuracy = 0.5341269841269841


Epoch[1] Batch[320] Speed: 1.2586716649775311 samples/sec                   batch loss = 884.3264486789703 | accuracy = 0.53515625


Epoch[1] Batch[325] Speed: 1.255215827082461 samples/sec                   batch loss = 897.9479782581329 | accuracy = 0.5338461538461539


Epoch[1] Batch[330] Speed: 1.2550807035602927 samples/sec                   batch loss = 911.7661514282227 | accuracy = 0.5371212121212121


Epoch[1] Batch[335] Speed: 1.2606586539191273 samples/sec                   batch loss = 925.2998509407043 | accuracy = 0.5373134328358209


Epoch[1] Batch[340] Speed: 1.2578468970297967 samples/sec                   batch loss = 939.1161103248596 | accuracy = 0.538235294117647


Epoch[1] Batch[345] Speed: 1.2569741424934953 samples/sec                   batch loss = 952.8943445682526 | accuracy = 0.5391304347826087


Epoch[1] Batch[350] Speed: 1.2578992385972532 samples/sec                   batch loss = 965.9795413017273 | accuracy = 0.5407142857142857


Epoch[1] Batch[355] Speed: 1.2550809852329956 samples/sec                   batch loss = 979.3502235412598 | accuracy = 0.5429577464788733


Epoch[1] Batch[360] Speed: 1.2595008245919381 samples/sec                   batch loss = 992.7948832511902 | accuracy = 0.5444444444444444


Epoch[1] Batch[365] Speed: 1.2507108914292258 samples/sec                   batch loss = 1006.3501720428467 | accuracy = 0.5458904109589041


Epoch[1] Batch[370] Speed: 1.2520942101258243 samples/sec                   batch loss = 1020.7348806858063 | accuracy = 0.5459459459459459


Epoch[1] Batch[375] Speed: 1.2566568549273243 samples/sec                   batch loss = 1034.5403339862823 | accuracy = 0.5446666666666666


Epoch[1] Batch[380] Speed: 1.2543158653699862 samples/sec                   batch loss = 1048.2237939834595 | accuracy = 0.5427631578947368


Epoch[1] Batch[385] Speed: 1.2609499128913313 samples/sec                   batch loss = 1061.0562090873718 | accuracy = 0.5428571428571428


Epoch[1] Batch[390] Speed: 1.2524743646202532 samples/sec                   batch loss = 1074.4025976657867 | accuracy = 0.541025641025641


Epoch[1] Batch[395] Speed: 1.2481222797410945 samples/sec                   batch loss = 1087.92222738266 | accuracy = 0.5430379746835443


Epoch[1] Batch[400] Speed: 1.2485319872947116 samples/sec                   batch loss = 1101.750004529953 | accuracy = 0.543125


Epoch[1] Batch[405] Speed: 1.2508139281485724 samples/sec                   batch loss = 1115.3309047222137 | accuracy = 0.5432098765432098


Epoch[1] Batch[410] Speed: 1.2512284844161525 samples/sec                   batch loss = 1127.9272482395172 | accuracy = 0.5457317073170732


Epoch[1] Batch[415] Speed: 1.2541198093959776 samples/sec                   batch loss = 1141.2044649124146 | accuracy = 0.5469879518072289


Epoch[1] Batch[420] Speed: 1.254416120441538 samples/sec                   batch loss = 1153.993570804596 | accuracy = 0.55


Epoch[1] Batch[425] Speed: 1.2534406042137984 samples/sec                   batch loss = 1166.8600010871887 | accuracy = 0.5523529411764706


Epoch[1] Batch[430] Speed: 1.2636713439303506 samples/sec                   batch loss = 1180.1878905296326 | accuracy = 0.5540697674418604


Epoch[1] Batch[435] Speed: 1.25674251619729 samples/sec                   batch loss = 1193.0209367275238 | accuracy = 0.5540229885057472


Epoch[1] Batch[440] Speed: 1.2533242134757847 samples/sec                   batch loss = 1206.331434726715 | accuracy = 0.5539772727272727


Epoch[1] Batch[445] Speed: 1.248235013058422 samples/sec                   batch loss = 1219.7743031978607 | accuracy = 0.5544943820224719


Epoch[1] Batch[450] Speed: 1.2523807767810982 samples/sec                   batch loss = 1232.0537312030792 | accuracy = 0.5566666666666666


Epoch[1] Batch[455] Speed: 1.254064594744717 samples/sec                   batch loss = 1244.7474961280823 | accuracy = 0.5576923076923077


Epoch[1] Batch[460] Speed: 1.254195374166584 samples/sec                   batch loss = 1258.5644063949585 | accuracy = 0.5581521739130435


Epoch[1] Batch[465] Speed: 1.25591848739565 samples/sec                   batch loss = 1271.4783759117126 | accuracy = 0.5591397849462365


Epoch[1] Batch[470] Speed: 1.252656812291155 samples/sec                   batch loss = 1285.0315880775452 | accuracy = 0.5590425531914893


Epoch[1] Batch[475] Speed: 1.2564079387304385 samples/sec                   batch loss = 1298.3230452537537 | accuracy = 0.5584210526315789


Epoch[1] Batch[480] Speed: 1.2507091199055804 samples/sec                   batch loss = 1311.5057322978973 | accuracy = 0.5578125


Epoch[1] Batch[485] Speed: 1.255068028419515 samples/sec                   batch loss = 1324.1981790065765 | accuracy = 0.5603092783505155


Epoch[1] Batch[490] Speed: 1.25173679321652 samples/sec                   batch loss = 1337.7023630142212 | accuracy = 0.5591836734693878


Epoch[1] Batch[495] Speed: 1.2524517377068223 samples/sec                   batch loss = 1350.7281184196472 | accuracy = 0.5606060606060606


Epoch[1] Batch[500] Speed: 1.2542664471111873 samples/sec                   batch loss = 1363.8042137622833 | accuracy = 0.5615


Epoch[1] Batch[505] Speed: 1.252101311954223 samples/sec                   batch loss = 1376.9469394683838 | accuracy = 0.5623762376237624


Epoch[1] Batch[510] Speed: 1.2518230926713219 samples/sec                   batch loss = 1390.1635158061981 | accuracy = 0.5632352941176471


Epoch[1] Batch[515] Speed: 1.2550823935984066 samples/sec                   batch loss = 1402.9990854263306 | accuracy = 0.5635922330097087


Epoch[1] Batch[520] Speed: 1.2643023254061434 samples/sec                   batch loss = 1417.3564944267273 | accuracy = 0.5625


Epoch[1] Batch[525] Speed: 1.2541330278860667 samples/sec                   batch loss = 1431.046059846878 | accuracy = 0.5623809523809524


Epoch[1] Batch[530] Speed: 1.2500474806283135 samples/sec                   batch loss = 1445.1442112922668 | accuracy = 0.5627358490566038


Epoch[1] Batch[535] Speed: 1.2507481878288418 samples/sec                   batch loss = 1457.8292920589447 | accuracy = 0.5649532710280374


Epoch[1] Batch[540] Speed: 1.2541349966211974 samples/sec                   batch loss = 1471.861449956894 | accuracy = 0.5652777777777778


Epoch[1] Batch[545] Speed: 1.2576281470001978 samples/sec                   batch loss = 1485.9916915893555 | accuracy = 0.5642201834862385


Epoch[1] Batch[550] Speed: 1.253632888232262 samples/sec                   batch loss = 1499.0815370082855 | accuracy = 0.5645454545454546


Epoch[1] Batch[555] Speed: 1.2611464987304506 samples/sec                   batch loss = 1511.4797587394714 | accuracy = 0.5662162162162162


Epoch[1] Batch[560] Speed: 1.252127010032784 samples/sec                   batch loss = 1523.9935557842255 | accuracy = 0.5669642857142857


Epoch[1] Batch[565] Speed: 1.258691212067195 samples/sec                   batch loss = 1536.9370896816254 | accuracy = 0.5672566371681416


Epoch[1] Batch[570] Speed: 1.259377633600066 samples/sec                   batch loss = 1551.1136054992676 | accuracy = 0.5671052631578948


Epoch[1] Batch[575] Speed: 1.2545055100848548 samples/sec                   batch loss = 1564.4303243160248 | accuracy = 0.5673913043478261


Epoch[1] Batch[580] Speed: 1.259139923265277 samples/sec                   batch loss = 1578.0301127433777 | accuracy = 0.5672413793103448


Epoch[1] Batch[585] Speed: 1.257316088507558 samples/sec                   batch loss = 1591.249581336975 | accuracy = 0.5670940170940171


Epoch[1] Batch[590] Speed: 1.2589025860409109 samples/sec                   batch loss = 1603.3219743967056 | accuracy = 0.5686440677966101


Epoch[1] Batch[595] Speed: 1.2586087784600248 samples/sec                   batch loss = 1614.9443103075027 | accuracy = 0.5697478991596638


Epoch[1] Batch[600] Speed: 1.2544584218223456 samples/sec                   batch loss = 1628.5837179422379 | accuracy = 0.57


Epoch[1] Batch[605] Speed: 1.2524908210708094 samples/sec                   batch loss = 1641.2425645589828 | accuracy = 0.5702479338842975


Epoch[1] Batch[610] Speed: 1.255267386392918 samples/sec                   batch loss = 1655.386597275734 | accuracy = 0.569672131147541


Epoch[1] Batch[615] Speed: 1.2515970021441072 samples/sec                   batch loss = 1667.6651583909988 | accuracy = 0.5699186991869919


Epoch[1] Batch[620] Speed: 1.252545242640281 samples/sec                   batch loss = 1680.492800116539 | accuracy = 0.5709677419354838


Epoch[1] Batch[625] Speed: 1.2562349318517985 samples/sec                   batch loss = 1693.6545685529709 | accuracy = 0.5712


Epoch[1] Batch[630] Speed: 1.2479416146090225 samples/sec                   batch loss = 1706.016160607338 | accuracy = 0.5718253968253968


Epoch[1] Batch[635] Speed: 1.2579619599060436 samples/sec                   batch loss = 1718.6033337116241 | accuracy = 0.5724409448818898


Epoch[1] Batch[640] Speed: 1.2564992122359893 samples/sec                   batch loss = 1731.1273076534271 | accuracy = 0.573046875


Epoch[1] Batch[645] Speed: 1.2529059284158348 samples/sec                   batch loss = 1743.8327083587646 | accuracy = 0.5740310077519379


Epoch[1] Batch[650] Speed: 1.260806540568598 samples/sec                   batch loss = 1758.7554211616516 | accuracy = 0.5734615384615385


Epoch[1] Batch[655] Speed: 1.2558018240441973 samples/sec                   batch loss = 1772.156550168991 | accuracy = 0.5744274809160306


Epoch[1] Batch[660] Speed: 1.2621227522512843 samples/sec                   batch loss = 1784.7210031747818 | accuracy = 0.5753787878787879


Epoch[1] Batch[665] Speed: 1.2578132309722305 samples/sec                   batch loss = 1797.4154597520828 | accuracy = 0.5766917293233083


Epoch[1] Batch[670] Speed: 1.2522204664839973 samples/sec                   batch loss = 1810.758706688881 | accuracy = 0.5772388059701492


Epoch[1] Batch[675] Speed: 1.250824745657602 samples/sec                   batch loss = 1823.031078696251 | accuracy = 0.5781481481481482


Epoch[1] Batch[680] Speed: 1.2557852804728165 samples/sec                   batch loss = 1834.5895351171494 | accuracy = 0.5794117647058824


Epoch[1] Batch[685] Speed: 1.255800884056871 samples/sec                   batch loss = 1847.4959307909012 | accuracy = 0.5806569343065694


Epoch[1] Batch[690] Speed: 1.257620605252764 samples/sec                   batch loss = 1859.6240431070328 | accuracy = 0.5818840579710145


Epoch[1] Batch[695] Speed: 1.256917263739007 samples/sec                   batch loss = 1872.6628721952438 | accuracy = 0.5816546762589928


Epoch[1] Batch[700] Speed: 1.261981296409767 samples/sec                   batch loss = 1886.5539969205856 | accuracy = 0.5810714285714286


Epoch[1] Batch[705] Speed: 1.2579700716899493 samples/sec                   batch loss = 1899.88991189003 | accuracy = 0.5801418439716312


Epoch[1] Batch[710] Speed: 1.2542396297241023 samples/sec                   batch loss = 1911.322031378746 | accuracy = 0.581338028169014


Epoch[1] Batch[715] Speed: 1.254849962206144 samples/sec                   batch loss = 1925.450565457344 | accuracy = 0.5807692307692308


Epoch[1] Batch[720] Speed: 1.2539628025472376 samples/sec                   batch loss = 1938.3147069215775 | accuracy = 0.58125


Epoch[1] Batch[725] Speed: 1.2594726483177545 samples/sec                   batch loss = 1952.612187743187 | accuracy = 0.5817241379310345


Epoch[1] Batch[730] Speed: 1.2633162298014697 samples/sec                   batch loss = 1967.4671314954758 | accuracy = 0.5811643835616438


Epoch[1] Batch[735] Speed: 1.2609535142022108 samples/sec                   batch loss = 1980.2535773515701 | accuracy = 0.5819727891156462


Epoch[1] Batch[740] Speed: 1.2531689970414925 samples/sec                   batch loss = 1992.6758798360825 | accuracy = 0.5827702702702703


Epoch[1] Batch[745] Speed: 1.2514620964856873 samples/sec                   batch loss = 2004.8226433992386 | accuracy = 0.5835570469798658


Epoch[1] Batch[750] Speed: 1.2568103945728544 samples/sec                   batch loss = 2017.3838008642197 | accuracy = 0.5836666666666667


Epoch[1] Batch[755] Speed: 1.2518130985105353 samples/sec                   batch loss = 2029.4575318098068 | accuracy = 0.5844370860927153


Epoch[1] Batch[760] Speed: 1.254660776310706 samples/sec                   batch loss = 2040.7532371282578 | accuracy = 0.5855263157894737


Epoch[1] Batch[765] Speed: 1.2499308810606655 samples/sec                   batch loss = 2053.199287176132 | accuracy = 0.5859477124183007


Epoch[1] Batch[770] Speed: 1.2447254353700756 samples/sec                   batch loss = 2063.947456598282 | accuracy = 0.5866883116883117


Epoch[1] Batch[775] Speed: 1.2503298497149251 samples/sec                   batch loss = 2075.9517953395844 | accuracy = 0.5874193548387097


Epoch[1] Batch[780] Speed: 1.2507307514851538 samples/sec                   batch loss = 2088.0536618232727 | accuracy = 0.5884615384615385


Epoch[1] Batch[785] Speed: 1.246207165290684 samples/sec                   batch loss = 2102.341906785965 | accuracy = 0.5878980891719745


[Epoch 1] training: accuracy=0.5885152284263959
[Epoch 1] time cost: 646.0354740619659
[Epoch 1] validation: validation accuracy=0.69


Epoch[2] Batch[5] Speed: 1.2593121243392018 samples/sec                   batch loss = 10.303808212280273 | accuracy = 0.75


Epoch[2] Batch[10] Speed: 1.2567835624384989 samples/sec                   batch loss = 24.428004384040833 | accuracy = 0.65


Epoch[2] Batch[15] Speed: 1.254051565171048 samples/sec                   batch loss = 38.553964495658875 | accuracy = 0.6333333333333333


Epoch[2] Batch[20] Speed: 1.2547168880011512 samples/sec                   batch loss = 50.54850506782532 | accuracy = 0.625


Epoch[2] Batch[25] Speed: 1.2584536663683281 samples/sec                   batch loss = 63.177685141563416 | accuracy = 0.63


Epoch[2] Batch[30] Speed: 1.2543261808678132 samples/sec                   batch loss = 76.17354130744934 | accuracy = 0.6166666666666667


Epoch[2] Batch[35] Speed: 1.2527645664767595 samples/sec                   batch loss = 89.93654072284698 | accuracy = 0.6142857142857143


Epoch[2] Batch[40] Speed: 1.2612808455201576 samples/sec                   batch loss = 101.90638291835785 | accuracy = 0.63125


Epoch[2] Batch[45] Speed: 1.2496212341558344 samples/sec                   batch loss = 114.68869352340698 | accuracy = 0.6444444444444445


Epoch[2] Batch[50] Speed: 1.2562908081461888 samples/sec                   batch loss = 127.48016309738159 | accuracy = 0.65


Epoch[2] Batch[55] Speed: 1.2533129781850003 samples/sec                   batch loss = 139.57357680797577 | accuracy = 0.6454545454545455


Epoch[2] Batch[60] Speed: 1.252456132126038 samples/sec                   batch loss = 152.2580041885376 | accuracy = 0.6416666666666667


Epoch[2] Batch[65] Speed: 1.250599761347759 samples/sec                   batch loss = 166.0307240486145 | accuracy = 0.6307692307692307


Epoch[2] Batch[70] Speed: 1.2615991444968493 samples/sec                   batch loss = 178.94749546051025 | accuracy = 0.625


Epoch[2] Batch[75] Speed: 1.2477880068418095 samples/sec                   batch loss = 193.08938455581665 | accuracy = 0.6266666666666667


Epoch[2] Batch[80] Speed: 1.2527030171109697 samples/sec                   batch loss = 206.19180345535278 | accuracy = 0.621875


Epoch[2] Batch[85] Speed: 1.2529997819209477 samples/sec                   batch loss = 217.16556298732758 | accuracy = 0.6323529411764706


Epoch[2] Batch[90] Speed: 1.2516815080445445 samples/sec                   batch loss = 230.75711977481842 | accuracy = 0.625


Epoch[2] Batch[95] Speed: 1.2510477585825828 samples/sec                   batch loss = 243.41503489017487 | accuracy = 0.6236842105263158


Epoch[2] Batch[100] Speed: 1.255008505627044 samples/sec                   batch loss = 256.77934992313385 | accuracy = 0.62


Epoch[2] Batch[105] Speed: 1.255170657469931 samples/sec                   batch loss = 270.83372485637665 | accuracy = 0.6142857142857143


Epoch[2] Batch[110] Speed: 1.2564366367023954 samples/sec                   batch loss = 282.74533808231354 | accuracy = 0.6181818181818182


Epoch[2] Batch[115] Speed: 1.247596398205751 samples/sec                   batch loss = 294.2189186811447 | accuracy = 0.6217391304347826


Epoch[2] Batch[120] Speed: 1.2527743887436493 samples/sec                   batch loss = 305.94711112976074 | accuracy = 0.6229166666666667


Epoch[2] Batch[125] Speed: 1.2483407074295965 samples/sec                   batch loss = 319.85563611984253 | accuracy = 0.618


Epoch[2] Batch[130] Speed: 1.2486096679263492 samples/sec                   batch loss = 332.65484857559204 | accuracy = 0.6230769230769231


Epoch[2] Batch[135] Speed: 1.2463098315917387 samples/sec                   batch loss = 343.89817702770233 | accuracy = 0.6240740740740741


Epoch[2] Batch[140] Speed: 1.2491777921876501 samples/sec                   batch loss = 356.22949624061584 | accuracy = 0.6214285714285714


Epoch[2] Batch[145] Speed: 1.2558229741310685 samples/sec                   batch loss = 370.94882321357727 | accuracy = 0.6206896551724138


Epoch[2] Batch[150] Speed: 1.252113646900288 samples/sec                   batch loss = 382.31958878040314 | accuracy = 0.6233333333333333


Epoch[2] Batch[155] Speed: 1.2543635056320772 samples/sec                   batch loss = 395.1849604845047 | accuracy = 0.6209677419354839


Epoch[2] Batch[160] Speed: 1.2496410596211962 samples/sec                   batch loss = 406.67417645454407 | accuracy = 0.625


Epoch[2] Batch[165] Speed: 1.2471484585608537 samples/sec                   batch loss = 418.3227870464325 | accuracy = 0.6257575757575757


Epoch[2] Batch[170] Speed: 1.2551607036889294 samples/sec                   batch loss = 431.1902813911438 | accuracy = 0.6294117647058823


Epoch[2] Batch[175] Speed: 1.2568408057102105 samples/sec                   batch loss = 443.3213315010071 | accuracy = 0.6342857142857142


Epoch[2] Batch[180] Speed: 1.2544683644678298 samples/sec                   batch loss = 456.52732479572296 | accuracy = 0.6305555555555555


Epoch[2] Batch[185] Speed: 1.2546475466934588 samples/sec                   batch loss = 469.18589973449707 | accuracy = 0.6310810810810811


Epoch[2] Batch[190] Speed: 1.2600408612388947 samples/sec                   batch loss = 481.4205598831177 | accuracy = 0.6302631578947369


Epoch[2] Batch[195] Speed: 1.2568333675561374 samples/sec                   batch loss = 493.21354591846466 | accuracy = 0.6307692307692307


Epoch[2] Batch[200] Speed: 1.2583396460763738 samples/sec                   batch loss = 507.39067685604095 | accuracy = 0.6275


Epoch[2] Batch[205] Speed: 1.2587721453674412 samples/sec                   batch loss = 521.0792099237442 | accuracy = 0.6268292682926829


Epoch[2] Batch[210] Speed: 1.2577940883314391 samples/sec                   batch loss = 533.5087931156158 | accuracy = 0.6273809523809524


Epoch[2] Batch[215] Speed: 1.2492926698284923 samples/sec                   batch loss = 545.7875288724899 | accuracy = 0.6267441860465116


Epoch[2] Batch[220] Speed: 1.2503209975431364 samples/sec                   batch loss = 559.2961503267288 | accuracy = 0.6272727272727273


Epoch[2] Batch[225] Speed: 1.2593140148428554 samples/sec                   batch loss = 573.2881489992142 | accuracy = 0.6244444444444445


Epoch[2] Batch[230] Speed: 1.255544602017242 samples/sec                   batch loss = 585.9680935144424 | accuracy = 0.6239130434782608


Epoch[2] Batch[235] Speed: 1.2565198211820885 samples/sec                   batch loss = 598.6145783662796 | accuracy = 0.625531914893617


Epoch[2] Batch[240] Speed: 1.2514602294833879 samples/sec                   batch loss = 610.3524081707001 | accuracy = 0.6260416666666667


Epoch[2] Batch[245] Speed: 1.2561075825784456 samples/sec                   batch loss = 620.435759305954 | accuracy = 0.6275510204081632


Epoch[2] Batch[250] Speed: 1.253052188798633 samples/sec                   batch loss = 632.0832825899124 | accuracy = 0.627


Epoch[2] Batch[255] Speed: 1.2518220652268721 samples/sec                   batch loss = 645.32259786129 | accuracy = 0.6274509803921569


Epoch[2] Batch[260] Speed: 1.2571584687906578 samples/sec                   batch loss = 656.483535528183 | accuracy = 0.6298076923076923


Epoch[2] Batch[265] Speed: 1.2568603901318738 samples/sec                   batch loss = 670.5154178142548 | accuracy = 0.6273584905660378


Epoch[2] Batch[270] Speed: 1.259401173224021 samples/sec                   batch loss = 681.8241016864777 | accuracy = 0.6287037037037037


Epoch[2] Batch[275] Speed: 1.2551448343079776 samples/sec                   batch loss = 692.8985395431519 | accuracy = 0.6281818181818182


Epoch[2] Batch[280] Speed: 1.2554449180540337 samples/sec                   batch loss = 708.3194072246552 | accuracy = 0.625


Epoch[2] Batch[285] Speed: 1.256322887524044 samples/sec                   batch loss = 718.3853734731674 | accuracy = 0.6289473684210526


Epoch[2] Batch[290] Speed: 1.2613661900322153 samples/sec                   batch loss = 731.5176849365234 | accuracy = 0.6293103448275862


Epoch[2] Batch[295] Speed: 1.2597466167600082 samples/sec                   batch loss = 741.969923377037 | accuracy = 0.6313559322033898


Epoch[2] Batch[300] Speed: 1.2566704093395542 samples/sec                   batch loss = 756.6545914411545 | accuracy = 0.6283333333333333


Epoch[2] Batch[305] Speed: 1.2559755578639662 samples/sec                   batch loss = 771.3881697654724 | accuracy = 0.6254098360655738


Epoch[2] Batch[310] Speed: 1.258689229001441 samples/sec                   batch loss = 781.9636620283127 | accuracy = 0.6266129032258064


Epoch[2] Batch[315] Speed: 1.2660509528953912 samples/sec                   batch loss = 794.2194510698318 | accuracy = 0.6277777777777778


Epoch[2] Batch[320] Speed: 1.2581918657521989 samples/sec                   batch loss = 805.928941488266 | accuracy = 0.628125


Epoch[2] Batch[325] Speed: 1.2565716759044006 samples/sec                   batch loss = 818.2647680044174 | accuracy = 0.6292307692307693


Epoch[2] Batch[330] Speed: 1.2630881561996148 samples/sec                   batch loss = 830.0171858072281 | accuracy = 0.6310606060606061


Epoch[2] Batch[335] Speed: 1.2559289232881872 samples/sec                   batch loss = 840.7099621295929 | accuracy = 0.6328358208955224


Epoch[2] Batch[340] Speed: 1.2556825508925844 samples/sec                   batch loss = 854.3773326873779 | accuracy = 0.6323529411764706


Epoch[2] Batch[345] Speed: 1.2590134015624712 samples/sec                   batch loss = 866.7114442586899 | accuracy = 0.6318840579710145


Epoch[2] Batch[350] Speed: 1.2542134698116116 samples/sec                   batch loss = 878.4378659725189 | accuracy = 0.6321428571428571


Epoch[2] Batch[355] Speed: 1.2519051068593996 samples/sec                   batch loss = 892.0194928646088 | accuracy = 0.6302816901408451


Epoch[2] Batch[360] Speed: 1.2483472094252015 samples/sec                   batch loss = 905.7173118591309 | accuracy = 0.63125


Epoch[2] Batch[365] Speed: 1.258514649275979 samples/sec                   batch loss = 919.5037417411804 | accuracy = 0.6308219178082192


Epoch[2] Batch[370] Speed: 1.2573902484522277 samples/sec                   batch loss = 930.6316303014755 | accuracy = 0.6331081081081081


Epoch[2] Batch[375] Speed: 1.256075514194848 samples/sec                   batch loss = 942.1642979383469 | accuracy = 0.634


Epoch[2] Batch[380] Speed: 1.2574956139624363 samples/sec                   batch loss = 953.2464249134064 | accuracy = 0.6355263157894737


Epoch[2] Batch[385] Speed: 1.2582570697769824 samples/sec                   batch loss = 964.3810842037201 | accuracy = 0.6376623376623377


Epoch[2] Batch[390] Speed: 1.2531159252125665 samples/sec                   batch loss = 974.1310777664185 | accuracy = 0.6410256410256411


Epoch[2] Batch[395] Speed: 1.249891212365457 samples/sec                   batch loss = 982.1258752346039 | accuracy = 0.6436708860759494


Epoch[2] Batch[400] Speed: 1.2545100127333706 samples/sec                   batch loss = 996.4605777263641 | accuracy = 0.64375


Epoch[2] Batch[405] Speed: 1.2499869429938937 samples/sec                   batch loss = 1007.6139624118805 | accuracy = 0.6450617283950617


Epoch[2] Batch[410] Speed: 1.2555150052204564 samples/sec                   batch loss = 1016.4047193527222 | accuracy = 0.6481707317073171


Epoch[2] Batch[415] Speed: 1.266602455321451 samples/sec                   batch loss = 1027.4961904287338 | accuracy = 0.65


Epoch[2] Batch[420] Speed: 1.25374971107782 samples/sec                   batch loss = 1042.5250413417816 | accuracy = 0.65


Epoch[2] Batch[425] Speed: 1.258808035109015 samples/sec                   batch loss = 1055.5583493709564 | accuracy = 0.6505882352941177


Epoch[2] Batch[430] Speed: 1.2571507442735637 samples/sec                   batch loss = 1066.4644685983658 | accuracy = 0.6511627906976745


Epoch[2] Batch[435] Speed: 1.255440126857909 samples/sec                   batch loss = 1079.69544672966 | accuracy = 0.6505747126436782


Epoch[2] Batch[440] Speed: 1.2577558047977928 samples/sec                   batch loss = 1089.5843371152878 | accuracy = 0.6517045454545455


Epoch[2] Batch[445] Speed: 1.2542928905527848 samples/sec                   batch loss = 1102.3199926614761 | accuracy = 0.652247191011236


Epoch[2] Batch[450] Speed: 1.2545944433780034 samples/sec                   batch loss = 1115.134631037712 | accuracy = 0.6516666666666666


Epoch[2] Batch[455] Speed: 1.2598332674151478 samples/sec                   batch loss = 1127.601912856102 | accuracy = 0.6516483516483517


Epoch[2] Batch[460] Speed: 1.2548142978029329 samples/sec                   batch loss = 1142.220578789711 | accuracy = 0.6510869565217391


Epoch[2] Batch[465] Speed: 1.2585562834171948 samples/sec                   batch loss = 1153.1733144521713 | accuracy = 0.6521505376344086


Epoch[2] Batch[470] Speed: 1.2602316732246395 samples/sec                   batch loss = 1164.0545901060104 | accuracy = 0.6531914893617021


Epoch[2] Batch[475] Speed: 1.257153287701637 samples/sec                   batch loss = 1173.478629231453 | accuracy = 0.6547368421052632


Epoch[2] Batch[480] Speed: 1.2578621746587784 samples/sec                   batch loss = 1185.793200969696 | accuracy = 0.6546875


Epoch[2] Batch[485] Speed: 1.256318654081227 samples/sec                   batch loss = 1199.283333659172 | accuracy = 0.6536082474226804


Epoch[2] Batch[490] Speed: 1.255280910847696 samples/sec                   batch loss = 1208.5426942110062 | accuracy = 0.6551020408163265


Epoch[2] Batch[495] Speed: 1.2580168579736974 samples/sec                   batch loss = 1220.3125644922256 | accuracy = 0.656060606060606


Epoch[2] Batch[500] Speed: 1.2617166977485523 samples/sec                   batch loss = 1231.3631765842438 | accuracy = 0.6575


Epoch[2] Batch[505] Speed: 1.2562062430931822 samples/sec                   batch loss = 1243.4145647287369 | accuracy = 0.6574257425742575


Epoch[2] Batch[510] Speed: 1.2578466141142413 samples/sec                   batch loss = 1256.7531996965408 | accuracy = 0.6568627450980392


Epoch[2] Batch[515] Speed: 1.2536224904615851 samples/sec                   batch loss = 1272.0699719190598 | accuracy = 0.6563106796116505


Epoch[2] Batch[520] Speed: 1.2595864012067008 samples/sec                   batch loss = 1286.1140793561935 | accuracy = 0.6548076923076923


Epoch[2] Batch[525] Speed: 1.2528086275638395 samples/sec                   batch loss = 1297.5341846942902 | accuracy = 0.6557142857142857


Epoch[2] Batch[530] Speed: 1.256308493934872 samples/sec                   batch loss = 1308.3930391073227 | accuracy = 0.6561320754716982


Epoch[2] Batch[535] Speed: 1.2601121250262692 samples/sec                   batch loss = 1321.8389061689377 | accuracy = 0.655607476635514


Epoch[2] Batch[540] Speed: 1.2616605275217734 samples/sec                   batch loss = 1334.4080970287323 | accuracy = 0.6564814814814814


Epoch[2] Batch[545] Speed: 1.2530541541418259 samples/sec                   batch loss = 1348.6821599006653 | accuracy = 0.6568807339449542


Epoch[2] Batch[550] Speed: 1.2557679853869619 samples/sec                   batch loss = 1359.9797856807709 | accuracy = 0.6568181818181819


Epoch[2] Batch[555] Speed: 1.2615573087946037 samples/sec                   batch loss = 1369.675957083702 | accuracy = 0.6585585585585586


Epoch[2] Batch[560] Speed: 1.2550979797199964 samples/sec                   batch loss = 1380.6251035928726 | accuracy = 0.6589285714285714


Epoch[2] Batch[565] Speed: 1.2595899001665667 samples/sec                   batch loss = 1393.1984658241272 | accuracy = 0.6584070796460177


Epoch[2] Batch[570] Speed: 1.2568654746488321 samples/sec                   batch loss = 1403.9637079238892 | accuracy = 0.6596491228070176


Epoch[2] Batch[575] Speed: 1.2522269154936474 samples/sec                   batch loss = 1413.2183122634888 | accuracy = 0.6604347826086957


Epoch[2] Batch[580] Speed: 1.2577044180037804 samples/sec                   batch loss = 1424.109612584114 | accuracy = 0.6607758620689655


Epoch[2] Batch[585] Speed: 1.2552785628311693 samples/sec                   batch loss = 1433.684933423996 | accuracy = 0.661965811965812


Epoch[2] Batch[590] Speed: 1.250648518171666 samples/sec                   batch loss = 1443.291578412056 | accuracy = 0.6635593220338983


Epoch[2] Batch[595] Speed: 1.258449701739934 samples/sec                   batch loss = 1453.3473734855652 | accuracy = 0.6647058823529411


Epoch[2] Batch[600] Speed: 1.2548789644917282 samples/sec                   batch loss = 1466.107364654541 | accuracy = 0.6645833333333333


Epoch[2] Batch[605] Speed: 1.2552649445085315 samples/sec                   batch loss = 1476.505822300911 | accuracy = 0.6648760330578513


Epoch[2] Batch[610] Speed: 1.2538682424681058 samples/sec                   batch loss = 1485.1059892177582 | accuracy = 0.6663934426229509


Epoch[2] Batch[615] Speed: 1.2552743364235563 samples/sec                   batch loss = 1496.9428873062134 | accuracy = 0.6666666666666666


Epoch[2] Batch[620] Speed: 1.2647333088184631 samples/sec                   batch loss = 1510.8060007095337 | accuracy = 0.6653225806451613


Epoch[2] Batch[625] Speed: 1.2611159736694222 samples/sec                   batch loss = 1523.0853480100632 | accuracy = 0.6656


Epoch[2] Batch[630] Speed: 1.2613635346955463 samples/sec                   batch loss = 1533.723149061203 | accuracy = 0.6658730158730158


Epoch[2] Batch[635] Speed: 1.260177149617689 samples/sec                   batch loss = 1544.4508855342865 | accuracy = 0.6665354330708662


Epoch[2] Batch[640] Speed: 1.2635393422465044 samples/sec                   batch loss = 1554.234910607338 | accuracy = 0.667578125


Epoch[2] Batch[645] Speed: 1.2594941113383864 samples/sec                   batch loss = 1566.2273310422897 | accuracy = 0.6674418604651163


Epoch[2] Batch[650] Speed: 1.263660778973534 samples/sec                   batch loss = 1579.2420449256897 | accuracy = 0.6676923076923077


Epoch[2] Batch[655] Speed: 1.260011998386196 samples/sec                   batch loss = 1592.0713139772415 | accuracy = 0.667175572519084


Epoch[2] Batch[660] Speed: 1.2590298413321634 samples/sec                   batch loss = 1602.2028506994247 | accuracy = 0.6670454545454545


Epoch[2] Batch[665] Speed: 1.257501080633511 samples/sec                   batch loss = 1611.426932811737 | accuracy = 0.6680451127819549


Epoch[2] Batch[670] Speed: 1.2588547892891648 samples/sec                   batch loss = 1622.527083158493 | accuracy = 0.6686567164179105


Epoch[2] Batch[675] Speed: 1.25409656052917 samples/sec                   batch loss = 1634.7908090353012 | accuracy = 0.6681481481481482


Epoch[2] Batch[680] Speed: 1.2566295587344853 samples/sec                   batch loss = 1648.0793360471725 | accuracy = 0.6672794117647058


Epoch[2] Batch[685] Speed: 1.2653465509323525 samples/sec                   batch loss = 1658.810961484909 | accuracy = 0.6682481751824818


Epoch[2] Batch[690] Speed: 1.263057156722318 samples/sec                   batch loss = 1674.1977496147156 | accuracy = 0.6673913043478261


Epoch[2] Batch[695] Speed: 1.258568462632318 samples/sec                   batch loss = 1682.8180428147316 | accuracy = 0.6676258992805756


Epoch[2] Batch[700] Speed: 1.2536885332877408 samples/sec                   batch loss = 1690.8949380517006 | accuracy = 0.6692857142857143


Epoch[2] Batch[705] Speed: 1.259818320272376 samples/sec                   batch loss = 1701.9028733372688 | accuracy = 0.6691489361702128


Epoch[2] Batch[710] Speed: 1.2584461147167216 samples/sec                   batch loss = 1717.300470173359 | accuracy = 0.6683098591549296


Epoch[2] Batch[715] Speed: 1.2511682988745074 samples/sec                   batch loss = 1728.878840982914 | accuracy = 0.6681818181818182


Epoch[2] Batch[720] Speed: 1.2557865964227646 samples/sec                   batch loss = 1742.251471221447 | accuracy = 0.6677083333333333


Epoch[2] Batch[725] Speed: 1.2568560589091347 samples/sec                   batch loss = 1757.0903314948082 | accuracy = 0.666896551724138


Epoch[2] Batch[730] Speed: 1.26127686305831 samples/sec                   batch loss = 1770.5774880051613 | accuracy = 0.666095890410959


Epoch[2] Batch[735] Speed: 1.2575214396887306 samples/sec                   batch loss = 1783.957990348339 | accuracy = 0.6659863945578232


Epoch[2] Batch[740] Speed: 1.2588218248697283 samples/sec                   batch loss = 1794.3260239958763 | accuracy = 0.6662162162162162


Epoch[2] Batch[745] Speed: 1.2593092885943642 samples/sec                   batch loss = 1804.3437663912773 | accuracy = 0.6667785234899329


Epoch[2] Batch[750] Speed: 1.2537794120965997 samples/sec                   batch loss = 1815.8725643754005 | accuracy = 0.667


Epoch[2] Batch[755] Speed: 1.2579388513277432 samples/sec                   batch loss = 1827.443786561489 | accuracy = 0.6665562913907285


Epoch[2] Batch[760] Speed: 1.2660548700225807 samples/sec                   batch loss = 1838.316045820713 | accuracy = 0.6677631578947368


Epoch[2] Batch[765] Speed: 1.2545799955506618 samples/sec                   batch loss = 1849.7990545630455 | accuracy = 0.6676470588235294


Epoch[2] Batch[770] Speed: 1.2600561922098332 samples/sec                   batch loss = 1860.0937111973763 | accuracy = 0.6681818181818182


Epoch[2] Batch[775] Speed: 1.258731913511218 samples/sec                   batch loss = 1872.789742410183 | accuracy = 0.6680645161290323


Epoch[2] Batch[780] Speed: 1.252697685611649 samples/sec                   batch loss = 1881.5183121562004 | accuracy = 0.6692307692307692


Epoch[2] Batch[785] Speed: 1.2550065341485 samples/sec                   batch loss = 1894.113130390644 | accuracy = 0.6697452229299363


[Epoch 2] training: accuracy=0.6700507614213198
[Epoch 2] time cost: 643.5273740291595
[Epoch 2] validation: validation accuracy=0.7244444444444444


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).