<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[09:31:57] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

09:31:57] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[09:31:58] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 9.002522, -8.992892]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7948305270875627 samples/sec                   batch loss = 13.942870616912842 | accuracy = 0.55


Epoch[1] Batch[10] Speed: 1.279212398982255 samples/sec                   batch loss = 27.884786128997803 | accuracy = 0.575


Epoch[1] Batch[15] Speed: 1.281100554828443 samples/sec                   batch loss = 44.10267448425293 | accuracy = 0.5166666666666667


Epoch[1] Batch[20] Speed: 1.2728650916178699 samples/sec                   batch loss = 58.83742952346802 | accuracy = 0.475


Epoch[1] Batch[25] Speed: 1.2804823105748984 samples/sec                   batch loss = 74.16756963729858 | accuracy = 0.42


Epoch[1] Batch[30] Speed: 1.280073148711272 samples/sec                   batch loss = 87.79708051681519 | accuracy = 0.43333333333333335


Epoch[1] Batch[35] Speed: 1.2780359285195793 samples/sec                   batch loss = 101.70813655853271 | accuracy = 0.45714285714285713


Epoch[1] Batch[40] Speed: 1.2785355640851128 samples/sec                   batch loss = 115.1469554901123 | accuracy = 0.475


Epoch[1] Batch[45] Speed: 1.277897502164141 samples/sec                   batch loss = 128.8992087841034 | accuracy = 0.4888888888888889


Epoch[1] Batch[50] Speed: 1.2800712930331355 samples/sec                   batch loss = 143.4904158115387 | accuracy = 0.48


Epoch[1] Batch[55] Speed: 1.2766565103126903 samples/sec                   batch loss = 156.51796865463257 | accuracy = 0.5045454545454545


Epoch[1] Batch[60] Speed: 1.2801005938424421 samples/sec                   batch loss = 170.72039222717285 | accuracy = 0.5041666666666667


Epoch[1] Batch[65] Speed: 1.2774341576333608 samples/sec                   batch loss = 185.65517687797546 | accuracy = 0.48846153846153845


Epoch[1] Batch[70] Speed: 1.2820169427013737 samples/sec                   batch loss = 199.98298525810242 | accuracy = 0.4857142857142857


Epoch[1] Batch[75] Speed: 1.273995971287354 samples/sec                   batch loss = 213.75895500183105 | accuracy = 0.49


Epoch[1] Batch[80] Speed: 1.2516914067197886 samples/sec                   batch loss = 227.6776602268219 | accuracy = 0.484375


Epoch[1] Batch[85] Speed: 1.2576084443755484 samples/sec                   batch loss = 241.69982647895813 | accuracy = 0.4852941176470588


Epoch[1] Batch[90] Speed: 1.2557428895606424 samples/sec                   batch loss = 255.61768436431885 | accuracy = 0.4888888888888889


Epoch[1] Batch[95] Speed: 1.2518652193459447 samples/sec                   batch loss = 268.97856426239014 | accuracy = 0.49736842105263157


Epoch[1] Batch[100] Speed: 1.253686097542203 samples/sec                   batch loss = 283.5565938949585 | accuracy = 0.495


Epoch[1] Batch[105] Speed: 1.257309021625032 samples/sec                   batch loss = 297.4716947078705 | accuracy = 0.5


Epoch[1] Batch[110] Speed: 1.2573294686893994 samples/sec                   batch loss = 310.79661989212036 | accuracy = 0.5045454545454545


Epoch[1] Batch[115] Speed: 1.251482353818755 samples/sec                   batch loss = 325.08864307403564 | accuracy = 0.5043478260869565


Epoch[1] Batch[120] Speed: 1.2594059001622335 samples/sec                   batch loss = 338.6726129055023 | accuracy = 0.5020833333333333


Epoch[1] Batch[125] Speed: 1.2554895436960039 samples/sec                   batch loss = 351.9383680820465 | accuracy = 0.506


Epoch[1] Batch[130] Speed: 1.2514423998920507 samples/sec                   batch loss = 365.8541944026947 | accuracy = 0.5038461538461538


Epoch[1] Batch[135] Speed: 1.2527684953650329 samples/sec                   batch loss = 378.9077661037445 | accuracy = 0.5148148148148148


Epoch[1] Batch[140] Speed: 1.2525810587275785 samples/sec                   batch loss = 392.97757601737976 | accuracy = 0.5142857142857142


Epoch[1] Batch[145] Speed: 1.2461240448263158 samples/sec                   batch loss = 407.0592505931854 | accuracy = 0.5137931034482759


Epoch[1] Batch[150] Speed: 1.2466031115562046 samples/sec                   batch loss = 420.80392241477966 | accuracy = 0.5166666666666667


Epoch[1] Batch[155] Speed: 1.2514382926236356 samples/sec                   batch loss = 434.29606771469116 | accuracy = 0.5241935483870968


Epoch[1] Batch[160] Speed: 1.2579011248611804 samples/sec                   batch loss = 447.79701828956604 | accuracy = 0.5296875


Epoch[1] Batch[165] Speed: 1.258644753311205 samples/sec                   batch loss = 462.32505321502686 | accuracy = 0.5242424242424243


Epoch[1] Batch[170] Speed: 1.2556401669155912 samples/sec                   batch loss = 476.48215222358704 | accuracy = 0.5235294117647059


Epoch[1] Batch[175] Speed: 1.251766772522428 samples/sec                   batch loss = 490.44753408432007 | accuracy = 0.5228571428571429


Epoch[1] Batch[180] Speed: 1.2584322387931628 samples/sec                   batch loss = 504.92588090896606 | accuracy = 0.5208333333333334


Epoch[1] Batch[185] Speed: 1.2586594837649496 samples/sec                   batch loss = 518.3818299770355 | accuracy = 0.5256756756756756


Epoch[1] Batch[190] Speed: 1.2501581399507826 samples/sec                   batch loss = 532.1863656044006 | accuracy = 0.5263157894736842


Epoch[1] Batch[195] Speed: 1.2567459993751182 samples/sec                   batch loss = 545.8448820114136 | accuracy = 0.5269230769230769


Epoch[1] Batch[200] Speed: 1.253445192867595 samples/sec                   batch loss = 559.6016359329224 | accuracy = 0.52375


Epoch[1] Batch[205] Speed: 1.2551357260279794 samples/sec                   batch loss = 572.6836867332458 | accuracy = 0.5280487804878049


Epoch[1] Batch[210] Speed: 1.2575327505598186 samples/sec                   batch loss = 586.3651475906372 | accuracy = 0.5285714285714286


Epoch[1] Batch[215] Speed: 1.2499870361242125 samples/sec                   batch loss = 600.4666864871979 | accuracy = 0.5255813953488372


Epoch[1] Batch[220] Speed: 1.260550105789575 samples/sec                   batch loss = 614.5181424617767 | accuracy = 0.5261363636363636


Epoch[1] Batch[225] Speed: 1.2534414470252422 samples/sec                   batch loss = 628.5303020477295 | accuracy = 0.5255555555555556


Epoch[1] Batch[230] Speed: 1.257328903323841 samples/sec                   batch loss = 642.5607943534851 | accuracy = 0.5271739130434783


Epoch[1] Batch[235] Speed: 1.2585971649708043 samples/sec                   batch loss = 656.6243104934692 | accuracy = 0.524468085106383


Epoch[1] Batch[240] Speed: 1.2574766694873396 samples/sec                   batch loss = 669.9917168617249 | accuracy = 0.5260416666666666


Epoch[1] Batch[245] Speed: 1.2568814817300915 samples/sec                   batch loss = 683.3968217372894 | accuracy = 0.5275510204081633


Epoch[1] Batch[250] Speed: 1.2591198897554425 samples/sec                   batch loss = 696.7603619098663 | accuracy = 0.532


Epoch[1] Batch[255] Speed: 1.2553596214440228 samples/sec                   batch loss = 710.1695091724396 | accuracy = 0.5333333333333333


Epoch[1] Batch[260] Speed: 1.2603598601474566 samples/sec                   batch loss = 723.6194653511047 | accuracy = 0.5336538461538461


Epoch[1] Batch[265] Speed: 1.2551697184272432 samples/sec                   batch loss = 737.262209892273 | accuracy = 0.5349056603773585


Epoch[1] Batch[270] Speed: 1.2520847722943336 samples/sec                   batch loss = 752.2941002845764 | accuracy = 0.5305555555555556


Epoch[1] Batch[275] Speed: 1.2546814188239725 samples/sec                   batch loss = 765.3946795463562 | accuracy = 0.5345454545454545


Epoch[1] Batch[280] Speed: 1.2568693351428861 samples/sec                   batch loss = 779.3252308368683 | accuracy = 0.5348214285714286


Epoch[1] Batch[285] Speed: 1.2534948273924655 samples/sec                   batch loss = 792.5471432209015 | accuracy = 0.5359649122807018


Epoch[1] Batch[290] Speed: 1.2551562902705131 samples/sec                   batch loss = 806.0717313289642 | accuracy = 0.5370689655172414


Epoch[1] Batch[295] Speed: 1.2556709913423658 samples/sec                   batch loss = 819.7703716754913 | accuracy = 0.5364406779661017


Epoch[1] Batch[300] Speed: 1.2556501283026347 samples/sec                   batch loss = 833.229171037674 | accuracy = 0.5383333333333333


Epoch[1] Batch[305] Speed: 1.2518756813986442 samples/sec                   batch loss = 846.4327387809753 | accuracy = 0.5409836065573771


Epoch[1] Batch[310] Speed: 1.2512265247956986 samples/sec                   batch loss = 859.9244089126587 | accuracy = 0.5403225806451613


Epoch[1] Batch[315] Speed: 1.2525700237878479 samples/sec                   batch loss = 872.8954386711121 | accuracy = 0.5436507936507936


Epoch[1] Batch[320] Speed: 1.2504564960851907 samples/sec                   batch loss = 887.0423455238342 | accuracy = 0.54140625


Epoch[1] Batch[325] Speed: 1.2517411826201739 samples/sec                   batch loss = 900.4373705387115 | accuracy = 0.5430769230769231


Epoch[1] Batch[330] Speed: 1.2497671940968424 samples/sec                   batch loss = 913.5119907855988 | accuracy = 0.543939393939394


Epoch[1] Batch[335] Speed: 1.25445382574704 samples/sec                   batch loss = 926.4978048801422 | accuracy = 0.5455223880597015


Epoch[1] Batch[340] Speed: 1.2500301569525019 samples/sec                   batch loss = 940.3639035224915 | accuracy = 0.5455882352941176


Epoch[1] Batch[345] Speed: 1.2569947670221626 samples/sec                   batch loss = 953.701033115387 | accuracy = 0.5471014492753623


Epoch[1] Batch[350] Speed: 1.2536206170077298 samples/sec                   batch loss = 966.4098482131958 | accuracy = 0.5507142857142857


Epoch[1] Batch[355] Speed: 1.2529280103285882 samples/sec                   batch loss = 979.8152658939362 | accuracy = 0.55


Epoch[1] Batch[360] Speed: 1.2571294552421743 samples/sec                   batch loss = 992.999650478363 | accuracy = 0.5506944444444445


Epoch[1] Batch[365] Speed: 1.2586416372978646 samples/sec                   batch loss = 1006.7059350013733 | accuracy = 0.5513698630136986


Epoch[1] Batch[370] Speed: 1.258623224806704 samples/sec                   batch loss = 1019.5934491157532 | accuracy = 0.5527027027027027


Epoch[1] Batch[375] Speed: 1.2564725815854132 samples/sec                   batch loss = 1033.0779511928558 | accuracy = 0.552


Epoch[1] Batch[380] Speed: 1.2553855473813087 samples/sec                   batch loss = 1046.8076255321503 | accuracy = 0.5519736842105263


Epoch[1] Batch[385] Speed: 1.254232222326302 samples/sec                   batch loss = 1059.9398548603058 | accuracy = 0.5512987012987013


Epoch[1] Batch[390] Speed: 1.2572322332899 samples/sec                   batch loss = 1072.905106306076 | accuracy = 0.5532051282051282


Epoch[1] Batch[395] Speed: 1.2566806694710226 samples/sec                   batch loss = 1085.9342317581177 | accuracy = 0.5544303797468354


Epoch[1] Batch[400] Speed: 1.2550180814680791 samples/sec                   batch loss = 1099.3646667003632 | accuracy = 0.555


Epoch[1] Batch[405] Speed: 1.2549382872575625 samples/sec                   batch loss = 1113.1173288822174 | accuracy = 0.5567901234567901


Epoch[1] Batch[410] Speed: 1.2529674977231118 samples/sec                   batch loss = 1126.3690311908722 | accuracy = 0.5554878048780488


Epoch[1] Batch[415] Speed: 1.2558037040230712 samples/sec                   batch loss = 1139.730426311493 | accuracy = 0.5566265060240964


Epoch[1] Batch[420] Speed: 1.2583185055115695 samples/sec                   batch loss = 1152.4068965911865 | accuracy = 0.5577380952380953


Epoch[1] Batch[425] Speed: 1.2536469395477514 samples/sec                   batch loss = 1167.0207133293152 | accuracy = 0.5558823529411765


Epoch[1] Batch[430] Speed: 1.2539197848460697 samples/sec                   batch loss = 1180.7788944244385 | accuracy = 0.5552325581395349


Epoch[1] Batch[435] Speed: 1.251517829069348 samples/sec                   batch loss = 1193.6753208637238 | accuracy = 0.5574712643678161


Epoch[1] Batch[440] Speed: 1.2524735231088355 samples/sec                   batch loss = 1207.4353954792023 | accuracy = 0.5573863636363636


Epoch[1] Batch[445] Speed: 1.2507090266676328 samples/sec                   batch loss = 1219.8028852939606 | accuracy = 0.5595505617977528


Epoch[1] Batch[450] Speed: 1.2455609135440997 samples/sec                   batch loss = 1232.376226902008 | accuracy = 0.5616666666666666


Epoch[1] Batch[455] Speed: 1.2507017541505514 samples/sec                   batch loss = 1245.7204031944275 | accuracy = 0.5620879120879121


Epoch[1] Batch[460] Speed: 1.248701020078172 samples/sec                   batch loss = 1259.5307261943817 | accuracy = 0.5614130434782608


Epoch[1] Batch[465] Speed: 1.2519183721226284 samples/sec                   batch loss = 1272.3138407468796 | accuracy = 0.5634408602150538


Epoch[1] Batch[470] Speed: 1.2589534094933936 samples/sec                   batch loss = 1285.8432334661484 | accuracy = 0.5643617021276596


Epoch[1] Batch[475] Speed: 1.2573702705795708 samples/sec                   batch loss = 1298.9800740480423 | accuracy = 0.5668421052631579


Epoch[1] Batch[480] Speed: 1.2494205018499407 samples/sec                   batch loss = 1312.5840839147568 | accuracy = 0.5677083333333334


Epoch[1] Batch[485] Speed: 1.2517819028108361 samples/sec                   batch loss = 1326.2195967435837 | accuracy = 0.5685567010309278


Epoch[1] Batch[490] Speed: 1.2480170865376419 samples/sec                   batch loss = 1339.7729676961899 | accuracy = 0.5668367346938775


Epoch[1] Batch[495] Speed: 1.252656905819736 samples/sec                   batch loss = 1352.8186115026474 | accuracy = 0.5676767676767677


Epoch[1] Batch[500] Speed: 1.247136592071608 samples/sec                   batch loss = 1365.7992433309555 | accuracy = 0.5685


Epoch[1] Batch[505] Speed: 1.255514065662458 samples/sec                   batch loss = 1379.3043917417526 | accuracy = 0.5698019801980198


Epoch[1] Batch[510] Speed: 1.2497145030938546 samples/sec                   batch loss = 1392.913124680519 | accuracy = 0.5696078431372549


Epoch[1] Batch[515] Speed: 1.2574548039371611 samples/sec                   batch loss = 1405.478988289833 | accuracy = 0.5703883495145631


Epoch[1] Batch[520] Speed: 1.2485084805876434 samples/sec                   batch loss = 1417.0499995946884 | accuracy = 0.5716346153846154


Epoch[1] Batch[525] Speed: 1.2559246925000396 samples/sec                   batch loss = 1430.6207889318466 | accuracy = 0.5719047619047619


Epoch[1] Batch[530] Speed: 1.255658492230802 samples/sec                   batch loss = 1444.1496757268906 | accuracy = 0.5716981132075472


Epoch[1] Batch[535] Speed: 1.2567535306364281 samples/sec                   batch loss = 1455.8143781423569 | accuracy = 0.572429906542056


Epoch[1] Batch[540] Speed: 1.2475952849134975 samples/sec                   batch loss = 1469.7800952196121 | accuracy = 0.5722222222222222


Epoch[1] Batch[545] Speed: 1.2500586574481771 samples/sec                   batch loss = 1483.6583367586136 | accuracy = 0.5720183486238533


Epoch[1] Batch[550] Speed: 1.2472268943171507 samples/sec                   batch loss = 1497.0394061803818 | accuracy = 0.5713636363636364


Epoch[1] Batch[555] Speed: 1.2527268691109548 samples/sec                   batch loss = 1510.3968809843063 | accuracy = 0.5716216216216217


Epoch[1] Batch[560] Speed: 1.2538013374107322 samples/sec                   batch loss = 1522.8229558467865 | accuracy = 0.5727678571428572


Epoch[1] Batch[565] Speed: 1.2479950845530519 samples/sec                   batch loss = 1535.7893931865692 | accuracy = 0.5725663716814159


Epoch[1] Batch[570] Speed: 1.2472692685640894 samples/sec                   batch loss = 1548.6499860286713 | accuracy = 0.5736842105263158


Epoch[1] Batch[575] Speed: 1.2457687326989106 samples/sec                   batch loss = 1561.9602403640747 | accuracy = 0.5739130434782609


Epoch[1] Batch[580] Speed: 1.2475418492218189 samples/sec                   batch loss = 1575.97052359581 | accuracy = 0.5745689655172413


Epoch[1] Batch[585] Speed: 1.2456590340189653 samples/sec                   batch loss = 1588.5830655097961 | accuracy = 0.5756410256410256


Epoch[1] Batch[590] Speed: 1.250429748173235 samples/sec                   batch loss = 1601.793557882309 | accuracy = 0.5758474576271186


Epoch[1] Batch[595] Speed: 1.2476959531204936 samples/sec                   batch loss = 1615.3782706260681 | accuracy = 0.5756302521008403


Epoch[1] Batch[600] Speed: 1.252343943845074 samples/sec                   batch loss = 1628.3752477169037 | accuracy = 0.57625


Epoch[1] Batch[605] Speed: 1.2484708531164712 samples/sec                   batch loss = 1640.7183368206024 | accuracy = 0.5768595041322314


Epoch[1] Batch[610] Speed: 1.2452961301605325 samples/sec                   batch loss = 1653.5326528549194 | accuracy = 0.578688524590164


Epoch[1] Batch[615] Speed: 1.2474940763941051 samples/sec                   batch loss = 1666.6304650306702 | accuracy = 0.5792682926829268


Epoch[1] Batch[620] Speed: 1.2478443407674207 samples/sec                   batch loss = 1680.0070762634277 | accuracy = 0.5790322580645161


Epoch[1] Batch[625] Speed: 1.2500936794446993 samples/sec                   batch loss = 1694.0490794181824 | accuracy = 0.5784


Epoch[1] Batch[630] Speed: 1.2452178446318865 samples/sec                   batch loss = 1706.052092552185 | accuracy = 0.5793650793650794


Epoch[1] Batch[635] Speed: 1.24301611677044 samples/sec                   batch loss = 1720.1754252910614 | accuracy = 0.5795275590551181


Epoch[1] Batch[640] Speed: 1.2578518009196438 samples/sec                   batch loss = 1732.7818938493729 | accuracy = 0.580078125


Epoch[1] Batch[645] Speed: 1.252326556460562 samples/sec                   batch loss = 1747.3392766714096 | accuracy = 0.5790697674418605


Epoch[1] Batch[650] Speed: 1.2570589051552763 samples/sec                   batch loss = 1758.7459448575974 | accuracy = 0.5811538461538461


Epoch[1] Batch[655] Speed: 1.2555146293970882 samples/sec                   batch loss = 1771.8084036111832 | accuracy = 0.582442748091603


Epoch[1] Batch[660] Speed: 1.2479851514217861 samples/sec                   batch loss = 1783.849130988121 | accuracy = 0.5837121212121212


Epoch[1] Batch[665] Speed: 1.2459018584766686 samples/sec                   batch loss = 1795.7059046030045 | accuracy = 0.5842105263157895


Epoch[1] Batch[670] Speed: 1.2499434526514672 samples/sec                   batch loss = 1808.5151294469833 | accuracy = 0.5847014925373134


Epoch[1] Batch[675] Speed: 1.2516680610384183 samples/sec                   batch loss = 1821.3595093488693 | accuracy = 0.5844444444444444


Epoch[1] Batch[680] Speed: 1.2510228510620434 samples/sec                   batch loss = 1833.8409888744354 | accuracy = 0.5845588235294118


Epoch[1] Batch[685] Speed: 1.250023637413917 samples/sec                   batch loss = 1847.2441902160645 | accuracy = 0.5850364963503649


Epoch[1] Batch[690] Speed: 1.2466656377615413 samples/sec                   batch loss = 1859.9947509765625 | accuracy = 0.5851449275362319


Epoch[1] Batch[695] Speed: 1.246924053291586 samples/sec                   batch loss = 1872.2806688547134 | accuracy = 0.5856115107913669


Epoch[1] Batch[700] Speed: 1.2534130729965316 samples/sec                   batch loss = 1884.9165314435959 | accuracy = 0.5853571428571429


Epoch[1] Batch[705] Speed: 1.2549894482605712 samples/sec                   batch loss = 1897.8279958963394 | accuracy = 0.5865248226950355


Epoch[1] Batch[710] Speed: 1.2515709523499048 samples/sec                   batch loss = 1909.8757318258286 | accuracy = 0.5876760563380282


Epoch[1] Batch[715] Speed: 1.2563197829965218 samples/sec                   batch loss = 1922.8271533250809 | accuracy = 0.5884615384615385


Epoch[1] Batch[720] Speed: 1.2466317338851127 samples/sec                   batch loss = 1934.9664225578308 | accuracy = 0.5892361111111111


Epoch[1] Batch[725] Speed: 1.2487004624456515 samples/sec                   batch loss = 1947.3577885627747 | accuracy = 0.5896551724137931


Epoch[1] Batch[730] Speed: 1.2486013976385908 samples/sec                   batch loss = 1960.2138916254044 | accuracy = 0.5904109589041096


Epoch[1] Batch[735] Speed: 1.2549082497131296 samples/sec                   batch loss = 1973.6309686899185 | accuracy = 0.5904761904761905


Epoch[1] Batch[740] Speed: 1.2547066599109669 samples/sec                   batch loss = 1988.0550495386124 | accuracy = 0.589527027027027


Epoch[1] Batch[745] Speed: 1.2479587876158738 samples/sec                   batch loss = 2001.1876674890518 | accuracy = 0.5892617449664429


Epoch[1] Batch[750] Speed: 1.2444174388576636 samples/sec                   batch loss = 2013.5893810987473 | accuracy = 0.5906666666666667


Epoch[1] Batch[755] Speed: 1.2466835167778094 samples/sec                   batch loss = 2025.5057128667831 | accuracy = 0.5913907284768212


Epoch[1] Batch[760] Speed: 1.2557006894776077 samples/sec                   batch loss = 2038.3074048757553 | accuracy = 0.5914473684210526


Epoch[1] Batch[765] Speed: 1.252906676942505 samples/sec                   batch loss = 2050.290736436844 | accuracy = 0.592156862745098


Epoch[1] Batch[770] Speed: 1.2452019484187096 samples/sec                   batch loss = 2062.370718359947 | accuracy = 0.5931818181818181


Epoch[1] Batch[775] Speed: 1.2504931248732902 samples/sec                   batch loss = 2073.8324452638626 | accuracy = 0.5932258064516129


Epoch[1] Batch[780] Speed: 1.2577382668144517 samples/sec                   batch loss = 2086.3830696344376 | accuracy = 0.5939102564102564


Epoch[1] Batch[785] Speed: 1.255980071071222 samples/sec                   batch loss = 2099.8179066181183 | accuracy = 0.5933121019108281


[Epoch 1] training: accuracy=0.5935913705583756
[Epoch 1] time cost: 646.3287630081177
[Epoch 1] validation: validation accuracy=0.68


Epoch[2] Batch[5] Speed: 1.2510478518710408 samples/sec                   batch loss = 10.484230637550354 | accuracy = 0.7


Epoch[2] Batch[10] Speed: 1.2520672986650658 samples/sec                   batch loss = 22.34666132926941 | accuracy = 0.7


Epoch[2] Batch[15] Speed: 1.2516612442637283 samples/sec                   batch loss = 33.57773232460022 | accuracy = 0.7166666666666667


Epoch[2] Batch[20] Speed: 1.2484590543518808 samples/sec                   batch loss = 44.80154728889465 | accuracy = 0.7125


Epoch[2] Batch[25] Speed: 1.2498401868545503 samples/sec                   batch loss = 57.61061072349548 | accuracy = 0.7


Epoch[2] Batch[30] Speed: 1.2448952864138685 samples/sec                   batch loss = 70.82096910476685 | accuracy = 0.6916666666666667


Epoch[2] Batch[35] Speed: 1.2573584914710267 samples/sec                   batch loss = 82.67075598239899 | accuracy = 0.7


Epoch[2] Batch[40] Speed: 1.2486993471821044 samples/sec                   batch loss = 94.90346825122833 | accuracy = 0.69375


Epoch[2] Batch[45] Speed: 1.2497401963847912 samples/sec                   batch loss = 108.9869350194931 | accuracy = 0.6777777777777778


Epoch[2] Batch[50] Speed: 1.251548638155517 samples/sec                   batch loss = 119.96137011051178 | accuracy = 0.685


Epoch[2] Batch[55] Speed: 1.2508291286671107 samples/sec                   batch loss = 129.8817514181137 | accuracy = 0.6954545454545454


Epoch[2] Batch[60] Speed: 1.2498165377570176 samples/sec                   batch loss = 143.75412476062775 | accuracy = 0.6875


Epoch[2] Batch[65] Speed: 1.25241677044863 samples/sec                   batch loss = 156.47128331661224 | accuracy = 0.6923076923076923


Epoch[2] Batch[70] Speed: 1.2461034052714761 samples/sec                   batch loss = 168.19879281520844 | accuracy = 0.6964285714285714


Epoch[2] Batch[75] Speed: 1.2516593766671822 samples/sec                   batch loss = 178.62796783447266 | accuracy = 0.7033333333333334


Epoch[2] Batch[80] Speed: 1.246215496460844 samples/sec                   batch loss = 193.27698850631714 | accuracy = 0.690625


Epoch[2] Batch[85] Speed: 1.2585610984275577 samples/sec                   batch loss = 206.04286766052246 | accuracy = 0.6882352941176471


Epoch[2] Batch[90] Speed: 1.248980084693115 samples/sec                   batch loss = 217.44404757022858 | accuracy = 0.6861111111111111


Epoch[2] Batch[95] Speed: 1.2545895648536385 samples/sec                   batch loss = 228.85012531280518 | accuracy = 0.6894736842105263


Epoch[2] Batch[100] Speed: 1.259057714336488 samples/sec                   batch loss = 238.30230236053467 | accuracy = 0.6975


Epoch[2] Batch[105] Speed: 1.2552128219345262 samples/sec                   batch loss = 248.46085619926453 | accuracy = 0.7023809523809523


Epoch[2] Batch[110] Speed: 1.253180510571301 samples/sec                   batch loss = 261.4984333515167 | accuracy = 0.6931818181818182


Epoch[2] Batch[115] Speed: 1.252940361568377 samples/sec                   batch loss = 271.91276001930237 | accuracy = 0.6956521739130435


Epoch[2] Batch[120] Speed: 1.2578142682748268 samples/sec                   batch loss = 283.0423607826233 | accuracy = 0.6958333333333333


Epoch[2] Batch[125] Speed: 1.2516216524102601 samples/sec                   batch loss = 296.4994559288025 | accuracy = 0.688


Epoch[2] Batch[130] Speed: 1.2491304520041373 samples/sec                   batch loss = 309.61190831661224 | accuracy = 0.6846153846153846


Epoch[2] Batch[135] Speed: 1.253869929242498 samples/sec                   batch loss = 322.6825941801071 | accuracy = 0.6814814814814815


Epoch[2] Batch[140] Speed: 1.261679693151898 samples/sec                   batch loss = 334.4365861415863 | accuracy = 0.6803571428571429


Epoch[2] Batch[145] Speed: 1.2528735554931754 samples/sec                   batch loss = 346.9508068561554 | accuracy = 0.6775862068965517


Epoch[2] Batch[150] Speed: 1.2565478654914173 samples/sec                   batch loss = 358.3803448677063 | accuracy = 0.6783333333333333


Epoch[2] Batch[155] Speed: 1.251577768141135 samples/sec                   batch loss = 369.04446172714233 | accuracy = 0.6838709677419355


Epoch[2] Batch[160] Speed: 1.254258570556874 samples/sec                   batch loss = 382.8539261817932 | accuracy = 0.68125


Epoch[2] Batch[165] Speed: 1.253499322788843 samples/sec                   batch loss = 394.93288803100586 | accuracy = 0.6803030303030303


Epoch[2] Batch[170] Speed: 1.2508249321680192 samples/sec                   batch loss = 407.6256651878357 | accuracy = 0.6808823529411765


Epoch[2] Batch[175] Speed: 1.252938490152756 samples/sec                   batch loss = 420.0124719142914 | accuracy = 0.68


Epoch[2] Batch[180] Speed: 1.252859802185565 samples/sec                   batch loss = 431.0782907009125 | accuracy = 0.6819444444444445


Epoch[2] Batch[185] Speed: 1.2524293920947096 samples/sec                   batch loss = 442.526288151741 | accuracy = 0.6824324324324325


Epoch[2] Batch[190] Speed: 1.25215747530329 samples/sec                   batch loss = 453.09364128112793 | accuracy = 0.6855263157894737


Epoch[2] Batch[195] Speed: 1.2528936714188441 samples/sec                   batch loss = 464.56714367866516 | accuracy = 0.6858974358974359


Epoch[2] Batch[200] Speed: 1.2543412793114483 samples/sec                   batch loss = 479.2159136533737 | accuracy = 0.685


Epoch[2] Batch[205] Speed: 1.2623891360537292 samples/sec                   batch loss = 491.2415224313736 | accuracy = 0.6853658536585366


Epoch[2] Batch[210] Speed: 1.2533964986632453 samples/sec                   batch loss = 503.5381826162338 | accuracy = 0.6857142857142857


Epoch[2] Batch[215] Speed: 1.2636485010024248 samples/sec                   batch loss = 514.6250100135803 | accuracy = 0.6883720930232559


Epoch[2] Batch[220] Speed: 1.2660736916790667 samples/sec                   batch loss = 527.180374622345 | accuracy = 0.6875


Epoch[2] Batch[225] Speed: 1.2578612315845137 samples/sec                   batch loss = 537.4324157238007 | accuracy = 0.6888888888888889


Epoch[2] Batch[230] Speed: 1.2569467383449977 samples/sec                   batch loss = 549.3647131919861 | accuracy = 0.6869565217391305


Epoch[2] Batch[235] Speed: 1.261216465470011 samples/sec                   batch loss = 561.4340360164642 | accuracy = 0.6872340425531915


Epoch[2] Batch[240] Speed: 1.2557200504107815 samples/sec                   batch loss = 571.3958320617676 | accuracy = 0.6885416666666667


Epoch[2] Batch[245] Speed: 1.2598574863234344 samples/sec                   batch loss = 582.0627952814102 | accuracy = 0.6887755102040817


Epoch[2] Batch[250] Speed: 1.2643204280412215 samples/sec                   batch loss = 593.7142015695572 | accuracy = 0.69


Epoch[2] Batch[255] Speed: 1.2580130847572872 samples/sec                   batch loss = 605.7093560695648 | accuracy = 0.6901960784313725


Epoch[2] Batch[260] Speed: 1.2528387517893065 samples/sec                   batch loss = 618.9612774848938 | accuracy = 0.6875


Epoch[2] Batch[265] Speed: 1.252543091869945 samples/sec                   batch loss = 635.0511648654938 | accuracy = 0.6830188679245283


Epoch[2] Batch[270] Speed: 1.2539193162599112 samples/sec                   batch loss = 647.516129732132 | accuracy = 0.6824074074074075


Epoch[2] Batch[275] Speed: 1.259061115872356 samples/sec                   batch loss = 655.8604209423065 | accuracy = 0.6872727272727273


Epoch[2] Batch[280] Speed: 1.2555119986398109 samples/sec                   batch loss = 666.9563812017441 | accuracy = 0.6857142857142857


Epoch[2] Batch[285] Speed: 1.2575050392870637 samples/sec                   batch loss = 678.997061252594 | accuracy = 0.6842105263157895


Epoch[2] Batch[290] Speed: 1.2502471100936574 samples/sec                   batch loss = 689.1465361118317 | accuracy = 0.6870689655172414


Epoch[2] Batch[295] Speed: 1.2474174619838865 samples/sec                   batch loss = 701.6723871231079 | accuracy = 0.6872881355932203


Epoch[2] Batch[300] Speed: 1.2482139320516707 samples/sec                   batch loss = 714.8335082530975 | accuracy = 0.6875


Epoch[2] Batch[305] Speed: 1.248913607288315 samples/sec                   batch loss = 727.0069621801376 | accuracy = 0.6860655737704918


Epoch[2] Batch[310] Speed: 1.254018008235138 samples/sec                   batch loss = 740.8747395277023 | accuracy = 0.6846774193548387


Epoch[2] Batch[315] Speed: 1.252377878679685 samples/sec                   batch loss = 752.5151886940002 | accuracy = 0.6841269841269841


Epoch[2] Batch[320] Speed: 1.2477338122808388 samples/sec                   batch loss = 763.2511838674545 | accuracy = 0.68515625


Epoch[2] Batch[325] Speed: 1.2497762246263755 samples/sec                   batch loss = 776.775159239769 | accuracy = 0.6823076923076923


Epoch[2] Batch[330] Speed: 1.2539425585554935 samples/sec                   batch loss = 789.0969904661179 | accuracy = 0.6825757575757576


Epoch[2] Batch[335] Speed: 1.2510233174858998 samples/sec                   batch loss = 802.1435035467148 | accuracy = 0.682089552238806


Epoch[2] Batch[340] Speed: 1.2581735607720679 samples/sec                   batch loss = 816.8849680423737 | accuracy = 0.6801470588235294


Epoch[2] Batch[345] Speed: 1.2473403933015776 samples/sec                   batch loss = 827.5939862728119 | accuracy = 0.6797101449275362


Epoch[2] Batch[350] Speed: 1.2550853981218473 samples/sec                   batch loss = 839.7301516532898 | accuracy = 0.6807142857142857


Epoch[2] Batch[355] Speed: 1.2477923685950687 samples/sec                   batch loss = 853.1944514513016 | accuracy = 0.680281690140845


Epoch[2] Batch[360] Speed: 1.2481287794615268 samples/sec                   batch loss = 863.6078976392746 | accuracy = 0.68125


Epoch[2] Batch[365] Speed: 1.2516332309007159 samples/sec                   batch loss = 875.2494575977325 | accuracy = 0.6815068493150684


Epoch[2] Batch[370] Speed: 1.2467631911673498 samples/sec                   batch loss = 889.1555511951447 | accuracy = 0.6797297297297298


Epoch[2] Batch[375] Speed: 1.2502791610799673 samples/sec                   batch loss = 902.2870352268219 | accuracy = 0.678


Epoch[2] Batch[380] Speed: 1.2511513173349307 samples/sec                   batch loss = 916.7416400909424 | accuracy = 0.6763157894736842


Epoch[2] Batch[385] Speed: 1.2521407472565134 samples/sec                   batch loss = 926.0808601379395 | accuracy = 0.6792207792207792


Epoch[2] Batch[390] Speed: 1.2485131261117461 samples/sec                   batch loss = 937.1112275123596 | accuracy = 0.6807692307692308


Epoch[2] Batch[395] Speed: 1.2514810468742505 samples/sec                   batch loss = 948.7538723945618 | accuracy = 0.680379746835443


Epoch[2] Batch[400] Speed: 1.2480882037432195 samples/sec                   batch loss = 961.343533873558 | accuracy = 0.681875


Epoch[2] Batch[405] Speed: 1.2590948487645952 samples/sec                   batch loss = 976.6044473648071 | accuracy = 0.6796296296296296


Epoch[2] Batch[410] Speed: 1.2484769848459751 samples/sec                   batch loss = 988.0308884382248 | accuracy = 0.6798780487804879


Epoch[2] Batch[415] Speed: 1.2489869652761973 samples/sec                   batch loss = 999.8591237068176 | accuracy = 0.6795180722891566


Epoch[2] Batch[420] Speed: 1.2545372169223727 samples/sec                   batch loss = 1012.3712675571442 | accuracy = 0.6797619047619048


Epoch[2] Batch[425] Speed: 1.2489054259429535 samples/sec                   batch loss = 1023.617197394371 | accuracy = 0.6817647058823529


Epoch[2] Batch[430] Speed: 1.247085049629659 samples/sec                   batch loss = 1035.568844795227 | accuracy = 0.6831395348837209


Epoch[2] Batch[435] Speed: 1.2466495192952511 samples/sec                   batch loss = 1047.9631043672562 | accuracy = 0.6827586206896552


Epoch[2] Batch[440] Speed: 1.2482416068234643 samples/sec                   batch loss = 1059.403757572174 | accuracy = 0.6823863636363636


Epoch[2] Batch[445] Speed: 1.2508241861266842 samples/sec                   batch loss = 1071.873568058014 | accuracy = 0.6808988764044944


Epoch[2] Batch[450] Speed: 1.2477531139134765 samples/sec                   batch loss = 1084.3823260068893 | accuracy = 0.68


Epoch[2] Batch[455] Speed: 1.2451383676243553 samples/sec                   batch loss = 1098.1955935955048 | accuracy = 0.6796703296703297


Epoch[2] Batch[460] Speed: 1.249249320803423 samples/sec                   batch loss = 1108.125414967537 | accuracy = 0.6804347826086956


Epoch[2] Batch[465] Speed: 1.247707551883834 samples/sec                   batch loss = 1119.9760711193085 | accuracy = 0.6811827956989247


Epoch[2] Batch[470] Speed: 1.2490958559468617 samples/sec                   batch loss = 1129.6213393211365 | accuracy = 0.6824468085106383


Epoch[2] Batch[475] Speed: 1.2500961943976263 samples/sec                   batch loss = 1143.375987291336 | accuracy = 0.6815789473684211


Epoch[2] Batch[480] Speed: 1.2492456930171438 samples/sec                   batch loss = 1156.8321328163147 | accuracy = 0.68125


Epoch[2] Batch[485] Speed: 1.2522028956063447 samples/sec                   batch loss = 1167.2789324522018 | accuracy = 0.6829896907216495


Epoch[2] Batch[490] Speed: 1.248266217970204 samples/sec                   batch loss = 1178.5235201120377 | accuracy = 0.6821428571428572


Epoch[2] Batch[495] Speed: 1.2543268373143261 samples/sec                   batch loss = 1189.1216888427734 | accuracy = 0.6823232323232323


Epoch[2] Batch[500] Speed: 1.2449748247767058 samples/sec                   batch loss = 1198.2986500263214 | accuracy = 0.683


Epoch[2] Batch[505] Speed: 1.2488455566324108 samples/sec                   batch loss = 1211.9802953004837 | accuracy = 0.6831683168316832


Epoch[2] Batch[510] Speed: 1.2481383434589675 samples/sec                   batch loss = 1225.7960171699524 | accuracy = 0.6813725490196079


Epoch[2] Batch[515] Speed: 1.2477085725853316 samples/sec                   batch loss = 1234.887536406517 | accuracy = 0.683495145631068


Epoch[2] Batch[520] Speed: 1.246394087991452 samples/sec                   batch loss = 1247.445671081543 | accuracy = 0.6836538461538462


Epoch[2] Batch[525] Speed: 1.2486894028368858 samples/sec                   batch loss = 1258.8604480028152 | accuracy = 0.6842857142857143


Epoch[2] Batch[530] Speed: 1.2490796745778743 samples/sec                   batch loss = 1267.7102487683296 | accuracy = 0.6858490566037736


Epoch[2] Batch[535] Speed: 1.2501632635420812 samples/sec                   batch loss = 1277.6733545660973 | accuracy = 0.6850467289719626


Epoch[2] Batch[540] Speed: 1.2462207729262265 samples/sec                   batch loss = 1290.0737176537514 | accuracy = 0.6851851851851852


Epoch[2] Batch[545] Speed: 1.2477966375745948 samples/sec                   batch loss = 1303.5724329352379 | accuracy = 0.6844036697247706


Epoch[2] Batch[550] Speed: 1.2497714765997328 samples/sec                   batch loss = 1313.6083126664162 | accuracy = 0.6859090909090909


Epoch[2] Batch[555] Speed: 1.2485360755080368 samples/sec                   batch loss = 1324.535262644291 | accuracy = 0.6864864864864865


Epoch[2] Batch[560] Speed: 1.255626352800792 samples/sec                   batch loss = 1334.8461109995842 | accuracy = 0.6866071428571429


Epoch[2] Batch[565] Speed: 1.2503643277893304 samples/sec                   batch loss = 1346.1824166178703 | accuracy = 0.6871681415929204


Epoch[2] Batch[570] Speed: 1.249384308033422 samples/sec                   batch loss = 1358.693315923214 | accuracy = 0.6877192982456141


Epoch[2] Batch[575] Speed: 1.2507086537159815 samples/sec                   batch loss = 1372.763788998127 | accuracy = 0.6873913043478261


Epoch[2] Batch[580] Speed: 1.2487919208363798 samples/sec                   batch loss = 1383.4708705544472 | accuracy = 0.6883620689655172


Epoch[2] Batch[585] Speed: 1.263293304586508 samples/sec                   batch loss = 1398.9296343922615 | accuracy = 0.6876068376068376


Epoch[2] Batch[590] Speed: 1.2477764993838174 samples/sec                   batch loss = 1414.1203466057777 | accuracy = 0.6855932203389831


Epoch[2] Batch[595] Speed: 1.251651159308589 samples/sec                   batch loss = 1423.5098739266396 | accuracy = 0.6865546218487395


Epoch[2] Batch[600] Speed: 1.251355872913346 samples/sec                   batch loss = 1432.9861637949944 | accuracy = 0.6879166666666666


Epoch[2] Batch[605] Speed: 1.2526824396447347 samples/sec                   batch loss = 1442.858401119709 | accuracy = 0.6888429752066115


Epoch[2] Batch[610] Speed: 1.2503769081007536 samples/sec                   batch loss = 1455.7758902907372 | accuracy = 0.6885245901639344


Epoch[2] Batch[615] Speed: 1.2495596209973663 samples/sec                   batch loss = 1466.1530784964561 | accuracy = 0.6890243902439024


Epoch[2] Batch[620] Speed: 1.2549862564429703 samples/sec                   batch loss = 1478.3131304383278 | accuracy = 0.6895161290322581


Epoch[2] Batch[625] Speed: 1.2504232244658546 samples/sec                   batch loss = 1493.5515734553337 | accuracy = 0.6888


Epoch[2] Batch[630] Speed: 1.2490900901236268 samples/sec                   batch loss = 1505.654366672039 | accuracy = 0.6888888888888889


Epoch[2] Batch[635] Speed: 1.248177993827264 samples/sec                   batch loss = 1517.0846745371819 | accuracy = 0.6877952755905512


Epoch[2] Batch[640] Speed: 1.2522040171369941 samples/sec                   batch loss = 1527.2784433960915 | accuracy = 0.687890625


Epoch[2] Batch[645] Speed: 1.250500860998672 samples/sec                   batch loss = 1539.4042750000954 | accuracy = 0.6872093023255814


Epoch[2] Batch[650] Speed: 1.2543569408140602 samples/sec                   batch loss = 1554.5729663968086 | accuracy = 0.6853846153846154


Epoch[2] Batch[655] Speed: 1.2640642763549337 samples/sec                   batch loss = 1565.3405626416206 | accuracy = 0.6862595419847328


Epoch[2] Batch[660] Speed: 1.2554167350725411 samples/sec                   batch loss = 1577.6402485966682 | accuracy = 0.6856060606060606


Epoch[2] Batch[665] Speed: 1.2546378826813145 samples/sec                   batch loss = 1589.9530792832375 | accuracy = 0.6860902255639098


Epoch[2] Batch[670] Speed: 1.2544426639900783 samples/sec                   batch loss = 1600.1169154047966 | accuracy = 0.6865671641791045


Epoch[2] Batch[675] Speed: 1.2594971370212695 samples/sec                   batch loss = 1612.7985075116158 | accuracy = 0.6862962962962963


Epoch[2] Batch[680] Speed: 1.2508607431986953 samples/sec                   batch loss = 1625.6947059035301 | accuracy = 0.6863970588235294


Epoch[2] Batch[685] Speed: 1.2536689539086625 samples/sec                   batch loss = 1636.0265337824821 | accuracy = 0.6875912408759124


Epoch[2] Batch[690] Speed: 1.2513121006072705 samples/sec                   batch loss = 1647.2299835085869 | accuracy = 0.6880434782608695


Epoch[2] Batch[695] Speed: 1.2528306124923907 samples/sec                   batch loss = 1659.9360056519508 | accuracy = 0.6870503597122302


Epoch[2] Batch[700] Speed: 1.2485580965721244 samples/sec                   batch loss = 1670.2407470345497 | accuracy = 0.6875


Epoch[2] Batch[705] Speed: 1.2499789338383946 samples/sec                   batch loss = 1683.2054792046547 | accuracy = 0.6868794326241134


Epoch[2] Batch[710] Speed: 1.2530559323147887 samples/sec                   batch loss = 1695.2505509257317 | accuracy = 0.6873239436619718


Epoch[2] Batch[715] Speed: 1.2509120386114303 samples/sec                   batch loss = 1706.486323773861 | accuracy = 0.6870629370629371


Epoch[2] Batch[720] Speed: 1.2504194034688574 samples/sec                   batch loss = 1721.2007440924644 | accuracy = 0.6864583333333333


Epoch[2] Batch[725] Speed: 1.2517979674679975 samples/sec                   batch loss = 1732.2419348359108 | accuracy = 0.6865517241379311


Epoch[2] Batch[730] Speed: 1.2532996833511174 samples/sec                   batch loss = 1744.516754090786 | accuracy = 0.6866438356164384


Epoch[2] Batch[735] Speed: 1.252840903575132 samples/sec                   batch loss = 1760.4601516127586 | accuracy = 0.6846938775510204


Epoch[2] Batch[740] Speed: 1.2528585859211931 samples/sec                   batch loss = 1771.472307741642 | accuracy = 0.6851351351351351


Epoch[2] Batch[745] Speed: 1.2516362189330277 samples/sec                   batch loss = 1782.2178615927696 | accuracy = 0.6855704697986578


Epoch[2] Batch[750] Speed: 1.252093836347614 samples/sec                   batch loss = 1793.4736949801445 | accuracy = 0.6863333333333334


Epoch[2] Batch[755] Speed: 1.2474598491810198 samples/sec                   batch loss = 1808.3351051211357 | accuracy = 0.686092715231788


Epoch[2] Batch[760] Speed: 1.2550209918016624 samples/sec                   batch loss = 1821.036797940731 | accuracy = 0.6855263157894737


Epoch[2] Batch[765] Speed: 1.2545131083229746 samples/sec                   batch loss = 1831.347050845623 | accuracy = 0.6859477124183007


Epoch[2] Batch[770] Speed: 1.2517065351863255 samples/sec                   batch loss = 1844.0396775603294 | accuracy = 0.6863636363636364


Epoch[2] Batch[775] Speed: 1.25201403980931 samples/sec                   batch loss = 1852.5028476119041 | accuracy = 0.687741935483871


Epoch[2] Batch[780] Speed: 1.246079989922754 samples/sec                   batch loss = 1864.8468419909477 | accuracy = 0.6865384615384615


Epoch[2] Batch[785] Speed: 1.2523726434337128 samples/sec                   batch loss = 1877.264324605465 | accuracy = 0.6872611464968152


[Epoch 2] training: accuracy=0.6868654822335025
[Epoch 2] time cost: 645.3435025215149
[Epoch 2] validation: validation accuracy=0.7388888888888889


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).