<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `device` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), device=gpu)
x

[11:38:08] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], device=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[11:38:08] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), device=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], device=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_device(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', device=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), device=gpu)
net(x)

[11:38:08] /work/mxnet/src/operator/cudnn_ops.cc:353: Auto-tuning cuDNN op, set MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable


[11:38:09] /work/mxnet/src/operator/cudnn_ops.cc:353: Auto-tuning cuDNN op, set MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable


array([[ 5.3043594, -4.0917797]], device=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, device=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.5857363360176356 samples/sec                   batch loss = 13.333740711212158 | accuracy = 0.65


Epoch[1] Batch[10] Speed: 1.2620584761026283 samples/sec                   batch loss = 28.073069095611572 | accuracy = 0.6


Epoch[1] Batch[15] Speed: 1.2647742112714906 samples/sec                   batch loss = 40.980440616607666 | accuracy = 0.6666666666666666


Epoch[1] Batch[20] Speed: 1.2687218380008947 samples/sec                   batch loss = 54.91207456588745 | accuracy = 0.65


Epoch[1] Batch[25] Speed: 1.2689795930407535 samples/sec                   batch loss = 68.61178040504456 | accuracy = 0.65


Epoch[1] Batch[30] Speed: 1.2660193301465301 samples/sec                   batch loss = 82.75757956504822 | accuracy = 0.625


Epoch[1] Batch[35] Speed: 1.2664164965816556 samples/sec                   batch loss = 96.6297676563263 | accuracy = 0.5928571428571429


Epoch[1] Batch[40] Speed: 1.2654313966556294 samples/sec                   batch loss = 110.083172082901 | accuracy = 0.5875


Epoch[1] Batch[45] Speed: 1.265775668237505 samples/sec                   batch loss = 124.2527379989624 | accuracy = 0.5666666666666667


Epoch[1] Batch[50] Speed: 1.2650325585042692 samples/sec                   batch loss = 137.73289275169373 | accuracy = 0.565


Epoch[1] Batch[55] Speed: 1.2640842770268657 samples/sec                   batch loss = 150.93075466156006 | accuracy = 0.5772727272727273


Epoch[1] Batch[60] Speed: 1.25707868475462 samples/sec                   batch loss = 164.64132833480835 | accuracy = 0.5708333333333333


Epoch[1] Batch[65] Speed: 1.2619772145978276 samples/sec                   batch loss = 178.53408551216125 | accuracy = 0.5615384615384615


Epoch[1] Batch[70] Speed: 1.2664181216914243 samples/sec                   batch loss = 192.85364532470703 | accuracy = 0.5535714285714286


Epoch[1] Batch[75] Speed: 1.2667775645912707 samples/sec                   batch loss = 207.14948225021362 | accuracy = 0.5466666666666666


Epoch[1] Batch[80] Speed: 1.2626969707027518 samples/sec                   batch loss = 221.89357113838196 | accuracy = 0.528125


Epoch[1] Batch[85] Speed: 1.265931348876741 samples/sec                   batch loss = 235.48512601852417 | accuracy = 0.5352941176470588


Epoch[1] Batch[90] Speed: 1.2650747204120576 samples/sec                   batch loss = 249.60612630844116 | accuracy = 0.5333333333333333


Epoch[1] Batch[95] Speed: 1.2687275946022487 samples/sec                   batch loss = 263.4053843021393 | accuracy = 0.531578947368421


Epoch[1] Batch[100] Speed: 1.2686212977880964 samples/sec                   batch loss = 277.05040669441223 | accuracy = 0.5375


Epoch[1] Batch[105] Speed: 1.2689421612007001 samples/sec                   batch loss = 291.4759418964386 | accuracy = 0.5380952380952381


Epoch[1] Batch[110] Speed: 1.2654490543950538 samples/sec                   batch loss = 305.4304578304291 | accuracy = 0.5431818181818182


Epoch[1] Batch[115] Speed: 1.2676498149060158 samples/sec                   batch loss = 319.62489581108093 | accuracy = 0.5347826086956522


Epoch[1] Batch[120] Speed: 1.2621609222796633 samples/sec                   batch loss = 333.747567653656 | accuracy = 0.5354166666666667


Epoch[1] Batch[125] Speed: 1.2687234690326412 samples/sec                   batch loss = 348.09001898765564 | accuracy = 0.53


Epoch[1] Batch[130] Speed: 1.2599729173521128 samples/sec                   batch loss = 361.87299251556396 | accuracy = 0.5307692307692308


Epoch[1] Batch[135] Speed: 1.2662500883050325 samples/sec                   batch loss = 375.71484994888306 | accuracy = 0.5296296296296297


Epoch[1] Batch[140] Speed: 1.2642647879096336 samples/sec                   batch loss = 389.22209763526917 | accuracy = 0.5339285714285714


Epoch[1] Batch[145] Speed: 1.2602838347711567 samples/sec                   batch loss = 403.1518416404724 | accuracy = 0.5362068965517242


Epoch[1] Batch[150] Speed: 1.2623259725059206 samples/sec                   batch loss = 416.3816788196564 | accuracy = 0.54


Epoch[1] Batch[155] Speed: 1.259734698497641 samples/sec                   batch loss = 430.40098333358765 | accuracy = 0.5370967741935484


Epoch[1] Batch[160] Speed: 1.2651759395395912 samples/sec                   batch loss = 443.651771068573 | accuracy = 0.5421875


Epoch[1] Batch[165] Speed: 1.2688536772724286 samples/sec                   batch loss = 457.63401055336 | accuracy = 0.543939393939394


Epoch[1] Batch[170] Speed: 1.2640974206701867 samples/sec                   batch loss = 471.58194732666016 | accuracy = 0.5411764705882353


Epoch[1] Batch[175] Speed: 1.2697310001623376 samples/sec                   batch loss = 485.30334734916687 | accuracy = 0.5428571428571428


Epoch[1] Batch[180] Speed: 1.2689851600158566 samples/sec                   batch loss = 499.1781988143921 | accuracy = 0.5416666666666666


Epoch[1] Batch[185] Speed: 1.2653863477906973 samples/sec                   batch loss = 512.951623916626 | accuracy = 0.5378378378378378


Epoch[1] Batch[190] Speed: 1.2628633969285799 samples/sec                   batch loss = 526.0344653129578 | accuracy = 0.5421052631578948


Epoch[1] Batch[195] Speed: 1.264231063227468 samples/sec                   batch loss = 539.6907732486725 | accuracy = 0.541025641025641


Epoch[1] Batch[200] Speed: 1.2608439677392833 samples/sec                   batch loss = 553.6563055515289 | accuracy = 0.5375


Epoch[1] Batch[205] Speed: 1.2662719740870818 samples/sec                   batch loss = 566.9424059391022 | accuracy = 0.5426829268292683


Epoch[1] Batch[210] Speed: 1.2669544444479686 samples/sec                   batch loss = 580.7857785224915 | accuracy = 0.5416666666666666


Epoch[1] Batch[215] Speed: 1.2646050886715923 samples/sec                   batch loss = 595.0433824062347 | accuracy = 0.5372093023255814


Epoch[1] Batch[220] Speed: 1.2714531559248181 samples/sec                   batch loss = 608.7788956165314 | accuracy = 0.5375


Epoch[1] Batch[225] Speed: 1.2613739664396293 samples/sec                   batch loss = 622.7149307727814 | accuracy = 0.5377777777777778


Epoch[1] Batch[230] Speed: 1.2631299983978626 samples/sec                   batch loss = 635.9986686706543 | accuracy = 0.5391304347826087


Epoch[1] Batch[235] Speed: 1.2658536947897554 samples/sec                   batch loss = 649.5096428394318 | accuracy = 0.5414893617021277


Epoch[1] Batch[240] Speed: 1.2645848808790248 samples/sec                   batch loss = 663.2735333442688 | accuracy = 0.54375


Epoch[1] Batch[245] Speed: 1.2602461568508148 samples/sec                   batch loss = 676.7741248607635 | accuracy = 0.5448979591836735


Epoch[1] Batch[250] Speed: 1.2605997362970809 samples/sec                   batch loss = 689.904780626297 | accuracy = 0.548


Epoch[1] Batch[255] Speed: 1.2627260517344479 samples/sec                   batch loss = 703.2992463111877 | accuracy = 0.55


Epoch[1] Batch[260] Speed: 1.2618706222069522 samples/sec                   batch loss = 716.4840984344482 | accuracy = 0.5509615384615385


Epoch[1] Batch[265] Speed: 1.2665010078204695 samples/sec                   batch loss = 730.4803364276886 | accuracy = 0.55


Epoch[1] Batch[270] Speed: 1.2610989106145922 samples/sec                   batch loss = 743.8904747962952 | accuracy = 0.549074074074074


Epoch[1] Batch[275] Speed: 1.261367897040263 samples/sec                   batch loss = 758.1301462650299 | accuracy = 0.5454545454545454


Epoch[1] Batch[280] Speed: 1.2630047653594405 samples/sec                   batch loss = 771.7843189239502 | accuracy = 0.5455357142857142


Epoch[1] Batch[285] Speed: 1.2630491693734884 samples/sec                   batch loss = 785.2327990531921 | accuracy = 0.5473684210526316


Epoch[1] Batch[290] Speed: 1.2577311008974927 samples/sec                   batch loss = 799.0659325122833 | accuracy = 0.5465517241379311


Epoch[1] Batch[295] Speed: 1.2584056205310468 samples/sec                   batch loss = 813.9019784927368 | accuracy = 0.5423728813559322


Epoch[1] Batch[300] Speed: 1.2629426810414095 samples/sec                   batch loss = 827.4488084316254 | accuracy = 0.5416666666666666


Epoch[1] Batch[305] Speed: 1.263675055984144 samples/sec                   batch loss = 841.0252957344055 | accuracy = 0.5426229508196722


Epoch[1] Batch[310] Speed: 1.259620351394792 samples/sec                   batch loss = 854.6051366329193 | accuracy = 0.542741935483871


Epoch[1] Batch[315] Speed: 1.267722133575253 samples/sec                   batch loss = 867.7780215740204 | accuracy = 0.5428571428571428


Epoch[1] Batch[320] Speed: 1.2664777756657926 samples/sec                   batch loss = 881.0163912773132 | accuracy = 0.5453125


Epoch[1] Batch[325] Speed: 1.2645488516058931 samples/sec                   batch loss = 894.566618680954 | accuracy = 0.5446153846153846


Epoch[1] Batch[330] Speed: 1.2617747708761735 samples/sec                   batch loss = 907.672366142273 | accuracy = 0.546969696969697


Epoch[1] Batch[335] Speed: 1.2618037145452725 samples/sec                   batch loss = 921.3284733295441 | accuracy = 0.5485074626865671


Epoch[1] Batch[340] Speed: 1.2657820666257396 samples/sec                   batch loss = 935.2372422218323 | accuracy = 0.5470588235294118


Epoch[1] Batch[345] Speed: 1.2615998085779863 samples/sec                   batch loss = 949.036164522171 | accuracy = 0.5463768115942029


Epoch[1] Batch[350] Speed: 1.2615292301514913 samples/sec                   batch loss = 961.7139639854431 | accuracy = 0.5478571428571428


Epoch[1] Batch[355] Speed: 1.2643388170381271 samples/sec                   batch loss = 975.0182337760925 | accuracy = 0.547887323943662


Epoch[1] Batch[360] Speed: 1.2585628922643675 samples/sec                   batch loss = 988.5209803581238 | accuracy = 0.5493055555555556


Epoch[1] Batch[365] Speed: 1.2699353323070735 samples/sec                   batch loss = 1002.375818490982 | accuracy = 0.55


Epoch[1] Batch[370] Speed: 1.2628519899617467 samples/sec                   batch loss = 1016.9029855728149 | accuracy = 0.5486486486486486


Epoch[1] Batch[375] Speed: 1.26243226186618 samples/sec                   batch loss = 1029.6698062419891 | accuracy = 0.5526666666666666


Epoch[1] Batch[380] Speed: 1.2642028654393676 samples/sec                   batch loss = 1043.7601838111877 | accuracy = 0.5519736842105263


Epoch[1] Batch[385] Speed: 1.2647035632415253 samples/sec                   batch loss = 1057.1460082530975 | accuracy = 0.5525974025974026


Epoch[1] Batch[390] Speed: 1.2623218884640073 samples/sec                   batch loss = 1070.6135840415955 | accuracy = 0.5525641025641026


Epoch[1] Batch[395] Speed: 1.2649774279420016 samples/sec                   batch loss = 1083.830885887146 | accuracy = 0.5544303797468354


Epoch[1] Batch[400] Speed: 1.262649265506126 samples/sec                   batch loss = 1096.1574022769928 | accuracy = 0.55875


Epoch[1] Batch[405] Speed: 1.2645750631751769 samples/sec                   batch loss = 1110.789568901062 | accuracy = 0.5598765432098766


Epoch[1] Batch[410] Speed: 1.2621657649109796 samples/sec                   batch loss = 1124.7046756744385 | accuracy = 0.5603658536585366


Epoch[1] Batch[415] Speed: 1.2654476226680937 samples/sec                   batch loss = 1138.0272858142853 | accuracy = 0.560843373493976


Epoch[1] Batch[420] Speed: 1.26267995987052 samples/sec                   batch loss = 1151.743744134903 | accuracy = 0.5601190476190476


Epoch[1] Batch[425] Speed: 1.2626168622391885 samples/sec                   batch loss = 1165.585125207901 | accuracy = 0.558235294117647


Epoch[1] Batch[430] Speed: 1.2605060667224897 samples/sec                   batch loss = 1179.4483525753021 | accuracy = 0.5587209302325581


Epoch[1] Batch[435] Speed: 1.2622648096206859 samples/sec                   batch loss = 1194.2336769104004 | accuracy = 0.5580459770114943


Epoch[1] Batch[440] Speed: 1.2637277885394915 samples/sec                   batch loss = 1207.849463224411 | accuracy = 0.5585227272727272


Epoch[1] Batch[445] Speed: 1.261365810697721 samples/sec                   batch loss = 1220.700388431549 | accuracy = 0.5595505617977528


Epoch[1] Batch[450] Speed: 1.2631527274810745 samples/sec                   batch loss = 1233.8930673599243 | accuracy = 0.5611111111111111


Epoch[1] Batch[455] Speed: 1.2611875487353594 samples/sec                   batch loss = 1246.769696712494 | accuracy = 0.5631868131868132


Epoch[1] Batch[460] Speed: 1.2612976290285525 samples/sec                   batch loss = 1260.2257432937622 | accuracy = 0.5625


Epoch[1] Batch[465] Speed: 1.2654188933651858 samples/sec                   batch loss = 1273.9841096401215 | accuracy = 0.5629032258064516


Epoch[1] Batch[470] Speed: 1.2602099010796413 samples/sec                   batch loss = 1287.1930708885193 | accuracy = 0.5627659574468085


Epoch[1] Batch[475] Speed: 1.257754390426166 samples/sec                   batch loss = 1299.8719506263733 | accuracy = 0.5642105263157895


Epoch[1] Batch[480] Speed: 1.2557606539182566 samples/sec                   batch loss = 1313.0050148963928 | accuracy = 0.5640625


Epoch[1] Batch[485] Speed: 1.259646831920513 samples/sec                   batch loss = 1326.4855284690857 | accuracy = 0.5644329896907216


Epoch[1] Batch[490] Speed: 1.264894264283124 samples/sec                   batch loss = 1340.6757822036743 | accuracy = 0.5622448979591836


Epoch[1] Batch[495] Speed: 1.2626491704794514 samples/sec                   batch loss = 1354.3651812076569 | accuracy = 0.5626262626262626


Epoch[1] Batch[500] Speed: 1.2607812429548357 samples/sec                   batch loss = 1367.650505065918 | accuracy = 0.563


Epoch[1] Batch[505] Speed: 1.261096730368633 samples/sec                   batch loss = 1381.3666915893555 | accuracy = 0.5623762376237624


Epoch[1] Batch[510] Speed: 1.2592153381366584 samples/sec                   batch loss = 1394.0111439228058 | accuracy = 0.5642156862745098


Epoch[1] Batch[515] Speed: 1.2669178973466126 samples/sec                   batch loss = 1407.7078156471252 | accuracy = 0.5650485436893203


Epoch[1] Batch[520] Speed: 1.264133805224331 samples/sec                   batch loss = 1421.3840823173523 | accuracy = 0.5653846153846154


Epoch[1] Batch[525] Speed: 1.2596721786094094 samples/sec                   batch loss = 1434.667594909668 | accuracy = 0.5652380952380952


Epoch[1] Batch[530] Speed: 1.25664414793141 samples/sec                   batch loss = 1447.743402004242 | accuracy = 0.5669811320754717


Epoch[1] Batch[535] Speed: 1.2639379057910167 samples/sec                   batch loss = 1461.2493968009949 | accuracy = 0.5654205607476636


Epoch[1] Batch[540] Speed: 1.2618602771544105 samples/sec                   batch loss = 1474.8428721427917 | accuracy = 0.5666666666666667


Epoch[1] Batch[545] Speed: 1.2665038760468226 samples/sec                   batch loss = 1488.4420223236084 | accuracy = 0.5669724770642202


Epoch[1] Batch[550] Speed: 1.2621268350047503 samples/sec                   batch loss = 1503.004476070404 | accuracy = 0.5663636363636364


Epoch[1] Batch[555] Speed: 1.2643134727569463 samples/sec                   batch loss = 1516.0034427642822 | accuracy = 0.5675675675675675


Epoch[1] Batch[560] Speed: 1.263077696083192 samples/sec                   batch loss = 1529.3293206691742 | accuracy = 0.5678571428571428


Epoch[1] Batch[565] Speed: 1.26518996457936 samples/sec                   batch loss = 1541.0588021278381 | accuracy = 0.570353982300885


Epoch[1] Batch[570] Speed: 1.257967430632541 samples/sec                   batch loss = 1553.537037372589 | accuracy = 0.5719298245614035


Epoch[1] Batch[575] Speed: 1.2548262169963524 samples/sec                   batch loss = 1566.0017552375793 | accuracy = 0.5730434782608695


Epoch[1] Batch[580] Speed: 1.258461690097466 samples/sec                   batch loss = 1581.550961971283 | accuracy = 0.571551724137931


Epoch[1] Batch[585] Speed: 1.2576711366878346 samples/sec                   batch loss = 1594.7294435501099 | accuracy = 0.5717948717948718


Epoch[1] Batch[590] Speed: 1.2603750095126025 samples/sec                   batch loss = 1607.6223287582397 | accuracy = 0.5720338983050848


Epoch[1] Batch[595] Speed: 1.2611722850265288 samples/sec                   batch loss = 1622.2428143024445 | accuracy = 0.5714285714285714


Epoch[1] Batch[600] Speed: 1.2595088616797505 samples/sec                   batch loss = 1635.501209974289 | accuracy = 0.5725


Epoch[1] Batch[605] Speed: 1.2618043788417903 samples/sec                   batch loss = 1650.9903683662415 | accuracy = 0.5702479338842975


Epoch[1] Batch[610] Speed: 1.263129047408025 samples/sec                   batch loss = 1663.9804112911224 | accuracy = 0.5717213114754098


Epoch[1] Batch[615] Speed: 1.2648455346973868 samples/sec                   batch loss = 1677.7371661663055 | accuracy = 0.5723577235772358


Epoch[1] Batch[620] Speed: 1.2608341132813716 samples/sec                   batch loss = 1690.8839807510376 | accuracy = 0.572983870967742


Epoch[1] Batch[625] Speed: 1.2620254386209715 samples/sec                   batch loss = 1704.358938217163 | accuracy = 0.5728


Epoch[1] Batch[630] Speed: 1.2615177524174142 samples/sec                   batch loss = 1717.2267487049103 | accuracy = 0.573015873015873


Epoch[1] Batch[635] Speed: 1.264401610442706 samples/sec                   batch loss = 1731.4482390880585 | accuracy = 0.5724409448818898


Epoch[1] Batch[640] Speed: 1.2631734601615339 samples/sec                   batch loss = 1743.058584690094 | accuracy = 0.5734375


Epoch[1] Batch[645] Speed: 1.2618235488426301 samples/sec                   batch loss = 1756.8772995471954 | accuracy = 0.5728682170542636


Epoch[1] Batch[650] Speed: 1.2652781290927566 samples/sec                   batch loss = 1770.4783508777618 | accuracy = 0.5726923076923077


Epoch[1] Batch[655] Speed: 1.2618114963483749 samples/sec                   batch loss = 1782.672120332718 | accuracy = 0.5740458015267176


Epoch[1] Batch[660] Speed: 1.259232161232481 samples/sec                   batch loss = 1796.8602213859558 | accuracy = 0.5742424242424242


Epoch[1] Batch[665] Speed: 1.260940720164728 samples/sec                   batch loss = 1808.7181475162506 | accuracy = 0.5763157894736842


Epoch[1] Batch[670] Speed: 1.2666062802388258 samples/sec                   batch loss = 1820.4301133155823 | accuracy = 0.5776119402985075


Epoch[1] Batch[675] Speed: 1.2706194487322975 samples/sec                   batch loss = 1833.9597136974335 | accuracy = 0.5777777777777777


Epoch[1] Batch[680] Speed: 1.2652680143579713 samples/sec                   batch loss = 1847.9915421009064 | accuracy = 0.5772058823529411


Epoch[1] Batch[685] Speed: 1.2670307026503687 samples/sec                   batch loss = 1860.46444272995 | accuracy = 0.5788321167883211


Epoch[1] Batch[690] Speed: 1.2623928405783231 samples/sec                   batch loss = 1874.0147924423218 | accuracy = 0.5789855072463768


Epoch[1] Batch[695] Speed: 1.2641139934796157 samples/sec                   batch loss = 1884.9742401838303 | accuracy = 0.5802158273381295


Epoch[1] Batch[700] Speed: 1.2636670608184573 samples/sec                   batch loss = 1897.6340121030807 | accuracy = 0.5817857142857142


Epoch[1] Batch[705] Speed: 1.2603797437638975 samples/sec                   batch loss = 1908.220809340477 | accuracy = 0.5840425531914893


Epoch[1] Batch[710] Speed: 1.2648587895358465 samples/sec                   batch loss = 1920.228216767311 | accuracy = 0.5852112676056338


Epoch[1] Batch[715] Speed: 1.259137749787626 samples/sec                   batch loss = 1931.7992876768112 | accuracy = 0.5863636363636363


Epoch[1] Batch[720] Speed: 1.261933645273718 samples/sec                   batch loss = 1943.9895083904266 | accuracy = 0.5864583333333333


Epoch[1] Batch[725] Speed: 1.2624424263083083 samples/sec                   batch loss = 1955.6580390930176 | accuracy = 0.5872413793103448


Epoch[1] Batch[730] Speed: 1.2614150312541494 samples/sec                   batch loss = 1969.4915790557861 | accuracy = 0.5866438356164384


Epoch[1] Batch[735] Speed: 1.258198470772704 samples/sec                   batch loss = 1980.9583677053452 | accuracy = 0.5884353741496599


Epoch[1] Batch[740] Speed: 1.2613664745332358 samples/sec                   batch loss = 1993.5105801820755 | accuracy = 0.5888513513513514


Epoch[1] Batch[745] Speed: 1.2592471889846975 samples/sec                   batch loss = 2005.9874264001846 | accuracy = 0.5892617449664429


Epoch[1] Batch[750] Speed: 1.2606256896834402 samples/sec                   batch loss = 2018.12702190876 | accuracy = 0.5896666666666667


Epoch[1] Batch[755] Speed: 1.2618530641906174 samples/sec                   batch loss = 2031.0966829061508 | accuracy = 0.5903973509933775


Epoch[1] Batch[760] Speed: 1.2608280490761437 samples/sec                   batch loss = 2043.8757687807083 | accuracy = 0.5914473684210526


Epoch[1] Batch[765] Speed: 1.2585795091010474 samples/sec                   batch loss = 2056.8090134859085 | accuracy = 0.5915032679738562


Epoch[1] Batch[770] Speed: 1.256004518174212 samples/sec                   batch loss = 2068.3000906705856 | accuracy = 0.5925324675324676


Epoch[1] Batch[775] Speed: 1.2551834285899826 samples/sec                   batch loss = 2080.393654227257 | accuracy = 0.5929032258064516


Epoch[1] Batch[780] Speed: 1.2595016755728512 samples/sec                   batch loss = 2095.929500937462 | accuracy = 0.592948717948718


Epoch[1] Batch[785] Speed: 1.2583601266759388 samples/sec                   batch loss = 2107.265301346779 | accuracy = 0.5939490445859873


[Epoch 1] training: accuracy=0.5945431472081218
[Epoch 1] time cost: 642.3377957344055
[Epoch 1] validation: validation accuracy=0.6533333333333333


Epoch[2] Batch[5] Speed: 1.2740554705538034 samples/sec                   batch loss = 14.269534587860107 | accuracy = 0.55


Epoch[2] Batch[10] Speed: 1.2761308715183914 samples/sec                   batch loss = 27.061687231063843 | accuracy = 0.575


Epoch[2] Batch[15] Speed: 1.2743562445614394 samples/sec                   batch loss = 37.93883574008942 | accuracy = 0.6333333333333333


Epoch[2] Batch[20] Speed: 1.2825167551773007 samples/sec                   batch loss = 52.14588272571564 | accuracy = 0.5875


Epoch[2] Batch[25] Speed: 1.2770397720798414 samples/sec                   batch loss = 63.74185264110565 | accuracy = 0.62


Epoch[2] Batch[30] Speed: 1.2798964927457954 samples/sec                   batch loss = 78.67044270038605 | accuracy = 0.625


Epoch[2] Batch[35] Speed: 1.2797687917710179 samples/sec                   batch loss = 92.98069560527802 | accuracy = 0.6071428571428571


Epoch[2] Batch[40] Speed: 1.2745137527128743 samples/sec                   batch loss = 106.50858986377716 | accuracy = 0.60625


Epoch[2] Batch[45] Speed: 1.2753907786716538 samples/sec                   batch loss = 119.28255927562714 | accuracy = 0.6111111111111112


Epoch[2] Batch[50] Speed: 1.277311420838203 samples/sec                   batch loss = 131.52421176433563 | accuracy = 0.61


Epoch[2] Batch[55] Speed: 1.2745375711016362 samples/sec                   batch loss = 144.14698612689972 | accuracy = 0.6136363636363636


Epoch[2] Batch[60] Speed: 1.2797221306469864 samples/sec                   batch loss = 156.58753907680511 | accuracy = 0.6208333333333333


Epoch[2] Batch[65] Speed: 1.2772507419446957 samples/sec                   batch loss = 169.9317637681961 | accuracy = 0.6192307692307693


Epoch[2] Batch[70] Speed: 1.2714992159808045 samples/sec                   batch loss = 182.55906236171722 | accuracy = 0.625


Epoch[2] Batch[75] Speed: 1.2724543163913093 samples/sec                   batch loss = 193.81723392009735 | accuracy = 0.6366666666666667


Epoch[2] Batch[80] Speed: 1.274476961829462 samples/sec                   batch loss = 206.53819334506989 | accuracy = 0.6375


Epoch[2] Batch[85] Speed: 1.2756330162755356 samples/sec                   batch loss = 217.10594260692596 | accuracy = 0.6470588235294118


Epoch[2] Batch[90] Speed: 1.269910339898174 samples/sec                   batch loss = 231.2882125377655 | accuracy = 0.6416666666666667


Epoch[2] Batch[95] Speed: 1.2714598045552643 samples/sec                   batch loss = 243.28709375858307 | accuracy = 0.6473684210526316


Epoch[2] Batch[100] Speed: 1.2740665002966987 samples/sec                   batch loss = 255.27224791049957 | accuracy = 0.6525


Epoch[2] Batch[105] Speed: 1.2755529067094495 samples/sec                   batch loss = 267.13123512268066 | accuracy = 0.6547619047619048


Epoch[2] Batch[110] Speed: 1.2769386868015011 samples/sec                   batch loss = 279.8666467666626 | accuracy = 0.6545454545454545


Epoch[2] Batch[115] Speed: 1.2775828933266666 samples/sec                   batch loss = 294.13449597358704 | accuracy = 0.6521739130434783


Epoch[2] Batch[120] Speed: 1.2777504448497503 samples/sec                   batch loss = 305.0038709640503 | accuracy = 0.6583333333333333


Epoch[2] Batch[125] Speed: 1.2741406173817973 samples/sec                   batch loss = 318.113489151001 | accuracy = 0.652


Epoch[2] Batch[130] Speed: 1.2737055205241068 samples/sec                   batch loss = 328.90991735458374 | accuracy = 0.6557692307692308


Epoch[2] Batch[135] Speed: 1.2745515139861938 samples/sec                   batch loss = 341.4581849575043 | accuracy = 0.6555555555555556


Epoch[2] Batch[140] Speed: 1.2739909407098764 samples/sec                   batch loss = 353.4274568557739 | accuracy = 0.6571428571428571


Epoch[2] Batch[145] Speed: 1.2735108002223776 samples/sec                   batch loss = 365.09819972515106 | accuracy = 0.656896551724138


Epoch[2] Batch[150] Speed: 1.2679408639450538 samples/sec                   batch loss = 375.64871299266815 | accuracy = 0.6566666666666666


Epoch[2] Batch[155] Speed: 1.268910489830052 samples/sec                   batch loss = 387.7373938560486 | accuracy = 0.6564516129032258


Epoch[2] Batch[160] Speed: 1.2681186439460908 samples/sec                   batch loss = 399.3971610069275 | accuracy = 0.65625


Epoch[2] Batch[165] Speed: 1.2745723320552067 samples/sec                   batch loss = 413.55317878723145 | accuracy = 0.6515151515151515


Epoch[2] Batch[170] Speed: 1.2680112032792976 samples/sec                   batch loss = 426.201722741127 | accuracy = 0.6529411764705882


Epoch[2] Batch[175] Speed: 1.2787422527220083 samples/sec                   batch loss = 439.4911412000656 | accuracy = 0.6514285714285715


Epoch[2] Batch[180] Speed: 1.2766408698791205 samples/sec                   batch loss = 451.0153692960739 | accuracy = 0.6513888888888889


Epoch[2] Batch[185] Speed: 1.2789588553629514 samples/sec                   batch loss = 462.63008415699005 | accuracy = 0.6554054054054054


Epoch[2] Batch[190] Speed: 1.2758873774863113 samples/sec                   batch loss = 476.2006161212921 | accuracy = 0.6526315789473685


Epoch[2] Batch[195] Speed: 1.2795140517463994 samples/sec                   batch loss = 489.0830622911453 | accuracy = 0.6538461538461539


Epoch[2] Batch[200] Speed: 1.2747881053930814 samples/sec                   batch loss = 500.0815818309784 | accuracy = 0.655


Epoch[2] Batch[205] Speed: 1.2805111415323256 samples/sec                   batch loss = 511.39278268814087 | accuracy = 0.6585365853658537


Epoch[2] Batch[210] Speed: 1.279551524375688 samples/sec                   batch loss = 523.1670936346054 | accuracy = 0.6595238095238095


Epoch[2] Batch[215] Speed: 1.282898442961993 samples/sec                   batch loss = 535.1857171058655 | accuracy = 0.6593023255813953


Epoch[2] Batch[220] Speed: 1.2804025679765803 samples/sec                   batch loss = 546.2896945476532 | accuracy = 0.6613636363636364


Epoch[2] Batch[225] Speed: 1.2800710976989078 samples/sec                   batch loss = 558.5008063316345 | accuracy = 0.6622222222222223


Epoch[2] Batch[230] Speed: 1.2774454404876234 samples/sec                   batch loss = 570.0714280605316 | accuracy = 0.6619565217391304


Epoch[2] Batch[235] Speed: 1.2816156108656565 samples/sec                   batch loss = 582.9284199476242 | accuracy = 0.6617021276595745


Epoch[2] Batch[240] Speed: 1.2815150724458788 samples/sec                   batch loss = 595.2668451070786 | accuracy = 0.6614583333333334


Epoch[2] Batch[245] Speed: 1.2766326126619292 samples/sec                   batch loss = 608.4355269670486 | accuracy = 0.6581632653061225


Epoch[2] Batch[250] Speed: 1.2795688952452542 samples/sec                   batch loss = 623.1352007389069 | accuracy = 0.655


Epoch[2] Batch[255] Speed: 1.2690360329129717 samples/sec                   batch loss = 636.5752750635147 | accuracy = 0.653921568627451


Epoch[2] Batch[260] Speed: 1.2681567940755911 samples/sec                   batch loss = 651.4434872865677 | accuracy = 0.6538461538461539


Epoch[2] Batch[265] Speed: 1.2706970149570254 samples/sec                   batch loss = 661.1900554895401 | accuracy = 0.6556603773584906


Epoch[2] Batch[270] Speed: 1.2681117426603035 samples/sec                   batch loss = 674.887192606926 | accuracy = 0.6574074074074074


Epoch[2] Batch[275] Speed: 1.2683614828290697 samples/sec                   batch loss = 683.8960421085358 | accuracy = 0.6609090909090909


Epoch[2] Batch[280] Speed: 1.2683121980927405 samples/sec                   batch loss = 694.4764605760574 | accuracy = 0.6633928571428571


Epoch[2] Batch[285] Speed: 1.2694398011657781 samples/sec                   batch loss = 707.1693441867828 | accuracy = 0.6631578947368421


Epoch[2] Batch[290] Speed: 1.2775309437662135 samples/sec                   batch loss = 717.7248466014862 | accuracy = 0.6655172413793103


Epoch[2] Batch[295] Speed: 1.276696730330224 samples/sec                   batch loss = 728.1669781208038 | accuracy = 0.6677966101694915


Epoch[2] Batch[300] Speed: 1.2703937329361898 samples/sec                   batch loss = 738.9920643568039 | accuracy = 0.6666666666666666


Epoch[2] Batch[305] Speed: 1.2740058390739555 samples/sec                   batch loss = 753.752091050148 | accuracy = 0.6663934426229509


Epoch[2] Batch[310] Speed: 1.2692926672642462 samples/sec                   batch loss = 765.5480329990387 | accuracy = 0.6661290322580645


Epoch[2] Batch[315] Speed: 1.2734731972525153 samples/sec                   batch loss = 779.6720465421677 | accuracy = 0.6658730158730158


Epoch[2] Batch[320] Speed: 1.272516760302175 samples/sec                   batch loss = 793.5923074483871 | accuracy = 0.665625


Epoch[2] Batch[325] Speed: 1.2714702112468608 samples/sec                   batch loss = 808.3111757040024 | accuracy = 0.6646153846153846


Epoch[2] Batch[330] Speed: 1.2782327168811554 samples/sec                   batch loss = 821.284630894661 | accuracy = 0.6628787878787878


Epoch[2] Batch[335] Speed: 1.269381788687765 samples/sec                   batch loss = 831.474597454071 | accuracy = 0.6634328358208955


Epoch[2] Batch[340] Speed: 1.2707917240309476 samples/sec                   batch loss = 841.4879602193832 | accuracy = 0.6647058823529411


Epoch[2] Batch[345] Speed: 1.2718467975375143 samples/sec                   batch loss = 854.0548313856125 | accuracy = 0.6659420289855073


Epoch[2] Batch[350] Speed: 1.2726524787123414 samples/sec                   batch loss = 865.4959179162979 | accuracy = 0.6671428571428571


Epoch[2] Batch[355] Speed: 1.2742084527152624 samples/sec                   batch loss = 879.9777210950851 | accuracy = 0.6654929577464789


Epoch[2] Batch[360] Speed: 1.2696851642108535 samples/sec                   batch loss = 891.3494341373444 | accuracy = 0.6659722222222222


Epoch[2] Batch[365] Speed: 1.2696396198137816 samples/sec                   batch loss = 907.6003065109253 | accuracy = 0.6636986301369863


Epoch[2] Batch[370] Speed: 1.274662486984395 samples/sec                   batch loss = 919.4154930114746 | accuracy = 0.6635135135135135


Epoch[2] Batch[375] Speed: 1.2703874802301949 samples/sec                   batch loss = 931.7604135274887 | accuracy = 0.6626666666666666


Epoch[2] Batch[380] Speed: 1.2690923818069397 samples/sec                   batch loss = 943.2568509578705 | accuracy = 0.6618421052631579


Epoch[2] Batch[385] Speed: 1.2646726750848443 samples/sec                   batch loss = 954.1401835680008 | accuracy = 0.662987012987013


Epoch[2] Batch[390] Speed: 1.2707681416915881 samples/sec                   batch loss = 966.571634054184 | accuracy = 0.6647435897435897


Epoch[2] Batch[395] Speed: 1.2742993304792798 samples/sec                   batch loss = 978.5516724586487 | accuracy = 0.6639240506329114


Epoch[2] Batch[400] Speed: 1.2692246821797006 samples/sec                   batch loss = 991.5515886545181 | accuracy = 0.665625


Epoch[2] Batch[405] Speed: 1.2714351375223134 samples/sec                   batch loss = 1000.2655403614044 | accuracy = 0.6691358024691358


Epoch[2] Batch[410] Speed: 1.267813333911677 samples/sec                   batch loss = 1012.039573431015 | accuracy = 0.6689024390243903


Epoch[2] Batch[415] Speed: 1.2723999845587828 samples/sec                   batch loss = 1026.1093493700027 | accuracy = 0.6686746987951807


Epoch[2] Batch[420] Speed: 1.2763375601216327 samples/sec                   batch loss = 1036.8267220258713 | accuracy = 0.669047619047619


Epoch[2] Batch[425] Speed: 1.2780517978728105 samples/sec                   batch loss = 1050.419953942299 | accuracy = 0.6688235294117647


Epoch[2] Batch[430] Speed: 1.270920239096888 samples/sec                   batch loss = 1062.7441695928574 | accuracy = 0.6680232558139535


Epoch[2] Batch[435] Speed: 1.2775350295336223 samples/sec                   batch loss = 1075.7937635183334 | accuracy = 0.6672413793103448


Epoch[2] Batch[440] Speed: 1.277389611759112 samples/sec                   batch loss = 1089.7401332855225 | accuracy = 0.6659090909090909


Epoch[2] Batch[445] Speed: 1.2762784300539656 samples/sec                   batch loss = 1101.1482813358307 | accuracy = 0.6651685393258427


Epoch[2] Batch[450] Speed: 1.2783246566046786 samples/sec                   batch loss = 1113.6552156209946 | accuracy = 0.6638888888888889


Epoch[2] Batch[455] Speed: 1.2767405477745337 samples/sec                   batch loss = 1127.5940064191818 | accuracy = 0.6637362637362637


Epoch[2] Batch[460] Speed: 1.2787293875470418 samples/sec                   batch loss = 1138.7724332809448 | accuracy = 0.6641304347826087


Epoch[2] Batch[465] Speed: 1.2764870117578029 samples/sec                   batch loss = 1150.3456492424011 | accuracy = 0.6645161290322581


Epoch[2] Batch[470] Speed: 1.278959147855364 samples/sec                   batch loss = 1160.5730201005936 | accuracy = 0.6659574468085107


Epoch[2] Batch[475] Speed: 1.278738451620643 samples/sec                   batch loss = 1170.6726105213165 | accuracy = 0.6678947368421052


Epoch[2] Batch[480] Speed: 1.273864415404594 samples/sec                   batch loss = 1183.8956534862518 | accuracy = 0.6671875


Epoch[2] Batch[485] Speed: 1.271181103507678 samples/sec                   batch loss = 1197.284516096115 | accuracy = 0.6659793814432989


Epoch[2] Batch[490] Speed: 1.272035029681699 samples/sec                   batch loss = 1209.6456481218338 | accuracy = 0.6658163265306123


Epoch[2] Batch[495] Speed: 1.2748265610041525 samples/sec                   batch loss = 1223.6033633947372 | accuracy = 0.6661616161616162


Epoch[2] Batch[500] Speed: 1.277613442419114 samples/sec                   batch loss = 1232.5417189598083 | accuracy = 0.668


Epoch[2] Batch[505] Speed: 1.2803519522158686 samples/sec                   batch loss = 1241.6285064220428 | accuracy = 0.6698019801980198


Epoch[2] Batch[510] Speed: 1.274858722137948 samples/sec                   batch loss = 1251.7190914154053 | accuracy = 0.6715686274509803


Epoch[2] Batch[515] Speed: 1.2825154806500665 samples/sec                   batch loss = 1263.9576964378357 | accuracy = 0.6718446601941748


Epoch[2] Batch[520] Speed: 1.2781688343527622 samples/sec                   batch loss = 1274.6157034635544 | accuracy = 0.6716346153846153


Epoch[2] Batch[525] Speed: 1.270503886849746 samples/sec                   batch loss = 1288.937532544136 | accuracy = 0.6704761904761904


Epoch[2] Batch[530] Speed: 1.2666809663797438 samples/sec                   batch loss = 1302.7506295442581 | accuracy = 0.6693396226415095


Epoch[2] Batch[535] Speed: 1.2739529224378667 samples/sec                   batch loss = 1314.568249464035 | accuracy = 0.6696261682242991


Epoch[2] Batch[540] Speed: 1.2758948488186357 samples/sec                   batch loss = 1324.6670850515366 | accuracy = 0.6708333333333333


Epoch[2] Batch[545] Speed: 1.2738976887372782 samples/sec                   batch loss = 1335.2654007673264 | accuracy = 0.6720183486238532


Epoch[2] Batch[550] Speed: 1.2727165833890208 samples/sec                   batch loss = 1345.279053926468 | accuracy = 0.6731818181818182


Epoch[2] Batch[555] Speed: 1.2762120244902977 samples/sec                   batch loss = 1356.5315656661987 | accuracy = 0.6734234234234234


Epoch[2] Batch[560] Speed: 1.2790970241726174 samples/sec                   batch loss = 1368.693314909935 | accuracy = 0.6727678571428571


Epoch[2] Batch[565] Speed: 1.2773053915876167 samples/sec                   batch loss = 1378.916067957878 | accuracy = 0.6743362831858407


Epoch[2] Batch[570] Speed: 1.2792977487714836 samples/sec                   batch loss = 1388.780586719513 | accuracy = 0.6754385964912281


Epoch[2] Batch[575] Speed: 1.281243687701311 samples/sec                   batch loss = 1402.8362766504288 | accuracy = 0.6752173913043479


Epoch[2] Batch[580] Speed: 1.27553089288902 samples/sec                   batch loss = 1412.9027143716812 | accuracy = 0.6754310344827587


Epoch[2] Batch[585] Speed: 1.2773969061764547 samples/sec                   batch loss = 1425.2929872274399 | accuracy = 0.6756410256410257


Epoch[2] Batch[590] Speed: 1.2760232335611161 samples/sec                   batch loss = 1439.2507899999619 | accuracy = 0.6741525423728814


Epoch[2] Batch[595] Speed: 1.2773588788623012 samples/sec                   batch loss = 1450.6691395044327 | accuracy = 0.6743697478991597


Epoch[2] Batch[600] Speed: 1.2781791564019518 samples/sec                   batch loss = 1461.9208706617355 | accuracy = 0.675


Epoch[2] Batch[605] Speed: 1.2751045371613288 samples/sec                   batch loss = 1476.662559747696 | accuracy = 0.6735537190082644


Epoch[2] Batch[610] Speed: 1.2803673905758632 samples/sec                   batch loss = 1489.216113448143 | accuracy = 0.6733606557377049


Epoch[2] Batch[615] Speed: 1.2784235260428292 samples/sec                   batch loss = 1501.6309980154037 | accuracy = 0.6727642276422764


Epoch[2] Batch[620] Speed: 1.2831144933782064 samples/sec                   batch loss = 1512.1544299125671 | accuracy = 0.6737903225806452


Epoch[2] Batch[625] Speed: 1.2806295086612172 samples/sec                   batch loss = 1523.727609872818 | accuracy = 0.674


Epoch[2] Batch[630] Speed: 1.2761931914216458 samples/sec                   batch loss = 1535.8354307413101 | accuracy = 0.6738095238095239


Epoch[2] Batch[635] Speed: 1.2756629872103828 samples/sec                   batch loss = 1549.4079884290695 | accuracy = 0.6732283464566929


Epoch[2] Batch[640] Speed: 1.276738313109248 samples/sec                   batch loss = 1558.3223161697388 | accuracy = 0.67421875


Epoch[2] Batch[645] Speed: 1.2814233583583328 samples/sec                   batch loss = 1569.45339179039 | accuracy = 0.6759689922480621


Epoch[2] Batch[650] Speed: 1.2839188952057263 samples/sec                   batch loss = 1580.6010760068893 | accuracy = 0.6765384615384615


Epoch[2] Batch[655] Speed: 1.2799882813572874 samples/sec                   batch loss = 1590.6237167716026 | accuracy = 0.6770992366412214


Epoch[2] Batch[660] Speed: 1.2794292585087204 samples/sec                   batch loss = 1604.7812688946724 | accuracy = 0.6757575757575758


Epoch[2] Batch[665] Speed: 1.2824040183360699 samples/sec                   batch loss = 1618.5874403119087 | accuracy = 0.674436090225564


Epoch[2] Batch[670] Speed: 1.2772014445885784 samples/sec                   batch loss = 1633.7082994580269 | accuracy = 0.6735074626865671


Epoch[2] Batch[675] Speed: 1.278849765023814 samples/sec                   batch loss = 1645.9089099764824 | accuracy = 0.6733333333333333


Epoch[2] Batch[680] Speed: 1.2776894323261698 samples/sec                   batch loss = 1656.6149935126305 | accuracy = 0.6731617647058824


Epoch[2] Batch[685] Speed: 1.278249175457231 samples/sec                   batch loss = 1672.225914657116 | accuracy = 0.6722627737226278


Epoch[2] Batch[690] Speed: 1.27614281084159 samples/sec                   batch loss = 1686.7565531134605 | accuracy = 0.6710144927536232


Epoch[2] Batch[695] Speed: 1.2684702296281858 samples/sec                   batch loss = 1699.9696816802025 | accuracy = 0.6705035971223021


Epoch[2] Batch[700] Speed: 1.2702660937356025 samples/sec                   batch loss = 1711.193070113659 | accuracy = 0.6710714285714285


Epoch[2] Batch[705] Speed: 1.2711157088316516 samples/sec                   batch loss = 1720.4522797465324 | accuracy = 0.6716312056737589


Epoch[2] Batch[710] Speed: 1.2791072636987064 samples/sec                   batch loss = 1731.63368922472 | accuracy = 0.6725352112676056


Epoch[2] Batch[715] Speed: 1.2721242472616165 samples/sec                   batch loss = 1747.4575110077858 | accuracy = 0.6706293706293707


Epoch[2] Batch[720] Speed: 1.2770184845227535 samples/sec                   batch loss = 1757.8693946003914 | accuracy = 0.6711805555555556


Epoch[2] Batch[725] Speed: 1.270169539747573 samples/sec                   batch loss = 1769.8037530779839 | accuracy = 0.6710344827586207


Epoch[2] Batch[730] Speed: 1.2706165618347658 samples/sec                   batch loss = 1779.7583815455437 | accuracy = 0.6712328767123288


Epoch[2] Batch[735] Speed: 1.2741598737537 samples/sec                   batch loss = 1789.9670750498772 | accuracy = 0.6717687074829932


Epoch[2] Batch[740] Speed: 1.274157938413632 samples/sec                   batch loss = 1800.5211285948753 | accuracy = 0.6726351351351352


Epoch[2] Batch[745] Speed: 1.2747025814579318 samples/sec                   batch loss = 1811.4830172657967 | accuracy = 0.6728187919463087


Epoch[2] Batch[750] Speed: 1.2726667665448532 samples/sec                   batch loss = 1821.150502026081 | accuracy = 0.6736666666666666


Epoch[2] Batch[755] Speed: 1.2766392184271371 samples/sec                   batch loss = 1831.652453839779 | accuracy = 0.6738410596026491


Epoch[2] Batch[760] Speed: 1.2764986663777456 samples/sec                   batch loss = 1845.6614431738853 | accuracy = 0.6726973684210527


Epoch[2] Batch[765] Speed: 1.276400579968506 samples/sec                   batch loss = 1855.9428220391273 | accuracy = 0.673202614379085


Epoch[2] Batch[770] Speed: 1.2729755779551117 samples/sec                   batch loss = 1865.2938460707664 | accuracy = 0.674025974025974


Epoch[2] Batch[775] Speed: 1.2707454264698386 samples/sec                   batch loss = 1879.0480462908745 | accuracy = 0.6735483870967742


Epoch[2] Batch[780] Speed: 1.2803351463231685 samples/sec                   batch loss = 1890.0793880820274 | accuracy = 0.673397435897436


Epoch[2] Batch[785] Speed: 1.2737322097728023 samples/sec                   batch loss = 1900.5496299862862 | accuracy = 0.674203821656051


[Epoch 2] training: accuracy=0.674492385786802
[Epoch 2] time cost: 634.5267686843872
[Epoch 2] validation: validation accuracy=0.7066666666666667


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).