<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[04:33:03] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

04:33:03] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[04:33:04] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 7.0109706, -9.595711 ]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7752198977168342 samples/sec                   batch loss = 14.948371887207031 | accuracy = 0.35


Epoch[1] Batch[10] Speed: 1.2527217244851117 samples/sec                   batch loss = 28.60189437866211 | accuracy = 0.425


Epoch[1] Batch[15] Speed: 1.252763537486284 samples/sec                   batch loss = 42.59649682044983 | accuracy = 0.4666666666666667


Epoch[1] Batch[20] Speed: 1.2558362285486218 samples/sec                   batch loss = 57.149203300476074 | accuracy = 0.475


Epoch[1] Batch[25] Speed: 1.2556643188533507 samples/sec                   batch loss = 70.79609823226929 | accuracy = 0.49


Epoch[1] Batch[30] Speed: 1.26203407756554 samples/sec                   batch loss = 84.37476420402527 | accuracy = 0.5083333333333333


Epoch[1] Batch[35] Speed: 1.258328886950164 samples/sec                   batch loss = 98.97106981277466 | accuracy = 0.5142857142857142


Epoch[1] Batch[40] Speed: 1.2586580673601837 samples/sec                   batch loss = 113.42605519294739 | accuracy = 0.5


Epoch[1] Batch[45] Speed: 1.2557591500378213 samples/sec                   batch loss = 127.16371822357178 | accuracy = 0.5


Epoch[1] Batch[50] Speed: 1.2537140155008082 samples/sec                   batch loss = 141.38866233825684 | accuracy = 0.505


Epoch[1] Batch[55] Speed: 1.2530675373570197 samples/sec                   batch loss = 155.45744800567627 | accuracy = 0.4954545454545455


Epoch[1] Batch[60] Speed: 1.252122337576404 samples/sec                   batch loss = 169.69632720947266 | accuracy = 0.4791666666666667


Epoch[1] Batch[65] Speed: 1.2577915423098174 samples/sec                   batch loss = 183.59954404830933 | accuracy = 0.4807692307692308


Epoch[1] Batch[70] Speed: 1.2541709036843982 samples/sec                   batch loss = 197.92356824874878 | accuracy = 0.48214285714285715


Epoch[1] Batch[75] Speed: 1.2545160163150038 samples/sec                   batch loss = 212.59254336357117 | accuracy = 0.47333333333333333


Epoch[1] Batch[80] Speed: 1.2531525227651492 samples/sec                   batch loss = 226.37469840049744 | accuracy = 0.478125


Epoch[1] Batch[85] Speed: 1.2536294222895386 samples/sec                   batch loss = 240.56752109527588 | accuracy = 0.47352941176470587


Epoch[1] Batch[90] Speed: 1.2537304108546532 samples/sec                   batch loss = 254.8400537967682 | accuracy = 0.4722222222222222


Epoch[1] Batch[95] Speed: 1.2549243947145303 samples/sec                   batch loss = 268.67669224739075 | accuracy = 0.4710526315789474


Epoch[1] Batch[100] Speed: 1.2538688047257323 samples/sec                   batch loss = 282.41464352607727 | accuracy = 0.47


Epoch[1] Batch[105] Speed: 1.2550481242706815 samples/sec                   batch loss = 295.902952671051 | accuracy = 0.48333333333333334


Epoch[1] Batch[110] Speed: 1.25742360914528 samples/sec                   batch loss = 309.63650488853455 | accuracy = 0.4863636363636364


Epoch[1] Batch[115] Speed: 1.2623211286451714 samples/sec                   batch loss = 323.33065843582153 | accuracy = 0.4891304347826087


Epoch[1] Batch[120] Speed: 1.2604931870608063 samples/sec                   batch loss = 337.5122911930084 | accuracy = 0.48541666666666666


Epoch[1] Batch[125] Speed: 1.2552276599923313 samples/sec                   batch loss = 351.35674929618835 | accuracy = 0.486


Epoch[1] Batch[130] Speed: 1.2547787292726154 samples/sec                   batch loss = 365.2474453449249 | accuracy = 0.48653846153846153


Epoch[1] Batch[135] Speed: 1.252737626192445 samples/sec                   batch loss = 379.1621904373169 | accuracy = 0.48703703703703705


Epoch[1] Batch[140] Speed: 1.2556230637706405 samples/sec                   batch loss = 393.21327233314514 | accuracy = 0.4857142857142857


Epoch[1] Batch[145] Speed: 1.251749120981558 samples/sec                   batch loss = 406.67724537849426 | accuracy = 0.49137931034482757


Epoch[1] Batch[150] Speed: 1.253426838453986 samples/sec                   batch loss = 420.0728144645691 | accuracy = 0.49666666666666665


Epoch[1] Batch[155] Speed: 1.2547404413934118 samples/sec                   batch loss = 434.17571449279785 | accuracy = 0.49838709677419357


Epoch[1] Batch[160] Speed: 1.2559241283971068 samples/sec                   batch loss = 448.2790048122406 | accuracy = 0.496875


Epoch[1] Batch[165] Speed: 1.255798252099844 samples/sec                   batch loss = 462.06314301490784 | accuracy = 0.5


Epoch[1] Batch[170] Speed: 1.255552682655044 samples/sec                   batch loss = 475.9062707424164 | accuracy = 0.5029411764705882


Epoch[1] Batch[175] Speed: 1.2501831993696482 samples/sec                   batch loss = 489.4553506374359 | accuracy = 0.5071428571428571


Epoch[1] Batch[180] Speed: 1.2569012557463328 samples/sec                   batch loss = 503.5686876773834 | accuracy = 0.5083333333333333


Epoch[1] Batch[185] Speed: 1.2564968596595631 samples/sec                   batch loss = 517.1429741382599 | accuracy = 0.5108108108108108


Epoch[1] Batch[190] Speed: 1.2583455919882136 samples/sec                   batch loss = 530.9204001426697 | accuracy = 0.5118421052631579


Epoch[1] Batch[195] Speed: 1.2610063041987278 samples/sec                   batch loss = 544.4736258983612 | accuracy = 0.5128205128205128


Epoch[1] Batch[200] Speed: 1.2582408389104227 samples/sec                   batch loss = 558.4720029830933 | accuracy = 0.50875


Epoch[1] Batch[205] Speed: 1.2599749044648363 samples/sec                   batch loss = 571.5487577915192 | accuracy = 0.5097560975609756


Epoch[1] Batch[210] Speed: 1.2614174971248744 samples/sec                   batch loss = 584.9579703807831 | accuracy = 0.5142857142857142


Epoch[1] Batch[215] Speed: 1.2611057357811164 samples/sec                   batch loss = 598.860347032547 | accuracy = 0.5127906976744186


Epoch[1] Batch[220] Speed: 1.258772523143537 samples/sec                   batch loss = 612.6220986843109 | accuracy = 0.509090909090909


Epoch[1] Batch[225] Speed: 1.2566461245583407 samples/sec                   batch loss = 626.4483797550201 | accuracy = 0.5077777777777778


Epoch[1] Batch[230] Speed: 1.260695030248927 samples/sec                   batch loss = 639.8706531524658 | accuracy = 0.5119565217391304


Epoch[1] Batch[235] Speed: 1.2596799341433964 samples/sec                   batch loss = 653.9047634601593 | accuracy = 0.5117021276595745


Epoch[1] Batch[240] Speed: 1.2618092187375316 samples/sec                   batch loss = 667.8032021522522 | accuracy = 0.5135416666666667


Epoch[1] Batch[245] Speed: 1.2620989208918578 samples/sec                   batch loss = 681.4419567584991 | accuracy = 0.5183673469387755


Epoch[1] Batch[250] Speed: 1.2576042965332335 samples/sec                   batch loss = 695.1888501644135 | accuracy = 0.517


Epoch[1] Batch[255] Speed: 1.2533359171178888 samples/sec                   batch loss = 709.0937860012054 | accuracy = 0.5137254901960784


Epoch[1] Batch[260] Speed: 1.2562664439655995 samples/sec                   batch loss = 722.9246172904968 | accuracy = 0.5134615384615384


Epoch[1] Batch[265] Speed: 1.26195927392355 samples/sec                   batch loss = 737.2122654914856 | accuracy = 0.5141509433962265


Epoch[1] Batch[270] Speed: 1.261708632459163 samples/sec                   batch loss = 750.566442489624 | accuracy = 0.5175925925925926


Epoch[1] Batch[275] Speed: 1.26109464492303 samples/sec                   batch loss = 764.1500797271729 | accuracy = 0.519090909090909


Epoch[1] Batch[280] Speed: 1.2567040142476056 samples/sec                   batch loss = 778.446052312851 | accuracy = 0.5142857142857142


Epoch[1] Batch[285] Speed: 1.2623627300723708 samples/sec                   batch loss = 791.7693011760712 | accuracy = 0.5175438596491229


Epoch[1] Batch[290] Speed: 1.2523644167070442 samples/sec                   batch loss = 805.5407111644745 | accuracy = 0.5172413793103449


Epoch[1] Batch[295] Speed: 1.2524264002741163 samples/sec                   batch loss = 819.3036487102509 | accuracy = 0.5169491525423728


Epoch[1] Batch[300] Speed: 1.254091685876091 samples/sec                   batch loss = 832.6437270641327 | accuracy = 0.5191666666666667


Epoch[1] Batch[305] Speed: 1.2536932174403765 samples/sec                   batch loss = 846.1892998218536 | accuracy = 0.5229508196721312


Epoch[1] Batch[310] Speed: 1.2513100473973169 samples/sec                   batch loss = 859.4250798225403 | accuracy = 0.5266129032258065


Epoch[1] Batch[315] Speed: 1.2498368349595206 samples/sec                   batch loss = 872.7077963352203 | accuracy = 0.5277777777777778


Epoch[1] Batch[320] Speed: 1.2522931852525845 samples/sec                   batch loss = 886.7241685390472 | accuracy = 0.52734375


Epoch[1] Batch[325] Speed: 1.249604108438648 samples/sec                   batch loss = 899.9732022285461 | accuracy = 0.5276923076923077


Epoch[1] Batch[330] Speed: 1.2549836278994895 samples/sec                   batch loss = 913.3735408782959 | accuracy = 0.5295454545454545


Epoch[1] Batch[335] Speed: 1.2563459367688041 samples/sec                   batch loss = 926.7302808761597 | accuracy = 0.5305970149253731


Epoch[1] Batch[340] Speed: 1.2562555321293494 samples/sec                   batch loss = 940.6615521907806 | accuracy = 0.5286764705882353


Epoch[1] Batch[345] Speed: 1.2603747254586557 samples/sec                   batch loss = 954.6795125007629 | accuracy = 0.527536231884058


Epoch[1] Batch[350] Speed: 1.2559633347571943 samples/sec                   batch loss = 967.5440363883972 | accuracy = 0.5328571428571428


Epoch[1] Batch[355] Speed: 1.2589764608876612 samples/sec                   batch loss = 981.2294297218323 | accuracy = 0.5330985915492957


Epoch[1] Batch[360] Speed: 1.2608450100467243 samples/sec                   batch loss = 994.8885960578918 | accuracy = 0.5347222222222222


Epoch[1] Batch[365] Speed: 1.2540630011852039 samples/sec                   batch loss = 1008.4893288612366 | accuracy = 0.5335616438356164


Epoch[1] Batch[370] Speed: 1.2522596288844965 samples/sec                   batch loss = 1022.4109649658203 | accuracy = 0.5331081081081082


Epoch[1] Batch[375] Speed: 1.2575660245519422 samples/sec                   batch loss = 1035.8034439086914 | accuracy = 0.5346666666666666


Epoch[1] Batch[380] Speed: 1.2559886274490135 samples/sec                   batch loss = 1049.0927674770355 | accuracy = 0.5368421052631579


Epoch[1] Batch[385] Speed: 1.2535960752793545 samples/sec                   batch loss = 1063.2881731987 | accuracy = 0.537012987012987


Epoch[1] Batch[390] Speed: 1.25583575852903 samples/sec                   batch loss = 1076.5402030944824 | accuracy = 0.5371794871794872


Epoch[1] Batch[395] Speed: 1.2569251737211997 samples/sec                   batch loss = 1089.552363872528 | accuracy = 0.539873417721519


Epoch[1] Batch[400] Speed: 1.2580477992020493 samples/sec                   batch loss = 1102.7697138786316 | accuracy = 0.539375


Epoch[1] Batch[405] Speed: 1.2580068589996947 samples/sec                   batch loss = 1115.2431151866913 | accuracy = 0.5419753086419753


Epoch[1] Batch[410] Speed: 1.259575053362516 samples/sec                   batch loss = 1128.6060700416565 | accuracy = 0.5426829268292683


Epoch[1] Batch[415] Speed: 1.2617600623072969 samples/sec                   batch loss = 1141.7301394939423 | accuracy = 0.5439759036144578


Epoch[1] Batch[420] Speed: 1.258656462105303 samples/sec                   batch loss = 1155.0493655204773 | accuracy = 0.5458333333333333


Epoch[1] Batch[425] Speed: 1.2555808717407386 samples/sec                   batch loss = 1169.1493310928345 | accuracy = 0.5441176470588235


Epoch[1] Batch[430] Speed: 1.2546426677560987 samples/sec                   batch loss = 1182.8184144496918 | accuracy = 0.5459302325581395


Epoch[1] Batch[435] Speed: 1.2554594797569212 samples/sec                   batch loss = 1196.6188142299652 | accuracy = 0.5459770114942529


Epoch[1] Batch[440] Speed: 1.2515283786765643 samples/sec                   batch loss = 1209.4705572128296 | accuracy = 0.5471590909090909


Epoch[1] Batch[445] Speed: 1.2570678529921644 samples/sec                   batch loss = 1223.385472536087 | accuracy = 0.5466292134831461


Epoch[1] Batch[450] Speed: 1.256690082583033 samples/sec                   batch loss = 1237.0885286331177 | accuracy = 0.5483333333333333


Epoch[1] Batch[455] Speed: 1.256619958286477 samples/sec                   batch loss = 1250.4679350852966 | accuracy = 0.5494505494505495


Epoch[1] Batch[460] Speed: 1.2541632158373126 samples/sec                   batch loss = 1263.991191148758 | accuracy = 0.5510869565217391


Epoch[1] Batch[465] Speed: 1.2521183192918073 samples/sec                   batch loss = 1277.6540660858154 | accuracy = 0.55


Epoch[1] Batch[470] Speed: 1.2569330838029509 samples/sec                   batch loss = 1290.2449307441711 | accuracy = 0.5526595744680851


Epoch[1] Batch[475] Speed: 1.2590053707965843 samples/sec                   batch loss = 1303.9478907585144 | accuracy = 0.5531578947368421


Epoch[1] Batch[480] Speed: 1.2526644816811845 samples/sec                   batch loss = 1317.4927265644073 | accuracy = 0.5526041666666667


Epoch[1] Batch[485] Speed: 1.2616489525222427 samples/sec                   batch loss = 1331.1133000850677 | accuracy = 0.5515463917525774


Epoch[1] Batch[490] Speed: 1.2523743261865734 samples/sec                   batch loss = 1345.0416922569275 | accuracy = 0.551530612244898


Epoch[1] Batch[495] Speed: 1.2567238768170277 samples/sec                   batch loss = 1358.9198718070984 | accuracy = 0.5525252525252525


Epoch[1] Batch[500] Speed: 1.257867833134063 samples/sec                   batch loss = 1372.0288252830505 | accuracy = 0.554


Epoch[1] Batch[505] Speed: 1.2560092196492803 samples/sec                   batch loss = 1384.8808009624481 | accuracy = 0.553960396039604


Epoch[1] Batch[510] Speed: 1.2540478157036128 samples/sec                   batch loss = 1396.9615337848663 | accuracy = 0.5553921568627451


Epoch[1] Batch[515] Speed: 1.2591882142040935 samples/sec                   batch loss = 1410.2085001468658 | accuracy = 0.5553398058252427


Epoch[1] Batch[520] Speed: 1.2561021280163993 samples/sec                   batch loss = 1423.8749220371246 | accuracy = 0.55625


Epoch[1] Batch[525] Speed: 1.2479533107677636 samples/sec                   batch loss = 1436.9835455417633 | accuracy = 0.5561904761904762


Epoch[1] Batch[530] Speed: 1.2474060541055132 samples/sec                   batch loss = 1449.6943163871765 | accuracy = 0.5561320754716981


Epoch[1] Batch[535] Speed: 1.2547258963654775 samples/sec                   batch loss = 1463.9165489673615 | accuracy = 0.5551401869158878


Epoch[1] Batch[540] Speed: 1.2586011305283926 samples/sec                   batch loss = 1477.9182472229004 | accuracy = 0.5546296296296296


Epoch[1] Batch[545] Speed: 1.2628917251208565 samples/sec                   batch loss = 1492.0151793956757 | accuracy = 0.5541284403669725


Epoch[1] Batch[550] Speed: 1.2543963307529205 samples/sec                   batch loss = 1505.2770278453827 | accuracy = 0.5540909090909091


Epoch[1] Batch[555] Speed: 1.2591707307554327 samples/sec                   batch loss = 1518.5932350158691 | accuracy = 0.5545045045045045


Epoch[1] Batch[560] Speed: 1.2591710142669437 samples/sec                   batch loss = 1531.2347366809845 | accuracy = 0.55625


Epoch[1] Batch[565] Speed: 1.2585083241486237 samples/sec                   batch loss = 1544.3832199573517 | accuracy = 0.5570796460176991


Epoch[1] Batch[570] Speed: 1.2506524337985345 samples/sec                   batch loss = 1557.336194038391 | accuracy = 0.5565789473684211


Epoch[1] Batch[575] Speed: 1.255827016228756 samples/sec                   batch loss = 1571.7712960243225 | accuracy = 0.5560869565217391


Epoch[1] Batch[580] Speed: 1.2481877443037335 samples/sec                   batch loss = 1584.2595109939575 | accuracy = 0.5577586206896552


Epoch[1] Batch[585] Speed: 1.2515808492766034 samples/sec                   batch loss = 1597.614629983902 | accuracy = 0.5572649572649573


Epoch[1] Batch[590] Speed: 1.2556673261635654 samples/sec                   batch loss = 1611.7594766616821 | accuracy = 0.5576271186440678


Epoch[1] Batch[595] Speed: 1.262404334241718 samples/sec                   batch loss = 1625.1863451004028 | accuracy = 0.5579831932773109


Epoch[1] Batch[600] Speed: 1.2544586094179715 samples/sec                   batch loss = 1637.8540315628052 | accuracy = 0.55875


Epoch[1] Batch[605] Speed: 1.2530000626603992 samples/sec                   batch loss = 1650.540503025055 | accuracy = 0.5599173553719008


Epoch[1] Batch[610] Speed: 1.252297111184742 samples/sec                   batch loss = 1664.195796251297 | accuracy = 0.5602459016393443


Epoch[1] Batch[615] Speed: 1.2558202480798781 samples/sec                   batch loss = 1676.7698311805725 | accuracy = 0.5605691056910569


Epoch[1] Batch[620] Speed: 1.2558499532758272 samples/sec                   batch loss = 1689.0919885635376 | accuracy = 0.5616935483870967


Epoch[1] Batch[625] Speed: 1.2570038081248138 samples/sec                   batch loss = 1702.5829231739044 | accuracy = 0.5628


Epoch[1] Batch[630] Speed: 1.2532558685712545 samples/sec                   batch loss = 1715.128858089447 | accuracy = 0.5634920634920635


Epoch[1] Batch[635] Speed: 1.2526148193691762 samples/sec                   batch loss = 1729.1550664901733 | accuracy = 0.562992125984252


Epoch[1] Batch[640] Speed: 1.2537010868956036 samples/sec                   batch loss = 1742.3244998455048 | accuracy = 0.563671875


Epoch[1] Batch[645] Speed: 1.2575390658846943 samples/sec                   batch loss = 1755.4258341789246 | accuracy = 0.5643410852713179


Epoch[1] Batch[650] Speed: 1.2547130407183056 samples/sec                   batch loss = 1768.8524878025055 | accuracy = 0.5642307692307692


Epoch[1] Batch[655] Speed: 1.250599388461269 samples/sec                   batch loss = 1781.7363214492798 | accuracy = 0.565267175572519


Epoch[1] Batch[660] Speed: 1.2455979959026537 samples/sec                   batch loss = 1795.4839310646057 | accuracy = 0.5643939393939394


Epoch[1] Batch[665] Speed: 1.246289093302001 samples/sec                   batch loss = 1807.925864458084 | accuracy = 0.5661654135338345


Epoch[1] Batch[670] Speed: 1.2436017450968722 samples/sec                   batch loss = 1821.494236946106 | accuracy = 0.566044776119403


Epoch[1] Batch[675] Speed: 1.2489730182272376 samples/sec                   batch loss = 1833.592481136322 | accuracy = 0.5681481481481482


Epoch[1] Batch[680] Speed: 1.2532187033069624 samples/sec                   batch loss = 1847.2428588867188 | accuracy = 0.5676470588235294


Epoch[1] Batch[685] Speed: 1.250196241843873 samples/sec                   batch loss = 1860.2856740951538 | accuracy = 0.5686131386861314


Epoch[1] Batch[690] Speed: 1.2546701591870781 samples/sec                   batch loss = 1873.1739146709442 | accuracy = 0.5688405797101449


Epoch[1] Batch[695] Speed: 1.245147793436819 samples/sec                   batch loss = 1885.6190192699432 | accuracy = 0.5697841726618705


Epoch[1] Batch[700] Speed: 1.2561799070257642 samples/sec                   batch loss = 1897.293045759201 | accuracy = 0.5710714285714286


Epoch[1] Batch[705] Speed: 1.2571695847257474 samples/sec                   batch loss = 1909.6886613368988 | accuracy = 0.5716312056737589


Epoch[1] Batch[710] Speed: 1.2524541686583222 samples/sec                   batch loss = 1922.9381031990051 | accuracy = 0.5721830985915493


Epoch[1] Batch[715] Speed: 1.2495907060308864 samples/sec                   batch loss = 1936.5784695148468 | accuracy = 0.5716783216783217


Epoch[1] Batch[720] Speed: 1.2507396094599637 samples/sec                   batch loss = 1949.5735132694244 | accuracy = 0.5722222222222222


Epoch[1] Batch[725] Speed: 1.2497674733896145 samples/sec                   batch loss = 1962.4325094223022 | accuracy = 0.5734482758620689


Epoch[1] Batch[730] Speed: 1.2544952854405498 samples/sec                   batch loss = 1976.153127193451 | accuracy = 0.5743150684931507


Epoch[1] Batch[735] Speed: 1.2549566859636432 samples/sec                   batch loss = 1989.905478477478 | accuracy = 0.5738095238095238


Epoch[1] Batch[740] Speed: 1.255850799330471 samples/sec                   batch loss = 2004.4072427749634 | accuracy = 0.5733108108108108


Epoch[1] Batch[745] Speed: 1.25682310493113 samples/sec                   batch loss = 2017.1999063491821 | accuracy = 0.5741610738255034


Epoch[1] Batch[750] Speed: 1.2531592621894816 samples/sec                   batch loss = 2030.3080914020538 | accuracy = 0.5743333333333334


Epoch[1] Batch[755] Speed: 1.2497039840440398 samples/sec                   batch loss = 2044.806049823761 | accuracy = 0.5731788079470199


Epoch[1] Batch[760] Speed: 1.250083247155421 samples/sec                   batch loss = 2057.988770723343 | accuracy = 0.5726973684210527


Epoch[1] Batch[765] Speed: 1.2561529137026497 samples/sec                   batch loss = 2069.4621357917786 | accuracy = 0.5741830065359477


Epoch[1] Batch[770] Speed: 1.2574567831151444 samples/sec                   batch loss = 2082.491267681122 | accuracy = 0.575


Epoch[1] Batch[775] Speed: 1.254468739665085 samples/sec                   batch loss = 2095.5948433876038 | accuracy = 0.5751612903225807


Epoch[1] Batch[780] Speed: 1.2545810275272935 samples/sec                   batch loss = 2107.979836702347 | accuracy = 0.5756410256410256


Epoch[1] Batch[785] Speed: 1.2505379584479694 samples/sec                   batch loss = 2121.254874229431 | accuracy = 0.575796178343949


[Epoch 1] training: accuracy=0.5764593908629442
[Epoch 1] time cost: 645.9716329574585
[Epoch 1] validation: validation accuracy=0.6544444444444445


Epoch[2] Batch[5] Speed: 1.2592605157820231 samples/sec                   batch loss = 14.283987641334534 | accuracy = 0.6


Epoch[2] Batch[10] Speed: 1.2624314069205664 samples/sec                   batch loss = 26.593279123306274 | accuracy = 0.675


Epoch[2] Batch[15] Speed: 1.2518295375886135 samples/sec                   batch loss = 39.32531487941742 | accuracy = 0.65


Epoch[2] Batch[20] Speed: 1.2580324227306443 samples/sec                   batch loss = 51.831812024116516 | accuracy = 0.6625


Epoch[2] Batch[25] Speed: 1.2558789077939094 samples/sec                   batch loss = 66.74547958374023 | accuracy = 0.64


Epoch[2] Batch[30] Speed: 1.2592215758613252 samples/sec                   batch loss = 79.80010843276978 | accuracy = 0.625


Epoch[2] Batch[35] Speed: 1.2566928124118932 samples/sec                   batch loss = 92.43169951438904 | accuracy = 0.6214285714285714


Epoch[2] Batch[40] Speed: 1.2599663883115797 samples/sec                   batch loss = 104.51234412193298 | accuracy = 0.65


Epoch[2] Batch[45] Speed: 1.2535624490515653 samples/sec                   batch loss = 117.24924421310425 | accuracy = 0.6388888888888888


Epoch[2] Batch[50] Speed: 1.2555472329111534 samples/sec                   batch loss = 129.69120121002197 | accuracy = 0.65


Epoch[2] Batch[55] Speed: 1.255323458428842 samples/sec                   batch loss = 142.6815688610077 | accuracy = 0.6545454545454545


Epoch[2] Batch[60] Speed: 1.2547761954437036 samples/sec                   batch loss = 155.31690394878387 | accuracy = 0.6458333333333334


Epoch[2] Batch[65] Speed: 1.2541639658670256 samples/sec                   batch loss = 168.36378419399261 | accuracy = 0.65


Epoch[2] Batch[70] Speed: 1.2533072669894083 samples/sec                   batch loss = 181.09814202785492 | accuracy = 0.6428571428571429


Epoch[2] Batch[75] Speed: 1.2556219361071266 samples/sec                   batch loss = 193.27497160434723 | accuracy = 0.65


Epoch[2] Batch[80] Speed: 1.2551288714302098 samples/sec                   batch loss = 206.48844945430756 | accuracy = 0.646875


Epoch[2] Batch[85] Speed: 1.2544099302348604 samples/sec                   batch loss = 218.40766441822052 | accuracy = 0.6529411764705882


Epoch[2] Batch[90] Speed: 1.260437883269944 samples/sec                   batch loss = 232.6859780550003 | accuracy = 0.6472222222222223


Epoch[2] Batch[95] Speed: 1.254738752276087 samples/sec                   batch loss = 244.7830754518509 | accuracy = 0.6526315789473685


Epoch[2] Batch[100] Speed: 1.2594955296254278 samples/sec                   batch loss = 257.79610681533813 | accuracy = 0.6475


Epoch[2] Batch[105] Speed: 1.2569442899225263 samples/sec                   batch loss = 270.5964901447296 | accuracy = 0.65


Epoch[2] Batch[110] Speed: 1.2594454187540172 samples/sec                   batch loss = 282.0552110671997 | accuracy = 0.6545454545454545


Epoch[2] Batch[115] Speed: 1.260264238184696 samples/sec                   batch loss = 294.34465193748474 | accuracy = 0.6543478260869565


Epoch[2] Batch[120] Speed: 1.2574275673110815 samples/sec                   batch loss = 306.7582402229309 | accuracy = 0.65


Epoch[2] Batch[125] Speed: 1.2596366178715486 samples/sec                   batch loss = 321.56388211250305 | accuracy = 0.642


Epoch[2] Batch[130] Speed: 1.2628814583808499 samples/sec                   batch loss = 334.9535083770752 | accuracy = 0.6403846153846153


Epoch[2] Batch[135] Speed: 1.2601674948873252 samples/sec                   batch loss = 348.8676335811615 | accuracy = 0.6388888888888888


Epoch[2] Batch[140] Speed: 1.2587382408862509 samples/sec                   batch loss = 360.56756806373596 | accuracy = 0.6410714285714286


Epoch[2] Batch[145] Speed: 1.2566358649909442 samples/sec                   batch loss = 372.6108675003052 | accuracy = 0.6396551724137931


Epoch[2] Batch[150] Speed: 1.2570695483861396 samples/sec                   batch loss = 384.22078120708466 | accuracy = 0.6433333333333333


Epoch[2] Batch[155] Speed: 1.2591437032442816 samples/sec                   batch loss = 397.8081065416336 | accuracy = 0.6403225806451613


Epoch[2] Batch[160] Speed: 1.2626405231119244 samples/sec                   batch loss = 412.681933760643 | accuracy = 0.6375


Epoch[2] Batch[165] Speed: 1.2628998055428822 samples/sec                   batch loss = 424.38802671432495 | accuracy = 0.6424242424242425


Epoch[2] Batch[170] Speed: 1.2585164429803823 samples/sec                   batch loss = 436.91947722435 | accuracy = 0.6441176470588236


Epoch[2] Batch[175] Speed: 1.2552097228907892 samples/sec                   batch loss = 452.11203241348267 | accuracy = 0.6371428571428571


Epoch[2] Batch[180] Speed: 1.256580993268162 samples/sec                   batch loss = 462.22359824180603 | accuracy = 0.6458333333333334


Epoch[2] Batch[185] Speed: 1.2642439241228305 samples/sec                   batch loss = 476.7693691253662 | accuracy = 0.6391891891891892


Epoch[2] Batch[190] Speed: 1.2536494688179913 samples/sec                   batch loss = 490.580970287323 | accuracy = 0.6368421052631579


Epoch[2] Batch[195] Speed: 1.250110911732442 samples/sec                   batch loss = 503.51983523368835 | accuracy = 0.6358974358974359


Epoch[2] Batch[200] Speed: 1.2546930539944074 samples/sec                   batch loss = 517.8862478733063 | accuracy = 0.6325


Epoch[2] Batch[205] Speed: 1.2516605906043032 samples/sec                   batch loss = 532.7462391853333 | accuracy = 0.6292682926829268


Epoch[2] Batch[210] Speed: 1.2532331197976687 samples/sec                   batch loss = 545.3762565851212 | accuracy = 0.6285714285714286


Epoch[2] Batch[215] Speed: 1.2499820071068575 samples/sec                   batch loss = 557.7150055170059 | accuracy = 0.6302325581395349


Epoch[2] Batch[220] Speed: 1.2519700345945828 samples/sec                   batch loss = 570.2050951719284 | accuracy = 0.6306818181818182


Epoch[2] Batch[225] Speed: 1.2530867235837715 samples/sec                   batch loss = 580.6921987533569 | accuracy = 0.6344444444444445


Epoch[2] Batch[230] Speed: 1.2557919542474556 samples/sec                   batch loss = 593.1795945167542 | accuracy = 0.6369565217391304


Epoch[2] Batch[235] Speed: 1.258371169487821 samples/sec                   batch loss = 605.8498675823212 | accuracy = 0.6372340425531915


Epoch[2] Batch[240] Speed: 1.254189936199132 samples/sec                   batch loss = 619.3256590366364 | accuracy = 0.6375


Epoch[2] Batch[245] Speed: 1.2588270197209441 samples/sec                   batch loss = 634.5651681423187 | accuracy = 0.6346938775510204


Epoch[2] Batch[250] Speed: 1.256061502423972 samples/sec                   batch loss = 646.5881236791611 | accuracy = 0.636


Epoch[2] Batch[255] Speed: 1.2504427957922102 samples/sec                   batch loss = 658.4199162721634 | accuracy = 0.6362745098039215


Epoch[2] Batch[260] Speed: 1.2573768669767562 samples/sec                   batch loss = 670.1869940757751 | accuracy = 0.6375


Epoch[2] Batch[265] Speed: 1.2514291447136061 samples/sec                   batch loss = 681.8034892082214 | accuracy = 0.6386792452830189


Epoch[2] Batch[270] Speed: 1.2558710109747506 samples/sec                   batch loss = 692.0139825344086 | accuracy = 0.6425925925925926


Epoch[2] Batch[275] Speed: 1.2558437489099368 samples/sec                   batch loss = 703.0262916088104 | accuracy = 0.6472727272727272


Epoch[2] Batch[280] Speed: 1.2576987609986316 samples/sec                   batch loss = 713.4995647668839 | accuracy = 0.6491071428571429


Epoch[2] Batch[285] Speed: 1.2565181272706085 samples/sec                   batch loss = 726.0586124658585 | accuracy = 0.6456140350877193


Epoch[2] Batch[290] Speed: 1.2509337704822203 samples/sec                   batch loss = 738.5396819114685 | accuracy = 0.6439655172413793


Epoch[2] Batch[295] Speed: 1.2508286623880434 samples/sec                   batch loss = 750.8051860332489 | accuracy = 0.6449152542372881


Epoch[2] Batch[300] Speed: 1.2531933348332318 samples/sec                   batch loss = 764.6120903491974 | accuracy = 0.6416666666666667


Epoch[2] Batch[305] Speed: 1.2516496652549824 samples/sec                   batch loss = 776.7548500299454 | accuracy = 0.6434426229508197


Epoch[2] Batch[310] Speed: 1.2571792877789238 samples/sec                   batch loss = 790.0671020746231 | accuracy = 0.6395161290322581


Epoch[2] Batch[315] Speed: 1.262118384683796 samples/sec                   batch loss = 800.5372694730759 | accuracy = 0.6396825396825396


Epoch[2] Batch[320] Speed: 1.252828647850359 samples/sec                   batch loss = 811.5193133354187 | accuracy = 0.64296875


Epoch[2] Batch[325] Speed: 1.2466495192952511 samples/sec                   batch loss = 822.8387025594711 | accuracy = 0.6446153846153846


Epoch[2] Batch[330] Speed: 1.2485834635707376 samples/sec                   batch loss = 834.8879655599594 | accuracy = 0.6446969696969697


Epoch[2] Batch[335] Speed: 1.25062344009526 samples/sec                   batch loss = 845.5399533510208 | accuracy = 0.6477611940298508


Epoch[2] Batch[340] Speed: 1.2497326558529225 samples/sec                   batch loss = 855.5811938047409 | accuracy = 0.6485294117647059


Epoch[2] Batch[345] Speed: 1.2455377035155084 samples/sec                   batch loss = 869.6267772912979 | accuracy = 0.6478260869565218


Epoch[2] Batch[350] Speed: 1.2506721987655354 samples/sec                   batch loss = 883.0953327417374 | accuracy = 0.6478571428571429


Epoch[2] Batch[355] Speed: 1.2478193750228705 samples/sec                   batch loss = 897.3582507371902 | accuracy = 0.6443661971830986


Epoch[2] Batch[360] Speed: 1.2493867270920107 samples/sec                   batch loss = 909.2230110168457 | accuracy = 0.6444444444444445


Epoch[2] Batch[365] Speed: 1.2534632668715724 samples/sec                   batch loss = 921.2718939781189 | accuracy = 0.6438356164383562


Epoch[2] Batch[370] Speed: 1.2527078809744967 samples/sec                   batch loss = 935.1488900184631 | accuracy = 0.6432432432432432


Epoch[2] Batch[375] Speed: 1.2529882717117546 samples/sec                   batch loss = 947.4849826097488 | accuracy = 0.6426666666666667


Epoch[2] Batch[380] Speed: 1.2579934643968138 samples/sec                   batch loss = 961.7927047014236 | accuracy = 0.6414473684210527


Epoch[2] Batch[385] Speed: 1.2578634949651244 samples/sec                   batch loss = 972.5821179151535 | accuracy = 0.6435064935064935


Epoch[2] Batch[390] Speed: 1.2573894945587145 samples/sec                   batch loss = 984.5610686540604 | accuracy = 0.6442307692307693


Epoch[2] Batch[395] Speed: 1.2543672569875102 samples/sec                   batch loss = 998.0240675210953 | accuracy = 0.6443037974683544


Epoch[2] Batch[400] Speed: 1.2507895894373495 samples/sec                   batch loss = 1011.1019669771194 | accuracy = 0.64375


Epoch[2] Batch[405] Speed: 1.254263634044719 samples/sec                   batch loss = 1026.832890868187 | accuracy = 0.6407407407407407


Epoch[2] Batch[410] Speed: 1.253707457479337 samples/sec                   batch loss = 1038.3316017389297 | accuracy = 0.6408536585365854


Epoch[2] Batch[415] Speed: 1.2498322726868418 samples/sec                   batch loss = 1049.6954694986343 | accuracy = 0.641566265060241


Epoch[2] Batch[420] Speed: 1.251827856299444 samples/sec                   batch loss = 1062.558354973793 | accuracy = 0.6410714285714286


Epoch[2] Batch[425] Speed: 1.250224284084977 samples/sec                   batch loss = 1077.4610034227371 | accuracy = 0.64


Epoch[2] Batch[430] Speed: 1.252361144743525 samples/sec                   batch loss = 1094.8981779813766 | accuracy = 0.6366279069767442


Epoch[2] Batch[435] Speed: 1.2573868559385375 samples/sec                   batch loss = 1105.1859741210938 | accuracy = 0.639080459770115


Epoch[2] Batch[440] Speed: 1.25358923748803 samples/sec                   batch loss = 1116.9093223810196 | accuracy = 0.6397727272727273


Epoch[2] Batch[445] Speed: 1.2591490897535764 samples/sec                   batch loss = 1128.9516615867615 | accuracy = 0.6398876404494382


Epoch[2] Batch[450] Speed: 1.262666940716379 samples/sec                   batch loss = 1140.029715180397 | accuracy = 0.6405555555555555


Epoch[2] Batch[455] Speed: 1.2625648874516573 samples/sec                   batch loss = 1149.4468581676483 | accuracy = 0.6428571428571429


Epoch[2] Batch[460] Speed: 1.2559433081811036 samples/sec                   batch loss = 1163.9700450897217 | accuracy = 0.6423913043478261


Epoch[2] Batch[465] Speed: 1.2566953539873553 samples/sec                   batch loss = 1176.237719297409 | accuracy = 0.6435483870967742


Epoch[2] Batch[470] Speed: 1.2535048484243976 samples/sec                   batch loss = 1188.0297597646713 | accuracy = 0.6436170212765957


Epoch[2] Batch[475] Speed: 1.2547478547954505 samples/sec                   batch loss = 1201.9631630182266 | accuracy = 0.6426315789473684


Epoch[2] Batch[480] Speed: 1.259059226127938 samples/sec                   batch loss = 1213.3216586112976 | accuracy = 0.6442708333333333


Epoch[2] Batch[485] Speed: 1.2580998744615912 samples/sec                   batch loss = 1222.8564363718033 | accuracy = 0.6458762886597939


Epoch[2] Batch[490] Speed: 1.257551979507646 samples/sec                   batch loss = 1236.097174525261 | accuracy = 0.6448979591836734


Epoch[2] Batch[495] Speed: 1.2600021569288937 samples/sec                   batch loss = 1248.1168903112411 | accuracy = 0.6464646464646465


Epoch[2] Batch[500] Speed: 1.2572423141683473 samples/sec                   batch loss = 1259.65478849411 | accuracy = 0.647


Epoch[2] Batch[505] Speed: 1.2583753223903018 samples/sec                   batch loss = 1273.1795054674149 | accuracy = 0.6460396039603961


Epoch[2] Batch[510] Speed: 1.2593805641880227 samples/sec                   batch loss = 1285.8553618192673 | accuracy = 0.6460784313725491


Epoch[2] Batch[515] Speed: 1.254867889027037 samples/sec                   batch loss = 1298.5003691911697 | accuracy = 0.6456310679611651


Epoch[2] Batch[520] Speed: 1.2609330438668576 samples/sec                   batch loss = 1309.8390175104141 | accuracy = 0.6466346153846154


Epoch[2] Batch[525] Speed: 1.2585810197443217 samples/sec                   batch loss = 1322.2629135847092 | accuracy = 0.6461904761904762


Epoch[2] Batch[530] Speed: 1.2637690067598004 samples/sec                   batch loss = 1336.5623453855515 | accuracy = 0.6443396226415095


Epoch[2] Batch[535] Speed: 1.2575294515347328 samples/sec                   batch loss = 1349.0283943414688 | accuracy = 0.6439252336448598


Epoch[2] Batch[540] Speed: 1.254133965378215 samples/sec                   batch loss = 1362.1141167879105 | accuracy = 0.6439814814814815


Epoch[2] Batch[545] Speed: 1.2563905324025815 samples/sec                   batch loss = 1372.8763808012009 | accuracy = 0.6449541284403669


Epoch[2] Batch[550] Speed: 1.257899615849586 samples/sec                   batch loss = 1387.6296268701553 | accuracy = 0.6436363636363637


Epoch[2] Batch[555] Speed: 1.2560701539502774 samples/sec                   batch loss = 1399.485409617424 | accuracy = 0.6441441441441441


Epoch[2] Batch[560] Speed: 1.258370791952591 samples/sec                   batch loss = 1410.2325465679169 | accuracy = 0.6464285714285715


Epoch[2] Batch[565] Speed: 1.2536187435594737 samples/sec                   batch loss = 1422.3115253448486 | accuracy = 0.6473451327433628


Epoch[2] Batch[570] Speed: 1.2627544688077244 samples/sec                   batch loss = 1434.6863446235657 | accuracy = 0.6486842105263158


Epoch[2] Batch[575] Speed: 1.2554076228458948 samples/sec                   batch loss = 1447.6736170053482 | accuracy = 0.648695652173913


Epoch[2] Batch[580] Speed: 1.2585414609695138 samples/sec                   batch loss = 1459.3324452638626 | accuracy = 0.6495689655172414


Epoch[2] Batch[585] Speed: 1.253591579188989 samples/sec                   batch loss = 1469.4185367822647 | accuracy = 0.6508547008547009


Epoch[2] Batch[590] Speed: 1.259488060016232 samples/sec                   batch loss = 1479.8265752792358 | accuracy = 0.6521186440677966


Epoch[2] Batch[595] Speed: 1.2536524664848245 samples/sec                   batch loss = 1489.8871763944626 | accuracy = 0.6529411764705882


Epoch[2] Batch[600] Speed: 1.259847268857948 samples/sec                   batch loss = 1503.445172905922 | accuracy = 0.6533333333333333


Epoch[2] Batch[605] Speed: 1.253046386392795 samples/sec                   batch loss = 1516.3475595712662 | accuracy = 0.6516528925619834


Epoch[2] Batch[610] Speed: 1.2569796046315471 samples/sec                   batch loss = 1526.3869735002518 | accuracy = 0.6532786885245901


Epoch[2] Batch[615] Speed: 1.253124629807535 samples/sec                   batch loss = 1535.6636258363724 | accuracy = 0.6548780487804878


Epoch[2] Batch[620] Speed: 1.255195730429295 samples/sec                   batch loss = 1545.8412219285965 | accuracy = 0.655241935483871


Epoch[2] Batch[625] Speed: 1.2537329404617163 samples/sec                   batch loss = 1553.8676273822784 | accuracy = 0.6568


Epoch[2] Batch[630] Speed: 1.254061407629741 samples/sec                   batch loss = 1563.6149243116379 | accuracy = 0.6575396825396825


Epoch[2] Batch[635] Speed: 1.2504529544840837 samples/sec                   batch loss = 1573.8998042345047 | accuracy = 0.658267716535433


Epoch[2] Batch[640] Speed: 1.2510763987927123 samples/sec                   batch loss = 1582.5402263402939 | accuracy = 0.659375


Epoch[2] Batch[645] Speed: 1.2543573159446673 samples/sec                   batch loss = 1595.615415096283 | accuracy = 0.6593023255813953


Epoch[2] Batch[650] Speed: 1.2559890035562065 samples/sec                   batch loss = 1607.8665634393692 | accuracy = 0.6584615384615384


Epoch[2] Batch[655] Speed: 1.25094832099799 samples/sec                   batch loss = 1619.7760084867477 | accuracy = 0.6591603053435114


Epoch[2] Batch[660] Speed: 1.2502896898148497 samples/sec                   batch loss = 1632.7111951112747 | accuracy = 0.6587121212121212


Epoch[2] Batch[665] Speed: 1.2575655532366437 samples/sec                   batch loss = 1642.9876997470856 | accuracy = 0.6593984962406015


Epoch[2] Batch[670] Speed: 1.2561449193871121 samples/sec                   batch loss = 1654.4425531625748 | accuracy = 0.6600746268656716


Epoch[2] Batch[675] Speed: 1.2597053767485724 samples/sec                   batch loss = 1665.7521039247513 | accuracy = 0.66


Epoch[2] Batch[680] Speed: 1.2529820020119748 samples/sec                   batch loss = 1677.2012286186218 | accuracy = 0.6602941176470588


Epoch[2] Batch[685] Speed: 1.2479237923125106 samples/sec                   batch loss = 1689.7081583738327 | accuracy = 0.6605839416058394


Epoch[2] Batch[690] Speed: 1.2497481093864506 samples/sec                   batch loss = 1703.3808139562607 | accuracy = 0.6605072463768116


Epoch[2] Batch[695] Speed: 1.2537767886061055 samples/sec                   batch loss = 1714.314592719078 | accuracy = 0.6607913669064748


Epoch[2] Batch[700] Speed: 1.257012755177347 samples/sec                   batch loss = 1725.3014847040176 | accuracy = 0.6610714285714285


Epoch[2] Batch[705] Speed: 1.2544899386745738 samples/sec                   batch loss = 1742.036800980568 | accuracy = 0.6595744680851063


Epoch[2] Batch[710] Speed: 1.2564096323448413 samples/sec                   batch loss = 1755.5325291156769 | accuracy = 0.6591549295774648


Epoch[2] Batch[715] Speed: 1.2578269046438233 samples/sec                   batch loss = 1767.0839340686798 | accuracy = 0.6594405594405595


Epoch[2] Batch[720] Speed: 1.254849962206144 samples/sec                   batch loss = 1782.5687869787216 | accuracy = 0.6590277777777778


Epoch[2] Batch[725] Speed: 1.2562210105856055 samples/sec                   batch loss = 1795.217255473137 | accuracy = 0.6579310344827586


Epoch[2] Batch[730] Speed: 1.252805353278795 samples/sec                   batch loss = 1805.5497615337372 | accuracy = 0.6582191780821918


Epoch[2] Batch[735] Speed: 1.2499567695012388 samples/sec                   batch loss = 1816.2416316270828 | accuracy = 0.658843537414966


Epoch[2] Batch[740] Speed: 1.2503624640610194 samples/sec                   batch loss = 1826.6547940969467 | accuracy = 0.6601351351351351


Epoch[2] Batch[745] Speed: 1.251326286605596 samples/sec                   batch loss = 1838.2804690599442 | accuracy = 0.6604026845637584


Epoch[2] Batch[750] Speed: 1.2488583852871682 samples/sec                   batch loss = 1848.0938680171967 | accuracy = 0.6613333333333333


Epoch[2] Batch[755] Speed: 1.2458667934027907 samples/sec                   batch loss = 1861.1689140796661 | accuracy = 0.6612582781456954


Epoch[2] Batch[760] Speed: 1.250423131270528 samples/sec                   batch loss = 1872.9166518449783 | accuracy = 0.6611842105263158


Epoch[2] Batch[765] Speed: 1.2529084546969311 samples/sec                   batch loss = 1886.0709252357483 | accuracy = 0.6611111111111111


Epoch[2] Batch[770] Speed: 1.2462834459408765 samples/sec                   batch loss = 1898.561942100525 | accuracy = 0.662012987012987


Epoch[2] Batch[775] Speed: 1.2491275689261685 samples/sec                   batch loss = 1910.1427505016327 | accuracy = 0.6619354838709678


Epoch[2] Batch[780] Speed: 1.2518810993160894 samples/sec                   batch loss = 1923.2012622356415 | accuracy = 0.6615384615384615


Epoch[2] Batch[785] Speed: 1.2468248994942908 samples/sec                   batch loss = 1934.367444038391 | accuracy = 0.6621019108280255


[Epoch 2] training: accuracy=0.6621192893401016
[Epoch 2] time cost: 644.2844243049622
[Epoch 2] validation: validation accuracy=0.7333333333333333


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).