<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `device` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), device=gpu)
x

[10:26:57] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], device=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[10:26:57] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), device=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], device=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_device(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', device=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), device=gpu)
net(x)

[10:26:58] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 6.6895742, -3.056898 ]], device=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, device=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.77374825912342 samples/sec                   batch loss = 15.640437126159668 | accuracy = 0.35


Epoch[1] Batch[10] Speed: 1.2512556397885968 samples/sec                   batch loss = 29.599234104156494 | accuracy = 0.475


Epoch[1] Batch[15] Speed: 1.2477035618849133 samples/sec                   batch loss = 42.889941692352295 | accuracy = 0.5


Epoch[1] Batch[20] Speed: 1.2554344902034045 samples/sec                   batch loss = 56.58572483062744 | accuracy = 0.525


Epoch[1] Batch[25] Speed: 1.244434791922382 samples/sec                   batch loss = 71.60910487174988 | accuracy = 0.49


Epoch[1] Batch[30] Speed: 1.257400426103138 samples/sec                   batch loss = 85.54032826423645 | accuracy = 0.5


Epoch[1] Batch[35] Speed: 1.2493990086108897 samples/sec                   batch loss = 99.50774192810059 | accuracy = 0.5


Epoch[1] Batch[40] Speed: 1.254771033972019 samples/sec                   batch loss = 113.82843708992004 | accuracy = 0.4875


Epoch[1] Batch[45] Speed: 1.2561651404997043 samples/sec                   batch loss = 128.81016874313354 | accuracy = 0.4722222222222222


Epoch[1] Batch[50] Speed: 1.2458887204077391 samples/sec                   batch loss = 143.27457642555237 | accuracy = 0.46


Epoch[1] Batch[55] Speed: 1.260268782266413 samples/sec                   batch loss = 156.86026549339294 | accuracy = 0.4681818181818182


Epoch[1] Batch[60] Speed: 1.260818479120936 samples/sec                   batch loss = 170.72841024398804 | accuracy = 0.4791666666666667


Epoch[1] Batch[65] Speed: 1.258545803816112 samples/sec                   batch loss = 183.93394207954407 | accuracy = 0.49230769230769234


Epoch[1] Batch[70] Speed: 1.2548806539865789 samples/sec                   batch loss = 197.56521725654602 | accuracy = 0.49642857142857144


Epoch[1] Batch[75] Speed: 1.2547147297664056 samples/sec                   batch loss = 211.38890600204468 | accuracy = 0.49666666666666665


Epoch[1] Batch[80] Speed: 1.2531548628348272 samples/sec                   batch loss = 225.64129519462585 | accuracy = 0.496875


Epoch[1] Batch[85] Speed: 1.2602669835634817 samples/sec                   batch loss = 239.605126619339 | accuracy = 0.49411764705882355


Epoch[1] Batch[90] Speed: 1.2536270804471963 samples/sec                   batch loss = 252.6695294380188 | accuracy = 0.5055555555555555


Epoch[1] Batch[95] Speed: 1.257958752950564 samples/sec                   batch loss = 266.7318468093872 | accuracy = 0.5026315789473684


Epoch[1] Batch[100] Speed: 1.2512300707800634 samples/sec                   batch loss = 280.6677174568176 | accuracy = 0.4975


Epoch[1] Batch[105] Speed: 1.253248191938051 samples/sec                   batch loss = 294.38169026374817 | accuracy = 0.5


Epoch[1] Batch[110] Speed: 1.2628406782564698 samples/sec                   batch loss = 308.0403769016266 | accuracy = 0.5


Epoch[1] Batch[115] Speed: 1.2521382240703738 samples/sec                   batch loss = 321.7123968601227 | accuracy = 0.49782608695652175


Epoch[1] Batch[120] Speed: 1.2548120453823706 samples/sec                   batch loss = 334.9118101596832 | accuracy = 0.5020833333333333


Epoch[1] Batch[125] Speed: 1.2485418362177103 samples/sec                   batch loss = 348.58449387550354 | accuracy = 0.508


Epoch[1] Batch[130] Speed: 1.2413462954943268 samples/sec                   batch loss = 361.8364074230194 | accuracy = 0.5096153846153846


Epoch[1] Batch[135] Speed: 1.2569996642699541 samples/sec                   batch loss = 375.2692196369171 | accuracy = 0.5111111111111111


Epoch[1] Batch[140] Speed: 1.2590367386047647 samples/sec                   batch loss = 388.9134178161621 | accuracy = 0.5160714285714286


Epoch[1] Batch[145] Speed: 1.2510574606567366 samples/sec                   batch loss = 402.15671157836914 | accuracy = 0.5172413793103449


Epoch[1] Batch[150] Speed: 1.2577026266133107 samples/sec                   batch loss = 415.7288064956665 | accuracy = 0.5183333333333333


Epoch[1] Batch[155] Speed: 1.2542415050285638 samples/sec                   batch loss = 429.1588544845581 | accuracy = 0.5209677419354839


Epoch[1] Batch[160] Speed: 1.2619312723032405 samples/sec                   batch loss = 442.47604155540466 | accuracy = 0.525


Epoch[1] Batch[165] Speed: 1.2562445264175606 samples/sec                   batch loss = 455.95080518722534 | accuracy = 0.5287878787878788


Epoch[1] Batch[170] Speed: 1.2550660567539593 samples/sec                   batch loss = 469.72021532058716 | accuracy = 0.5294117647058824


Epoch[1] Batch[175] Speed: 1.25863701051864 samples/sec                   batch loss = 483.31555438041687 | accuracy = 0.5314285714285715


Epoch[1] Batch[180] Speed: 1.2571635557196084 samples/sec                   batch loss = 497.4079542160034 | accuracy = 0.5305555555555556


Epoch[1] Batch[185] Speed: 1.2542212520371394 samples/sec                   batch loss = 511.19031620025635 | accuracy = 0.5283783783783784


Epoch[1] Batch[190] Speed: 1.2531334281231878 samples/sec                   batch loss = 525.253279209137 | accuracy = 0.5276315789473685


Epoch[1] Batch[195] Speed: 1.2455445462228645 samples/sec                   batch loss = 539.7338716983795 | accuracy = 0.5243589743589744


Epoch[1] Batch[200] Speed: 1.2469020898291572 samples/sec                   batch loss = 553.0027496814728 | accuracy = 0.5275


Epoch[1] Batch[205] Speed: 1.251096830232589 samples/sec                   batch loss = 566.6938879489899 | accuracy = 0.526829268292683


Epoch[1] Batch[210] Speed: 1.250566761754334 samples/sec                   batch loss = 579.7701337337494 | accuracy = 0.5321428571428571


Epoch[1] Batch[215] Speed: 1.2523481505428948 samples/sec                   batch loss = 593.1957769393921 | accuracy = 0.5325581395348837


Epoch[1] Batch[220] Speed: 1.2439834005205725 samples/sec                   batch loss = 606.6920788288116 | accuracy = 0.5340909090909091


Epoch[1] Batch[225] Speed: 1.244687573864583 samples/sec                   batch loss = 620.705265045166 | accuracy = 0.5333333333333333


Epoch[1] Batch[230] Speed: 1.2466680463037925 samples/sec                   batch loss = 634.2473084926605 | accuracy = 0.5369565217391304


Epoch[1] Batch[235] Speed: 1.2453829304640334 samples/sec                   batch loss = 647.7565445899963 | accuracy = 0.5414893617021277


Epoch[1] Batch[240] Speed: 1.2466042230784287 samples/sec                   batch loss = 660.550281047821 | accuracy = 0.5447916666666667


Epoch[1] Batch[245] Speed: 1.2484268179195497 samples/sec                   batch loss = 674.688761472702 | accuracy = 0.5469387755102041


Epoch[1] Batch[250] Speed: 1.2487171916379227 samples/sec                   batch loss = 688.7875769138336 | accuracy = 0.544


Epoch[1] Batch[255] Speed: 1.2412616181353704 samples/sec                   batch loss = 702.2243006229401 | accuracy = 0.546078431372549


Epoch[1] Batch[260] Speed: 1.2515661906767912 samples/sec                   batch loss = 715.6220045089722 | accuracy = 0.5471153846153847


Epoch[1] Batch[265] Speed: 1.2471810925706892 samples/sec                   batch loss = 729.5506916046143 | accuracy = 0.5471698113207547


Epoch[1] Batch[270] Speed: 1.2435273592509255 samples/sec                   batch loss = 743.6441819667816 | accuracy = 0.5435185185185185


Epoch[1] Batch[275] Speed: 1.2414056316336597 samples/sec                   batch loss = 757.5111515522003 | accuracy = 0.5436363636363636


Epoch[1] Batch[280] Speed: 1.2509445900642413 samples/sec                   batch loss = 771.1391091346741 | accuracy = 0.54375


Epoch[1] Batch[285] Speed: 1.2464096442648398 samples/sec                   batch loss = 784.4929518699646 | accuracy = 0.5456140350877193


Epoch[1] Batch[290] Speed: 1.2573792228496672 samples/sec                   batch loss = 798.238392829895 | accuracy = 0.5448275862068965


Epoch[1] Batch[295] Speed: 1.2497278150656075 samples/sec                   batch loss = 811.4184381961823 | accuracy = 0.5457627118644067


Epoch[1] Batch[300] Speed: 1.2519851697971365 samples/sec                   batch loss = 825.0927972793579 | accuracy = 0.5458333333333333


Epoch[1] Batch[305] Speed: 1.2572572002470868 samples/sec                   batch loss = 838.6960246562958 | accuracy = 0.5467213114754098


Epoch[1] Batch[310] Speed: 1.254287264195236 samples/sec                   batch loss = 852.5009138584137 | accuracy = 0.5451612903225806


Epoch[1] Batch[315] Speed: 1.255081360796796 samples/sec                   batch loss = 866.4874129295349 | accuracy = 0.5452380952380952


Epoch[1] Batch[320] Speed: 1.252705729645586 samples/sec                   batch loss = 880.1505975723267 | accuracy = 0.5453125


Epoch[1] Batch[325] Speed: 1.2541044350488113 samples/sec                   batch loss = 893.7415418624878 | accuracy = 0.5446153846153846


Epoch[1] Batch[330] Speed: 1.2566946009268873 samples/sec                   batch loss = 907.322434425354 | accuracy = 0.5462121212121213


Epoch[1] Batch[335] Speed: 1.2531390441339476 samples/sec                   batch loss = 920.3616938591003 | accuracy = 0.5485074626865671


Epoch[1] Batch[340] Speed: 1.2559762160380041 samples/sec                   batch loss = 934.4873189926147 | accuracy = 0.5485294117647059


Epoch[1] Batch[345] Speed: 1.2493469068580942 samples/sec                   batch loss = 947.5673322677612 | accuracy = 0.5507246376811594


Epoch[1] Batch[350] Speed: 1.2494196644371722 samples/sec                   batch loss = 961.6787619590759 | accuracy = 0.5464285714285714


Epoch[1] Batch[355] Speed: 1.2490287153044808 samples/sec                   batch loss = 974.9409956932068 | accuracy = 0.547887323943662


Epoch[1] Batch[360] Speed: 1.2517551915611433 samples/sec                   batch loss = 988.2868217229843 | accuracy = 0.5479166666666667


Epoch[1] Batch[365] Speed: 1.2545889081321022 samples/sec                   batch loss = 1002.0738252401352 | accuracy = 0.5486301369863014


Epoch[1] Batch[370] Speed: 1.252637732752654 samples/sec                   batch loss = 1015.856432557106 | accuracy = 0.5486486486486486


Epoch[1] Batch[375] Speed: 1.2488647997133833 samples/sec                   batch loss = 1029.553455233574 | accuracy = 0.5486666666666666


Epoch[1] Batch[380] Speed: 1.2478940896161925 samples/sec                   batch loss = 1043.3101960420609 | accuracy = 0.5486842105263158


Epoch[1] Batch[385] Speed: 1.2515969087737329 samples/sec                   batch loss = 1055.6442059278488 | accuracy = 0.551948051948052


Epoch[1] Batch[390] Speed: 1.2574058919466278 samples/sec                   batch loss = 1068.7740021944046 | accuracy = 0.5525641025641026


Epoch[1] Batch[395] Speed: 1.256496577350984 samples/sec                   batch loss = 1082.4915384054184 | accuracy = 0.5518987341772152


Epoch[1] Batch[400] Speed: 1.2523484309904211 samples/sec                   batch loss = 1096.0654083490372 | accuracy = 0.551875


Epoch[1] Batch[405] Speed: 1.2498698891543503 samples/sec                   batch loss = 1109.550622344017 | accuracy = 0.5530864197530864


Epoch[1] Batch[410] Speed: 1.2514660172086483 samples/sec                   batch loss = 1122.500064253807 | accuracy = 0.5560975609756098


Epoch[1] Batch[415] Speed: 1.2380958144088972 samples/sec                   batch loss = 1135.667196393013 | accuracy = 0.5572289156626506


Epoch[1] Batch[420] Speed: 1.240876032259551 samples/sec                   batch loss = 1149.606073975563 | accuracy = 0.555952380952381


Epoch[1] Batch[425] Speed: 1.2527945014850095 samples/sec                   batch loss = 1164.3859215974808 | accuracy = 0.5547058823529412


Epoch[1] Batch[430] Speed: 1.2509937469013566 samples/sec                   batch loss = 1177.4489489793777 | accuracy = 0.5558139534883721


Epoch[1] Batch[435] Speed: 1.2430417195645964 samples/sec                   batch loss = 1190.9284440279007 | accuracy = 0.5557471264367816


Epoch[1] Batch[440] Speed: 1.2371407631564064 samples/sec                   batch loss = 1204.856372475624 | accuracy = 0.5545454545454546


Epoch[1] Batch[445] Speed: 1.2514882351028047 samples/sec                   batch loss = 1217.5938938856125 | accuracy = 0.5550561797752809


Epoch[1] Batch[450] Speed: 1.2514964503222654 samples/sec                   batch loss = 1231.8344568014145 | accuracy = 0.5544444444444444


Epoch[1] Batch[455] Speed: 1.2561831049179044 samples/sec                   batch loss = 1243.8904474973679 | accuracy = 0.5582417582417583


Epoch[1] Batch[460] Speed: 1.253041426314278 samples/sec                   batch loss = 1256.1596423387527 | accuracy = 0.5597826086956522


Epoch[1] Batch[465] Speed: 1.254493127968092 samples/sec                   batch loss = 1269.1208287477493 | accuracy = 0.5612903225806452


Epoch[1] Batch[470] Speed: 1.2635798820066417 samples/sec                   batch loss = 1282.7326644659042 | accuracy = 0.5617021276595745


Epoch[1] Batch[475] Speed: 1.2559292993596243 samples/sec                   batch loss = 1295.8889216184616 | accuracy = 0.5615789473684211


Epoch[1] Batch[480] Speed: 1.2670393145675172 samples/sec                   batch loss = 1308.4765187501907 | accuracy = 0.5630208333333333


Epoch[1] Batch[485] Speed: 1.2564596901212026 samples/sec                   batch loss = 1321.7855595350266 | accuracy = 0.5628865979381443


Epoch[1] Batch[490] Speed: 1.2485567957281853 samples/sec                   batch loss = 1335.5130466222763 | accuracy = 0.5622448979591836


Epoch[1] Batch[495] Speed: 1.2529275424833828 samples/sec                   batch loss = 1349.549853682518 | accuracy = 0.5616161616161616


Epoch[1] Batch[500] Speed: 1.2554869130438524 samples/sec                   batch loss = 1363.211259007454 | accuracy = 0.5605


Epoch[1] Batch[505] Speed: 1.2500127406226782 samples/sec                   batch loss = 1376.437909245491 | accuracy = 0.55990099009901


Epoch[1] Batch[510] Speed: 1.2459927222841485 samples/sec                   batch loss = 1391.2304691076279 | accuracy = 0.557843137254902


Epoch[1] Batch[515] Speed: 1.248504950012446 samples/sec                   batch loss = 1405.0634907484055 | accuracy = 0.5577669902912621


Epoch[1] Batch[520] Speed: 1.2506126260678374 samples/sec                   batch loss = 1417.0014315843582 | accuracy = 0.5591346153846154


Epoch[1] Batch[525] Speed: 1.2579138572906592 samples/sec                   batch loss = 1429.7937088012695 | accuracy = 0.559047619047619


Epoch[1] Batch[530] Speed: 1.2580530820029039 samples/sec                   batch loss = 1442.348168849945 | accuracy = 0.5599056603773584


Epoch[1] Batch[535] Speed: 1.2512600258184192 samples/sec                   batch loss = 1454.7592673301697 | accuracy = 0.5607476635514018


Epoch[1] Batch[540] Speed: 1.2565157746233515 samples/sec                   batch loss = 1467.4020676612854 | accuracy = 0.562037037037037


Epoch[1] Batch[545] Speed: 1.260353421777548 samples/sec                   batch loss = 1480.5033367872238 | accuracy = 0.5628440366972477


Epoch[1] Batch[550] Speed: 1.2529451337035138 samples/sec                   batch loss = 1493.257986664772 | accuracy = 0.5636363636363636


Epoch[1] Batch[555] Speed: 1.257672550872226 samples/sec                   batch loss = 1506.567840218544 | accuracy = 0.563963963963964


Epoch[1] Batch[560] Speed: 1.260800287131413 samples/sec                   batch loss = 1519.679094195366 | accuracy = 0.5633928571428571


Epoch[1] Batch[565] Speed: 1.2572999761313175 samples/sec                   batch loss = 1533.8530036211014 | accuracy = 0.5623893805309734


Epoch[1] Batch[570] Speed: 1.2581853551569975 samples/sec                   batch loss = 1545.694511294365 | accuracy = 0.5640350877192982


Epoch[1] Batch[575] Speed: 1.2475027030612238 samples/sec                   batch loss = 1558.286682009697 | accuracy = 0.5652173913043478


Epoch[1] Batch[580] Speed: 1.247977817716058 samples/sec                   batch loss = 1570.7854036092758 | accuracy = 0.565948275862069


Epoch[1] Batch[585] Speed: 1.256413960491281 samples/sec                   batch loss = 1583.283427119255 | accuracy = 0.5670940170940171


Epoch[1] Batch[590] Speed: 1.254141465365861 samples/sec                   batch loss = 1596.4934606552124 | accuracy = 0.5669491525423729


Epoch[1] Batch[595] Speed: 1.2508224142820803 samples/sec                   batch loss = 1609.8509185314178 | accuracy = 0.5676470588235294


Epoch[1] Batch[600] Speed: 1.2490634936281235 samples/sec                   batch loss = 1620.6334954500198 | accuracy = 0.5691666666666667


Epoch[1] Batch[605] Speed: 1.2445591389426272 samples/sec                   batch loss = 1633.9644051790237 | accuracy = 0.5702479338842975


Epoch[1] Batch[610] Speed: 1.2492359260049641 samples/sec                   batch loss = 1646.9822472333908 | accuracy = 0.5709016393442623


Epoch[1] Batch[615] Speed: 1.2527365972462252 samples/sec                   batch loss = 1659.8778845071793 | accuracy = 0.5711382113821138


Epoch[1] Batch[620] Speed: 1.2445026396661212 samples/sec                   batch loss = 1672.4132169485092 | accuracy = 0.5717741935483871


Epoch[1] Batch[625] Speed: 1.239166279393721 samples/sec                   batch loss = 1686.6076477766037 | accuracy = 0.5704


Epoch[1] Batch[630] Speed: 1.240041509110422 samples/sec                   batch loss = 1699.7905360460281 | accuracy = 0.5706349206349206


Epoch[1] Batch[635] Speed: 1.2585208800606111 samples/sec                   batch loss = 1711.3795405626297 | accuracy = 0.5724409448818898


Epoch[1] Batch[640] Speed: 1.254881592596795 samples/sec                   batch loss = 1722.8731026649475 | accuracy = 0.573828125


Epoch[1] Batch[645] Speed: 1.2642746960678148 samples/sec                   batch loss = 1737.7333455085754 | accuracy = 0.5724806201550388


Epoch[1] Batch[650] Speed: 1.2631395083749957 samples/sec                   batch loss = 1748.5812017917633 | accuracy = 0.5746153846153846


Epoch[1] Batch[655] Speed: 1.2592241276755916 samples/sec                   batch loss = 1761.7097959518433 | accuracy = 0.5751908396946565


Epoch[1] Batch[660] Speed: 1.2588355204790551 samples/sec                   batch loss = 1776.0801618099213 | accuracy = 0.5746212121212121


Epoch[1] Batch[665] Speed: 1.251649291742138 samples/sec                   batch loss = 1788.3076076507568 | accuracy = 0.5763157894736842


Epoch[1] Batch[670] Speed: 1.2527587667343478 samples/sec                   batch loss = 1800.646168231964 | accuracy = 0.5776119402985075


Epoch[1] Batch[675] Speed: 1.2586853573196446 samples/sec                   batch loss = 1813.5210065841675 | accuracy = 0.5781481481481482


Epoch[1] Batch[680] Speed: 1.2575455697930304 samples/sec                   batch loss = 1827.065193414688 | accuracy = 0.5772058823529411


Epoch[1] Batch[685] Speed: 1.2543985816812913 samples/sec                   batch loss = 1839.8045725822449 | accuracy = 0.5777372262773722


Epoch[1] Batch[690] Speed: 1.2570335693397148 samples/sec                   batch loss = 1852.2179284095764 | accuracy = 0.5789855072463768


Epoch[1] Batch[695] Speed: 1.251377246857817 samples/sec                   batch loss = 1867.235164642334 | accuracy = 0.5787769784172662


Epoch[1] Batch[700] Speed: 1.2461309865342785 samples/sec                   batch loss = 1878.5815831422806 | accuracy = 0.58


Epoch[1] Batch[705] Speed: 1.251067069590036 samples/sec                   batch loss = 1891.7489923238754 | accuracy = 0.5801418439716312


Epoch[1] Batch[710] Speed: 1.2516216524102601 samples/sec                   batch loss = 1904.23386657238 | accuracy = 0.5795774647887324


Epoch[1] Batch[715] Speed: 1.2595615307818702 samples/sec                   batch loss = 1916.5081218481064 | accuracy = 0.5793706293706293


Epoch[1] Batch[720] Speed: 1.2605345733802449 samples/sec                   batch loss = 1929.678840994835 | accuracy = 0.5798611111111112


Epoch[1] Batch[725] Speed: 1.2507644124569373 samples/sec                   batch loss = 1941.4491964578629 | accuracy = 0.5813793103448276


Epoch[1] Batch[730] Speed: 1.2537004311038946 samples/sec                   batch loss = 1954.6656388044357 | accuracy = 0.5811643835616438


Epoch[1] Batch[735] Speed: 1.2541352778695778 samples/sec                   batch loss = 1969.458338856697 | accuracy = 0.5802721088435374


Epoch[1] Batch[740] Speed: 1.2615210724016699 samples/sec                   batch loss = 1985.1321095228195 | accuracy = 0.5790540540540541


Epoch[1] Batch[745] Speed: 1.2553526704689661 samples/sec                   batch loss = 1998.8826991319656 | accuracy = 0.5791946308724832


Epoch[1] Batch[750] Speed: 1.2503287315389406 samples/sec                   batch loss = 2011.289453625679 | accuracy = 0.58


Epoch[1] Batch[755] Speed: 1.2483878019862333 samples/sec                   batch loss = 2024.107571721077 | accuracy = 0.5798013245033112


Epoch[1] Batch[760] Speed: 1.2524384611382164 samples/sec                   batch loss = 2037.1211808919907 | accuracy = 0.5799342105263158


Epoch[1] Batch[765] Speed: 1.2503094433180104 samples/sec                   batch loss = 2052.4107764959335 | accuracy = 0.5803921568627451


Epoch[1] Batch[770] Speed: 1.2539193162599112 samples/sec                   batch loss = 2065.283624768257 | accuracy = 0.5805194805194805


Epoch[1] Batch[775] Speed: 1.2576871642971672 samples/sec                   batch loss = 2077.0959399938583 | accuracy = 0.5816129032258065


Epoch[1] Batch[780] Speed: 1.256636523857211 samples/sec                   batch loss = 2090.142548441887 | accuracy = 0.5820512820512821


Epoch[1] Batch[785] Speed: 1.2576603889903921 samples/sec                   batch loss = 2102.0085450410843 | accuracy = 0.5821656050955414


[Epoch 1] training: accuracy=0.5815355329949239
[Epoch 1] time cost: 647.5373051166534
[Epoch 1] validation: validation accuracy=0.6811111111111111


Epoch[2] Batch[5] Speed: 1.2466758278115924 samples/sec                   batch loss = 12.125620365142822 | accuracy = 0.7


Epoch[2] Batch[10] Speed: 1.2537952469687759 samples/sec                   batch loss = 24.914395570755005 | accuracy = 0.625


Epoch[2] Batch[15] Speed: 1.23841193300866 samples/sec                   batch loss = 37.31949806213379 | accuracy = 0.6


Epoch[2] Batch[20] Speed: 1.2430735863859819 samples/sec                   batch loss = 49.903637647628784 | accuracy = 0.625


Epoch[2] Batch[25] Speed: 1.2485737069108265 samples/sec                   batch loss = 62.446730971336365 | accuracy = 0.62


Epoch[2] Batch[30] Speed: 1.2524078887020333 samples/sec                   batch loss = 75.13451445102692 | accuracy = 0.6333333333333333


Epoch[2] Batch[35] Speed: 1.2517366998252875 samples/sec                   batch loss = 87.01233541965485 | accuracy = 0.6285714285714286


Epoch[2] Batch[40] Speed: 1.2519924572473997 samples/sec                   batch loss = 98.58435952663422 | accuracy = 0.6375


Epoch[2] Batch[45] Speed: 1.252301878421169 samples/sec                   batch loss = 111.16254508495331 | accuracy = 0.6333333333333333


Epoch[2] Batch[50] Speed: 1.258849877575668 samples/sec                   batch loss = 122.58353984355927 | accuracy = 0.645


Epoch[2] Batch[55] Speed: 1.2579994070375375 samples/sec                   batch loss = 134.7803134918213 | accuracy = 0.6545454545454545


Epoch[2] Batch[60] Speed: 1.2610154978816235 samples/sec                   batch loss = 148.39240503311157 | accuracy = 0.6541666666666667


Epoch[2] Batch[65] Speed: 1.2569587922562386 samples/sec                   batch loss = 163.4282829761505 | accuracy = 0.6384615384615384


Epoch[2] Batch[70] Speed: 1.2480856040159616 samples/sec                   batch loss = 176.6856973171234 | accuracy = 0.6357142857142857


Epoch[2] Batch[75] Speed: 1.2606831887652836 samples/sec                   batch loss = 189.72855043411255 | accuracy = 0.6333333333333333


Epoch[2] Batch[80] Speed: 1.2505009542055854 samples/sec                   batch loss = 202.49244046211243 | accuracy = 0.640625


Epoch[2] Batch[85] Speed: 1.2443670438708463 samples/sec                   batch loss = 215.58602452278137 | accuracy = 0.638235294117647


Epoch[2] Batch[90] Speed: 1.241822154585882 samples/sec                   batch loss = 227.56260240077972 | accuracy = 0.6472222222222223


Epoch[2] Batch[95] Speed: 1.2427369490041456 samples/sec                   batch loss = 240.88402044773102 | accuracy = 0.6421052631578947


Epoch[2] Batch[100] Speed: 1.2496896486329419 samples/sec                   batch loss = 254.82741796970367 | accuracy = 0.64


Epoch[2] Batch[105] Speed: 1.2480818901386657 samples/sec                   batch loss = 267.9483822584152 | accuracy = 0.6357142857142857


Epoch[2] Batch[110] Speed: 1.256129401300362 samples/sec                   batch loss = 281.3328741788864 | accuracy = 0.6386363636363637


Epoch[2] Batch[115] Speed: 1.2545978208401742 samples/sec                   batch loss = 293.92617201805115 | accuracy = 0.6347826086956522


Epoch[2] Batch[120] Speed: 1.2601202645640146 samples/sec                   batch loss = 306.26461267471313 | accuracy = 0.6354166666666666


Epoch[2] Batch[125] Speed: 1.252253646879617 samples/sec                   batch loss = 319.0681788921356 | accuracy = 0.638


Epoch[2] Batch[130] Speed: 1.2453489114378817 samples/sec                   batch loss = 331.4164457321167 | accuracy = 0.6403846153846153


Epoch[2] Batch[135] Speed: 1.2552332947897304 samples/sec                   batch loss = 345.34391951560974 | accuracy = 0.6351851851851852


Epoch[2] Batch[140] Speed: 1.2493251370557634 samples/sec                   batch loss = 359.0312170982361 | accuracy = 0.6357142857142857


Epoch[2] Batch[145] Speed: 1.2521940168929449 samples/sec                   batch loss = 370.04884254932404 | accuracy = 0.6379310344827587


Epoch[2] Batch[150] Speed: 1.2494194783455983 samples/sec                   batch loss = 381.8090569972992 | accuracy = 0.6416666666666667


Epoch[2] Batch[155] Speed: 1.2537792247040576 samples/sec                   batch loss = 394.7558219432831 | accuracy = 0.6435483870967742


Epoch[2] Batch[160] Speed: 1.2470373117798428 samples/sec                   batch loss = 407.04191291332245 | accuracy = 0.64375


Epoch[2] Batch[165] Speed: 1.2526875839479155 samples/sec                   batch loss = 420.13886272907257 | accuracy = 0.6439393939393939


Epoch[2] Batch[170] Speed: 1.2590924864583821 samples/sec                   batch loss = 432.0245887041092 | accuracy = 0.6455882352941177


Epoch[2] Batch[175] Speed: 1.2650507774453934 samples/sec                   batch loss = 444.7587698698044 | accuracy = 0.6485714285714286


Epoch[2] Batch[180] Speed: 1.25437438462465 samples/sec                   batch loss = 457.5024884939194 | accuracy = 0.6486111111111111


Epoch[2] Batch[185] Speed: 1.2546196809656223 samples/sec                   batch loss = 471.3016333580017 | accuracy = 0.6418918918918919


Epoch[2] Batch[190] Speed: 1.2525868568246776 samples/sec                   batch loss = 482.3207937479019 | accuracy = 0.6460526315789473


Epoch[2] Batch[195] Speed: 1.2475155969043987 samples/sec                   batch loss = 492.5149738788605 | accuracy = 0.6512820512820513


Epoch[2] Batch[200] Speed: 1.252803201615085 samples/sec                   batch loss = 506.23302364349365 | accuracy = 0.6475


Epoch[2] Batch[205] Speed: 1.2568529517459932 samples/sec                   batch loss = 516.2491763830185 | accuracy = 0.6536585365853659


Epoch[2] Batch[210] Speed: 1.2401422453004818 samples/sec                   batch loss = 530.1074289083481 | accuracy = 0.6488095238095238


Epoch[2] Batch[215] Speed: 1.2462151261842493 samples/sec                   batch loss = 543.201543211937 | accuracy = 0.6441860465116279


Epoch[2] Batch[220] Speed: 1.2529153786306784 samples/sec                   batch loss = 554.9040203094482 | accuracy = 0.6443181818181818


Epoch[2] Batch[225] Speed: 1.252774108105315 samples/sec                   batch loss = 565.575471162796 | accuracy = 0.6466666666666666


Epoch[2] Batch[230] Speed: 1.2525782532159946 samples/sec                   batch loss = 578.4780812263489 | accuracy = 0.6434782608695652


Epoch[2] Batch[235] Speed: 1.258601980293984 samples/sec                   batch loss = 591.2884585857391 | accuracy = 0.6457446808510638


Epoch[2] Batch[240] Speed: 1.2585147436813464 samples/sec                   batch loss = 603.4299250841141 | accuracy = 0.6510416666666666


Epoch[2] Batch[245] Speed: 1.2592529544439566 samples/sec                   batch loss = 617.4835910797119 | accuracy = 0.65


Epoch[2] Batch[250] Speed: 1.2538453777518364 samples/sec                   batch loss = 629.9840438365936 | accuracy = 0.65


Epoch[2] Batch[255] Speed: 1.2583493671994987 samples/sec                   batch loss = 641.0587395429611 | accuracy = 0.6529411764705882


Epoch[2] Batch[260] Speed: 1.2577770207539158 samples/sec                   batch loss = 652.5652990341187 | accuracy = 0.6538461538461539


Epoch[2] Batch[265] Speed: 1.2541686535730563 samples/sec                   batch loss = 667.0240437984467 | accuracy = 0.6528301886792452


Epoch[2] Batch[270] Speed: 1.2523938651480975 samples/sec                   batch loss = 677.90849006176 | accuracy = 0.6555555555555556


Epoch[2] Batch[275] Speed: 1.2470062609656092 samples/sec                   batch loss = 689.5184632539749 | accuracy = 0.6554545454545454


Epoch[2] Batch[280] Speed: 1.2398947880377744 samples/sec                   batch loss = 700.7945252656937 | accuracy = 0.65625


Epoch[2] Batch[285] Speed: 1.2351459543545935 samples/sec                   batch loss = 712.0632051229477 | accuracy = 0.6578947368421053


Epoch[2] Batch[290] Speed: 1.250235464066719 samples/sec                   batch loss = 723.5788526535034 | accuracy = 0.6603448275862069


Epoch[2] Batch[295] Speed: 1.2527917885659374 samples/sec                   batch loss = 736.6539227962494 | accuracy = 0.6601694915254237


Epoch[2] Batch[300] Speed: 1.2553275912394122 samples/sec                   batch loss = 749.4176051616669 | accuracy = 0.6583333333333333


Epoch[2] Batch[305] Speed: 1.2520293629456318 samples/sec                   batch loss = 762.8549973964691 | accuracy = 0.6581967213114754


Epoch[2] Batch[310] Speed: 1.2516393003562916 samples/sec                   batch loss = 773.959896504879 | accuracy = 0.657258064516129


Epoch[2] Batch[315] Speed: 1.258714631602071 samples/sec                   batch loss = 784.0912482142448 | accuracy = 0.6595238095238095


Epoch[2] Batch[320] Speed: 1.2493168572873619 samples/sec                   batch loss = 797.7584719061852 | accuracy = 0.65859375


Epoch[2] Batch[325] Speed: 1.2430778231384654 samples/sec                   batch loss = 811.1397570967674 | accuracy = 0.6569230769230769


Epoch[2] Batch[330] Speed: 1.24844679130535 samples/sec                   batch loss = 823.461876809597 | accuracy = 0.6568181818181819


Epoch[2] Batch[335] Speed: 1.2517140061688852 samples/sec                   batch loss = 831.9852893948555 | accuracy = 0.6611940298507463


Epoch[2] Batch[340] Speed: 1.2450990025384048 samples/sec                   batch loss = 842.5456736683846 | accuracy = 0.6625


Epoch[2] Batch[345] Speed: 1.2498574121542654 samples/sec                   batch loss = 853.7333583235741 | accuracy = 0.6623188405797101


Epoch[2] Batch[350] Speed: 1.2419813763674057 samples/sec                   batch loss = 867.9154986739159 | accuracy = 0.6614285714285715


Epoch[2] Batch[355] Speed: 1.2453291294892546 samples/sec                   batch loss = 879.665889441967 | accuracy = 0.6612676056338028


Epoch[2] Batch[360] Speed: 1.2565351606995434 samples/sec                   batch loss = 892.1139308810234 | accuracy = 0.6611111111111111


Epoch[2] Batch[365] Speed: 1.2548539041856546 samples/sec                   batch loss = 903.9974283576012 | accuracy = 0.6616438356164384


Epoch[2] Batch[370] Speed: 1.2505874562109947 samples/sec                   batch loss = 916.41402810812 | accuracy = 0.6614864864864864


Epoch[2] Batch[375] Speed: 1.2513353396805507 samples/sec                   batch loss = 928.6880374550819 | accuracy = 0.6606666666666666


Epoch[2] Batch[380] Speed: 1.2461133084703122 samples/sec                   batch loss = 940.3468454480171 | accuracy = 0.6605263157894737


Epoch[2] Batch[385] Speed: 1.2523173955597011 samples/sec                   batch loss = 951.1371210217476 | accuracy = 0.6597402597402597


Epoch[2] Batch[390] Speed: 1.2569264920612444 samples/sec                   batch loss = 962.9989177584648 | accuracy = 0.6602564102564102


Epoch[2] Batch[395] Speed: 1.2524309815052135 samples/sec                   batch loss = 974.4781041741371 | accuracy = 0.660126582278481


Epoch[2] Batch[400] Speed: 1.238708732359592 samples/sec                   batch loss = 985.3924036622047 | accuracy = 0.66


Epoch[2] Batch[405] Speed: 1.241936602606501 samples/sec                   batch loss = 999.9812706112862 | accuracy = 0.6567901234567901


Epoch[2] Batch[410] Speed: 1.2554935836475583 samples/sec                   batch loss = 1012.5551176667213 | accuracy = 0.6573170731707317


Epoch[2] Batch[415] Speed: 1.2566731390829453 samples/sec                   batch loss = 1023.326212823391 | accuracy = 0.658433734939759


Epoch[2] Batch[420] Speed: 1.2583899521514952 samples/sec                   batch loss = 1035.4445578455925 | accuracy = 0.6577380952380952


Epoch[2] Batch[425] Speed: 1.2555943089837338 samples/sec                   batch loss = 1046.309073984623 | accuracy = 0.6588235294117647


Epoch[2] Batch[430] Speed: 1.2563954249433031 samples/sec                   batch loss = 1056.880069077015 | accuracy = 0.661046511627907


Epoch[2] Batch[435] Speed: 1.255608404302946 samples/sec                   batch loss = 1067.6521474719048 | accuracy = 0.6626436781609195


Epoch[2] Batch[440] Speed: 1.2577645739728776 samples/sec                   batch loss = 1078.832138478756 | accuracy = 0.6630681818181818


Epoch[2] Batch[445] Speed: 1.2570714321625869 samples/sec                   batch loss = 1092.2032950520515 | accuracy = 0.6623595505617977


Epoch[2] Batch[450] Speed: 1.249517649349212 samples/sec                   batch loss = 1103.3971933722496 | accuracy = 0.6627777777777778


Epoch[2] Batch[455] Speed: 1.2517652781928017 samples/sec                   batch loss = 1113.8265466094017 | accuracy = 0.6637362637362637


Epoch[2] Batch[460] Speed: 1.2546684702589743 samples/sec                   batch loss = 1123.9760112166405 | accuracy = 0.6657608695652174


Epoch[2] Batch[465] Speed: 1.2437442736940538 samples/sec                   batch loss = 1135.7795647978783 | accuracy = 0.6655913978494624


Epoch[2] Batch[470] Speed: 1.236503969150961 samples/sec                   batch loss = 1147.6889809966087 | accuracy = 0.6654255319148936


Epoch[2] Batch[475] Speed: 1.2411009282356702 samples/sec                   batch loss = 1160.3144057393074 | accuracy = 0.6668421052631579


Epoch[2] Batch[480] Speed: 1.2387859271536503 samples/sec                   batch loss = 1172.6380153298378 | accuracy = 0.6666666666666666


Epoch[2] Batch[485] Speed: 1.2504088725492242 samples/sec                   batch loss = 1185.6184282898903 | accuracy = 0.6664948453608247


Epoch[2] Batch[490] Speed: 1.2592946374494858 samples/sec                   batch loss = 1197.4536600708961 | accuracy = 0.6668367346938775


Epoch[2] Batch[495] Speed: 1.2450818157462937 samples/sec                   batch loss = 1208.2976155877113 | accuracy = 0.6686868686868687


Epoch[2] Batch[500] Speed: 1.2533264605581143 samples/sec                   batch loss = 1219.3853451609612 | accuracy = 0.669


Epoch[2] Batch[505] Speed: 1.2535870831308744 samples/sec                   batch loss = 1230.4509019255638 | accuracy = 0.6688118811881189


Epoch[2] Batch[510] Speed: 1.2473085855389008 samples/sec                   batch loss = 1244.7078284621239 | accuracy = 0.6661764705882353


Epoch[2] Batch[515] Speed: 1.2557267234921163 samples/sec                   batch loss = 1255.2949258685112 | accuracy = 0.6684466019417475


Epoch[2] Batch[520] Speed: 1.2514328785382807 samples/sec                   batch loss = 1265.6871793866158 | accuracy = 0.6682692307692307


Epoch[2] Batch[525] Speed: 1.246015579116106 samples/sec                   batch loss = 1277.8047215342522 | accuracy = 0.6680952380952381


Epoch[2] Batch[530] Speed: 1.2535639476751197 samples/sec                   batch loss = 1288.0017611384392 | accuracy = 0.6688679245283019


Epoch[2] Batch[535] Speed: 1.2480201501666472 samples/sec                   batch loss = 1299.5758679509163 | accuracy = 0.6686915887850468


Epoch[2] Batch[540] Speed: 1.2443100083155654 samples/sec                   batch loss = 1310.1018899083138 | accuracy = 0.6689814814814815


Epoch[2] Batch[545] Speed: 1.2430210898925245 samples/sec                   batch loss = 1322.180213034153 | accuracy = 0.6692660550458716


Epoch[2] Batch[550] Speed: 1.248419479022068 samples/sec                   batch loss = 1337.099966108799 | accuracy = 0.6668181818181819


Epoch[2] Batch[555] Speed: 1.251258346058684 samples/sec                   batch loss = 1348.9825343489647 | accuracy = 0.6675675675675675


Epoch[2] Batch[560] Speed: 1.2437852129134632 samples/sec                   batch loss = 1360.4151157736778 | accuracy = 0.6683035714285714


Epoch[2] Batch[565] Speed: 1.2483368991493382 samples/sec                   batch loss = 1372.545677125454 | accuracy = 0.6690265486725664


Epoch[2] Batch[570] Speed: 1.2379208716291035 samples/sec                   batch loss = 1383.6700736880302 | accuracy = 0.6692982456140351


Epoch[2] Batch[575] Speed: 1.238633467553197 samples/sec                   batch loss = 1395.6267630457878 | accuracy = 0.668695652173913


Epoch[2] Batch[580] Speed: 1.2473217535704004 samples/sec                   batch loss = 1411.345169723034 | accuracy = 0.6668103448275862


Epoch[2] Batch[585] Speed: 1.2425962157185027 samples/sec                   batch loss = 1420.928703725338 | accuracy = 0.6679487179487179


Epoch[2] Batch[590] Speed: 1.237932106676985 samples/sec                   batch loss = 1430.6198554635048 | accuracy = 0.6686440677966101


Epoch[2] Batch[595] Speed: 1.2424162266638032 samples/sec                   batch loss = 1440.4096861481667 | accuracy = 0.6697478991596638


Epoch[2] Batch[600] Speed: 1.255691197199359 samples/sec                   batch loss = 1451.383668601513 | accuracy = 0.6708333333333333


Epoch[2] Batch[605] Speed: 1.259597370983776 samples/sec                   batch loss = 1460.4727203249931 | accuracy = 0.6714876033057852


Epoch[2] Batch[610] Speed: 1.2563595785569581 samples/sec                   batch loss = 1472.933660686016 | accuracy = 0.6709016393442623


Epoch[2] Batch[615] Speed: 1.2530109180156632 samples/sec                   batch loss = 1483.9640145897865 | accuracy = 0.6707317073170732


Epoch[2] Batch[620] Speed: 1.252120188258179 samples/sec                   batch loss = 1492.5863474011421 | accuracy = 0.6725806451612903


Epoch[2] Batch[625] Speed: 1.2522731820757282 samples/sec                   batch loss = 1502.3008478283882 | accuracy = 0.6732


Epoch[2] Batch[630] Speed: 1.2603834365045976 samples/sec                   batch loss = 1514.5112366080284 | accuracy = 0.6726190476190477


Epoch[2] Batch[635] Speed: 1.2570811337006997 samples/sec                   batch loss = 1526.871205151081 | accuracy = 0.6728346456692913


Epoch[2] Batch[640] Speed: 1.250590346031963 samples/sec                   batch loss = 1536.0642256736755 | accuracy = 0.673046875


Epoch[2] Batch[645] Speed: 1.261893780554104 samples/sec                   batch loss = 1550.9029293060303 | accuracy = 0.6728682170542636


Epoch[2] Batch[650] Speed: 1.256158462757968 samples/sec                   batch loss = 1559.9759432077408 | accuracy = 0.6746153846153846


Epoch[2] Batch[655] Speed: 1.2487043658837549 samples/sec                   batch loss = 1572.5323004722595 | accuracy = 0.6748091603053435


Epoch[2] Batch[660] Speed: 1.2422359218283296 samples/sec                   batch loss = 1586.803660273552 | accuracy = 0.6734848484848485


Epoch[2] Batch[665] Speed: 1.2464892836512298 samples/sec                   batch loss = 1595.9703385829926 | accuracy = 0.675187969924812


Epoch[2] Batch[670] Speed: 1.2453490038783694 samples/sec                   batch loss = 1607.0226695537567 | accuracy = 0.6757462686567164


Epoch[2] Batch[675] Speed: 1.2385871060585893 samples/sec                   batch loss = 1616.729057610035 | accuracy = 0.677037037037037


Epoch[2] Batch[680] Speed: 1.243497773302683 samples/sec                   batch loss = 1628.6591383814812 | accuracy = 0.6775735294117647


Epoch[2] Batch[685] Speed: 1.2458837243134009 samples/sec                   batch loss = 1640.9776132702827 | accuracy = 0.6773722627737226


Epoch[2] Batch[690] Speed: 1.247269824919085 samples/sec                   batch loss = 1652.6755152344704 | accuracy = 0.677536231884058


Epoch[2] Batch[695] Speed: 1.2546656553889055 samples/sec                   batch loss = 1661.8684812486172 | accuracy = 0.6784172661870503


Epoch[2] Batch[700] Speed: 1.2477187796921565 samples/sec                   batch loss = 1670.9078830182552 | accuracy = 0.6796428571428571


Epoch[2] Batch[705] Speed: 1.2551748831794143 samples/sec                   batch loss = 1681.9700587689877 | accuracy = 0.6794326241134752


Epoch[2] Batch[710] Speed: 1.2563147028936725 samples/sec                   batch loss = 1695.017477363348 | accuracy = 0.6788732394366197


Epoch[2] Batch[715] Speed: 1.2497873965960808 samples/sec                   batch loss = 1708.6853183209896 | accuracy = 0.679020979020979


Epoch[2] Batch[720] Speed: 1.2510345117627943 samples/sec                   batch loss = 1717.0721316039562 | accuracy = 0.6798611111111111


Epoch[2] Batch[725] Speed: 1.2506110412691858 samples/sec                   batch loss = 1726.703076094389 | accuracy = 0.6803448275862068


Epoch[2] Batch[730] Speed: 1.2459277653399492 samples/sec                   batch loss = 1738.3548119962215 | accuracy = 0.6804794520547945


Epoch[2] Batch[735] Speed: 1.247238484361212 samples/sec                   batch loss = 1750.1223059594631 | accuracy = 0.6799319727891157


Epoch[2] Batch[740] Speed: 1.2427079529470324 samples/sec                   batch loss = 1759.5681428015232 | accuracy = 0.6810810810810811


Epoch[2] Batch[745] Speed: 1.2400646980662056 samples/sec                   batch loss = 1768.0116999447346 | accuracy = 0.6815436241610738


Epoch[2] Batch[750] Speed: 1.244497562367982 samples/sec                   batch loss = 1780.718396216631 | accuracy = 0.681


Epoch[2] Batch[755] Speed: 1.2500631282320853 samples/sec                   batch loss = 1793.6740590631962 | accuracy = 0.6804635761589404


Epoch[2] Batch[760] Speed: 1.2532671964578794 samples/sec                   batch loss = 1803.8562882244587 | accuracy = 0.68125


Epoch[2] Batch[765] Speed: 1.2487544621723894 samples/sec                   batch loss = 1814.8998611271381 | accuracy = 0.6820261437908497


Epoch[2] Batch[770] Speed: 1.2507045512624968 samples/sec                   batch loss = 1825.0915204584599 | accuracy = 0.6827922077922078


Epoch[2] Batch[775] Speed: 1.2461288577356147 samples/sec                   batch loss = 1837.3722907602787 | accuracy = 0.6832258064516129


Epoch[2] Batch[780] Speed: 1.2539091948843675 samples/sec                   batch loss = 1853.3851985037327 | accuracy = 0.6823717948717949


Epoch[2] Batch[785] Speed: 1.2533380706118455 samples/sec                   batch loss = 1869.1285254061222 | accuracy = 0.6821656050955414


[Epoch 2] training: accuracy=0.6817893401015228
[Epoch 2] time cost: 646.3719055652618
[Epoch 2] validation: validation accuracy=0.7388888888888889


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).