<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[03:32:11] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[03:32:11] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[03:32:12] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[3.3484986 , 0.32047993]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7671578190720652 samples/sec                   batch loss = 14.46235728263855 | accuracy = 0.5


Epoch[1] Batch[10] Speed: 1.2425536062363731 samples/sec                   batch loss = 29.16131567955017 | accuracy = 0.475


Epoch[1] Batch[15] Speed: 1.249653159979615 samples/sec                   batch loss = 43.76539731025696 | accuracy = 0.48333333333333334


Epoch[1] Batch[20] Speed: 1.2558585078808466 samples/sec                   batch loss = 58.26655411720276 | accuracy = 0.5


Epoch[1] Batch[25] Speed: 1.2517594876991984 samples/sec                   batch loss = 72.30762887001038 | accuracy = 0.49


Epoch[1] Batch[30] Speed: 1.2519284613842836 samples/sec                   batch loss = 87.11149001121521 | accuracy = 0.4666666666666667


Epoch[1] Batch[35] Speed: 1.2452098040547286 samples/sec                   batch loss = 101.10451221466064 | accuracy = 0.45714285714285713


Epoch[1] Batch[40] Speed: 1.2552668228802935 samples/sec                   batch loss = 114.52256536483765 | accuracy = 0.4875


Epoch[1] Batch[45] Speed: 1.254458140429012 samples/sec                   batch loss = 128.31622648239136 | accuracy = 0.5


Epoch[1] Batch[50] Speed: 1.2570501458172882 samples/sec                   batch loss = 142.01433205604553 | accuracy = 0.505


Epoch[1] Batch[55] Speed: 1.2562927836617097 samples/sec                   batch loss = 155.5735628604889 | accuracy = 0.5136363636363637


Epoch[1] Batch[60] Speed: 1.2553648816925307 samples/sec                   batch loss = 169.52203011512756 | accuracy = 0.5166666666666667


Epoch[1] Batch[65] Speed: 1.2517259599264836 samples/sec                   batch loss = 182.96009516716003 | accuracy = 0.5307692307692308


Epoch[1] Batch[70] Speed: 1.2508276365753188 samples/sec                   batch loss = 196.82732462882996 | accuracy = 0.5321428571428571


Epoch[1] Batch[75] Speed: 1.2573280552764567 samples/sec                   batch loss = 210.68321466445923 | accuracy = 0.54


Epoch[1] Batch[80] Speed: 1.2529611346571967 samples/sec                   batch loss = 224.6830871105194 | accuracy = 0.534375


Epoch[1] Batch[85] Speed: 1.2544365673159514 samples/sec                   batch loss = 238.88994431495667 | accuracy = 0.5264705882352941


Epoch[1] Batch[90] Speed: 1.2460266839525238 samples/sec                   batch loss = 252.96554350852966 | accuracy = 0.5194444444444445


Epoch[1] Batch[95] Speed: 1.25062344009526 samples/sec                   batch loss = 266.63367199897766 | accuracy = 0.5210526315789473


Epoch[1] Batch[100] Speed: 1.2480687988974704 samples/sec                   batch loss = 280.04082107543945 | accuracy = 0.5225


Epoch[1] Batch[105] Speed: 1.2599845562443646 samples/sec                   batch loss = 293.6716802120209 | accuracy = 0.5238095238095238


Epoch[1] Batch[110] Speed: 1.2524948417466228 samples/sec                   batch loss = 307.88012528419495 | accuracy = 0.5136363636363637


Epoch[1] Batch[115] Speed: 1.2536865659540715 samples/sec                   batch loss = 321.56142687797546 | accuracy = 0.5130434782608696


Epoch[1] Batch[120] Speed: 1.2438855438368552 samples/sec                   batch loss = 335.1376919746399 | accuracy = 0.5104166666666666


Epoch[1] Batch[125] Speed: 1.2574019339172853 samples/sec                   batch loss = 349.34788942337036 | accuracy = 0.502


Epoch[1] Batch[130] Speed: 1.261975790716154 samples/sec                   batch loss = 363.0167558193207 | accuracy = 0.5057692307692307


Epoch[1] Batch[135] Speed: 1.2543160529229778 samples/sec                   batch loss = 376.66442036628723 | accuracy = 0.512962962962963


Epoch[1] Batch[140] Speed: 1.2562562846636947 samples/sec                   batch loss = 390.359028339386 | accuracy = 0.5142857142857142


Epoch[1] Batch[145] Speed: 1.2645094886358412 samples/sec                   batch loss = 403.77488374710083 | accuracy = 0.5172413793103449


Epoch[1] Batch[150] Speed: 1.2510314333166774 samples/sec                   batch loss = 417.3549253940582 | accuracy = 0.5166666666666667


Epoch[1] Batch[155] Speed: 1.2506589598978017 samples/sec                   batch loss = 431.024578332901 | accuracy = 0.5225806451612903


Epoch[1] Batch[160] Speed: 1.2551772308080882 samples/sec                   batch loss = 444.94727087020874 | accuracy = 0.5203125


Epoch[1] Batch[165] Speed: 1.2540349739475476 samples/sec                   batch loss = 458.7099702358246 | accuracy = 0.5196969696969697


Epoch[1] Batch[170] Speed: 1.2590865334860397 samples/sec                   batch loss = 471.82191252708435 | accuracy = 0.5220588235294118


Epoch[1] Batch[175] Speed: 1.2503644209758917 samples/sec                   batch loss = 485.48092889785767 | accuracy = 0.5214285714285715


Epoch[1] Batch[180] Speed: 1.2473486468928356 samples/sec                   batch loss = 498.93238735198975 | accuracy = 0.5236111111111111


Epoch[1] Batch[185] Speed: 1.2478656877353198 samples/sec                   batch loss = 512.4557869434357 | accuracy = 0.5256756756756756


Epoch[1] Batch[190] Speed: 1.2488133931930312 samples/sec                   batch loss = 527.337788105011 | accuracy = 0.5210526315789473


Epoch[1] Batch[195] Speed: 1.2524216320895898 samples/sec                   batch loss = 541.0449812412262 | accuracy = 0.5217948717948718


Epoch[1] Batch[200] Speed: 1.247513370610281 samples/sec                   batch loss = 554.4718651771545 | accuracy = 0.525


Epoch[1] Batch[205] Speed: 1.254966918130564 samples/sec                   batch loss = 568.4333961009979 | accuracy = 0.524390243902439


Epoch[1] Batch[210] Speed: 1.2438047614193442 samples/sec                   batch loss = 581.0524516105652 | accuracy = 0.5321428571428571


Epoch[1] Batch[215] Speed: 1.2509419784238587 samples/sec                   batch loss = 594.7258048057556 | accuracy = 0.5313953488372093


Epoch[1] Batch[220] Speed: 1.2522501885591173 samples/sec                   batch loss = 608.787145614624 | accuracy = 0.5284090909090909


Epoch[1] Batch[225] Speed: 1.2534893018452586 samples/sec                   batch loss = 622.0358498096466 | accuracy = 0.5322222222222223


Epoch[1] Batch[230] Speed: 1.2549197952241407 samples/sec                   batch loss = 635.5550048351288 | accuracy = 0.5369565217391304


Epoch[1] Batch[235] Speed: 1.2573418126308278 samples/sec                   batch loss = 649.2682483196259 | accuracy = 0.5351063829787234


Epoch[1] Batch[240] Speed: 1.2489018001537193 samples/sec                   batch loss = 663.2942705154419 | accuracy = 0.5333333333333333


Epoch[1] Batch[245] Speed: 1.247810836781869 samples/sec                   batch loss = 676.7439565658569 | accuracy = 0.5377551020408163


Epoch[1] Batch[250] Speed: 1.2511512240310414 samples/sec                   batch loss = 690.7119567394257 | accuracy = 0.539


Epoch[1] Batch[255] Speed: 1.2548112007267445 samples/sec                   batch loss = 704.7300305366516 | accuracy = 0.5362745098039216


Epoch[1] Batch[260] Speed: 1.2492692274928918 samples/sec                   batch loss = 718.0423045158386 | accuracy = 0.5375


Epoch[1] Batch[265] Speed: 1.2445450136425658 samples/sec                   batch loss = 731.3157386779785 | accuracy = 0.5377358490566038


Epoch[1] Batch[270] Speed: 1.245670872406841 samples/sec                   batch loss = 744.2656149864197 | accuracy = 0.5398148148148149


Epoch[1] Batch[275] Speed: 1.249538402162075 samples/sec                   batch loss = 757.6487650871277 | accuracy = 0.5409090909090909


Epoch[1] Batch[280] Speed: 1.2507591906916147 samples/sec                   batch loss = 771.0805432796478 | accuracy = 0.5455357142857142


Epoch[1] Batch[285] Speed: 1.2468340728752016 samples/sec                   batch loss = 784.6448309421539 | accuracy = 0.5456140350877193


Epoch[1] Batch[290] Speed: 1.2482196897889994 samples/sec                   batch loss = 799.2450201511383 | accuracy = 0.5431034482758621


Epoch[1] Batch[295] Speed: 1.2527720500947053 samples/sec                   batch loss = 813.1626615524292 | accuracy = 0.5406779661016949


Epoch[1] Batch[300] Speed: 1.2384314958077156 samples/sec                   batch loss = 827.4862613677979 | accuracy = 0.5391666666666667


Epoch[1] Batch[305] Speed: 1.252997348850972 samples/sec                   batch loss = 841.0055913925171 | accuracy = 0.5401639344262295


Epoch[1] Batch[310] Speed: 1.2521161699873775 samples/sec                   batch loss = 854.5739314556122 | accuracy = 0.5411290322580645


Epoch[1] Batch[315] Speed: 1.2594104380563007 samples/sec                   batch loss = 868.8460686206818 | accuracy = 0.5404761904761904


Epoch[1] Batch[320] Speed: 1.2589878924249707 samples/sec                   batch loss = 881.722469329834 | accuracy = 0.5421875


Epoch[1] Batch[325] Speed: 1.2560951688163584 samples/sec                   batch loss = 895.6819369792938 | accuracy = 0.5415384615384615


Epoch[1] Batch[330] Speed: 1.2514725518015095 samples/sec                   batch loss = 909.2866544723511 | accuracy = 0.5424242424242425


Epoch[1] Batch[335] Speed: 1.249349046665941 samples/sec                   batch loss = 922.7747631072998 | accuracy = 0.5440298507462686


Epoch[1] Batch[340] Speed: 1.257452542027095 samples/sec                   batch loss = 936.3823487758636 | accuracy = 0.5448529411764705


Epoch[1] Batch[345] Speed: 1.259682582396386 samples/sec                   batch loss = 949.4224064350128 | accuracy = 0.5456521739130434


Epoch[1] Batch[350] Speed: 1.250082688287695 samples/sec                   batch loss = 963.1131575107574 | accuracy = 0.5457142857142857


Epoch[1] Batch[355] Speed: 1.248651113939938 samples/sec                   batch loss = 976.6550259590149 | accuracy = 0.547887323943662


Epoch[1] Batch[360] Speed: 1.2543154902641713 samples/sec                   batch loss = 990.0974409580231 | accuracy = 0.5513888888888889


Epoch[1] Batch[365] Speed: 1.2532736562535063 samples/sec                   batch loss = 1003.9138011932373 | accuracy = 0.552054794520548


Epoch[1] Batch[370] Speed: 1.2577386439702378 samples/sec                   batch loss = 1016.9767122268677 | accuracy = 0.5547297297297298


Epoch[1] Batch[375] Speed: 1.2562469721146265 samples/sec                   batch loss = 1030.0937571525574 | accuracy = 0.556


Epoch[1] Batch[380] Speed: 1.2516786131785431 samples/sec                   batch loss = 1043.6350982189178 | accuracy = 0.5578947368421052


Epoch[1] Batch[385] Speed: 1.2537556136919081 samples/sec                   batch loss = 1056.7230896949768 | accuracy = 0.5590909090909091


Epoch[1] Batch[390] Speed: 1.2501092350565548 samples/sec                   batch loss = 1069.7024283409119 | accuracy = 0.5615384615384615


Epoch[1] Batch[395] Speed: 1.257965921461858 samples/sec                   batch loss = 1083.0241198539734 | accuracy = 0.5639240506329114


Epoch[1] Batch[400] Speed: 1.2595362831042556 samples/sec                   batch loss = 1097.027227640152 | accuracy = 0.563125


Epoch[1] Batch[405] Speed: 1.2598941948815026 samples/sec                   batch loss = 1110.8834826946259 | accuracy = 0.562962962962963


Epoch[1] Batch[410] Speed: 1.2498238930891 samples/sec                   batch loss = 1123.7540233135223 | accuracy = 0.5646341463414634


Epoch[1] Batch[415] Speed: 1.2460676810211786 samples/sec                   batch loss = 1137.1471922397614 | accuracy = 0.5650602409638554


Epoch[1] Batch[420] Speed: 1.2562561905968521 samples/sec                   batch loss = 1149.3647682666779 | accuracy = 0.5678571428571428


Epoch[1] Batch[425] Speed: 1.2534982925909917 samples/sec                   batch loss = 1162.9827449321747 | accuracy = 0.5670588235294117


Epoch[1] Batch[430] Speed: 1.2603450898670747 samples/sec                   batch loss = 1176.9977095127106 | accuracy = 0.5668604651162791


Epoch[1] Batch[435] Speed: 1.2538556855124996 samples/sec                   batch loss = 1190.6633248329163 | accuracy = 0.5666666666666667


Epoch[1] Batch[440] Speed: 1.2479017936172396 samples/sec                   batch loss = 1202.6866500377655 | accuracy = 0.5681818181818182


Epoch[1] Batch[445] Speed: 1.2528270574303704 samples/sec                   batch loss = 1216.0129880905151 | accuracy = 0.5685393258426966


Epoch[1] Batch[450] Speed: 1.2579875218122345 samples/sec                   batch loss = 1228.8018655776978 | accuracy = 0.57


Epoch[1] Batch[455] Speed: 1.2467952490853336 samples/sec                   batch loss = 1241.535817861557 | accuracy = 0.5714285714285714


Epoch[1] Batch[460] Speed: 1.2541199031431531 samples/sec                   batch loss = 1255.4356331825256 | accuracy = 0.5717391304347826


Epoch[1] Batch[465] Speed: 1.2446817563252084 samples/sec                   batch loss = 1268.834228515625 | accuracy = 0.5720430107526882


Epoch[1] Batch[470] Speed: 1.2514213970984882 samples/sec                   batch loss = 1281.4221634864807 | accuracy = 0.5728723404255319


Epoch[1] Batch[475] Speed: 1.2549844727872645 samples/sec                   batch loss = 1294.920670747757 | accuracy = 0.5742105263157895


Epoch[1] Batch[480] Speed: 1.2510539156506568 samples/sec                   batch loss = 1308.7544796466827 | accuracy = 0.5739583333333333


Epoch[1] Batch[485] Speed: 1.2547073167557845 samples/sec                   batch loss = 1322.4282622337341 | accuracy = 0.5737113402061855


Epoch[1] Batch[490] Speed: 1.2530289794977014 samples/sec                   batch loss = 1334.3537857532501 | accuracy = 0.5755102040816327


Epoch[1] Batch[495] Speed: 1.2495628783385175 samples/sec                   batch loss = 1346.2963058948517 | accuracy = 0.5782828282828283


Epoch[1] Batch[500] Speed: 1.2558522094240776 samples/sec                   batch loss = 1359.7695381641388 | accuracy = 0.578


Epoch[1] Batch[505] Speed: 1.2546580553028084 samples/sec                   batch loss = 1371.8545987606049 | accuracy = 0.5782178217821782


Epoch[1] Batch[510] Speed: 1.258045723828138 samples/sec                   batch loss = 1386.1609320640564 | accuracy = 0.5774509803921568


Epoch[1] Batch[515] Speed: 1.2556106595833885 samples/sec                   batch loss = 1399.0561759471893 | accuracy = 0.579126213592233


Epoch[1] Batch[520] Speed: 1.2530877531052305 samples/sec                   batch loss = 1412.0006694793701 | accuracy = 0.5788461538461539


Epoch[1] Batch[525] Speed: 1.2477476388704716 samples/sec                   batch loss = 1424.3372159004211 | accuracy = 0.579047619047619


Epoch[1] Batch[530] Speed: 1.2467943225327769 samples/sec                   batch loss = 1438.1500177383423 | accuracy = 0.5806603773584905


Epoch[1] Batch[535] Speed: 1.246510491668763 samples/sec                   batch loss = 1452.4150912761688 | accuracy = 0.5799065420560747


Epoch[1] Batch[540] Speed: 1.251812911705311 samples/sec                   batch loss = 1466.63720536232 | accuracy = 0.5800925925925926


Epoch[1] Batch[545] Speed: 1.2522768274668834 samples/sec                   batch loss = 1478.9770460128784 | accuracy = 0.581651376146789


Epoch[1] Batch[550] Speed: 1.2500082701991793 samples/sec                   batch loss = 1491.7651851177216 | accuracy = 0.5813636363636364


Epoch[1] Batch[555] Speed: 1.2434016517307718 samples/sec                   batch loss = 1505.385220527649 | accuracy = 0.581081081081081


Epoch[1] Batch[560] Speed: 1.2412166207692958 samples/sec                   batch loss = 1519.5868818759918 | accuracy = 0.5803571428571429


Epoch[1] Batch[565] Speed: 1.2502180420154891 samples/sec                   batch loss = 1532.4806904792786 | accuracy = 0.5805309734513274


Epoch[1] Batch[570] Speed: 1.2474213574047783 samples/sec                   batch loss = 1544.7037026882172 | accuracy = 0.5815789473684211


Epoch[1] Batch[575] Speed: 1.2495812128326622 samples/sec                   batch loss = 1557.3996562957764 | accuracy = 0.5834782608695652


Epoch[1] Batch[580] Speed: 1.2556020143856894 samples/sec                   batch loss = 1570.6475328207016 | accuracy = 0.5831896551724138


Epoch[1] Batch[585] Speed: 1.2573158058307317 samples/sec                   batch loss = 1583.5080324411392 | accuracy = 0.5841880341880342


Epoch[1] Batch[590] Speed: 1.255221555685571 samples/sec                   batch loss = 1596.8526426553726 | accuracy = 0.5838983050847457


Epoch[1] Batch[595] Speed: 1.2477947814929509 samples/sec                   batch loss = 1609.0680040121078 | accuracy = 0.5844537815126051


Epoch[1] Batch[600] Speed: 1.2574452851205673 samples/sec                   batch loss = 1623.4238678216934 | accuracy = 0.58375


Epoch[1] Batch[605] Speed: 1.2555508973889038 samples/sec                   batch loss = 1636.8399478197098 | accuracy = 0.5822314049586776


Epoch[1] Batch[610] Speed: 1.2622618655940525 samples/sec                   batch loss = 1650.6040095090866 | accuracy = 0.5815573770491803


Epoch[1] Batch[615] Speed: 1.251149824474374 samples/sec                   batch loss = 1662.7861450910568 | accuracy = 0.582520325203252


Epoch[1] Batch[620] Speed: 1.2525678729324068 samples/sec                   batch loss = 1675.6633001565933 | accuracy = 0.5830645161290322


Epoch[1] Batch[625] Speed: 1.2561591211237166 samples/sec                   batch loss = 1688.982029080391 | accuracy = 0.584


Epoch[1] Batch[630] Speed: 1.2572732173077763 samples/sec                   batch loss = 1704.6630438566208 | accuracy = 0.582936507936508


Epoch[1] Batch[635] Speed: 1.256868016922738 samples/sec                   batch loss = 1718.8187848329544 | accuracy = 0.5838582677165355


Epoch[1] Batch[640] Speed: 1.2504733655658191 samples/sec                   batch loss = 1730.275032877922 | accuracy = 0.585546875


Epoch[1] Batch[645] Speed: 1.253483214434394 samples/sec                   batch loss = 1743.6333869695663 | accuracy = 0.5852713178294574


Epoch[1] Batch[650] Speed: 1.2549629754409817 samples/sec                   batch loss = 1756.1452581882477 | accuracy = 0.5865384615384616


Epoch[1] Batch[655] Speed: 1.2546099235622836 samples/sec                   batch loss = 1769.2899854183197 | accuracy = 0.5866412213740458


Epoch[1] Batch[660] Speed: 1.2575751681386411 samples/sec                   batch loss = 1781.3925046920776 | accuracy = 0.5882575757575758


Epoch[1] Batch[665] Speed: 1.2460422310560497 samples/sec                   batch loss = 1793.775059223175 | accuracy = 0.5890977443609022


Epoch[1] Batch[670] Speed: 1.2500116230138059 samples/sec                   batch loss = 1805.8284640312195 | accuracy = 0.591044776119403


Epoch[1] Batch[675] Speed: 1.2447507392212147 samples/sec                   batch loss = 1818.3182611465454 | accuracy = 0.5922222222222222


Epoch[1] Batch[680] Speed: 1.2438297512061145 samples/sec                   batch loss = 1831.2951519489288 | accuracy = 0.5933823529411765


Epoch[1] Batch[685] Speed: 1.2539141618352816 samples/sec                   batch loss = 1844.0240515470505 | accuracy = 0.5937956204379562


Epoch[1] Batch[690] Speed: 1.2516123150739942 samples/sec                   batch loss = 1856.2973883152008 | accuracy = 0.5945652173913043


Epoch[1] Batch[695] Speed: 1.2469716897887735 samples/sec                   batch loss = 1869.9836263656616 | accuracy = 0.5942446043165468


Epoch[1] Batch[700] Speed: 1.2486285320263477 samples/sec                   batch loss = 1883.308352470398 | accuracy = 0.5935714285714285


Epoch[1] Batch[705] Speed: 1.2454841665493108 samples/sec                   batch loss = 1895.691713809967 | accuracy = 0.5936170212765958


Epoch[1] Batch[710] Speed: 1.2470093196338743 samples/sec                   batch loss = 1908.1276807785034 | accuracy = 0.5936619718309859


Epoch[1] Batch[715] Speed: 1.2511735240563482 samples/sec                   batch loss = 1922.1306955814362 | accuracy = 0.5919580419580419


Epoch[1] Batch[720] Speed: 1.2526304377885895 samples/sec                   batch loss = 1935.7692918777466 | accuracy = 0.5920138888888888


Epoch[1] Batch[725] Speed: 1.247627292858387 samples/sec                   batch loss = 1949.9829542636871 | accuracy = 0.5917241379310345


Epoch[1] Batch[730] Speed: 1.2474731131626535 samples/sec                   batch loss = 1961.8653528690338 | accuracy = 0.5921232876712329


Epoch[1] Batch[735] Speed: 1.239064512059915 samples/sec                   batch loss = 1975.723129272461 | accuracy = 0.5918367346938775


Epoch[1] Batch[740] Speed: 1.2518320595308359 samples/sec                   batch loss = 1987.830826997757 | accuracy = 0.5922297297297298


Epoch[1] Batch[745] Speed: 1.252193829974016 samples/sec                   batch loss = 2000.756700515747 | accuracy = 0.5929530201342282


Epoch[1] Batch[750] Speed: 1.2566338883962886 samples/sec                   batch loss = 2013.7247591018677 | accuracy = 0.5926666666666667


Epoch[1] Batch[755] Speed: 1.2527447353216004 samples/sec                   batch loss = 2027.8139462471008 | accuracy = 0.5923841059602649


Epoch[1] Batch[760] Speed: 1.2517623829393034 samples/sec                   batch loss = 2041.078934788704 | accuracy = 0.5921052631578947


Epoch[1] Batch[765] Speed: 1.2566491365732397 samples/sec                   batch loss = 2053.3269869089127 | accuracy = 0.592483660130719


Epoch[1] Batch[770] Speed: 1.242594651171799 samples/sec                   batch loss = 2064.714218854904 | accuracy = 0.5938311688311688


Epoch[1] Batch[775] Speed: 1.2575252099564853 samples/sec                   batch loss = 2077.85382604599 | accuracy = 0.5948387096774194


Epoch[1] Batch[780] Speed: 1.254450918043321 samples/sec                   batch loss = 2089.392874598503 | accuracy = 0.5958333333333333


Epoch[1] Batch[785] Speed: 1.253182382709966 samples/sec                   batch loss = 2102.3549369573593 | accuracy = 0.5961783439490446


[Epoch 1] training: accuracy=0.5970812182741116
[Epoch 1] time cost: 647.80606341362
[Epoch 1] validation: validation accuracy=0.7022222222222222


Epoch[2] Batch[5] Speed: 1.2553224252304513 samples/sec                   batch loss = 11.885858297348022 | accuracy = 0.65


Epoch[2] Batch[10] Speed: 1.2557768208604323 samples/sec                   batch loss = 26.54330086708069 | accuracy = 0.55


Epoch[2] Batch[15] Speed: 1.2495123449291878 samples/sec                   batch loss = 40.2301299571991 | accuracy = 0.5833333333333334


Epoch[2] Batch[20] Speed: 1.2493082985410007 samples/sec                   batch loss = 52.21613585948944 | accuracy = 0.6


Epoch[2] Batch[25] Speed: 1.248567481312288 samples/sec                   batch loss = 63.76332628726959 | accuracy = 0.64


Epoch[2] Batch[30] Speed: 1.253453621094747 samples/sec                   batch loss = 76.4599609375 | accuracy = 0.6333333333333333


Epoch[2] Batch[35] Speed: 1.242850184662032 samples/sec                   batch loss = 90.42753720283508 | accuracy = 0.6142857142857143


Epoch[2] Batch[40] Speed: 1.245645531132496 samples/sec                   batch loss = 103.95952141284943 | accuracy = 0.60625


Epoch[2] Batch[45] Speed: 1.2435102158196596 samples/sec                   batch loss = 117.08982813358307 | accuracy = 0.6055555555555555


Epoch[2] Batch[50] Speed: 1.2457843658711012 samples/sec                   batch loss = 130.40216553211212 | accuracy = 0.605


Epoch[2] Batch[55] Speed: 1.2381007482380335 samples/sec                   batch loss = 143.76083958148956 | accuracy = 0.6


Epoch[2] Batch[60] Speed: 1.2457269228290386 samples/sec                   batch loss = 154.61719477176666 | accuracy = 0.6166666666666667


Epoch[2] Batch[65] Speed: 1.2420959456976297 samples/sec                   batch loss = 165.75085580348969 | accuracy = 0.6269230769230769


Epoch[2] Batch[70] Speed: 1.2407896755085013 samples/sec                   batch loss = 177.16422271728516 | accuracy = 0.6357142857142857


Epoch[2] Batch[75] Speed: 1.2452896599089955 samples/sec                   batch loss = 187.10186517238617 | accuracy = 0.6466666666666666


Epoch[2] Batch[80] Speed: 1.2349919347827831 samples/sec                   batch loss = 197.58227336406708 | accuracy = 0.65


Epoch[2] Batch[85] Speed: 1.2510016757852829 samples/sec                   batch loss = 214.15589654445648 | accuracy = 0.6411764705882353


Epoch[2] Batch[90] Speed: 1.2484813514173396 samples/sec                   batch loss = 228.8599852323532 | accuracy = 0.6388888888888888


Epoch[2] Batch[95] Speed: 1.2457948190394383 samples/sec                   batch loss = 241.5384647846222 | accuracy = 0.631578947368421


Epoch[2] Batch[100] Speed: 1.254511326011943 samples/sec                   batch loss = 253.47274267673492 | accuracy = 0.635


Epoch[2] Batch[105] Speed: 1.2547903662479274 samples/sec                   batch loss = 266.7056874036789 | accuracy = 0.6309523809523809


Epoch[2] Batch[110] Speed: 1.2526738347229518 samples/sec                   batch loss = 276.9811432361603 | accuracy = 0.6409090909090909


Epoch[2] Batch[115] Speed: 1.24351556157009 samples/sec                   batch loss = 290.7321078777313 | accuracy = 0.6326086956521739


Epoch[2] Batch[120] Speed: 1.2453877376502063 samples/sec                   batch loss = 304.13851022720337 | accuracy = 0.63125


Epoch[2] Batch[125] Speed: 1.2504935909022303 samples/sec                   batch loss = 315.7533791065216 | accuracy = 0.632


Epoch[2] Batch[130] Speed: 1.2476142111520268 samples/sec                   batch loss = 327.0596116781235 | accuracy = 0.6365384615384615


Epoch[2] Batch[135] Speed: 1.2521371026575758 samples/sec                   batch loss = 340.1415368318558 | accuracy = 0.6407407407407407


Epoch[2] Batch[140] Speed: 1.2497042633085609 samples/sec                   batch loss = 353.9911333322525 | accuracy = 0.6428571428571429


Epoch[2] Batch[145] Speed: 1.247915995217859 samples/sec                   batch loss = 365.69098913669586 | accuracy = 0.6431034482758621


Epoch[2] Batch[150] Speed: 1.2543480315280884 samples/sec                   batch loss = 376.97064554691315 | accuracy = 0.6466666666666666


Epoch[2] Batch[155] Speed: 1.2508405992416323 samples/sec                   batch loss = 389.2907705307007 | accuracy = 0.646774193548387


Epoch[2] Batch[160] Speed: 1.2514986908553825 samples/sec                   batch loss = 399.3075684309006 | accuracy = 0.653125


Epoch[2] Batch[165] Speed: 1.251335246349213 samples/sec                   batch loss = 412.7130342721939 | accuracy = 0.6530303030303031


Epoch[2] Batch[170] Speed: 1.239179916754358 samples/sec                   batch loss = 423.33349573612213 | accuracy = 0.6558823529411765


Epoch[2] Batch[175] Speed: 1.2535883008100959 samples/sec                   batch loss = 435.07620346546173 | accuracy = 0.6542857142857142


Epoch[2] Batch[180] Speed: 1.2580209142065828 samples/sec                   batch loss = 445.74575662612915 | accuracy = 0.6569444444444444


Epoch[2] Batch[185] Speed: 1.2542233148119049 samples/sec                   batch loss = 457.064838886261 | accuracy = 0.6581081081081082


Epoch[2] Batch[190] Speed: 1.25902870754116 samples/sec                   batch loss = 469.33361434936523 | accuracy = 0.656578947368421


Epoch[2] Batch[195] Speed: 1.248732620068111 samples/sec                   batch loss = 483.83916532993317 | accuracy = 0.6564102564102564


Epoch[2] Batch[200] Speed: 1.251025369755 samples/sec                   batch loss = 494.7316436767578 | accuracy = 0.65875


Epoch[2] Batch[205] Speed: 1.2518620434002603 samples/sec                   batch loss = 507.4723262786865 | accuracy = 0.6597560975609756


Epoch[2] Batch[210] Speed: 1.2543823564200416 samples/sec                   batch loss = 517.9544956684113 | accuracy = 0.6607142857142857


Epoch[2] Batch[215] Speed: 1.2595133057605794 samples/sec                   batch loss = 530.7701427936554 | accuracy = 0.6593023255813953


Epoch[2] Batch[220] Speed: 1.2514395061319523 samples/sec                   batch loss = 540.3940188884735 | accuracy = 0.6613636363636364


Epoch[2] Batch[225] Speed: 1.252492223629208 samples/sec                   batch loss = 551.6625690460205 | accuracy = 0.6622222222222223


Epoch[2] Batch[230] Speed: 1.245883076674476 samples/sec                   batch loss = 562.9952325820923 | accuracy = 0.6619565217391304


Epoch[2] Batch[235] Speed: 1.2522698171181146 samples/sec                   batch loss = 573.7786288261414 | accuracy = 0.6648936170212766


Epoch[2] Batch[240] Speed: 1.2523281456100668 samples/sec                   batch loss = 584.4681098461151 | accuracy = 0.6677083333333333


Epoch[2] Batch[245] Speed: 1.2526101432714603 samples/sec                   batch loss = 597.7349224090576 | accuracy = 0.6653061224489796


Epoch[2] Batch[250] Speed: 1.2508120630797195 samples/sec                   batch loss = 611.6155260801315 | accuracy = 0.663


Epoch[2] Batch[255] Speed: 1.2501977324296834 samples/sec                   batch loss = 625.681688785553 | accuracy = 0.6598039215686274


Epoch[2] Batch[260] Speed: 1.250581676609124 samples/sec                   batch loss = 636.0444625616074 | accuracy = 0.6615384615384615


Epoch[2] Batch[265] Speed: 1.2459163846923205 samples/sec                   batch loss = 650.662407040596 | accuracy = 0.659433962264151


Epoch[2] Batch[270] Speed: 1.2593327311352855 samples/sec                   batch loss = 661.8690682649612 | accuracy = 0.662962962962963


Epoch[2] Batch[275] Speed: 1.2434492959449044 samples/sec                   batch loss = 672.7052094936371 | accuracy = 0.6654545454545454


Epoch[2] Batch[280] Speed: 1.2496570693762654 samples/sec                   batch loss = 683.5158231258392 | accuracy = 0.6660714285714285


Epoch[2] Batch[285] Speed: 1.2502917396776618 samples/sec                   batch loss = 695.365146279335 | accuracy = 0.6666666666666666


Epoch[2] Batch[290] Speed: 1.254221533324208 samples/sec                   batch loss = 705.8461155891418 | accuracy = 0.6681034482758621


Epoch[2] Batch[295] Speed: 1.2478829514696037 samples/sec                   batch loss = 721.1269298791885 | accuracy = 0.6677966101694915


Epoch[2] Batch[300] Speed: 1.249568648527387 samples/sec                   batch loss = 735.2201620340347 | accuracy = 0.665


Epoch[2] Batch[305] Speed: 1.2449875740308003 samples/sec                   batch loss = 746.3893132209778 | accuracy = 0.6655737704918033


Epoch[2] Batch[310] Speed: 1.2552391174668573 samples/sec                   batch loss = 756.3236808776855 | accuracy = 0.6653225806451613


Epoch[2] Batch[315] Speed: 1.251912019707977 samples/sec                   batch loss = 767.1819276809692 | accuracy = 0.6658730158730158


Epoch[2] Batch[320] Speed: 1.242127672081924 samples/sec                   batch loss = 780.9575288295746 | accuracy = 0.66484375


Epoch[2] Batch[325] Speed: 1.2457198931265243 samples/sec                   batch loss = 792.77372443676 | accuracy = 0.6646153846153846


Epoch[2] Batch[330] Speed: 1.2469651094333996 samples/sec                   batch loss = 804.0630472898483 | accuracy = 0.6666666666666666


Epoch[2] Batch[335] Speed: 1.2485391416856007 samples/sec                   batch loss = 815.3297798633575 | accuracy = 0.6656716417910448


Epoch[2] Batch[340] Speed: 1.246394365778643 samples/sec                   batch loss = 826.946004986763 | accuracy = 0.6661764705882353


Epoch[2] Batch[345] Speed: 1.2501949375842047 samples/sec                   batch loss = 840.1518037319183 | accuracy = 0.6652173913043479


Epoch[2] Batch[350] Speed: 1.2435831247564586 samples/sec                   batch loss = 851.9014194011688 | accuracy = 0.665


Epoch[2] Batch[355] Speed: 1.2552945295167743 samples/sec                   batch loss = 863.990696310997 | accuracy = 0.6661971830985915


Epoch[2] Batch[360] Speed: 1.2601651285544169 samples/sec                   batch loss = 875.6472412347794 | accuracy = 0.6666666666666666


Epoch[2] Batch[365] Speed: 1.255408562244588 samples/sec                   batch loss = 888.1649688482285 | accuracy = 0.6664383561643835


Epoch[2] Batch[370] Speed: 1.2586490968706951 samples/sec                   batch loss = 901.7408317327499 | accuracy = 0.6668918918918919


Epoch[2] Batch[375] Speed: 1.252258039908876 samples/sec                   batch loss = 913.6991803646088 | accuracy = 0.668


Epoch[2] Batch[380] Speed: 1.2456846533136514 samples/sec                   batch loss = 924.1850333213806 | accuracy = 0.6703947368421053


Epoch[2] Batch[385] Speed: 1.2494271081455834 samples/sec                   batch loss = 935.7931777238846 | accuracy = 0.6701298701298701


Epoch[2] Batch[390] Speed: 1.242895576779984 samples/sec                   batch loss = 950.1945937871933 | accuracy = 0.6685897435897435


Epoch[2] Batch[395] Speed: 1.2548388872533884 samples/sec                   batch loss = 961.7799367904663 | accuracy = 0.6689873417721519


Epoch[2] Batch[400] Speed: 1.2476762821108713 samples/sec                   batch loss = 975.0887750387192 | accuracy = 0.6675


Epoch[2] Batch[405] Speed: 1.2406543370911383 samples/sec                   batch loss = 984.4800171256065 | accuracy = 0.6691358024691358


Epoch[2] Batch[410] Speed: 1.2407578339163077 samples/sec                   batch loss = 996.0826626420021 | accuracy = 0.6689024390243903


Epoch[2] Batch[415] Speed: 1.2377250674498728 samples/sec                   batch loss = 1006.3977977633476 | accuracy = 0.6704819277108434


Epoch[2] Batch[420] Speed: 1.2418435717676344 samples/sec                   batch loss = 1018.3525702357292 | accuracy = 0.6696428571428571


Epoch[2] Batch[425] Speed: 1.2421544338574793 samples/sec                   batch loss = 1028.9388177990913 | accuracy = 0.6705882352941176


Epoch[2] Batch[430] Speed: 1.2482265619966368 samples/sec                   batch loss = 1038.8036994338036 | accuracy = 0.6715116279069767


Epoch[2] Batch[435] Speed: 1.2581650689518376 samples/sec                   batch loss = 1049.0875633358955 | accuracy = 0.6729885057471264


Epoch[2] Batch[440] Speed: 1.2496270979603565 samples/sec                   batch loss = 1060.0242367386818 | accuracy = 0.6727272727272727


Epoch[2] Batch[445] Speed: 1.2346284954504707 samples/sec                   batch loss = 1074.1812716126442 | accuracy = 0.6719101123595506


Epoch[2] Batch[450] Speed: 1.2469344329185859 samples/sec                   batch loss = 1083.6741343140602 | accuracy = 0.6738888888888889


Epoch[2] Batch[455] Speed: 1.2453606514895998 samples/sec                   batch loss = 1096.0791847109795 | accuracy = 0.6736263736263737


Epoch[2] Batch[460] Speed: 1.2487402414815345 samples/sec                   batch loss = 1109.0551871657372 | accuracy = 0.6739130434782609


Epoch[2] Batch[465] Speed: 1.2529913597959033 samples/sec                   batch loss = 1121.2392665147781 | accuracy = 0.6752688172043011


Epoch[2] Batch[470] Speed: 1.2475572486217894 samples/sec                   batch loss = 1137.3683822154999 | accuracy = 0.673936170212766


Epoch[2] Batch[475] Speed: 1.2508492722547206 samples/sec                   batch loss = 1149.082405924797 | accuracy = 0.6731578947368421


Epoch[2] Batch[480] Speed: 1.2429628884375712 samples/sec                   batch loss = 1161.201579451561 | accuracy = 0.671875


Epoch[2] Batch[485] Speed: 1.2386778205546107 samples/sec                   batch loss = 1174.894558429718 | accuracy = 0.6711340206185566


Epoch[2] Batch[490] Speed: 1.2518285101335844 samples/sec                   batch loss = 1190.8177634477615 | accuracy = 0.6698979591836735


Epoch[2] Batch[495] Speed: 1.2454784340208926 samples/sec                   batch loss = 1200.0748080015182 | accuracy = 0.6712121212121213


Epoch[2] Batch[500] Speed: 1.2517274541622367 samples/sec                   batch loss = 1212.2271857261658 | accuracy = 0.67


Epoch[2] Batch[505] Speed: 1.2505196890772683 samples/sec                   batch loss = 1221.1807152032852 | accuracy = 0.6717821782178218


Epoch[2] Batch[510] Speed: 1.2391308602686524 samples/sec                   batch loss = 1233.4435795545578 | accuracy = 0.6715686274509803


Epoch[2] Batch[515] Speed: 1.2523230042586049 samples/sec                   batch loss = 1243.8319519758224 | accuracy = 0.6728155339805825


Epoch[2] Batch[520] Speed: 1.2494386460689064 samples/sec                   batch loss = 1253.25068461895 | accuracy = 0.675


Epoch[2] Batch[525] Speed: 1.2485585611599024 samples/sec                   batch loss = 1266.0130735635757 | accuracy = 0.6757142857142857


Epoch[2] Batch[530] Speed: 1.2441098714994983 samples/sec                   batch loss = 1281.9938334226608 | accuracy = 0.6745283018867925


Epoch[2] Batch[535] Speed: 1.241657184323068 samples/sec                   batch loss = 1295.169538617134 | accuracy = 0.6738317757009346


Epoch[2] Batch[540] Speed: 1.255388459419419 samples/sec                   batch loss = 1309.6116203069687 | accuracy = 0.6731481481481482


Epoch[2] Batch[545] Speed: 1.2530489132403448 samples/sec                   batch loss = 1321.7401005029678 | accuracy = 0.6729357798165138


Epoch[2] Batch[550] Speed: 1.2560491836084884 samples/sec                   batch loss = 1332.3422356843948 | accuracy = 0.6736363636363636


Epoch[2] Batch[555] Speed: 1.2528155503942804 samples/sec                   batch loss = 1341.5191999673843 | accuracy = 0.6747747747747748


Epoch[2] Batch[560] Speed: 1.2592632567894972 samples/sec                   batch loss = 1355.111303448677 | accuracy = 0.6736607142857143


Epoch[2] Batch[565] Speed: 1.249467677915405 samples/sec                   batch loss = 1364.991776227951 | accuracy = 0.6747787610619469


Epoch[2] Batch[570] Speed: 1.2500897673158153 samples/sec                   batch loss = 1375.502044081688 | accuracy = 0.6741228070175439


Epoch[2] Batch[575] Speed: 1.2473555095125526 samples/sec                   batch loss = 1389.8188942670822 | accuracy = 0.6734782608695652


Epoch[2] Batch[580] Speed: 1.2496470166914404 samples/sec                   batch loss = 1405.757420182228 | accuracy = 0.6728448275862069


Epoch[2] Batch[585] Speed: 1.251407395627761 samples/sec                   batch loss = 1416.2700202465057 | accuracy = 0.6730769230769231


Epoch[2] Batch[590] Speed: 1.2464618717371065 samples/sec                   batch loss = 1428.3711709976196 | accuracy = 0.6720338983050848


Epoch[2] Batch[595] Speed: 1.2434213724906586 samples/sec                   batch loss = 1439.4346606731415 | accuracy = 0.6718487394957983


Epoch[2] Batch[600] Speed: 1.2432600308198838 samples/sec                   batch loss = 1455.2891891002655 | accuracy = 0.6708333333333333


Epoch[2] Batch[605] Speed: 1.2457332126301184 samples/sec                   batch loss = 1469.2818319797516 | accuracy = 0.6702479338842975


Epoch[2] Batch[610] Speed: 1.2485123828255666 samples/sec                   batch loss = 1479.6950166225433 | accuracy = 0.6704918032786885


Epoch[2] Batch[615] Speed: 1.2497214848618567 samples/sec                   batch loss = 1490.4873900413513 | accuracy = 0.6703252032520325


Epoch[2] Batch[620] Speed: 1.2469126544360258 samples/sec                   batch loss = 1503.3185641765594 | accuracy = 0.6697580645161291


Epoch[2] Batch[625] Speed: 1.2486243502801313 samples/sec                   batch loss = 1515.2028694152832 | accuracy = 0.6704


Epoch[2] Batch[630] Speed: 1.2447610826974576 samples/sec                   batch loss = 1525.3275521993637 | accuracy = 0.6714285714285714


Epoch[2] Batch[635] Speed: 1.248639962275587 samples/sec                   batch loss = 1537.0619666576385 | accuracy = 0.6712598425196851


Epoch[2] Batch[640] Speed: 1.2534931416271367 samples/sec                   batch loss = 1550.0028018951416 | accuracy = 0.67109375


Epoch[2] Batch[645] Speed: 1.249670566290792 samples/sec                   batch loss = 1562.9935995340347 | accuracy = 0.6705426356589147


Epoch[2] Batch[650] Speed: 1.2469307258891058 samples/sec                   batch loss = 1573.468890786171 | accuracy = 0.6703846153846154


Epoch[2] Batch[655] Speed: 1.2490768847290423 samples/sec                   batch loss = 1585.1061809062958 | accuracy = 0.6713740458015267


Epoch[2] Batch[660] Speed: 1.245423885181072 samples/sec                   batch loss = 1595.47648280859 | accuracy = 0.6715909090909091


Epoch[2] Batch[665] Speed: 1.2431654198617295 samples/sec                   batch loss = 1605.5859021544456 | accuracy = 0.6725563909774436


Epoch[2] Batch[670] Speed: 1.2392641272408051 samples/sec                   batch loss = 1616.1702002882957 | accuracy = 0.6727611940298508


Epoch[2] Batch[675] Speed: 1.2460567605451682 samples/sec                   batch loss = 1626.77305239439 | accuracy = 0.6737037037037037


Epoch[2] Batch[680] Speed: 1.2510410418501656 samples/sec                   batch loss = 1637.6628968119621 | accuracy = 0.674264705882353


Epoch[2] Batch[685] Speed: 1.2487108716681594 samples/sec                   batch loss = 1648.6684470772743 | accuracy = 0.6744525547445256


Epoch[2] Batch[690] Speed: 1.2432951335815494 samples/sec                   batch loss = 1658.4854059815407 | accuracy = 0.6753623188405797


Epoch[2] Batch[695] Speed: 1.2377952904624072 samples/sec                   batch loss = 1672.087635576725 | accuracy = 0.6755395683453237


Epoch[2] Batch[700] Speed: 1.2503869725321608 samples/sec                   batch loss = 1683.1942860484123 | accuracy = 0.6753571428571429


Epoch[2] Batch[705] Speed: 1.240717369006056 samples/sec                   batch loss = 1695.7344797253609 | accuracy = 0.6758865248226951


Epoch[2] Batch[710] Speed: 1.2560859526971473 samples/sec                   batch loss = 1705.7069460749626 | accuracy = 0.6767605633802817


Epoch[2] Batch[715] Speed: 1.2322985424953896 samples/sec                   batch loss = 1717.0514549613 | accuracy = 0.6776223776223776


Epoch[2] Batch[720] Speed: 1.2346320388371663 samples/sec                   batch loss = 1730.084523499012 | accuracy = 0.6770833333333334


Epoch[2] Batch[725] Speed: 1.2417677417215562 samples/sec                   batch loss = 1741.5405604243279 | accuracy = 0.676896551724138


Epoch[2] Batch[730] Speed: 1.2439744535113446 samples/sec                   batch loss = 1754.6642786860466 | accuracy = 0.6760273972602739


Epoch[2] Batch[735] Speed: 1.2480347258197408 samples/sec                   batch loss = 1765.3710382580757 | accuracy = 0.676530612244898


Epoch[2] Batch[740] Speed: 1.2435137182026808 samples/sec                   batch loss = 1776.3792288899422 | accuracy = 0.6766891891891892


Epoch[2] Batch[745] Speed: 1.2443822726947387 samples/sec                   batch loss = 1787.7944319844246 | accuracy = 0.6768456375838926


Epoch[2] Batch[750] Speed: 1.2448144651558828 samples/sec                   batch loss = 1797.8254006505013 | accuracy = 0.6776666666666666


Epoch[2] Batch[755] Speed: 1.238494393480334 samples/sec                   batch loss = 1810.3450637459755 | accuracy = 0.6771523178807947


Epoch[2] Batch[760] Speed: 1.2364347126732078 samples/sec                   batch loss = 1822.4952928423882 | accuracy = 0.6773026315789473


Epoch[2] Batch[765] Speed: 1.2407181030398793 samples/sec                   batch loss = 1835.3383205533028 | accuracy = 0.6774509803921569


Epoch[2] Batch[770] Speed: 1.240850335051903 samples/sec                   batch loss = 1844.855450630188 | accuracy = 0.6779220779220779


Epoch[2] Batch[775] Speed: 1.2390162882350357 samples/sec                   batch loss = 1857.8349952697754 | accuracy = 0.6783870967741935


Epoch[2] Batch[780] Speed: 1.238134006927026 samples/sec                   batch loss = 1868.3992017507553 | accuracy = 0.6794871794871795


Epoch[2] Batch[785] Speed: 1.2348714008069215 samples/sec                   batch loss = 1879.4646835327148 | accuracy = 0.6792993630573249


[Epoch 2] training: accuracy=0.6802030456852792
[Epoch 2] time cost: 648.393618106842
[Epoch 2] validation: validation accuracy=0.7622222222222222


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).