<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[03:23:56] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[03:23:56] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[03:23:57] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[4.5923743, 1.8361223]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7664747694757558 samples/sec                   batch loss = 14.730656623840332 | accuracy = 0.5


Epoch[1] Batch[10] Speed: 1.249769987030181 samples/sec                   batch loss = 28.886590003967285 | accuracy = 0.45


Epoch[1] Batch[15] Speed: 1.2454143627379077 samples/sec                   batch loss = 43.413069009780884 | accuracy = 0.43333333333333335


Epoch[1] Batch[20] Speed: 1.236977855085748 samples/sec                   batch loss = 58.10424304008484 | accuracy = 0.4375


Epoch[1] Batch[25] Speed: 1.239886174637903 samples/sec                   batch loss = 70.89628028869629 | accuracy = 0.49


Epoch[1] Batch[30] Speed: 1.2411384800351424 samples/sec                   batch loss = 84.45326089859009 | accuracy = 0.5083333333333333


Epoch[1] Batch[35] Speed: 1.2465517984382082 samples/sec                   batch loss = 97.6656436920166 | accuracy = 0.5214285714285715


Epoch[1] Batch[40] Speed: 1.2449413823220155 samples/sec                   batch loss = 111.5806679725647 | accuracy = 0.5125


Epoch[1] Batch[45] Speed: 1.2413676044148225 samples/sec                   batch loss = 126.11131286621094 | accuracy = 0.5111111111111111


Epoch[1] Batch[50] Speed: 1.2327121456181898 samples/sec                   batch loss = 139.3288016319275 | accuracy = 0.53


Epoch[1] Batch[55] Speed: 1.2155287173928133 samples/sec                   batch loss = 152.78706979751587 | accuracy = 0.5409090909090909


Epoch[1] Batch[60] Speed: 1.2218603937781691 samples/sec                   batch loss = 165.1695737838745 | accuracy = 0.5666666666666667


Epoch[1] Batch[65] Speed: 1.2308751485200453 samples/sec                   batch loss = 179.27496194839478 | accuracy = 0.5576923076923077


Epoch[1] Batch[70] Speed: 1.2483518537492448 samples/sec                   batch loss = 193.09162545204163 | accuracy = 0.5571428571428572


Epoch[1] Batch[75] Speed: 1.2468583505547992 samples/sec                   batch loss = 207.01684188842773 | accuracy = 0.5666666666666667


Epoch[1] Batch[80] Speed: 1.2435896694660231 samples/sec                   batch loss = 221.32405948638916 | accuracy = 0.565625


Epoch[1] Batch[85] Speed: 1.2471985227983424 samples/sec                   batch loss = 235.22331523895264 | accuracy = 0.5647058823529412


Epoch[1] Batch[90] Speed: 1.2462020740748665 samples/sec                   batch loss = 249.86572909355164 | accuracy = 0.5555555555555556


Epoch[1] Batch[95] Speed: 1.2470384240765033 samples/sec                   batch loss = 264.77660059928894 | accuracy = 0.55


Epoch[1] Batch[100] Speed: 1.2508510441754164 samples/sec                   batch loss = 278.77413606643677 | accuracy = 0.5475


Epoch[1] Batch[105] Speed: 1.2489569330175534 samples/sec                   batch loss = 292.5979194641113 | accuracy = 0.5452380952380952


Epoch[1] Batch[110] Speed: 1.245749677410157 samples/sec                   batch loss = 306.50454902648926 | accuracy = 0.5431818181818182


Epoch[1] Batch[115] Speed: 1.2406931463771296 samples/sec                   batch loss = 319.8668587207794 | accuracy = 0.5478260869565217


Epoch[1] Batch[120] Speed: 1.2385772306856382 samples/sec                   batch loss = 334.75466752052307 | accuracy = 0.5416666666666666


Epoch[1] Batch[125] Speed: 1.2377720949840074 samples/sec                   batch loss = 349.1131241321564 | accuracy = 0.534


Epoch[1] Batch[130] Speed: 1.243603035636177 samples/sec                   batch loss = 362.7412004470825 | accuracy = 0.5346153846153846


Epoch[1] Batch[135] Speed: 1.2407558151961011 samples/sec                   batch loss = 375.9789743423462 | accuracy = 0.5370370370370371


Epoch[1] Batch[140] Speed: 1.2408749309288218 samples/sec                   batch loss = 389.8241608142853 | accuracy = 0.5357142857142857


Epoch[1] Batch[145] Speed: 1.2414605639519332 samples/sec                   batch loss = 403.5866632461548 | accuracy = 0.5327586206896552


Epoch[1] Batch[150] Speed: 1.2494881499003037 samples/sec                   batch loss = 417.33054304122925 | accuracy = 0.53


Epoch[1] Batch[155] Speed: 1.2409948956910508 samples/sec                   batch loss = 431.2161362171173 | accuracy = 0.532258064516129


Epoch[1] Batch[160] Speed: 1.2459161071181464 samples/sec                   batch loss = 445.0208508968353 | accuracy = 0.5328125


Epoch[1] Batch[165] Speed: 1.2496798746775748 samples/sec                   batch loss = 458.4980947971344 | accuracy = 0.5333333333333333


Epoch[1] Batch[170] Speed: 1.2463636247475287 samples/sec                   batch loss = 472.2501850128174 | accuracy = 0.5323529411764706


Epoch[1] Batch[175] Speed: 1.2461272842804074 samples/sec                   batch loss = 486.1038775444031 | accuracy = 0.5328571428571428


Epoch[1] Batch[180] Speed: 1.2335526107789803 samples/sec                   batch loss = 500.39170837402344 | accuracy = 0.5319444444444444


Epoch[1] Batch[185] Speed: 1.235984188269713 samples/sec                   batch loss = 514.0676419734955 | accuracy = 0.5310810810810811


Epoch[1] Batch[190] Speed: 1.2461209904993131 samples/sec                   batch loss = 528.0404207706451 | accuracy = 0.5328947368421053


Epoch[1] Batch[195] Speed: 1.2348444974905606 samples/sec                   batch loss = 541.6465470790863 | accuracy = 0.5333333333333333


Epoch[1] Batch[200] Speed: 1.2262541753830067 samples/sec                   batch loss = 554.3075861930847 | accuracy = 0.54


Epoch[1] Batch[205] Speed: 1.2225817866589093 samples/sec                   batch loss = 567.7212898731232 | accuracy = 0.5426829268292683


Epoch[1] Batch[210] Speed: 1.2285186644269452 samples/sec                   batch loss = 581.2095663547516 | accuracy = 0.544047619047619


Epoch[1] Batch[215] Speed: 1.235200152194843 samples/sec                   batch loss = 594.7796802520752 | accuracy = 0.5430232558139535


Epoch[1] Batch[220] Speed: 1.2459745854757491 samples/sec                   batch loss = 608.9466829299927 | accuracy = 0.5431818181818182


Epoch[1] Batch[225] Speed: 1.2470640074473522 samples/sec                   batch loss = 622.602587223053 | accuracy = 0.5433333333333333


Epoch[1] Batch[230] Speed: 1.2489966354129949 samples/sec                   batch loss = 636.0103509426117 | accuracy = 0.5467391304347826


Epoch[1] Batch[235] Speed: 1.248334112617529 samples/sec                   batch loss = 650.3506100177765 | accuracy = 0.5414893617021277


Epoch[1] Batch[240] Speed: 1.252669345245473 samples/sec                   batch loss = 663.9986686706543 | accuracy = 0.5427083333333333


Epoch[1] Batch[245] Speed: 1.2440630068979837 samples/sec                   batch loss = 677.5415751934052 | accuracy = 0.5459183673469388


Epoch[1] Batch[250] Speed: 1.2441194662627872 samples/sec                   batch loss = 691.6117327213287 | accuracy = 0.545


Epoch[1] Batch[255] Speed: 1.2442312930709514 samples/sec                   batch loss = 704.8002150058746 | accuracy = 0.5470588235294118


Epoch[1] Batch[260] Speed: 1.23855684036753 samples/sec                   batch loss = 717.8707656860352 | accuracy = 0.5490384615384616


Epoch[1] Batch[265] Speed: 1.2408872292329356 samples/sec                   batch loss = 730.645346403122 | accuracy = 0.5547169811320755


Epoch[1] Batch[270] Speed: 1.2352471699457932 samples/sec                   batch loss = 744.3857896327972 | accuracy = 0.5527777777777778


Epoch[1] Batch[275] Speed: 1.2412697914764623 samples/sec                   batch loss = 758.7220463752747 | accuracy = 0.5518181818181818


Epoch[1] Batch[280] Speed: 1.2423821855188844 samples/sec                   batch loss = 772.1296372413635 | accuracy = 0.5526785714285715


Epoch[1] Batch[285] Speed: 1.2443888258185143 samples/sec                   batch loss = 785.3644268512726 | accuracy = 0.5526315789473685


Epoch[1] Batch[290] Speed: 1.2462911300677515 samples/sec                   batch loss = 798.9220805168152 | accuracy = 0.5517241379310345


Epoch[1] Batch[295] Speed: 1.2473025580116095 samples/sec                   batch loss = 812.4992165565491 | accuracy = 0.5542372881355933


Epoch[1] Batch[300] Speed: 1.2427062040217771 samples/sec                   batch loss = 825.0975520610809 | accuracy = 0.5575


Epoch[1] Batch[305] Speed: 1.2445438134684508 samples/sec                   batch loss = 837.9406278133392 | accuracy = 0.559016393442623


Epoch[1] Batch[310] Speed: 1.2379046131404203 samples/sec                   batch loss = 851.3551278114319 | accuracy = 0.5588709677419355


Epoch[1] Batch[315] Speed: 1.2396532000847065 samples/sec                   batch loss = 864.4122221469879 | accuracy = 0.5611111111111111


Epoch[1] Batch[320] Speed: 1.2294548588050427 samples/sec                   batch loss = 877.6429116725922 | accuracy = 0.56171875


Epoch[1] Batch[325] Speed: 1.220761156647027 samples/sec                   batch loss = 891.5794086456299 | accuracy = 0.5615384615384615


Epoch[1] Batch[330] Speed: 1.232555200894287 samples/sec                   batch loss = 905.2984645366669 | accuracy = 0.5606060606060606


Epoch[1] Batch[335] Speed: 1.2386352964801564 samples/sec                   batch loss = 918.1848981380463 | accuracy = 0.5611940298507463


Epoch[1] Batch[340] Speed: 1.2419912141560747 samples/sec                   batch loss = 932.2561779022217 | accuracy = 0.5625


Epoch[1] Batch[345] Speed: 1.2411342564919579 samples/sec                   batch loss = 945.2798619270325 | accuracy = 0.5615942028985508


Epoch[1] Batch[350] Speed: 1.249909090902318 samples/sec                   batch loss = 958.1808786392212 | accuracy = 0.5635714285714286


Epoch[1] Batch[355] Speed: 1.2474239543655559 samples/sec                   batch loss = 971.5916540622711 | accuracy = 0.5633802816901409


Epoch[1] Batch[360] Speed: 1.2378086236899222 samples/sec                   batch loss = 985.6943509578705 | accuracy = 0.5631944444444444


Epoch[1] Batch[365] Speed: 1.2504144641658799 samples/sec                   batch loss = 999.9920806884766 | accuracy = 0.563013698630137


Epoch[1] Batch[370] Speed: 1.2506637146701554 samples/sec                   batch loss = 1013.586311340332 | accuracy = 0.5635135135135135


Epoch[1] Batch[375] Speed: 1.2473578279822246 samples/sec                   batch loss = 1026.0625216960907 | accuracy = 0.5653333333333334


Epoch[1] Batch[380] Speed: 1.2435354702863557 samples/sec                   batch loss = 1038.2939997911453 | accuracy = 0.5664473684210526


Epoch[1] Batch[385] Speed: 1.2404967394587416 samples/sec                   batch loss = 1052.6056205034256 | accuracy = 0.564935064935065


Epoch[1] Batch[390] Speed: 1.2410741200282314 samples/sec                   batch loss = 1066.1005469560623 | accuracy = 0.5634615384615385


Epoch[1] Batch[395] Speed: 1.2421715399263755 samples/sec                   batch loss = 1078.2502747774124 | accuracy = 0.5651898734177215


Epoch[1] Batch[400] Speed: 1.2474182967148866 samples/sec                   batch loss = 1092.9513639211655 | accuracy = 0.5625


Epoch[1] Batch[405] Speed: 1.247358477155277 samples/sec                   batch loss = 1106.7492719888687 | accuracy = 0.562962962962963


Epoch[1] Batch[410] Speed: 1.2474749682874635 samples/sec                   batch loss = 1119.2352381944656 | accuracy = 0.5652439024390243


Epoch[1] Batch[415] Speed: 1.2459973491055143 samples/sec                   batch loss = 1132.537501692772 | accuracy = 0.5644578313253013


Epoch[1] Batch[420] Speed: 1.2465843086056305 samples/sec                   batch loss = 1146.8727074861526 | accuracy = 0.5642857142857143


Epoch[1] Batch[425] Speed: 1.2419013926556361 samples/sec                   batch loss = 1159.949644446373 | accuracy = 0.5641176470588235


Epoch[1] Batch[430] Speed: 1.2458464399117217 samples/sec                   batch loss = 1172.97503221035 | accuracy = 0.5656976744186046


Epoch[1] Batch[435] Speed: 1.2482952882351637 samples/sec                   batch loss = 1186.8726328611374 | accuracy = 0.5632183908045977


Epoch[1] Batch[440] Speed: 1.2472642613914642 samples/sec                   batch loss = 1199.7209309339523 | accuracy = 0.5647727272727273


Epoch[1] Batch[445] Speed: 1.2480244207049027 samples/sec                   batch loss = 1213.6011947393417 | accuracy = 0.5662921348314607


Epoch[1] Batch[450] Speed: 1.248635780452809 samples/sec                   batch loss = 1227.982299208641 | accuracy = 0.565


Epoch[1] Batch[455] Speed: 1.249781065788725 samples/sec                   batch loss = 1241.1823843717575 | accuracy = 0.5664835164835165


Epoch[1] Batch[460] Speed: 1.239472231140975 samples/sec                   batch loss = 1254.0520334243774 | accuracy = 0.5668478260869565


Epoch[1] Batch[465] Speed: 1.246757817458041 samples/sec                   batch loss = 1266.7337913513184 | accuracy = 0.567741935483871


Epoch[1] Batch[470] Speed: 1.2460612027496027 samples/sec                   batch loss = 1280.9725184440613 | accuracy = 0.5664893617021277


Epoch[1] Batch[475] Speed: 1.246990041005874 samples/sec                   batch loss = 1294.2529528141022 | accuracy = 0.5668421052631579


Epoch[1] Batch[480] Speed: 1.2448478084260501 samples/sec                   batch loss = 1306.945606470108 | accuracy = 0.5682291666666667


Epoch[1] Batch[485] Speed: 1.2427335430479372 samples/sec                   batch loss = 1320.0286464691162 | accuracy = 0.5685567010309278


Epoch[1] Batch[490] Speed: 1.2484213369626405 samples/sec                   batch loss = 1333.0662007331848 | accuracy = 0.5693877551020409


Epoch[1] Batch[495] Speed: 1.2385297762288912 samples/sec                   batch loss = 1347.1763303279877 | accuracy = 0.5691919191919191


Epoch[1] Batch[500] Speed: 1.2466583195556464 samples/sec                   batch loss = 1361.4361140727997 | accuracy = 0.5695


Epoch[1] Batch[505] Speed: 1.240394936899777 samples/sec                   batch loss = 1373.0680304765701 | accuracy = 0.5712871287128712


Epoch[1] Batch[510] Speed: 1.2435595275023839 samples/sec                   batch loss = 1386.1656666994095 | accuracy = 0.5715686274509804


Epoch[1] Batch[515] Speed: 1.2479525681479922 samples/sec                   batch loss = 1399.4452985525131 | accuracy = 0.5728155339805825


Epoch[1] Batch[520] Speed: 1.2458023121264525 samples/sec                   batch loss = 1411.259721159935 | accuracy = 0.5745192307692307


Epoch[1] Batch[525] Speed: 1.2404350139309623 samples/sec                   batch loss = 1425.5465528964996 | accuracy = 0.5747619047619048


Epoch[1] Batch[530] Speed: 1.2437584730279432 samples/sec                   batch loss = 1438.9823598861694 | accuracy = 0.5740566037735849


Epoch[1] Batch[535] Speed: 1.2449046161718427 samples/sec                   batch loss = 1452.4626326560974 | accuracy = 0.5733644859813084


Epoch[1] Batch[540] Speed: 1.2433741911538885 samples/sec                   batch loss = 1464.936573266983 | accuracy = 0.5745370370370371


Epoch[1] Batch[545] Speed: 1.2317035245487384 samples/sec                   batch loss = 1476.9132151603699 | accuracy = 0.5752293577981651


Epoch[1] Batch[550] Speed: 1.2278085805421668 samples/sec                   batch loss = 1489.9589591026306 | accuracy = 0.5754545454545454


Epoch[1] Batch[555] Speed: 1.2510604459405972 samples/sec                   batch loss = 1503.8942332267761 | accuracy = 0.5752252252252252


Epoch[1] Batch[560] Speed: 1.2444966392273167 samples/sec                   batch loss = 1517.4471983909607 | accuracy = 0.5754464285714286


Epoch[1] Batch[565] Speed: 1.245054650494201 samples/sec                   batch loss = 1532.0670874118805 | accuracy = 0.5752212389380531


Epoch[1] Batch[570] Speed: 1.245571917804151 samples/sec                   batch loss = 1546.5794396400452 | accuracy = 0.5741228070175438


Epoch[1] Batch[575] Speed: 1.2493677470377513 samples/sec                   batch loss = 1560.1047863960266 | accuracy = 0.5743478260869566


Epoch[1] Batch[580] Speed: 1.2380657554476369 samples/sec                   batch loss = 1572.365427017212 | accuracy = 0.5754310344827587


Epoch[1] Batch[585] Speed: 1.2406507590522635 samples/sec                   batch loss = 1584.9981389045715 | accuracy = 0.5756410256410256


Epoch[1] Batch[590] Speed: 1.2396451396147419 samples/sec                   batch loss = 1597.7973823547363 | accuracy = 0.5766949152542373


Epoch[1] Batch[595] Speed: 1.2485108962558624 samples/sec                   batch loss = 1609.655210018158 | accuracy = 0.5777310924369747


Epoch[1] Batch[600] Speed: 1.2418243606133943 samples/sec                   batch loss = 1622.4517455101013 | accuracy = 0.5783333333333334


Epoch[1] Batch[605] Speed: 1.2409013634059 samples/sec                   batch loss = 1634.888026714325 | accuracy = 0.5785123966942148


Epoch[1] Batch[610] Speed: 1.2321591679151265 samples/sec                   batch loss = 1648.046043395996 | accuracy = 0.5778688524590164


Epoch[1] Batch[615] Speed: 1.2419772390262023 samples/sec                   batch loss = 1659.2752656936646 | accuracy = 0.5804878048780487


Epoch[1] Batch[620] Speed: 1.2412721792144004 samples/sec                   batch loss = 1672.2248466014862 | accuracy = 0.5810483870967742


Epoch[1] Batch[625] Speed: 1.2413988343269255 samples/sec                   batch loss = 1685.5815016031265 | accuracy = 0.582


Epoch[1] Batch[630] Speed: 1.247496395378917 samples/sec                   batch loss = 1698.4762860536575 | accuracy = 0.5825396825396826


Epoch[1] Batch[635] Speed: 1.2371350159490069 samples/sec                   batch loss = 1711.137734055519 | accuracy = 0.5830708661417323


Epoch[1] Batch[640] Speed: 1.2371163150881488 samples/sec                   batch loss = 1724.0101292133331 | accuracy = 0.5828125


Epoch[1] Batch[645] Speed: 1.2425673182562242 samples/sec                   batch loss = 1739.2800209522247 | accuracy = 0.5821705426356589


Epoch[1] Batch[650] Speed: 1.2393527436753526 samples/sec                   batch loss = 1751.2928493022919 | accuracy = 0.583076923076923


Epoch[1] Batch[655] Speed: 1.2461897627612661 samples/sec                   batch loss = 1764.0719575881958 | accuracy = 0.583587786259542


Epoch[1] Batch[660] Speed: 1.2477713025368766 samples/sec                   batch loss = 1776.125406742096 | accuracy = 0.5840909090909091


Epoch[1] Batch[665] Speed: 1.2249039024627792 samples/sec                   batch loss = 1787.397543668747 | accuracy = 0.5857142857142857


Epoch[1] Batch[670] Speed: 1.2326383322442487 samples/sec                   batch loss = 1801.836100101471 | accuracy = 0.5854477611940299


Epoch[1] Batch[675] Speed: 1.2462990920341068 samples/sec                   batch loss = 1813.4156501293182 | accuracy = 0.585925925925926


Epoch[1] Batch[680] Speed: 1.2440801655746532 samples/sec                   batch loss = 1826.9392445087433 | accuracy = 0.5856617647058824


Epoch[1] Batch[685] Speed: 1.2453750726434696 samples/sec                   batch loss = 1839.2800269126892 | accuracy = 0.5857664233576643


Epoch[1] Batch[690] Speed: 1.2400927459462823 samples/sec                   batch loss = 1851.7358281612396 | accuracy = 0.5869565217391305


Epoch[1] Batch[695] Speed: 1.2375273169739693 samples/sec                   batch loss = 1864.1708822250366 | accuracy = 0.5870503597122302


Epoch[1] Batch[700] Speed: 1.2416027859144543 samples/sec                   batch loss = 1876.0042171478271 | accuracy = 0.5878571428571429


Epoch[1] Batch[705] Speed: 1.249882738862576 samples/sec                   batch loss = 1887.9874358177185 | accuracy = 0.5882978723404255


Epoch[1] Batch[710] Speed: 1.2412895365085876 samples/sec                   batch loss = 1903.6629796028137 | accuracy = 0.5894366197183099


Epoch[1] Batch[715] Speed: 1.242551397617179 samples/sec                   batch loss = 1916.082358598709 | accuracy = 0.5902097902097903


Epoch[1] Batch[720] Speed: 1.2401760721319754 samples/sec                   batch loss = 1929.5032498836517 | accuracy = 0.5902777777777778


Epoch[1] Batch[725] Speed: 1.243491137395441 samples/sec                   batch loss = 1941.8703119754791 | accuracy = 0.5910344827586207


Epoch[1] Batch[730] Speed: 1.2345002203779316 samples/sec                   batch loss = 1953.8880851268768 | accuracy = 0.5914383561643836


Epoch[1] Batch[735] Speed: 1.2407704969474422 samples/sec                   batch loss = 1966.1485102176666 | accuracy = 0.5921768707482993


Epoch[1] Batch[740] Speed: 1.2406390158902094 samples/sec                   batch loss = 1978.527678847313 | accuracy = 0.5922297297297298


Epoch[1] Batch[745] Speed: 1.235457020058889 samples/sec                   batch loss = 1992.0709627866745 | accuracy = 0.5922818791946308


Epoch[1] Batch[750] Speed: 1.2402414392322416 samples/sec                   batch loss = 2003.766357421875 | accuracy = 0.5926666666666667


Epoch[1] Batch[755] Speed: 1.2364808220439059 samples/sec                   batch loss = 2016.2267570495605 | accuracy = 0.5920529801324503


Epoch[1] Batch[760] Speed: 1.238499879045364 samples/sec                   batch loss = 2030.5804269313812 | accuracy = 0.5914473684210526


Epoch[1] Batch[765] Speed: 1.2344393626351633 samples/sec                   batch loss = 2041.3397772312164 | accuracy = 0.592483660130719


Epoch[1] Batch[770] Speed: 1.2433548404691857 samples/sec                   batch loss = 2053.7978225946426 | accuracy = 0.5935064935064935


Epoch[1] Batch[775] Speed: 1.2277432595552826 samples/sec                   batch loss = 2067.5802146196365 | accuracy = 0.5929032258064516


Epoch[1] Batch[780] Speed: 1.2226897750139853 samples/sec                   batch loss = 2080.869543671608 | accuracy = 0.5932692307692308


Epoch[1] Batch[785] Speed: 1.246906723406621 samples/sec                   batch loss = 2093.7651530504227 | accuracy = 0.5936305732484076


[Epoch 1] training: accuracy=0.5948604060913706
[Epoch 1] time cost: 653.1453068256378
[Epoch 1] validation: validation accuracy=0.6677777777777778


Epoch[2] Batch[5] Speed: 1.239217718524445 samples/sec                   batch loss = 12.699517726898193 | accuracy = 0.5


Epoch[2] Batch[10] Speed: 1.2389700810707422 samples/sec                   batch loss = 24.078656435012817 | accuracy = 0.6


Epoch[2] Batch[15] Speed: 1.2443117617579718 samples/sec                   batch loss = 36.90675365924835 | accuracy = 0.6333333333333333


Epoch[2] Batch[20] Speed: 1.2386364852855762 samples/sec                   batch loss = 47.7091498374939 | accuracy = 0.6875


Epoch[2] Batch[25] Speed: 1.2392951598601476 samples/sec                   batch loss = 60.60352158546448 | accuracy = 0.67


Epoch[2] Batch[30] Speed: 1.2388634060101196 samples/sec                   batch loss = 73.73001027107239 | accuracy = 0.6583333333333333


Epoch[2] Batch[35] Speed: 1.2343656147263211 samples/sec                   batch loss = 85.97596263885498 | accuracy = 0.6571428571428571


Epoch[2] Batch[40] Speed: 1.2324838509008322 samples/sec                   batch loss = 99.58826756477356 | accuracy = 0.64375


Epoch[2] Batch[45] Speed: 1.2385322448681892 samples/sec                   batch loss = 111.94752562046051 | accuracy = 0.6388888888888888


Epoch[2] Batch[50] Speed: 1.2377615933800785 samples/sec                   batch loss = 124.3618311882019 | accuracy = 0.64


Epoch[2] Batch[55] Speed: 1.2344637050122245 samples/sec                   batch loss = 136.92615509033203 | accuracy = 0.6363636363636364


Epoch[2] Batch[60] Speed: 1.235415535579281 samples/sec                   batch loss = 147.01645243167877 | accuracy = 0.65


Epoch[2] Batch[65] Speed: 1.233801262819083 samples/sec                   batch loss = 160.59912741184235 | accuracy = 0.6384615384615384


Epoch[2] Batch[70] Speed: 1.2394047476487848 samples/sec                   batch loss = 173.63934981822968 | accuracy = 0.6357142857142857


Epoch[2] Batch[75] Speed: 1.2412335173568851 samples/sec                   batch loss = 183.65532457828522 | accuracy = 0.6466666666666666


Epoch[2] Batch[80] Speed: 1.239064146020541 samples/sec                   batch loss = 195.49911773204803 | accuracy = 0.646875


Epoch[2] Batch[85] Speed: 1.2279557804009744 samples/sec                   batch loss = 208.11761593818665 | accuracy = 0.6470588235294118


Epoch[2] Batch[90] Speed: 1.24095092733574 samples/sec                   batch loss = 220.26464593410492 | accuracy = 0.6527777777777778


Epoch[2] Batch[95] Speed: 1.2383431024653169 samples/sec                   batch loss = 232.14056539535522 | accuracy = 0.6526315789473685


Epoch[2] Batch[100] Speed: 1.2286944690639119 samples/sec                   batch loss = 245.57396912574768 | accuracy = 0.6475


Epoch[2] Batch[105] Speed: 1.2204311673899055 samples/sec                   batch loss = 257.85176718235016 | accuracy = 0.65


Epoch[2] Batch[110] Speed: 1.2164136365050189 samples/sec                   batch loss = 270.24858486652374 | accuracy = 0.6477272727272727


Epoch[2] Batch[115] Speed: 1.232415406132835 samples/sec                   batch loss = 282.23118364810944 | accuracy = 0.6478260869565218


Epoch[2] Batch[120] Speed: 1.2430393250139606 samples/sec                   batch loss = 295.5593602657318 | accuracy = 0.6458333333333334


Epoch[2] Batch[125] Speed: 1.2403006697921812 samples/sec                   batch loss = 307.69521701335907 | accuracy = 0.646


Epoch[2] Batch[130] Speed: 1.2385529086793632 samples/sec                   batch loss = 320.13206362724304 | accuracy = 0.6442307692307693


Epoch[2] Batch[135] Speed: 1.2481267366849536 samples/sec                   batch loss = 331.10525381565094 | accuracy = 0.6481481481481481


Epoch[2] Batch[140] Speed: 1.242390281622117 samples/sec                   batch loss = 343.7841693162918 | accuracy = 0.6446428571428572


Epoch[2] Batch[145] Speed: 1.2394802893625043 samples/sec                   batch loss = 354.24451518058777 | accuracy = 0.6517241379310345


Epoch[2] Batch[150] Speed: 1.2497952170937185 samples/sec                   batch loss = 364.1863247156143 | accuracy = 0.6583333333333333


Epoch[2] Batch[155] Speed: 1.2384129385580296 samples/sec                   batch loss = 374.5940898656845 | accuracy = 0.6596774193548387


Epoch[2] Batch[160] Speed: 1.2338540723242306 samples/sec                   batch loss = 384.99112033843994 | accuracy = 0.665625


Epoch[2] Batch[165] Speed: 1.2397012902939297 samples/sec                   batch loss = 395.4120590686798 | accuracy = 0.6696969696969697


Epoch[2] Batch[170] Speed: 1.2410902782611821 samples/sec                   batch loss = 407.0079073905945 | accuracy = 0.6705882352941176


Epoch[2] Batch[175] Speed: 1.2370527363900143 samples/sec                   batch loss = 421.6656653881073 | accuracy = 0.6685714285714286


Epoch[2] Batch[180] Speed: 1.236412479343953 samples/sec                   batch loss = 434.2531044483185 | accuracy = 0.6694444444444444


Epoch[2] Batch[185] Speed: 1.23801404648577 samples/sec                   batch loss = 446.2445673942566 | accuracy = 0.6716216216216216


Epoch[2] Batch[190] Speed: 1.2436013763718483 samples/sec                   batch loss = 459.60950219631195 | accuracy = 0.6723684210526316


Epoch[2] Batch[195] Speed: 1.2466000548803091 samples/sec                   batch loss = 471.94547641277313 | accuracy = 0.6743589743589744


Epoch[2] Batch[200] Speed: 1.249885997888549 samples/sec                   batch loss = 483.4843474626541 | accuracy = 0.67625


Epoch[2] Batch[205] Speed: 1.2450337691674627 samples/sec                   batch loss = 494.64835262298584 | accuracy = 0.6792682926829269


Epoch[2] Batch[210] Speed: 1.2449435994457227 samples/sec                   batch loss = 504.95687186717987 | accuracy = 0.6821428571428572


Epoch[2] Batch[215] Speed: 1.2396998246324304 samples/sec                   batch loss = 514.961502790451 | accuracy = 0.6837209302325581


Epoch[2] Batch[220] Speed: 1.2451575890402011 samples/sec                   batch loss = 526.8783098459244 | accuracy = 0.6863636363636364


Epoch[2] Batch[225] Speed: 1.248694607334254 samples/sec                   batch loss = 538.2540445327759 | accuracy = 0.6844444444444444


Epoch[2] Batch[230] Speed: 1.2471161043677863 samples/sec                   batch loss = 552.469069480896 | accuracy = 0.6815217391304348


Epoch[2] Batch[235] Speed: 1.2504669346051553 samples/sec                   batch loss = 566.031492471695 | accuracy = 0.6797872340425531


Epoch[2] Batch[240] Speed: 1.2444834384656223 samples/sec                   batch loss = 579.3385862112045 | accuracy = 0.678125


Epoch[2] Batch[245] Speed: 1.2475730194856691 samples/sec                   batch loss = 591.7505089044571 | accuracy = 0.6775510204081633


Epoch[2] Batch[250] Speed: 1.2526758924108843 samples/sec                   batch loss = 603.6447854042053 | accuracy = 0.678


Epoch[2] Batch[255] Speed: 1.248238542107104 samples/sec                   batch loss = 617.7274411916733 | accuracy = 0.6774509803921569


Epoch[2] Batch[260] Speed: 1.2506091768053262 samples/sec                   batch loss = 628.5540388822556 | accuracy = 0.6788461538461539


Epoch[2] Batch[265] Speed: 1.2423292874414693 samples/sec                   batch loss = 642.3924502134323 | accuracy = 0.6764150943396227


Epoch[2] Batch[270] Speed: 1.247782809899014 samples/sec                   batch loss = 660.2238210439682 | accuracy = 0.674074074074074


Epoch[2] Batch[275] Speed: 1.248003903820805 samples/sec                   batch loss = 673.6691509485245 | accuracy = 0.6727272727272727


Epoch[2] Batch[280] Speed: 1.2387683654072277 samples/sec                   batch loss = 686.6774619817734 | accuracy = 0.66875


Epoch[2] Batch[285] Speed: 1.2486476754888562 samples/sec                   batch loss = 699.7056535482407 | accuracy = 0.6666666666666666


Epoch[2] Batch[290] Speed: 1.2225597814708902 samples/sec                   batch loss = 711.6691809892654 | accuracy = 0.6681034482758621


Epoch[2] Batch[295] Speed: 1.2258989054515812 samples/sec                   batch loss = 724.4852646589279 | accuracy = 0.6669491525423729


Epoch[2] Batch[300] Speed: 1.2426641392500677 samples/sec                   batch loss = 737.6566270589828 | accuracy = 0.6658333333333334


Epoch[2] Batch[305] Speed: 1.2413865258798418 samples/sec                   batch loss = 748.3565037250519 | accuracy = 0.6672131147540984


Epoch[2] Batch[310] Speed: 1.2366869893903691 samples/sec                   batch loss = 760.8213819265366 | accuracy = 0.6669354838709678


Epoch[2] Batch[315] Speed: 1.2376179675885033 samples/sec                   batch loss = 774.783461689949 | accuracy = 0.6642857142857143


Epoch[2] Batch[320] Speed: 1.2425702631590252 samples/sec                   batch loss = 784.5046252012253 | accuracy = 0.66640625


Epoch[2] Batch[325] Speed: 1.246438165107952 samples/sec                   batch loss = 795.161102771759 | accuracy = 0.6676923076923077


Epoch[2] Batch[330] Speed: 1.2482429998813303 samples/sec                   batch loss = 805.5227926969528 | accuracy = 0.6681818181818182


Epoch[2] Batch[335] Speed: 1.2426472957307686 samples/sec                   batch loss = 817.8781732320786 | accuracy = 0.6671641791044776


Epoch[2] Batch[340] Speed: 1.2371019933477405 samples/sec                   batch loss = 830.9537814855576 | accuracy = 0.6661764705882353


Epoch[2] Batch[345] Speed: 1.2445633858275602 samples/sec                   batch loss = 844.5034452676773 | accuracy = 0.6659420289855073


Epoch[2] Batch[350] Speed: 1.2466050567213973 samples/sec                   batch loss = 858.1100181341171 | accuracy = 0.6671428571428571


Epoch[2] Batch[355] Speed: 1.2456742944827592 samples/sec                   batch loss = 870.2116799354553 | accuracy = 0.6669014084507042


Epoch[2] Batch[360] Speed: 1.2414962082572902 samples/sec                   batch loss = 881.4494384527206 | accuracy = 0.6666666666666666


Epoch[2] Batch[365] Speed: 1.2402079755500992 samples/sec                   batch loss = 892.5209232568741 | accuracy = 0.6664383561643835


Epoch[2] Batch[370] Speed: 1.2409048511084975 samples/sec                   batch loss = 903.0061178207397 | accuracy = 0.6655405405405406


Epoch[2] Batch[375] Speed: 1.238374454605934 samples/sec                   batch loss = 913.7301669120789 | accuracy = 0.666


Epoch[2] Batch[380] Speed: 1.2377959297201844 samples/sec                   batch loss = 926.0488196611404 | accuracy = 0.6664473684210527


Epoch[2] Batch[385] Speed: 1.2421480881775337 samples/sec                   batch loss = 938.8281933069229 | accuracy = 0.6662337662337663


Epoch[2] Batch[390] Speed: 1.2420013279330624 samples/sec                   batch loss = 948.624269247055 | accuracy = 0.6679487179487179


Epoch[2] Batch[395] Speed: 1.2372809933165414 samples/sec                   batch loss = 959.6072080135345 | accuracy = 0.6683544303797468


Epoch[2] Batch[400] Speed: 1.2347560701864084 samples/sec                   batch loss = 969.8366570472717 | accuracy = 0.67


Epoch[2] Batch[405] Speed: 1.2194818900512072 samples/sec                   batch loss = 980.0953900814056 | accuracy = 0.6722222222222223


Epoch[2] Batch[410] Speed: 1.2365272082601342 samples/sec                   batch loss = 994.7505950927734 | accuracy = 0.6695121951219513


Epoch[2] Batch[415] Speed: 1.2441440996500839 samples/sec                   batch loss = 1005.9922885894775 | accuracy = 0.6698795180722892


Epoch[2] Batch[420] Speed: 1.2462077206985627 samples/sec                   batch loss = 1019.3566071987152 | accuracy = 0.669047619047619


Epoch[2] Batch[425] Speed: 1.2441207578768396 samples/sec                   batch loss = 1028.0706828832626 | accuracy = 0.6705882352941176


Epoch[2] Batch[430] Speed: 1.2549552778803386 samples/sec                   batch loss = 1038.6041584014893 | accuracy = 0.6709302325581395


Epoch[2] Batch[435] Speed: 1.2438670072175535 samples/sec                   batch loss = 1048.3803210258484 | accuracy = 0.6729885057471264


Epoch[2] Batch[440] Speed: 1.2449683578635766 samples/sec                   batch loss = 1058.9971115589142 | accuracy = 0.6744318181818182


Epoch[2] Batch[445] Speed: 1.236534499141063 samples/sec                   batch loss = 1072.337368130684 | accuracy = 0.6747191011235955


Epoch[2] Batch[450] Speed: 1.2472859594298644 samples/sec                   batch loss = 1082.1302456855774 | accuracy = 0.6761111111111111


Epoch[2] Batch[455] Speed: 1.2427301371103983 samples/sec                   batch loss = 1092.7819582223892 | accuracy = 0.676923076923077


Epoch[2] Batch[460] Speed: 1.2400534242758194 samples/sec                   batch loss = 1104.2885392904282 | accuracy = 0.6766304347826086


Epoch[2] Batch[465] Speed: 1.24014847883263 samples/sec                   batch loss = 1119.5519616603851 | accuracy = 0.6752688172043011


Epoch[2] Batch[470] Speed: 1.2409477147388104 samples/sec                   batch loss = 1132.7094848155975 | accuracy = 0.674468085106383


Epoch[2] Batch[475] Speed: 1.24127135268869 samples/sec                   batch loss = 1140.544673204422 | accuracy = 0.6763157894736842


Epoch[2] Batch[480] Speed: 1.2392514949382063 samples/sec                   batch loss = 1154.6794862747192 | accuracy = 0.6755208333333333


Epoch[2] Batch[485] Speed: 1.242413558506691 samples/sec                   batch loss = 1167.1719096899033 | accuracy = 0.6752577319587629


Epoch[2] Batch[490] Speed: 1.239709168283864 samples/sec                   batch loss = 1176.6566338539124 | accuracy = 0.6770408163265306


Epoch[2] Batch[495] Speed: 1.2425014297062218 samples/sec                   batch loss = 1186.8021194934845 | accuracy = 0.6772727272727272


Epoch[2] Batch[500] Speed: 1.2377352944930304 samples/sec                   batch loss = 1197.6712574958801 | accuracy = 0.677


Epoch[2] Batch[505] Speed: 1.243238564753392 samples/sec                   batch loss = 1209.902864933014 | accuracy = 0.6777227722772278


Epoch[2] Batch[510] Speed: 1.243903896556454 samples/sec                   batch loss = 1223.3660905361176 | accuracy = 0.6764705882352942


Epoch[2] Batch[515] Speed: 1.234242842189919 samples/sec                   batch loss = 1233.0989699959755 | accuracy = 0.6766990291262136


Epoch[2] Batch[520] Speed: 1.214973095974035 samples/sec                   batch loss = 1242.7335680127144 | accuracy = 0.6783653846153846


Epoch[2] Batch[525] Speed: 1.2458616124510595 samples/sec                   batch loss = 1257.4738023877144 | accuracy = 0.6766666666666666


Epoch[2] Batch[530] Speed: 1.242536121549381 samples/sec                   batch loss = 1271.3412753343582 | accuracy = 0.6754716981132075


Epoch[2] Batch[535] Speed: 1.2476239528221729 samples/sec                   batch loss = 1287.5661388635635 | accuracy = 0.6752336448598131


Epoch[2] Batch[540] Speed: 1.2527829950469525 samples/sec                   batch loss = 1298.642008781433 | accuracy = 0.675


Epoch[2] Batch[545] Speed: 1.242377677507137 samples/sec                   batch loss = 1312.456124305725 | accuracy = 0.6743119266055045


Epoch[2] Batch[550] Speed: 1.2417773003929937 samples/sec                   batch loss = 1325.585344672203 | accuracy = 0.6731818181818182


Epoch[2] Batch[555] Speed: 1.2417659954417108 samples/sec                   batch loss = 1335.0611040592194 | accuracy = 0.6752252252252252


Epoch[2] Batch[560] Speed: 1.2487887604652725 samples/sec                   batch loss = 1344.7332532405853 | accuracy = 0.6763392857142857


Epoch[2] Batch[565] Speed: 1.2437409544160019 samples/sec                   batch loss = 1357.1537120342255 | accuracy = 0.6761061946902654


Epoch[2] Batch[570] Speed: 1.2409898469741711 samples/sec                   batch loss = 1367.0983288288116 | accuracy = 0.6767543859649123


Epoch[2] Batch[575] Speed: 1.2485713839190362 samples/sec                   batch loss = 1375.7352652549744 | accuracy = 0.6791304347826087


Epoch[2] Batch[580] Speed: 1.2331170536930047 samples/sec                   batch loss = 1388.2852269411087 | accuracy = 0.6801724137931034


Epoch[2] Batch[585] Speed: 1.241712138906551 samples/sec                   batch loss = 1400.6305505037308 | accuracy = 0.6786324786324787


Epoch[2] Batch[590] Speed: 1.2404239168278266 samples/sec                   batch loss = 1411.2979415655136 | accuracy = 0.6792372881355933


Epoch[2] Batch[595] Speed: 1.2390379747658102 samples/sec                   batch loss = 1422.7913571596146 | accuracy = 0.6794117647058824


Epoch[2] Batch[600] Speed: 1.237144047298895 samples/sec                   batch loss = 1436.1827721595764 | accuracy = 0.6795833333333333


Epoch[2] Batch[605] Speed: 1.2482111460688068 samples/sec                   batch loss = 1447.9727492332458 | accuracy = 0.6793388429752066


Epoch[2] Batch[610] Speed: 1.2408875045708923 samples/sec                   batch loss = 1459.4476836919785 | accuracy = 0.6782786885245902


Epoch[2] Batch[615] Speed: 1.2405939719163666 samples/sec                   batch loss = 1471.012971162796 | accuracy = 0.6776422764227642


Epoch[2] Batch[620] Speed: 1.2369490359523083 samples/sec                   batch loss = 1481.6455672979355 | accuracy = 0.6782258064516129


Epoch[2] Batch[625] Speed: 1.2412775975768726 samples/sec                   batch loss = 1492.0903805494308 | accuracy = 0.6788


Epoch[2] Batch[630] Speed: 1.2441893095279484 samples/sec                   batch loss = 1502.0238338708878 | accuracy = 0.678968253968254


Epoch[2] Batch[635] Speed: 1.2510158547459322 samples/sec                   batch loss = 1511.3510156869888 | accuracy = 0.6799212598425197


Epoch[2] Batch[640] Speed: 1.2405277421155188 samples/sec                   batch loss = 1520.4581052064896 | accuracy = 0.68125


Epoch[2] Batch[645] Speed: 1.2393848794184237 samples/sec                   batch loss = 1532.8708893060684 | accuracy = 0.6802325581395349


Epoch[2] Batch[650] Speed: 1.2427151327970352 samples/sec                   batch loss = 1547.2557901144028 | accuracy = 0.68


Epoch[2] Batch[655] Speed: 1.2445061476417627 samples/sec                   batch loss = 1557.1813102960587 | accuracy = 0.6809160305343511


Epoch[2] Batch[660] Speed: 1.2434656082460045 samples/sec                   batch loss = 1566.4672607183456 | accuracy = 0.6814393939393939


Epoch[2] Batch[665] Speed: 1.2396706953290428 samples/sec                   batch loss = 1576.1812080144882 | accuracy = 0.6819548872180451


Epoch[2] Batch[670] Speed: 1.233204521216365 samples/sec                   batch loss = 1587.1462055444717 | accuracy = 0.6817164179104478


Epoch[2] Batch[675] Speed: 1.2364315234170202 samples/sec                   batch loss = 1599.8547592163086 | accuracy = 0.6811111111111111


Epoch[2] Batch[680] Speed: 1.2384179663293717 samples/sec                   batch loss = 1611.8392852544785 | accuracy = 0.680514705882353


Epoch[2] Batch[685] Speed: 1.2425295879242224 samples/sec                   batch loss = 1624.0694352388382 | accuracy = 0.6810218978102189


Epoch[2] Batch[690] Speed: 1.2393212504329665 samples/sec                   batch loss = 1641.3028038740158 | accuracy = 0.678623188405797


Epoch[2] Batch[695] Speed: 1.242420182917857 samples/sec                   batch loss = 1651.1782191991806 | accuracy = 0.6794964028776979


Epoch[2] Batch[700] Speed: 1.2455465805557664 samples/sec                   batch loss = 1665.1505690813065 | accuracy = 0.6789285714285714


Epoch[2] Batch[705] Speed: 1.2468775324435186 samples/sec                   batch loss = 1677.3807190656662 | accuracy = 0.6787234042553192


Epoch[2] Batch[710] Speed: 1.2455745070700681 samples/sec                   batch loss = 1688.5993914604187 | accuracy = 0.6785211267605634


Epoch[2] Batch[715] Speed: 1.242187818719767 samples/sec                   batch loss = 1700.522240281105 | accuracy = 0.6783216783216783


Epoch[2] Batch[720] Speed: 1.2400361931874204 samples/sec                   batch loss = 1714.0199192762375 | accuracy = 0.6770833333333334


Epoch[2] Batch[725] Speed: 1.2395143548202872 samples/sec                   batch loss = 1726.8171286582947 | accuracy = 0.6772413793103448


Epoch[2] Batch[730] Speed: 1.2432546872662258 samples/sec                   batch loss = 1739.993864417076 | accuracy = 0.6767123287671233


Epoch[2] Batch[735] Speed: 1.240700669971215 samples/sec                   batch loss = 1752.506270647049 | accuracy = 0.6768707482993197


Epoch[2] Batch[740] Speed: 1.2436893234830828 samples/sec                   batch loss = 1763.660361289978 | accuracy = 0.677027027027027


Epoch[2] Batch[745] Speed: 1.246544944659229 samples/sec                   batch loss = 1773.7258747816086 | accuracy = 0.6778523489932886


Epoch[2] Batch[750] Speed: 1.242554526496688 samples/sec                   batch loss = 1784.902359366417 | accuracy = 0.6776666666666666


Epoch[2] Batch[755] Speed: 1.2452717284203987 samples/sec                   batch loss = 1796.3886115550995 | accuracy = 0.6778145695364238


Epoch[2] Batch[760] Speed: 1.24207948540059 samples/sec                   batch loss = 1810.078531742096 | accuracy = 0.6769736842105263


Epoch[2] Batch[765] Speed: 1.24013216177821 samples/sec                   batch loss = 1822.200966000557 | accuracy = 0.6764705882352942


Epoch[2] Batch[770] Speed: 1.2421353970121465 samples/sec                   batch loss = 1832.8373454809189 | accuracy = 0.6766233766233766


Epoch[2] Batch[775] Speed: 1.2395960462884932 samples/sec                   batch loss = 1846.2803509235382 | accuracy = 0.6767741935483871


Epoch[2] Batch[780] Speed: 1.2431357590210468 samples/sec                   batch loss = 1857.6967873573303 | accuracy = 0.6772435897435898


Epoch[2] Batch[785] Speed: 1.2294731485554078 samples/sec                   batch loss = 1871.8175737857819 | accuracy = 0.6761146496815287


[Epoch 2] training: accuracy=0.6763959390862944
[Epoch 2] time cost: 651.161759853363
[Epoch 2] validation: validation accuracy=0.7066666666666667


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).