<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[21:36:13] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[21:36:13] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[21:36:14] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 3.6910949, -1.6766056]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7699976395977375 samples/sec                   batch loss = 14.416633367538452 | accuracy = 0.4


Epoch[1] Batch[10] Speed: 1.2558457230197069 samples/sec                   batch loss = 28.218145608901978 | accuracy = 0.45


Epoch[1] Batch[15] Speed: 1.2557594320151284 samples/sec                   batch loss = 42.10918712615967 | accuracy = 0.48333333333333334


Epoch[1] Batch[20] Speed: 1.2535720965034105 samples/sec                   batch loss = 56.90426516532898 | accuracy = 0.475


Epoch[1] Batch[25] Speed: 1.2586125552506693 samples/sec                   batch loss = 70.0362491607666 | accuracy = 0.51


Epoch[1] Batch[30] Speed: 1.2653603888845815 samples/sec                   batch loss = 84.6483223438263 | accuracy = 0.5


Epoch[1] Batch[35] Speed: 1.2531152700328927 samples/sec                   batch loss = 98.22530341148376 | accuracy = 0.5


Epoch[1] Batch[40] Speed: 1.2610361604622862 samples/sec                   batch loss = 111.68948769569397 | accuracy = 0.525


Epoch[1] Batch[45] Speed: 1.2573273014575201 samples/sec                   batch loss = 125.24406576156616 | accuracy = 0.5277777777777778


Epoch[1] Batch[50] Speed: 1.2571187168008078 samples/sec                   batch loss = 139.654048204422 | accuracy = 0.53


Epoch[1] Batch[55] Speed: 1.2620552482264542 samples/sec                   batch loss = 153.64407873153687 | accuracy = 0.5272727272727272


Epoch[1] Batch[60] Speed: 1.2561344799047276 samples/sec                   batch loss = 168.19084882736206 | accuracy = 0.5333333333333333


Epoch[1] Batch[65] Speed: 1.2567891170670094 samples/sec                   batch loss = 182.24214887619019 | accuracy = 0.5307692307692308


Epoch[1] Batch[70] Speed: 1.2567624741264725 samples/sec                   batch loss = 196.00545740127563 | accuracy = 0.525


Epoch[1] Batch[75] Speed: 1.2515092401399506 samples/sec                   batch loss = 210.03429865837097 | accuracy = 0.5266666666666666


Epoch[1] Batch[80] Speed: 1.2539781734219695 samples/sec                   batch loss = 223.40359926223755 | accuracy = 0.53125


Epoch[1] Batch[85] Speed: 1.246090077850475 samples/sec                   batch loss = 237.68634152412415 | accuracy = 0.5235294117647059


Epoch[1] Batch[90] Speed: 1.2561151061900364 samples/sec                   batch loss = 251.30895113945007 | accuracy = 0.525


Epoch[1] Batch[95] Speed: 1.245698527017767 samples/sec                   batch loss = 264.7777969837189 | accuracy = 0.5236842105263158


Epoch[1] Batch[100] Speed: 1.2536848796729836 samples/sec                   batch loss = 277.8605523109436 | accuracy = 0.5325


Epoch[1] Batch[105] Speed: 1.25401978914471 samples/sec                   batch loss = 292.28744530677795 | accuracy = 0.5285714285714286


Epoch[1] Batch[110] Speed: 1.2505296625961309 samples/sec                   batch loss = 305.7731308937073 | accuracy = 0.5295454545454545


Epoch[1] Batch[115] Speed: 1.2539648644719636 samples/sec                   batch loss = 319.33072781562805 | accuracy = 0.532608695652174


Epoch[1] Batch[120] Speed: 1.2544034587204742 samples/sec                   batch loss = 333.37473154067993 | accuracy = 0.53125


Epoch[1] Batch[125] Speed: 1.2529150043620327 samples/sec                   batch loss = 347.43294501304626 | accuracy = 0.532


Epoch[1] Batch[130] Speed: 1.2574641344022004 samples/sec                   batch loss = 361.6797983646393 | accuracy = 0.5269230769230769


Epoch[1] Batch[135] Speed: 1.2471433596510868 samples/sec                   batch loss = 376.22634387016296 | accuracy = 0.5203703703703704


Epoch[1] Batch[140] Speed: 1.2579995013656209 samples/sec                   batch loss = 389.90876483917236 | accuracy = 0.5232142857142857


Epoch[1] Batch[145] Speed: 1.260321230914665 samples/sec                   batch loss = 403.61560583114624 | accuracy = 0.5241379310344828


Epoch[1] Batch[150] Speed: 1.2495566428717462 samples/sec                   batch loss = 417.6324098110199 | accuracy = 0.5216666666666666


Epoch[1] Batch[155] Speed: 1.243178500557967 samples/sec                   batch loss = 430.9324200153351 | accuracy = 0.5209677419354839


Epoch[1] Batch[160] Speed: 1.2480090097696341 samples/sec                   batch loss = 444.8588216304779 | accuracy = 0.51875


Epoch[1] Batch[165] Speed: 1.245908057520897 samples/sec                   batch loss = 459.05465936660767 | accuracy = 0.5151515151515151


Epoch[1] Batch[170] Speed: 1.2481686149412572 samples/sec                   batch loss = 473.3921377658844 | accuracy = 0.5117647058823529


Epoch[1] Batch[175] Speed: 1.2499399139558909 samples/sec                   batch loss = 486.721120595932 | accuracy = 0.5171428571428571


Epoch[1] Batch[180] Speed: 1.2474926850073564 samples/sec                   batch loss = 500.41634011268616 | accuracy = 0.5166666666666667


Epoch[1] Batch[185] Speed: 1.2539917638385034 samples/sec                   batch loss = 514.5723838806152 | accuracy = 0.5148648648648648


Epoch[1] Batch[190] Speed: 1.2585097402163885 samples/sec                   batch loss = 527.9100649356842 | accuracy = 0.5197368421052632


Epoch[1] Batch[195] Speed: 1.2552324495668954 samples/sec                   batch loss = 542.1451044082642 | accuracy = 0.517948717948718


Epoch[1] Batch[200] Speed: 1.2470963589373183 samples/sec                   batch loss = 555.9987206459045 | accuracy = 0.52


Epoch[1] Batch[205] Speed: 1.257609858418955 samples/sec                   batch loss = 570.232931137085 | accuracy = 0.5170731707317073


Epoch[1] Batch[210] Speed: 1.2490883231884524 samples/sec                   batch loss = 584.2500364780426 | accuracy = 0.5166666666666667


Epoch[1] Batch[215] Speed: 1.2549474865431671 samples/sec                   batch loss = 597.7812435626984 | accuracy = 0.5209302325581395


Epoch[1] Batch[220] Speed: 1.255539810060234 samples/sec                   batch loss = 611.6490738391876 | accuracy = 0.5204545454545455


Epoch[1] Batch[225] Speed: 1.2498986616936583 samples/sec                   batch loss = 625.3743455410004 | accuracy = 0.5211111111111111


Epoch[1] Batch[230] Speed: 1.2590937148565067 samples/sec                   batch loss = 639.4753425121307 | accuracy = 0.5163043478260869


Epoch[1] Batch[235] Speed: 1.2605300273816324 samples/sec                   batch loss = 653.5447955131531 | accuracy = 0.5117021276595745


Epoch[1] Batch[240] Speed: 1.2643299559521164 samples/sec                   batch loss = 667.1035187244415 | accuracy = 0.5145833333333333


Epoch[1] Batch[245] Speed: 1.2516264145054001 samples/sec                   batch loss = 680.7842466831207 | accuracy = 0.513265306122449


Epoch[1] Batch[250] Speed: 1.2544082420072786 samples/sec                   batch loss = 695.0189921855927 | accuracy = 0.509


Epoch[1] Batch[255] Speed: 1.2562910903623115 samples/sec                   batch loss = 708.3138294219971 | accuracy = 0.5107843137254902


Epoch[1] Batch[260] Speed: 1.245978656958204 samples/sec                   batch loss = 722.0259544849396 | accuracy = 0.5134615384615384


Epoch[1] Batch[265] Speed: 1.2476549416226668 samples/sec                   batch loss = 735.8140556812286 | accuracy = 0.5132075471698113


Epoch[1] Batch[270] Speed: 1.251102054817505 samples/sec                   batch loss = 749.4876117706299 | accuracy = 0.5138888888888888


Epoch[1] Batch[275] Speed: 1.2477091293323068 samples/sec                   batch loss = 763.4052908420563 | accuracy = 0.5163636363636364


Epoch[1] Batch[280] Speed: 1.2476829627615642 samples/sec                   batch loss = 777.0913696289062 | accuracy = 0.5142857142857142


Epoch[1] Batch[285] Speed: 1.2562414222773057 samples/sec                   batch loss = 789.8873314857483 | accuracy = 0.519298245614035


Epoch[1] Batch[290] Speed: 1.2540510027495033 samples/sec                   batch loss = 803.9122006893158 | accuracy = 0.521551724137931


Epoch[1] Batch[295] Speed: 1.2492222524444851 samples/sec                   batch loss = 817.0682077407837 | accuracy = 0.523728813559322


Epoch[1] Batch[300] Speed: 1.2441126392044901 samples/sec                   batch loss = 831.2370219230652 | accuracy = 0.5208333333333334


Epoch[1] Batch[305] Speed: 1.251642848680663 samples/sec                   batch loss = 845.1659314632416 | accuracy = 0.521311475409836


Epoch[1] Batch[310] Speed: 1.2523883493029393 samples/sec                   batch loss = 858.4629917144775 | accuracy = 0.5233870967741936


Epoch[1] Batch[315] Speed: 1.2576300324511895 samples/sec                   batch loss = 872.0582818984985 | accuracy = 0.5253968253968254


Epoch[1] Batch[320] Speed: 1.257339174210767 samples/sec                   batch loss = 884.727255821228 | accuracy = 0.528125


Epoch[1] Batch[325] Speed: 1.247553166816093 samples/sec                   batch loss = 899.2604854106903 | accuracy = 0.5230769230769231


Epoch[1] Batch[330] Speed: 1.255386768556936 samples/sec                   batch loss = 912.4037940502167 | accuracy = 0.5265151515151515


Epoch[1] Batch[335] Speed: 1.2512983815595733 samples/sec                   batch loss = 925.4428839683533 | accuracy = 0.5283582089552239


Epoch[1] Batch[340] Speed: 1.2534849001730215 samples/sec                   batch loss = 939.5665493011475 | accuracy = 0.5279411764705882


Epoch[1] Batch[345] Speed: 1.2573776208551273 samples/sec                   batch loss = 953.8695032596588 | accuracy = 0.5289855072463768


Epoch[1] Batch[350] Speed: 1.2554300748593954 samples/sec                   batch loss = 967.4813735485077 | accuracy = 0.53


Epoch[1] Batch[355] Speed: 1.253224600922162 samples/sec                   batch loss = 981.1891491413116 | accuracy = 0.5309859154929577


Epoch[1] Batch[360] Speed: 1.2519551800293893 samples/sec                   batch loss = 994.881157875061 | accuracy = 0.5333333333333333


Epoch[1] Batch[365] Speed: 1.2601552847048998 samples/sec                   batch loss = 1008.892552614212 | accuracy = 0.5321917808219178


Epoch[1] Batch[370] Speed: 1.2563133858366773 samples/sec                   batch loss = 1022.6810522079468 | accuracy = 0.5324324324324324


Epoch[1] Batch[375] Speed: 1.2541467154105863 samples/sec                   batch loss = 1036.2099630832672 | accuracy = 0.5333333333333333


Epoch[1] Batch[380] Speed: 1.2594864526435587 samples/sec                   batch loss = 1049.510442018509 | accuracy = 0.5335526315789474


Epoch[1] Batch[385] Speed: 1.2607935600182463 samples/sec                   batch loss = 1062.845920085907 | accuracy = 0.5357142857142857


Epoch[1] Batch[390] Speed: 1.2579089529169323 samples/sec                   batch loss = 1076.4641454219818 | accuracy = 0.5352564102564102


Epoch[1] Batch[395] Speed: 1.2527072262214374 samples/sec                   batch loss = 1089.954463481903 | accuracy = 0.5360759493670886


Epoch[1] Batch[400] Speed: 1.2571562079465606 samples/sec                   batch loss = 1103.3273394107819 | accuracy = 0.538125


Epoch[1] Batch[405] Speed: 1.2561066421333338 samples/sec                   batch loss = 1117.2138350009918 | accuracy = 0.5376543209876543


Epoch[1] Batch[410] Speed: 1.254119528154535 samples/sec                   batch loss = 1130.6448321342468 | accuracy = 0.5390243902439025


Epoch[1] Batch[415] Speed: 1.2493195551773715 samples/sec                   batch loss = 1144.6629257202148 | accuracy = 0.5391566265060241


Epoch[1] Batch[420] Speed: 1.2563249572175827 samples/sec                   batch loss = 1158.073191165924 | accuracy = 0.5404761904761904


Epoch[1] Batch[425] Speed: 1.2638972478402266 samples/sec                   batch loss = 1171.5531024932861 | accuracy = 0.5435294117647059


Epoch[1] Batch[430] Speed: 1.2561127550517326 samples/sec                   batch loss = 1184.36381316185 | accuracy = 0.5447674418604651


Epoch[1] Batch[435] Speed: 1.2550454954680328 samples/sec                   batch loss = 1197.8785696029663 | accuracy = 0.5442528735632184


Epoch[1] Batch[440] Speed: 1.2508505778800096 samples/sec                   batch loss = 1211.8703124523163 | accuracy = 0.5431818181818182


Epoch[1] Batch[445] Speed: 1.25397142518666 samples/sec                   batch loss = 1226.003823041916 | accuracy = 0.5432584269662921


Epoch[1] Batch[450] Speed: 1.2505025387252402 samples/sec                   batch loss = 1239.5125601291656 | accuracy = 0.5433333333333333


Epoch[1] Batch[455] Speed: 1.2547753508363404 samples/sec                   batch loss = 1252.7734243869781 | accuracy = 0.5434065934065934


Epoch[1] Batch[460] Speed: 1.253893263420547 samples/sec                   batch loss = 1266.212506532669 | accuracy = 0.5434782608695652


Epoch[1] Batch[465] Speed: 1.2575298285652952 samples/sec                   batch loss = 1280.008585691452 | accuracy = 0.5435483870967742


Epoch[1] Batch[470] Speed: 1.259336134157317 samples/sec                   batch loss = 1293.633306503296 | accuracy = 0.5446808510638298


Epoch[1] Batch[475] Speed: 1.2547151051110454 samples/sec                   batch loss = 1306.9367825984955 | accuracy = 0.5468421052631579


Epoch[1] Batch[480] Speed: 1.2489240200647107 samples/sec                   batch loss = 1320.8278653621674 | accuracy = 0.5458333333333333


Epoch[1] Batch[485] Speed: 1.2522295325020174 samples/sec                   batch loss = 1333.6931347846985 | accuracy = 0.5474226804123712


Epoch[1] Batch[490] Speed: 1.2446145353705371 samples/sec                   batch loss = 1346.9630444049835 | accuracy = 0.548469387755102


Epoch[1] Batch[495] Speed: 1.2522215880461214 samples/sec                   batch loss = 1361.0615012645721 | accuracy = 0.547979797979798


Epoch[1] Batch[500] Speed: 1.2459068547164471 samples/sec                   batch loss = 1374.5580432415009 | accuracy = 0.549


Epoch[1] Batch[505] Speed: 1.2554113804491034 samples/sec                   batch loss = 1388.4081139564514 | accuracy = 0.5504950495049505


Epoch[1] Batch[510] Speed: 1.2588767980145164 samples/sec                   batch loss = 1402.4820408821106 | accuracy = 0.5485294117647059


Epoch[1] Batch[515] Speed: 1.2643490122047258 samples/sec                   batch loss = 1416.3378217220306 | accuracy = 0.5485436893203883


Epoch[1] Batch[520] Speed: 1.2650558330561008 samples/sec                   batch loss = 1429.4980418682098 | accuracy = 0.5495192307692308


Epoch[1] Batch[525] Speed: 1.2509930939389244 samples/sec                   batch loss = 1443.2843351364136 | accuracy = 0.549047619047619


Epoch[1] Batch[530] Speed: 1.2439580356229938 samples/sec                   batch loss = 1456.5337400436401 | accuracy = 0.55


Epoch[1] Batch[535] Speed: 1.2481023167373149 samples/sec                   batch loss = 1470.2981069087982 | accuracy = 0.55


Epoch[1] Batch[540] Speed: 1.2502595017045623 samples/sec                   batch loss = 1484.083263874054 | accuracy = 0.55


Epoch[1] Batch[545] Speed: 1.2572885752256864 samples/sec                   batch loss = 1497.911268234253 | accuracy = 0.55


Epoch[1] Batch[550] Speed: 1.253176204673599 samples/sec                   batch loss = 1512.1605653762817 | accuracy = 0.5490909090909091


Epoch[1] Batch[555] Speed: 1.2514560287485816 samples/sec                   batch loss = 1525.2341153621674 | accuracy = 0.5509009009009009


Epoch[1] Batch[560] Speed: 1.2525855475722536 samples/sec                   batch loss = 1537.5350937843323 | accuracy = 0.5522321428571428


Epoch[1] Batch[565] Speed: 1.2543345271675028 samples/sec                   batch loss = 1551.6539657115936 | accuracy = 0.5517699115044248


Epoch[1] Batch[570] Speed: 1.2555370852382486 samples/sec                   batch loss = 1565.895459651947 | accuracy = 0.5504385964912281


Epoch[1] Batch[575] Speed: 1.2495327253151758 samples/sec                   batch loss = 1578.582978963852 | accuracy = 0.5521739130434783


Epoch[1] Batch[580] Speed: 1.2568287540617737 samples/sec                   batch loss = 1591.9067101478577 | accuracy = 0.5517241379310345


Epoch[1] Batch[585] Speed: 1.2551015476817988 samples/sec                   batch loss = 1604.7325177192688 | accuracy = 0.552991452991453


Epoch[1] Batch[590] Speed: 1.2544987561726832 samples/sec                   batch loss = 1617.7961912155151 | accuracy = 0.5538135593220339


Epoch[1] Batch[595] Speed: 1.2515210032687425 samples/sec                   batch loss = 1631.489862203598 | accuracy = 0.553781512605042


Epoch[1] Batch[600] Speed: 1.2563295670140642 samples/sec                   batch loss = 1644.3225116729736 | accuracy = 0.5541666666666667


Epoch[1] Batch[605] Speed: 1.2559584455811 samples/sec                   batch loss = 1656.9733383655548 | accuracy = 0.5557851239669421


Epoch[1] Batch[610] Speed: 1.264626536327714 samples/sec                   batch loss = 1669.9410772323608 | accuracy = 0.5565573770491803


Epoch[1] Batch[615] Speed: 1.2630195029622646 samples/sec                   batch loss = 1683.1020650863647 | accuracy = 0.5573170731707318


Epoch[1] Batch[620] Speed: 1.245123120288685 samples/sec                   batch loss = 1696.499873161316 | accuracy = 0.5568548387096774


Epoch[1] Batch[625] Speed: 1.2525152260351389 samples/sec                   batch loss = 1710.8364028930664 | accuracy = 0.5576


Epoch[1] Batch[630] Speed: 1.2589192118488308 samples/sec                   batch loss = 1723.241018295288 | accuracy = 0.5583333333333333


Epoch[1] Batch[635] Speed: 1.2515885054970861 samples/sec                   batch loss = 1736.6534039974213 | accuracy = 0.5590551181102362


Epoch[1] Batch[640] Speed: 1.2511249128895825 samples/sec                   batch loss = 1749.8866984844208 | accuracy = 0.559765625


Epoch[1] Batch[645] Speed: 1.2679170040089602 samples/sec                   batch loss = 1763.679086446762 | accuracy = 0.5596899224806201


Epoch[1] Batch[650] Speed: 1.2593711107274488 samples/sec                   batch loss = 1776.8355963230133 | accuracy = 0.5603846153846154


Epoch[1] Batch[655] Speed: 1.2589501974808714 samples/sec                   batch loss = 1790.3411934375763 | accuracy = 0.5606870229007633


Epoch[1] Batch[660] Speed: 1.2552747121030807 samples/sec                   batch loss = 1802.4607903957367 | accuracy = 0.5617424242424243


Epoch[1] Batch[665] Speed: 1.2497199954114704 samples/sec                   batch loss = 1814.4988465309143 | accuracy = 0.562406015037594


Epoch[1] Batch[670] Speed: 1.255333414791404 samples/sec                   batch loss = 1829.1796004772186 | accuracy = 0.5615671641791045


Epoch[1] Batch[675] Speed: 1.2556677960570877 samples/sec                   batch loss = 1842.4970746040344 | accuracy = 0.5618518518518518


Epoch[1] Batch[680] Speed: 1.2518687689513681 samples/sec                   batch loss = 1855.6852989196777 | accuracy = 0.5628676470588235


Epoch[1] Batch[685] Speed: 1.2538177350485635 samples/sec                   batch loss = 1870.1540169715881 | accuracy = 0.5620437956204379


Epoch[1] Batch[690] Speed: 1.2578078558861794 samples/sec                   batch loss = 1883.114837884903 | accuracy = 0.5615942028985508


Epoch[1] Batch[695] Speed: 1.2567987200999133 samples/sec                   batch loss = 1894.696691274643 | accuracy = 0.5633093525179856


Epoch[1] Batch[700] Speed: 1.2558342544687011 samples/sec                   batch loss = 1908.4921159744263 | accuracy = 0.5632142857142857


Epoch[1] Batch[705] Speed: 1.2529906111680453 samples/sec                   batch loss = 1920.3200192451477 | accuracy = 0.5645390070921986


Epoch[1] Batch[710] Speed: 1.2489744129181166 samples/sec                   batch loss = 1932.1104907989502 | accuracy = 0.5654929577464789


Epoch[1] Batch[715] Speed: 1.2557234339361156 samples/sec                   batch loss = 1944.504005908966 | accuracy = 0.5667832167832167


Epoch[1] Batch[720] Speed: 1.2625957677509934 samples/sec                   batch loss = 1957.3903863430023 | accuracy = 0.5673611111111111


Epoch[1] Batch[725] Speed: 1.2567913765909609 samples/sec                   batch loss = 1970.172905921936 | accuracy = 0.5682758620689655


Epoch[1] Batch[730] Speed: 1.2503328315273285 samples/sec                   batch loss = 1984.083503484726 | accuracy = 0.5681506849315069


Epoch[1] Batch[735] Speed: 1.2473612593331562 samples/sec                   batch loss = 1997.5531811714172 | accuracy = 0.5683673469387756


Epoch[1] Batch[740] Speed: 1.2586022635494363 samples/sec                   batch loss = 2010.6273797750473 | accuracy = 0.5695945945945946


Epoch[1] Batch[745] Speed: 1.258466221189712 samples/sec                   batch loss = 2024.0346735715866 | accuracy = 0.5694630872483222


Epoch[1] Batch[750] Speed: 1.257929985404691 samples/sec                   batch loss = 2036.0182238817215 | accuracy = 0.5706666666666667


Epoch[1] Batch[755] Speed: 1.2512775702452008 samples/sec                   batch loss = 2047.7364966869354 | accuracy = 0.5711920529801324


Epoch[1] Batch[760] Speed: 1.2473204553014192 samples/sec                   batch loss = 2060.793578147888 | accuracy = 0.5713815789473684


Epoch[1] Batch[765] Speed: 1.2533297375676213 samples/sec                   batch loss = 2074.353405714035 | accuracy = 0.5718954248366013


Epoch[1] Batch[770] Speed: 1.2521742037970311 samples/sec                   batch loss = 2087.744249343872 | accuracy = 0.5720779220779221


Epoch[1] Batch[775] Speed: 1.257031874042787 samples/sec                   batch loss = 2099.8371307849884 | accuracy = 0.5732258064516129


Epoch[1] Batch[780] Speed: 1.2509497201038833 samples/sec                   batch loss = 2112.7089519500732 | accuracy = 0.573076923076923


Epoch[1] Batch[785] Speed: 1.256127144156048 samples/sec                   batch loss = 2125.217690229416 | accuracy = 0.5735668789808918


[Epoch 1] training: accuracy=0.5736040609137056
[Epoch 1] time cost: 646.6003189086914
[Epoch 1] validation: validation accuracy=0.6588888888888889


Epoch[2] Batch[5] Speed: 1.2603182012710592 samples/sec                   batch loss = 13.339862585067749 | accuracy = 0.6


Epoch[2] Batch[10] Speed: 1.2590422186834447 samples/sec                   batch loss = 24.650697946548462 | accuracy = 0.675


Epoch[2] Batch[15] Speed: 1.2587287026288163 samples/sec                   batch loss = 37.81382775306702 | accuracy = 0.6166666666666667


Epoch[2] Batch[20] Speed: 1.259159201395482 samples/sec                   batch loss = 49.53587985038757 | accuracy = 0.675


Epoch[2] Batch[25] Speed: 1.2610269664781 samples/sec                   batch loss = 61.01367735862732 | accuracy = 0.7


Epoch[2] Batch[30] Speed: 1.261481992551961 samples/sec                   batch loss = 71.78980720043182 | accuracy = 0.7333333333333333


Epoch[2] Batch[35] Speed: 1.2496925343061398 samples/sec                   batch loss = 84.22361886501312 | accuracy = 0.7142857142857143


Epoch[2] Batch[40] Speed: 1.2549862564429703 samples/sec                   batch loss = 95.63807678222656 | accuracy = 0.71875


Epoch[2] Batch[45] Speed: 1.2501583262624578 samples/sec                   batch loss = 109.54416227340698 | accuracy = 0.7111111111111111


Epoch[2] Batch[50] Speed: 1.2626765387529157 samples/sec                   batch loss = 122.12033450603485 | accuracy = 0.71


Epoch[2] Batch[55] Speed: 1.254978652472333 samples/sec                   batch loss = 136.43047761917114 | accuracy = 0.7045454545454546


Epoch[2] Batch[60] Speed: 1.2599055484769022 samples/sec                   batch loss = 149.18159174919128 | accuracy = 0.6958333333333333


Epoch[2] Batch[65] Speed: 1.2507237584357722 samples/sec                   batch loss = 160.61046409606934 | accuracy = 0.7076923076923077


Epoch[2] Batch[70] Speed: 1.2600008321290743 samples/sec                   batch loss = 173.50469732284546 | accuracy = 0.7


Epoch[2] Batch[75] Speed: 1.2602813733306952 samples/sec                   batch loss = 187.5605342388153 | accuracy = 0.6933333333333334


Epoch[2] Batch[80] Speed: 1.2653559034462234 samples/sec                   batch loss = 200.84383940696716 | accuracy = 0.6875


Epoch[2] Batch[85] Speed: 1.2520369312184079 samples/sec                   batch loss = 212.86575269699097 | accuracy = 0.6911764705882353


Epoch[2] Batch[90] Speed: 1.2592754496914789 samples/sec                   batch loss = 223.78035640716553 | accuracy = 0.7


Epoch[2] Batch[95] Speed: 1.2560267094455406 samples/sec                   batch loss = 234.78444576263428 | accuracy = 0.7026315789473684


Epoch[2] Batch[100] Speed: 1.2572538084513016 samples/sec                   batch loss = 245.57312881946564 | accuracy = 0.71


Epoch[2] Batch[105] Speed: 1.2572849948186466 samples/sec                   batch loss = 256.32514810562134 | accuracy = 0.7142857142857143


Epoch[2] Batch[110] Speed: 1.255122016907309 samples/sec                   batch loss = 270.2275650501251 | accuracy = 0.7068181818181818


Epoch[2] Batch[115] Speed: 1.253598791682912 samples/sec                   batch loss = 281.86260306835175 | accuracy = 0.7043478260869566


Epoch[2] Batch[120] Speed: 1.26228627293954 samples/sec                   batch loss = 295.77724635601044 | accuracy = 0.6958333333333333


Epoch[2] Batch[125] Speed: 1.2597124705949487 samples/sec                   batch loss = 308.4854737520218 | accuracy = 0.696


Epoch[2] Batch[130] Speed: 1.253410638321239 samples/sec                   batch loss = 320.342667222023 | accuracy = 0.6942307692307692


Epoch[2] Batch[135] Speed: 1.2552646627532522 samples/sec                   batch loss = 334.30538618564606 | accuracy = 0.6888888888888889


Epoch[2] Batch[140] Speed: 1.2575960951982816 samples/sec                   batch loss = 348.2264121770859 | accuracy = 0.6821428571428572


Epoch[2] Batch[145] Speed: 1.2545359035895924 samples/sec                   batch loss = 359.64061737060547 | accuracy = 0.6844827586206896


Epoch[2] Batch[150] Speed: 1.2622546480384746 samples/sec                   batch loss = 369.87384736537933 | accuracy = 0.6866666666666666


Epoch[2] Batch[155] Speed: 1.2540636573562778 samples/sec                   batch loss = 382.6199184656143 | accuracy = 0.6887096774193548


Epoch[2] Batch[160] Speed: 1.2530451697661278 samples/sec                   batch loss = 395.3402998447418 | accuracy = 0.6890625


Epoch[2] Batch[165] Speed: 1.2613817429429282 samples/sec                   batch loss = 408.8494119644165 | accuracy = 0.6848484848484848


Epoch[2] Batch[170] Speed: 1.2624377715434674 samples/sec                   batch loss = 422.1427574157715 | accuracy = 0.6823529411764706


Epoch[2] Batch[175] Speed: 1.2580983649731088 samples/sec                   batch loss = 436.53117656707764 | accuracy = 0.6785714285714286


Epoch[2] Batch[180] Speed: 1.2515542399705186 samples/sec                   batch loss = 450.10135424137115 | accuracy = 0.675


Epoch[2] Batch[185] Speed: 1.2511192216180358 samples/sec                   batch loss = 462.37704253196716 | accuracy = 0.672972972972973


Epoch[2] Batch[190] Speed: 1.257178816753466 samples/sec                   batch loss = 475.31589806079865 | accuracy = 0.6723684210526316


Epoch[2] Batch[195] Speed: 1.2569216895501085 samples/sec                   batch loss = 490.9780203104019 | accuracy = 0.6692307692307692


Epoch[2] Batch[200] Speed: 1.2509624987492027 samples/sec                   batch loss = 503.5408630371094 | accuracy = 0.66875


Epoch[2] Batch[205] Speed: 1.2466854621939345 samples/sec                   batch loss = 513.8277606964111 | accuracy = 0.6731707317073171


Epoch[2] Batch[210] Speed: 1.2603521909202022 samples/sec                   batch loss = 525.0240786075592 | accuracy = 0.6761904761904762


Epoch[2] Batch[215] Speed: 1.2604491519706313 samples/sec                   batch loss = 537.6719959974289 | accuracy = 0.6755813953488372


Epoch[2] Batch[220] Speed: 1.264487091913846 samples/sec                   batch loss = 549.1962629556656 | accuracy = 0.6738636363636363


Epoch[2] Batch[225] Speed: 1.250181522499848 samples/sec                   batch loss = 561.8530541658401 | accuracy = 0.6733333333333333


Epoch[2] Batch[230] Speed: 1.2475162462416798 samples/sec                   batch loss = 574.3035998344421 | accuracy = 0.6728260869565217


Epoch[2] Batch[235] Speed: 1.254782483112036 samples/sec                   batch loss = 585.5733470916748 | accuracy = 0.675531914893617


Epoch[2] Batch[240] Speed: 1.250912504952661 samples/sec                   batch loss = 598.60116314888 | accuracy = 0.6729166666666667


Epoch[2] Batch[245] Speed: 1.259610232345056 samples/sec                   batch loss = 608.4983403682709 | accuracy = 0.6744897959183673


Epoch[2] Batch[250] Speed: 1.2524185468130662 samples/sec                   batch loss = 621.284175992012 | accuracy = 0.677


Epoch[2] Batch[255] Speed: 1.2546741938670651 samples/sec                   batch loss = 633.1661142110825 | accuracy = 0.6774509803921569


Epoch[2] Batch[260] Speed: 1.2509991571877317 samples/sec                   batch loss = 644.4947077035904 | accuracy = 0.6798076923076923


Epoch[2] Batch[265] Speed: 1.2548933253429688 samples/sec                   batch loss = 657.70936024189 | accuracy = 0.6792452830188679


Epoch[2] Batch[270] Speed: 1.253280209737407 samples/sec                   batch loss = 670.1847997903824 | accuracy = 0.6787037037037037


Epoch[2] Batch[275] Speed: 1.250844329555102 samples/sec                   batch loss = 681.608983874321 | accuracy = 0.6781818181818182


Epoch[2] Batch[280] Speed: 1.2518007694853148 samples/sec                   batch loss = 693.9864802360535 | accuracy = 0.6776785714285715


Epoch[2] Batch[285] Speed: 1.2476250661655905 samples/sec                   batch loss = 705.1318401098251 | accuracy = 0.6798245614035088


Epoch[2] Batch[290] Speed: 1.2570132260783768 samples/sec                   batch loss = 714.7547394037247 | accuracy = 0.6827586206896552


Epoch[2] Batch[295] Speed: 1.2511004687781857 samples/sec                   batch loss = 729.0537180900574 | accuracy = 0.6796610169491526


Epoch[2] Batch[300] Speed: 1.2517434240296577 samples/sec                   batch loss = 740.1830171346664 | accuracy = 0.6808333333333333


Epoch[2] Batch[305] Speed: 1.251585984536096 samples/sec                   batch loss = 750.2454768419266 | accuracy = 0.6811475409836065


Epoch[2] Batch[310] Speed: 1.2550955384946618 samples/sec                   batch loss = 759.9886120557785 | accuracy = 0.6838709677419355


Epoch[2] Batch[315] Speed: 1.2595620035944766 samples/sec                   batch loss = 772.267273068428 | accuracy = 0.6825396825396826


Epoch[2] Batch[320] Speed: 1.2561675858876775 samples/sec                   batch loss = 784.5504416227341 | accuracy = 0.6828125


Epoch[2] Batch[325] Speed: 1.2507655314123218 samples/sec                   batch loss = 797.3730410337448 | accuracy = 0.6823076923076923


Epoch[2] Batch[330] Speed: 1.2457708602673268 samples/sec                   batch loss = 807.8975985050201 | accuracy = 0.6833333333333333


Epoch[2] Batch[335] Speed: 1.2563262742988397 samples/sec                   batch loss = 818.7478011846542 | accuracy = 0.6835820895522388


Epoch[2] Batch[340] Speed: 1.2546331914669588 samples/sec                   batch loss = 828.0194112062454 | accuracy = 0.6852941176470588


Epoch[2] Batch[345] Speed: 1.2662977792561252 samples/sec                   batch loss = 840.5452610254288 | accuracy = 0.6855072463768116


Epoch[2] Batch[350] Speed: 1.2596770021620034 samples/sec                   batch loss = 852.3015223741531 | accuracy = 0.685


Epoch[2] Batch[355] Speed: 1.2593620355388186 samples/sec                   batch loss = 863.5019198656082 | accuracy = 0.6852112676056338


Epoch[2] Batch[360] Speed: 1.2671860222049645 samples/sec                   batch loss = 876.523766875267 | accuracy = 0.6840277777777778


Epoch[2] Batch[365] Speed: 1.2655566340805933 samples/sec                   batch loss = 886.3657995462418 | accuracy = 0.6856164383561644


Epoch[2] Batch[370] Speed: 1.2508404127265427 samples/sec                   batch loss = 900.5399309396744 | accuracy = 0.6844594594594594


Epoch[2] Batch[375] Speed: 1.2519622802807249 samples/sec                   batch loss = 910.2725125551224 | accuracy = 0.6866666666666666


Epoch[2] Batch[380] Speed: 1.2533802993147474 samples/sec                   batch loss = 922.4414321184158 | accuracy = 0.6868421052631579


Epoch[2] Batch[385] Speed: 1.2572025568759373 samples/sec                   batch loss = 936.7268706560135 | accuracy = 0.685064935064935


Epoch[2] Batch[390] Speed: 1.2529006887541867 samples/sec                   batch loss = 949.9441241025925 | accuracy = 0.6846153846153846


Epoch[2] Batch[395] Speed: 1.2542641966570032 samples/sec                   batch loss = 963.7534719705582 | accuracy = 0.6829113924050633


Epoch[2] Batch[400] Speed: 1.253030570430398 samples/sec                   batch loss = 975.9344815015793 | accuracy = 0.681875


Epoch[2] Batch[405] Speed: 1.2650142447049086 samples/sec                   batch loss = 987.9204894304276 | accuracy = 0.6820987654320988


Epoch[2] Batch[410] Speed: 1.2626719772916144 samples/sec                   batch loss = 1000.8035938739777 | accuracy = 0.6817073170731708


Epoch[2] Batch[415] Speed: 1.2539053525531074 samples/sec                   batch loss = 1013.7576532363892 | accuracy = 0.6825301204819277


Epoch[2] Batch[420] Speed: 1.248390588757739 samples/sec                   batch loss = 1025.9053913354874 | accuracy = 0.6827380952380953


Epoch[2] Batch[425] Speed: 1.25071163733542 samples/sec                   batch loss = 1038.853208899498 | accuracy = 0.6835294117647058


Epoch[2] Batch[430] Speed: 1.2488298465374095 samples/sec                   batch loss = 1051.4985965490341 | accuracy = 0.6831395348837209


Epoch[2] Batch[435] Speed: 1.2555172601653906 samples/sec                   batch loss = 1065.5680953264236 | accuracy = 0.6816091954022988


Epoch[2] Batch[440] Speed: 1.2463064060230806 samples/sec                   batch loss = 1078.3108574151993 | accuracy = 0.6806818181818182


Epoch[2] Batch[445] Speed: 1.2541106222407605 samples/sec                   batch loss = 1089.521750330925 | accuracy = 0.6808988764044944


Epoch[2] Batch[450] Speed: 1.249748574860257 samples/sec                   batch loss = 1103.267952799797 | accuracy = 0.6788888888888889


Epoch[2] Batch[455] Speed: 1.2546124567197179 samples/sec                   batch loss = 1116.8805035352707 | accuracy = 0.6785714285714286


Epoch[2] Batch[460] Speed: 1.2580874212899367 samples/sec                   batch loss = 1129.4662209749222 | accuracy = 0.6771739130434783


Epoch[2] Batch[465] Speed: 1.2509801281117796 samples/sec                   batch loss = 1141.9518364667892 | accuracy = 0.6763440860215054


Epoch[2] Batch[470] Speed: 1.246654984705508 samples/sec                   batch loss = 1157.0860072374344 | accuracy = 0.675


Epoch[2] Batch[475] Speed: 1.2436960537010027 samples/sec                   batch loss = 1166.9272201061249 | accuracy = 0.6763157894736842


Epoch[2] Batch[480] Speed: 1.2479411504802618 samples/sec                   batch loss = 1180.0359477996826 | accuracy = 0.675


Epoch[2] Batch[485] Speed: 1.2574526362731853 samples/sec                   batch loss = 1191.1235882043839 | accuracy = 0.6762886597938145


Epoch[2] Batch[490] Speed: 1.2606109131718732 samples/sec                   batch loss = 1201.0223157405853 | accuracy = 0.6775510204081633


Epoch[2] Batch[495] Speed: 1.2581644084817352 samples/sec                   batch loss = 1211.091460466385 | accuracy = 0.6792929292929293


Epoch[2] Batch[500] Speed: 1.256809358925359 samples/sec                   batch loss = 1225.2724862098694 | accuracy = 0.678


Epoch[2] Batch[505] Speed: 1.2646498912695505 samples/sec                   batch loss = 1237.5653603076935 | accuracy = 0.6772277227722773


Epoch[2] Batch[510] Speed: 1.2638912493476093 samples/sec                   batch loss = 1247.0125819444656 | accuracy = 0.6784313725490196


Epoch[2] Batch[515] Speed: 1.2668964675662973 samples/sec                   batch loss = 1259.4057673215866 | accuracy = 0.6766990291262136


Epoch[2] Batch[520] Speed: 1.2541443716351883 samples/sec                   batch loss = 1272.8428258895874 | accuracy = 0.6778846153846154


Epoch[2] Batch[525] Speed: 1.2457985193180565 samples/sec                   batch loss = 1285.464866399765 | accuracy = 0.6776190476190476


Epoch[2] Batch[530] Speed: 1.2603488770854516 samples/sec                   batch loss = 1301.4267027378082 | accuracy = 0.6764150943396227


Epoch[2] Batch[535] Speed: 1.25584130478263 samples/sec                   batch loss = 1313.091471672058 | accuracy = 0.6771028037383178


Epoch[2] Batch[540] Speed: 1.2571242743922955 samples/sec                   batch loss = 1322.5479307174683 | accuracy = 0.6782407407407407


Epoch[2] Batch[545] Speed: 1.2531865950424137 samples/sec                   batch loss = 1335.0238018035889 | accuracy = 0.6779816513761467


Epoch[2] Batch[550] Speed: 1.2553412109213953 samples/sec                   batch loss = 1348.3818310499191 | accuracy = 0.6772727272727272


Epoch[2] Batch[555] Speed: 1.2549317164218459 samples/sec                   batch loss = 1360.3718551397324 | accuracy = 0.677027027027027


Epoch[2] Batch[560] Speed: 1.254190029956792 samples/sec                   batch loss = 1371.5099779367447 | accuracy = 0.6767857142857143


Epoch[2] Batch[565] Speed: 1.2409066867493184 samples/sec                   batch loss = 1385.5378576517105 | accuracy = 0.6765486725663716


Epoch[2] Batch[570] Speed: 1.2419226287055944 samples/sec                   batch loss = 1395.3806238174438 | accuracy = 0.6771929824561403


Epoch[2] Batch[575] Speed: 1.253259987779061 samples/sec                   batch loss = 1408.371534705162 | accuracy = 0.6765217391304348


Epoch[2] Batch[580] Speed: 1.2519352810699182 samples/sec                   batch loss = 1420.5460114479065 | accuracy = 0.6771551724137931


Epoch[2] Batch[585] Speed: 1.251476939352151 samples/sec                   batch loss = 1431.3220286369324 | accuracy = 0.6773504273504274


Epoch[2] Batch[590] Speed: 1.2498107652785482 samples/sec                   batch loss = 1443.6896584033966 | accuracy = 0.6771186440677966


Epoch[2] Batch[595] Speed: 1.2604211226340831 samples/sec                   batch loss = 1455.5713348388672 | accuracy = 0.6773109243697479


Epoch[2] Batch[600] Speed: 1.2594821033027779 samples/sec                   batch loss = 1465.8714048862457 | accuracy = 0.6779166666666666


Epoch[2] Batch[605] Speed: 1.2648202654512728 samples/sec                   batch loss = 1476.5469448566437 | accuracy = 0.6785123966942149


Epoch[2] Batch[610] Speed: 1.249631193348535 samples/sec                   batch loss = 1488.9615000486374 | accuracy = 0.6782786885245902


Epoch[2] Batch[615] Speed: 1.252258039908876 samples/sec                   batch loss = 1498.533436536789 | accuracy = 0.6796747967479675


Epoch[2] Batch[620] Speed: 1.242751493567768 samples/sec                   batch loss = 1510.7981859445572 | accuracy = 0.6790322580645162


Epoch[2] Batch[625] Speed: 1.2441455758391033 samples/sec                   batch loss = 1520.0712790489197 | accuracy = 0.6804


Epoch[2] Batch[630] Speed: 1.2562800840274846 samples/sec                   batch loss = 1529.5293904542923 | accuracy = 0.6813492063492064


Epoch[2] Batch[635] Speed: 1.252182708398189 samples/sec                   batch loss = 1538.5845860242844 | accuracy = 0.6826771653543308


Epoch[2] Batch[640] Speed: 1.2500149758464185 samples/sec                   batch loss = 1551.015544295311 | accuracy = 0.68203125


Epoch[2] Batch[645] Speed: 1.2570098355988364 samples/sec                   batch loss = 1565.6464215517044 | accuracy = 0.6813953488372093


Epoch[2] Batch[650] Speed: 1.256506928748504 samples/sec                   batch loss = 1577.553832769394 | accuracy = 0.6815384615384615


Epoch[2] Batch[655] Speed: 1.2594911802219564 samples/sec                   batch loss = 1588.8502714633942 | accuracy = 0.6824427480916031


Epoch[2] Batch[660] Speed: 1.258301612445843 samples/sec                   batch loss = 1600.0784356594086 | accuracy = 0.6825757575757576


Epoch[2] Batch[665] Speed: 1.2515410757848084 samples/sec                   batch loss = 1611.5062692165375 | accuracy = 0.6823308270676691


Epoch[2] Batch[670] Speed: 1.252664668740651 samples/sec                   batch loss = 1624.7133893966675 | accuracy = 0.6817164179104478


Epoch[2] Batch[675] Speed: 1.2564402122792702 samples/sec                   batch loss = 1638.6782166957855 | accuracy = 0.6807407407407408


Epoch[2] Batch[680] Speed: 1.2539016976744934 samples/sec                   batch loss = 1650.9267324209213 | accuracy = 0.6801470588235294


Epoch[2] Batch[685] Speed: 1.2604332432752843 samples/sec                   batch loss = 1662.6160290241241 | accuracy = 0.67992700729927


Epoch[2] Batch[690] Speed: 1.2589057978104912 samples/sec                   batch loss = 1671.8052945137024 | accuracy = 0.6807971014492754


Epoch[2] Batch[695] Speed: 1.2584387519436655 samples/sec                   batch loss = 1681.418590426445 | accuracy = 0.6816546762589928


Epoch[2] Batch[700] Speed: 1.267136350257748 samples/sec                   batch loss = 1691.889251589775 | accuracy = 0.6817857142857143


Epoch[2] Batch[705] Speed: 1.2559992525639214 samples/sec                   batch loss = 1699.5664090514183 | accuracy = 0.6833333333333333


Epoch[2] Batch[710] Speed: 1.260168157462132 samples/sec                   batch loss = 1712.9681633114815 | accuracy = 0.6834507042253521


Epoch[2] Batch[715] Speed: 1.2564047396934688 samples/sec                   batch loss = 1728.6687428355217 | accuracy = 0.6825174825174826


Epoch[2] Batch[720] Speed: 1.2481505075897985 samples/sec                   batch loss = 1741.110459268093 | accuracy = 0.6822916666666666


Epoch[2] Batch[725] Speed: 1.2558275802444554 samples/sec                   batch loss = 1749.288783609867 | accuracy = 0.6834482758620689


Epoch[2] Batch[730] Speed: 1.2597369686254671 samples/sec                   batch loss = 1760.618687570095 | accuracy = 0.6842465753424658


Epoch[2] Batch[735] Speed: 1.258130725425972 samples/sec                   batch loss = 1772.3950194716454 | accuracy = 0.6850340136054421


Epoch[2] Batch[740] Speed: 1.2599502079375466 samples/sec                   batch loss = 1783.6928390860558 | accuracy = 0.6854729729729729


Epoch[2] Batch[745] Speed: 1.256603487271333 samples/sec                   batch loss = 1793.5965431332588 | accuracy = 0.6865771812080537


Epoch[2] Batch[750] Speed: 1.2631024202739445 samples/sec                   batch loss = 1804.299020588398 | accuracy = 0.6863333333333334


Epoch[2] Batch[755] Speed: 1.2583569176900264 samples/sec                   batch loss = 1814.9023323655128 | accuracy = 0.6867549668874172


Epoch[2] Batch[760] Speed: 1.2594370043005876 samples/sec                   batch loss = 1825.3394492268562 | accuracy = 0.6868421052631579


Epoch[2] Batch[765] Speed: 1.254885722498427 samples/sec                   batch loss = 1834.9173204302788 | accuracy = 0.6866013071895425


Epoch[2] Batch[770] Speed: 1.2516731970136001 samples/sec                   batch loss = 1849.29584223032 | accuracy = 0.6866883116883117


Epoch[2] Batch[775] Speed: 1.2564498099809824 samples/sec                   batch loss = 1859.3213080763817 | accuracy = 0.687741935483871


Epoch[2] Batch[780] Speed: 1.2463914953169728 samples/sec                   batch loss = 1871.0643268227577 | accuracy = 0.6881410256410256


Epoch[2] Batch[785] Speed: 1.2574191797988095 samples/sec                   batch loss = 1881.7170919775963 | accuracy = 0.6878980891719745


[Epoch 2] training: accuracy=0.6881345177664975
[Epoch 2] time cost: 644.1116154193878
[Epoch 2] validation: validation accuracy=0.7155555555555555


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).