<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[15:36:55] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[15:36:55] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[15:36:56] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 6.2687106, -2.639347 ]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7811172901626907 samples/sec                   batch loss = 14.083472967147827 | accuracy = 0.6


Epoch[1] Batch[10] Speed: 1.2630504055042 samples/sec                   batch loss = 27.875250816345215 | accuracy = 0.6


Epoch[1] Batch[15] Speed: 1.2589202509764066 samples/sec                   batch loss = 41.47236514091492 | accuracy = 0.6


Epoch[1] Batch[20] Speed: 1.266485328412697 samples/sec                   batch loss = 53.40841484069824 | accuracy = 0.6375


Epoch[1] Batch[25] Speed: 1.262315809938933 samples/sec                   batch loss = 68.04085278511047 | accuracy = 0.61


Epoch[1] Batch[30] Speed: 1.2651367283232844 samples/sec                   batch loss = 81.55837202072144 | accuracy = 0.6083333333333333


Epoch[1] Batch[35] Speed: 1.2660835326664137 samples/sec                   batch loss = 95.75003433227539 | accuracy = 0.6071428571428571


Epoch[1] Batch[40] Speed: 1.2614135138000335 samples/sec                   batch loss = 110.27200651168823 | accuracy = 0.59375


Epoch[1] Batch[45] Speed: 1.2601467661131016 samples/sec                   batch loss = 124.20360851287842 | accuracy = 0.5888888888888889


Epoch[1] Batch[50] Speed: 1.2612042349515735 samples/sec                   batch loss = 138.75688862800598 | accuracy = 0.6


Epoch[1] Batch[55] Speed: 1.2669276557903746 samples/sec                   batch loss = 152.28780579566956 | accuracy = 0.5954545454545455


Epoch[1] Batch[60] Speed: 1.2654817941022685 samples/sec                   batch loss = 165.07597064971924 | accuracy = 0.6083333333333333


Epoch[1] Batch[65] Speed: 1.2670798878410459 samples/sec                   batch loss = 178.5398120880127 | accuracy = 0.6076923076923076


Epoch[1] Batch[70] Speed: 1.2600200419991259 samples/sec                   batch loss = 192.37000823020935 | accuracy = 0.5928571428571429


Epoch[1] Batch[75] Speed: 1.2642721237425922 samples/sec                   batch loss = 206.78435707092285 | accuracy = 0.58


Epoch[1] Batch[80] Speed: 1.2632946363189708 samples/sec                   batch loss = 221.10250544548035 | accuracy = 0.578125


Epoch[1] Batch[85] Speed: 1.2655353457879759 samples/sec                   batch loss = 234.38289666175842 | accuracy = 0.5823529411764706


Epoch[1] Batch[90] Speed: 1.25823970654001 samples/sec                   batch loss = 247.65496253967285 | accuracy = 0.5861111111111111


Epoch[1] Batch[95] Speed: 1.2685518499243167 samples/sec                   batch loss = 261.0083169937134 | accuracy = 0.5868421052631579


Epoch[1] Batch[100] Speed: 1.2589429232778078 samples/sec                   batch loss = 274.65615701675415 | accuracy = 0.5875


Epoch[1] Batch[105] Speed: 1.2563108458059848 samples/sec                   batch loss = 288.6103570461273 | accuracy = 0.5857142857142857


Epoch[1] Batch[110] Speed: 1.2578590625190715 samples/sec                   batch loss = 303.1625964641571 | accuracy = 0.5727272727272728


Epoch[1] Batch[115] Speed: 1.2612364709378459 samples/sec                   batch loss = 316.85217237472534 | accuracy = 0.5739130434782609


Epoch[1] Batch[120] Speed: 1.2555425348940465 samples/sec                   batch loss = 329.83667373657227 | accuracy = 0.5770833333333333


Epoch[1] Batch[125] Speed: 1.2586549512809202 samples/sec                   batch loss = 343.82067108154297 | accuracy = 0.578


Epoch[1] Batch[130] Speed: 1.2583177505045355 samples/sec                   batch loss = 356.53723526000977 | accuracy = 0.5826923076923077


Epoch[1] Batch[135] Speed: 1.2552210861259716 samples/sec                   batch loss = 371.5498809814453 | accuracy = 0.575925925925926


Epoch[1] Batch[140] Speed: 1.2557902622978772 samples/sec                   batch loss = 385.20890617370605 | accuracy = 0.575


Epoch[1] Batch[145] Speed: 1.2575877054375781 samples/sec                   batch loss = 398.57837867736816 | accuracy = 0.5741379310344827


Epoch[1] Batch[150] Speed: 1.257102609482734 samples/sec                   batch loss = 411.73987317085266 | accuracy = 0.5766666666666667


Epoch[1] Batch[155] Speed: 1.2536107814668545 samples/sec                   batch loss = 426.1032292842865 | accuracy = 0.5725806451612904


Epoch[1] Batch[160] Speed: 1.2518047857320118 samples/sec                   batch loss = 439.2532386779785 | accuracy = 0.5765625


Epoch[1] Batch[165] Speed: 1.2557090540793703 samples/sec                   batch loss = 453.25072407722473 | accuracy = 0.5757575757575758


Epoch[1] Batch[170] Speed: 1.259554722319695 samples/sec                   batch loss = 468.00138902664185 | accuracy = 0.5705882352941176


Epoch[1] Batch[175] Speed: 1.2535508348405089 samples/sec                   batch loss = 482.79560947418213 | accuracy = 0.5628571428571428


Epoch[1] Batch[180] Speed: 1.2627354605903207 samples/sec                   batch loss = 496.66654682159424 | accuracy = 0.5611111111111111


Epoch[1] Batch[185] Speed: 1.2634864351110504 samples/sec                   batch loss = 510.07603216171265 | accuracy = 0.5608108108108109


Epoch[1] Batch[190] Speed: 1.258240650181879 samples/sec                   batch loss = 524.0390458106995 | accuracy = 0.5605263157894737


Epoch[1] Batch[195] Speed: 1.258585457279912 samples/sec                   batch loss = 537.9568772315979 | accuracy = 0.5602564102564103


Epoch[1] Batch[200] Speed: 1.2657110194957122 samples/sec                   batch loss = 551.4893283843994 | accuracy = 0.56125


Epoch[1] Batch[205] Speed: 1.2527587667343478 samples/sec                   batch loss = 565.0890889167786 | accuracy = 0.5621951219512196


Epoch[1] Batch[210] Speed: 1.25749476569015 samples/sec                   batch loss = 579.1506009101868 | accuracy = 0.5595238095238095


Epoch[1] Batch[215] Speed: 1.2558718570577674 samples/sec                   batch loss = 591.974287033081 | accuracy = 0.5651162790697675


Epoch[1] Batch[220] Speed: 1.2567998498781383 samples/sec                   batch loss = 605.7980003356934 | accuracy = 0.5659090909090909


Epoch[1] Batch[225] Speed: 1.2563370932851043 samples/sec                   batch loss = 619.3238263130188 | accuracy = 0.5666666666666667


Epoch[1] Batch[230] Speed: 1.254947862027077 samples/sec                   batch loss = 633.3510248661041 | accuracy = 0.5652173913043478


Epoch[1] Batch[235] Speed: 1.2548974553218277 samples/sec                   batch loss = 646.5579702854156 | accuracy = 0.5680851063829787


Epoch[1] Batch[240] Speed: 1.2629434416086818 samples/sec                   batch loss = 660.5112817287445 | accuracy = 0.565625


Epoch[1] Batch[245] Speed: 1.2573374780894322 samples/sec                   batch loss = 674.1236753463745 | accuracy = 0.5673469387755102


Epoch[1] Batch[250] Speed: 1.2562592007427986 samples/sec                   batch loss = 688.3449096679688 | accuracy = 0.564


Epoch[1] Batch[255] Speed: 1.2517753649870171 samples/sec                   batch loss = 701.5979945659637 | accuracy = 0.5656862745098039


Epoch[1] Batch[260] Speed: 1.2542488187695466 samples/sec                   batch loss = 714.4379234313965 | accuracy = 0.5682692307692307


Epoch[1] Batch[265] Speed: 1.2581912996108109 samples/sec                   batch loss = 727.8634893894196 | accuracy = 0.5679245283018868


Epoch[1] Batch[270] Speed: 1.254975554584787 samples/sec                   batch loss = 742.4690127372742 | accuracy = 0.5648148148148148


Epoch[1] Batch[275] Speed: 1.258533152998664 samples/sec                   batch loss = 756.4690206050873 | accuracy = 0.5636363636363636


Epoch[1] Batch[280] Speed: 1.2545557914947192 samples/sec                   batch loss = 769.8487825393677 | accuracy = 0.5651785714285714


Epoch[1] Batch[285] Speed: 1.2607754634926525 samples/sec                   batch loss = 783.7576701641083 | accuracy = 0.5640350877192982


Epoch[1] Batch[290] Speed: 1.2608037928385583 samples/sec                   batch loss = 797.2558085918427 | accuracy = 0.5655172413793104


Epoch[1] Batch[295] Speed: 1.2566727625659104 samples/sec                   batch loss = 810.4996979236603 | accuracy = 0.5677966101694916


Epoch[1] Batch[300] Speed: 1.2624825157038853 samples/sec                   batch loss = 824.2694256305695 | accuracy = 0.5658333333333333


Epoch[1] Batch[305] Speed: 1.2562293821208597 samples/sec                   batch loss = 837.7426605224609 | accuracy = 0.5663934426229508


Epoch[1] Batch[310] Speed: 1.2613109044363109 samples/sec                   batch loss = 851.2491776943207 | accuracy = 0.5661290322580645


Epoch[1] Batch[315] Speed: 1.2568946643384535 samples/sec                   batch loss = 864.58518242836 | accuracy = 0.5658730158730159


Epoch[1] Batch[320] Speed: 1.2633760675812025 samples/sec                   batch loss = 878.1841661930084 | accuracy = 0.5640625


Epoch[1] Batch[325] Speed: 1.2571199413506435 samples/sec                   batch loss = 890.7140531539917 | accuracy = 0.5661538461538461


Epoch[1] Batch[330] Speed: 1.2629175828353214 samples/sec                   batch loss = 902.9278972148895 | accuracy = 0.5712121212121212


Epoch[1] Batch[335] Speed: 1.256282623933789 samples/sec                   batch loss = 917.2105648517609 | accuracy = 0.5701492537313433


Epoch[1] Batch[340] Speed: 1.2601704291524736 samples/sec                   batch loss = 930.0028364658356 | accuracy = 0.5720588235294117


Epoch[1] Batch[345] Speed: 1.2565685701471851 samples/sec                   batch loss = 944.4055969715118 | accuracy = 0.5710144927536231


Epoch[1] Batch[350] Speed: 1.2562485712293954 samples/sec                   batch loss = 958.2992894649506 | accuracy = 0.57


Epoch[1] Batch[355] Speed: 1.2627679649894807 samples/sec                   batch loss = 971.8599512577057 | accuracy = 0.5704225352112676


Epoch[1] Batch[360] Speed: 1.2562171540731963 samples/sec                   batch loss = 984.5844686031342 | accuracy = 0.5736111111111111


Epoch[1] Batch[365] Speed: 1.2584810418651653 samples/sec                   batch loss = 997.6327695846558 | accuracy = 0.576027397260274


Epoch[1] Batch[370] Speed: 1.2591161099194137 samples/sec                   batch loss = 1012.1410412788391 | accuracy = 0.575


Epoch[1] Batch[375] Speed: 1.2582170595597655 samples/sec                   batch loss = 1025.6901128292084 | accuracy = 0.574


Epoch[1] Batch[380] Speed: 1.2593963517835989 samples/sec                   batch loss = 1039.50532579422 | accuracy = 0.5736842105263158


Epoch[1] Batch[385] Speed: 1.2600719021102647 samples/sec                   batch loss = 1053.09876704216 | accuracy = 0.574025974025974


Epoch[1] Batch[390] Speed: 1.2569772502558763 samples/sec                   batch loss = 1066.9495058059692 | accuracy = 0.573076923076923


Epoch[1] Batch[395] Speed: 1.258706038042488 samples/sec                   batch loss = 1079.885124206543 | accuracy = 0.5740506329113924


Epoch[1] Batch[400] Speed: 1.2590872894159155 samples/sec                   batch loss = 1093.0078294277191 | accuracy = 0.575625


Epoch[1] Batch[405] Speed: 1.2576340861899657 samples/sec                   batch loss = 1106.6901278495789 | accuracy = 0.5771604938271605


Epoch[1] Batch[410] Speed: 1.258965690868883 samples/sec                   batch loss = 1119.9911065101624 | accuracy = 0.5780487804878048


Epoch[1] Batch[415] Speed: 1.2599425436948937 samples/sec                   batch loss = 1133.3321549892426 | accuracy = 0.5783132530120482


Epoch[1] Batch[420] Speed: 1.254528211267099 samples/sec                   batch loss = 1146.2247898578644 | accuracy = 0.5791666666666667


Epoch[1] Batch[425] Speed: 1.2554657742749942 samples/sec                   batch loss = 1159.3522913455963 | accuracy = 0.581764705882353


Epoch[1] Batch[430] Speed: 1.2536280171830831 samples/sec                   batch loss = 1173.1370561122894 | accuracy = 0.5819767441860465


Epoch[1] Batch[435] Speed: 1.261107726468602 samples/sec                   batch loss = 1186.5961742401123 | accuracy = 0.5827586206896552


Epoch[1] Batch[440] Speed: 1.264654371703947 samples/sec                   batch loss = 1199.5869076251984 | accuracy = 0.5823863636363636


Epoch[1] Batch[445] Speed: 1.2628543663961769 samples/sec                   batch loss = 1213.4154272079468 | accuracy = 0.5820224719101124


Epoch[1] Batch[450] Speed: 1.2663927895721894 samples/sec                   batch loss = 1226.1737563610077 | accuracy = 0.5838888888888889


Epoch[1] Batch[455] Speed: 1.2593017266705668 samples/sec                   batch loss = 1240.0167844295502 | accuracy = 0.5835164835164836


Epoch[1] Batch[460] Speed: 1.2628960029784602 samples/sec                   batch loss = 1252.7056441307068 | accuracy = 0.5842391304347826


Epoch[1] Batch[465] Speed: 1.2632645778983425 samples/sec                   batch loss = 1265.6779997348785 | accuracy = 0.5827956989247312


Epoch[1] Batch[470] Speed: 1.2552385539796131 samples/sec                   batch loss = 1279.2665331363678 | accuracy = 0.5824468085106383


Epoch[1] Batch[475] Speed: 1.2572625706278409 samples/sec                   batch loss = 1292.979299545288 | accuracy = 0.5821052631578948


Epoch[1] Batch[480] Speed: 1.2585241842896167 samples/sec                   batch loss = 1306.4383916854858 | accuracy = 0.5822916666666667


Epoch[1] Batch[485] Speed: 1.262246765802333 samples/sec                   batch loss = 1319.8530521392822 | accuracy = 0.5829896907216495


Epoch[1] Batch[490] Speed: 1.2594711355333243 samples/sec                   batch loss = 1333.7428741455078 | accuracy = 0.5821428571428572


Epoch[1] Batch[495] Speed: 1.2567094740382385 samples/sec                   batch loss = 1346.1072883605957 | accuracy = 0.5843434343434344


Epoch[1] Batch[500] Speed: 1.2555631124692268 samples/sec                   batch loss = 1360.4226133823395 | accuracy = 0.583


Epoch[1] Batch[505] Speed: 1.2603040000108172 samples/sec                   batch loss = 1373.7062265872955 | accuracy = 0.5821782178217821


Epoch[1] Batch[510] Speed: 1.2609986271023201 samples/sec                   batch loss = 1388.2535920143127 | accuracy = 0.5803921568627451


Epoch[1] Batch[515] Speed: 1.2605450861276233 samples/sec                   batch loss = 1402.309329032898 | accuracy = 0.5800970873786407


Epoch[1] Batch[520] Speed: 1.2592645800387554 samples/sec                   batch loss = 1416.5939271450043 | accuracy = 0.5793269230769231


Epoch[1] Batch[525] Speed: 1.2625055065059922 samples/sec                   batch loss = 1430.387314081192 | accuracy = 0.579047619047619


Epoch[1] Batch[530] Speed: 1.2566810459927955 samples/sec                   batch loss = 1442.0560797452927 | accuracy = 0.5811320754716981


Epoch[1] Batch[535] Speed: 1.2583593716189574 samples/sec                   batch loss = 1454.3206514120102 | accuracy = 0.5827102803738318


Epoch[1] Batch[540] Speed: 1.259017275262644 samples/sec                   batch loss = 1467.0930160284042 | accuracy = 0.5837962962962963


Epoch[1] Batch[545] Speed: 1.257410038480284 samples/sec                   batch loss = 1480.3903673887253 | accuracy = 0.5834862385321101


Epoch[1] Batch[550] Speed: 1.2574390649816385 samples/sec                   batch loss = 1493.3426595926285 | accuracy = 0.5840909090909091


Epoch[1] Batch[555] Speed: 1.249646365134116 samples/sec                   batch loss = 1506.5491219758987 | accuracy = 0.5846846846846847


Epoch[1] Batch[560] Speed: 1.253597199306985 samples/sec                   batch loss = 1518.6278237104416 | accuracy = 0.5870535714285714


Epoch[1] Batch[565] Speed: 1.255242028825678 samples/sec                   batch loss = 1531.601611495018 | accuracy = 0.5880530973451328


Epoch[1] Batch[570] Speed: 1.2524345343197507 samples/sec                   batch loss = 1544.6109298467636 | accuracy = 0.5885964912280702


Epoch[1] Batch[575] Speed: 1.2503345088030555 samples/sec                   batch loss = 1558.593141913414 | accuracy = 0.587391304347826


Epoch[1] Batch[580] Speed: 1.2512561997055853 samples/sec                   batch loss = 1573.1461223363876 | accuracy = 0.5866379310344828


Epoch[1] Batch[585] Speed: 1.2513894742024052 samples/sec                   batch loss = 1585.4282168149948 | accuracy = 0.5876068376068376


Epoch[1] Batch[590] Speed: 1.2507135953434125 samples/sec                   batch loss = 1598.3689275979996 | accuracy = 0.588135593220339


Epoch[1] Batch[595] Speed: 1.2444343303984386 samples/sec                   batch loss = 1612.7739897966385 | accuracy = 0.588655462184874


Epoch[1] Batch[600] Speed: 1.2593171341862932 samples/sec                   batch loss = 1625.057709813118 | accuracy = 0.5895833333333333


Epoch[1] Batch[605] Speed: 1.2662569693427412 samples/sec                   batch loss = 1637.316153883934 | accuracy = 0.5900826446280992


Epoch[1] Batch[610] Speed: 1.2641545701232815 samples/sec                   batch loss = 1649.5041412115097 | accuracy = 0.5918032786885246


Epoch[1] Batch[615] Speed: 1.2585834745473767 samples/sec                   batch loss = 1663.5246180295944 | accuracy = 0.5914634146341463


Epoch[1] Batch[620] Speed: 1.2577276122569476 samples/sec                   batch loss = 1675.98816716671 | accuracy = 0.5915322580645161


Epoch[1] Batch[625] Speed: 1.260438072658207 samples/sec                   batch loss = 1687.7115094661713 | accuracy = 0.5924


Epoch[1] Batch[630] Speed: 1.2544382556197766 samples/sec                   batch loss = 1700.9075756072998 | accuracy = 0.5924603174603175


Epoch[1] Batch[635] Speed: 1.259319686387873 samples/sec                   batch loss = 1713.840896844864 | accuracy = 0.5921259842519685


Epoch[1] Batch[640] Speed: 1.2624462261417893 samples/sec                   batch loss = 1726.4482340812683 | accuracy = 0.5921875


Epoch[1] Batch[645] Speed: 1.2549556533689108 samples/sec                   batch loss = 1740.0771718025208 | accuracy = 0.5922480620155038


Epoch[1] Batch[650] Speed: 1.25985947307208 samples/sec                   batch loss = 1753.5142023563385 | accuracy = 0.5919230769230769


Epoch[1] Batch[655] Speed: 1.254954526903868 samples/sec                   batch loss = 1766.7532918453217 | accuracy = 0.5908396946564886


Epoch[1] Batch[660] Speed: 1.2561255453504 samples/sec                   batch loss = 1777.8161858320236 | accuracy = 0.5912878787878788


Epoch[1] Batch[665] Speed: 1.2595475356894734 samples/sec                   batch loss = 1791.18809735775 | accuracy = 0.5913533834586466


Epoch[1] Batch[670] Speed: 1.2667110919638382 samples/sec                   batch loss = 1803.8115402460098 | accuracy = 0.591044776119403


Epoch[1] Batch[675] Speed: 1.262351997021619 samples/sec                   batch loss = 1817.102835059166 | accuracy = 0.5914814814814815


Epoch[1] Batch[680] Speed: 1.262963501901196 samples/sec                   batch loss = 1830.2610222101212 | accuracy = 0.5915441176470588


Epoch[1] Batch[685] Speed: 1.2643839819224885 samples/sec                   batch loss = 1844.263862490654 | accuracy = 0.5908759124087591


Epoch[1] Batch[690] Speed: 1.2586902677493899 samples/sec                   batch loss = 1856.163032412529 | accuracy = 0.591304347826087


Epoch[1] Batch[695] Speed: 1.2526556899492718 samples/sec                   batch loss = 1869.477739930153 | accuracy = 0.591726618705036


Epoch[1] Batch[700] Speed: 1.2578870723308369 samples/sec                   batch loss = 1881.8985568284988 | accuracy = 0.5928571428571429


Epoch[1] Batch[705] Speed: 1.260973985202217 samples/sec                   batch loss = 1896.723053097725 | accuracy = 0.5925531914893617


Epoch[1] Batch[710] Speed: 1.2593023883352734 samples/sec                   batch loss = 1909.3331724405289 | accuracy = 0.592605633802817


Epoch[1] Batch[715] Speed: 1.2625291632053606 samples/sec                   batch loss = 1921.4512318372726 | accuracy = 0.5940559440559441


Epoch[1] Batch[720] Speed: 1.26518815180162 samples/sec                   batch loss = 1934.0876718759537 | accuracy = 0.5944444444444444


Epoch[1] Batch[725] Speed: 1.2637630094843904 samples/sec                   batch loss = 1946.6300839185715 | accuracy = 0.5955172413793104


Epoch[1] Batch[730] Speed: 1.2641741926271646 samples/sec                   batch loss = 1960.6832379102707 | accuracy = 0.5948630136986301


Epoch[1] Batch[735] Speed: 1.263510128586125 samples/sec                   batch loss = 1975.2145906686783 | accuracy = 0.5952380952380952


Epoch[1] Batch[740] Speed: 1.2668637502822222 samples/sec                   batch loss = 1987.3369616270065 | accuracy = 0.5962837837837838


Epoch[1] Batch[745] Speed: 1.2601326633642909 samples/sec                   batch loss = 1999.3154326677322 | accuracy = 0.5973154362416108


Epoch[1] Batch[750] Speed: 1.2617097710820024 samples/sec                   batch loss = 2012.3908060789108 | accuracy = 0.598


Epoch[1] Batch[755] Speed: 1.2567778195691204 samples/sec                   batch loss = 2024.9955459833145 | accuracy = 0.5990066225165563


Epoch[1] Batch[760] Speed: 1.266881352353702 samples/sec                   batch loss = 2038.2876158952713 | accuracy = 0.5990131578947369


Epoch[1] Batch[765] Speed: 1.261528945577047 samples/sec                   batch loss = 2051.78016769886 | accuracy = 0.5983660130718954


Epoch[1] Batch[770] Speed: 1.2585920664334809 samples/sec                   batch loss = 2064.556934952736 | accuracy = 0.5980519480519481


Epoch[1] Batch[775] Speed: 1.257760330808374 samples/sec                   batch loss = 2077.297022461891 | accuracy = 0.5980645161290322


Epoch[1] Batch[780] Speed: 1.2645610517647812 samples/sec                   batch loss = 2089.8903762102127 | accuracy = 0.5983974358974359


Epoch[1] Batch[785] Speed: 1.2608284280872615 samples/sec                   batch loss = 2101.244849085808 | accuracy = 0.5996815286624204


[Epoch 1] training: accuracy=0.5999365482233503
[Epoch 1] time cost: 643.7063925266266
[Epoch 1] validation: validation accuracy=0.6966666666666667


Epoch[2] Batch[5] Speed: 1.2627157876879533 samples/sec                   batch loss = 13.408623933792114 | accuracy = 0.6


Epoch[2] Batch[10] Speed: 1.2527393099262687 samples/sec                   batch loss = 26.82153010368347 | accuracy = 0.625


Epoch[2] Batch[15] Speed: 1.2520856132834446 samples/sec                   batch loss = 40.71001076698303 | accuracy = 0.6666666666666666


Epoch[2] Batch[20] Speed: 1.2579814849584032 samples/sec                   batch loss = 53.847092151641846 | accuracy = 0.6625


Epoch[2] Batch[25] Speed: 1.2544632055283302 samples/sec                   batch loss = 67.93368744850159 | accuracy = 0.65


Epoch[2] Batch[30] Speed: 1.2583736234723375 samples/sec                   batch loss = 80.5290036201477 | accuracy = 0.6416666666666667


Epoch[2] Batch[35] Speed: 1.260914943336969 samples/sec                   batch loss = 92.45705485343933 | accuracy = 0.6428571428571429


Epoch[2] Batch[40] Speed: 1.255444542272624 samples/sec                   batch loss = 106.08574533462524 | accuracy = 0.63125


Epoch[2] Batch[45] Speed: 1.2538408798730227 samples/sec                   batch loss = 118.54692482948303 | accuracy = 0.6277777777777778


Epoch[2] Batch[50] Speed: 1.2531868758655837 samples/sec                   batch loss = 130.4252152442932 | accuracy = 0.645


Epoch[2] Batch[55] Speed: 1.2575788444635536 samples/sec                   batch loss = 142.4797614812851 | accuracy = 0.65


Epoch[2] Batch[60] Speed: 1.2589890261424652 samples/sec                   batch loss = 155.09546744823456 | accuracy = 0.65


Epoch[2] Batch[65] Speed: 1.2568663220723246 samples/sec                   batch loss = 167.25230872631073 | accuracy = 0.6538461538461539


Epoch[2] Batch[70] Speed: 1.2594378551952625 samples/sec                   batch loss = 179.19858157634735 | accuracy = 0.6535714285714286


Epoch[2] Batch[75] Speed: 1.2643399604131114 samples/sec                   batch loss = 190.32911801338196 | accuracy = 0.6666666666666666


Epoch[2] Batch[80] Speed: 1.2577412840670739 samples/sec                   batch loss = 203.60478949546814 | accuracy = 0.665625


Epoch[2] Batch[85] Speed: 1.2553057064383777 samples/sec                   batch loss = 213.14564609527588 | accuracy = 0.6764705882352942


Epoch[2] Batch[90] Speed: 1.2603558834994517 samples/sec                   batch loss = 226.34611439704895 | accuracy = 0.6694444444444444


Epoch[2] Batch[95] Speed: 1.2547591158269031 samples/sec                   batch loss = 238.4548614025116 | accuracy = 0.6657894736842105


Epoch[2] Batch[100] Speed: 1.2542288468322738 samples/sec                   batch loss = 251.62451660633087 | accuracy = 0.6625


Epoch[2] Batch[105] Speed: 1.2581504444190306 samples/sec                   batch loss = 264.93504083156586 | accuracy = 0.6619047619047619


Epoch[2] Batch[110] Speed: 1.2522122417564752 samples/sec                   batch loss = 277.6007524728775 | accuracy = 0.6659090909090909


Epoch[2] Batch[115] Speed: 1.2592459602870896 samples/sec                   batch loss = 289.5062881708145 | accuracy = 0.6652173913043479


Epoch[2] Batch[120] Speed: 1.2521458871115865 samples/sec                   batch loss = 300.3472764492035 | accuracy = 0.6645833333333333


Epoch[2] Batch[125] Speed: 1.2518928694362763 samples/sec                   batch loss = 314.36082649230957 | accuracy = 0.666


Epoch[2] Batch[130] Speed: 1.2570228325364 samples/sec                   batch loss = 328.02014684677124 | accuracy = 0.6653846153846154


Epoch[2] Batch[135] Speed: 1.2587561845345672 samples/sec                   batch loss = 341.784326672554 | accuracy = 0.6648148148148149


Epoch[2] Batch[140] Speed: 1.257668779720916 samples/sec                   batch loss = 353.38922894001007 | accuracy = 0.6642857142857143


Epoch[2] Batch[145] Speed: 1.2580961950897611 samples/sec                   batch loss = 367.1498535871506 | accuracy = 0.6586206896551724


Epoch[2] Batch[150] Speed: 1.2564899901868174 samples/sec                   batch loss = 379.7064299583435 | accuracy = 0.66


Epoch[2] Batch[155] Speed: 1.2503667506444407 samples/sec                   batch loss = 392.9712985754013 | accuracy = 0.6548387096774193


Epoch[2] Batch[160] Speed: 1.2503614390128166 samples/sec                   batch loss = 406.61629700660706 | accuracy = 0.653125


Epoch[2] Batch[165] Speed: 1.2514409996838805 samples/sec                   batch loss = 420.2112898826599 | accuracy = 0.6515151515151515


Epoch[2] Batch[170] Speed: 1.2534245910117083 samples/sec                   batch loss = 432.81515395641327 | accuracy = 0.6529411764705882


Epoch[2] Batch[175] Speed: 1.2570839594195715 samples/sec                   batch loss = 445.011492729187 | accuracy = 0.6514285714285715


Epoch[2] Batch[180] Speed: 1.2570856548569922 samples/sec                   batch loss = 455.31630289554596 | accuracy = 0.6541666666666667


Epoch[2] Batch[185] Speed: 1.2581903560429633 samples/sec                   batch loss = 465.58163344860077 | accuracy = 0.6581081081081082


Epoch[2] Batch[190] Speed: 1.2589564325789748 samples/sec                   batch loss = 479.2919420003891 | accuracy = 0.6539473684210526


Epoch[2] Batch[195] Speed: 1.2538168917310133 samples/sec                   batch loss = 491.99841272830963 | accuracy = 0.6551282051282051


Epoch[2] Batch[200] Speed: 1.2569687745764875 samples/sec                   batch loss = 508.05690693855286 | accuracy = 0.65125


Epoch[2] Batch[205] Speed: 1.24806090714716 samples/sec                   batch loss = 521.1568207740784 | accuracy = 0.6487804878048781


Epoch[2] Batch[210] Speed: 1.2573567010658504 samples/sec                   batch loss = 535.3057563304901 | accuracy = 0.6476190476190476


Epoch[2] Batch[215] Speed: 1.2596583702121098 samples/sec                   batch loss = 546.4403955936432 | accuracy = 0.65


Epoch[2] Batch[220] Speed: 1.257309304298808 samples/sec                   batch loss = 559.0799911022186 | accuracy = 0.6488636363636363


Epoch[2] Batch[225] Speed: 1.2582496148502291 samples/sec                   batch loss = 570.4990130662918 | accuracy = 0.65


Epoch[2] Batch[230] Speed: 1.251051396842756 samples/sec                   batch loss = 581.3374170064926 | accuracy = 0.6532608695652173


Epoch[2] Batch[235] Speed: 1.2584780210622266 samples/sec                   batch loss = 598.8757017850876 | accuracy = 0.648936170212766


Epoch[2] Batch[240] Speed: 1.2655186402302872 samples/sec                   batch loss = 611.4458074569702 | accuracy = 0.6479166666666667


Epoch[2] Batch[245] Speed: 1.2573187268307329 samples/sec                   batch loss = 625.628035068512 | accuracy = 0.6479591836734694


Epoch[2] Batch[250] Speed: 1.2570862200038153 samples/sec                   batch loss = 637.8291380405426 | accuracy = 0.647


Epoch[2] Batch[255] Speed: 1.259957588407488 samples/sec                   batch loss = 649.0467374324799 | accuracy = 0.65


Epoch[2] Batch[260] Speed: 1.2685675805422674 samples/sec                   batch loss = 660.0337998867035 | accuracy = 0.6519230769230769


Epoch[2] Batch[265] Speed: 1.2620175592471892 samples/sec                   batch loss = 671.1503984928131 | accuracy = 0.6537735849056604


Epoch[2] Batch[270] Speed: 1.2625457899301962 samples/sec                   batch loss = 684.6737101078033 | accuracy = 0.6527777777777778


Epoch[2] Batch[275] Speed: 1.2604425232987664 samples/sec                   batch loss = 695.4545097351074 | accuracy = 0.6554545454545454


Epoch[2] Batch[280] Speed: 1.254590690663586 samples/sec                   batch loss = 707.0701824426651 | accuracy = 0.6571428571428571


Epoch[2] Batch[285] Speed: 1.2552625026336455 samples/sec                   batch loss = 718.8621826171875 | accuracy = 0.6578947368421053


Epoch[2] Batch[290] Speed: 1.2555168843406723 samples/sec                   batch loss = 730.2150478363037 | accuracy = 0.6586206896551724


Epoch[2] Batch[295] Speed: 1.254492846559188 samples/sec                   batch loss = 742.1533416509628 | accuracy = 0.6567796610169492


Epoch[2] Batch[300] Speed: 1.255400483461769 samples/sec                   batch loss = 756.502357840538 | accuracy = 0.655


Epoch[2] Batch[305] Speed: 1.2526327759088451 samples/sec                   batch loss = 767.7942932844162 | accuracy = 0.6540983606557377


Epoch[2] Batch[310] Speed: 1.2541556218369925 samples/sec                   batch loss = 778.0841773748398 | accuracy = 0.6556451612903226


Epoch[2] Batch[315] Speed: 1.2615884244266602 samples/sec                   batch loss = 787.8017646074295 | accuracy = 0.6587301587301587


Epoch[2] Batch[320] Speed: 1.2618756524314054 samples/sec                   batch loss = 801.607600569725 | accuracy = 0.65859375


Epoch[2] Batch[325] Speed: 1.2624074689135127 samples/sec                   batch loss = 813.3237200975418 | accuracy = 0.6592307692307692


Epoch[2] Batch[330] Speed: 1.2621403177736061 samples/sec                   batch loss = 824.3355052471161 | accuracy = 0.6590909090909091


Epoch[2] Batch[335] Speed: 1.2508837789818144 samples/sec                   batch loss = 834.5885535478592 | accuracy = 0.6597014925373135


Epoch[2] Batch[340] Speed: 1.2532833928828881 samples/sec                   batch loss = 844.0081322193146 | accuracy = 0.6617647058823529


Epoch[2] Batch[345] Speed: 1.2480650851201864 samples/sec                   batch loss = 856.9743473529816 | accuracy = 0.6601449275362319


Epoch[2] Batch[350] Speed: 1.261274018458102 samples/sec                   batch loss = 867.1940221786499 | accuracy = 0.6607142857142857


Epoch[2] Batch[355] Speed: 1.2596089083692679 samples/sec                   batch loss = 878.2061742544174 | accuracy = 0.6619718309859155


Epoch[2] Batch[360] Speed: 1.2619181736667988 samples/sec                   batch loss = 891.2314096689224 | accuracy = 0.6618055555555555


Epoch[2] Batch[365] Speed: 1.2583800416309392 samples/sec                   batch loss = 902.5312449932098 | accuracy = 0.6623287671232877


Epoch[2] Batch[370] Speed: 1.2611190071497318 samples/sec                   batch loss = 913.6316928863525 | accuracy = 0.6621621621621622


Epoch[2] Batch[375] Speed: 1.2610723690025982 samples/sec                   batch loss = 925.5068129301071 | accuracy = 0.6606666666666666


Epoch[2] Batch[380] Speed: 1.2605161054648535 samples/sec                   batch loss = 939.4888359308243 | accuracy = 0.6598684210526315


Epoch[2] Batch[385] Speed: 1.2560538854179464 samples/sec                   batch loss = 954.3286024332047 | accuracy = 0.6571428571428571


Epoch[2] Batch[390] Speed: 1.2579203650763755 samples/sec                   batch loss = 966.5869096517563 | accuracy = 0.6564102564102564


Epoch[2] Batch[395] Speed: 1.2540186643591287 samples/sec                   batch loss = 978.7414737939835 | accuracy = 0.6556962025316456


Epoch[2] Batch[400] Speed: 1.2544561706792123 samples/sec                   batch loss = 990.1568228006363 | accuracy = 0.655625


Epoch[2] Batch[405] Speed: 1.2651482720055633 samples/sec                   batch loss = 1001.0974227190018 | accuracy = 0.6574074074074074


Epoch[2] Batch[410] Speed: 1.2579516788418805 samples/sec                   batch loss = 1012.512261390686 | accuracy = 0.6591463414634147


Epoch[2] Batch[415] Speed: 1.2549529310818524 samples/sec                   batch loss = 1027.417387008667 | accuracy = 0.6572289156626506


Epoch[2] Batch[420] Speed: 1.2540077915358665 samples/sec                   batch loss = 1039.4340226650238 | accuracy = 0.6571428571428571


Epoch[2] Batch[425] Speed: 1.2544728668497045 samples/sec                   batch loss = 1051.927536725998 | accuracy = 0.6564705882352941


Epoch[2] Batch[430] Speed: 1.250451463289619 samples/sec                   batch loss = 1062.9161727428436 | accuracy = 0.6569767441860465


Epoch[2] Batch[435] Speed: 1.260763715241052 samples/sec                   batch loss = 1074.4454399347305 | accuracy = 0.656896551724138


Epoch[2] Batch[440] Speed: 1.2534894891511459 samples/sec                   batch loss = 1085.397146821022 | accuracy = 0.6573863636363636


Epoch[2] Batch[445] Speed: 1.2543656626587096 samples/sec                   batch loss = 1097.940907716751 | accuracy = 0.6573033707865169


Epoch[2] Batch[450] Speed: 1.253819702793926 samples/sec                   batch loss = 1109.9424183368683 | accuracy = 0.6572222222222223


Epoch[2] Batch[455] Speed: 1.2571631789086448 samples/sec                   batch loss = 1122.4480766057968 | accuracy = 0.6554945054945055


Epoch[2] Batch[460] Speed: 1.2579424354597688 samples/sec                   batch loss = 1137.5032300949097 | accuracy = 0.6532608695652173


Epoch[2] Batch[465] Speed: 1.2626593384147624 samples/sec                   batch loss = 1149.2655056715012 | accuracy = 0.6537634408602151


Epoch[2] Batch[470] Speed: 1.2573811075543522 samples/sec                   batch loss = 1160.6162625551224 | accuracy = 0.6537234042553192


Epoch[2] Batch[475] Speed: 1.26038097467506 samples/sec                   batch loss = 1171.6371092796326 | accuracy = 0.6542105263157895


Epoch[2] Batch[480] Speed: 1.2566947891919198 samples/sec                   batch loss = 1183.5428093671799 | accuracy = 0.6536458333333334


Epoch[2] Batch[485] Speed: 1.2568472082424502 samples/sec                   batch loss = 1196.5360088348389 | accuracy = 0.6536082474226804


Epoch[2] Batch[490] Speed: 1.2628960980422916 samples/sec                   batch loss = 1207.4328428506851 | accuracy = 0.6540816326530612


Epoch[2] Batch[495] Speed: 1.2572093399327544 samples/sec                   batch loss = 1220.2122560739517 | accuracy = 0.6540404040404041


Epoch[2] Batch[500] Speed: 1.2582127189816534 samples/sec                   batch loss = 1229.3135622143745 | accuracy = 0.6555


Epoch[2] Batch[505] Speed: 1.2554322355557748 samples/sec                   batch loss = 1239.002705514431 | accuracy = 0.656930693069307


Epoch[2] Batch[510] Speed: 1.2590940928256427 samples/sec                   batch loss = 1248.004037797451 | accuracy = 0.657843137254902


Epoch[2] Batch[515] Speed: 1.2609383509270187 samples/sec                   batch loss = 1258.7274014353752 | accuracy = 0.6592233009708738


Epoch[2] Batch[520] Speed: 1.2570281067326523 samples/sec                   batch loss = 1269.4407941699028 | accuracy = 0.6605769230769231


Epoch[2] Batch[525] Speed: 1.255936914854682 samples/sec                   batch loss = 1282.664539039135 | accuracy = 0.6604761904761904


Epoch[2] Batch[530] Speed: 1.2589135439104322 samples/sec                   batch loss = 1296.1308956742287 | accuracy = 0.6599056603773585


Epoch[2] Batch[535] Speed: 1.2598090494379892 samples/sec                   batch loss = 1307.3123179674149 | accuracy = 0.6598130841121496


Epoch[2] Batch[540] Speed: 1.2560112883094607 samples/sec                   batch loss = 1320.1566669940948 | accuracy = 0.6592592592592592


Epoch[2] Batch[545] Speed: 1.260527849102251 samples/sec                   batch loss = 1332.3421156406403 | accuracy = 0.6600917431192661


Epoch[2] Batch[550] Speed: 1.256170407501016 samples/sec                   batch loss = 1345.7873141765594 | accuracy = 0.6590909090909091


Epoch[2] Batch[555] Speed: 1.2614463295594942 samples/sec                   batch loss = 1357.6845933198929 | accuracy = 0.6585585585585586


Epoch[2] Batch[560] Speed: 1.2638354565726657 samples/sec                   batch loss = 1368.3376841545105 | accuracy = 0.6584821428571429


Epoch[2] Batch[565] Speed: 1.2605755835581345 samples/sec                   batch loss = 1382.6346662044525 | accuracy = 0.6575221238938053


Epoch[2] Batch[570] Speed: 1.2554139168439853 samples/sec                   batch loss = 1394.2280629873276 | accuracy = 0.6574561403508772


Epoch[2] Batch[575] Speed: 1.2609145642738357 samples/sec                   batch loss = 1408.3963035345078 | accuracy = 0.6565217391304348


Epoch[2] Batch[580] Speed: 1.2583106723576547 samples/sec                   batch loss = 1421.2160577774048 | accuracy = 0.6564655172413794


Epoch[2] Batch[585] Speed: 1.2649323160222723 samples/sec                   batch loss = 1431.8487261533737 | accuracy = 0.6581196581196581


Epoch[2] Batch[590] Speed: 1.2584525336148094 samples/sec                   batch loss = 1443.9540303945541 | accuracy = 0.6580508474576271


Epoch[2] Batch[595] Speed: 1.2508297814583889 samples/sec                   batch loss = 1453.6213697195053 | accuracy = 0.6605042016806723


Epoch[2] Batch[600] Speed: 1.255674092664217 samples/sec                   batch loss = 1465.1875994205475 | accuracy = 0.66125


Epoch[2] Batch[605] Speed: 1.2529305366987344 samples/sec                   batch loss = 1481.3225796222687 | accuracy = 0.6595041322314049


Epoch[2] Batch[610] Speed: 1.262820526813007 samples/sec                   batch loss = 1491.96888589859 | accuracy = 0.6602459016393443


Epoch[2] Batch[615] Speed: 1.2535886754811014 samples/sec                   batch loss = 1502.7827129364014 | accuracy = 0.6609756097560976


Epoch[2] Batch[620] Speed: 1.251795539063136 samples/sec                   batch loss = 1514.5125659704208 | accuracy = 0.6608870967741935


Epoch[2] Batch[625] Speed: 1.2555483604404907 samples/sec                   batch loss = 1524.423896253109 | accuracy = 0.662


Epoch[2] Batch[630] Speed: 1.2578615145066445 samples/sec                   batch loss = 1536.9509525895119 | accuracy = 0.6619047619047619


Epoch[2] Batch[635] Speed: 1.2530982356018423 samples/sec                   batch loss = 1548.1266195178032 | accuracy = 0.6625984251968504


Epoch[2] Batch[640] Speed: 1.2566711623710294 samples/sec                   batch loss = 1557.5877781510353 | accuracy = 0.664453125


Epoch[2] Batch[645] Speed: 1.2524902600483296 samples/sec                   batch loss = 1570.1047244668007 | accuracy = 0.663953488372093


Epoch[2] Batch[650] Speed: 1.2571953028545018 samples/sec                   batch loss = 1583.466949045658 | accuracy = 0.6634615384615384


Epoch[2] Batch[655] Speed: 1.2539329054006987 samples/sec                   batch loss = 1595.5683090090752 | accuracy = 0.6633587786259542


Epoch[2] Batch[660] Speed: 1.258163181896242 samples/sec                   batch loss = 1607.949089229107 | accuracy = 0.6632575757575757


Epoch[2] Batch[665] Speed: 1.255981763532305 samples/sec                   batch loss = 1619.0157743096352 | accuracy = 0.6635338345864662


Epoch[2] Batch[670] Speed: 1.2532835801272435 samples/sec                   batch loss = 1632.7111354470253 | accuracy = 0.6623134328358209


Epoch[2] Batch[675] Speed: 1.2520174033924705 samples/sec                   batch loss = 1644.205479323864 | accuracy = 0.6618518518518518


Epoch[2] Batch[680] Speed: 1.2604618413369664 samples/sec                   batch loss = 1655.497397840023 | accuracy = 0.6613970588235294


Epoch[2] Batch[685] Speed: 1.2610013756816494 samples/sec                   batch loss = 1663.3203803300858 | accuracy = 0.6631386861313868


Epoch[2] Batch[690] Speed: 1.255863020246928 samples/sec                   batch loss = 1674.4510509967804 | accuracy = 0.6634057971014493


Epoch[2] Batch[695] Speed: 1.2613302491021867 samples/sec                   batch loss = 1684.2756134271622 | accuracy = 0.6640287769784172


Epoch[2] Batch[700] Speed: 1.2559024108195007 samples/sec                   batch loss = 1699.8058708906174 | accuracy = 0.6632142857142858


Epoch[2] Batch[705] Speed: 1.2646411211494315 samples/sec                   batch loss = 1711.3220797777176 | accuracy = 0.6634751773049645


Epoch[2] Batch[710] Speed: 1.2556656345497972 samples/sec                   batch loss = 1724.8898738622665 | accuracy = 0.6637323943661971


Epoch[2] Batch[715] Speed: 1.259427833619842 samples/sec                   batch loss = 1736.9790219068527 | accuracy = 0.663986013986014


Epoch[2] Batch[720] Speed: 1.2577991804056048 samples/sec                   batch loss = 1747.810239493847 | accuracy = 0.6649305555555556


Epoch[2] Batch[725] Speed: 1.2567078737498625 samples/sec                   batch loss = 1756.7275513410568 | accuracy = 0.6658620689655173


Epoch[2] Batch[730] Speed: 1.2605864758552372 samples/sec                   batch loss = 1768.4488780498505 | accuracy = 0.6657534246575343


Epoch[2] Batch[735] Speed: 1.2580608176127746 samples/sec                   batch loss = 1778.0310853719711 | accuracy = 0.6663265306122449


Epoch[2] Batch[740] Speed: 1.2570110599365607 samples/sec                   batch loss = 1790.952754497528 | accuracy = 0.6662162162162162


Epoch[2] Batch[745] Speed: 1.252889648182034 samples/sec                   batch loss = 1801.9440222978592 | accuracy = 0.6667785234899329


Epoch[2] Batch[750] Speed: 1.2541330278860667 samples/sec                   batch loss = 1814.5422152280807 | accuracy = 0.6676666666666666


Epoch[2] Batch[755] Speed: 1.2564620425584434 samples/sec                   batch loss = 1827.0616859197617 | accuracy = 0.6682119205298013


Epoch[2] Batch[760] Speed: 1.2510077391172787 samples/sec                   batch loss = 1837.8559937477112 | accuracy = 0.6684210526315789


Epoch[2] Batch[765] Speed: 1.2555608573595858 samples/sec                   batch loss = 1850.099746465683 | accuracy = 0.6686274509803921


Epoch[2] Batch[770] Speed: 1.2538286045763072 samples/sec                   batch loss = 1861.2414529323578 | accuracy = 0.6685064935064935


Epoch[2] Batch[775] Speed: 1.262504271441621 samples/sec                   batch loss = 1874.8773002624512 | accuracy = 0.667741935483871


Epoch[2] Batch[780] Speed: 1.2594064673972036 samples/sec                   batch loss = 1883.9755182266235 | accuracy = 0.6689102564102564


Epoch[2] Batch[785] Speed: 1.2543594729500118 samples/sec                   batch loss = 1893.6669139266014 | accuracy = 0.6694267515923567


[Epoch 2] training: accuracy=0.6684644670050761
[Epoch 2] time cost: 643.0619361400604
[Epoch 2] validation: validation accuracy=0.7411111111111112


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).