<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[09:32:03] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[09:32:03] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[09:32:04] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 4.0864916, -1.744343 ]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.783094670758603 samples/sec                   batch loss = 14.316141843795776 | accuracy = 0.55


Epoch[1] Batch[10] Speed: 1.2759360882662316 samples/sec                   batch loss = 28.32122778892517 | accuracy = 0.525


Epoch[1] Batch[15] Speed: 1.2714141327566006 samples/sec                   batch loss = 41.935529947280884 | accuracy = 0.5333333333333333


Epoch[1] Batch[20] Speed: 1.2757937506758343 samples/sec                   batch loss = 55.856451749801636 | accuracy = 0.5375


Epoch[1] Batch[25] Speed: 1.265248930393125 samples/sec                   batch loss = 70.08618783950806 | accuracy = 0.53


Epoch[1] Batch[30] Speed: 1.2735307142626164 samples/sec                   batch loss = 84.33653736114502 | accuracy = 0.5166666666666667


Epoch[1] Batch[35] Speed: 1.274145455611446 samples/sec                   batch loss = 97.02404403686523 | accuracy = 0.5285714285714286


Epoch[1] Batch[40] Speed: 1.2724716881159486 samples/sec                   batch loss = 111.41947078704834 | accuracy = 0.53125


Epoch[1] Batch[45] Speed: 1.2690292176278801 samples/sec                   batch loss = 125.09445881843567 | accuracy = 0.5333333333333333


Epoch[1] Batch[50] Speed: 1.2682020404206322 samples/sec                   batch loss = 139.22980451583862 | accuracy = 0.54


Epoch[1] Batch[55] Speed: 1.267361197143477 samples/sec                   batch loss = 154.82052159309387 | accuracy = 0.5227272727272727


Epoch[1] Batch[60] Speed: 1.2690177949902695 samples/sec                   batch loss = 168.444983959198 | accuracy = 0.5375


Epoch[1] Batch[65] Speed: 1.2746877636412703 samples/sec                   batch loss = 182.43665289878845 | accuracy = 0.5384615384615384


Epoch[1] Batch[70] Speed: 1.272390238131172 samples/sec                   batch loss = 196.2878999710083 | accuracy = 0.5428571428571428


Epoch[1] Batch[75] Speed: 1.2734282506006525 samples/sec                   batch loss = 210.58387732505798 | accuracy = 0.54


Epoch[1] Batch[80] Speed: 1.2755898567239912 samples/sec                   batch loss = 225.12832403182983 | accuracy = 0.53125


Epoch[1] Batch[85] Speed: 1.276517605589062 samples/sec                   batch loss = 238.1700475215912 | accuracy = 0.5352941176470588


Epoch[1] Batch[90] Speed: 1.2760431292004986 samples/sec                   batch loss = 252.27430391311646 | accuracy = 0.5333333333333333


Epoch[1] Batch[95] Speed: 1.272409634634215 samples/sec                   batch loss = 265.88613629341125 | accuracy = 0.5394736842105263


Epoch[1] Batch[100] Speed: 1.2653686918011997 samples/sec                   batch loss = 279.88091826438904 | accuracy = 0.535


Epoch[1] Batch[105] Speed: 1.2758821379028547 samples/sec                   batch loss = 294.4944281578064 | accuracy = 0.5166666666666667


Epoch[1] Batch[110] Speed: 1.2763883445037114 samples/sec                   batch loss = 308.1333529949188 | accuracy = 0.5136363636363637


Epoch[1] Batch[115] Speed: 1.2774393127058385 samples/sec                   batch loss = 321.31566429138184 | accuracy = 0.5260869565217391


Epoch[1] Batch[120] Speed: 1.2784794451850137 samples/sec                   batch loss = 335.1029899120331 | accuracy = 0.5291666666666667


Epoch[1] Batch[125] Speed: 1.2801436684674312 samples/sec                   batch loss = 349.2975640296936 | accuracy = 0.526


Epoch[1] Batch[130] Speed: 1.2791587563720759 samples/sec                   batch loss = 363.29272532463074 | accuracy = 0.5288461538461539


Epoch[1] Batch[135] Speed: 1.27344903199157 samples/sec                   batch loss = 376.93960976600647 | accuracy = 0.5333333333333333


Epoch[1] Batch[140] Speed: 1.2726798961873118 samples/sec                   batch loss = 391.0582401752472 | accuracy = 0.5321428571428571


Epoch[1] Batch[145] Speed: 1.272023263540387 samples/sec                   batch loss = 404.6595425605774 | accuracy = 0.5327586206896552


Epoch[1] Batch[150] Speed: 1.2754857039028888 samples/sec                   batch loss = 417.9907784461975 | accuracy = 0.5366666666666666


Epoch[1] Batch[155] Speed: 1.274160744658649 samples/sec                   batch loss = 431.68301010131836 | accuracy = 0.5370967741935484


Epoch[1] Batch[160] Speed: 1.2763609612182065 samples/sec                   batch loss = 446.1188907623291 | accuracy = 0.53125


Epoch[1] Batch[165] Speed: 1.2684288000774187 samples/sec                   batch loss = 460.218213558197 | accuracy = 0.5303030303030303


Epoch[1] Batch[170] Speed: 1.2763535815266536 samples/sec                   batch loss = 473.3367745876312 | accuracy = 0.5338235294117647


Epoch[1] Batch[175] Speed: 1.2853176054221518 samples/sec                   batch loss = 487.3404107093811 | accuracy = 0.5328571428571428


Epoch[1] Batch[180] Speed: 1.2785776564934288 samples/sec                   batch loss = 501.6969380378723 | accuracy = 0.5277777777777778


Epoch[1] Batch[185] Speed: 1.2810607415945836 samples/sec                   batch loss = 515.1824131011963 | accuracy = 0.5310810810810811


Epoch[1] Batch[190] Speed: 1.2804897380872369 samples/sec                   batch loss = 529.306631565094 | accuracy = 0.5289473684210526


Epoch[1] Batch[195] Speed: 1.2815767445818425 samples/sec                   batch loss = 542.6885046958923 | accuracy = 0.532051282051282


Epoch[1] Batch[200] Speed: 1.2802438941195387 samples/sec                   batch loss = 556.3418855667114 | accuracy = 0.53125


Epoch[1] Batch[205] Speed: 1.2736710969129987 samples/sec                   batch loss = 570.4058275222778 | accuracy = 0.5304878048780488


Epoch[1] Batch[210] Speed: 1.278296897985114 samples/sec                   batch loss = 583.668704032898 | accuracy = 0.5333333333333333


Epoch[1] Batch[215] Speed: 1.2834664914905745 samples/sec                   batch loss = 597.6600160598755 | accuracy = 0.5302325581395348


Epoch[1] Batch[220] Speed: 1.2768587046972908 samples/sec                   batch loss = 611.0085022449493 | accuracy = 0.5318181818181819


Epoch[1] Batch[225] Speed: 1.2844483156120743 samples/sec                   batch loss = 624.5536468029022 | accuracy = 0.5355555555555556


Epoch[1] Batch[230] Speed: 1.281829662411045 samples/sec                   batch loss = 637.7556984424591 | accuracy = 0.5402173913043479


Epoch[1] Batch[235] Speed: 1.2870310868050134 samples/sec                   batch loss = 651.9994628429413 | accuracy = 0.5414893617021277


Epoch[1] Batch[240] Speed: 1.2830806387100375 samples/sec                   batch loss = 666.0684244632721 | accuracy = 0.5395833333333333


Epoch[1] Batch[245] Speed: 1.2849131222427799 samples/sec                   batch loss = 678.989604473114 | accuracy = 0.5428571428571428


Epoch[1] Batch[250] Speed: 1.2844702449565075 samples/sec                   batch loss = 692.7898871898651 | accuracy = 0.546


Epoch[1] Batch[255] Speed: 1.2814372565543064 samples/sec                   batch loss = 706.6269490718842 | accuracy = 0.546078431372549


Epoch[1] Batch[260] Speed: 1.27855427147998 samples/sec                   batch loss = 720.5690205097198 | accuracy = 0.5461538461538461


Epoch[1] Batch[265] Speed: 1.278643528879618 samples/sec                   batch loss = 734.6053879261017 | accuracy = 0.5462264150943397


Epoch[1] Batch[270] Speed: 1.2812440790858861 samples/sec                   batch loss = 747.5954768657684 | accuracy = 0.5509259259259259


Epoch[1] Batch[275] Speed: 1.2840119495683764 samples/sec                   batch loss = 761.4906949996948 | accuracy = 0.5527272727272727


Epoch[1] Batch[280] Speed: 1.2855302358917995 samples/sec                   batch loss = 775.1678855419159 | accuracy = 0.5544642857142857


Epoch[1] Batch[285] Speed: 1.2783695599415936 samples/sec                   batch loss = 790.0316479206085 | accuracy = 0.5517543859649123


Epoch[1] Batch[290] Speed: 1.2838739941764588 samples/sec                   batch loss = 803.2871294021606 | accuracy = 0.553448275862069


Epoch[1] Batch[295] Speed: 1.2874802769109803 samples/sec                   batch loss = 816.4376404285431 | accuracy = 0.5542372881355933


Epoch[1] Batch[300] Speed: 1.2840904716080515 samples/sec                   batch loss = 830.3536446094513 | accuracy = 0.5516666666666666


Epoch[1] Batch[305] Speed: 1.2828957942939745 samples/sec                   batch loss = 843.204158782959 | accuracy = 0.5549180327868852


Epoch[1] Batch[310] Speed: 1.2821862472870345 samples/sec                   batch loss = 856.6916389465332 | accuracy = 0.5540322580645162


Epoch[1] Batch[315] Speed: 1.2866991365414322 samples/sec                   batch loss = 870.3208422660828 | accuracy = 0.5555555555555556


Epoch[1] Batch[320] Speed: 1.2833208983023543 samples/sec                   batch loss = 884.5530037879944 | accuracy = 0.553125


Epoch[1] Batch[325] Speed: 1.278523287654739 samples/sec                   batch loss = 897.1394007205963 | accuracy = 0.556923076923077


Epoch[1] Batch[330] Speed: 1.2822445540655885 samples/sec                   batch loss = 910.4793202877045 | accuracy = 0.5568181818181818


Epoch[1] Batch[335] Speed: 1.2765655874397364 samples/sec                   batch loss = 923.6951270103455 | accuracy = 0.5582089552238806


Epoch[1] Batch[340] Speed: 1.2824296029587303 samples/sec                   batch loss = 937.6313240528107 | accuracy = 0.5566176470588236


Epoch[1] Batch[345] Speed: 1.283282222995277 samples/sec                   batch loss = 951.7636306285858 | accuracy = 0.5543478260869565


Epoch[1] Batch[350] Speed: 1.288028856019851 samples/sec                   batch loss = 965.619722366333 | accuracy = 0.5571428571428572


Epoch[1] Batch[355] Speed: 1.2768406299354502 samples/sec                   batch loss = 979.4691593647003 | accuracy = 0.5584507042253521


Epoch[1] Batch[360] Speed: 1.2698198948251171 samples/sec                   batch loss = 992.945657491684 | accuracy = 0.5590277777777778


Epoch[1] Batch[365] Speed: 1.27089655571545 samples/sec                   batch loss = 1005.8907775878906 | accuracy = 0.560958904109589


Epoch[1] Batch[370] Speed: 1.2649850581831348 samples/sec                   batch loss = 1019.9775748252869 | accuracy = 0.5608108108108109


Epoch[1] Batch[375] Speed: 1.2693544170914628 samples/sec                   batch loss = 1033.3677842617035 | accuracy = 0.5613333333333334


Epoch[1] Batch[380] Speed: 1.2686433615463018 samples/sec                   batch loss = 1047.006695985794 | accuracy = 0.5598684210526316


Epoch[1] Batch[385] Speed: 1.2792489760688133 samples/sec                   batch loss = 1059.9737606048584 | accuracy = 0.561038961038961


Epoch[1] Batch[390] Speed: 1.2735163103556562 samples/sec                   batch loss = 1072.7031688690186 | accuracy = 0.5653846153846154


Epoch[1] Batch[395] Speed: 1.2765398477393735 samples/sec                   batch loss = 1086.712244272232 | accuracy = 0.5626582278481013


Epoch[1] Batch[400] Speed: 1.276593756603265 samples/sec                   batch loss = 1099.8353855609894 | accuracy = 0.56375


Epoch[1] Batch[405] Speed: 1.2805194489982852 samples/sec                   batch loss = 1112.934396982193 | accuracy = 0.5648148148148148


Epoch[1] Batch[410] Speed: 1.2765244044173856 samples/sec                   batch loss = 1127.4539153575897 | accuracy = 0.5628048780487804


Epoch[1] Batch[415] Speed: 1.281404664781462 samples/sec                   batch loss = 1140.8783700466156 | accuracy = 0.5620481927710843


Epoch[1] Batch[420] Speed: 1.2798796011697202 samples/sec                   batch loss = 1154.0082113742828 | accuracy = 0.5636904761904762


Epoch[1] Batch[425] Speed: 1.2797552225995592 samples/sec                   batch loss = 1167.467398405075 | accuracy = 0.5647058823529412


Epoch[1] Batch[430] Speed: 1.2590463760163346 samples/sec                   batch loss = 1180.2967567443848 | accuracy = 0.5668604651162791


Epoch[1] Batch[435] Speed: 1.2542648530386393 samples/sec                   batch loss = 1194.2813153266907 | accuracy = 0.5666666666666667


Epoch[1] Batch[440] Speed: 1.2563552507851838 samples/sec                   batch loss = 1207.7566757202148 | accuracy = 0.5664772727272728


Epoch[1] Batch[445] Speed: 1.2452776439059485 samples/sec                   batch loss = 1220.8255305290222 | accuracy = 0.5674157303370787


Epoch[1] Batch[450] Speed: 1.2497807864898776 samples/sec                   batch loss = 1235.2986147403717 | accuracy = 0.5655555555555556


Epoch[1] Batch[455] Speed: 1.2495564567393662 samples/sec                   batch loss = 1249.439046382904 | accuracy = 0.5664835164835165


Epoch[1] Batch[460] Speed: 1.2500703933241424 samples/sec                   batch loss = 1263.249220609665 | accuracy = 0.5668478260869565


Epoch[1] Batch[465] Speed: 1.2502800928192652 samples/sec                   batch loss = 1276.203650712967 | accuracy = 0.5682795698924731


Epoch[1] Batch[470] Speed: 1.2522995415360323 samples/sec                   batch loss = 1288.95596408844 | accuracy = 0.5707446808510638


Epoch[1] Batch[475] Speed: 1.2486541806825555 samples/sec                   batch loss = 1302.8993790149689 | accuracy = 0.5710526315789474


Epoch[1] Batch[480] Speed: 1.2469624217025899 samples/sec                   batch loss = 1316.7458946704865 | accuracy = 0.5703125


Epoch[1] Batch[485] Speed: 1.2469943971821873 samples/sec                   batch loss = 1330.0927748680115 | accuracy = 0.5711340206185567


Epoch[1] Batch[490] Speed: 1.2463942731828987 samples/sec                   batch loss = 1343.8477790355682 | accuracy = 0.5698979591836735


Epoch[1] Batch[495] Speed: 1.2451233051026391 samples/sec                   batch loss = 1357.769701242447 | accuracy = 0.5686868686868687


Epoch[1] Batch[500] Speed: 1.2447862956409892 samples/sec                   batch loss = 1370.8840186595917 | accuracy = 0.5705


Epoch[1] Batch[505] Speed: 1.2585481640713994 samples/sec                   batch loss = 1384.3857989311218 | accuracy = 0.5693069306930693


Epoch[1] Batch[510] Speed: 1.252507745485222 samples/sec                   batch loss = 1398.6625101566315 | accuracy = 0.5686274509803921


Epoch[1] Batch[515] Speed: 1.251207675426723 samples/sec                   batch loss = 1413.0330138206482 | accuracy = 0.566990291262136


Epoch[1] Batch[520] Speed: 1.2529305366987344 samples/sec                   batch loss = 1426.7329688072205 | accuracy = 0.5658653846153846


Epoch[1] Batch[525] Speed: 1.259431048070056 samples/sec                   batch loss = 1440.3711152076721 | accuracy = 0.5661904761904762


Epoch[1] Batch[530] Speed: 1.2538266368030038 samples/sec                   batch loss = 1455.2154698371887 | accuracy = 0.5641509433962264


Epoch[1] Batch[535] Speed: 1.2571298320329223 samples/sec                   batch loss = 1468.303628206253 | accuracy = 0.5658878504672897


Epoch[1] Batch[540] Speed: 1.2561033505865322 samples/sec                   batch loss = 1481.4357500076294 | accuracy = 0.5662037037037037


Epoch[1] Batch[545] Speed: 1.251868395307744 samples/sec                   batch loss = 1494.8466258049011 | accuracy = 0.5665137614678899


Epoch[1] Batch[550] Speed: 1.2475845231908156 samples/sec                   batch loss = 1508.1645979881287 | accuracy = 0.5668181818181818


Epoch[1] Batch[555] Speed: 1.2518044121265806 samples/sec                   batch loss = 1520.6686215400696 | accuracy = 0.568018018018018


Epoch[1] Batch[560] Speed: 1.2542663533421017 samples/sec                   batch loss = 1533.6321334838867 | accuracy = 0.5683035714285715


Epoch[1] Batch[565] Speed: 1.2568246113610029 samples/sec                   batch loss = 1546.6555290222168 | accuracy = 0.5685840707964602


Epoch[1] Batch[570] Speed: 1.255981011326816 samples/sec                   batch loss = 1560.6186497211456 | accuracy = 0.5679824561403509


Epoch[1] Batch[575] Speed: 1.2527666244627815 samples/sec                   batch loss = 1574.2460334300995 | accuracy = 0.568695652173913


Epoch[1] Batch[580] Speed: 1.2557612178743485 samples/sec                   batch loss = 1586.2243587970734 | accuracy = 0.5706896551724138


Epoch[1] Batch[585] Speed: 1.2534391995305771 samples/sec                   batch loss = 1599.4065613746643 | accuracy = 0.5726495726495726


Epoch[1] Batch[590] Speed: 1.2494117555941677 samples/sec                   batch loss = 1613.3414661884308 | accuracy = 0.5724576271186441


Epoch[1] Batch[595] Speed: 1.249009932095591 samples/sec                   batch loss = 1626.5358245372772 | accuracy = 0.573109243697479


Epoch[1] Batch[600] Speed: 1.2486461885969564 samples/sec                   batch loss = 1639.6918065547943 | accuracy = 0.57375


Epoch[1] Batch[605] Speed: 1.2508451688786988 samples/sec                   batch loss = 1653.4058167934418 | accuracy = 0.5731404958677686


Epoch[1] Batch[610] Speed: 1.252352357268977 samples/sec                   batch loss = 1668.202208995819 | accuracy = 0.5725409836065574


Epoch[1] Batch[615] Speed: 1.2534097955512258 samples/sec                   batch loss = 1680.1440188884735 | accuracy = 0.5739837398373984


Epoch[1] Batch[620] Speed: 1.2531876247279858 samples/sec                   batch loss = 1693.7103412151337 | accuracy = 0.5745967741935484


Epoch[1] Batch[625] Speed: 1.2530593950871474 samples/sec                   batch loss = 1707.577833890915 | accuracy = 0.5736


Epoch[1] Batch[630] Speed: 1.253141571355212 samples/sec                   batch loss = 1721.0501866340637 | accuracy = 0.5734126984126984


Epoch[1] Batch[635] Speed: 1.246946573594431 samples/sec                   batch loss = 1733.7913930416107 | accuracy = 0.5744094488188977


Epoch[1] Batch[640] Speed: 1.2477614657665428 samples/sec                   batch loss = 1746.2901062965393 | accuracy = 0.575


Epoch[1] Batch[645] Speed: 1.2530720296891418 samples/sec                   batch loss = 1758.750005722046 | accuracy = 0.5763565891472868


Epoch[1] Batch[650] Speed: 1.248847043999165 samples/sec                   batch loss = 1771.026671886444 | accuracy = 0.5773076923076923


Epoch[1] Batch[655] Speed: 1.256558029509897 samples/sec                   batch loss = 1782.5498745441437 | accuracy = 0.5786259541984733


Epoch[1] Batch[660] Speed: 1.2448225006282647 samples/sec                   batch loss = 1795.1130616664886 | accuracy = 0.5799242424242425


Epoch[1] Batch[665] Speed: 1.2480776192070997 samples/sec                   batch loss = 1808.5899646282196 | accuracy = 0.5796992481203007


Epoch[1] Batch[670] Speed: 1.2503314338009937 samples/sec                   batch loss = 1821.5549533367157 | accuracy = 0.5809701492537314


Epoch[1] Batch[675] Speed: 1.2531780767993987 samples/sec                   batch loss = 1834.7173767089844 | accuracy = 0.5818518518518518


Epoch[1] Batch[680] Speed: 1.246128302398029 samples/sec                   batch loss = 1847.2046718597412 | accuracy = 0.5834558823529412


Epoch[1] Batch[685] Speed: 1.2516105409958536 samples/sec                   batch loss = 1860.8323075771332 | accuracy = 0.583941605839416


Epoch[1] Batch[690] Speed: 1.246846304259727 samples/sec                   batch loss = 1874.5592074394226 | accuracy = 0.5840579710144927


Epoch[1] Batch[695] Speed: 1.2510157614622461 samples/sec                   batch loss = 1886.423435330391 | accuracy = 0.5845323741007195


Epoch[1] Batch[700] Speed: 1.250293416843144 samples/sec                   batch loss = 1899.2781802415848 | accuracy = 0.5846428571428571


Epoch[1] Batch[705] Speed: 1.2512568529427053 samples/sec                   batch loss = 1911.9684666395187 | accuracy = 0.5854609929078014


Epoch[1] Batch[710] Speed: 1.2492087652120873 samples/sec                   batch loss = 1924.5378185510635 | accuracy = 0.5855633802816902


Epoch[1] Batch[715] Speed: 1.2543727902777304 samples/sec                   batch loss = 1936.3422667980194 | accuracy = 0.5867132867132867


Epoch[1] Batch[720] Speed: 1.2493997529531276 samples/sec                   batch loss = 1948.864203095436 | accuracy = 0.5881944444444445


Epoch[1] Batch[725] Speed: 1.2547574266592993 samples/sec                   batch loss = 1962.9324785470963 | accuracy = 0.5872413793103448


Epoch[1] Batch[730] Speed: 1.2513314197763459 samples/sec                   batch loss = 1975.4193657636642 | accuracy = 0.5883561643835616


Epoch[1] Batch[735] Speed: 1.2552073751405703 samples/sec                   batch loss = 1989.046921133995 | accuracy = 0.5880952380952381


Epoch[1] Batch[740] Speed: 1.2542722608218746 samples/sec                   batch loss = 1999.5909126996994 | accuracy = 0.5898648648648649


Epoch[1] Batch[745] Speed: 1.2483599349554586 samples/sec                   batch loss = 2013.2277277708054 | accuracy = 0.5902684563758389


Epoch[1] Batch[750] Speed: 1.2413781672974724 samples/sec                   batch loss = 2026.3577815294266 | accuracy = 0.5896666666666667


Epoch[1] Batch[755] Speed: 1.2454695579519688 samples/sec                   batch loss = 2041.4100040197372 | accuracy = 0.5890728476821192


Epoch[1] Batch[760] Speed: 1.2547103194837095 samples/sec                   batch loss = 2053.247197031975 | accuracy = 0.5898026315789474


Epoch[1] Batch[765] Speed: 1.249960215165919 samples/sec                   batch loss = 2064.9458767175674 | accuracy = 0.5908496732026144


Epoch[1] Batch[770] Speed: 1.247117680321105 samples/sec                   batch loss = 2078.962975859642 | accuracy = 0.5902597402597403


Epoch[1] Batch[775] Speed: 1.246531422559573 samples/sec                   batch loss = 2093.3498128652573 | accuracy = 0.59


Epoch[1] Batch[780] Speed: 1.24441143924043 samples/sec                   batch loss = 2106.160745024681 | accuracy = 0.5897435897435898


Epoch[1] Batch[785] Speed: 1.2444473455049303 samples/sec                   batch loss = 2120.4039767980576 | accuracy = 0.589171974522293


[Epoch 1] training: accuracy=0.5891497461928934
[Epoch 1] time cost: 642.9951708316803
[Epoch 1] validation: validation accuracy=0.6744444444444444


Epoch[2] Batch[5] Speed: 1.2504639521530965 samples/sec                   batch loss = 11.998680472373962 | accuracy = 0.7


Epoch[2] Batch[10] Speed: 1.2494840554496562 samples/sec                   batch loss = 24.415566325187683 | accuracy = 0.725


Epoch[2] Batch[15] Speed: 1.2525604852678027 samples/sec                   batch loss = 37.273019909858704 | accuracy = 0.7


Epoch[2] Batch[20] Speed: 1.2523374001490812 samples/sec                   batch loss = 48.1581369638443 | accuracy = 0.75


Epoch[2] Batch[25] Speed: 1.2457759479603976 samples/sec                   batch loss = 60.8162100315094 | accuracy = 0.74


Epoch[2] Batch[30] Speed: 1.2519082830235901 samples/sec                   batch loss = 73.50262331962585 | accuracy = 0.7416666666666667


Epoch[2] Batch[35] Speed: 1.2514125294639822 samples/sec                   batch loss = 86.13365375995636 | accuracy = 0.7214285714285714


Epoch[2] Batch[40] Speed: 1.2474160707680355 samples/sec                   batch loss = 98.24956464767456 | accuracy = 0.725


Epoch[2] Batch[45] Speed: 1.2430203531311863 samples/sec                   batch loss = 111.34662961959839 | accuracy = 0.7055555555555556


Epoch[2] Batch[50] Speed: 1.2479071771925045 samples/sec                   batch loss = 124.67255771160126 | accuracy = 0.695


Epoch[2] Batch[55] Speed: 1.244007751795435 samples/sec                   batch loss = 135.8932422399521 | accuracy = 0.7090909090909091


Epoch[2] Batch[60] Speed: 1.2444072856931572 samples/sec                   batch loss = 151.2633537054062 | accuracy = 0.6791666666666667


Epoch[2] Batch[65] Speed: 1.250381847107032 samples/sec                   batch loss = 164.12126243114471 | accuracy = 0.6730769230769231


Epoch[2] Batch[70] Speed: 1.254761837273164 samples/sec                   batch loss = 177.7511225938797 | accuracy = 0.6678571428571428


Epoch[2] Batch[75] Speed: 1.2550738495634903 samples/sec                   batch loss = 191.52006423473358 | accuracy = 0.66


Epoch[2] Batch[80] Speed: 1.2529463501359315 samples/sec                   batch loss = 204.0530368089676 | accuracy = 0.653125


Epoch[2] Batch[85] Speed: 1.2495719989840888 samples/sec                   batch loss = 217.3962780237198 | accuracy = 0.6470588235294118


Epoch[2] Batch[90] Speed: 1.2531580453436146 samples/sec                   batch loss = 229.39034688472748 | accuracy = 0.6555555555555556


Epoch[2] Batch[95] Speed: 1.2499119775890397 samples/sec                   batch loss = 240.1694209575653 | accuracy = 0.6657894736842105


Epoch[2] Batch[100] Speed: 1.252683000839392 samples/sec                   batch loss = 253.286776304245 | accuracy = 0.6625


Epoch[2] Batch[105] Speed: 1.255210192441887 samples/sec                   batch loss = 266.0108721256256 | accuracy = 0.6595238095238095


Epoch[2] Batch[110] Speed: 1.2460019759612424 samples/sec                   batch loss = 278.8728610277176 | accuracy = 0.6590909090909091


Epoch[2] Batch[115] Speed: 1.2484438184818651 samples/sec                   batch loss = 290.15080535411835 | accuracy = 0.6608695652173913


Epoch[2] Batch[120] Speed: 1.2469312819421228 samples/sec                   batch loss = 301.43768644332886 | accuracy = 0.6625


Epoch[2] Batch[125] Speed: 1.2522385051851874 samples/sec                   batch loss = 313.5871183872223 | accuracy = 0.668


Epoch[2] Batch[130] Speed: 1.246540221152721 samples/sec                   batch loss = 325.26628851890564 | accuracy = 0.6711538461538461


Epoch[2] Batch[135] Speed: 1.2471168459923814 samples/sec                   batch loss = 338.04854369163513 | accuracy = 0.6703703703703704


Epoch[2] Batch[140] Speed: 1.2494024512011743 samples/sec                   batch loss = 350.6456925868988 | accuracy = 0.6714285714285714


Epoch[2] Batch[145] Speed: 1.2498471700269396 samples/sec                   batch loss = 363.2362561225891 | accuracy = 0.6706896551724137


Epoch[2] Batch[150] Speed: 1.2527781306001213 samples/sec                   batch loss = 376.4659810066223 | accuracy = 0.665


Epoch[2] Batch[155] Speed: 1.2588986185834665 samples/sec                   batch loss = 389.1059238910675 | accuracy = 0.6645161290322581


Epoch[2] Batch[160] Speed: 1.2549740525842095 samples/sec                   batch loss = 402.9290862083435 | accuracy = 0.659375


Epoch[2] Batch[165] Speed: 1.2484130691695425 samples/sec                   batch loss = 415.5451166629791 | accuracy = 0.6575757575757576


Epoch[2] Batch[170] Speed: 1.2525529106641968 samples/sec                   batch loss = 428.05452823638916 | accuracy = 0.6588235294117647


Epoch[2] Batch[175] Speed: 1.2559211198566904 samples/sec                   batch loss = 439.84735679626465 | accuracy = 0.6585714285714286


Epoch[2] Batch[180] Speed: 1.254331901353378 samples/sec                   batch loss = 452.7673399448395 | accuracy = 0.6583333333333333


Epoch[2] Batch[185] Speed: 1.2472664867965566 samples/sec                   batch loss = 464.0045323371887 | accuracy = 0.6608108108108108


Epoch[2] Batch[190] Speed: 1.2522316821956474 samples/sec                   batch loss = 474.31254744529724 | accuracy = 0.6631578947368421


Epoch[2] Batch[195] Speed: 1.2514039419798213 samples/sec                   batch loss = 486.2436407804489 | accuracy = 0.6641025641025641


Epoch[2] Batch[200] Speed: 1.2524797877209588 samples/sec                   batch loss = 498.3195056915283 | accuracy = 0.66125


Epoch[2] Batch[205] Speed: 1.2595059304946656 samples/sec                   batch loss = 512.7024278640747 | accuracy = 0.6585365853658537


Epoch[2] Batch[210] Speed: 1.2507417540411514 samples/sec                   batch loss = 526.1694638729095 | accuracy = 0.6607142857142857


Epoch[2] Batch[215] Speed: 1.249875941520237 samples/sec                   batch loss = 535.1128077507019 | accuracy = 0.6662790697674419


Epoch[2] Batch[220] Speed: 1.254781638496208 samples/sec                   batch loss = 548.6246559619904 | accuracy = 0.6659090909090909


Epoch[2] Batch[225] Speed: 1.2537319098798287 samples/sec                   batch loss = 561.1102534532547 | accuracy = 0.6655555555555556


Epoch[2] Batch[230] Speed: 1.2527120901176498 samples/sec                   batch loss = 574.455414056778 | accuracy = 0.6630434782608695


Epoch[2] Batch[235] Speed: 1.2545100127333706 samples/sec                   batch loss = 586.0812451839447 | accuracy = 0.6670212765957447


Epoch[2] Batch[240] Speed: 1.254101622709016 samples/sec                   batch loss = 599.3795461654663 | accuracy = 0.665625


Epoch[2] Batch[245] Speed: 1.2480670348505047 samples/sec                   batch loss = 612.7197538614273 | accuracy = 0.6632653061224489


Epoch[2] Batch[250] Speed: 1.2591679901507418 samples/sec                   batch loss = 625.8517280817032 | accuracy = 0.661


Epoch[2] Batch[255] Speed: 1.2519932046829927 samples/sec                   batch loss = 638.4908940792084 | accuracy = 0.6598039215686274


Epoch[2] Batch[260] Speed: 1.2542615711373286 samples/sec                   batch loss = 650.185161948204 | accuracy = 0.6625


Epoch[2] Batch[265] Speed: 1.2513777135459725 samples/sec                   batch loss = 663.4297176599503 | accuracy = 0.6622641509433962


Epoch[2] Batch[270] Speed: 1.2549436378460457 samples/sec                   batch loss = 675.9261604547501 | accuracy = 0.6620370370370371


Epoch[2] Batch[275] Speed: 1.2509361955446756 samples/sec                   batch loss = 685.9979239702225 | accuracy = 0.6663636363636364


Epoch[2] Batch[280] Speed: 1.2483990420407494 samples/sec                   batch loss = 693.9919911623001 | accuracy = 0.6696428571428571


Epoch[2] Batch[285] Speed: 1.2496491575274322 samples/sec                   batch loss = 706.1878192424774 | accuracy = 0.6684210526315789


Epoch[2] Batch[290] Speed: 1.253620991698053 samples/sec                   batch loss = 719.4530295133591 | accuracy = 0.6672413793103448


Epoch[2] Batch[295] Speed: 1.2501764919174376 samples/sec                   batch loss = 730.4173021316528 | accuracy = 0.6703389830508475


Epoch[2] Batch[300] Speed: 1.2506096429207698 samples/sec                   batch loss = 742.6221264600754 | accuracy = 0.6708333333333333


Epoch[2] Batch[305] Speed: 1.249956024495157 samples/sec                   batch loss = 755.9530845880508 | accuracy = 0.671311475409836


Epoch[2] Batch[310] Speed: 1.2519674186707377 samples/sec                   batch loss = 766.4226509332657 | accuracy = 0.6733870967741935


Epoch[2] Batch[315] Speed: 1.2528694388252197 samples/sec                   batch loss = 778.5040845870972 | accuracy = 0.6722222222222223


Epoch[2] Batch[320] Speed: 1.2584383743678824 samples/sec                   batch loss = 788.8849945068359 | accuracy = 0.6734375


Epoch[2] Batch[325] Speed: 1.252427428710833 samples/sec                   batch loss = 800.0662466287613 | accuracy = 0.6746153846153846


Epoch[2] Batch[330] Speed: 1.2534401359857081 samples/sec                   batch loss = 812.3295794725418 | accuracy = 0.6765151515151515


Epoch[2] Batch[335] Speed: 1.256527161517945 samples/sec                   batch loss = 825.179176568985 | accuracy = 0.6753731343283582


Epoch[2] Batch[340] Speed: 1.2515152150348778 samples/sec                   batch loss = 838.05466401577 | accuracy = 0.6757352941176471


Epoch[2] Batch[345] Speed: 1.2503359065362654 samples/sec                   batch loss = 850.1767238378525 | accuracy = 0.6746376811594202


Epoch[2] Batch[350] Speed: 1.2504602241080292 samples/sec                   batch loss = 861.2794524431229 | accuracy = 0.6742857142857143


Epoch[2] Batch[355] Speed: 1.2496096928600664 samples/sec                   batch loss = 874.9578086137772 | accuracy = 0.6725352112676056


Epoch[2] Batch[360] Speed: 1.2547805123435396 samples/sec                   batch loss = 885.7679804563522 | accuracy = 0.6743055555555556


Epoch[2] Batch[365] Speed: 1.2525080260042314 samples/sec                   batch loss = 896.8567283153534 | accuracy = 0.6760273972602739


Epoch[2] Batch[370] Speed: 1.2519540589444351 samples/sec                   batch loss = 908.6954832077026 | accuracy = 0.6777027027027027


Epoch[2] Batch[375] Speed: 1.2539951380565322 samples/sec                   batch loss = 921.8601760864258 | accuracy = 0.676


Epoch[2] Batch[380] Speed: 1.254267384802814 samples/sec                   batch loss = 931.9012084007263 | accuracy = 0.6769736842105263


Epoch[2] Batch[385] Speed: 1.2522298128964202 samples/sec                   batch loss = 943.8043565750122 | accuracy = 0.6785714285714286


Epoch[2] Batch[390] Speed: 1.2607967814437633 samples/sec                   batch loss = 957.0947363376617 | accuracy = 0.6782051282051282


Epoch[2] Batch[395] Speed: 1.245845144712069 samples/sec                   batch loss = 970.478404045105 | accuracy = 0.6772151898734177


Epoch[2] Batch[400] Speed: 1.2468952321977098 samples/sec                   batch loss = 984.6071718931198 | accuracy = 0.675625


Epoch[2] Batch[405] Speed: 1.2484094462385351 samples/sec                   batch loss = 997.5084614753723 | accuracy = 0.6753086419753086


Epoch[2] Batch[410] Speed: 1.246709456158704 samples/sec                   batch loss = 1010.7586619853973 | accuracy = 0.6743902439024391


Epoch[2] Batch[415] Speed: 1.243936453215594 samples/sec                   batch loss = 1022.8039237260818 | accuracy = 0.6759036144578313


Epoch[2] Batch[420] Speed: 1.242271150750925 samples/sec                   batch loss = 1033.043203830719 | accuracy = 0.6773809523809524


Epoch[2] Batch[425] Speed: 1.2404195147262156 samples/sec                   batch loss = 1046.6919481754303 | accuracy = 0.6752941176470588


Epoch[2] Batch[430] Speed: 1.2431731577051217 samples/sec                   batch loss = 1057.4338110685349 | accuracy = 0.6767441860465117


Epoch[2] Batch[435] Speed: 1.2405234310024382 samples/sec                   batch loss = 1067.739690065384 | accuracy = 0.6781609195402298


Epoch[2] Batch[440] Speed: 1.244623122258296 samples/sec                   batch loss = 1078.8444557189941 | accuracy = 0.678409090909091


Epoch[2] Batch[445] Speed: 1.24449405444074 samples/sec                   batch loss = 1093.2336766719818 | accuracy = 0.6769662921348315


Epoch[2] Batch[450] Speed: 1.241698629553009 samples/sec                   batch loss = 1104.924897313118 | accuracy = 0.6772222222222222


Epoch[2] Batch[455] Speed: 1.2423587258156183 samples/sec                   batch loss = 1118.284138083458 | accuracy = 0.6758241758241759


Epoch[2] Batch[460] Speed: 1.2455698833884825 samples/sec                   batch loss = 1132.8003661632538 | accuracy = 0.6755434782608696


Epoch[2] Batch[465] Speed: 1.251055408282347 samples/sec                   batch loss = 1144.6917378902435 | accuracy = 0.6758064516129032


Epoch[2] Batch[470] Speed: 1.251277850213446 samples/sec                   batch loss = 1154.8197798132896 | accuracy = 0.6776595744680851


Epoch[2] Batch[475] Speed: 1.2425761530065038 samples/sec                   batch loss = 1165.9226362109184 | accuracy = 0.6773684210526316


Epoch[2] Batch[480] Speed: 1.2376337620550077 samples/sec                   batch loss = 1178.4147219061852 | accuracy = 0.6776041666666667


Epoch[2] Batch[485] Speed: 1.2488357958759995 samples/sec                   batch loss = 1188.8423369526863 | accuracy = 0.6798969072164949


Epoch[2] Batch[490] Speed: 1.2470017193343503 samples/sec                   batch loss = 1198.237576663494 | accuracy = 0.6821428571428572


Epoch[2] Batch[495] Speed: 1.2467769036010188 samples/sec                   batch loss = 1211.2570462822914 | accuracy = 0.6823232323232323


Epoch[2] Batch[500] Speed: 1.2423890856002242 samples/sec                   batch loss = 1223.923408806324 | accuracy = 0.683


Epoch[2] Batch[505] Speed: 1.2543251493103955 samples/sec                   batch loss = 1234.9773289561272 | accuracy = 0.6836633663366337


Epoch[2] Batch[510] Speed: 1.2449532070730418 samples/sec                   batch loss = 1245.8827431797981 | accuracy = 0.6833333333333333


Epoch[2] Batch[515] Speed: 1.2432800234884045 samples/sec                   batch loss = 1258.9170224666595 | accuracy = 0.683009708737864


Epoch[2] Batch[520] Speed: 1.244226033438895 samples/sec                   batch loss = 1270.0733925104141 | accuracy = 0.6836538461538462


Epoch[2] Batch[525] Speed: 1.2478011849979591 samples/sec                   batch loss = 1281.8588197231293 | accuracy = 0.6838095238095238


Epoch[2] Batch[530] Speed: 1.246107662702154 samples/sec                   batch loss = 1291.6347030997276 | accuracy = 0.6849056603773584


Epoch[2] Batch[535] Speed: 1.2408122500768426 samples/sec                   batch loss = 1301.5743821263313 | accuracy = 0.6869158878504673


Epoch[2] Batch[540] Speed: 1.2550583579287293 samples/sec                   batch loss = 1312.5281228423119 | accuracy = 0.6875


Epoch[2] Batch[545] Speed: 1.243770644143615 samples/sec                   batch loss = 1324.7275175452232 | accuracy = 0.686697247706422


Epoch[2] Batch[550] Speed: 1.243364792174662 samples/sec                   batch loss = 1336.1639562249184 | accuracy = 0.6872727272727273


Epoch[2] Batch[555] Speed: 1.2419624367645168 samples/sec                   batch loss = 1345.6960272192955 | accuracy = 0.6882882882882883


Epoch[2] Batch[560] Speed: 1.2472610160232729 samples/sec                   batch loss = 1358.8495970368385 | accuracy = 0.6875


Epoch[2] Batch[565] Speed: 1.2465209570263032 samples/sec                   batch loss = 1369.6014769673347 | accuracy = 0.6871681415929204


Epoch[2] Batch[570] Speed: 1.250942444787413 samples/sec                   batch loss = 1382.090790092945 | accuracy = 0.6868421052631579


Epoch[2] Batch[575] Speed: 1.253661553250003 samples/sec                   batch loss = 1394.5145145058632 | accuracy = 0.6860869565217391


Epoch[2] Batch[580] Speed: 1.2506446957976582 samples/sec                   batch loss = 1408.0015882849693 | accuracy = 0.6862068965517242


Epoch[2] Batch[585] Speed: 1.255086994280776 samples/sec                   batch loss = 1417.5467955470085 | accuracy = 0.6871794871794872


Epoch[2] Batch[590] Speed: 1.2525315900496057 samples/sec                   batch loss = 1428.6916727423668 | accuracy = 0.6872881355932203


Epoch[2] Batch[595] Speed: 1.2531474682444703 samples/sec                   batch loss = 1440.794382750988 | accuracy = 0.6865546218487395


Epoch[2] Batch[600] Speed: 1.248342007823299 samples/sec                   batch loss = 1455.11284917593 | accuracy = 0.68625


Epoch[2] Batch[605] Speed: 1.2482936164260439 samples/sec                   batch loss = 1469.1318796277046 | accuracy = 0.6851239669421487


Epoch[2] Batch[610] Speed: 1.2549274923495064 samples/sec                   batch loss = 1480.516884624958 | accuracy = 0.6852459016393443


Epoch[2] Batch[615] Speed: 1.2492879254752791 samples/sec                   batch loss = 1491.300932586193 | accuracy = 0.6849593495934959


Epoch[2] Batch[620] Speed: 1.2527523122454458 samples/sec                   batch loss = 1500.1725663542747 | accuracy = 0.6858870967741936


Epoch[2] Batch[625] Speed: 1.2498369280674733 samples/sec                   batch loss = 1508.9241977334023 | accuracy = 0.6876


Epoch[2] Batch[630] Speed: 1.2516862705951666 samples/sec                   batch loss = 1518.4544419646263 | accuracy = 0.6884920634920635


Epoch[2] Batch[635] Speed: 1.2525219586064629 samples/sec                   batch loss = 1529.885024368763 | accuracy = 0.6885826771653544


Epoch[2] Batch[640] Speed: 1.2515336068664051 samples/sec                   batch loss = 1539.4770582914352 | accuracy = 0.688671875


Epoch[2] Batch[645] Speed: 1.2470289696181363 samples/sec                   batch loss = 1551.5118092298508 | accuracy = 0.6891472868217055


Epoch[2] Batch[650] Speed: 1.246677402652371 samples/sec                   batch loss = 1560.1552733182907 | accuracy = 0.6896153846153846


Epoch[2] Batch[655] Speed: 1.2498219378657947 samples/sec                   batch loss = 1568.8159106969833 | accuracy = 0.6900763358778625


Epoch[2] Batch[660] Speed: 1.2455239258598751 samples/sec                   batch loss = 1582.0987421274185 | accuracy = 0.6901515151515152


Epoch[2] Batch[665] Speed: 1.2460777687496007 samples/sec                   batch loss = 1596.9388831853867 | accuracy = 0.6890977443609022


Epoch[2] Batch[670] Speed: 1.2430192479908164 samples/sec                   batch loss = 1607.3129291534424 | accuracy = 0.6899253731343283


Epoch[2] Batch[675] Speed: 1.2459381283869353 samples/sec                   batch loss = 1621.0677754878998 | accuracy = 0.69


Epoch[2] Batch[680] Speed: 1.2482881366386565 samples/sec                   batch loss = 1633.1993272304535 | accuracy = 0.6893382352941176


Epoch[2] Batch[685] Speed: 1.2460395473021404 samples/sec                   batch loss = 1644.1864515542984 | accuracy = 0.6894160583941605


Epoch[2] Batch[690] Speed: 1.2498257552122212 samples/sec                   batch loss = 1657.0107561349869 | accuracy = 0.6884057971014492


Epoch[2] Batch[695] Speed: 1.258719258951988 samples/sec                   batch loss = 1665.7298176288605 | accuracy = 0.689568345323741


Epoch[2] Batch[700] Speed: 1.247619313912654 samples/sec                   batch loss = 1681.7330132722855 | accuracy = 0.6882142857142857


Epoch[2] Batch[705] Speed: 1.2498758484064716 samples/sec                   batch loss = 1691.646965265274 | accuracy = 0.6882978723404255


Epoch[2] Batch[710] Speed: 1.2485243684231364 samples/sec                   batch loss = 1703.376137971878 | accuracy = 0.6873239436619718


Epoch[2] Batch[715] Speed: 1.2496067144957652 samples/sec                   batch loss = 1716.709662079811 | accuracy = 0.6870629370629371


Epoch[2] Batch[720] Speed: 1.2521571949412813 samples/sec                   batch loss = 1726.3608657121658 | accuracy = 0.6881944444444444


Epoch[2] Batch[725] Speed: 1.25309009293302 samples/sec                   batch loss = 1737.1635375022888 | accuracy = 0.6886206896551724


Epoch[2] Batch[730] Speed: 1.2583386079070529 samples/sec                   batch loss = 1751.3903875350952 | accuracy = 0.686986301369863


Epoch[2] Batch[735] Speed: 1.2573585857030194 samples/sec                   batch loss = 1761.1932303905487 | accuracy = 0.6877551020408164


Epoch[2] Batch[740] Speed: 1.2477829027011842 samples/sec                   batch loss = 1771.2470978498459 | accuracy = 0.6881756756756757


Epoch[2] Batch[745] Speed: 1.244211638883809 samples/sec                   batch loss = 1781.0348279476166 | accuracy = 0.6889261744966443


Epoch[2] Batch[750] Speed: 1.255982515738695 samples/sec                   batch loss = 1791.8168547153473 | accuracy = 0.6886666666666666


Epoch[2] Batch[755] Speed: 1.249903876276233 samples/sec                   batch loss = 1802.6932682991028 | accuracy = 0.6897350993377483


Epoch[2] Batch[760] Speed: 1.2441978905492235 samples/sec                   batch loss = 1812.3208649158478 | accuracy = 0.6904605263157895


Epoch[2] Batch[765] Speed: 1.2495534786288292 samples/sec                   batch loss = 1821.2799561023712 | accuracy = 0.6915032679738562


Epoch[2] Batch[770] Speed: 1.2473502234339273 samples/sec                   batch loss = 1831.5580554008484 | accuracy = 0.6915584415584416


Epoch[2] Batch[775] Speed: 1.2454718694160678 samples/sec                   batch loss = 1840.1436542272568 | accuracy = 0.6929032258064516


Epoch[2] Batch[780] Speed: 1.2546983086582417 samples/sec                   batch loss = 1855.5730773210526 | accuracy = 0.6923076923076923


Epoch[2] Batch[785] Speed: 1.2513862073312223 samples/sec                   batch loss = 1865.742290019989 | accuracy = 0.6923566878980891


[Epoch 2] training: accuracy=0.692258883248731
[Epoch 2] time cost: 645.5308375358582
[Epoch 2] validation: validation accuracy=0.7133333333333334


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).