<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[03:32:42] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[03:32:42] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[03:32:43] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[  9.084641, -13.851528]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7770907139480545 samples/sec                   batch loss = 14.493964910507202 | accuracy = 0.5


Epoch[1] Batch[10] Speed: 1.253665581445779 samples/sec                   batch loss = 28.114454746246338 | accuracy = 0.55


Epoch[1] Batch[15] Speed: 1.2503464362268926 samples/sec                   batch loss = 42.31161832809448 | accuracy = 0.5166666666666667


Epoch[1] Batch[20] Speed: 1.2509992504689416 samples/sec                   batch loss = 56.491451263427734 | accuracy = 0.525


Epoch[1] Batch[25] Speed: 1.2523357174954504 samples/sec                   batch loss = 69.89907360076904 | accuracy = 0.54


Epoch[1] Batch[30] Speed: 1.2507025932828213 samples/sec                   batch loss = 83.65840721130371 | accuracy = 0.5583333333333333


Epoch[1] Batch[35] Speed: 1.2510407619878803 samples/sec                   batch loss = 97.82653951644897 | accuracy = 0.5428571428571428


Epoch[1] Batch[40] Speed: 1.2513744467361938 samples/sec                   batch loss = 111.75997233390808 | accuracy = 0.54375


Epoch[1] Batch[45] Speed: 1.2538978553891802 samples/sec                   batch loss = 126.08608937263489 | accuracy = 0.55


Epoch[1] Batch[50] Speed: 1.254024475773018 samples/sec                   batch loss = 140.73157358169556 | accuracy = 0.545


Epoch[1] Batch[55] Speed: 1.2526179056127937 samples/sec                   batch loss = 154.255615234375 | accuracy = 0.5454545454545454


Epoch[1] Batch[60] Speed: 1.2599466123553082 samples/sec                   batch loss = 167.73535251617432 | accuracy = 0.55


Epoch[1] Batch[65] Speed: 1.2601123143166504 samples/sec                   batch loss = 181.81078600883484 | accuracy = 0.5461538461538461


Epoch[1] Batch[70] Speed: 1.256211980740078 samples/sec                   batch loss = 195.86451649665833 | accuracy = 0.5321428571428571


Epoch[1] Batch[75] Speed: 1.2610939813736028 samples/sec                   batch loss = 209.79889917373657 | accuracy = 0.53


Epoch[1] Batch[80] Speed: 1.2648754776771616 samples/sec                   batch loss = 224.41328191757202 | accuracy = 0.528125


Epoch[1] Batch[85] Speed: 1.2587463626847468 samples/sec                   batch loss = 238.0448513031006 | accuracy = 0.5294117647058824


Epoch[1] Batch[90] Speed: 1.2655866107382703 samples/sec                   batch loss = 252.32894325256348 | accuracy = 0.525


Epoch[1] Batch[95] Speed: 1.259124614582392 samples/sec                   batch loss = 266.37283992767334 | accuracy = 0.5236842105263158


Epoch[1] Batch[100] Speed: 1.2657610572352473 samples/sec                   batch loss = 279.7831926345825 | accuracy = 0.53


Epoch[1] Batch[105] Speed: 1.2582287603977653 samples/sec                   batch loss = 293.3015887737274 | accuracy = 0.530952380952381


Epoch[1] Batch[110] Speed: 1.2625707783460123 samples/sec                   batch loss = 307.61172461509705 | accuracy = 0.5318181818181819


Epoch[1] Batch[115] Speed: 1.2606387615016683 samples/sec                   batch loss = 321.84857845306396 | accuracy = 0.5282608695652173


Epoch[1] Batch[120] Speed: 1.2600926284215914 samples/sec                   batch loss = 335.07821249961853 | accuracy = 0.5333333333333333


Epoch[1] Batch[125] Speed: 1.2617969767201191 samples/sec                   batch loss = 348.5063621997833 | accuracy = 0.54


Epoch[1] Batch[130] Speed: 1.2574933519055498 samples/sec                   batch loss = 362.40886330604553 | accuracy = 0.5423076923076923


Epoch[1] Batch[135] Speed: 1.261517562704556 samples/sec                   batch loss = 375.6475954055786 | accuracy = 0.5481481481481482


Epoch[1] Batch[140] Speed: 1.2590119843610468 samples/sec                   batch loss = 389.88208961486816 | accuracy = 0.5428571428571428


Epoch[1] Batch[145] Speed: 1.2625136769927021 samples/sec                   batch loss = 402.4938862323761 | accuracy = 0.5517241379310345


Epoch[1] Batch[150] Speed: 1.2602772078379714 samples/sec                   batch loss = 415.26873779296875 | accuracy = 0.5516666666666666


Epoch[1] Batch[155] Speed: 1.2621311076820116 samples/sec                   batch loss = 430.1045002937317 | accuracy = 0.5467741935483871


Epoch[1] Batch[160] Speed: 1.261296870442264 samples/sec                   batch loss = 444.29298663139343 | accuracy = 0.54375


Epoch[1] Batch[165] Speed: 1.2588632904223325 samples/sec                   batch loss = 457.07342648506165 | accuracy = 0.546969696969697


Epoch[1] Batch[170] Speed: 1.258294723225225 samples/sec                   batch loss = 471.3399977684021 | accuracy = 0.5441176470588235


Epoch[1] Batch[175] Speed: 1.2630188373856892 samples/sec                   batch loss = 485.3769769668579 | accuracy = 0.5414285714285715


Epoch[1] Batch[180] Speed: 1.2589500085394687 samples/sec                   batch loss = 498.81634974479675 | accuracy = 0.5430555555555555


Epoch[1] Batch[185] Speed: 1.259693837595817 samples/sec                   batch loss = 512.1883153915405 | accuracy = 0.5459459459459459


Epoch[1] Batch[190] Speed: 1.2596169468365455 samples/sec                   batch loss = 526.0504140853882 | accuracy = 0.5447368421052632


Epoch[1] Batch[195] Speed: 1.254642198629506 samples/sec                   batch loss = 540.3388044834137 | accuracy = 0.5423076923076923


Epoch[1] Batch[200] Speed: 1.260839798526752 samples/sec                   batch loss = 553.4657094478607 | accuracy = 0.54625


Epoch[1] Batch[205] Speed: 1.2574768579867677 samples/sec                   batch loss = 566.6000833511353 | accuracy = 0.551219512195122


Epoch[1] Batch[210] Speed: 1.2605546519330109 samples/sec                   batch loss = 580.7141976356506 | accuracy = 0.5464285714285714


Epoch[1] Batch[215] Speed: 1.2541647158976357 samples/sec                   batch loss = 593.842357635498 | accuracy = 0.5465116279069767


Epoch[1] Batch[220] Speed: 1.257080474368124 samples/sec                   batch loss = 608.5174412727356 | accuracy = 0.5409090909090909


Epoch[1] Batch[225] Speed: 1.2584930308198177 samples/sec                   batch loss = 621.4618442058563 | accuracy = 0.5433333333333333


Epoch[1] Batch[230] Speed: 1.2643377689462083 samples/sec                   batch loss = 635.2444953918457 | accuracy = 0.5434782608695652


Epoch[1] Batch[235] Speed: 1.2524383676422526 samples/sec                   batch loss = 648.6421999931335 | accuracy = 0.5457446808510639


Epoch[1] Batch[240] Speed: 1.2517839575695693 samples/sec                   batch loss = 661.7816202640533 | accuracy = 0.546875


Epoch[1] Batch[245] Speed: 1.2542837008615622 samples/sec                   batch loss = 675.8858487606049 | accuracy = 0.5448979591836735


Epoch[1] Batch[250] Speed: 1.2568435361939234 samples/sec                   batch loss = 688.8240730762482 | accuracy = 0.55


Epoch[1] Batch[255] Speed: 1.2531048808467733 samples/sec                   batch loss = 702.269553899765 | accuracy = 0.553921568627451


Epoch[1] Batch[260] Speed: 1.2521180389473328 samples/sec                   batch loss = 716.1563403606415 | accuracy = 0.5557692307692308


Epoch[1] Batch[265] Speed: 1.2505442969257967 samples/sec                   batch loss = 729.6382718086243 | accuracy = 0.5566037735849056


Epoch[1] Batch[270] Speed: 1.2561326929836942 samples/sec                   batch loss = 743.963751077652 | accuracy = 0.5546296296296296


Epoch[1] Batch[275] Speed: 1.2549657916452983 samples/sec                   batch loss = 756.5821521282196 | accuracy = 0.5572727272727273


Epoch[1] Batch[280] Speed: 1.2563864866765362 samples/sec                   batch loss = 769.7387845516205 | accuracy = 0.5589285714285714


Epoch[1] Batch[285] Speed: 1.258525411579095 samples/sec                   batch loss = 784.0967350006104 | accuracy = 0.5552631578947368


Epoch[1] Batch[290] Speed: 1.255381320253252 samples/sec                   batch loss = 796.9785506725311 | accuracy = 0.5586206896551724


Epoch[1] Batch[295] Speed: 1.2583348327603259 samples/sec                   batch loss = 810.8221619129181 | accuracy = 0.5601694915254237


Epoch[1] Batch[300] Speed: 1.2558638663191783 samples/sec                   batch loss = 824.0609495639801 | accuracy = 0.5633333333333334


Epoch[1] Batch[305] Speed: 1.2584350705894451 samples/sec                   batch loss = 838.1752274036407 | accuracy = 0.5606557377049181


Epoch[1] Batch[310] Speed: 1.2487062246723817 samples/sec                   batch loss = 852.2075517177582 | accuracy = 0.5612903225806452


Epoch[1] Batch[315] Speed: 1.2543765416886978 samples/sec                   batch loss = 865.8345758914948 | accuracy = 0.5603174603174603


Epoch[1] Batch[320] Speed: 1.2514866480842028 samples/sec                   batch loss = 879.8248000144958 | accuracy = 0.559375


Epoch[1] Batch[325] Speed: 1.2533010877216546 samples/sec                   batch loss = 892.9482634067535 | accuracy = 0.5607692307692308


Epoch[1] Batch[330] Speed: 1.252579188385126 samples/sec                   batch loss = 906.4420609474182 | accuracy = 0.5598484848484848


Epoch[1] Batch[335] Speed: 1.2520676724274322 samples/sec                   batch loss = 920.4313716888428 | accuracy = 0.5597014925373134


Epoch[1] Batch[340] Speed: 1.2499230588651273 samples/sec                   batch loss = 934.5000734329224 | accuracy = 0.5588235294117647


Epoch[1] Batch[345] Speed: 1.2469325794010906 samples/sec                   batch loss = 947.8422863483429 | accuracy = 0.5594202898550724


Epoch[1] Batch[350] Speed: 1.2527312654610674 samples/sec                   batch loss = 961.5043127536774 | accuracy = 0.5592857142857143


Epoch[1] Batch[355] Speed: 1.2538316030998844 samples/sec                   batch loss = 975.1711332798004 | accuracy = 0.5598591549295775


Epoch[1] Batch[360] Speed: 1.2559564711169289 samples/sec                   batch loss = 989.184065580368 | accuracy = 0.5583333333333333


Epoch[1] Batch[365] Speed: 1.2527274303454212 samples/sec                   batch loss = 1002.2986083030701 | accuracy = 0.5582191780821918


Epoch[1] Batch[370] Speed: 1.2521545782352557 samples/sec                   batch loss = 1016.0631415843964 | accuracy = 0.5581081081081081


Epoch[1] Batch[375] Speed: 1.2511529035031757 samples/sec                   batch loss = 1030.743227481842 | accuracy = 0.556


Epoch[1] Batch[380] Speed: 1.252304776170851 samples/sec                   batch loss = 1043.3090152740479 | accuracy = 0.5578947368421052


Epoch[1] Batch[385] Speed: 1.2560982722338774 samples/sec                   batch loss = 1057.1920101642609 | accuracy = 0.5577922077922078


Epoch[1] Batch[390] Speed: 1.25371851246947 samples/sec                   batch loss = 1071.0944848060608 | accuracy = 0.5564102564102564


Epoch[1] Batch[395] Speed: 1.25383113457963 samples/sec                   batch loss = 1085.5349068641663 | accuracy = 0.5550632911392405


Epoch[1] Batch[400] Speed: 1.254378323616768 samples/sec                   batch loss = 1099.7155303955078 | accuracy = 0.554375


Epoch[1] Batch[405] Speed: 1.2520754280468898 samples/sec                   batch loss = 1112.8253946304321 | accuracy = 0.5555555555555556


Epoch[1] Batch[410] Speed: 1.2549008344156591 samples/sec                   batch loss = 1125.9133658409119 | accuracy = 0.5567073170731708


Epoch[1] Batch[415] Speed: 1.2617917573275599 samples/sec                   batch loss = 1139.6027357578278 | accuracy = 0.5572289156626506


Epoch[1] Batch[420] Speed: 1.2538461274014427 samples/sec                   batch loss = 1152.0985860824585 | accuracy = 0.5601190476190476


Epoch[1] Batch[425] Speed: 1.2480215427302583 samples/sec                   batch loss = 1165.6416690349579 | accuracy = 0.56


Epoch[1] Batch[430] Speed: 1.2493270907249847 samples/sec                   batch loss = 1179.8495652675629 | accuracy = 0.5604651162790698


Epoch[1] Batch[435] Speed: 1.250598269803133 samples/sec                   batch loss = 1193.130981683731 | accuracy = 0.5609195402298851


Epoch[1] Batch[440] Speed: 1.2517939512650513 samples/sec                   batch loss = 1207.07900929451 | accuracy = 0.5607954545454545


Epoch[1] Batch[445] Speed: 1.25357453180599 samples/sec                   batch loss = 1219.9662322998047 | accuracy = 0.5634831460674158


Epoch[1] Batch[450] Speed: 1.25350466111392 samples/sec                   batch loss = 1232.6052508354187 | accuracy = 0.5655555555555556


Epoch[1] Batch[455] Speed: 1.253139324935807 samples/sec                   batch loss = 1246.6566059589386 | accuracy = 0.5648351648351648


Epoch[1] Batch[460] Speed: 1.2568770562021405 samples/sec                   batch loss = 1260.6996731758118 | accuracy = 0.5641304347826087


Epoch[1] Batch[465] Speed: 1.25846527720947 samples/sec                   batch loss = 1273.3695151805878 | accuracy = 0.5661290322580645


Epoch[1] Batch[470] Speed: 1.2623048877362908 samples/sec                   batch loss = 1287.4284348487854 | accuracy = 0.5643617021276596


Epoch[1] Batch[475] Speed: 1.2589125047939285 samples/sec                   batch loss = 1300.461543083191 | accuracy = 0.5652631578947368


Epoch[1] Batch[480] Speed: 1.2577161093090186 samples/sec                   batch loss = 1313.2323718070984 | accuracy = 0.565625


Epoch[1] Batch[485] Speed: 1.259572500125753 samples/sec                   batch loss = 1326.372986793518 | accuracy = 0.565979381443299


Epoch[1] Batch[490] Speed: 1.2550720656587917 samples/sec                   batch loss = 1340.4748346805573 | accuracy = 0.5663265306122449


Epoch[1] Batch[495] Speed: 1.253328052246306 samples/sec                   batch loss = 1352.871691942215 | accuracy = 0.5691919191919191


Epoch[1] Batch[500] Speed: 1.2570402563888545 samples/sec                   batch loss = 1367.5206747055054 | accuracy = 0.5685


Epoch[1] Batch[505] Speed: 1.2559674717820823 samples/sec                   batch loss = 1380.1611077785492 | accuracy = 0.5698019801980198


Epoch[1] Batch[510] Speed: 1.253892982280703 samples/sec                   batch loss = 1393.2917740345001 | accuracy = 0.571078431372549


Epoch[1] Batch[515] Speed: 1.2523148716615848 samples/sec                   batch loss = 1406.4694287776947 | accuracy = 0.5713592233009709


Epoch[1] Batch[520] Speed: 1.2562072777469455 samples/sec                   batch loss = 1419.5306849479675 | accuracy = 0.5711538461538461


Epoch[1] Batch[525] Speed: 1.2498887913529102 samples/sec                   batch loss = 1432.6935217380524 | accuracy = 0.5719047619047619


Epoch[1] Batch[530] Speed: 1.2573622607617492 samples/sec                   batch loss = 1445.937302350998 | accuracy = 0.5716981132075472


Epoch[1] Batch[535] Speed: 1.2476903857906874 samples/sec                   batch loss = 1459.4920530319214 | accuracy = 0.5714953271028037


Epoch[1] Batch[540] Speed: 1.2512251250705608 samples/sec                   batch loss = 1473.7890689373016 | accuracy = 0.5708333333333333


Epoch[1] Batch[545] Speed: 1.25550138176711 samples/sec                   batch loss = 1487.0130441188812 | accuracy = 0.5711009174311926


Epoch[1] Batch[550] Speed: 1.2527555862532436 samples/sec                   batch loss = 1501.5097811222076 | accuracy = 0.5690909090909091


Epoch[1] Batch[555] Speed: 1.2500307157732593 samples/sec                   batch loss = 1514.3614008426666 | accuracy = 0.5698198198198198


Epoch[1] Batch[560] Speed: 1.2545727717617752 samples/sec                   batch loss = 1528.003091096878 | accuracy = 0.5691964285714286


Epoch[1] Batch[565] Speed: 1.2546930539944074 samples/sec                   batch loss = 1540.9183325767517 | accuracy = 0.5690265486725664


Epoch[1] Batch[570] Speed: 1.256798814248021 samples/sec                   batch loss = 1554.6019217967987 | accuracy = 0.5688596491228071


Epoch[1] Batch[575] Speed: 1.254353095738275 samples/sec                   batch loss = 1568.030862569809 | accuracy = 0.5691304347826087


Epoch[1] Batch[580] Speed: 1.2509161424261943 samples/sec                   batch loss = 1580.0013375282288 | accuracy = 0.5706896551724138


Epoch[1] Batch[585] Speed: 1.2520505730275941 samples/sec                   batch loss = 1594.3521254062653 | accuracy = 0.5709401709401709


Epoch[1] Batch[590] Speed: 1.2500980573322444 samples/sec                   batch loss = 1607.0513529777527 | accuracy = 0.5711864406779661


Epoch[1] Batch[595] Speed: 1.2490016565046227 samples/sec                   batch loss = 1620.4003868103027 | accuracy = 0.5718487394957983


Epoch[1] Batch[600] Speed: 1.2489980301566197 samples/sec                   batch loss = 1632.4153668880463 | accuracy = 0.57375


Epoch[1] Batch[605] Speed: 1.2495139269444822 samples/sec                   batch loss = 1645.5705218315125 | accuracy = 0.5739669421487603


Epoch[1] Batch[610] Speed: 1.247228841429499 samples/sec                   batch loss = 1659.0754170417786 | accuracy = 0.5737704918032787


Epoch[1] Batch[615] Speed: 1.2501395090637115 samples/sec                   batch loss = 1671.4780628681183 | accuracy = 0.5747967479674797


Epoch[1] Batch[620] Speed: 1.2516364056855207 samples/sec                   batch loss = 1684.6847472190857 | accuracy = 0.575


Epoch[1] Batch[625] Speed: 1.2540291624363566 samples/sec                   batch loss = 1697.1547157764435 | accuracy = 0.5764


Epoch[1] Batch[630] Speed: 1.2562015401430107 samples/sec                   batch loss = 1710.8639402389526 | accuracy = 0.5753968253968254


Epoch[1] Batch[635] Speed: 1.2610535061789474 samples/sec                   batch loss = 1722.650731086731 | accuracy = 0.5783464566929134


Epoch[1] Batch[640] Speed: 1.262513581986435 samples/sec                   batch loss = 1736.0573801994324 | accuracy = 0.578125


Epoch[1] Batch[645] Speed: 1.2523571249260348 samples/sec                   batch loss = 1748.6354835033417 | accuracy = 0.5786821705426357


Epoch[1] Batch[650] Speed: 1.2494914068688447 samples/sec                   batch loss = 1761.3017311096191 | accuracy = 0.5792307692307692


Epoch[1] Batch[655] Speed: 1.2565196329694763 samples/sec                   batch loss = 1773.7576169967651 | accuracy = 0.5801526717557252


Epoch[1] Batch[660] Speed: 1.2589706034861714 samples/sec                   batch loss = 1786.5916509628296 | accuracy = 0.5799242424242425


Epoch[1] Batch[665] Speed: 1.2547509515589501 samples/sec                   batch loss = 1799.5578827857971 | accuracy = 0.5804511278195489


Epoch[1] Batch[670] Speed: 1.2525708654289442 samples/sec                   batch loss = 1810.837143421173 | accuracy = 0.5828358208955224


Epoch[1] Batch[675] Speed: 1.2594608297861016 samples/sec                   batch loss = 1823.9434671401978 | accuracy = 0.582962962962963


Epoch[1] Batch[680] Speed: 1.2516382732135176 samples/sec                   batch loss = 1836.6174893379211 | accuracy = 0.5830882352941177


Epoch[1] Batch[685] Speed: 1.2515333267836992 samples/sec                   batch loss = 1848.2322733402252 | accuracy = 0.5835766423357664


Epoch[1] Batch[690] Speed: 1.2499076941232075 samples/sec                   batch loss = 1861.1743891239166 | accuracy = 0.5833333333333334


Epoch[1] Batch[695] Speed: 1.257668685442423 samples/sec                   batch loss = 1874.320904493332 | accuracy = 0.5830935251798561


Epoch[1] Batch[700] Speed: 1.2553283426624395 samples/sec                   batch loss = 1887.165498495102 | accuracy = 0.5835714285714285


Epoch[1] Batch[705] Speed: 1.2528292091760251 samples/sec                   batch loss = 1899.0335698127747 | accuracy = 0.5847517730496454


Epoch[1] Batch[710] Speed: 1.2570528772106597 samples/sec                   batch loss = 1912.1723010540009 | accuracy = 0.5845070422535211


Epoch[1] Batch[715] Speed: 1.2509368484477124 samples/sec                   batch loss = 1925.454610824585 | accuracy = 0.5849650349650349


Epoch[1] Batch[720] Speed: 1.2527032041819455 samples/sec                   batch loss = 1937.173839688301 | accuracy = 0.5857638888888889


Epoch[1] Batch[725] Speed: 1.254321116874951 samples/sec                   batch loss = 1952.1095770597458 | accuracy = 0.5851724137931035


Epoch[1] Batch[730] Speed: 1.2505677871392122 samples/sec                   batch loss = 1965.019199848175 | accuracy = 0.5856164383561644


Epoch[1] Batch[735] Speed: 1.2466013516500705 samples/sec                   batch loss = 1978.5866198539734 | accuracy = 0.5857142857142857


Epoch[1] Batch[740] Speed: 1.2506285675012059 samples/sec                   batch loss = 1991.680447101593 | accuracy = 0.5858108108108108


Epoch[1] Batch[745] Speed: 1.2594378551952625 samples/sec                   batch loss = 2004.4579601287842 | accuracy = 0.5859060402684564


Epoch[1] Batch[750] Speed: 1.2598919241869777 samples/sec                   batch loss = 2017.3732769489288 | accuracy = 0.5863333333333334


Epoch[1] Batch[755] Speed: 1.2579027281899666 samples/sec                   batch loss = 2029.4566187858582 | accuracy = 0.5870860927152318


Epoch[1] Batch[760] Speed: 1.2587711064843472 samples/sec                   batch loss = 2043.3705713748932 | accuracy = 0.5868421052631579


Epoch[1] Batch[765] Speed: 1.2613471287561941 samples/sec                   batch loss = 2055.2706620693207 | accuracy = 0.5872549019607843


Epoch[1] Batch[770] Speed: 1.2604392089889789 samples/sec                   batch loss = 2068.2673075199127 | accuracy = 0.5873376623376624


Epoch[1] Batch[775] Speed: 1.2553584942536493 samples/sec                   batch loss = 2080.0830405950546 | accuracy = 0.5880645161290322


Epoch[1] Batch[780] Speed: 1.2576844301466756 samples/sec                   batch loss = 2093.8092809915543 | accuracy = 0.5884615384615385


Epoch[1] Batch[785] Speed: 1.25638535764136 samples/sec                   batch loss = 2106.837630391121 | accuracy = 0.5888535031847134


[Epoch 1] training: accuracy=0.5881979695431472
[Epoch 1] time cost: 645.532957315445
[Epoch 1] validation: validation accuracy=0.7022222222222222


Epoch[2] Batch[5] Speed: 1.258334738381948 samples/sec                   batch loss = 13.221164464950562 | accuracy = 0.55


Epoch[2] Batch[10] Speed: 1.2562324862016128 samples/sec                   batch loss = 27.305179357528687 | accuracy = 0.6


Epoch[2] Batch[15] Speed: 1.2608257750142224 samples/sec                   batch loss = 40.09816551208496 | accuracy = 0.6166666666666667


Epoch[2] Batch[20] Speed: 1.258275188448227 samples/sec                   batch loss = 52.725000619888306 | accuracy = 0.6125


Epoch[2] Batch[25] Speed: 1.2583989189474618 samples/sec                   batch loss = 64.49909591674805 | accuracy = 0.63


Epoch[2] Batch[30] Speed: 1.2652457815942555 samples/sec                   batch loss = 76.05830478668213 | accuracy = 0.6666666666666666


Epoch[2] Batch[35] Speed: 1.2643868405680496 samples/sec                   batch loss = 87.8994791507721 | accuracy = 0.6642857142857143


Epoch[2] Batch[40] Speed: 1.25893158702503 samples/sec                   batch loss = 99.76189243793488 | accuracy = 0.66875


Epoch[2] Batch[45] Speed: 1.255949137447217 samples/sec                   batch loss = 111.62123811244965 | accuracy = 0.6722222222222223


Epoch[2] Batch[50] Speed: 1.2570303671160235 samples/sec                   batch loss = 126.11957776546478 | accuracy = 0.665


Epoch[2] Batch[55] Speed: 1.257273499965453 samples/sec                   batch loss = 139.37452387809753 | accuracy = 0.6636363636363637


Epoch[2] Batch[60] Speed: 1.2589329095773343 samples/sec                   batch loss = 151.84250223636627 | accuracy = 0.6708333333333333


Epoch[2] Batch[65] Speed: 1.2552843859282763 samples/sec                   batch loss = 164.38490569591522 | accuracy = 0.6653846153846154


Epoch[2] Batch[70] Speed: 1.2564147132154448 samples/sec                   batch loss = 178.03787434101105 | accuracy = 0.6642857142857143


Epoch[2] Batch[75] Speed: 1.2572247905020764 samples/sec                   batch loss = 191.78831779956818 | accuracy = 0.6633333333333333


Epoch[2] Batch[80] Speed: 1.2591326468695867 samples/sec                   batch loss = 204.85744750499725 | accuracy = 0.65625


Epoch[2] Batch[85] Speed: 1.2544396625431022 samples/sec                   batch loss = 220.18415915966034 | accuracy = 0.6441176470588236


Epoch[2] Batch[90] Speed: 1.25650579949685 samples/sec                   batch loss = 233.1653665304184 | accuracy = 0.6444444444444445


Epoch[2] Batch[95] Speed: 1.2624309319512814 samples/sec                   batch loss = 244.12157833576202 | accuracy = 0.6473684210526316


Epoch[2] Batch[100] Speed: 1.2577522217291834 samples/sec                   batch loss = 256.32376849651337 | accuracy = 0.6475


Epoch[2] Batch[105] Speed: 1.257929985404691 samples/sec                   batch loss = 269.3550556898117 | accuracy = 0.65


Epoch[2] Batch[110] Speed: 1.2575302055960835 samples/sec                   batch loss = 282.3162624835968 | accuracy = 0.6409090909090909


Epoch[2] Batch[115] Speed: 1.2496138812088855 samples/sec                   batch loss = 294.44556534290314 | accuracy = 0.6413043478260869


Epoch[2] Batch[120] Speed: 1.250171834006859 samples/sec                   batch loss = 306.479318857193 | accuracy = 0.6458333333333334


Epoch[2] Batch[125] Speed: 1.2599074407626965 samples/sec                   batch loss = 317.1319274902344 | accuracy = 0.65


Epoch[2] Batch[130] Speed: 1.258702732858508 samples/sec                   batch loss = 327.22444009780884 | accuracy = 0.6576923076923077


Epoch[2] Batch[135] Speed: 1.253040771212504 samples/sec                   batch loss = 341.72695541381836 | accuracy = 0.65


Epoch[2] Batch[140] Speed: 1.2537648893406248 samples/sec                   batch loss = 352.50780606269836 | accuracy = 0.6553571428571429


Epoch[2] Batch[145] Speed: 1.2531511187275373 samples/sec                   batch loss = 363.19671630859375 | accuracy = 0.6586206896551724


Epoch[2] Batch[150] Speed: 1.2516070862265387 samples/sec                   batch loss = 374.2111711502075 | accuracy = 0.6583333333333333


Epoch[2] Batch[155] Speed: 1.250995146108864 samples/sec                   batch loss = 384.32923567295074 | accuracy = 0.6661290322580645


Epoch[2] Batch[160] Speed: 1.2548051004698746 samples/sec                   batch loss = 399.0871762037277 | accuracy = 0.6640625


Epoch[2] Batch[165] Speed: 1.2621334814041014 samples/sec                   batch loss = 409.6778185367584 | accuracy = 0.6696969696969697


Epoch[2] Batch[170] Speed: 1.2548216182256062 samples/sec                   batch loss = 423.27807998657227 | accuracy = 0.6691176470588235


Epoch[2] Batch[175] Speed: 1.2593761210438763 samples/sec                   batch loss = 437.40411710739136 | accuracy = 0.6614285714285715


Epoch[2] Batch[180] Speed: 1.2515276317958672 samples/sec                   batch loss = 448.68566966056824 | accuracy = 0.6611111111111111


Epoch[2] Batch[185] Speed: 1.2492892278431613 samples/sec                   batch loss = 461.4985353946686 | accuracy = 0.6581081081081082


Epoch[2] Batch[190] Speed: 1.2525201819486669 samples/sec                   batch loss = 473.1203262805939 | accuracy = 0.6578947368421053


Epoch[2] Batch[195] Speed: 1.253012883229396 samples/sec                   batch loss = 485.5848786830902 | accuracy = 0.6576923076923077


Epoch[2] Batch[200] Speed: 1.2542901711403338 samples/sec                   batch loss = 498.66086649894714 | accuracy = 0.6575


Epoch[2] Batch[205] Speed: 1.2523775047320926 samples/sec                   batch loss = 512.5595366954803 | accuracy = 0.6524390243902439


Epoch[2] Batch[210] Speed: 1.2489173261170956 samples/sec                   batch loss = 527.5540888309479 | accuracy = 0.6476190476190476


Epoch[2] Batch[215] Speed: 1.2561429443365955 samples/sec                   batch loss = 537.7430355548859 | accuracy = 0.6534883720930232


Epoch[2] Batch[220] Speed: 1.2539587724412156 samples/sec                   batch loss = 549.7932751178741 | accuracy = 0.6545454545454545


Epoch[2] Batch[225] Speed: 1.2524521116987464 samples/sec                   batch loss = 560.028800368309 | accuracy = 0.6544444444444445


Epoch[2] Batch[230] Speed: 1.2541412778650765 samples/sec                   batch loss = 572.0176945924759 | accuracy = 0.6543478260869565


Epoch[2] Batch[235] Speed: 1.2526648558001736 samples/sec                   batch loss = 582.2260454893112 | accuracy = 0.6563829787234042


Epoch[2] Batch[240] Speed: 1.2505423394478454 samples/sec                   batch loss = 593.989136338234 | accuracy = 0.659375


Epoch[2] Batch[245] Speed: 1.2571137244300126 samples/sec                   batch loss = 603.745358467102 | accuracy = 0.6612244897959184


Epoch[2] Batch[250] Speed: 1.2517197028530143 samples/sec                   batch loss = 616.7729470729828 | accuracy = 0.661


Epoch[2] Batch[255] Speed: 1.2543801055499006 samples/sec                   batch loss = 629.5754842758179 | accuracy = 0.6607843137254902


Epoch[2] Batch[260] Speed: 1.253958678718128 samples/sec                   batch loss = 640.5169943571091 | accuracy = 0.6634615384615384


Epoch[2] Batch[265] Speed: 1.2599485047644545 samples/sec                   batch loss = 653.1600593328476 | accuracy = 0.6613207547169812


Epoch[2] Batch[270] Speed: 1.2614435790401106 samples/sec                   batch loss = 665.538711309433 | accuracy = 0.662962962962963


Epoch[2] Batch[275] Speed: 1.255114599083315 samples/sec                   batch loss = 678.9691694974899 | accuracy = 0.6618181818181819


Epoch[2] Batch[280] Speed: 1.258775639804979 samples/sec                   batch loss = 691.6597820520401 | accuracy = 0.6616071428571428


Epoch[2] Batch[285] Speed: 1.261829148111203 samples/sec                   batch loss = 703.1002376079559 | accuracy = 0.6622807017543859


Epoch[2] Batch[290] Speed: 1.2592593815755244 samples/sec                   batch loss = 716.6579751968384 | accuracy = 0.6629310344827586


Epoch[2] Batch[295] Speed: 1.2569395814445832 samples/sec                   batch loss = 727.6270415782928 | accuracy = 0.6652542372881356


Epoch[2] Batch[300] Speed: 1.2607640942134928 samples/sec                   batch loss = 738.669849395752 | accuracy = 0.6675


Epoch[2] Batch[305] Speed: 1.2553490071482216 samples/sec                   batch loss = 750.6488891839981 | accuracy = 0.6672131147540984


Epoch[2] Batch[310] Speed: 1.2528195731552232 samples/sec                   batch loss = 760.1863371133804 | accuracy = 0.6701612903225806


Epoch[2] Batch[315] Speed: 1.2553344480078845 samples/sec                   batch loss = 770.7402358055115 | accuracy = 0.6722222222222223


Epoch[2] Batch[320] Speed: 1.2494672126508175 samples/sec                   batch loss = 782.7795341014862 | accuracy = 0.67265625


Epoch[2] Batch[325] Speed: 1.250299193780913 samples/sec                   batch loss = 797.8749520778656 | accuracy = 0.6692307692307692


Epoch[2] Batch[330] Speed: 1.2494668404393967 samples/sec                   batch loss = 810.3023363351822 | accuracy = 0.6681818181818182


Epoch[2] Batch[335] Speed: 1.2474339713155593 samples/sec                   batch loss = 820.106054186821 | accuracy = 0.6701492537313433


Epoch[2] Batch[340] Speed: 1.2511799622868474 samples/sec                   batch loss = 833.7897919416428 | accuracy = 0.6691176470588235


Epoch[2] Batch[345] Speed: 1.2484692737413539 samples/sec                   batch loss = 846.6503442525864 | accuracy = 0.6673913043478261


Epoch[2] Batch[350] Speed: 1.2475168028170303 samples/sec                   batch loss = 856.9326649904251 | accuracy = 0.6671428571428571


Epoch[2] Batch[355] Speed: 1.252258881130761 samples/sec                   batch loss = 867.464576125145 | accuracy = 0.6690140845070423


Epoch[2] Batch[360] Speed: 1.246731876064691 samples/sec                   batch loss = 875.9527725577354 | accuracy = 0.6722222222222223


Epoch[2] Batch[365] Speed: 1.2518759616346036 samples/sec                   batch loss = 886.4199283719063 | accuracy = 0.6732876712328767


Epoch[2] Batch[370] Speed: 1.2548715495403475 samples/sec                   batch loss = 898.9342730641365 | accuracy = 0.675


Epoch[2] Batch[375] Speed: 1.253145596210122 samples/sec                   batch loss = 912.0250533223152 | accuracy = 0.674


Epoch[2] Batch[380] Speed: 1.253582868105682 samples/sec                   batch loss = 927.6518996357918 | accuracy = 0.6723684210526316


Epoch[2] Batch[385] Speed: 1.2552147001502991 samples/sec                   batch loss = 939.6190914511681 | accuracy = 0.6733766233766234


Epoch[2] Batch[390] Speed: 1.251844949616445 samples/sec                   batch loss = 953.1624851822853 | accuracy = 0.6724358974358975


Epoch[2] Batch[395] Speed: 1.2509751843779615 samples/sec                   batch loss = 964.522872030735 | accuracy = 0.6734177215189874


Epoch[2] Batch[400] Speed: 1.255541689254701 samples/sec                   batch loss = 977.7795262932777 | accuracy = 0.6725


Epoch[2] Batch[405] Speed: 1.2528231281747684 samples/sec                   batch loss = 989.1257295012474 | accuracy = 0.674074074074074


Epoch[2] Batch[410] Speed: 1.2603164024270601 samples/sec                   batch loss = 1000.47115701437 | accuracy = 0.6743902439024391


Epoch[2] Batch[415] Speed: 1.2560603739728056 samples/sec                   batch loss = 1012.3470925688744 | accuracy = 0.6746987951807228


Epoch[2] Batch[420] Speed: 1.255237990492875 samples/sec                   batch loss = 1022.4445597529411 | accuracy = 0.6767857142857143


Epoch[2] Batch[425] Speed: 1.25410724740122 samples/sec                   batch loss = 1034.6008905768394 | accuracy = 0.6764705882352942


Epoch[2] Batch[430] Speed: 1.2521761663870448 samples/sec                   batch loss = 1047.5803828835487 | accuracy = 0.675


Epoch[2] Batch[435] Speed: 1.2580091229044168 samples/sec                   batch loss = 1058.777609527111 | accuracy = 0.674712643678161


Epoch[2] Batch[440] Speed: 1.2603870345804522 samples/sec                   batch loss = 1071.4933241009712 | accuracy = 0.6738636363636363


Epoch[2] Batch[445] Speed: 1.2583847609069734 samples/sec                   batch loss = 1083.9610909819603 | accuracy = 0.6735955056179775


Epoch[2] Batch[450] Speed: 1.2588180468230465 samples/sec                   batch loss = 1093.510519683361 | accuracy = 0.675


Epoch[2] Batch[455] Speed: 1.2586416372978646 samples/sec                   batch loss = 1106.1816768050194 | accuracy = 0.6741758241758242


Epoch[2] Batch[460] Speed: 1.2558382966390047 samples/sec                   batch loss = 1121.1861068606377 | accuracy = 0.6722826086956522


Epoch[2] Batch[465] Speed: 1.2532041935375922 samples/sec                   batch loss = 1133.828378856182 | accuracy = 0.6720430107526881


Epoch[2] Batch[470] Speed: 1.256975837634707 samples/sec                   batch loss = 1147.4003221392632 | accuracy = 0.6702127659574468


Epoch[2] Batch[475] Speed: 1.2581671447195306 samples/sec                   batch loss = 1161.7628344893456 | accuracy = 0.6673684210526316


Epoch[2] Batch[480] Speed: 1.258353236814575 samples/sec                   batch loss = 1173.2522402405739 | accuracy = 0.6671875


Epoch[2] Batch[485] Speed: 1.2589011690889529 samples/sec                   batch loss = 1181.908661544323 | accuracy = 0.6695876288659793


Epoch[2] Batch[490] Speed: 1.252258787661607 samples/sec                   batch loss = 1191.8652413487434 | accuracy = 0.6704081632653062


Epoch[2] Batch[495] Speed: 1.259827875050593 samples/sec                   batch loss = 1203.1157966256142 | accuracy = 0.6707070707070707


Epoch[2] Batch[500] Speed: 1.255555125659042 samples/sec                   batch loss = 1216.416205585003 | accuracy = 0.6695


Epoch[2] Batch[505] Speed: 1.2553085241811195 samples/sec                   batch loss = 1230.5012617707253 | accuracy = 0.6688118811881189


Epoch[2] Batch[510] Speed: 1.2538148302928884 samples/sec                   batch loss = 1242.5490569472313 | accuracy = 0.6681372549019607


Epoch[2] Batch[515] Speed: 1.2564837794952906 samples/sec                   batch loss = 1253.536358177662 | accuracy = 0.6699029126213593


Epoch[2] Batch[520] Speed: 1.2556280443087222 samples/sec                   batch loss = 1264.1452792286873 | accuracy = 0.6701923076923076


Epoch[2] Batch[525] Speed: 1.2456654156219742 samples/sec                   batch loss = 1276.6081101298332 | accuracy = 0.67


Epoch[2] Batch[530] Speed: 1.2476711788842347 samples/sec                   batch loss = 1287.934352695942 | accuracy = 0.6702830188679245


Epoch[2] Batch[535] Speed: 1.2476264578476568 samples/sec                   batch loss = 1298.2899332642555 | accuracy = 0.6710280373831776


Epoch[2] Batch[540] Speed: 1.2488311479503824 samples/sec                   batch loss = 1307.7155963778496 | accuracy = 0.6717592592592593


Epoch[2] Batch[545] Speed: 1.2525138234252233 samples/sec                   batch loss = 1317.9351297020912 | accuracy = 0.6724770642201835


Epoch[2] Batch[550] Speed: 1.2515579745417125 samples/sec                   batch loss = 1326.0608202815056 | accuracy = 0.6745454545454546


Epoch[2] Batch[555] Speed: 1.2499373996317362 samples/sec                   batch loss = 1335.9112654328346 | accuracy = 0.6752252252252252


Epoch[2] Batch[560] Speed: 1.2449324215274888 samples/sec                   batch loss = 1347.7690615057945 | accuracy = 0.675


Epoch[2] Batch[565] Speed: 1.2489430796141272 samples/sec                   batch loss = 1360.1649524569511 | accuracy = 0.6738938053097345


Epoch[2] Batch[570] Speed: 1.2494902901920035 samples/sec                   batch loss = 1373.8239635825157 | accuracy = 0.6723684210526316


Epoch[2] Batch[575] Speed: 1.2501552521269206 samples/sec                   batch loss = 1384.0365697741508 | accuracy = 0.6734782608695652


Epoch[2] Batch[580] Speed: 1.2543005799897338 samples/sec                   batch loss = 1395.7508456110954 | accuracy = 0.6732758620689655


Epoch[2] Batch[585] Speed: 1.2548442369942945 samples/sec                   batch loss = 1403.7591810822487 | accuracy = 0.6747863247863248


Epoch[2] Batch[590] Speed: 1.2498521048491507 samples/sec                   batch loss = 1416.1747617125511 | accuracy = 0.675


Epoch[2] Batch[595] Speed: 1.249373515424616 samples/sec                   batch loss = 1428.0648235678673 | accuracy = 0.6756302521008404


Epoch[2] Batch[600] Speed: 1.2491734207433234 samples/sec                   batch loss = 1440.900470674038 | accuracy = 0.675


Epoch[2] Batch[605] Speed: 1.2482326913267507 samples/sec                   batch loss = 1451.3234053254128 | accuracy = 0.675206611570248


Epoch[2] Batch[610] Speed: 1.2489212309111506 samples/sec                   batch loss = 1465.067489206791 | accuracy = 0.6745901639344263


Epoch[2] Batch[615] Speed: 1.250439906653109 samples/sec                   batch loss = 1479.81564027071 | accuracy = 0.6747967479674797


Epoch[2] Batch[620] Speed: 1.2553207345458441 samples/sec                   batch loss = 1492.9644678235054 | accuracy = 0.6733870967741935


Epoch[2] Batch[625] Speed: 1.2565088108391045 samples/sec                   batch loss = 1506.839105784893 | accuracy = 0.6736


Epoch[2] Batch[630] Speed: 1.2575646106071066 samples/sec                   batch loss = 1519.09755975008 | accuracy = 0.6734126984126985


Epoch[2] Batch[635] Speed: 1.2554981873449778 samples/sec                   batch loss = 1529.938444197178 | accuracy = 0.6732283464566929


Epoch[2] Batch[640] Speed: 1.2601403299201248 samples/sec                   batch loss = 1542.7696915268898 | accuracy = 0.6734375


Epoch[2] Batch[645] Speed: 1.2579022566222524 samples/sec                   batch loss = 1554.4552205204964 | accuracy = 0.6736434108527132


Epoch[2] Batch[650] Speed: 1.2584971845266775 samples/sec                   batch loss = 1565.2058615088463 | accuracy = 0.6726923076923077


Epoch[2] Batch[655] Speed: 1.2554507426946533 samples/sec                   batch loss = 1574.5401214957237 | accuracy = 0.6732824427480916


Epoch[2] Batch[660] Speed: 1.251542289492469 samples/sec                   batch loss = 1585.019314467907 | accuracy = 0.6742424242424242


Epoch[2] Batch[665] Speed: 1.2563884624929775 samples/sec                   batch loss = 1597.2992792725563 | accuracy = 0.6740601503759398


Epoch[2] Batch[670] Speed: 1.2540319744511748 samples/sec                   batch loss = 1605.6435484290123 | accuracy = 0.6753731343283582


Epoch[2] Batch[675] Speed: 1.252922676913959 samples/sec                   batch loss = 1617.1821611523628 | accuracy = 0.6755555555555556


Epoch[2] Batch[680] Speed: 1.250723105755153 samples/sec                   batch loss = 1632.5959566235542 | accuracy = 0.6738970588235295


Epoch[2] Batch[685] Speed: 1.255414386547866 samples/sec                   batch loss = 1645.662999689579 | accuracy = 0.672992700729927


Epoch[2] Batch[690] Speed: 1.2539327179622708 samples/sec                   batch loss = 1658.3779289126396 | accuracy = 0.6728260869565217


Epoch[2] Batch[695] Speed: 1.2568894854236183 samples/sec                   batch loss = 1668.6711297631264 | accuracy = 0.6737410071942446


Epoch[2] Batch[700] Speed: 1.257372814896 samples/sec                   batch loss = 1680.6817465424538 | accuracy = 0.6735714285714286


Epoch[2] Batch[705] Speed: 1.259598127527169 samples/sec                   batch loss = 1691.9820529818535 | accuracy = 0.6737588652482269


Epoch[2] Batch[710] Speed: 1.2522355142765103 samples/sec                   batch loss = 1704.5910163521767 | accuracy = 0.673943661971831


Epoch[2] Batch[715] Speed: 1.2582223437822364 samples/sec                   batch loss = 1718.1144261956215 | accuracy = 0.6737762237762238


Epoch[2] Batch[720] Speed: 1.2583578615077133 samples/sec                   batch loss = 1729.370831310749 | accuracy = 0.6746527777777778


Epoch[2] Batch[725] Speed: 1.2543273999833127 samples/sec                   batch loss = 1744.7044617533684 | accuracy = 0.6737931034482758


Epoch[2] Batch[730] Speed: 1.2524390221142931 samples/sec                   batch loss = 1757.4802908301353 | accuracy = 0.6726027397260274


Epoch[2] Batch[735] Speed: 1.2573191037349474 samples/sec                   batch loss = 1771.42827886343 | accuracy = 0.6714285714285714


Epoch[2] Batch[740] Speed: 1.2552047456707516 samples/sec                   batch loss = 1784.2056075930595 | accuracy = 0.6709459459459459


Epoch[2] Batch[745] Speed: 1.2527269626499975 samples/sec                   batch loss = 1796.6521746516228 | accuracy = 0.6701342281879195


Epoch[2] Batch[750] Speed: 1.2613494046991633 samples/sec                   batch loss = 1806.2409625649452 | accuracy = 0.6706666666666666


Epoch[2] Batch[755] Speed: 1.255487758609413 samples/sec                   batch loss = 1817.1499493718147 | accuracy = 0.6708609271523179


Epoch[2] Batch[760] Speed: 1.2482410496011884 samples/sec                   batch loss = 1829.2751982808113 | accuracy = 0.6707236842105263


Epoch[2] Batch[765] Speed: 1.2510550351240906 samples/sec                   batch loss = 1839.0008400082588 | accuracy = 0.6722222222222223


Epoch[2] Batch[770] Speed: 1.2556014505725834 samples/sec                   batch loss = 1847.769140779972 | accuracy = 0.6733766233766234


Epoch[2] Batch[775] Speed: 1.2556742806236094 samples/sec                   batch loss = 1862.4102969765663 | accuracy = 0.672258064516129


Epoch[2] Batch[780] Speed: 1.254055689611123 samples/sec                   batch loss = 1873.9883894324303 | accuracy = 0.6721153846153847


Epoch[2] Batch[785] Speed: 1.255813574004507 samples/sec                   batch loss = 1887.8799069523811 | accuracy = 0.671656050955414


[Epoch 2] training: accuracy=0.6722715736040609
[Epoch 2] time cost: 643.6291356086731
[Epoch 2] validation: validation accuracy=0.7477777777777778


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).