<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[09:31:59] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[09:31:59] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[09:31:59] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 2.8399243, -0.8696672]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7914026392233005 samples/sec                   batch loss = 12.968069314956665 | accuracy = 0.65


Epoch[1] Batch[10] Speed: 1.2759862584354178 samples/sec                   batch loss = 26.932479858398438 | accuracy = 0.625


Epoch[1] Batch[15] Speed: 1.2768747391885447 samples/sec                   batch loss = 42.26755404472351 | accuracy = 0.55


Epoch[1] Batch[20] Speed: 1.2830368756125643 samples/sec                   batch loss = 56.61911368370056 | accuracy = 0.55


Epoch[1] Batch[25] Speed: 1.283520201431355 samples/sec                   batch loss = 70.03704142570496 | accuracy = 0.56


Epoch[1] Batch[30] Speed: 1.282984874019497 samples/sec                   batch loss = 84.26598715782166 | accuracy = 0.5416666666666666


Epoch[1] Batch[35] Speed: 1.2767258768540424 samples/sec                   batch loss = 98.08407711982727 | accuracy = 0.5357142857142857


Epoch[1] Batch[40] Speed: 1.2814929502372836 samples/sec                   batch loss = 113.1574375629425 | accuracy = 0.525


Epoch[1] Batch[45] Speed: 1.2820221348278373 samples/sec                   batch loss = 127.38133096694946 | accuracy = 0.5055555555555555


Epoch[1] Batch[50] Speed: 1.285433317309998 samples/sec                   batch loss = 141.28444123268127 | accuracy = 0.5


Epoch[1] Batch[55] Speed: 1.2784070629766437 samples/sec                   batch loss = 156.5863676071167 | accuracy = 0.4863636363636364


Epoch[1] Batch[60] Speed: 1.2792094729056902 samples/sec                   batch loss = 170.1928949356079 | accuracy = 0.5083333333333333


Epoch[1] Batch[65] Speed: 1.276539070707956 samples/sec                   batch loss = 183.26196694374084 | accuracy = 0.5307692307692308


Epoch[1] Batch[70] Speed: 1.282849983877604 samples/sec                   batch loss = 196.90106081962585 | accuracy = 0.5321428571428571


Epoch[1] Batch[75] Speed: 1.2834541201850531 samples/sec                   batch loss = 211.33921551704407 | accuracy = 0.5233333333333333


Epoch[1] Batch[80] Speed: 1.273613373853814 samples/sec                   batch loss = 226.11382746696472 | accuracy = 0.515625


Epoch[1] Batch[85] Speed: 1.2799505878450432 samples/sec                   batch loss = 239.0339195728302 | accuracy = 0.5264705882352941


Epoch[1] Batch[90] Speed: 1.2790138462882545 samples/sec                   batch loss = 252.86609506607056 | accuracy = 0.5277777777777778


Epoch[1] Batch[95] Speed: 1.28526965264382 samples/sec                   batch loss = 266.7805299758911 | accuracy = 0.5289473684210526


Epoch[1] Batch[100] Speed: 1.2813965415646553 samples/sec                   batch loss = 280.08078026771545 | accuracy = 0.5325


Epoch[1] Batch[105] Speed: 1.2868212166834987 samples/sec                   batch loss = 293.14868664741516 | accuracy = 0.5404761904761904


Epoch[1] Batch[110] Speed: 1.2801857691446619 samples/sec                   batch loss = 307.79704332351685 | accuracy = 0.5363636363636364


Epoch[1] Batch[115] Speed: 1.2854410978354651 samples/sec                   batch loss = 322.32532691955566 | accuracy = 0.5347826086956522


Epoch[1] Batch[120] Speed: 1.2825575413863537 samples/sec                   batch loss = 335.5798816680908 | accuracy = 0.5416666666666666


Epoch[1] Batch[125] Speed: 1.2844771287616428 samples/sec                   batch loss = 348.7042429447174 | accuracy = 0.548


Epoch[1] Batch[130] Speed: 1.2776643283865 samples/sec                   batch loss = 362.7979187965393 | accuracy = 0.5461538461538461


Epoch[1] Batch[135] Speed: 1.2813444771266116 samples/sec                   batch loss = 376.6070201396942 | accuracy = 0.5462962962962963


Epoch[1] Batch[140] Speed: 1.2818495436530737 samples/sec                   batch loss = 390.1017949581146 | accuracy = 0.5517857142857143


Epoch[1] Batch[145] Speed: 1.2795278109724675 samples/sec                   batch loss = 403.3778464794159 | accuracy = 0.5568965517241379


Epoch[1] Batch[150] Speed: 1.2751695675507764 samples/sec                   batch loss = 416.44438219070435 | accuracy = 0.5633333333333334


Epoch[1] Batch[155] Speed: 1.2851147905945546 samples/sec                   batch loss = 430.56561303138733 | accuracy = 0.5580645161290323


Epoch[1] Batch[160] Speed: 1.2839409047346788 samples/sec                   batch loss = 444.1754093170166 | accuracy = 0.5609375


Epoch[1] Batch[165] Speed: 1.2815777235524934 samples/sec                   batch loss = 458.7004973888397 | accuracy = 0.5545454545454546


Epoch[1] Batch[170] Speed: 1.2824040183360699 samples/sec                   batch loss = 473.0439102649689 | accuracy = 0.5455882352941176


Epoch[1] Batch[175] Speed: 1.2793881832021856 samples/sec                   batch loss = 486.27736353874207 | accuracy = 0.5485714285714286


Epoch[1] Batch[180] Speed: 1.2808696333974918 samples/sec                   batch loss = 500.42449951171875 | accuracy = 0.5416666666666666


Epoch[1] Batch[185] Speed: 1.2822621941358334 samples/sec                   batch loss = 513.8801367282867 | accuracy = 0.5459459459459459


Epoch[1] Batch[190] Speed: 1.2802231834391906 samples/sec                   batch loss = 527.9621317386627 | accuracy = 0.5421052631578948


Epoch[1] Batch[195] Speed: 1.283778995130089 samples/sec                   batch loss = 542.0457456111908 | accuracy = 0.5397435897435897


Epoch[1] Batch[200] Speed: 1.281495984654327 samples/sec                   batch loss = 555.1899628639221 | accuracy = 0.5425


Epoch[1] Batch[205] Speed: 1.2763576597666997 samples/sec                   batch loss = 568.8353078365326 | accuracy = 0.5414634146341464


Epoch[1] Batch[210] Speed: 1.2809182363700051 samples/sec                   batch loss = 582.3516652584076 | accuracy = 0.544047619047619


Epoch[1] Batch[215] Speed: 1.2768048706240487 samples/sec                   batch loss = 595.819117307663 | accuracy = 0.5430232558139535


Epoch[1] Batch[220] Speed: 1.2796775226701829 samples/sec                   batch loss = 609.0972526073456 | accuracy = 0.5443181818181818


Epoch[1] Batch[225] Speed: 1.2816017088006488 samples/sec                   batch loss = 622.9156458377838 | accuracy = 0.5444444444444444


Epoch[1] Batch[230] Speed: 1.285065573347331 samples/sec                   batch loss = 636.5467851161957 | accuracy = 0.5478260869565217


Epoch[1] Batch[235] Speed: 1.2820850314736092 samples/sec                   batch loss = 649.5095415115356 | accuracy = 0.55


Epoch[1] Batch[240] Speed: 1.279396183411363 samples/sec                   batch loss = 663.687173128128 | accuracy = 0.5489583333333333


Epoch[1] Batch[245] Speed: 1.2790861021921516 samples/sec                   batch loss = 677.9578444957733 | accuracy = 0.5469387755102041


Epoch[1] Batch[250] Speed: 1.2888935939348338 samples/sec                   batch loss = 691.1340100765228 | accuracy = 0.551


Epoch[1] Batch[255] Speed: 1.2802144890608198 samples/sec                   batch loss = 705.0846943855286 | accuracy = 0.5490196078431373


Epoch[1] Batch[260] Speed: 1.2810694474711783 samples/sec                   batch loss = 718.8093600273132 | accuracy = 0.55


Epoch[1] Batch[265] Speed: 1.2805176897612034 samples/sec                   batch loss = 732.4804146289825 | accuracy = 0.5518867924528302


Epoch[1] Batch[270] Speed: 1.2755749212897458 samples/sec                   batch loss = 745.4765982627869 | accuracy = 0.5555555555555556


Epoch[1] Batch[275] Speed: 1.2868944561466271 samples/sec                   batch loss = 758.9178714752197 | accuracy = 0.5554545454545454


Epoch[1] Batch[280] Speed: 1.281788041118149 samples/sec                   batch loss = 772.3350625038147 | accuracy = 0.5571428571428572


Epoch[1] Batch[285] Speed: 1.2853549263847477 samples/sec                   batch loss = 786.8950729370117 | accuracy = 0.5535087719298246


Epoch[1] Batch[290] Speed: 1.2848747445140989 samples/sec                   batch loss = 800.8448987007141 | accuracy = 0.5543103448275862


Epoch[1] Batch[295] Speed: 1.2803244963055975 samples/sec                   batch loss = 814.8488490581512 | accuracy = 0.5550847457627118


Epoch[1] Batch[300] Speed: 1.2806934418759914 samples/sec                   batch loss = 828.7897157669067 | accuracy = 0.555


Epoch[1] Batch[305] Speed: 1.2806564888724314 samples/sec                   batch loss = 842.2586464881897 | accuracy = 0.5573770491803278


Epoch[1] Batch[310] Speed: 1.2849735472104706 samples/sec                   batch loss = 856.0435256958008 | accuracy = 0.5564516129032258


Epoch[1] Batch[315] Speed: 1.2831039933334822 samples/sec                   batch loss = 868.7150266170502 | accuracy = 0.5611111111111111


Epoch[1] Batch[320] Speed: 1.2814972571560737 samples/sec                   batch loss = 881.4799790382385 | accuracy = 0.56328125


Epoch[1] Batch[325] Speed: 1.2847703490124043 samples/sec                   batch loss = 895.2329692840576 | accuracy = 0.5607692307692308


Epoch[1] Batch[330] Speed: 1.276355038037964 samples/sec                   batch loss = 909.3858151435852 | accuracy = 0.5613636363636364


Epoch[1] Batch[335] Speed: 1.2807814338017358 samples/sec                   batch loss = 923.662232875824 | accuracy = 0.558955223880597


Epoch[1] Batch[340] Speed: 1.2723694912824814 samples/sec                   batch loss = 937.2172648906708 | accuracy = 0.5588235294117647


Epoch[1] Batch[345] Speed: 1.2735141836318984 samples/sec                   batch loss = 950.7293040752411 | accuracy = 0.5608695652173913


Epoch[1] Batch[350] Speed: 1.2731089790627603 samples/sec                   batch loss = 963.4414753913879 | accuracy = 0.5642857142857143


Epoch[1] Batch[355] Speed: 1.2825708759019907 samples/sec                   batch loss = 976.799569606781 | accuracy = 0.5654929577464789


Epoch[1] Batch[360] Speed: 1.2761929972691937 samples/sec                   batch loss = 991.0554876327515 | accuracy = 0.5638888888888889


Epoch[1] Batch[365] Speed: 1.279576604939604 samples/sec                   batch loss = 1004.7702422142029 | accuracy = 0.563013698630137


Epoch[1] Batch[370] Speed: 1.2788887585395046 samples/sec                   batch loss = 1018.8951570987701 | accuracy = 0.5614864864864865


Epoch[1] Batch[375] Speed: 1.2783829049252577 samples/sec                   batch loss = 1032.592349767685 | accuracy = 0.562


Epoch[1] Batch[380] Speed: 1.2793058454000483 samples/sec                   batch loss = 1045.8033170700073 | accuracy = 0.5605263157894737


Epoch[1] Batch[385] Speed: 1.2778976968355662 samples/sec                   batch loss = 1058.7507209777832 | accuracy = 0.562987012987013


Epoch[1] Batch[390] Speed: 1.2793665246013899 samples/sec                   batch loss = 1072.047311782837 | accuracy = 0.5634615384615385


Epoch[1] Batch[395] Speed: 1.278396445010954 samples/sec                   batch loss = 1085.2281377315521 | accuracy = 0.5658227848101266


Epoch[1] Batch[400] Speed: 1.2765278038587058 samples/sec                   batch loss = 1099.1626143455505 | accuracy = 0.565


Epoch[1] Batch[405] Speed: 1.2869645448402454 samples/sec                   batch loss = 1113.4433863162994 | accuracy = 0.5635802469135802


Epoch[1] Batch[410] Speed: 1.2834122951563953 samples/sec                   batch loss = 1126.5169548988342 | accuracy = 0.5646341463414634


Epoch[1] Batch[415] Speed: 1.2850462812149364 samples/sec                   batch loss = 1140.0870649814606 | accuracy = 0.5650602409638554


Epoch[1] Batch[420] Speed: 1.28439374134632 samples/sec                   batch loss = 1153.3181474208832 | accuracy = 0.5654761904761905


Epoch[1] Batch[425] Speed: 1.282551462508003 samples/sec                   batch loss = 1167.39826130867 | accuracy = 0.5647058823529412


Epoch[1] Batch[430] Speed: 1.282132844890594 samples/sec                   batch loss = 1181.7631704807281 | accuracy = 0.5651162790697675


Epoch[1] Batch[435] Speed: 1.2809655715825834 samples/sec                   batch loss = 1194.9446585178375 | accuracy = 0.5649425287356322


Epoch[1] Batch[440] Speed: 1.275409975893184 samples/sec                   batch loss = 1207.5848207473755 | accuracy = 0.5670454545454545


Epoch[1] Batch[445] Speed: 1.2734229345398298 samples/sec                   batch loss = 1221.1042296886444 | accuracy = 0.5668539325842696


Epoch[1] Batch[450] Speed: 1.2810436236429794 samples/sec                   batch loss = 1234.3161273002625 | accuracy = 0.5677777777777778


Epoch[1] Batch[455] Speed: 1.2819679624460125 samples/sec                   batch loss = 1248.3063898086548 | accuracy = 0.5675824175824176


Epoch[1] Batch[460] Speed: 1.2749633539197835 samples/sec                   batch loss = 1262.4297604560852 | accuracy = 0.5673913043478261


Epoch[1] Batch[465] Speed: 1.274980115977088 samples/sec                   batch loss = 1275.8020856380463 | accuracy = 0.5682795698924731


Epoch[1] Batch[470] Speed: 1.2762331880863718 samples/sec                   batch loss = 1288.7386338710785 | accuracy = 0.5702127659574469


Epoch[1] Batch[475] Speed: 1.2783481306237507 samples/sec                   batch loss = 1302.2526516914368 | accuracy = 0.5689473684210526


Epoch[1] Batch[480] Speed: 1.2736772852901694 samples/sec                   batch loss = 1315.6159582138062 | accuracy = 0.56875


Epoch[1] Batch[485] Speed: 1.2721723817259647 samples/sec                   batch loss = 1330.1320796012878 | accuracy = 0.5670103092783505


Epoch[1] Batch[490] Speed: 1.2755003463352816 samples/sec                   batch loss = 1343.126856803894 | accuracy = 0.5683673469387756


Epoch[1] Batch[495] Speed: 1.2730923627734587 samples/sec                   batch loss = 1357.0656611919403 | accuracy = 0.5681818181818182


Epoch[1] Batch[500] Speed: 1.2764150491938802 samples/sec                   batch loss = 1369.1624376773834 | accuracy = 0.5695


Epoch[1] Batch[505] Speed: 1.2777316636595388 samples/sec                   batch loss = 1382.883413553238 | accuracy = 0.5702970297029702


Epoch[1] Batch[510] Speed: 1.2742932328434238 samples/sec                   batch loss = 1395.2489068508148 | accuracy = 0.5720588235294117


Epoch[1] Batch[515] Speed: 1.2753887426366006 samples/sec                   batch loss = 1408.7077445983887 | accuracy = 0.570873786407767


Epoch[1] Batch[520] Speed: 1.27798783607688 samples/sec                   batch loss = 1423.2737913131714 | accuracy = 0.5692307692307692


Epoch[1] Batch[525] Speed: 1.275050560191955 samples/sec                   batch loss = 1436.2949900627136 | accuracy = 0.5704761904761905


Epoch[1] Batch[530] Speed: 1.2773362191608777 samples/sec                   batch loss = 1449.5327050685883 | accuracy = 0.5707547169811321


Epoch[1] Batch[535] Speed: 1.2717967595335962 samples/sec                   batch loss = 1462.8289539813995 | accuracy = 0.5710280373831775


Epoch[1] Batch[540] Speed: 1.2761945504904655 samples/sec                   batch loss = 1475.374792098999 | accuracy = 0.5726851851851852


Epoch[1] Batch[545] Speed: 1.2878158925196876 samples/sec                   batch loss = 1488.3487615585327 | accuracy = 0.573394495412844


Epoch[1] Batch[550] Speed: 1.277554777777854 samples/sec                   batch loss = 1501.3805561065674 | accuracy = 0.5745454545454546


Epoch[1] Batch[555] Speed: 1.2788914881746558 samples/sec                   batch loss = 1513.5614104270935 | accuracy = 0.5756756756756757


Epoch[1] Batch[560] Speed: 1.2816021004039855 samples/sec                   batch loss = 1527.5324754714966 | accuracy = 0.5754464285714286


Epoch[1] Batch[565] Speed: 1.276243964315531 samples/sec                   batch loss = 1542.089033126831 | accuracy = 0.5743362831858407


Epoch[1] Batch[570] Speed: 1.2801180772976266 samples/sec                   batch loss = 1554.970524072647 | accuracy = 0.5754385964912281


Epoch[1] Batch[575] Speed: 1.2754822130414838 samples/sec                   batch loss = 1568.6839022636414 | accuracy = 0.5756521739130435


Epoch[1] Batch[580] Speed: 1.2768313012263268 samples/sec                   batch loss = 1582.4507083892822 | accuracy = 0.5758620689655173


Epoch[1] Batch[585] Speed: 1.275313220024463 samples/sec                   batch loss = 1596.003521680832 | accuracy = 0.5764957264957264


Epoch[1] Batch[590] Speed: 1.2793292579147042 samples/sec                   batch loss = 1608.5758578777313 | accuracy = 0.5775423728813559


Epoch[1] Batch[595] Speed: 1.2824613646002108 samples/sec                   batch loss = 1622.5042262077332 | accuracy = 0.5756302521008403


Epoch[1] Batch[600] Speed: 1.2802567897873192 samples/sec                   batch loss = 1635.6988551616669 | accuracy = 0.5758333333333333


Epoch[1] Batch[605] Speed: 1.2822810107457387 samples/sec                   batch loss = 1648.7989525794983 | accuracy = 0.5768595041322314


Epoch[1] Batch[610] Speed: 1.2785346871894105 samples/sec                   batch loss = 1662.0491588115692 | accuracy = 0.5770491803278689


Epoch[1] Batch[615] Speed: 1.2798522631474196 samples/sec                   batch loss = 1675.9172041416168 | accuracy = 0.5764227642276423


Epoch[1] Batch[620] Speed: 1.2806376221559288 samples/sec                   batch loss = 1689.0530877113342 | accuracy = 0.5770161290322581


Epoch[1] Batch[625] Speed: 1.2805307864198776 samples/sec                   batch loss = 1701.2443542480469 | accuracy = 0.5776


Epoch[1] Batch[630] Speed: 1.282841351874257 samples/sec                   batch loss = 1714.524531841278 | accuracy = 0.5785714285714286


Epoch[1] Batch[635] Speed: 1.2825124414030384 samples/sec                   batch loss = 1727.7516095638275 | accuracy = 0.5783464566929134


Epoch[1] Batch[640] Speed: 1.2783904055128343 samples/sec                   batch loss = 1741.4538855552673 | accuracy = 0.578515625


Epoch[1] Batch[645] Speed: 1.2828820606351825 samples/sec                   batch loss = 1755.6395392417908 | accuracy = 0.577906976744186


Epoch[1] Batch[650] Speed: 1.2754381910403465 samples/sec                   batch loss = 1769.1902537345886 | accuracy = 0.5776923076923077


Epoch[1] Batch[655] Speed: 1.2865959243570784 samples/sec                   batch loss = 1782.8958253860474 | accuracy = 0.5774809160305343


Epoch[1] Batch[660] Speed: 1.2806271626096184 samples/sec                   batch loss = 1796.2835278511047 | accuracy = 0.5787878787878787


Epoch[1] Batch[665] Speed: 1.2791417867376274 samples/sec                   batch loss = 1810.1255860328674 | accuracy = 0.5781954887218045


Epoch[1] Batch[670] Speed: 1.2757275895484301 samples/sec                   batch loss = 1821.839985370636 | accuracy = 0.5802238805970149


Epoch[1] Batch[675] Speed: 1.283388143916622 samples/sec                   batch loss = 1834.7290151119232 | accuracy = 0.5807407407407408


Epoch[1] Batch[680] Speed: 1.2816465489378024 samples/sec                   batch loss = 1847.275817155838 | accuracy = 0.58125


Epoch[1] Batch[685] Speed: 1.2731316822522174 samples/sec                   batch loss = 1858.8573942184448 | accuracy = 0.5828467153284671


Epoch[1] Batch[690] Speed: 1.2839354022816976 samples/sec                   batch loss = 1869.1211994886398 | accuracy = 0.5855072463768116


Epoch[1] Batch[695] Speed: 1.2636110974912707 samples/sec                   batch loss = 1881.588182091713 | accuracy = 0.5870503597122302


Epoch[1] Batch[700] Speed: 1.2545757738457557 samples/sec                   batch loss = 1894.9556738138199 | accuracy = 0.5871428571428572


Epoch[1] Batch[705] Speed: 1.251415983159322 samples/sec                   batch loss = 1907.6710251569748 | accuracy = 0.5879432624113475


Epoch[1] Batch[710] Speed: 1.2517639706573054 samples/sec                   batch loss = 1919.2978078126907 | accuracy = 0.5890845070422536


Epoch[1] Batch[715] Speed: 1.2530345009873205 samples/sec                   batch loss = 1931.6479700803757 | accuracy = 0.5895104895104896


Epoch[1] Batch[720] Speed: 1.2552236217519814 samples/sec                   batch loss = 1944.3759433031082 | accuracy = 0.5895833333333333


Epoch[1] Batch[725] Speed: 1.2521721477569487 samples/sec                   batch loss = 1956.9075459241867 | accuracy = 0.59


Epoch[1] Batch[730] Speed: 1.2528166730226424 samples/sec                   batch loss = 1969.6953715085983 | accuracy = 0.5900684931506849


Epoch[1] Batch[735] Speed: 1.2491873722683824 samples/sec                   batch loss = 1981.1190519332886 | accuracy = 0.5914965986394558


Epoch[1] Batch[740] Speed: 1.2505946341780123 samples/sec                   batch loss = 1992.6468230485916 | accuracy = 0.5925675675675676


Epoch[1] Batch[745] Speed: 1.2562097232989797 samples/sec                   batch loss = 2004.977090716362 | accuracy = 0.5939597315436241


Epoch[1] Batch[750] Speed: 1.252919963439733 samples/sec                   batch loss = 2016.22163438797 | accuracy = 0.5946666666666667


Epoch[1] Batch[755] Speed: 1.249834693480442 samples/sec                   batch loss = 2029.462243795395 | accuracy = 0.595364238410596


Epoch[1] Batch[760] Speed: 1.2494854512820894 samples/sec                   batch loss = 2043.6699495315552 | accuracy = 0.5947368421052631


Epoch[1] Batch[765] Speed: 1.2499875949064163 samples/sec                   batch loss = 2056.098275542259 | accuracy = 0.5950980392156863


Epoch[1] Batch[770] Speed: 1.2541494342010229 samples/sec                   batch loss = 2068.7004801034927 | accuracy = 0.5957792207792207


Epoch[1] Batch[775] Speed: 1.2521335515303007 samples/sec                   batch loss = 2081.6029633283615 | accuracy = 0.5954838709677419


Epoch[1] Batch[780] Speed: 1.2523299217231096 samples/sec                   batch loss = 2093.6536391973495 | accuracy = 0.596474358974359


Epoch[1] Batch[785] Speed: 1.266444315212475 samples/sec                   batch loss = 2106.6643224954605 | accuracy = 0.5961783439490446


[Epoch 1] training: accuracy=0.5961294416243654
[Epoch 1] time cost: 637.2299082279205
[Epoch 1] validation: validation accuracy=0.6777777777777778


Epoch[2] Batch[5] Speed: 1.2484404740723676 samples/sec                   batch loss = 11.896809577941895 | accuracy = 0.65


Epoch[2] Batch[10] Speed: 1.2477864291938847 samples/sec                   batch loss = 25.887200355529785 | accuracy = 0.525


Epoch[2] Batch[15] Speed: 1.2488336578259185 samples/sec                   batch loss = 36.90951323509216 | accuracy = 0.6166666666666667


Epoch[2] Batch[20] Speed: 1.2490482430017869 samples/sec                   batch loss = 50.58242583274841 | accuracy = 0.5875


Epoch[2] Batch[25] Speed: 1.250119947229925 samples/sec                   batch loss = 63.36326313018799 | accuracy = 0.6


Epoch[2] Batch[30] Speed: 1.2477178517664533 samples/sec                   batch loss = 75.71899509429932 | accuracy = 0.6416666666666667


Epoch[2] Batch[35] Speed: 1.2500811979762014 samples/sec                   batch loss = 86.25183987617493 | accuracy = 0.6571428571428571


Epoch[2] Batch[40] Speed: 1.2395893603714065 samples/sec                   batch loss = 101.73214602470398 | accuracy = 0.63125


Epoch[2] Batch[45] Speed: 1.2480091026054543 samples/sec                   batch loss = 111.49841141700745 | accuracy = 0.6611111111111111


Epoch[2] Batch[50] Speed: 1.253899823386033 samples/sec                   batch loss = 124.0203058719635 | accuracy = 0.665


Epoch[2] Batch[55] Speed: 1.2516477043150367 samples/sec                   batch loss = 136.1659083366394 | accuracy = 0.6681818181818182


Epoch[2] Batch[60] Speed: 1.253481996959316 samples/sec                   batch loss = 149.1214382648468 | accuracy = 0.6541666666666667


Epoch[2] Batch[65] Speed: 1.2535276070640158 samples/sec                   batch loss = 159.44292283058167 | accuracy = 0.6615384615384615


Epoch[2] Batch[70] Speed: 1.250192142751222 samples/sec                   batch loss = 172.64009714126587 | accuracy = 0.6535714285714286


Epoch[2] Batch[75] Speed: 1.2508160729846292 samples/sec                   batch loss = 184.5764455795288 | accuracy = 0.6633333333333333


Epoch[2] Batch[80] Speed: 1.255550991350153 samples/sec                   batch loss = 196.12601912021637 | accuracy = 0.66875


Epoch[2] Batch[85] Speed: 1.2533458419776626 samples/sec                   batch loss = 210.3841973543167 | accuracy = 0.6735294117647059


Epoch[2] Batch[90] Speed: 1.256210569838441 samples/sec                   batch loss = 218.75759494304657 | accuracy = 0.6888888888888889


Epoch[2] Batch[95] Speed: 1.249832086472338 samples/sec                   batch loss = 233.17889142036438 | accuracy = 0.6789473684210526


Epoch[2] Batch[100] Speed: 1.2545173296061456 samples/sec                   batch loss = 246.79881381988525 | accuracy = 0.6725


Epoch[2] Batch[105] Speed: 1.2554355235862744 samples/sec                   batch loss = 257.41655254364014 | accuracy = 0.6714285714285714


Epoch[2] Batch[110] Speed: 1.2500230785994886 samples/sec                   batch loss = 268.824670791626 | accuracy = 0.6727272727272727


Epoch[2] Batch[115] Speed: 1.2528407164630275 samples/sec                   batch loss = 282.335889339447 | accuracy = 0.6695652173913044


Epoch[2] Batch[120] Speed: 1.2550146078617945 samples/sec                   batch loss = 292.45604372024536 | accuracy = 0.675


Epoch[2] Batch[125] Speed: 1.2500051036685467 samples/sec                   batch loss = 302.98465740680695 | accuracy = 0.676


Epoch[2] Batch[130] Speed: 1.2459594101846532 samples/sec                   batch loss = 312.61594891548157 | accuracy = 0.676923076923077


Epoch[2] Batch[135] Speed: 1.2437897311456003 samples/sec                   batch loss = 326.53741812705994 | accuracy = 0.6722222222222223


Epoch[2] Batch[140] Speed: 1.2473625576870804 samples/sec                   batch loss = 339.2913999557495 | accuracy = 0.6696428571428571


Epoch[2] Batch[145] Speed: 1.2464977112049764 samples/sec                   batch loss = 350.6291183233261 | accuracy = 0.6724137931034483


Epoch[2] Batch[150] Speed: 1.249882738862576 samples/sec                   batch loss = 362.94329726696014 | accuracy = 0.675


Epoch[2] Batch[155] Speed: 1.2471588419241875 samples/sec                   batch loss = 375.8286658525467 | accuracy = 0.6725806451612903


Epoch[2] Batch[160] Speed: 1.2521071056107438 samples/sec                   batch loss = 389.3056448698044 | accuracy = 0.6734375


Epoch[2] Batch[165] Speed: 1.252473710111275 samples/sec                   batch loss = 399.436311006546 | accuracy = 0.6772727272727272


Epoch[2] Batch[170] Speed: 1.253905914844016 samples/sec                   batch loss = 412.42099690437317 | accuracy = 0.6779411764705883


Epoch[2] Batch[175] Speed: 1.2465666176774348 samples/sec                   batch loss = 424.2552065849304 | accuracy = 0.6771428571428572


Epoch[2] Batch[180] Speed: 1.2481080734449734 samples/sec                   batch loss = 437.4175281524658 | accuracy = 0.6708333333333333


Epoch[2] Batch[185] Speed: 1.2497567672560046 samples/sec                   batch loss = 451.2747358083725 | accuracy = 0.6675675675675675


Epoch[2] Batch[190] Speed: 1.2515517191475622 samples/sec                   batch loss = 460.46973395347595 | accuracy = 0.6710526315789473


Epoch[2] Batch[195] Speed: 1.245535761679819 samples/sec                   batch loss = 471.6562954187393 | accuracy = 0.6730769230769231


Epoch[2] Batch[200] Speed: 1.2492242057919416 samples/sec                   batch loss = 482.303853392601 | accuracy = 0.67125


Epoch[2] Batch[205] Speed: 1.2500785899396383 samples/sec                   batch loss = 494.83514416217804 | accuracy = 0.6731707317073171


Epoch[2] Batch[210] Speed: 1.2522570117503293 samples/sec                   batch loss = 508.30999767780304 | accuracy = 0.6702380952380952


Epoch[2] Batch[215] Speed: 1.2522637415460036 samples/sec                   batch loss = 521.7217141389847 | accuracy = 0.6686046511627907


Epoch[2] Batch[220] Speed: 1.2476310040307101 samples/sec                   batch loss = 534.2168892621994 | accuracy = 0.6681818181818182


Epoch[2] Batch[225] Speed: 1.2471067414331367 samples/sec                   batch loss = 544.9880423545837 | accuracy = 0.6711111111111111


Epoch[2] Batch[230] Speed: 1.2488117199958555 samples/sec                   batch loss = 558.1911939382553 | accuracy = 0.6706521739130434


Epoch[2] Batch[235] Speed: 1.2529381158703028 samples/sec                   batch loss = 570.8611903190613 | accuracy = 0.6691489361702128


Epoch[2] Batch[240] Speed: 1.2496809916933076 samples/sec                   batch loss = 582.7719550132751 | accuracy = 0.6697916666666667


Epoch[2] Batch[245] Speed: 1.2518863304532932 samples/sec                   batch loss = 596.7056686878204 | accuracy = 0.6683673469387755


Epoch[2] Batch[250] Speed: 1.2535123408894018 samples/sec                   batch loss = 608.3944872617722 | accuracy = 0.668


Epoch[2] Batch[255] Speed: 1.2509084944294393 samples/sec                   batch loss = 618.9314960241318 | accuracy = 0.6686274509803921


Epoch[2] Batch[260] Speed: 1.2542919528230214 samples/sec                   batch loss = 630.6267523765564 | accuracy = 0.6701923076923076


Epoch[2] Batch[265] Speed: 1.2529743287334525 samples/sec                   batch loss = 646.5603384971619 | accuracy = 0.6679245283018868


Epoch[2] Batch[270] Speed: 1.2539100383260886 samples/sec                   batch loss = 660.0439960956573 | accuracy = 0.6657407407407407


Epoch[2] Batch[275] Speed: 1.2567267950673302 samples/sec                   batch loss = 671.6236028671265 | accuracy = 0.6654545454545454


Epoch[2] Batch[280] Speed: 1.2439424482536392 samples/sec                   batch loss = 685.1014404296875 | accuracy = 0.6642857142857143


Epoch[2] Batch[285] Speed: 1.2482599954376636 samples/sec                   batch loss = 696.4866855144501 | accuracy = 0.6666666666666666


Epoch[2] Batch[290] Speed: 1.2521807457876681 samples/sec                   batch loss = 708.9356288909912 | accuracy = 0.6655172413793103


Epoch[2] Batch[295] Speed: 1.2560771128731385 samples/sec                   batch loss = 721.1980695724487 | accuracy = 0.6652542372881356


Epoch[2] Batch[300] Speed: 1.2560017913347996 samples/sec                   batch loss = 732.6147881746292 | accuracy = 0.6675


Epoch[2] Batch[305] Speed: 1.2467201101457128 samples/sec                   batch loss = 742.0072212219238 | accuracy = 0.6680327868852459


Epoch[2] Batch[310] Speed: 1.2546387271036241 samples/sec                   batch loss = 752.3828115463257 | accuracy = 0.6685483870967742


Epoch[2] Batch[315] Speed: 1.2554609829196797 samples/sec                   batch loss = 764.6499339342117 | accuracy = 0.6674603174603174


Epoch[2] Batch[320] Speed: 1.2511446927933816 samples/sec                   batch loss = 775.5399340391159 | accuracy = 0.6703125


Epoch[2] Batch[325] Speed: 1.2535319153768885 samples/sec                   batch loss = 787.3435570001602 | accuracy = 0.67


Epoch[2] Batch[330] Speed: 1.2616157467350086 samples/sec                   batch loss = 800.205689072609 | accuracy = 0.6696969696969697


Epoch[2] Batch[335] Speed: 1.2554213382063937 samples/sec                   batch loss = 810.2207024097443 | accuracy = 0.6716417910447762


Epoch[2] Batch[340] Speed: 1.2581117618099718 samples/sec                   batch loss = 820.9109343290329 | accuracy = 0.6735294117647059


Epoch[2] Batch[345] Speed: 1.2580510066115627 samples/sec                   batch loss = 836.1142872571945 | accuracy = 0.6710144927536232


Epoch[2] Batch[350] Speed: 1.2518759616346036 samples/sec                   batch loss = 851.7970176935196 | accuracy = 0.6685714285714286


Epoch[2] Batch[355] Speed: 1.2555060794762458 samples/sec                   batch loss = 864.9497363567352 | accuracy = 0.6683098591549296


Epoch[2] Batch[360] Speed: 1.2553157564453914 samples/sec                   batch loss = 875.6683630943298 | accuracy = 0.6694444444444444


Epoch[2] Batch[365] Speed: 1.2551187305187725 samples/sec                   batch loss = 888.5554178953171 | accuracy = 0.6691780821917809


Epoch[2] Batch[370] Speed: 1.2553470345997533 samples/sec                   batch loss = 900.6913174390793 | accuracy = 0.6709459459459459


Epoch[2] Batch[375] Speed: 1.2524766086562287 samples/sec                   batch loss = 914.5728467702866 | accuracy = 0.6706666666666666


Epoch[2] Batch[380] Speed: 1.2571967159689952 samples/sec                   batch loss = 926.3562635183334 | accuracy = 0.6697368421052632


Epoch[2] Batch[385] Speed: 1.2549003650959834 samples/sec                   batch loss = 937.7012485265732 | accuracy = 0.6707792207792208


Epoch[2] Batch[390] Speed: 1.2590226606902946 samples/sec                   batch loss = 949.721799492836 | accuracy = 0.6705128205128205


Epoch[2] Batch[395] Speed: 1.255699749641677 samples/sec                   batch loss = 957.9529601335526 | accuracy = 0.6734177215189874


Epoch[2] Batch[400] Speed: 1.2521878485977573 samples/sec                   batch loss = 966.5175673961639 | accuracy = 0.675


Epoch[2] Batch[405] Speed: 1.2557409157741544 samples/sec                   batch loss = 979.2535847425461 | accuracy = 0.6753086419753086


Epoch[2] Batch[410] Speed: 1.257269731206881 samples/sec                   batch loss = 989.5878516435623 | accuracy = 0.675609756097561


Epoch[2] Batch[415] Speed: 1.2558734552176876 samples/sec                   batch loss = 1000.1306793689728 | accuracy = 0.6771084337349398


Epoch[2] Batch[420] Speed: 1.2591068494170523 samples/sec                   batch loss = 1009.9330874681473 | accuracy = 0.6779761904761905


Epoch[2] Batch[425] Speed: 1.2572819797548507 samples/sec                   batch loss = 1021.7060887813568 | accuracy = 0.6770588235294117


Epoch[2] Batch[430] Speed: 1.2584388463376466 samples/sec                   batch loss = 1033.0438485145569 | accuracy = 0.6773255813953488


Epoch[2] Batch[435] Speed: 1.2575089979655405 samples/sec                   batch loss = 1043.4736907482147 | accuracy = 0.6787356321839081


Epoch[2] Batch[440] Speed: 1.2545757738457557 samples/sec                   batch loss = 1053.5612193346024 | accuracy = 0.6789772727272727


Epoch[2] Batch[445] Speed: 1.2593077762023392 samples/sec                   batch loss = 1066.8638838529587 | accuracy = 0.6792134831460674


Epoch[2] Batch[450] Speed: 1.2571745775402288 samples/sec                   batch loss = 1076.4837898015976 | accuracy = 0.6811111111111111


Epoch[2] Batch[455] Speed: 1.2577435470160405 samples/sec                   batch loss = 1088.5381840467453 | accuracy = 0.6813186813186813


Epoch[2] Batch[460] Speed: 1.2591420967504314 samples/sec                   batch loss = 1100.790209889412 | accuracy = 0.6809782608695653


Epoch[2] Batch[465] Speed: 1.2559715148100095 samples/sec                   batch loss = 1117.1079132556915 | accuracy = 0.6790322580645162


Epoch[2] Batch[470] Speed: 1.2552000502163334 samples/sec                   batch loss = 1130.1263514757156 | accuracy = 0.6781914893617021


Epoch[2] Batch[475] Speed: 1.256370209948988 samples/sec                   batch loss = 1140.810925602913 | accuracy = 0.6805263157894736


Epoch[2] Batch[480] Speed: 1.2597469005309085 samples/sec                   batch loss = 1153.039716720581 | accuracy = 0.6796875


Epoch[2] Batch[485] Speed: 1.2556602778029333 samples/sec                   batch loss = 1162.4082679748535 | accuracy = 0.6798969072164949


Epoch[2] Batch[490] Speed: 1.2525357044976555 samples/sec                   batch loss = 1172.1858611106873 | accuracy = 0.6811224489795918


Epoch[2] Batch[495] Speed: 1.254999117689455 samples/sec                   batch loss = 1184.4326337575912 | accuracy = 0.6808080808080809


Epoch[2] Batch[500] Speed: 1.260801329366636 samples/sec                   batch loss = 1196.721829175949 | accuracy = 0.68


Epoch[2] Batch[505] Speed: 1.256461383875128 samples/sec                   batch loss = 1205.6194532513618 | accuracy = 0.6811881188118812


Epoch[2] Batch[510] Speed: 1.2465903291930949 samples/sec                   batch loss = 1216.3573706746101 | accuracy = 0.6823529411764706


Epoch[2] Batch[515] Speed: 1.2520330069173988 samples/sec                   batch loss = 1228.8083736300468 | accuracy = 0.6825242718446602


Epoch[2] Batch[520] Speed: 1.2597716836826638 samples/sec                   batch loss = 1237.1986256241798 | accuracy = 0.6841346153846154


Epoch[2] Batch[525] Speed: 1.256589746069079 samples/sec                   batch loss = 1250.9258933663368 | accuracy = 0.6833333333333333


Epoch[2] Batch[530] Speed: 1.2506860905548585 samples/sec                   batch loss = 1262.3710197806358 | accuracy = 0.6830188679245283


Epoch[2] Batch[535] Speed: 1.2576328606382774 samples/sec                   batch loss = 1271.8478489518166 | accuracy = 0.6845794392523364


Epoch[2] Batch[540] Speed: 1.262626459514404 samples/sec                   batch loss = 1282.3165156245232 | accuracy = 0.6851851851851852


Epoch[2] Batch[545] Speed: 1.2574887335646756 samples/sec                   batch loss = 1294.9305023550987 | accuracy = 0.6848623853211009


Epoch[2] Batch[550] Speed: 1.2602085758429307 samples/sec                   batch loss = 1304.6208493113518 | accuracy = 0.6859090909090909


Epoch[2] Batch[555] Speed: 1.2598938164318467 samples/sec                   batch loss = 1315.6291345953941 | accuracy = 0.686036036036036


Epoch[2] Batch[560] Speed: 1.2532627963453633 samples/sec                   batch loss = 1326.9416916966438 | accuracy = 0.6857142857142857


Epoch[2] Batch[565] Speed: 1.258335493409368 samples/sec                   batch loss = 1338.9755955338478 | accuracy = 0.6849557522123894


Epoch[2] Batch[570] Speed: 1.257539160143755 samples/sec                   batch loss = 1352.7528169751167 | accuracy = 0.6850877192982456


Epoch[2] Batch[575] Speed: 1.2609457429781175 samples/sec                   batch loss = 1363.4998187422752 | accuracy = 0.6856521739130435


Epoch[2] Batch[580] Speed: 1.2576630287587107 samples/sec                   batch loss = 1374.5624640583992 | accuracy = 0.6853448275862069


Epoch[2] Batch[585] Speed: 1.2570525946521744 samples/sec                   batch loss = 1383.3332523107529 | accuracy = 0.6854700854700855


Epoch[2] Batch[590] Speed: 1.2565505005915305 samples/sec                   batch loss = 1396.2575370073318 | accuracy = 0.6860169491525424


Epoch[2] Batch[595] Speed: 1.260737377214575 samples/sec                   batch loss = 1405.7708055973053 | accuracy = 0.6869747899159664


Epoch[2] Batch[600] Speed: 1.2605657332950142 samples/sec                   batch loss = 1420.0565168857574 | accuracy = 0.6866666666666666


Epoch[2] Batch[605] Speed: 1.2610143605115247 samples/sec                   batch loss = 1430.757118344307 | accuracy = 0.687603305785124


Epoch[2] Batch[610] Speed: 1.2620740460901974 samples/sec                   batch loss = 1440.7240487337112 | accuracy = 0.6885245901639344


Epoch[2] Batch[615] Speed: 1.2587032994602432 samples/sec                   batch loss = 1451.6117539405823 | accuracy = 0.6882113821138212


Epoch[2] Batch[620] Speed: 1.2578672672842435 samples/sec                   batch loss = 1466.4460775852203 | accuracy = 0.6858870967741936


Epoch[2] Batch[625] Speed: 1.2538165169235773 samples/sec                   batch loss = 1477.7202198505402 | accuracy = 0.6868


Epoch[2] Batch[630] Speed: 1.261151428381999 samples/sec                   batch loss = 1489.7578312158585 | accuracy = 0.6869047619047619


Epoch[2] Batch[635] Speed: 1.2603024852286142 samples/sec                   batch loss = 1506.089223742485 | accuracy = 0.6854330708661417


Epoch[2] Batch[640] Speed: 1.2557489989390977 samples/sec                   batch loss = 1515.17595911026 | accuracy = 0.6859375


Epoch[2] Batch[645] Speed: 1.2526873968816048 samples/sec                   batch loss = 1527.4072734117508 | accuracy = 0.686046511627907


Epoch[2] Batch[650] Speed: 1.2550094444285278 samples/sec                   batch loss = 1536.0856068134308 | accuracy = 0.6865384615384615


Epoch[2] Batch[655] Speed: 1.2577692886336815 samples/sec                   batch loss = 1550.3065776824951 | accuracy = 0.6858778625954198


Epoch[2] Batch[660] Speed: 1.256069119522381 samples/sec                   batch loss = 1565.398937702179 | accuracy = 0.6837121212121212


Epoch[2] Batch[665] Speed: 1.2601462928613785 samples/sec                   batch loss = 1577.0878945589066 | accuracy = 0.6845864661654135


Epoch[2] Batch[670] Speed: 1.2525802170727838 samples/sec                   batch loss = 1587.1683700084686 | accuracy = 0.6850746268656717


Epoch[2] Batch[675] Speed: 1.2557739070203346 samples/sec                   batch loss = 1599.5993003249168 | accuracy = 0.6844444444444444


Epoch[2] Batch[680] Speed: 1.27646554838961 samples/sec                   batch loss = 1613.6981300711632 | accuracy = 0.6838235294117647


Epoch[2] Batch[685] Speed: 1.279648534033011 samples/sec                   batch loss = 1625.5304680466652 | accuracy = 0.6843065693430657


Epoch[2] Batch[690] Speed: 1.2775842553569972 samples/sec                   batch loss = 1636.9017344117165 | accuracy = 0.6844202898550724


Epoch[2] Batch[695] Speed: 1.281844157048757 samples/sec                   batch loss = 1648.861872971058 | accuracy = 0.6845323741007194


Epoch[2] Batch[700] Speed: 1.2792017676347744 samples/sec                   batch loss = 1662.8889963030815 | accuracy = 0.6839285714285714


Epoch[2] Batch[705] Speed: 1.2849931324178447 samples/sec                   batch loss = 1673.0882732272148 | accuracy = 0.6851063829787234


Epoch[2] Batch[710] Speed: 1.280409505969927 samples/sec                   batch loss = 1682.7240275740623 | accuracy = 0.6859154929577465


Epoch[2] Batch[715] Speed: 1.2776271607990013 samples/sec                   batch loss = 1694.2350437045097 | accuracy = 0.686013986013986


Epoch[2] Batch[720] Speed: 1.2766921641620679 samples/sec                   batch loss = 1706.2264932990074 | accuracy = 0.6861111111111111


Epoch[2] Batch[725] Speed: 1.2801603716530041 samples/sec                   batch loss = 1716.1413396000862 | accuracy = 0.6858620689655173


Epoch[2] Batch[730] Speed: 1.2767095546367504 samples/sec                   batch loss = 1729.7706930041313 | accuracy = 0.685958904109589


Epoch[2] Batch[735] Speed: 1.2751675322219445 samples/sec                   batch loss = 1741.9406171441078 | accuracy = 0.6857142857142857


Epoch[2] Batch[740] Speed: 1.2758062657957603 samples/sec                   batch loss = 1751.0467845797539 | accuracy = 0.6861486486486487


Epoch[2] Batch[745] Speed: 1.272465318428507 samples/sec                   batch loss = 1763.9514032006264 | accuracy = 0.6865771812080537


Epoch[2] Batch[750] Speed: 1.282432739840958 samples/sec                   batch loss = 1778.3868851065636 | accuracy = 0.687


Epoch[2] Batch[755] Speed: 1.2818535591511988 samples/sec                   batch loss = 1790.1711564660072 | accuracy = 0.6870860927152318


Epoch[2] Batch[760] Speed: 1.2759202714100042 samples/sec                   batch loss = 1799.254618048668 | accuracy = 0.687828947368421


Epoch[2] Batch[765] Speed: 1.2812960375110871 samples/sec                   batch loss = 1811.3082344532013 | accuracy = 0.688562091503268


Epoch[2] Batch[770] Speed: 1.2764893426647659 samples/sec                   batch loss = 1823.6352490186691 | accuracy = 0.6883116883116883


Epoch[2] Batch[775] Speed: 1.2802926450168206 samples/sec                   batch loss = 1831.7786782383919 | accuracy = 0.69


Epoch[2] Batch[780] Speed: 1.2811364573356203 samples/sec                   batch loss = 1844.1854240298271 | accuracy = 0.6900641025641026


Epoch[2] Batch[785] Speed: 1.28240784126079 samples/sec                   batch loss = 1854.6378685832024 | accuracy = 0.6910828025477707


[Epoch 2] training: accuracy=0.6903553299492385
[Epoch 2] time cost: 643.1426281929016
[Epoch 2] validation: validation accuracy=0.7244444444444444


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).