<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[09:42:37] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[09:42:37] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[09:42:38] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 5.644937 , -4.7139916]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7783121610904042 samples/sec                   batch loss = 13.53131890296936 | accuracy = 0.45


Epoch[1] Batch[10] Speed: 1.2567765956857093 samples/sec                   batch loss = 28.228672742843628 | accuracy = 0.4


Epoch[1] Batch[15] Speed: 1.2543637869829565 samples/sec                   batch loss = 43.166799783706665 | accuracy = 0.45


Epoch[1] Batch[20] Speed: 1.2509862845141793 samples/sec                   batch loss = 57.59353256225586 | accuracy = 0.4625


Epoch[1] Batch[25] Speed: 1.2570984649764303 samples/sec                   batch loss = 73.1167254447937 | accuracy = 0.43


Epoch[1] Batch[30] Speed: 1.255247288088749 samples/sec                   batch loss = 87.56694149971008 | accuracy = 0.4166666666666667


Epoch[1] Batch[35] Speed: 1.2523365588217006 samples/sec                   batch loss = 101.90837693214417 | accuracy = 0.40714285714285714


Epoch[1] Batch[40] Speed: 1.2452216339053965 samples/sec                   batch loss = 115.92053008079529 | accuracy = 0.43125


Epoch[1] Batch[45] Speed: 1.2486901463338536 samples/sec                   batch loss = 129.6356303691864 | accuracy = 0.45555555555555555


Epoch[1] Batch[50] Speed: 1.2404551909908195 samples/sec                   batch loss = 144.2401738166809 | accuracy = 0.455


Epoch[1] Batch[55] Speed: 1.2478835083722108 samples/sec                   batch loss = 158.06878423690796 | accuracy = 0.4772727272727273


Epoch[1] Batch[60] Speed: 1.2448031971969324 samples/sec                   batch loss = 172.9414985179901 | accuracy = 0.4666666666666667


Epoch[1] Batch[65] Speed: 1.2471291756305012 samples/sec                   batch loss = 187.78628945350647 | accuracy = 0.4576923076923077


Epoch[1] Batch[70] Speed: 1.2554933957422583 samples/sec                   batch loss = 201.45126175880432 | accuracy = 0.46785714285714286


Epoch[1] Batch[75] Speed: 1.25488891380467 samples/sec                   batch loss = 215.54725074768066 | accuracy = 0.47


Epoch[1] Batch[80] Speed: 1.2560835076269947 samples/sec                   batch loss = 229.350350856781 | accuracy = 0.465625


Epoch[1] Batch[85] Speed: 1.2508977687378748 samples/sec                   batch loss = 243.41584300994873 | accuracy = 0.4676470588235294


Epoch[1] Batch[90] Speed: 1.2568801634844642 samples/sec                   batch loss = 257.38352060317993 | accuracy = 0.46111111111111114


Epoch[1] Batch[95] Speed: 1.25311966910956 samples/sec                   batch loss = 271.4615089893341 | accuracy = 0.45789473684210524


Epoch[1] Batch[100] Speed: 1.2557213662240225 samples/sec                   batch loss = 285.86881971359253 | accuracy = 0.4625


Epoch[1] Batch[105] Speed: 1.2562246789974323 samples/sec                   batch loss = 299.95198225975037 | accuracy = 0.45714285714285713


Epoch[1] Batch[110] Speed: 1.2557235279231909 samples/sec                   batch loss = 313.3636336326599 | accuracy = 0.4636363636363636


Epoch[1] Batch[115] Speed: 1.2526554093641151 samples/sec                   batch loss = 326.856897354126 | accuracy = 0.47608695652173916


Epoch[1] Batch[120] Speed: 1.25230701959917 samples/sec                   batch loss = 340.2221574783325 | accuracy = 0.48333333333333334


Epoch[1] Batch[125] Speed: 1.2552096289806118 samples/sec                   batch loss = 353.53776931762695 | accuracy = 0.492


Epoch[1] Batch[130] Speed: 1.2482173681143307 samples/sec                   batch loss = 367.36406111717224 | accuracy = 0.49230769230769234


Epoch[1] Batch[135] Speed: 1.2478410923798107 samples/sec                   batch loss = 380.7709152698517 | accuracy = 0.4981481481481482


Epoch[1] Batch[140] Speed: 1.2520887903636153 samples/sec                   batch loss = 394.6596488952637 | accuracy = 0.49642857142857144


Epoch[1] Batch[145] Speed: 1.2515849574808262 samples/sec                   batch loss = 408.2037465572357 | accuracy = 0.5017241379310344


Epoch[1] Batch[150] Speed: 1.2426003571846773 samples/sec                   batch loss = 421.93347358703613 | accuracy = 0.5033333333333333


Epoch[1] Batch[155] Speed: 1.252458937090597 samples/sec                   batch loss = 435.04868721961975 | accuracy = 0.5096774193548387


Epoch[1] Batch[160] Speed: 1.2553490071482216 samples/sec                   batch loss = 448.9508340358734 | accuracy = 0.5109375


Epoch[1] Batch[165] Speed: 1.2540005743348845 samples/sec                   batch loss = 462.60410141944885 | accuracy = 0.5136363636363637


Epoch[1] Batch[170] Speed: 1.2506944817180445 samples/sec                   batch loss = 476.3434636592865 | accuracy = 0.5147058823529411


Epoch[1] Batch[175] Speed: 1.249372957191301 samples/sec                   batch loss = 490.6818037033081 | accuracy = 0.5114285714285715


Epoch[1] Batch[180] Speed: 1.2546227770964782 samples/sec                   batch loss = 504.0051393508911 | accuracy = 0.5166666666666667


Epoch[1] Batch[185] Speed: 1.259430102641819 samples/sec                   batch loss = 518.0955867767334 | accuracy = 0.5135135135135135


Epoch[1] Batch[190] Speed: 1.2503446657355344 samples/sec                   batch loss = 532.0001595020294 | accuracy = 0.5157894736842106


Epoch[1] Batch[195] Speed: 1.2535750938002368 samples/sec                   batch loss = 545.788236618042 | accuracy = 0.517948717948718


Epoch[1] Batch[200] Speed: 1.2536241765748428 samples/sec                   batch loss = 559.0061044692993 | accuracy = 0.52125


Epoch[1] Batch[205] Speed: 1.2526793530831069 samples/sec                   batch loss = 573.3220133781433 | accuracy = 0.5158536585365854


Epoch[1] Batch[210] Speed: 1.2545199561966782 samples/sec                   batch loss = 587.1156756877899 | accuracy = 0.5166666666666667


Epoch[1] Batch[215] Speed: 1.2486355016656197 samples/sec                   batch loss = 600.2636179924011 | accuracy = 0.5197674418604651


Epoch[1] Batch[220] Speed: 1.2405359975220656 samples/sec                   batch loss = 614.0321338176727 | accuracy = 0.5227272727272727


Epoch[1] Batch[225] Speed: 1.2448280423937994 samples/sec                   batch loss = 627.8807065486908 | accuracy = 0.5255555555555556


Epoch[1] Batch[230] Speed: 1.2580894024598455 samples/sec                   batch loss = 640.9111256599426 | accuracy = 0.5293478260869565


Epoch[1] Batch[235] Speed: 1.249358722410319 samples/sec                   batch loss = 653.7402195930481 | accuracy = 0.5340425531914894


Epoch[1] Batch[240] Speed: 1.2454754753171942 samples/sec                   batch loss = 666.8456003665924 | accuracy = 0.5375


Epoch[1] Batch[245] Speed: 1.2524211646224728 samples/sec                   batch loss = 681.0176587104797 | accuracy = 0.536734693877551


Epoch[1] Batch[250] Speed: 1.2456905727229426 samples/sec                   batch loss = 694.2225017547607 | accuracy = 0.542


Epoch[1] Batch[255] Speed: 1.2492166714854125 samples/sec                   batch loss = 708.2870981693268 | accuracy = 0.5401960784313725


Epoch[1] Batch[260] Speed: 1.2514587358855591 samples/sec                   batch loss = 722.8524057865143 | accuracy = 0.5346153846153846


Epoch[1] Batch[265] Speed: 1.2566265468134363 samples/sec                   batch loss = 736.6929497718811 | accuracy = 0.5339622641509434


Epoch[1] Batch[270] Speed: 1.2423398666965826 samples/sec                   batch loss = 750.3725411891937 | accuracy = 0.5351851851851852


Epoch[1] Batch[275] Speed: 1.2449032305553807 samples/sec                   batch loss = 764.1720452308655 | accuracy = 0.5345454545454545


Epoch[1] Batch[280] Speed: 1.2531797617174012 samples/sec                   batch loss = 778.2441737651825 | accuracy = 0.5330357142857143


Epoch[1] Batch[285] Speed: 1.2524548231468766 samples/sec                   batch loss = 791.791995048523 | accuracy = 0.5359649122807018


Epoch[1] Batch[290] Speed: 1.2444104239263138 samples/sec                   batch loss = 805.4197216033936 | accuracy = 0.5379310344827586


Epoch[1] Batch[295] Speed: 1.2434404487741668 samples/sec                   batch loss = 819.1943683624268 | accuracy = 0.538135593220339


Epoch[1] Batch[300] Speed: 1.2457790931005355 samples/sec                   batch loss = 832.7345716953278 | accuracy = 0.5391666666666667


Epoch[1] Batch[305] Speed: 1.2388868253758445 samples/sec                   batch loss = 846.4519529342651 | accuracy = 0.5401639344262295


Epoch[1] Batch[310] Speed: 1.255985712625904 samples/sec                   batch loss = 860.4253468513489 | accuracy = 0.5387096774193548


Epoch[1] Batch[315] Speed: 1.244565601605118 samples/sec                   batch loss = 873.5188972949982 | accuracy = 0.5412698412698412


Epoch[1] Batch[320] Speed: 1.2545667676369163 samples/sec                   batch loss = 887.1732738018036 | accuracy = 0.54140625


Epoch[1] Batch[325] Speed: 1.2563673874379855 samples/sec                   batch loss = 900.7017662525177 | accuracy = 0.5415384615384615


Epoch[1] Batch[330] Speed: 1.2609944568669085 samples/sec                   batch loss = 914.0791084766388 | accuracy = 0.5416666666666666


Epoch[1] Batch[335] Speed: 1.2596260256994336 samples/sec                   batch loss = 927.3292281627655 | accuracy = 0.5425373134328358


Epoch[1] Batch[340] Speed: 1.247964264512056 samples/sec                   batch loss = 941.3312089443207 | accuracy = 0.5433823529411764


Epoch[1] Batch[345] Speed: 1.2520091813322065 samples/sec                   batch loss = 954.7865679264069 | accuracy = 0.5456521739130434


Epoch[1] Batch[350] Speed: 1.256575628708515 samples/sec                   batch loss = 968.1408491134644 | accuracy = 0.5464285714285714


Epoch[1] Batch[355] Speed: 1.2569763085080774 samples/sec                   batch loss = 981.7740905284882 | accuracy = 0.5464788732394367


Epoch[1] Batch[360] Speed: 1.2475295114226634 samples/sec                   batch loss = 995.4010527133942 | accuracy = 0.5451388888888888


Epoch[1] Batch[365] Speed: 1.2497961481118616 samples/sec                   batch loss = 1008.7255659103394 | accuracy = 0.5452054794520548


Epoch[1] Batch[370] Speed: 1.2450516938036007 samples/sec                   batch loss = 1022.5885064601898 | accuracy = 0.5445945945945946


Epoch[1] Batch[375] Speed: 1.2388617593692837 samples/sec                   batch loss = 1036.2531282901764 | accuracy = 0.5433333333333333


Epoch[1] Batch[380] Speed: 1.2378876243894295 samples/sec                   batch loss = 1050.3054831027985 | accuracy = 0.5414473684210527


Epoch[1] Batch[385] Speed: 1.2476014080454791 samples/sec                   batch loss = 1063.7328848838806 | accuracy = 0.5422077922077922


Epoch[1] Batch[390] Speed: 1.2461958721045279 samples/sec                   batch loss = 1077.7871448993683 | accuracy = 0.5416666666666666


Epoch[1] Batch[395] Speed: 1.2491500758884082 samples/sec                   batch loss = 1091.3265895843506 | accuracy = 0.5424050632911392


Epoch[1] Batch[400] Speed: 1.2456025273148126 samples/sec                   batch loss = 1104.9463293552399 | accuracy = 0.544375


Epoch[1] Batch[405] Speed: 1.2428575502975983 samples/sec                   batch loss = 1118.1256103515625 | accuracy = 0.5450617283950617


Epoch[1] Batch[410] Speed: 1.2513939545106216 samples/sec                   batch loss = 1131.5037906169891 | accuracy = 0.5457317073170732


Epoch[1] Batch[415] Speed: 1.2479914640476417 samples/sec                   batch loss = 1145.453803062439 | accuracy = 0.5451807228915663


Epoch[1] Batch[420] Speed: 1.24966879771298 samples/sec                   batch loss = 1158.2156953811646 | accuracy = 0.5464285714285714


Epoch[1] Batch[425] Speed: 1.2517238119688383 samples/sec                   batch loss = 1171.534481048584 | accuracy = 0.5494117647058824


Epoch[1] Batch[430] Speed: 1.246080175020874 samples/sec                   batch loss = 1184.9808557033539 | accuracy = 0.5505813953488372


Epoch[1] Batch[435] Speed: 1.251181361910941 samples/sec                   batch loss = 1198.781919002533 | accuracy = 0.55


Epoch[1] Batch[440] Speed: 1.2573854423965696 samples/sec                   batch loss = 1212.514598608017 | accuracy = 0.5505681818181818


Epoch[1] Batch[445] Speed: 1.2506442296560851 samples/sec                   batch loss = 1225.2104659080505 | accuracy = 0.552247191011236


Epoch[1] Batch[450] Speed: 1.248796754376073 samples/sec                   batch loss = 1238.451891899109 | accuracy = 0.5511111111111111


Epoch[1] Batch[455] Speed: 1.249081162502347 samples/sec                   batch loss = 1252.2551639080048 | accuracy = 0.5516483516483517


Epoch[1] Batch[460] Speed: 1.2511702583125832 samples/sec                   batch loss = 1265.4227149486542 | accuracy = 0.5521739130434783


Epoch[1] Batch[465] Speed: 1.2376912828836588 samples/sec                   batch loss = 1279.2957742214203 | accuracy = 0.5526881720430108


Epoch[1] Batch[470] Speed: 1.2575606515784818 samples/sec                   batch loss = 1291.3867807388306 | accuracy = 0.5558510638297872


Epoch[1] Batch[475] Speed: 1.2516268813794231 samples/sec                   batch loss = 1305.0987808704376 | accuracy = 0.5542105263157895


Epoch[1] Batch[480] Speed: 1.255031788318353 samples/sec                   batch loss = 1318.576533794403 | accuracy = 0.553125


Epoch[1] Batch[485] Speed: 1.2533512726278464 samples/sec                   batch loss = 1330.0926184654236 | accuracy = 0.5567010309278351


Epoch[1] Batch[490] Speed: 1.259892018799086 samples/sec                   batch loss = 1342.511298418045 | accuracy = 0.5586734693877551


Epoch[1] Batch[495] Speed: 1.2597458600382327 samples/sec                   batch loss = 1358.0979437828064 | accuracy = 0.5555555555555556


Epoch[1] Batch[500] Speed: 1.2532436047265134 samples/sec                   batch loss = 1370.969254732132 | accuracy = 0.555


Epoch[1] Batch[505] Speed: 1.2564301442595691 samples/sec                   batch loss = 1384.0232701301575 | accuracy = 0.556930693069307


Epoch[1] Batch[510] Speed: 1.2550283146361931 samples/sec                   batch loss = 1397.5121626853943 | accuracy = 0.557843137254902


Epoch[1] Batch[515] Speed: 1.2589107099603716 samples/sec                   batch loss = 1411.7058911323547 | accuracy = 0.5587378640776699


Epoch[1] Batch[520] Speed: 1.260720419139619 samples/sec                   batch loss = 1424.286449790001 | accuracy = 0.5600961538461539


Epoch[1] Batch[525] Speed: 1.246429645758364 samples/sec                   batch loss = 1437.0858398675919 | accuracy = 0.5619047619047619


Epoch[1] Batch[530] Speed: 1.2534591463273865 samples/sec                   batch loss = 1450.6414450407028 | accuracy = 0.5608490566037736


Epoch[1] Batch[535] Speed: 1.2500872523887483 samples/sec                   batch loss = 1464.2780750989914 | accuracy = 0.5602803738317756


Epoch[1] Batch[540] Speed: 1.2492495998647786 samples/sec                   batch loss = 1477.8871151208878 | accuracy = 0.5601851851851852


Epoch[1] Batch[545] Speed: 1.2517656517748736 samples/sec                   batch loss = 1491.6547178030014 | accuracy = 0.5610091743119267


Epoch[1] Batch[550] Speed: 1.249316299104677 samples/sec                   batch loss = 1505.3658121824265 | accuracy = 0.56


Epoch[1] Batch[555] Speed: 1.2535040055276894 samples/sec                   batch loss = 1517.2913635969162 | accuracy = 0.5621621621621622


Epoch[1] Batch[560] Speed: 1.250643949971308 samples/sec                   batch loss = 1529.567656159401 | accuracy = 0.5633928571428571


Epoch[1] Batch[565] Speed: 1.248793408075377 samples/sec                   batch loss = 1541.7518173456192 | accuracy = 0.565929203539823


Epoch[1] Batch[570] Speed: 1.242874031223391 samples/sec                   batch loss = 1553.9892107248306 | accuracy = 0.5675438596491228


Epoch[1] Batch[575] Speed: 1.2523166477369765 samples/sec                   batch loss = 1567.118120789528 | accuracy = 0.5682608695652174


Epoch[1] Batch[580] Speed: 1.252248506139835 samples/sec                   batch loss = 1582.1985095739365 | accuracy = 0.5668103448275862


Epoch[1] Batch[585] Speed: 1.2505209940146156 samples/sec                   batch loss = 1594.989832997322 | accuracy = 0.5666666666666667


Epoch[1] Batch[590] Speed: 1.2426279676646843 samples/sec                   batch loss = 1608.206777215004 | accuracy = 0.5669491525423729


Epoch[1] Batch[595] Speed: 1.2432331292576038 samples/sec                   batch loss = 1621.5116196870804 | accuracy = 0.5668067226890756


Epoch[1] Batch[600] Speed: 1.2487900617926149 samples/sec                   batch loss = 1634.6442259550095 | accuracy = 0.5670833333333334


Epoch[1] Batch[605] Speed: 1.2494313883180823 samples/sec                   batch loss = 1649.2324603796005 | accuracy = 0.5661157024793388


Epoch[1] Batch[610] Speed: 1.253395374995499 samples/sec                   batch loss = 1661.4254122972488 | accuracy = 0.5672131147540984


Epoch[1] Batch[615] Speed: 1.2474896239674345 samples/sec                   batch loss = 1675.144070982933 | accuracy = 0.5666666666666667


Epoch[1] Batch[620] Speed: 1.2482875793748374 samples/sec                   batch loss = 1688.3435901403427 | accuracy = 0.5665322580645161


Epoch[1] Batch[625] Speed: 1.2512296975176047 samples/sec                   batch loss = 1700.8473246097565 | accuracy = 0.5664


Epoch[1] Batch[630] Speed: 1.2514705914164859 samples/sec                   batch loss = 1713.6726868152618 | accuracy = 0.5678571428571428


Epoch[1] Batch[635] Speed: 1.2455750619141652 samples/sec                   batch loss = 1728.2778089046478 | accuracy = 0.5673228346456692


Epoch[1] Batch[640] Speed: 1.246081378159996 samples/sec                   batch loss = 1741.491028547287 | accuracy = 0.5671875


Epoch[1] Batch[645] Speed: 1.2509232308969362 samples/sec                   batch loss = 1754.7398037910461 | accuracy = 0.5674418604651162


Epoch[1] Batch[650] Speed: 1.246588661947667 samples/sec                   batch loss = 1768.1027309894562 | accuracy = 0.5676923076923077


Epoch[1] Batch[655] Speed: 1.2564797331686908 samples/sec                   batch loss = 1779.684075832367 | accuracy = 0.5690839694656489


Epoch[1] Batch[660] Speed: 1.256556429607195 samples/sec                   batch loss = 1792.5177907943726 | accuracy = 0.5685606060606061


Epoch[1] Batch[665] Speed: 1.2566176993788523 samples/sec                   batch loss = 1805.2781190872192 | accuracy = 0.5691729323308271


Epoch[1] Batch[670] Speed: 1.2570048440927977 samples/sec                   batch loss = 1818.5033950805664 | accuracy = 0.5701492537313433


Epoch[1] Batch[675] Speed: 1.2602609248124357 samples/sec                   batch loss = 1831.4356758594513 | accuracy = 0.5707407407407408


Epoch[1] Batch[680] Speed: 1.2589405615416431 samples/sec                   batch loss = 1843.6680762767792 | accuracy = 0.5724264705882353


Epoch[1] Batch[685] Speed: 1.2552270965153738 samples/sec                   batch loss = 1856.938823223114 | accuracy = 0.5737226277372263


Epoch[1] Batch[690] Speed: 1.2494454386556775 samples/sec                   batch loss = 1869.8769826889038 | accuracy = 0.5746376811594203


Epoch[1] Batch[695] Speed: 1.2523335674446132 samples/sec                   batch loss = 1881.9911966323853 | accuracy = 0.5751798561151079


Epoch[1] Batch[700] Speed: 1.2522292521077403 samples/sec                   batch loss = 1895.3705995082855 | accuracy = 0.5753571428571429


Epoch[1] Batch[705] Speed: 1.2555469510291357 samples/sec                   batch loss = 1907.9452738761902 | accuracy = 0.5758865248226951


Epoch[1] Batch[710] Speed: 1.2422149509902645 samples/sec                   batch loss = 1920.705171585083 | accuracy = 0.5764084507042253


Epoch[1] Batch[715] Speed: 1.2429197934040088 samples/sec                   batch loss = 1934.1140086650848 | accuracy = 0.5762237762237762


Epoch[1] Batch[720] Speed: 1.2410557589105835 samples/sec                   batch loss = 1945.993485569954 | accuracy = 0.5767361111111111


Epoch[1] Batch[725] Speed: 1.2427932880692862 samples/sec                   batch loss = 1960.2661792039871 | accuracy = 0.576551724137931


Epoch[1] Batch[730] Speed: 1.2441553556798368 samples/sec                   batch loss = 1974.0786129236221 | accuracy = 0.577054794520548


Epoch[1] Batch[735] Speed: 1.2485219527027842 samples/sec                   batch loss = 1986.0816620588303 | accuracy = 0.5782312925170068


Epoch[1] Batch[740] Speed: 1.251846350728702 samples/sec                   batch loss = 1998.9782456159592 | accuracy = 0.5783783783783784


Epoch[1] Batch[745] Speed: 1.2481702864155706 samples/sec                   batch loss = 2012.7562893629074 | accuracy = 0.5795302013422818


Epoch[1] Batch[750] Speed: 1.2493483023841194 samples/sec                   batch loss = 2023.7773996591568 | accuracy = 0.581


Epoch[1] Batch[755] Speed: 1.2447429817268414 samples/sec                   batch loss = 2037.5325976610184 | accuracy = 0.5811258278145696


Epoch[1] Batch[760] Speed: 1.250735693287206 samples/sec                   batch loss = 2048.86113011837 | accuracy = 0.5822368421052632


Epoch[1] Batch[765] Speed: 1.2479077341167348 samples/sec                   batch loss = 2062.3603566884995 | accuracy = 0.5820261437908497


Epoch[1] Batch[770] Speed: 1.2489480072995929 samples/sec                   batch loss = 2074.246433734894 | accuracy = 0.5824675324675325


Epoch[1] Batch[775] Speed: 1.2418400787862884 samples/sec                   batch loss = 2087.361131668091 | accuracy = 0.582258064516129


Epoch[1] Batch[780] Speed: 1.2455827373082233 samples/sec                   batch loss = 2098.405119895935 | accuracy = 0.5836538461538462


Epoch[1] Batch[785] Speed: 1.2502603402434969 samples/sec                   batch loss = 2112.05872964859 | accuracy = 0.5840764331210191


[Epoch 1] training: accuracy=0.5834390862944162
[Epoch 1] time cost: 648.5752251148224
[Epoch 1] validation: validation accuracy=0.6988888888888889


Epoch[2] Batch[5] Speed: 1.251833367208581 samples/sec                   batch loss = 14.70193362236023 | accuracy = 0.5


Epoch[2] Batch[10] Speed: 1.247931682328899 samples/sec                   batch loss = 26.4886531829834 | accuracy = 0.65


Epoch[2] Batch[15] Speed: 1.2536241765748428 samples/sec                   batch loss = 40.86658811569214 | accuracy = 0.6166666666666667


Epoch[2] Batch[20] Speed: 1.2569073764012666 samples/sec                   batch loss = 53.002814412117004 | accuracy = 0.6375


Epoch[2] Batch[25] Speed: 1.2514340920360973 samples/sec                   batch loss = 66.45998251438141 | accuracy = 0.62


Epoch[2] Batch[30] Speed: 1.2467744946381287 samples/sec                   batch loss = 77.6368693113327 | accuracy = 0.65


Epoch[2] Batch[35] Speed: 1.2474540984514653 samples/sec                   batch loss = 89.01413762569427 | accuracy = 0.6571428571428571


Epoch[2] Batch[40] Speed: 1.2426976435639134 samples/sec                   batch loss = 100.90769755840302 | accuracy = 0.65625


Epoch[2] Batch[45] Speed: 1.2390479489947461 samples/sec                   batch loss = 115.50411450862885 | accuracy = 0.6388888888888888


Epoch[2] Batch[50] Speed: 1.2479468128747297 samples/sec                   batch loss = 128.2887725830078 | accuracy = 0.64


Epoch[2] Batch[55] Speed: 1.248021635567943 samples/sec                   batch loss = 140.03894066810608 | accuracy = 0.65


Epoch[2] Batch[60] Speed: 1.2478250363123047 samples/sec                   batch loss = 151.7774876356125 | accuracy = 0.65


Epoch[2] Batch[65] Speed: 1.241232139900457 samples/sec                   batch loss = 165.17527067661285 | accuracy = 0.6423076923076924


Epoch[2] Batch[70] Speed: 1.245906669669815 samples/sec                   batch loss = 176.90141379833221 | accuracy = 0.65


Epoch[2] Batch[75] Speed: 1.2563985298446787 samples/sec                   batch loss = 189.73265850543976 | accuracy = 0.6466666666666666


Epoch[2] Batch[80] Speed: 1.2476424160239337 samples/sec                   batch loss = 203.73593127727509 | accuracy = 0.6375


Epoch[2] Batch[85] Speed: 1.2480015829487852 samples/sec                   batch loss = 217.56095039844513 | accuracy = 0.6352941176470588


Epoch[2] Batch[90] Speed: 1.241820591987803 samples/sec                   batch loss = 230.0210062265396 | accuracy = 0.6388888888888888


Epoch[2] Batch[95] Speed: 1.2468965295816576 samples/sec                   batch loss = 241.34298872947693 | accuracy = 0.6394736842105263


Epoch[2] Batch[100] Speed: 1.2494582796379277 samples/sec                   batch loss = 252.70314276218414 | accuracy = 0.6475


Epoch[2] Batch[105] Speed: 1.2516802940668492 samples/sec                   batch loss = 267.8184252977371 | accuracy = 0.6404761904761904


Epoch[2] Batch[110] Speed: 1.2487744460435082 samples/sec                   batch loss = 281.6235371828079 | accuracy = 0.6363636363636364


Epoch[2] Batch[115] Speed: 1.2486502775582022 samples/sec                   batch loss = 293.49135661125183 | accuracy = 0.6391304347826087


Epoch[2] Batch[120] Speed: 1.2534704778890864 samples/sec                   batch loss = 305.9241762161255 | accuracy = 0.6375


Epoch[2] Batch[125] Speed: 1.2375547937536042 samples/sec                   batch loss = 318.42789459228516 | accuracy = 0.636


Epoch[2] Batch[130] Speed: 1.2557814266352625 samples/sec                   batch loss = 330.9405333995819 | accuracy = 0.6384615384615384


Epoch[2] Batch[135] Speed: 1.2519353744907853 samples/sec                   batch loss = 342.762983083725 | accuracy = 0.6444444444444445


Epoch[2] Batch[140] Speed: 1.246745587809531 samples/sec                   batch loss = 353.4342235326767 | accuracy = 0.65


Epoch[2] Batch[145] Speed: 1.251476192532848 samples/sec                   batch loss = 365.72364246845245 | accuracy = 0.6517241379310345


Epoch[2] Batch[150] Speed: 1.2496295179592174 samples/sec                   batch loss = 378.9736202955246 | accuracy = 0.6516666666666666


Epoch[2] Batch[155] Speed: 1.257958469984689 samples/sec                   batch loss = 391.47593557834625 | accuracy = 0.65


Epoch[2] Batch[160] Speed: 1.2473887108202533 samples/sec                   batch loss = 403.87599742412567 | accuracy = 0.6515625


Epoch[2] Batch[165] Speed: 1.2503904205690712 samples/sec                   batch loss = 415.49174320697784 | accuracy = 0.6530303030303031


Epoch[2] Batch[170] Speed: 1.2474366610795329 samples/sec                   batch loss = 426.98912620544434 | accuracy = 0.6544117647058824


Epoch[2] Batch[175] Speed: 1.25265877639429 samples/sec                   batch loss = 438.9896869659424 | accuracy = 0.6542857142857142


Epoch[2] Batch[180] Speed: 1.2462011484037325 samples/sec                   batch loss = 450.72085988521576 | accuracy = 0.6555555555555556


Epoch[2] Batch[185] Speed: 1.2433678330053326 samples/sec                   batch loss = 461.1013959646225 | accuracy = 0.6594594594594595


Epoch[2] Batch[190] Speed: 1.2546124567197179 samples/sec                   batch loss = 475.69059431552887 | accuracy = 0.6552631578947369


Epoch[2] Batch[195] Speed: 1.255860858067468 samples/sec                   batch loss = 487.4354569911957 | accuracy = 0.6551282051282051


Epoch[2] Batch[200] Speed: 1.2437847518712133 samples/sec                   batch loss = 501.1263417005539 | accuracy = 0.65375


Epoch[2] Batch[205] Speed: 1.252803201615085 samples/sec                   batch loss = 512.8858400583267 | accuracy = 0.6548780487804878


Epoch[2] Batch[210] Speed: 1.252441546512857 samples/sec                   batch loss = 524.6524765491486 | accuracy = 0.6571428571428571


Epoch[2] Batch[215] Speed: 1.2483857583617033 samples/sec                   batch loss = 538.9937918186188 | accuracy = 0.6534883720930232


Epoch[2] Batch[220] Speed: 1.2492391816587214 samples/sec                   batch loss = 550.8189567327499 | accuracy = 0.6545454545454545


Epoch[2] Batch[225] Speed: 1.2532824566619505 samples/sec                   batch loss = 565.384925365448 | accuracy = 0.65


Epoch[2] Batch[230] Speed: 1.2494380877773879 samples/sec                   batch loss = 577.2525641918182 | accuracy = 0.6510869565217391


Epoch[2] Batch[235] Speed: 1.2526148193691762 samples/sec                   batch loss = 589.6072082519531 | accuracy = 0.6542553191489362


Epoch[2] Batch[240] Speed: 1.2518795113009455 samples/sec                   batch loss = 601.204668879509 | accuracy = 0.6541666666666667


Epoch[2] Batch[245] Speed: 1.2516100741340208 samples/sec                   batch loss = 609.970424413681 | accuracy = 0.6591836734693878


Epoch[2] Batch[250] Speed: 1.2520030148578762 samples/sec                   batch loss = 622.3375351428986 | accuracy = 0.658


Epoch[2] Batch[255] Speed: 1.2488153452620696 samples/sec                   batch loss = 633.3992414474487 | accuracy = 0.6588235294117647


Epoch[2] Batch[260] Speed: 1.255266353286826 samples/sec                   batch loss = 646.0614290237427 | accuracy = 0.6605769230769231


Epoch[2] Batch[265] Speed: 1.2514652704023808 samples/sec                   batch loss = 660.2590894699097 | accuracy = 0.6613207547169812


Epoch[2] Batch[270] Speed: 1.252234112292986 samples/sec                   batch loss = 675.0241560935974 | accuracy = 0.6611111111111111


Epoch[2] Batch[275] Speed: 1.2511136236965341 samples/sec                   batch loss = 686.7899214029312 | accuracy = 0.6636363636363637


Epoch[2] Batch[280] Speed: 1.257948566259265 samples/sec                   batch loss = 699.4186367988586 | accuracy = 0.6633928571428571


Epoch[2] Batch[285] Speed: 1.2418847536607445 samples/sec                   batch loss = 711.6171205043793 | accuracy = 0.6657894736842105


Epoch[2] Batch[290] Speed: 1.2579871445071726 samples/sec                   batch loss = 724.0752766132355 | accuracy = 0.6655172413793103


Epoch[2] Batch[295] Speed: 1.2555621728392252 samples/sec                   batch loss = 736.06698346138 | accuracy = 0.6677966101694915


Epoch[2] Batch[300] Speed: 1.2561970253439747 samples/sec                   batch loss = 749.3470549583435 | accuracy = 0.665


Epoch[2] Batch[305] Speed: 1.2566364297334163 samples/sec                   batch loss = 763.1311101913452 | accuracy = 0.6639344262295082


Epoch[2] Batch[310] Speed: 1.2549573430702667 samples/sec                   batch loss = 775.9101159572601 | accuracy = 0.6645161290322581


Epoch[2] Batch[315] Speed: 1.2603005917559809 samples/sec                   batch loss = 788.7779524326324 | accuracy = 0.6619047619047619


Epoch[2] Batch[320] Speed: 1.2468072943938908 samples/sec                   batch loss = 801.0763726234436 | accuracy = 0.6625


Epoch[2] Batch[325] Speed: 1.2546817003174842 samples/sec                   batch loss = 810.3755683898926 | accuracy = 0.6646153846153846


Epoch[2] Batch[330] Speed: 1.2535602947864832 samples/sec                   batch loss = 823.382734298706 | accuracy = 0.6628787878787878


Epoch[2] Batch[335] Speed: 1.255787254398773 samples/sec                   batch loss = 836.963559627533 | accuracy = 0.6634328358208955


Epoch[2] Batch[340] Speed: 1.2509767700996466 samples/sec                   batch loss = 846.21595287323 | accuracy = 0.6654411764705882


Epoch[2] Batch[345] Speed: 1.2520837444202875 samples/sec                   batch loss = 860.0261787176132 | accuracy = 0.6644927536231884


Epoch[2] Batch[350] Speed: 1.2573855366326019 samples/sec                   batch loss = 873.0042227506638 | accuracy = 0.6642857142857143


Epoch[2] Batch[355] Speed: 1.255476108695278 samples/sec                   batch loss = 883.9305549860001 | accuracy = 0.6647887323943662


Epoch[2] Batch[360] Speed: 1.2532363027039513 samples/sec                   batch loss = 898.9409421682358 | accuracy = 0.6611111111111111


Epoch[2] Batch[365] Speed: 1.2390438311743461 samples/sec                   batch loss = 909.845577120781 | accuracy = 0.6623287671232877


Epoch[2] Batch[370] Speed: 1.2407972920393344 samples/sec                   batch loss = 921.9043618440628 | accuracy = 0.6621621621621622


Epoch[2] Batch[375] Speed: 1.239348532276979 samples/sec                   batch loss = 933.2584196329117 | accuracy = 0.664


Epoch[2] Batch[380] Speed: 1.2476194994683723 samples/sec                   batch loss = 944.3221905231476 | accuracy = 0.6638157894736842


Epoch[2] Batch[385] Speed: 1.2497005397918761 samples/sec                   batch loss = 955.2474346160889 | accuracy = 0.6649350649350649


Epoch[2] Batch[390] Speed: 1.2395108749295918 samples/sec                   batch loss = 968.8356109857559 | accuracy = 0.6634615384615384


Epoch[2] Batch[395] Speed: 1.2502322032180564 samples/sec                   batch loss = 981.1446839570999 | accuracy = 0.6639240506329114


Epoch[2] Batch[400] Speed: 1.2554804304123928 samples/sec                   batch loss = 990.2009427547455 | accuracy = 0.66625


Epoch[2] Batch[405] Speed: 1.2491775131583749 samples/sec                   batch loss = 1002.2209002971649 | accuracy = 0.6666666666666666


Epoch[2] Batch[410] Speed: 1.2480664777840778 samples/sec                   batch loss = 1016.3420808315277 | accuracy = 0.6652439024390244


Epoch[2] Batch[415] Speed: 1.2509686549780858 samples/sec                   batch loss = 1028.9416334629059 | accuracy = 0.6650602409638554


Epoch[2] Batch[420] Speed: 1.2473024652808755 samples/sec                   batch loss = 1043.563901424408 | accuracy = 0.6636904761904762


Epoch[2] Batch[425] Speed: 1.246920624345639 samples/sec                   batch loss = 1054.8517441749573 | accuracy = 0.6635294117647059


Epoch[2] Batch[430] Speed: 1.250523510687191 samples/sec                   batch loss = 1066.94977414608 | accuracy = 0.663953488372093


Epoch[2] Batch[435] Speed: 1.2435487431180243 samples/sec                   batch loss = 1078.7673910856247 | accuracy = 0.6637931034482759


Epoch[2] Batch[440] Speed: 1.2500656430620871 samples/sec                   batch loss = 1091.7568336725235 | accuracy = 0.6625


Epoch[2] Batch[445] Speed: 1.252538790351432 samples/sec                   batch loss = 1102.6562718153 | accuracy = 0.6640449438202247


Epoch[2] Batch[450] Speed: 1.2418119518105892 samples/sec                   batch loss = 1113.6564083099365 | accuracy = 0.6638888888888889


Epoch[2] Batch[455] Speed: 1.2498675613369246 samples/sec                   batch loss = 1125.0806341171265 | accuracy = 0.6648351648351648


Epoch[2] Batch[460] Speed: 1.2588850160620766 samples/sec                   batch loss = 1137.4253284931183 | accuracy = 0.6646739130434782


Epoch[2] Batch[465] Speed: 1.2567231237223755 samples/sec                   batch loss = 1151.2066890001297 | accuracy = 0.6634408602150538


Epoch[2] Batch[470] Speed: 1.249298623577626 samples/sec                   batch loss = 1161.1861989498138 | accuracy = 0.6643617021276595


Epoch[2] Batch[475] Speed: 1.2543722275680207 samples/sec                   batch loss = 1175.587108373642 | accuracy = 0.6642105263157895


Epoch[2] Batch[480] Speed: 1.2503032003981365 samples/sec                   batch loss = 1186.1618530750275 | accuracy = 0.6651041666666667


Epoch[2] Batch[485] Speed: 1.2518698898835785 samples/sec                   batch loss = 1197.5047110319138 | accuracy = 0.6649484536082474


Epoch[2] Batch[490] Speed: 1.2470600215632983 samples/sec                   batch loss = 1209.1577079296112 | accuracy = 0.664795918367347


Epoch[2] Batch[495] Speed: 1.2539034782537206 samples/sec                   batch loss = 1221.5605379343033 | accuracy = 0.6651515151515152


Epoch[2] Batch[500] Speed: 1.248206781387324 samples/sec                   batch loss = 1229.3529020547867 | accuracy = 0.667


Epoch[2] Batch[505] Speed: 1.2466673052128778 samples/sec                   batch loss = 1239.3190571069717 | accuracy = 0.6678217821782179


Epoch[2] Batch[510] Speed: 1.2556247552697093 samples/sec                   batch loss = 1251.3573969602585 | accuracy = 0.6681372549019607


Epoch[2] Batch[515] Speed: 1.2546696900399266 samples/sec                   batch loss = 1264.5244381427765 | accuracy = 0.6684466019417475


Epoch[2] Batch[520] Speed: 1.2536607101425352 samples/sec                   batch loss = 1274.2248621582985 | accuracy = 0.6692307692307692


Epoch[2] Batch[525] Speed: 1.2509970117237423 samples/sec                   batch loss = 1286.550625860691 | accuracy = 0.6680952380952381


Epoch[2] Batch[530] Speed: 1.2406095672194013 samples/sec                   batch loss = 1296.503300845623 | accuracy = 0.6698113207547169


Epoch[2] Batch[535] Speed: 1.250506919476948 samples/sec                   batch loss = 1310.8979365229607 | accuracy = 0.6682242990654206


Epoch[2] Batch[540] Speed: 1.2530315998595893 samples/sec                   batch loss = 1323.7758727669716 | accuracy = 0.6680555555555555


Epoch[2] Batch[545] Speed: 1.2435603570781593 samples/sec                   batch loss = 1339.535812675953 | accuracy = 0.6665137614678899


Epoch[2] Batch[550] Speed: 1.2435328894908777 samples/sec                   batch loss = 1350.3323731422424 | accuracy = 0.6663636363636364


Epoch[2] Batch[555] Speed: 1.2495789791599246 samples/sec                   batch loss = 1365.2381261587143 | accuracy = 0.6657657657657657


Epoch[2] Batch[560] Speed: 1.2496235610558046 samples/sec                   batch loss = 1378.6152452230453 | accuracy = 0.665625


Epoch[2] Batch[565] Speed: 1.2539416213494845 samples/sec                   batch loss = 1390.6814631223679 | accuracy = 0.6650442477876106


Epoch[2] Batch[570] Speed: 1.2529719893379634 samples/sec                   batch loss = 1402.0642622709274 | accuracy = 0.6653508771929825


Epoch[2] Batch[575] Speed: 1.2413539188602865 samples/sec                   batch loss = 1411.8417618274689 | accuracy = 0.6669565217391304


Epoch[2] Batch[580] Speed: 1.2423464903219366 samples/sec                   batch loss = 1423.848279595375 | accuracy = 0.6676724137931035


Epoch[2] Batch[585] Speed: 1.2505411276788512 samples/sec                   batch loss = 1435.0630762577057 | accuracy = 0.6679487179487179


Epoch[2] Batch[590] Speed: 1.2579510185957963 samples/sec                   batch loss = 1444.3705353736877 | accuracy = 0.6682203389830509


Epoch[2] Batch[595] Speed: 1.2498465182608816 samples/sec                   batch loss = 1460.5258820056915 | accuracy = 0.6668067226890756


Epoch[2] Batch[600] Speed: 1.2471911056206433 samples/sec                   batch loss = 1469.8157826662064 | accuracy = 0.6679166666666667


Epoch[2] Batch[605] Speed: 1.2512258715902447 samples/sec                   batch loss = 1482.7886091470718 | accuracy = 0.6669421487603305


Epoch[2] Batch[610] Speed: 1.2477893060842724 samples/sec                   batch loss = 1495.9558745622635 | accuracy = 0.6676229508196722


Epoch[2] Batch[615] Speed: 1.2470844007411794 samples/sec                   batch loss = 1507.4552009105682 | accuracy = 0.6682926829268293


Epoch[2] Batch[620] Speed: 1.2482309268164569 samples/sec                   batch loss = 1519.1237164735794 | accuracy = 0.6685483870967742


Epoch[2] Batch[625] Speed: 1.2543725089228124 samples/sec                   batch loss = 1529.2013823986053 | accuracy = 0.6684


Epoch[2] Batch[630] Speed: 1.2543600356482782 samples/sec                   batch loss = 1541.512658238411 | accuracy = 0.6674603174603174


Epoch[2] Batch[635] Speed: 1.257296207214016 samples/sec                   batch loss = 1551.4294238090515 | accuracy = 0.6681102362204724


Epoch[2] Batch[640] Speed: 1.250025872676628 samples/sec                   batch loss = 1565.726112961769 | accuracy = 0.6671875


Epoch[2] Batch[645] Speed: 1.251406928917466 samples/sec                   batch loss = 1579.600766301155 | accuracy = 0.6666666666666666


Epoch[2] Batch[650] Speed: 1.2513528862194234 samples/sec                   batch loss = 1592.4989844560623 | accuracy = 0.6676923076923077


Epoch[2] Batch[655] Speed: 1.2537850338989156 samples/sec                   batch loss = 1606.5806201696396 | accuracy = 0.666412213740458


Epoch[2] Batch[660] Speed: 1.2511801489032124 samples/sec                   batch loss = 1616.2494535446167 | accuracy = 0.6670454545454545


Epoch[2] Batch[665] Speed: 1.2523253412313087 samples/sec                   batch loss = 1626.756394982338 | accuracy = 0.6672932330827067


Epoch[2] Batch[670] Speed: 1.2512836362518984 samples/sec                   batch loss = 1637.9275045394897 | accuracy = 0.6671641791044776


Epoch[2] Batch[675] Speed: 1.2562904318582226 samples/sec                   batch loss = 1651.4768934249878 | accuracy = 0.6659259259259259


Epoch[2] Batch[680] Speed: 1.250052789592828 samples/sec                   batch loss = 1660.8472318649292 | accuracy = 0.6672794117647058


Epoch[2] Batch[685] Speed: 1.24326085999612 samples/sec                   batch loss = 1672.0404213666916 | accuracy = 0.6678832116788321


Epoch[2] Batch[690] Speed: 1.2437107129209108 samples/sec                   batch loss = 1682.7287058234215 | accuracy = 0.6684782608695652


Epoch[2] Batch[695] Speed: 1.2489906845418604 samples/sec                   batch loss = 1690.7603141665459 | accuracy = 0.6694244604316547


Epoch[2] Batch[700] Speed: 1.2418922917136863 samples/sec                   batch loss = 1702.3119309544563 | accuracy = 0.67


Epoch[2] Batch[705] Speed: 1.2513759401328335 samples/sec                   batch loss = 1710.8084155917168 | accuracy = 0.6709219858156028


Epoch[2] Batch[710] Speed: 1.2495372854012325 samples/sec                   batch loss = 1725.67548507452 | accuracy = 0.6693661971830986


Epoch[2] Batch[715] Speed: 1.2460082685401779 samples/sec                   batch loss = 1734.6111145615578 | accuracy = 0.6702797202797203


Epoch[2] Batch[720] Speed: 1.2488228747283843 samples/sec                   batch loss = 1744.3140216469765 | accuracy = 0.6711805555555556


Epoch[2] Batch[725] Speed: 1.252258787661607 samples/sec                   batch loss = 1759.214401781559 | accuracy = 0.67


Epoch[2] Batch[730] Speed: 1.2473367765937595 samples/sec                   batch loss = 1767.2039602994919 | accuracy = 0.6712328767123288


Epoch[2] Batch[735] Speed: 1.2466815713677557 samples/sec                   batch loss = 1780.1133497953415 | accuracy = 0.6714285714285714


Epoch[2] Batch[740] Speed: 1.248077154977169 samples/sec                   batch loss = 1789.777567267418 | accuracy = 0.6722972972972973


Epoch[2] Batch[745] Speed: 1.2579275331501663 samples/sec                   batch loss = 1801.5716038942337 | accuracy = 0.6728187919463087


Epoch[2] Batch[750] Speed: 1.255587919209941 samples/sec                   batch loss = 1814.7509223222733 | accuracy = 0.6716666666666666


Epoch[2] Batch[755] Speed: 1.2515253911591242 samples/sec                   batch loss = 1827.2138679027557 | accuracy = 0.6708609271523179


Epoch[2] Batch[760] Speed: 1.2455845868149105 samples/sec                   batch loss = 1837.5067355632782 | accuracy = 0.6717105263157894


Epoch[2] Batch[765] Speed: 1.2538922325750685 samples/sec                   batch loss = 1850.6994067430496 | accuracy = 0.6712418300653594


Epoch[2] Batch[770] Speed: 1.2455520362995383 samples/sec                   batch loss = 1864.0060985088348 | accuracy = 0.6704545454545454


Epoch[2] Batch[775] Speed: 1.2461123829310425 samples/sec                   batch loss = 1876.5177850723267 | accuracy = 0.67


Epoch[2] Batch[780] Speed: 1.250888815257948 samples/sec                   batch loss = 1891.8396747112274 | accuracy = 0.6689102564102564


Epoch[2] Batch[785] Speed: 1.240864468384398 samples/sec                   batch loss = 1900.772320151329 | accuracy = 0.6700636942675159


[Epoch 2] training: accuracy=0.6703680203045685
[Epoch 2] time cost: 646.5560116767883
[Epoch 2] validation: validation accuracy=0.7488888888888889


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).