<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[21:37:47] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[21:37:47] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[21:37:47] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[6.063839 , 0.9511585]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7715133547532028 samples/sec                   batch loss = 15.175195693969727 | accuracy = 0.5


Epoch[1] Batch[10] Speed: 1.25646778254228 samples/sec                   batch loss = 29.671375513076782 | accuracy = 0.5


Epoch[1] Batch[15] Speed: 1.258360881733826 samples/sec                   batch loss = 44.327364444732666 | accuracy = 0.45


Epoch[1] Batch[20] Speed: 1.2576453048125809 samples/sec                   batch loss = 57.89108681678772 | accuracy = 0.4625


Epoch[1] Batch[25] Speed: 1.2558248541732677 samples/sec                   batch loss = 71.87422132492065 | accuracy = 0.47


Epoch[1] Batch[30] Speed: 1.2537849402017973 samples/sec                   batch loss = 86.83279776573181 | accuracy = 0.43333333333333335


Epoch[1] Batch[35] Speed: 1.2609048982409412 samples/sec                   batch loss = 101.27051305770874 | accuracy = 0.4357142857142857


Epoch[1] Batch[40] Speed: 1.2552325434804874 samples/sec                   batch loss = 115.4597520828247 | accuracy = 0.425


Epoch[1] Batch[45] Speed: 1.2561781199754833 samples/sec                   batch loss = 129.3857398033142 | accuracy = 0.42777777777777776


Epoch[1] Batch[50] Speed: 1.2590893682277537 samples/sec                   batch loss = 143.71048212051392 | accuracy = 0.435


Epoch[1] Batch[55] Speed: 1.2605875177369976 samples/sec                   batch loss = 157.69439721107483 | accuracy = 0.4318181818181818


Epoch[1] Batch[60] Speed: 1.2554009531555976 samples/sec                   batch loss = 171.49340271949768 | accuracy = 0.4375


Epoch[1] Batch[65] Speed: 1.2600425646616873 samples/sec                   batch loss = 185.23333287239075 | accuracy = 0.4346153846153846


Epoch[1] Batch[70] Speed: 1.2562936303131207 samples/sec                   batch loss = 199.85963892936707 | accuracy = 0.4392857142857143


Epoch[1] Batch[75] Speed: 1.2630965244174939 samples/sec                   batch loss = 213.5314862728119 | accuracy = 0.44


Epoch[1] Batch[80] Speed: 1.2601243343723199 samples/sec                   batch loss = 227.10266947746277 | accuracy = 0.446875


Epoch[1] Batch[85] Speed: 1.2599944920542212 samples/sec                   batch loss = 241.47716164588928 | accuracy = 0.45


Epoch[1] Batch[90] Speed: 1.2609799560357562 samples/sec                   batch loss = 255.46169471740723 | accuracy = 0.4527777777777778


Epoch[1] Batch[95] Speed: 1.2548631960924945 samples/sec                   batch loss = 269.3853671550751 | accuracy = 0.45263157894736844


Epoch[1] Batch[100] Speed: 1.2598876666568042 samples/sec                   batch loss = 283.430536031723 | accuracy = 0.4575


Epoch[1] Batch[105] Speed: 1.2625949125839062 samples/sec                   batch loss = 296.6958420276642 | accuracy = 0.4642857142857143


Epoch[1] Batch[110] Speed: 1.2642955608702962 samples/sec                   batch loss = 310.20425820350647 | accuracy = 0.47954545454545455


Epoch[1] Batch[115] Speed: 1.2573209882594096 samples/sec                   batch loss = 323.98471093177795 | accuracy = 0.4782608695652174


Epoch[1] Batch[120] Speed: 1.2536992132068268 samples/sec                   batch loss = 337.8159968852997 | accuracy = 0.48125


Epoch[1] Batch[125] Speed: 1.2581021387011067 samples/sec                   batch loss = 351.7136483192444 | accuracy = 0.484


Epoch[1] Batch[130] Speed: 1.26008836953484 samples/sec                   batch loss = 365.9684007167816 | accuracy = 0.4826923076923077


Epoch[1] Batch[135] Speed: 1.2554979994382998 samples/sec                   batch loss = 379.51531648635864 | accuracy = 0.4925925925925926


Epoch[1] Batch[140] Speed: 1.2542795749215496 samples/sec                   batch loss = 393.80671739578247 | accuracy = 0.4928571428571429


Epoch[1] Batch[145] Speed: 1.2552887063255576 samples/sec                   batch loss = 407.50600957870483 | accuracy = 0.49310344827586206


Epoch[1] Batch[150] Speed: 1.2592870757016417 samples/sec                   batch loss = 421.5862579345703 | accuracy = 0.49166666666666664


Epoch[1] Batch[155] Speed: 1.2612720272455908 samples/sec                   batch loss = 435.5543510913849 | accuracy = 0.49032258064516127


Epoch[1] Batch[160] Speed: 1.2589742879742976 samples/sec                   batch loss = 448.9578890800476 | accuracy = 0.49375


Epoch[1] Batch[165] Speed: 1.2535303231708552 samples/sec                   batch loss = 462.82022309303284 | accuracy = 0.4954545454545455


Epoch[1] Batch[170] Speed: 1.2597582514717514 samples/sec                   batch loss = 476.3914256095886 | accuracy = 0.4985294117647059


Epoch[1] Batch[175] Speed: 1.2577280836937281 samples/sec                   batch loss = 490.62929821014404 | accuracy = 0.49857142857142855


Epoch[1] Batch[180] Speed: 1.2586311562753099 samples/sec                   batch loss = 504.9969274997711 | accuracy = 0.49444444444444446


Epoch[1] Batch[185] Speed: 1.2454708523708071 samples/sec                   batch loss = 518.4969067573547 | accuracy = 0.49594594594594593


Epoch[1] Batch[190] Speed: 1.2487081764063956 samples/sec                   batch loss = 532.8666331768036 | accuracy = 0.4934210526315789


Epoch[1] Batch[195] Speed: 1.2585731833217233 samples/sec                   batch loss = 546.991128206253 | accuracy = 0.49615384615384617


Epoch[1] Batch[200] Speed: 1.2546834831126579 samples/sec                   batch loss = 560.5147511959076 | accuracy = 0.5


Epoch[1] Batch[205] Speed: 1.2569944844898002 samples/sec                   batch loss = 574.6547174453735 | accuracy = 0.4975609756097561


Epoch[1] Batch[210] Speed: 1.2459012108188905 samples/sec                   batch loss = 588.7733225822449 | accuracy = 0.49523809523809526


Epoch[1] Batch[215] Speed: 1.2527032041819455 samples/sec                   batch loss = 602.712785243988 | accuracy = 0.4930232558139535


Epoch[1] Batch[220] Speed: 1.2586387101478718 samples/sec                   batch loss = 616.5274999141693 | accuracy = 0.49318181818181817


Epoch[1] Batch[225] Speed: 1.2608789333569017 samples/sec                   batch loss = 629.7055275440216 | accuracy = 0.5011111111111111


Epoch[1] Batch[230] Speed: 1.2604497201457496 samples/sec                   batch loss = 643.1405644416809 | accuracy = 0.5032608695652174


Epoch[1] Batch[235] Speed: 1.2556305815791629 samples/sec                   batch loss = 656.4099309444427 | accuracy = 0.5085106382978724


Epoch[1] Batch[240] Speed: 1.2664794009328362 samples/sec                   batch loss = 669.6260702610016 | accuracy = 0.5125


Epoch[1] Batch[245] Speed: 1.25651332787955 samples/sec                   batch loss = 683.2820835113525 | accuracy = 0.5112244897959184


Epoch[1] Batch[250] Speed: 1.254675319828902 samples/sec                   batch loss = 696.8083872795105 | accuracy = 0.512


Epoch[1] Batch[255] Speed: 1.2538481889424824 samples/sec                   batch loss = 710.9141790866852 | accuracy = 0.5107843137254902


Epoch[1] Batch[260] Speed: 1.2518974467650095 samples/sec                   batch loss = 724.3691351413727 | accuracy = 0.5134615384615384


Epoch[1] Batch[265] Speed: 1.2520570202874357 samples/sec                   batch loss = 739.05189204216 | accuracy = 0.5094339622641509


Epoch[1] Batch[270] Speed: 1.2510789177012898 samples/sec                   batch loss = 752.7385756969452 | accuracy = 0.5101851851851852


Epoch[1] Batch[275] Speed: 1.2539773298885837 samples/sec                   batch loss = 766.6584107875824 | accuracy = 0.5118181818181818


Epoch[1] Batch[280] Speed: 1.256668432636226 samples/sec                   batch loss = 780.6572835445404 | accuracy = 0.5116071428571428


Epoch[1] Batch[285] Speed: 1.2501204129807577 samples/sec                   batch loss = 793.8131034374237 | accuracy = 0.5149122807017544


Epoch[1] Batch[290] Speed: 1.255794586178084 samples/sec                   batch loss = 807.6619226932526 | accuracy = 0.5155172413793103


Epoch[1] Batch[295] Speed: 1.255684242551932 samples/sec                   batch loss = 820.9209887981415 | accuracy = 0.5177966101694915


Epoch[1] Batch[300] Speed: 1.251682815407773 samples/sec                   batch loss = 834.8543438911438 | accuracy = 0.5191666666666667


Epoch[1] Batch[305] Speed: 1.2487715646086752 samples/sec                   batch loss = 848.9549090862274 | accuracy = 0.5188524590163934


Epoch[1] Batch[310] Speed: 1.252027307381172 samples/sec                   batch loss = 862.47838306427 | accuracy = 0.5201612903225806


Epoch[1] Batch[315] Speed: 1.252389003722706 samples/sec                   batch loss = 875.9940810203552 | accuracy = 0.5222222222222223


Epoch[1] Batch[320] Speed: 1.2512151404554255 samples/sec                   batch loss = 889.2216556072235 | accuracy = 0.525


Epoch[1] Batch[325] Speed: 1.2577018723451658 samples/sec                   batch loss = 902.8711647987366 | accuracy = 0.5253846153846153


Epoch[1] Batch[330] Speed: 1.25342637023618 samples/sec                   batch loss = 916.2359371185303 | accuracy = 0.5287878787878788


Epoch[1] Batch[335] Speed: 1.2549989299321358 samples/sec                   batch loss = 930.0328698158264 | accuracy = 0.5305970149253731


Epoch[1] Batch[340] Speed: 1.256748447025143 samples/sec                   batch loss = 943.0982663631439 | accuracy = 0.5330882352941176


Epoch[1] Batch[345] Speed: 1.2564295797024942 samples/sec                   batch loss = 956.3680732250214 | accuracy = 0.5355072463768116


Epoch[1] Batch[350] Speed: 1.2532010107944058 samples/sec                   batch loss = 969.9252309799194 | accuracy = 0.5342857142857143


Epoch[1] Batch[355] Speed: 1.2512719709066045 samples/sec                   batch loss = 983.9531605243683 | accuracy = 0.5330985915492957


Epoch[1] Batch[360] Speed: 1.2518666205035756 samples/sec                   batch loss = 997.3326156139374 | accuracy = 0.5347222222222222


Epoch[1] Batch[365] Speed: 1.2571411358603866 samples/sec                   batch loss = 1010.6075081825256 | accuracy = 0.5363013698630137


Epoch[1] Batch[370] Speed: 1.2562600473489451 samples/sec                   batch loss = 1024.960096359253 | accuracy = 0.5344594594594595


Epoch[1] Batch[375] Speed: 1.2491963944205116 samples/sec                   batch loss = 1038.7091677188873 | accuracy = 0.5346666666666666


Epoch[1] Batch[380] Speed: 1.253740060891959 samples/sec                   batch loss = 1052.0103533267975 | accuracy = 0.5355263157894737


Epoch[1] Batch[385] Speed: 1.2584481914118786 samples/sec                   batch loss = 1065.1276588439941 | accuracy = 0.5383116883116883


Epoch[1] Batch[390] Speed: 1.2520537499299427 samples/sec                   batch loss = 1077.9196727275848 | accuracy = 0.5397435897435897


Epoch[1] Batch[395] Speed: 1.2553637544927108 samples/sec                   batch loss = 1091.6653990745544 | accuracy = 0.5392405063291139


Epoch[1] Batch[400] Speed: 1.2493229973298177 samples/sec                   batch loss = 1104.7222893238068 | accuracy = 0.54125


Epoch[1] Batch[405] Speed: 1.2584575366249033 samples/sec                   batch loss = 1118.091617822647 | accuracy = 0.5425925925925926


Epoch[1] Batch[410] Speed: 1.2590409903857995 samples/sec                   batch loss = 1130.9250037670135 | accuracy = 0.5445121951219513


Epoch[1] Batch[415] Speed: 1.2580021425576937 samples/sec                   batch loss = 1145.4142689704895 | accuracy = 0.5427710843373494


Epoch[1] Batch[420] Speed: 1.2520246912179849 samples/sec                   batch loss = 1158.486998796463 | accuracy = 0.5458333333333333


Epoch[1] Batch[425] Speed: 1.2498715651882657 samples/sec                   batch loss = 1171.7507495880127 | accuracy = 0.5470588235294118


Epoch[1] Batch[430] Speed: 1.2558537135274146 samples/sec                   batch loss = 1185.2904796600342 | accuracy = 0.5494186046511628


Epoch[1] Batch[435] Speed: 1.2507831552237174 samples/sec                   batch loss = 1198.6405017375946 | accuracy = 0.5505747126436782


Epoch[1] Batch[440] Speed: 1.25518126874995 samples/sec                   batch loss = 1211.1623747348785 | accuracy = 0.5528409090909091


Epoch[1] Batch[445] Speed: 1.2478171476443878 samples/sec                   batch loss = 1224.9630904197693 | accuracy = 0.553932584269663


Epoch[1] Batch[450] Speed: 1.2552874853407026 samples/sec                   batch loss = 1238.926936864853 | accuracy = 0.5538888888888889


Epoch[1] Batch[455] Speed: 1.2597791566564058 samples/sec                   batch loss = 1252.3454048633575 | accuracy = 0.5543956043956044


Epoch[1] Batch[460] Speed: 1.253325524272831 samples/sec                   batch loss = 1266.310908317566 | accuracy = 0.5543478260869565


Epoch[1] Batch[465] Speed: 1.2563161140292318 samples/sec                   batch loss = 1279.7912151813507 | accuracy = 0.5548387096774193


Epoch[1] Batch[470] Speed: 1.2495092739697584 samples/sec                   batch loss = 1293.482055425644 | accuracy = 0.5547872340425531


Epoch[1] Batch[475] Speed: 1.2530710001935188 samples/sec                   batch loss = 1306.8491878509521 | accuracy = 0.5542105263157895


Epoch[1] Batch[480] Speed: 1.2561990946228379 samples/sec                   batch loss = 1320.3518860340118 | accuracy = 0.5536458333333333


Epoch[1] Batch[485] Speed: 1.2541046225385795 samples/sec                   batch loss = 1333.2322640419006 | accuracy = 0.5551546391752578


Epoch[1] Batch[490] Speed: 1.2522355142765103 samples/sec                   batch loss = 1346.8725852966309 | accuracy = 0.5566326530612244


Epoch[1] Batch[495] Speed: 1.256719358262653 samples/sec                   batch loss = 1360.214293718338 | accuracy = 0.5580808080808081


Epoch[1] Batch[500] Speed: 1.254764464887652 samples/sec                   batch loss = 1373.214162826538 | accuracy = 0.56


Epoch[1] Batch[505] Speed: 1.253723477909844 samples/sec                   batch loss = 1385.9819037914276 | accuracy = 0.5603960396039604


Epoch[1] Batch[510] Speed: 1.2549863503197267 samples/sec                   batch loss = 1399.4049639701843 | accuracy = 0.5598039215686275


Epoch[1] Batch[515] Speed: 1.249342906367434 samples/sec                   batch loss = 1412.4941656589508 | accuracy = 0.5606796116504854


Epoch[1] Batch[520] Speed: 1.2527756048445509 samples/sec                   batch loss = 1426.478277206421 | accuracy = 0.5600961538461539


Epoch[1] Batch[525] Speed: 1.2640638001561733 samples/sec                   batch loss = 1439.4834096431732 | accuracy = 0.560952380952381


Epoch[1] Batch[530] Speed: 1.2604373151054973 samples/sec                   batch loss = 1452.3110854625702 | accuracy = 0.5613207547169812


Epoch[1] Batch[535] Speed: 1.2561417216893882 samples/sec                   batch loss = 1465.919670343399 | accuracy = 0.5607476635514018


Epoch[1] Batch[540] Speed: 1.2483532470531966 samples/sec                   batch loss = 1479.608535528183 | accuracy = 0.5611111111111111


Epoch[1] Batch[545] Speed: 1.2534922050928021 samples/sec                   batch loss = 1492.1557531356812 | accuracy = 0.563302752293578


Epoch[1] Batch[550] Speed: 1.2560402502674957 samples/sec                   batch loss = 1505.1244814395905 | accuracy = 0.5640909090909091


Epoch[1] Batch[555] Speed: 1.2539313121758473 samples/sec                   batch loss = 1518.3425891399384 | accuracy = 0.5644144144144144


Epoch[1] Batch[560] Speed: 1.2473637633038588 samples/sec                   batch loss = 1531.5827674865723 | accuracy = 0.5647321428571429


Epoch[1] Batch[565] Speed: 1.2502363025734102 samples/sec                   batch loss = 1545.1551542282104 | accuracy = 0.5650442477876106


Epoch[1] Batch[570] Speed: 1.2533162551240045 samples/sec                   batch loss = 1557.95370054245 | accuracy = 0.5657894736842105


Epoch[1] Batch[575] Speed: 1.2567395037347386 samples/sec                   batch loss = 1570.699177980423 | accuracy = 0.5669565217391305


Epoch[1] Batch[580] Speed: 1.2575797871144294 samples/sec                   batch loss = 1584.552256345749 | accuracy = 0.5668103448275862


Epoch[1] Batch[585] Speed: 1.2525376682210267 samples/sec                   batch loss = 1596.41151201725 | accuracy = 0.5683760683760684


Epoch[1] Batch[590] Speed: 1.2541258092434693 samples/sec                   batch loss = 1610.672492146492 | accuracy = 0.5682203389830508


Epoch[1] Batch[595] Speed: 1.2623788774837081 samples/sec                   batch loss = 1623.2972980737686 | accuracy = 0.569327731092437


Epoch[1] Batch[600] Speed: 1.2604743415595812 samples/sec                   batch loss = 1636.3618322610855 | accuracy = 0.5691666666666667


Epoch[1] Batch[605] Speed: 1.263401374293709 samples/sec                   batch loss = 1648.9177745580673 | accuracy = 0.5694214876033058


Epoch[1] Batch[610] Speed: 1.2484405669723895 samples/sec                   batch loss = 1662.0242742300034 | accuracy = 0.5692622950819672


Epoch[1] Batch[615] Speed: 1.253535661759981 samples/sec                   batch loss = 1675.480608344078 | accuracy = 0.5691056910569106


Epoch[1] Batch[620] Speed: 1.2584993557934483 samples/sec                   batch loss = 1688.5258983373642 | accuracy = 0.5701612903225807


Epoch[1] Batch[625] Speed: 1.2543840445779486 samples/sec                   batch loss = 1701.1192234754562 | accuracy = 0.5724


Epoch[1] Batch[630] Speed: 1.258355407584672 samples/sec                   batch loss = 1713.8046654462814 | accuracy = 0.5726190476190476


Epoch[1] Batch[635] Speed: 1.2471986155136219 samples/sec                   batch loss = 1727.7775765657425 | accuracy = 0.573228346456693


Epoch[1] Batch[640] Speed: 1.2470652124870822 samples/sec                   batch loss = 1740.8195954561234 | accuracy = 0.5734375


Epoch[1] Batch[645] Speed: 1.2533462165037508 samples/sec                   batch loss = 1754.6558085680008 | accuracy = 0.5724806201550388


Epoch[1] Batch[650] Speed: 1.2558511753551231 samples/sec                   batch loss = 1767.9796302318573 | accuracy = 0.5734615384615385


Epoch[1] Batch[655] Speed: 1.2538793940052078 samples/sec                   batch loss = 1780.844396352768 | accuracy = 0.5740458015267176


Epoch[1] Batch[660] Speed: 1.2504439141722208 samples/sec                   batch loss = 1793.869550228119 | accuracy = 0.5746212121212121


Epoch[1] Batch[665] Speed: 1.2584815138669345 samples/sec                   batch loss = 1807.156266450882 | accuracy = 0.5759398496240602


Epoch[1] Batch[670] Speed: 1.2569304470979716 samples/sec                   batch loss = 1819.7933790683746 | accuracy = 0.5761194029850746


Epoch[1] Batch[675] Speed: 1.2563461249293844 samples/sec                   batch loss = 1833.1259117126465 | accuracy = 0.577037037037037


Epoch[1] Batch[680] Speed: 1.2614160745059717 samples/sec                   batch loss = 1844.8251230716705 | accuracy = 0.5783088235294118


Epoch[1] Batch[685] Speed: 1.2533258051582692 samples/sec                   batch loss = 1858.4096610546112 | accuracy = 0.5777372262773722


Epoch[1] Batch[690] Speed: 1.2551913167645525 samples/sec                   batch loss = 1872.0635781288147 | accuracy = 0.5771739130434783


Epoch[1] Batch[695] Speed: 1.260675989251968 samples/sec                   batch loss = 1882.867354631424 | accuracy = 0.5787769784172662


Epoch[1] Batch[700] Speed: 1.2602830774038367 samples/sec                   batch loss = 1894.8377511501312 | accuracy = 0.58


Epoch[1] Batch[705] Speed: 1.2510429076019358 samples/sec                   batch loss = 1907.0055079460144 | accuracy = 0.5815602836879432


Epoch[1] Batch[710] Speed: 1.256501376614082 samples/sec                   batch loss = 1918.4150054454803 | accuracy = 0.5827464788732394


Epoch[1] Batch[715] Speed: 1.2595790250847223 samples/sec                   batch loss = 1931.86722946167 | accuracy = 0.5818181818181818


Epoch[1] Batch[720] Speed: 1.261803619645827 samples/sec                   batch loss = 1944.049506187439 | accuracy = 0.5833333333333334


Epoch[1] Batch[725] Speed: 1.2512902622648845 samples/sec                   batch loss = 1957.6251566410065 | accuracy = 0.5824137931034483


Epoch[1] Batch[730] Speed: 1.2561215013307514 samples/sec                   batch loss = 1969.0675065517426 | accuracy = 0.583904109589041


Epoch[1] Batch[735] Speed: 1.2567397861524894 samples/sec                   batch loss = 1980.6459200382233 | accuracy = 0.5846938775510204


Epoch[1] Batch[740] Speed: 1.265175367095593 samples/sec                   batch loss = 1993.197240114212 | accuracy = 0.5861486486486487


Epoch[1] Batch[745] Speed: 1.264240875590639 samples/sec                   batch loss = 2005.2329626083374 | accuracy = 0.5869127516778524


Epoch[1] Batch[750] Speed: 1.2582741503852624 samples/sec                   batch loss = 2018.7139070034027 | accuracy = 0.5873333333333334


Epoch[1] Batch[755] Speed: 1.2553573670653 samples/sec                   batch loss = 2031.1556901931763 | accuracy = 0.5877483443708609


Epoch[1] Batch[760] Speed: 1.2636387930081312 samples/sec                   batch loss = 2042.6656551361084 | accuracy = 0.5884868421052631


Epoch[1] Batch[765] Speed: 1.2597018771471231 samples/sec                   batch loss = 2054.516836643219 | accuracy = 0.5892156862745098


Epoch[1] Batch[770] Speed: 1.263486625416547 samples/sec                   batch loss = 2065.808585047722 | accuracy = 0.5909090909090909


Epoch[1] Batch[775] Speed: 1.2598148200433605 samples/sec                   batch loss = 2078.339404463768 | accuracy = 0.5919354838709677


Epoch[1] Batch[780] Speed: 1.2602015710666192 samples/sec                   batch loss = 2092.5913342237473 | accuracy = 0.5916666666666667


Epoch[1] Batch[785] Speed: 1.2589089151319324 samples/sec                   batch loss = 2106.435200691223 | accuracy = 0.5917197452229299


[Epoch 1] training: accuracy=0.5932741116751269
[Epoch 1] time cost: 645.3693602085114
[Epoch 1] validation: validation accuracy=0.69


Epoch[2] Batch[5] Speed: 1.2477174805965585 samples/sec                   batch loss = 10.759482860565186 | accuracy = 0.85


Epoch[2] Batch[10] Speed: 1.2659363160059918 samples/sec                   batch loss = 23.685957431793213 | accuracy = 0.725


Epoch[2] Batch[15] Speed: 1.2672786770134454 samples/sec                   batch loss = 36.93756699562073 | accuracy = 0.65


Epoch[2] Batch[20] Speed: 1.2656638501350177 samples/sec                   batch loss = 48.453449726104736 | accuracy = 0.6625


Epoch[2] Batch[25] Speed: 1.2529982846459984 samples/sec                   batch loss = 62.85991311073303 | accuracy = 0.64


Epoch[2] Batch[30] Speed: 1.2587324801392248 samples/sec                   batch loss = 79.0513002872467 | accuracy = 0.6083333333333333


Epoch[2] Batch[35] Speed: 1.2589049476345955 samples/sec                   batch loss = 92.45491433143616 | accuracy = 0.6071428571428571


Epoch[2] Batch[40] Speed: 1.2570382785218404 samples/sec                   batch loss = 105.11370134353638 | accuracy = 0.625


Epoch[2] Batch[45] Speed: 1.2574398189345948 samples/sec                   batch loss = 116.83342099189758 | accuracy = 0.6333333333333333


Epoch[2] Batch[50] Speed: 1.2594786049417739 samples/sec                   batch loss = 130.48601830005646 | accuracy = 0.63


Epoch[2] Batch[55] Speed: 1.2622755411888742 samples/sec                   batch loss = 144.72779786586761 | accuracy = 0.6181818181818182


Epoch[2] Batch[60] Speed: 1.2638223183763742 samples/sec                   batch loss = 156.86661183834076 | accuracy = 0.6208333333333333


Epoch[2] Batch[65] Speed: 1.2618749880598394 samples/sec                   batch loss = 169.9476079940796 | accuracy = 0.6192307692307693


Epoch[2] Batch[70] Speed: 1.2605157266414837 samples/sec                   batch loss = 183.62481307983398 | accuracy = 0.6142857142857143


Epoch[2] Batch[75] Speed: 1.2544089923300872 samples/sec                   batch loss = 195.70788252353668 | accuracy = 0.62


Epoch[2] Batch[80] Speed: 1.2578856576639692 samples/sec                   batch loss = 208.49223220348358 | accuracy = 0.621875


Epoch[2] Batch[85] Speed: 1.2548800908211233 samples/sec                   batch loss = 219.3850381374359 | accuracy = 0.6294117647058823


Epoch[2] Batch[90] Speed: 1.2538011500116357 samples/sec                   batch loss = 230.66147422790527 | accuracy = 0.6444444444444445


Epoch[2] Batch[95] Speed: 1.256119526353875 samples/sec                   batch loss = 243.75539565086365 | accuracy = 0.6394736842105263


Epoch[2] Batch[100] Speed: 1.2539378725394579 samples/sec                   batch loss = 256.6494927406311 | accuracy = 0.6375


Epoch[2] Batch[105] Speed: 1.256913779611768 samples/sec                   batch loss = 268.2391365766525 | accuracy = 0.6428571428571429


Epoch[2] Batch[110] Speed: 1.2574825129958924 samples/sec                   batch loss = 280.99922025203705 | accuracy = 0.6386363636363637


Epoch[2] Batch[115] Speed: 1.2505537115482233 samples/sec                   batch loss = 291.92499899864197 | accuracy = 0.6456521739130435


Epoch[2] Batch[120] Speed: 1.244336218180351 samples/sec                   batch loss = 305.37932229042053 | accuracy = 0.6416666666666667


Epoch[2] Batch[125] Speed: 1.2540651571784533 samples/sec                   batch loss = 319.30628418922424 | accuracy = 0.644


Epoch[2] Batch[130] Speed: 1.2529633804377869 samples/sec                   batch loss = 330.83927822113037 | accuracy = 0.65


Epoch[2] Batch[135] Speed: 1.25000100582919 samples/sec                   batch loss = 343.25362181663513 | accuracy = 0.65


Epoch[2] Batch[140] Speed: 1.2460736966193862 samples/sec                   batch loss = 354.93348503112793 | accuracy = 0.65


Epoch[2] Batch[145] Speed: 1.2510091383560862 samples/sec                   batch loss = 365.01635777950287 | accuracy = 0.6517241379310345


Epoch[2] Batch[150] Speed: 1.2590075438172434 samples/sec                   batch loss = 376.82393527030945 | accuracy = 0.6566666666666666


Epoch[2] Batch[155] Speed: 1.259697431735482 samples/sec                   batch loss = 390.6221480369568 | accuracy = 0.6564516129032258


Epoch[2] Batch[160] Speed: 1.2636380316031224 samples/sec                   batch loss = 402.6208527088165 | accuracy = 0.65625


Epoch[2] Batch[165] Speed: 1.2543681948298744 samples/sec                   batch loss = 416.3508086204529 | accuracy = 0.6545454545454545


Epoch[2] Batch[170] Speed: 1.2594512805745373 samples/sec                   batch loss = 428.5518971681595 | accuracy = 0.6558823529411765


Epoch[2] Batch[175] Speed: 1.260496028140022 samples/sec                   batch loss = 441.29762744903564 | accuracy = 0.6542857142857142


Epoch[2] Batch[180] Speed: 1.2579736559998842 samples/sec                   batch loss = 453.7778331041336 | accuracy = 0.6541666666666667


Epoch[2] Batch[185] Speed: 1.2614186352223067 samples/sec                   batch loss = 465.3367828130722 | accuracy = 0.654054054054054


Epoch[2] Batch[190] Speed: 1.2522561705309558 samples/sec                   batch loss = 480.07350170612335 | accuracy = 0.656578947368421


Epoch[2] Batch[195] Speed: 1.2545992281214455 samples/sec                   batch loss = 493.09796941280365 | accuracy = 0.6538461538461539


Epoch[2] Batch[200] Speed: 1.2575331275923591 samples/sec                   batch loss = 506.13996517658234 | accuracy = 0.65125


Epoch[2] Batch[205] Speed: 1.2609242304549302 samples/sec                   batch loss = 520.0324162244797 | accuracy = 0.6487804878048781


Epoch[2] Batch[210] Speed: 1.2587185034639634 samples/sec                   batch loss = 533.5602461099625 | accuracy = 0.6428571428571429


Epoch[2] Batch[215] Speed: 1.249302995898285 samples/sec                   batch loss = 543.4556249380112 | accuracy = 0.6476744186046511


Epoch[2] Batch[220] Speed: 1.2569266803957624 samples/sec                   batch loss = 556.0778065919876 | accuracy = 0.6488636363636363


Epoch[2] Batch[225] Speed: 1.2577580677990117 samples/sec                   batch loss = 568.6885269880295 | accuracy = 0.6488888888888888


Epoch[2] Batch[230] Speed: 1.2562017282603413 samples/sec                   batch loss = 582.3709431886673 | accuracy = 0.6467391304347826


Epoch[2] Batch[235] Speed: 1.254334902284704 samples/sec                   batch loss = 594.6900730133057 | accuracy = 0.6468085106382979


Epoch[2] Batch[240] Speed: 1.2476739624572248 samples/sec                   batch loss = 605.4676557779312 | accuracy = 0.65


Epoch[2] Batch[245] Speed: 1.255289082013683 samples/sec                   batch loss = 618.4313617944717 | accuracy = 0.6479591836734694


Epoch[2] Batch[250] Speed: 1.2535463390746515 samples/sec                   batch loss = 630.9717818498611 | accuracy = 0.648


Epoch[2] Batch[255] Speed: 1.2548822496247818 samples/sec                   batch loss = 642.9433648586273 | accuracy = 0.6490196078431373


Epoch[2] Batch[260] Speed: 1.250497039527183 samples/sec                   batch loss = 652.5545052289963 | accuracy = 0.6528846153846154


Epoch[2] Batch[265] Speed: 1.251682815407773 samples/sec                   batch loss = 665.9466178417206 | accuracy = 0.6547169811320754


Epoch[2] Batch[270] Speed: 1.2558414927920851 samples/sec                   batch loss = 676.9771859645844 | accuracy = 0.6564814814814814


Epoch[2] Batch[275] Speed: 1.2554223715676123 samples/sec                   batch loss = 688.7211450338364 | accuracy = 0.6563636363636364


Epoch[2] Batch[280] Speed: 1.2455372411731285 samples/sec                   batch loss = 700.1659604310989 | accuracy = 0.6571428571428571


Epoch[2] Batch[285] Speed: 1.247691035309938 samples/sec                   batch loss = 709.6679981946945 | accuracy = 0.6605263157894737


Epoch[2] Batch[290] Speed: 1.2563021909637115 samples/sec                   batch loss = 721.029382109642 | accuracy = 0.6586206896551724


Epoch[2] Batch[295] Speed: 1.2614565729785752 samples/sec                   batch loss = 734.4424685239792 | accuracy = 0.6567796610169492


Epoch[2] Batch[300] Speed: 1.2540773433666201 samples/sec                   batch loss = 745.6764578819275 | accuracy = 0.6575


Epoch[2] Batch[305] Speed: 1.2507634799956462 samples/sec                   batch loss = 761.954686164856 | accuracy = 0.6540983606557377


Epoch[2] Batch[310] Speed: 1.2578511407783988 samples/sec                   batch loss = 772.5350793600082 | accuracy = 0.6556451612903226


Epoch[2] Batch[315] Speed: 1.2694750530593184 samples/sec                   batch loss = 785.944354891777 | accuracy = 0.653968253968254


Epoch[2] Batch[320] Speed: 1.262814443484757 samples/sec                   batch loss = 799.704448223114 | accuracy = 0.653125


Epoch[2] Batch[325] Speed: 1.2626693164544083 samples/sec                   batch loss = 813.4905680418015 | accuracy = 0.6523076923076923


Epoch[2] Batch[330] Speed: 1.2571580919827436 samples/sec                   batch loss = 825.1620274782181 | accuracy = 0.6522727272727272


Epoch[2] Batch[335] Speed: 1.2588691468253075 samples/sec                   batch loss = 838.0567461252213 | accuracy = 0.6522388059701493


Epoch[2] Batch[340] Speed: 1.2586450365859103 samples/sec                   batch loss = 850.8085408210754 | accuracy = 0.6522058823529412


Epoch[2] Batch[345] Speed: 1.2609594848418868 samples/sec                   batch loss = 861.9317198991776 | accuracy = 0.6528985507246376


Epoch[2] Batch[350] Speed: 1.263785475761235 samples/sec                   batch loss = 874.6673712730408 | accuracy = 0.6514285714285715


Epoch[2] Batch[355] Speed: 1.2544727730497525 samples/sec                   batch loss = 886.2548071146011 | accuracy = 0.6528169014084507


Epoch[2] Batch[360] Speed: 1.2482135605865703 samples/sec                   batch loss = 899.5542260408401 | accuracy = 0.6513888888888889


Epoch[2] Batch[365] Speed: 1.2524089171083483 samples/sec                   batch loss = 910.9926228523254 | accuracy = 0.6520547945205479


Epoch[2] Batch[370] Speed: 1.2610426058094855 samples/sec                   batch loss = 922.5277895927429 | accuracy = 0.6533783783783784


Epoch[2] Batch[375] Speed: 1.254024475773018 samples/sec                   batch loss = 933.0931652784348 | accuracy = 0.6546666666666666


Epoch[2] Batch[380] Speed: 1.255491986454301 samples/sec                   batch loss = 942.525307059288 | accuracy = 0.6578947368421053


Epoch[2] Batch[385] Speed: 1.255279032433772 samples/sec                   batch loss = 953.6529715061188 | accuracy = 0.6571428571428571


Epoch[2] Batch[390] Speed: 1.2583457807482399 samples/sec                   batch loss = 963.5623137950897 | accuracy = 0.658974358974359


Epoch[2] Batch[395] Speed: 1.2584702803208896 samples/sec                   batch loss = 976.9426836967468 | accuracy = 0.6582278481012658


Epoch[2] Batch[400] Speed: 1.2505996681261156 samples/sec                   batch loss = 992.4814265966415 | accuracy = 0.656875


Epoch[2] Batch[405] Speed: 1.2558578498301691 samples/sec                   batch loss = 1005.2925906181335 | accuracy = 0.654320987654321


Epoch[2] Batch[410] Speed: 1.2612177928273662 samples/sec                   batch loss = 1014.8543437719345 | accuracy = 0.6573170731707317


Epoch[2] Batch[415] Speed: 1.2658937145870095 samples/sec                   batch loss = 1026.3036478757858 | accuracy = 0.6572289156626506


Epoch[2] Batch[420] Speed: 1.2602419915908831 samples/sec                   batch loss = 1034.7934952378273 | accuracy = 0.6583333333333333


Epoch[2] Batch[425] Speed: 1.2539797667659054 samples/sec                   batch loss = 1049.3472796082497 | accuracy = 0.6558823529411765


Epoch[2] Batch[430] Speed: 1.2552202409195783 samples/sec                   batch loss = 1058.509192764759 | accuracy = 0.6575581395348837


Epoch[2] Batch[435] Speed: 1.2552240913134778 samples/sec                   batch loss = 1072.5720012784004 | accuracy = 0.6574712643678161


Epoch[2] Batch[440] Speed: 1.250708001051127 samples/sec                   batch loss = 1084.4059779047966 | accuracy = 0.6579545454545455


Epoch[2] Batch[445] Speed: 1.2535241417034126 samples/sec                   batch loss = 1099.0444442629814 | accuracy = 0.6578651685393259


Epoch[2] Batch[450] Speed: 1.2500413334625804 samples/sec                   batch loss = 1111.4381411671638 | accuracy = 0.6588888888888889


Epoch[2] Batch[455] Speed: 1.2643319568316491 samples/sec                   batch loss = 1123.1662561297417 | accuracy = 0.6582417582417582


Epoch[2] Batch[460] Speed: 1.261417686807637 samples/sec                   batch loss = 1135.69323939085 | accuracy = 0.657608695652174


Epoch[2] Batch[465] Speed: 1.2572545621821167 samples/sec                   batch loss = 1148.0616218447685 | accuracy = 0.6586021505376344


Epoch[2] Batch[470] Speed: 1.2525589890425457 samples/sec                   batch loss = 1160.4021417498589 | accuracy = 0.6585106382978724


Epoch[2] Batch[475] Speed: 1.2482114246665335 samples/sec                   batch loss = 1170.2155683636665 | accuracy = 0.6594736842105263


Epoch[2] Batch[480] Speed: 1.2561984362151872 samples/sec                   batch loss = 1182.8159940838814 | accuracy = 0.659375


Epoch[2] Batch[485] Speed: 1.2617197341195097 samples/sec                   batch loss = 1195.2103006243706 | accuracy = 0.6592783505154639


Epoch[2] Batch[490] Speed: 1.2584065644217934 samples/sec                   batch loss = 1208.8842303156853 | accuracy = 0.6591836734693878


Epoch[2] Batch[495] Speed: 1.2507348541105199 samples/sec                   batch loss = 1220.9056307673454 | accuracy = 0.6585858585858586


Epoch[2] Batch[500] Speed: 1.2510047540849512 samples/sec                   batch loss = 1231.1718147397041 | accuracy = 0.66


Epoch[2] Batch[505] Speed: 1.254211969634661 samples/sec                   batch loss = 1240.6542772650719 | accuracy = 0.6618811881188119


Epoch[2] Batch[510] Speed: 1.2507848337078482 samples/sec                   batch loss = 1251.4414692521095 | accuracy = 0.6627450980392157


Epoch[2] Batch[515] Speed: 1.2557270054548613 samples/sec                   batch loss = 1263.4865553975105 | accuracy = 0.6631067961165048


Epoch[2] Batch[520] Speed: 1.2499938346749653 samples/sec                   batch loss = 1273.6378296017647 | accuracy = 0.6649038461538461


Epoch[2] Batch[525] Speed: 1.2546030747063626 samples/sec                   batch loss = 1287.9572635293007 | accuracy = 0.6642857142857143


Epoch[2] Batch[530] Speed: 1.2530008112995512 samples/sec                   batch loss = 1299.5306580662727 | accuracy = 0.6641509433962264


Epoch[2] Batch[535] Speed: 1.2563846049523697 samples/sec                   batch loss = 1311.4659982323647 | accuracy = 0.664018691588785


Epoch[2] Batch[540] Speed: 1.2522422438411254 samples/sec                   batch loss = 1322.1734998822212 | accuracy = 0.6652777777777777


Epoch[2] Batch[545] Speed: 1.254223971150753 samples/sec                   batch loss = 1332.463330090046 | accuracy = 0.6660550458715596


Epoch[2] Batch[550] Speed: 1.2620710080136754 samples/sec                   batch loss = 1341.5380001664162 | accuracy = 0.6677272727272727


Epoch[2] Batch[555] Speed: 1.2614649195911996 samples/sec                   batch loss = 1353.6777037978172 | accuracy = 0.6684684684684684


Epoch[2] Batch[560] Speed: 1.2581526144895139 samples/sec                   batch loss = 1367.6798188090324 | accuracy = 0.6678571428571428


Epoch[2] Batch[565] Speed: 1.249440507044238 samples/sec                   batch loss = 1379.4535046219826 | accuracy = 0.6690265486725664


Epoch[2] Batch[570] Speed: 1.2620955978654256 samples/sec                   batch loss = 1389.0241455435753 | accuracy = 0.6706140350877193


Epoch[2] Batch[575] Speed: 1.261148205143933 samples/sec                   batch loss = 1398.4985939860344 | accuracy = 0.671304347826087


Epoch[2] Batch[580] Speed: 1.2646410258227072 samples/sec                   batch loss = 1409.6514221429825 | accuracy = 0.6715517241379311


Epoch[2] Batch[585] Speed: 1.2535898931634166 samples/sec                   batch loss = 1423.9087573289871 | accuracy = 0.67008547008547


Epoch[2] Batch[590] Speed: 1.254393423315731 samples/sec                   batch loss = 1436.970215678215 | accuracy = 0.6694915254237288


Epoch[2] Batch[595] Speed: 1.2559837380760002 samples/sec                   batch loss = 1449.0354288816452 | accuracy = 0.6697478991596638


Epoch[2] Batch[600] Speed: 1.2633870083287435 samples/sec                   batch loss = 1460.3081586360931 | accuracy = 0.67


Epoch[2] Batch[605] Speed: 1.2622668989382408 samples/sec                   batch loss = 1470.9260835647583 | accuracy = 0.6702479338842975


Epoch[2] Batch[610] Speed: 1.2628049384016995 samples/sec                   batch loss = 1481.5834807157516 | accuracy = 0.6700819672131147


Epoch[2] Batch[615] Speed: 1.2555404677776576 samples/sec                   batch loss = 1496.611556172371 | accuracy = 0.6699186991869919


Epoch[2] Batch[620] Speed: 1.2572203625561928 samples/sec                   batch loss = 1507.4573060274124 | accuracy = 0.6701612903225806


Epoch[2] Batch[625] Speed: 1.2615740996614917 samples/sec                   batch loss = 1522.2986382246017 | accuracy = 0.6696


Epoch[2] Batch[630] Speed: 1.2659853207356702 samples/sec                   batch loss = 1536.2988384962082 | accuracy = 0.6686507936507936


Epoch[2] Batch[635] Speed: 1.2560037659414673 samples/sec                   batch loss = 1547.9526625871658 | accuracy = 0.6688976377952756


Epoch[2] Batch[640] Speed: 1.262132436965282 samples/sec                   batch loss = 1560.2653390169144 | accuracy = 0.669140625


Epoch[2] Batch[645] Speed: 1.2580395920815837 samples/sec                   batch loss = 1570.8911635875702 | accuracy = 0.6697674418604651


Epoch[2] Batch[650] Speed: 1.2552565858223599 samples/sec                   batch loss = 1581.8302252292633 | accuracy = 0.67


Epoch[2] Batch[655] Speed: 1.2614544863426773 samples/sec                   batch loss = 1591.1180238723755 | accuracy = 0.6709923664122137


Epoch[2] Batch[660] Speed: 1.2562431154428157 samples/sec                   batch loss = 1605.4282857179642 | accuracy = 0.6704545454545454


Epoch[2] Batch[665] Speed: 1.2554367448591324 samples/sec                   batch loss = 1616.5878299474716 | accuracy = 0.6710526315789473


Epoch[2] Batch[670] Speed: 1.257250887752929 samples/sec                   batch loss = 1630.324590563774 | accuracy = 0.6708955223880597


Epoch[2] Batch[675] Speed: 1.2559340002715902 samples/sec                   batch loss = 1640.0223875045776 | accuracy = 0.6718518518518518


Epoch[2] Batch[680] Speed: 1.2544664884849206 samples/sec                   batch loss = 1650.5749270915985 | accuracy = 0.6716911764705882


Epoch[2] Batch[685] Speed: 1.2524443514120853 samples/sec                   batch loss = 1662.621887922287 | accuracy = 0.6726277372262773


Epoch[2] Batch[690] Speed: 1.258590178096807 samples/sec                   batch loss = 1673.6339608430862 | accuracy = 0.6728260869565217


Epoch[2] Batch[695] Speed: 1.2605394035396982 samples/sec                   batch loss = 1684.0024337768555 | accuracy = 0.6726618705035972


Epoch[2] Batch[700] Speed: 1.2589310202177502 samples/sec                   batch loss = 1696.0130956172943 | accuracy = 0.6728571428571428


Epoch[2] Batch[705] Speed: 1.2495867039787545 samples/sec                   batch loss = 1708.8590210676193 | accuracy = 0.6719858156028369


Epoch[2] Batch[710] Speed: 1.2490842313577677 samples/sec                   batch loss = 1718.6219960451126 | accuracy = 0.6735915492957747


Epoch[2] Batch[715] Speed: 1.2577060208311577 samples/sec                   batch loss = 1728.5062452554703 | accuracy = 0.6737762237762238


Epoch[2] Batch[720] Speed: 1.2516487314732896 samples/sec                   batch loss = 1741.7480963468552 | accuracy = 0.6732638888888889


Epoch[2] Batch[725] Speed: 1.2538682424681058 samples/sec                   batch loss = 1753.475429058075 | accuracy = 0.6731034482758621


Epoch[2] Batch[730] Speed: 1.2508168190162863 samples/sec                   batch loss = 1766.1126780509949 | accuracy = 0.6726027397260274


Epoch[2] Batch[735] Speed: 1.2532056913046856 samples/sec                   batch loss = 1776.044746875763 | accuracy = 0.673469387755102


Epoch[2] Batch[740] Speed: 1.2538278549476538 samples/sec                   batch loss = 1786.7911373376846 | accuracy = 0.6739864864864865


Epoch[2] Batch[745] Speed: 1.2620413875340923 samples/sec                   batch loss = 1799.0266380310059 | accuracy = 0.6738255033557047


Epoch[2] Batch[750] Speed: 1.2553119054891075 samples/sec                   batch loss = 1809.9500184059143 | accuracy = 0.6743333333333333


Epoch[2] Batch[755] Speed: 1.2507988212507961 samples/sec                   batch loss = 1820.1764986515045 | accuracy = 0.6751655629139073


Epoch[2] Batch[760] Speed: 1.2550534757958158 samples/sec                   batch loss = 1834.0169862508774 | accuracy = 0.6740131578947368


Epoch[2] Batch[765] Speed: 1.260201097773731 samples/sec                   batch loss = 1847.366573691368 | accuracy = 0.6735294117647059


Epoch[2] Batch[770] Speed: 1.253099733115675 samples/sec                   batch loss = 1860.715978860855 | accuracy = 0.673051948051948


Epoch[2] Batch[775] Speed: 1.2531615086803678 samples/sec                   batch loss = 1871.8090090751648 | accuracy = 0.6729032258064516


Epoch[2] Batch[780] Speed: 1.253473381049433 samples/sec                   batch loss = 1883.567234635353 | accuracy = 0.6727564102564103


Epoch[2] Batch[785] Speed: 1.2597594811692898 samples/sec                   batch loss = 1897.9581454992294 | accuracy = 0.6719745222929936


[Epoch 2] training: accuracy=0.6722715736040609
[Epoch 2] time cost: 642.9326295852661
[Epoch 2] validation: validation accuracy=0.7266666666666667


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).