<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[15:37:26] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[15:37:26] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[15:37:27] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 4.9027143, -2.8922734]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7787937250003679 samples/sec                   batch loss = 14.915806293487549 | accuracy = 0.45


Epoch[1] Batch[10] Speed: 1.2606075979764717 samples/sec                   batch loss = 29.26508355140686 | accuracy = 0.475


Epoch[1] Batch[15] Speed: 1.2579751651891253 samples/sec                   batch loss = 43.111366987228394 | accuracy = 0.5166666666666667


Epoch[1] Batch[20] Speed: 1.2589919549221096 samples/sec                   batch loss = 56.17343187332153 | accuracy = 0.55


Epoch[1] Batch[25] Speed: 1.2612853021144244 samples/sec                   batch loss = 70.73603892326355 | accuracy = 0.51


Epoch[1] Batch[30] Speed: 1.260939109082117 samples/sec                   batch loss = 84.83905911445618 | accuracy = 0.5


Epoch[1] Batch[35] Speed: 1.259953898161709 samples/sec                   batch loss = 97.78402161598206 | accuracy = 0.5285714285714286


Epoch[1] Batch[40] Speed: 1.253781567114862 samples/sec                   batch loss = 111.20469546318054 | accuracy = 0.51875


Epoch[1] Batch[45] Speed: 1.2543548775997324 samples/sec                   batch loss = 124.93979787826538 | accuracy = 0.5111111111111111


Epoch[1] Batch[50] Speed: 1.2502449672084126 samples/sec                   batch loss = 138.1907241344452 | accuracy = 0.525


Epoch[1] Batch[55] Speed: 1.2492687623760865 samples/sec                   batch loss = 151.92137813568115 | accuracy = 0.5227272727272727


Epoch[1] Batch[60] Speed: 1.2499363752803512 samples/sec                   batch loss = 164.9904260635376 | accuracy = 0.525


Epoch[1] Batch[65] Speed: 1.2474755248259826 samples/sec                   batch loss = 178.68589806556702 | accuracy = 0.5230769230769231


Epoch[1] Batch[70] Speed: 1.2608401775449336 samples/sec                   batch loss = 191.09900307655334 | accuracy = 0.5357142857142857


Epoch[1] Batch[75] Speed: 1.264129518976919 samples/sec                   batch loss = 204.09507966041565 | accuracy = 0.5433333333333333


Epoch[1] Batch[80] Speed: 1.259379996976386 samples/sec                   batch loss = 218.72358059883118 | accuracy = 0.54375


Epoch[1] Batch[85] Speed: 1.263588732581913 samples/sec                   batch loss = 231.53893423080444 | accuracy = 0.5470588235294118


Epoch[1] Batch[90] Speed: 1.2590685804182646 samples/sec                   batch loss = 246.82814741134644 | accuracy = 0.5361111111111111


Epoch[1] Batch[95] Speed: 1.261713092077022 samples/sec                   batch loss = 260.95352816581726 | accuracy = 0.5342105263157895


Epoch[1] Batch[100] Speed: 1.2603876973861396 samples/sec                   batch loss = 275.26378059387207 | accuracy = 0.5325


Epoch[1] Batch[105] Speed: 1.26054555967893 samples/sec                   batch loss = 288.82560658454895 | accuracy = 0.5333333333333333


Epoch[1] Batch[110] Speed: 1.2565976519295532 samples/sec                   batch loss = 302.5513186454773 | accuracy = 0.5295454545454545


Epoch[1] Batch[115] Speed: 1.2576210766093288 samples/sec                   batch loss = 316.6993992328644 | accuracy = 0.5304347826086957


Epoch[1] Batch[120] Speed: 1.2568914628224277 samples/sec                   batch loss = 330.83353543281555 | accuracy = 0.5291666666666667


Epoch[1] Batch[125] Speed: 1.2592760168089454 samples/sec                   batch loss = 344.41954278945923 | accuracy = 0.534


Epoch[1] Batch[130] Speed: 1.2622274879769433 samples/sec                   batch loss = 357.8075864315033 | accuracy = 0.5365384615384615


Epoch[1] Batch[135] Speed: 1.2572709560509385 samples/sec                   batch loss = 371.3734176158905 | accuracy = 0.5351851851851852


Epoch[1] Batch[140] Speed: 1.2566718212743104 samples/sec                   batch loss = 386.14883494377136 | accuracy = 0.5285714285714286


Epoch[1] Batch[145] Speed: 1.2601145858056608 samples/sec                   batch loss = 399.6238272190094 | accuracy = 0.5327586206896552


Epoch[1] Batch[150] Speed: 1.2636684885191958 samples/sec                   batch loss = 413.3223628997803 | accuracy = 0.5316666666666666


Epoch[1] Batch[155] Speed: 1.262829651915277 samples/sec                   batch loss = 426.77560591697693 | accuracy = 0.5290322580645161


Epoch[1] Batch[160] Speed: 1.2551789211061712 samples/sec                   batch loss = 439.72404885292053 | accuracy = 0.5359375


Epoch[1] Batch[165] Speed: 1.263163474152501 samples/sec                   batch loss = 454.0397539138794 | accuracy = 0.5378787878787878


Epoch[1] Batch[170] Speed: 1.2609376875420555 samples/sec                   batch loss = 468.525230884552 | accuracy = 0.5352941176470588


Epoch[1] Batch[175] Speed: 1.2620961675258568 samples/sec                   batch loss = 482.36711072921753 | accuracy = 0.5357142857142857


Epoch[1] Batch[180] Speed: 1.262626079421522 samples/sec                   batch loss = 496.23314571380615 | accuracy = 0.5361111111111111


Epoch[1] Batch[185] Speed: 1.26267558844576 samples/sec                   batch loss = 510.7272205352783 | accuracy = 0.5310810810810811


Epoch[1] Batch[190] Speed: 1.2660407302629946 samples/sec                   batch loss = 524.3430542945862 | accuracy = 0.5328947368421053


Epoch[1] Batch[195] Speed: 1.259617325119886 samples/sec                   batch loss = 538.2639012336731 | accuracy = 0.5307692307692308


Epoch[1] Batch[200] Speed: 1.2680390920149787 samples/sec                   batch loss = 552.0663959980011 | accuracy = 0.53


Epoch[1] Batch[205] Speed: 1.261862270224736 samples/sec                   batch loss = 565.7822532653809 | accuracy = 0.5317073170731708


Epoch[1] Batch[210] Speed: 1.2585738442210659 samples/sec                   batch loss = 579.6198182106018 | accuracy = 0.5297619047619048


Epoch[1] Batch[215] Speed: 1.2639028655278002 samples/sec                   batch loss = 593.9051625728607 | accuracy = 0.5279069767441861


Epoch[1] Batch[220] Speed: 1.2614726972160137 samples/sec                   batch loss = 607.0567409992218 | accuracy = 0.5306818181818181


Epoch[1] Batch[225] Speed: 1.2633166103099827 samples/sec                   batch loss = 620.8251612186432 | accuracy = 0.53


Epoch[1] Batch[230] Speed: 1.261263872821854 samples/sec                   batch loss = 634.456431388855 | accuracy = 0.5293478260869565


Epoch[1] Batch[235] Speed: 1.2546633096734963 samples/sec                   batch loss = 648.6594069004059 | accuracy = 0.526595744680851


Epoch[1] Batch[240] Speed: 1.2658797694539456 samples/sec                   batch loss = 662.357001543045 | accuracy = 0.5270833333333333


Epoch[1] Batch[245] Speed: 1.2593272485273634 samples/sec                   batch loss = 676.101485490799 | accuracy = 0.5295918367346939


Epoch[1] Batch[250] Speed: 1.2582853804301155 samples/sec                   batch loss = 690.307373046875 | accuracy = 0.529


Epoch[1] Batch[255] Speed: 1.2644977659972887 samples/sec                   batch loss = 703.0411665439606 | accuracy = 0.5343137254901961


Epoch[1] Batch[260] Speed: 1.257929985404691 samples/sec                   batch loss = 717.1076738834381 | accuracy = 0.5326923076923077


Epoch[1] Batch[265] Speed: 1.2660138846966495 samples/sec                   batch loss = 730.6234872341156 | accuracy = 0.5311320754716982


Epoch[1] Batch[270] Speed: 1.2594965697046216 samples/sec                   batch loss = 744.2892196178436 | accuracy = 0.5324074074074074


Epoch[1] Batch[275] Speed: 1.2593327311352855 samples/sec                   batch loss = 758.0002076625824 | accuracy = 0.5327272727272727


Epoch[1] Batch[280] Speed: 1.255319419572076 samples/sec                   batch loss = 772.0684411525726 | accuracy = 0.53125


Epoch[1] Batch[285] Speed: 1.2627093252257655 samples/sec                   batch loss = 785.5326628684998 | accuracy = 0.5324561403508772


Epoch[1] Batch[290] Speed: 1.2594802122944166 samples/sec                   batch loss = 799.6946425437927 | accuracy = 0.5310344827586206


Epoch[1] Batch[295] Speed: 1.2586644884206533 samples/sec                   batch loss = 812.3795716762543 | accuracy = 0.5347457627118644


Epoch[1] Batch[300] Speed: 1.2586779917468638 samples/sec                   batch loss = 825.42365026474 | accuracy = 0.5383333333333333


Epoch[1] Batch[305] Speed: 1.2624724456162588 samples/sec                   batch loss = 839.1765713691711 | accuracy = 0.5385245901639344


Epoch[1] Batch[310] Speed: 1.2588476106439046 samples/sec                   batch loss = 852.2764666080475 | accuracy = 0.5419354838709678


Epoch[1] Batch[315] Speed: 1.256332953824873 samples/sec                   batch loss = 864.9821033477783 | accuracy = 0.546031746031746


Epoch[1] Batch[320] Speed: 1.2597044309084253 samples/sec                   batch loss = 877.4760127067566 | accuracy = 0.55


Epoch[1] Batch[325] Speed: 1.2550816424697941 samples/sec                   batch loss = 890.4856889247894 | accuracy = 0.5515384615384615


Epoch[1] Batch[330] Speed: 1.2542075628856226 samples/sec                   batch loss = 904.3267030715942 | accuracy = 0.5507575757575758


Epoch[1] Batch[335] Speed: 1.265475589654476 samples/sec                   batch loss = 917.6287286281586 | accuracy = 0.5507462686567164


Epoch[1] Batch[340] Speed: 1.2603706540328223 samples/sec                   batch loss = 931.2415537834167 | accuracy = 0.5507352941176471


Epoch[1] Batch[345] Speed: 1.262735650669662 samples/sec                   batch loss = 944.6661169528961 | accuracy = 0.553623188405797


Epoch[1] Batch[350] Speed: 1.2583737178565484 samples/sec                   batch loss = 959.5120630264282 | accuracy = 0.55


Epoch[1] Batch[355] Speed: 1.2647055653037467 samples/sec                   batch loss = 973.5073068141937 | accuracy = 0.5485915492957747


Epoch[1] Batch[360] Speed: 1.2619883209851581 samples/sec                   batch loss = 987.4917931556702 | accuracy = 0.5479166666666667


Epoch[1] Batch[365] Speed: 1.2613649571959429 samples/sec                   batch loss = 1001.3783910274506 | accuracy = 0.5493150684931507


Epoch[1] Batch[370] Speed: 1.2614571420622909 samples/sec                   batch loss = 1014.1347939968109 | accuracy = 0.5527027027027027


Epoch[1] Batch[375] Speed: 1.2590973055724601 samples/sec                   batch loss = 1027.5418727397919 | accuracy = 0.5526666666666666


Epoch[1] Batch[380] Speed: 1.2607829483800308 samples/sec                   batch loss = 1040.5419926643372 | accuracy = 0.5552631578947368


Epoch[1] Batch[385] Speed: 1.2586914953628128 samples/sec                   batch loss = 1054.0946972370148 | accuracy = 0.5551948051948052


Epoch[1] Batch[390] Speed: 1.2595611525320405 samples/sec                   batch loss = 1068.5778188705444 | accuracy = 0.5544871794871795


Epoch[1] Batch[395] Speed: 1.2607435352876648 samples/sec                   batch loss = 1081.9833760261536 | accuracy = 0.5544303797468354


Epoch[1] Batch[400] Speed: 1.2601127875428522 samples/sec                   batch loss = 1096.186490058899 | accuracy = 0.55375


Epoch[1] Batch[405] Speed: 1.2555951546939623 samples/sec                   batch loss = 1109.91268825531 | accuracy = 0.5530864197530864


Epoch[1] Batch[410] Speed: 1.2589809956875373 samples/sec                   batch loss = 1122.6967809200287 | accuracy = 0.5554878048780488


Epoch[1] Batch[415] Speed: 1.2628364007736645 samples/sec                   batch loss = 1136.2138509750366 | accuracy = 0.5542168674698795


Epoch[1] Batch[420] Speed: 1.2600136071005663 samples/sec                   batch loss = 1150.1211104393005 | accuracy = 0.5541666666666667


Epoch[1] Batch[425] Speed: 1.2580784589325944 samples/sec                   batch loss = 1163.7525930404663 | accuracy = 0.5564705882352942


Epoch[1] Batch[430] Speed: 1.2598442414925939 samples/sec                   batch loss = 1176.173704624176 | accuracy = 0.5587209302325581


Epoch[1] Batch[435] Speed: 1.2531441921880309 samples/sec                   batch loss = 1189.8873834609985 | accuracy = 0.5591954022988506


Epoch[1] Batch[440] Speed: 1.2592518202510785 samples/sec                   batch loss = 1202.0175485610962 | accuracy = 0.5619318181818181


Epoch[1] Batch[445] Speed: 1.26206597623161 samples/sec                   batch loss = 1215.5747339725494 | accuracy = 0.5629213483146067


Epoch[1] Batch[450] Speed: 1.2527027365046108 samples/sec                   batch loss = 1228.218351840973 | accuracy = 0.565


Epoch[1] Batch[455] Speed: 1.2614828462121301 samples/sec                   batch loss = 1241.994904756546 | accuracy = 0.5653846153846154


Epoch[1] Batch[460] Speed: 1.2570897050871204 samples/sec                   batch loss = 1256.3421804904938 | accuracy = 0.5646739130434782


Epoch[1] Batch[465] Speed: 1.2617592082718958 samples/sec                   batch loss = 1269.2822256088257 | accuracy = 0.5655913978494623


Epoch[1] Batch[470] Speed: 1.2568486205745202 samples/sec                   batch loss = 1282.5875136852264 | accuracy = 0.5654255319148936


Epoch[1] Batch[475] Speed: 1.2532287199244292 samples/sec                   batch loss = 1295.9741854667664 | accuracy = 0.5652631578947368


Epoch[1] Batch[480] Speed: 1.258198753846561 samples/sec                   batch loss = 1309.4833445549011 | accuracy = 0.565625


Epoch[1] Batch[485] Speed: 1.256400787964409 samples/sec                   batch loss = 1322.8368065357208 | accuracy = 0.565979381443299


Epoch[1] Batch[490] Speed: 1.2616494269032956 samples/sec                   batch loss = 1336.0335648059845 | accuracy = 0.5678571428571428


Epoch[1] Batch[495] Speed: 1.2656095237908445 samples/sec                   batch loss = 1349.522515296936 | accuracy = 0.5686868686868687


Epoch[1] Batch[500] Speed: 1.2571154199477228 samples/sec                   batch loss = 1362.6223981380463 | accuracy = 0.5695


Epoch[1] Batch[505] Speed: 1.263069423204719 samples/sec                   batch loss = 1376.8681211471558 | accuracy = 0.5693069306930693


Epoch[1] Batch[510] Speed: 1.2625613719441209 samples/sec                   batch loss = 1390.866733789444 | accuracy = 0.5681372549019608


Epoch[1] Batch[515] Speed: 1.2550150772669255 samples/sec                   batch loss = 1404.2808740139008 | accuracy = 0.5674757281553398


Epoch[1] Batch[520] Speed: 1.2563489473448521 samples/sec                   batch loss = 1418.2797520160675 | accuracy = 0.5668269230769231


Epoch[1] Batch[525] Speed: 1.26253030330965 samples/sec                   batch loss = 1431.6195714473724 | accuracy = 0.5666666666666667


Epoch[1] Batch[530] Speed: 1.261164321499023 samples/sec                   batch loss = 1444.5515348911285 | accuracy = 0.5674528301886792


Epoch[1] Batch[535] Speed: 1.2588670687405954 samples/sec                   batch loss = 1457.397129058838 | accuracy = 0.569158878504673


Epoch[1] Batch[540] Speed: 1.2500875318245894 samples/sec                   batch loss = 1471.5113236904144 | accuracy = 0.5689814814814815


Epoch[1] Batch[545] Speed: 1.254380949625257 samples/sec                   batch loss = 1483.9978985786438 | accuracy = 0.5706422018348624


Epoch[1] Batch[550] Speed: 1.2524714660856862 samples/sec                   batch loss = 1497.1006424427032 | accuracy = 0.5722727272727273


Epoch[1] Batch[555] Speed: 1.2578046497163735 samples/sec                   batch loss = 1509.7433483600616 | accuracy = 0.572972972972973


Epoch[1] Batch[560] Speed: 1.257704889423173 samples/sec                   batch loss = 1523.2353191375732 | accuracy = 0.5736607142857143


Epoch[1] Batch[565] Speed: 1.2616661253436412 samples/sec                   batch loss = 1536.8779592514038 | accuracy = 0.5738938053097346


Epoch[1] Batch[570] Speed: 1.2537338773558115 samples/sec                   batch loss = 1550.4532108306885 | accuracy = 0.575


Epoch[1] Batch[575] Speed: 1.2549028994264038 samples/sec                   batch loss = 1563.3619439601898 | accuracy = 0.5756521739130435


Epoch[1] Batch[580] Speed: 1.2579491321822314 samples/sec                   batch loss = 1577.0998740196228 | accuracy = 0.5758620689655173


Epoch[1] Batch[585] Speed: 1.26140346075873 samples/sec                   batch loss = 1589.2962491512299 | accuracy = 0.5782051282051283


Epoch[1] Batch[590] Speed: 1.255032914922102 samples/sec                   batch loss = 1603.0982701778412 | accuracy = 0.5775423728813559


Epoch[1] Batch[595] Speed: 1.2580101605301375 samples/sec                   batch loss = 1615.9070956707 | accuracy = 0.5785714285714286


Epoch[1] Batch[600] Speed: 1.2589064590592036 samples/sec                   batch loss = 1628.4458317756653 | accuracy = 0.5791666666666667


Epoch[1] Batch[605] Speed: 1.254387608481785 samples/sec                   batch loss = 1641.4716308116913 | accuracy = 0.5797520661157025


Epoch[1] Batch[610] Speed: 1.2528293027303516 samples/sec                   batch loss = 1654.687349319458 | accuracy = 0.5795081967213115


Epoch[1] Batch[615] Speed: 1.2527797208958906 samples/sec                   batch loss = 1667.9239065647125 | accuracy = 0.5788617886178862


Epoch[1] Batch[620] Speed: 1.264450591722398 samples/sec                   batch loss = 1680.750890493393 | accuracy = 0.5790322580645161


Epoch[1] Batch[625] Speed: 1.2577759835127535 samples/sec                   batch loss = 1692.9381747245789 | accuracy = 0.58


Epoch[1] Batch[630] Speed: 1.2538063972075086 samples/sec                   batch loss = 1705.3956022262573 | accuracy = 0.580952380952381


Epoch[1] Batch[635] Speed: 1.25405756436558 samples/sec                   batch loss = 1718.980661392212 | accuracy = 0.5803149606299213


Epoch[1] Batch[640] Speed: 1.2629070304783712 samples/sec                   batch loss = 1731.6532378196716 | accuracy = 0.580859375


Epoch[1] Batch[645] Speed: 1.2590631945977677 samples/sec                   batch loss = 1746.606921672821 | accuracy = 0.5786821705426357


Epoch[1] Batch[650] Speed: 1.2604297396284843 samples/sec                   batch loss = 1760.7026028633118 | accuracy = 0.5788461538461539


Epoch[1] Batch[655] Speed: 1.2511939586686966 samples/sec                   batch loss = 1775.2516677379608 | accuracy = 0.5774809160305343


Epoch[1] Batch[660] Speed: 1.2531207922830205 samples/sec                   batch loss = 1788.3030798435211 | accuracy = 0.5776515151515151


Epoch[1] Batch[665] Speed: 1.2536117181783837 samples/sec                   batch loss = 1801.356539964676 | accuracy = 0.5781954887218045


Epoch[1] Batch[670] Speed: 1.2572821681959143 samples/sec                   batch loss = 1814.1154475212097 | accuracy = 0.5779850746268657


Epoch[1] Batch[675] Speed: 1.2517434240296577 samples/sec                   batch loss = 1826.9369690418243 | accuracy = 0.5792592592592593


Epoch[1] Batch[680] Speed: 1.255622124050905 samples/sec                   batch loss = 1839.9533624649048 | accuracy = 0.5801470588235295


Epoch[1] Batch[685] Speed: 1.255777572821362 samples/sec                   batch loss = 1853.167328596115 | accuracy = 0.5806569343065694


Epoch[1] Batch[690] Speed: 1.2576129693256426 samples/sec                   batch loss = 1866.6435190439224 | accuracy = 0.5811594202898551


Epoch[1] Batch[695] Speed: 1.2557470251334042 samples/sec                   batch loss = 1881.0261808633804 | accuracy = 0.5805755395683453


Epoch[1] Batch[700] Speed: 1.2508701625864236 samples/sec                   batch loss = 1894.600449681282 | accuracy = 0.5810714285714286


Epoch[1] Batch[705] Speed: 1.252225420065204 samples/sec                   batch loss = 1908.399993777275 | accuracy = 0.5808510638297872


Epoch[1] Batch[710] Speed: 1.2494990375473463 samples/sec                   batch loss = 1920.867839217186 | accuracy = 0.581338028169014


Epoch[1] Batch[715] Speed: 1.253427962178147 samples/sec                   batch loss = 1932.99518096447 | accuracy = 0.5825174825174825


Epoch[1] Batch[720] Speed: 1.2563166784843431 samples/sec                   batch loss = 1944.4218207597733 | accuracy = 0.584375


Epoch[1] Batch[725] Speed: 1.2571384040834555 samples/sec                   batch loss = 1957.876062989235 | accuracy = 0.5837931034482758


Epoch[1] Batch[730] Speed: 1.2557308589583982 samples/sec                   batch loss = 1970.3007916212082 | accuracy = 0.583904109589041


Epoch[1] Batch[735] Speed: 1.2515535864228513 samples/sec                   batch loss = 1982.5998846292496 | accuracy = 0.5846938775510204


Epoch[1] Batch[740] Speed: 1.2540936544814265 samples/sec                   batch loss = 1994.9360913038254 | accuracy = 0.5851351351351352


Epoch[1] Batch[745] Speed: 1.252933718068226 samples/sec                   batch loss = 2009.510607600212 | accuracy = 0.5838926174496645


Epoch[1] Batch[750] Speed: 1.257496839246648 samples/sec                   batch loss = 2022.257374405861 | accuracy = 0.584


Epoch[1] Batch[755] Speed: 1.2562565668643066 samples/sec                   batch loss = 2034.7671250104904 | accuracy = 0.5837748344370861


Epoch[1] Batch[760] Speed: 1.257833505833065 samples/sec                   batch loss = 2049.531844139099 | accuracy = 0.5828947368421052


Epoch[1] Batch[765] Speed: 1.2549526494666239 samples/sec                   batch loss = 2059.6326707601547 | accuracy = 0.5852941176470589


Epoch[1] Batch[770] Speed: 1.257057303976849 samples/sec                   batch loss = 2071.5088773965836 | accuracy = 0.586038961038961


Epoch[1] Batch[775] Speed: 1.2537952469687759 samples/sec                   batch loss = 2085.621957421303 | accuracy = 0.5858064516129032


Epoch[1] Batch[780] Speed: 1.2551211718343274 samples/sec                   batch loss = 2098.02105987072 | accuracy = 0.5862179487179487


Epoch[1] Batch[785] Speed: 1.2529762002561335 samples/sec                   batch loss = 2111.342239499092 | accuracy = 0.5863057324840765


[Epoch 1] training: accuracy=0.587246192893401
[Epoch 1] time cost: 644.507839679718
[Epoch 1] validation: validation accuracy=0.7111111111111111


Epoch[2] Batch[5] Speed: 1.2589595501511865 samples/sec                   batch loss = 11.13124668598175 | accuracy = 0.65


Epoch[2] Batch[10] Speed: 1.2579188560186185 samples/sec                   batch loss = 22.516185998916626 | accuracy = 0.7


Epoch[2] Batch[15] Speed: 1.258285569172078 samples/sec                   batch loss = 33.35962152481079 | accuracy = 0.7166666666666667


Epoch[2] Batch[20] Speed: 1.2621171503767232 samples/sec                   batch loss = 46.2940559387207 | accuracy = 0.7


Epoch[2] Batch[25] Speed: 1.2608552437022118 samples/sec                   batch loss = 59.32341384887695 | accuracy = 0.69


Epoch[2] Batch[30] Speed: 1.26694392019738 samples/sec                   batch loss = 71.05417132377625 | accuracy = 0.7


Epoch[2] Batch[35] Speed: 1.2603678135187488 samples/sec                   batch loss = 81.4491685628891 | accuracy = 0.7071428571428572


Epoch[2] Batch[40] Speed: 1.259708781784862 samples/sec                   batch loss = 94.22025167942047 | accuracy = 0.7125


Epoch[2] Batch[45] Speed: 1.2544301893202912 samples/sec                   batch loss = 106.29102218151093 | accuracy = 0.7055555555555556


Epoch[2] Batch[50] Speed: 1.260284402797244 samples/sec                   batch loss = 118.4205551147461 | accuracy = 0.705


Epoch[2] Batch[55] Speed: 1.2630272997686727 samples/sec                   batch loss = 130.58600664138794 | accuracy = 0.7


Epoch[2] Batch[60] Speed: 1.255945188583615 samples/sec                   batch loss = 141.76695823669434 | accuracy = 0.7125


Epoch[2] Batch[65] Speed: 1.2574259651933504 samples/sec                   batch loss = 153.0885832309723 | accuracy = 0.7076923076923077


Epoch[2] Batch[70] Speed: 1.2594596952187316 samples/sec                   batch loss = 164.61590886116028 | accuracy = 0.7142857142857143


Epoch[2] Batch[75] Speed: 1.261783880871446 samples/sec                   batch loss = 177.2781707048416 | accuracy = 0.71


Epoch[2] Batch[80] Speed: 1.2629789040748367 samples/sec                   batch loss = 192.1221672296524 | accuracy = 0.703125


Epoch[2] Batch[85] Speed: 1.2591479557477896 samples/sec                   batch loss = 204.47281086444855 | accuracy = 0.7


Epoch[2] Batch[90] Speed: 1.2531179843531446 samples/sec                   batch loss = 217.2872735261917 | accuracy = 0.7


Epoch[2] Batch[95] Speed: 1.25345586864113 samples/sec                   batch loss = 230.8027707338333 | accuracy = 0.7


Epoch[2] Batch[100] Speed: 1.2553862988737212 samples/sec                   batch loss = 241.0946408510208 | accuracy = 0.7


Epoch[2] Batch[105] Speed: 1.2583696593482603 samples/sec                   batch loss = 252.90131747722626 | accuracy = 0.6952380952380952


Epoch[2] Batch[110] Speed: 1.2615414669740486 samples/sec                   batch loss = 264.38817274570465 | accuracy = 0.6954545454545454


Epoch[2] Batch[115] Speed: 1.253749430002345 samples/sec                   batch loss = 277.18022084236145 | accuracy = 0.6934782608695652


Epoch[2] Batch[120] Speed: 1.255835476517444 samples/sec                   batch loss = 289.613862991333 | accuracy = 0.6895833333333333


Epoch[2] Batch[125] Speed: 1.2545011950770228 samples/sec                   batch loss = 302.8949382305145 | accuracy = 0.68


Epoch[2] Batch[130] Speed: 1.25552205195027 samples/sec                   batch loss = 315.8914375305176 | accuracy = 0.675


Epoch[2] Batch[135] Speed: 1.2565457009531635 samples/sec                   batch loss = 328.0462726354599 | accuracy = 0.674074074074074


Epoch[2] Batch[140] Speed: 1.260134272386793 samples/sec                   batch loss = 339.13404750823975 | accuracy = 0.6696428571428571


Epoch[2] Batch[145] Speed: 1.254263727813398 samples/sec                   batch loss = 353.31233286857605 | accuracy = 0.6672413793103448


Epoch[2] Batch[150] Speed: 1.2551544122295368 samples/sec                   batch loss = 365.6356122493744 | accuracy = 0.6716666666666666


Epoch[2] Batch[155] Speed: 1.256227030554744 samples/sec                   batch loss = 377.4962372779846 | accuracy = 0.6693548387096774


Epoch[2] Batch[160] Speed: 1.2534223435774905 samples/sec                   batch loss = 389.7948054075241 | accuracy = 0.6703125


Epoch[2] Batch[165] Speed: 1.2580045007742735 samples/sec                   batch loss = 400.716157913208 | accuracy = 0.6742424242424242


Epoch[2] Batch[170] Speed: 1.2532715029808983 samples/sec                   batch loss = 412.266996383667 | accuracy = 0.675


Epoch[2] Batch[175] Speed: 1.2578971637134668 samples/sec                   batch loss = 425.15285551548004 | accuracy = 0.6742857142857143


Epoch[2] Batch[180] Speed: 1.2594754847983558 samples/sec                   batch loss = 440.9262889623642 | accuracy = 0.6694444444444444


Epoch[2] Batch[185] Speed: 1.2572385455967354 samples/sec                   batch loss = 452.447451710701 | accuracy = 0.6702702702702703


Epoch[2] Batch[190] Speed: 1.2536138726202135 samples/sec                   batch loss = 466.3653119802475 | accuracy = 0.6671052631578948


Epoch[2] Batch[195] Speed: 1.2535229241488304 samples/sec                   batch loss = 477.7546765804291 | accuracy = 0.6666666666666666


Epoch[2] Batch[200] Speed: 1.2624297920264556 samples/sec                   batch loss = 489.4892737865448 | accuracy = 0.6675


Epoch[2] Batch[205] Speed: 1.2641521887929006 samples/sec                   batch loss = 500.68124508857727 | accuracy = 0.6682926829268293


Epoch[2] Batch[210] Speed: 1.2574423635324974 samples/sec                   batch loss = 513.6440577507019 | accuracy = 0.6666666666666666


Epoch[2] Batch[215] Speed: 1.2544707094543568 samples/sec                   batch loss = 525.4271233081818 | accuracy = 0.6662790697674419


Epoch[2] Batch[220] Speed: 1.2579101790067788 samples/sec                   batch loss = 538.8982112407684 | accuracy = 0.6636363636363637


Epoch[2] Batch[225] Speed: 1.257069642574828 samples/sec                   batch loss = 553.531977891922 | accuracy = 0.6611111111111111


Epoch[2] Batch[230] Speed: 1.2572437273885262 samples/sec                   batch loss = 568.3971087932587 | accuracy = 0.6532608695652173


Epoch[2] Batch[235] Speed: 1.2540783745164867 samples/sec                   batch loss = 580.267294883728 | accuracy = 0.6574468085106383


Epoch[2] Batch[240] Speed: 1.2603070295861474 samples/sec                   batch loss = 592.0014100074768 | accuracy = 0.6604166666666667


Epoch[2] Batch[245] Speed: 1.253506346910233 samples/sec                   batch loss = 602.5640398263931 | accuracy = 0.6622448979591836


Epoch[2] Batch[250] Speed: 1.260553136548222 samples/sec                   batch loss = 614.7265515327454 | accuracy = 0.661


Epoch[2] Batch[255] Speed: 1.259145309742231 samples/sec                   batch loss = 628.7885434627533 | accuracy = 0.6598039215686274


Epoch[2] Batch[260] Speed: 1.256686129058813 samples/sec                   batch loss = 640.5710351467133 | accuracy = 0.6605769230769231


Epoch[2] Batch[265] Speed: 1.2529797561646303 samples/sec                   batch loss = 652.4491851329803 | accuracy = 0.660377358490566


Epoch[2] Batch[270] Speed: 1.2559326840126925 samples/sec                   batch loss = 667.3879647254944 | accuracy = 0.6564814814814814


Epoch[2] Batch[275] Speed: 1.2606969249069537 samples/sec                   batch loss = 676.9135380983353 | accuracy = 0.66


Epoch[2] Batch[280] Speed: 1.2552916179144127 samples/sec                   batch loss = 688.6774370670319 | accuracy = 0.6616071428571428


Epoch[2] Batch[285] Speed: 1.2560284960650512 samples/sec                   batch loss = 698.5720902681351 | accuracy = 0.6640350877192982


Epoch[2] Batch[290] Speed: 1.2560555780779685 samples/sec                   batch loss = 713.2228666543961 | accuracy = 0.6629310344827586


Epoch[2] Batch[295] Speed: 1.258383628277497 samples/sec                   batch loss = 723.4116092920303 | accuracy = 0.6644067796610169


Epoch[2] Batch[300] Speed: 1.257813325272396 samples/sec                   batch loss = 734.176798582077 | accuracy = 0.6641666666666667


Epoch[2] Batch[305] Speed: 1.2626159120218572 samples/sec                   batch loss = 747.711960196495 | accuracy = 0.6647540983606557


Epoch[2] Batch[310] Speed: 1.259292747004011 samples/sec                   batch loss = 760.3672802448273 | accuracy = 0.6629032258064517


Epoch[2] Batch[315] Speed: 1.2677794196490813 samples/sec                   batch loss = 770.2660357952118 | accuracy = 0.6658730158730158


Epoch[2] Batch[320] Speed: 1.2594004169171997 samples/sec                   batch loss = 782.4104474782944 | accuracy = 0.665625


Epoch[2] Batch[325] Speed: 1.2628434348718542 samples/sec                   batch loss = 798.8781411647797 | accuracy = 0.6615384615384615


Epoch[2] Batch[330] Speed: 1.2616297878260347 samples/sec                   batch loss = 809.2215585708618 | accuracy = 0.6628787878787878


Epoch[2] Batch[335] Speed: 1.2554598555472734 samples/sec                   batch loss = 819.7376588582993 | accuracy = 0.664179104477612


Epoch[2] Batch[340] Speed: 1.2613946407695513 samples/sec                   batch loss = 830.2747075557709 | accuracy = 0.6654411764705882


Epoch[2] Batch[345] Speed: 1.2586387101478718 samples/sec                   batch loss = 843.3419334888458 | accuracy = 0.6659420289855073


Epoch[2] Batch[350] Speed: 1.2623657695535322 samples/sec                   batch loss = 854.8502168655396 | accuracy = 0.6657142857142857


Epoch[2] Batch[355] Speed: 1.2604340008230994 samples/sec                   batch loss = 868.1676708459854 | accuracy = 0.6647887323943662


Epoch[2] Batch[360] Speed: 1.255736028329555 samples/sec                   batch loss = 879.3905017375946 | accuracy = 0.6652777777777777


Epoch[2] Batch[365] Speed: 1.2615172781353756 samples/sec                   batch loss = 891.7688045501709 | accuracy = 0.665068493150685


Epoch[2] Batch[370] Speed: 1.2605282279329069 samples/sec                   batch loss = 904.0706851482391 | accuracy = 0.6648648648648648


Epoch[2] Batch[375] Speed: 1.2582106430635778 samples/sec                   batch loss = 914.8566591739655 | accuracy = 0.666


Epoch[2] Batch[380] Speed: 1.2553402716234947 samples/sec                   batch loss = 928.5307948589325 | accuracy = 0.6651315789473684


Epoch[2] Batch[385] Speed: 1.2597633594618671 samples/sec                   batch loss = 940.2306475639343 | accuracy = 0.6668831168831169


Epoch[2] Batch[390] Speed: 1.263439716846148 samples/sec                   batch loss = 954.3044593334198 | accuracy = 0.6641025641025641


Epoch[2] Batch[395] Speed: 1.2600843945998452 samples/sec                   batch loss = 969.2994635105133 | accuracy = 0.6613924050632911


Epoch[2] Batch[400] Speed: 1.2601665483530953 samples/sec                   batch loss = 983.7011206150055 | accuracy = 0.660625


Epoch[2] Batch[405] Speed: 1.2586731758420386 samples/sec                   batch loss = 993.9383618831635 | accuracy = 0.6623456790123456


Epoch[2] Batch[410] Speed: 1.259965442079435 samples/sec                   batch loss = 1003.819746017456 | accuracy = 0.6634146341463415


Epoch[2] Batch[415] Speed: 1.2637875701158803 samples/sec                   batch loss = 1015.5810492038727 | accuracy = 0.6626506024096386


Epoch[2] Batch[420] Speed: 1.2595700414630975 samples/sec                   batch loss = 1025.5279808044434 | accuracy = 0.6648809523809524


Epoch[2] Batch[425] Speed: 1.2622443916541117 samples/sec                   batch loss = 1036.831427693367 | accuracy = 0.6658823529411765


Epoch[2] Batch[430] Speed: 1.262198999657916 samples/sec                   batch loss = 1050.8226252794266 | accuracy = 0.6651162790697674


Epoch[2] Batch[435] Speed: 1.2626369121583438 samples/sec                   batch loss = 1061.8674473762512 | accuracy = 0.6672413793103448


Epoch[2] Batch[440] Speed: 1.2653332858691764 samples/sec                   batch loss = 1072.8806401491165 | accuracy = 0.6681818181818182


Epoch[2] Batch[445] Speed: 1.2591106291974816 samples/sec                   batch loss = 1084.0593284368515 | accuracy = 0.6685393258426966


Epoch[2] Batch[450] Speed: 1.2591396397677665 samples/sec                   batch loss = 1097.4339317083359 | accuracy = 0.6677777777777778


Epoch[2] Batch[455] Speed: 1.2598024274969646 samples/sec                   batch loss = 1109.0291842222214 | accuracy = 0.6681318681318681


Epoch[2] Batch[460] Speed: 1.258268393885361 samples/sec                   batch loss = 1120.9716066122055 | accuracy = 0.6663043478260869


Epoch[2] Batch[465] Speed: 1.2625514906223816 samples/sec                   batch loss = 1132.6576862335205 | accuracy = 0.6666666666666666


Epoch[2] Batch[470] Speed: 1.2682004107292744 samples/sec                   batch loss = 1143.6508464813232 | accuracy = 0.6680851063829787


Epoch[2] Batch[475] Speed: 1.2597362119152826 samples/sec                   batch loss = 1157.9483578205109 | accuracy = 0.6668421052631579


Epoch[2] Batch[480] Speed: 1.258162993190994 samples/sec                   batch loss = 1168.2730938196182 | accuracy = 0.6682291666666667


Epoch[2] Batch[485] Speed: 1.2565616998903135 samples/sec                   batch loss = 1180.0271536111832 | accuracy = 0.6706185567010309


Epoch[2] Batch[490] Speed: 1.2623273021996881 samples/sec                   batch loss = 1189.851237297058 | accuracy = 0.6724489795918367


Epoch[2] Batch[495] Speed: 1.2542960788444686 samples/sec                   batch loss = 1200.8938653469086 | accuracy = 0.6732323232323232


Epoch[2] Batch[500] Speed: 1.2598307131314466 samples/sec                   batch loss = 1214.8866143226624 | accuracy = 0.673


Epoch[2] Batch[505] Speed: 1.2572800953473218 samples/sec                   batch loss = 1225.3788480758667 | accuracy = 0.6747524752475248


Epoch[2] Batch[510] Speed: 1.2614003310710322 samples/sec                   batch loss = 1239.6657371520996 | accuracy = 0.6730392156862746


Epoch[2] Batch[515] Speed: 1.2552532987290808 samples/sec                   batch loss = 1251.690071463585 | accuracy = 0.6733009708737864


Epoch[2] Batch[520] Speed: 1.2611914358258252 samples/sec                   batch loss = 1263.9596430063248 | accuracy = 0.6730769230769231


Epoch[2] Batch[525] Speed: 1.2587808342750122 samples/sec                   batch loss = 1274.1406608223915 | accuracy = 0.6728571428571428


Epoch[2] Batch[530] Speed: 1.2597472788923076 samples/sec                   batch loss = 1284.1372728943825 | accuracy = 0.6740566037735849


Epoch[2] Batch[535] Speed: 1.2603449005067533 samples/sec                   batch loss = 1295.449205815792 | accuracy = 0.6747663551401869


Epoch[2] Batch[540] Speed: 1.2504431685853248 samples/sec                   batch loss = 1306.1732488274574 | accuracy = 0.675


Epoch[2] Batch[545] Speed: 1.254220595701137 samples/sec                   batch loss = 1317.0182902812958 | accuracy = 0.6752293577981652


Epoch[2] Batch[550] Speed: 1.2520699150063166 samples/sec                   batch loss = 1331.5998030900955 | accuracy = 0.675


Epoch[2] Batch[555] Speed: 1.255424438295153 samples/sec                   batch loss = 1342.8962639570236 | accuracy = 0.6756756756756757


Epoch[2] Batch[560] Speed: 1.2534919241327747 samples/sec                   batch loss = 1355.5137211084366 | accuracy = 0.6758928571428572


Epoch[2] Batch[565] Speed: 1.2588364650148212 samples/sec                   batch loss = 1365.9108983278275 | accuracy = 0.6765486725663716


Epoch[2] Batch[570] Speed: 1.2555660253311693 samples/sec                   batch loss = 1377.6715455055237 | accuracy = 0.6767543859649123


Epoch[2] Batch[575] Speed: 1.2520094616279376 samples/sec                   batch loss = 1388.1375135183334 | accuracy = 0.6778260869565217


Epoch[2] Batch[580] Speed: 1.253879862561529 samples/sec                   batch loss = 1400.1486856937408 | accuracy = 0.6775862068965517


Epoch[2] Batch[585] Speed: 1.2575772419603082 samples/sec                   batch loss = 1412.0423355698586 | accuracy = 0.6777777777777778


Epoch[2] Batch[590] Speed: 1.2583519154799219 samples/sec                   batch loss = 1421.9232495427132 | accuracy = 0.6783898305084746


Epoch[2] Batch[595] Speed: 1.263257443989556 samples/sec                   batch loss = 1434.3397861123085 | accuracy = 0.6777310924369748


Epoch[2] Batch[600] Speed: 1.263002293279103 samples/sec                   batch loss = 1445.815567791462 | accuracy = 0.67875


Epoch[2] Batch[605] Speed: 1.2588408098976014 samples/sec                   batch loss = 1458.3612497448921 | accuracy = 0.678099173553719


Epoch[2] Batch[610] Speed: 1.2593202535451844 samples/sec                   batch loss = 1471.4773960709572 | accuracy = 0.6770491803278689


Epoch[2] Batch[615] Speed: 1.2615041880918605 samples/sec                   batch loss = 1484.75636190176 | accuracy = 0.6772357723577236


Epoch[2] Batch[620] Speed: 1.2568972067304454 samples/sec                   batch loss = 1497.4635953307152 | accuracy = 0.6766129032258065


Epoch[2] Batch[625] Speed: 1.2602747464233959 samples/sec                   batch loss = 1510.9962723851204 | accuracy = 0.6764


Epoch[2] Batch[630] Speed: 1.2615366291321053 samples/sec                   batch loss = 1521.1124140620232 | accuracy = 0.6765873015873016


Epoch[2] Batch[635] Speed: 1.25548822836855 samples/sec                   batch loss = 1530.7903400063515 | accuracy = 0.6775590551181102


Epoch[2] Batch[640] Speed: 1.2561473646963568 samples/sec                   batch loss = 1542.1835817694664 | accuracy = 0.67734375


Epoch[2] Batch[645] Speed: 1.2558408347592387 samples/sec                   batch loss = 1555.517932832241 | accuracy = 0.6779069767441861


Epoch[2] Batch[650] Speed: 1.25892072330769 samples/sec                   batch loss = 1568.5802311301231 | accuracy = 0.676923076923077


Epoch[2] Batch[655] Speed: 1.2617058807957857 samples/sec                   batch loss = 1582.4941381812096 | accuracy = 0.6755725190839694


Epoch[2] Batch[660] Speed: 1.25874589048429 samples/sec                   batch loss = 1593.3534957766533 | accuracy = 0.675


Epoch[2] Batch[665] Speed: 1.2580069533288953 samples/sec                   batch loss = 1603.9000133872032 | accuracy = 0.6759398496240602


Epoch[2] Batch[670] Speed: 1.2606612115612048 samples/sec                   batch loss = 1615.698564350605 | accuracy = 0.6764925373134328


Epoch[2] Batch[675] Speed: 1.2606464342168866 samples/sec                   batch loss = 1626.4966549277306 | accuracy = 0.6762962962962963


Epoch[2] Batch[680] Speed: 1.260004049504897 samples/sec                   batch loss = 1637.3063055872917 | accuracy = 0.6772058823529412


Epoch[2] Batch[685] Speed: 1.263668678879538 samples/sec                   batch loss = 1649.0568971037865 | accuracy = 0.6773722627737226


Epoch[2] Batch[690] Speed: 1.2563579791595683 samples/sec                   batch loss = 1660.5186198353767 | accuracy = 0.6771739130434783


Epoch[2] Batch[695] Speed: 1.2558131040018732 samples/sec                   batch loss = 1673.3858528733253 | accuracy = 0.6762589928057554


Epoch[2] Batch[700] Speed: 1.260531069170085 samples/sec                   batch loss = 1684.2005869746208 | accuracy = 0.6764285714285714


Epoch[2] Batch[705] Speed: 1.2574320909596541 samples/sec                   batch loss = 1696.100328028202 | accuracy = 0.6758865248226951


Epoch[2] Batch[710] Speed: 1.2576087271839755 samples/sec                   batch loss = 1705.9213923811913 | accuracy = 0.6767605633802817


Epoch[2] Batch[715] Speed: 1.2613009478542936 samples/sec                   batch loss = 1715.6271759867668 | accuracy = 0.6776223776223776


Epoch[2] Batch[720] Speed: 1.2650582177921583 samples/sec                   batch loss = 1726.7706890702248 | accuracy = 0.678125


Epoch[2] Batch[725] Speed: 1.2568902387176693 samples/sec                   batch loss = 1739.4755982756615 | accuracy = 0.6775862068965517


Epoch[2] Batch[730] Speed: 1.2622443916541117 samples/sec                   batch loss = 1752.2009236216545 | accuracy = 0.677054794520548


Epoch[2] Batch[735] Speed: 1.2536435672033073 samples/sec                   batch loss = 1763.8636781573296 | accuracy = 0.6761904761904762


Epoch[2] Batch[740] Speed: 1.2590485491785446 samples/sec                   batch loss = 1774.4682744145393 | accuracy = 0.6766891891891892


Epoch[2] Batch[745] Speed: 1.2605166737003348 samples/sec                   batch loss = 1784.277937233448 | accuracy = 0.6771812080536913


Epoch[2] Batch[750] Speed: 1.2549811871189738 samples/sec                   batch loss = 1799.0269246697426 | accuracy = 0.6756666666666666


Epoch[2] Batch[755] Speed: 1.261829148111203 samples/sec                   batch loss = 1813.4459334015846 | accuracy = 0.6754966887417219


Epoch[2] Batch[760] Speed: 1.2587178424126857 samples/sec                   batch loss = 1821.9058470129967 | accuracy = 0.6773026315789473


Epoch[2] Batch[765] Speed: 1.2589046642428856 samples/sec                   batch loss = 1831.0671353936195 | accuracy = 0.6777777777777778


Epoch[2] Batch[770] Speed: 1.2574531075038486 samples/sec                   batch loss = 1840.41482681036 | accuracy = 0.6788961038961039


Epoch[2] Batch[775] Speed: 1.2622042224201357 samples/sec                   batch loss = 1851.1546447873116 | accuracy = 0.6793548387096774


Epoch[2] Batch[780] Speed: 1.2560233242856698 samples/sec                   batch loss = 1865.4556763768196 | accuracy = 0.6782051282051282


Epoch[2] Batch[785] Speed: 1.2646007039070208 samples/sec                   batch loss = 1877.3214402794838 | accuracy = 0.6773885350318471


[Epoch 2] training: accuracy=0.6779822335025381
[Epoch 2] time cost: 642.2196021080017
[Epoch 2] validation: validation accuracy=0.7711111111111111


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).