<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `device` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), device=gpu)
x

[16:33:10] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], device=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[16:33:10] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), device=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], device=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_device(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', device=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), device=gpu)
net(x)

[16:33:10] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 6.4910526, -2.863353 ]], device=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, device=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7762484451634288 samples/sec                   batch loss = 13.561777830123901 | accuracy = 0.6


Epoch[1] Batch[10] Speed: 1.2558699768748336 samples/sec                   batch loss = 26.753849983215332 | accuracy = 0.6


Epoch[1] Batch[15] Speed: 1.2580167636430113 samples/sec                   batch loss = 40.55426645278931 | accuracy = 0.5333333333333333


Epoch[1] Batch[20] Speed: 1.251222418944175 samples/sec                   batch loss = 55.494627237319946 | accuracy = 0.5125


Epoch[1] Batch[25] Speed: 1.2519349073865895 samples/sec                   batch loss = 68.23640608787537 | accuracy = 0.53


Epoch[1] Batch[30] Speed: 1.254268228726477 samples/sec                   batch loss = 83.42013621330261 | accuracy = 0.5


Epoch[1] Batch[35] Speed: 1.254445008880522 samples/sec                   batch loss = 98.0943512916565 | accuracy = 0.5


Epoch[1] Batch[40] Speed: 1.2608869880422826 samples/sec                   batch loss = 112.2752857208252 | accuracy = 0.49375


Epoch[1] Batch[45] Speed: 1.2583003855928148 samples/sec                   batch loss = 126.02422404289246 | accuracy = 0.5


Epoch[1] Batch[50] Speed: 1.261638705978573 samples/sec                   batch loss = 139.67880606651306 | accuracy = 0.51


Epoch[1] Batch[55] Speed: 1.2602306319310101 samples/sec                   batch loss = 154.2315549850464 | accuracy = 0.4954545454545455


Epoch[1] Batch[60] Speed: 1.2618423398047955 samples/sec                   batch loss = 169.1676208972931 | accuracy = 0.475


Epoch[1] Batch[65] Speed: 1.258910804425168 samples/sec                   batch loss = 183.7897243499756 | accuracy = 0.47307692307692306


Epoch[1] Batch[70] Speed: 1.2512170067265194 samples/sec                   batch loss = 197.13091111183167 | accuracy = 0.4857142857142857


Epoch[1] Batch[75] Speed: 1.2529558009605597 samples/sec                   batch loss = 211.58442211151123 | accuracy = 0.4766666666666667


Epoch[1] Batch[80] Speed: 1.2501064406067375 samples/sec                   batch loss = 224.83601665496826 | accuracy = 0.48125


Epoch[1] Batch[85] Speed: 1.2541227155649373 samples/sec                   batch loss = 238.21038460731506 | accuracy = 0.5


Epoch[1] Batch[90] Speed: 1.2547234565873666 samples/sec                   batch loss = 251.87131929397583 | accuracy = 0.5083333333333333


Epoch[1] Batch[95] Speed: 1.249199649868224 samples/sec                   batch loss = 266.1908929347992 | accuracy = 0.5078947368421053


Epoch[1] Batch[100] Speed: 1.2505730073064814 samples/sec                   batch loss = 280.56642413139343 | accuracy = 0.505


Epoch[1] Batch[105] Speed: 1.2544529815736039 samples/sec                   batch loss = 294.5425081253052 | accuracy = 0.5071428571428571


Epoch[1] Batch[110] Speed: 1.2462361397290762 samples/sec                   batch loss = 308.8608181476593 | accuracy = 0.5045454545454545


Epoch[1] Batch[115] Speed: 1.2577918252005997 samples/sec                   batch loss = 322.4736113548279 | accuracy = 0.5021739130434782


Epoch[1] Batch[120] Speed: 1.249651577611734 samples/sec                   batch loss = 336.80572748184204 | accuracy = 0.5083333333333333


Epoch[1] Batch[125] Speed: 1.251190412889012 samples/sec                   batch loss = 350.7967312335968 | accuracy = 0.508


Epoch[1] Batch[130] Speed: 1.2550653056449008 samples/sec                   batch loss = 364.2032504081726 | accuracy = 0.5134615384615384


Epoch[1] Batch[135] Speed: 1.2563307900269722 samples/sec                   batch loss = 377.4985706806183 | accuracy = 0.5185185185185185


Epoch[1] Batch[140] Speed: 1.2569973098191376 samples/sec                   batch loss = 391.0862112045288 | accuracy = 0.5196428571428572


Epoch[1] Batch[145] Speed: 1.2606656637777254 samples/sec                   batch loss = 404.6084861755371 | accuracy = 0.5241379310344828


Epoch[1] Batch[150] Speed: 1.257292155653118 samples/sec                   batch loss = 418.5870234966278 | accuracy = 0.5183333333333333


Epoch[1] Batch[155] Speed: 1.2624247573830996 samples/sec                   batch loss = 432.7243912220001 | accuracy = 0.5145161290322581


Epoch[1] Batch[160] Speed: 1.260582403061244 samples/sec                   batch loss = 446.76080322265625 | accuracy = 0.5125


Epoch[1] Batch[165] Speed: 1.2554172987197707 samples/sec                   batch loss = 460.4511754512787 | accuracy = 0.5151515151515151


Epoch[1] Batch[170] Speed: 1.257074540406068 samples/sec                   batch loss = 474.71106123924255 | accuracy = 0.5176470588235295


Epoch[1] Batch[175] Speed: 1.2593500300214913 samples/sec                   batch loss = 488.20046830177307 | accuracy = 0.5214285714285715


Epoch[1] Batch[180] Speed: 1.2581442173018138 samples/sec                   batch loss = 501.24696469306946 | accuracy = 0.525


Epoch[1] Batch[185] Speed: 1.2609125741961251 samples/sec                   batch loss = 515.2434339523315 | accuracy = 0.5256756756756756


Epoch[1] Batch[190] Speed: 1.2536291412679954 samples/sec                   batch loss = 529.5742309093475 | accuracy = 0.5210526315789473


Epoch[1] Batch[195] Speed: 1.2567252888719334 samples/sec                   batch loss = 543.5553317070007 | accuracy = 0.5205128205128206


Epoch[1] Batch[200] Speed: 1.2542927030067201 samples/sec                   batch loss = 557.2708444595337 | accuracy = 0.52


Epoch[1] Batch[205] Speed: 1.259105432005242 samples/sec                   batch loss = 571.11474776268 | accuracy = 0.5207317073170732


Epoch[1] Batch[210] Speed: 1.2590814309831209 samples/sec                   batch loss = 584.5492920875549 | accuracy = 0.5214285714285715


Epoch[1] Batch[215] Speed: 1.2573820499088135 samples/sec                   batch loss = 598.6377499103546 | accuracy = 0.5197674418604651


Epoch[1] Batch[220] Speed: 1.2571080727373432 samples/sec                   batch loss = 612.2515177726746 | accuracy = 0.5181818181818182


Epoch[1] Batch[225] Speed: 1.2599298647821358 samples/sec                   batch loss = 625.9918723106384 | accuracy = 0.52


Epoch[1] Batch[230] Speed: 1.2621303480928288 samples/sec                   batch loss = 638.9918208122253 | accuracy = 0.525


Epoch[1] Batch[235] Speed: 1.2579298910870325 samples/sec                   batch loss = 652.8526895046234 | accuracy = 0.524468085106383


Epoch[1] Batch[240] Speed: 1.2574855290215559 samples/sec                   batch loss = 666.7667376995087 | accuracy = 0.5239583333333333


Epoch[1] Batch[245] Speed: 1.2535807137704185 samples/sec                   batch loss = 680.3989722728729 | accuracy = 0.5244897959183673


Epoch[1] Batch[250] Speed: 1.2568138781269558 samples/sec                   batch loss = 693.5757434368134 | accuracy = 0.528


Epoch[1] Batch[255] Speed: 1.2591822603265201 samples/sec                   batch loss = 707.6054639816284 | accuracy = 0.5264705882352941


Epoch[1] Batch[260] Speed: 1.2589713592768492 samples/sec                   batch loss = 721.6730849742889 | accuracy = 0.5230769230769231


Epoch[1] Batch[265] Speed: 1.2518794178884154 samples/sec                   batch loss = 735.5754098892212 | accuracy = 0.5226415094339623


Epoch[1] Batch[270] Speed: 1.2568705592069285 samples/sec                   batch loss = 749.2016899585724 | accuracy = 0.524074074074074


Epoch[1] Batch[275] Speed: 1.2547807938815172 samples/sec                   batch loss = 762.6361520290375 | accuracy = 0.5254545454545455


Epoch[1] Batch[280] Speed: 1.2575287917317928 samples/sec                   batch loss = 776.1096434593201 | accuracy = 0.5276785714285714


Epoch[1] Batch[285] Speed: 1.2574523535349569 samples/sec                   batch loss = 790.1728613376617 | accuracy = 0.5271929824561403


Epoch[1] Batch[290] Speed: 1.2548304403869348 samples/sec                   batch loss = 802.9129590988159 | accuracy = 0.531896551724138


Epoch[1] Batch[295] Speed: 1.2623409792124605 samples/sec                   batch loss = 817.1938719749451 | accuracy = 0.5288135593220339


Epoch[1] Batch[300] Speed: 1.259075761583937 samples/sec                   batch loss = 830.8035745620728 | accuracy = 0.53


Epoch[1] Batch[305] Speed: 1.2623002339856326 samples/sec                   batch loss = 844.6995282173157 | accuracy = 0.5286885245901639


Epoch[1] Batch[310] Speed: 1.257410509679021 samples/sec                   batch loss = 858.2442049980164 | accuracy = 0.5290322580645161


Epoch[1] Batch[315] Speed: 1.2510232242011006 samples/sec                   batch loss = 872.177538394928 | accuracy = 0.5285714285714286


Epoch[1] Batch[320] Speed: 1.2486733249315478 samples/sec                   batch loss = 886.1843276023865 | accuracy = 0.5296875


Epoch[1] Batch[325] Speed: 1.2496487852076024 samples/sec                   batch loss = 899.8065993785858 | accuracy = 0.53


Epoch[1] Batch[330] Speed: 1.2564962950425318 samples/sec                   batch loss = 914.0760469436646 | accuracy = 0.5295454545454545


Epoch[1] Batch[335] Speed: 1.2568913686604386 samples/sec                   batch loss = 927.4663074016571 | accuracy = 0.5313432835820896


Epoch[1] Batch[340] Speed: 1.2542385045441171 samples/sec                   batch loss = 941.37841629982 | accuracy = 0.5308823529411765


Epoch[1] Batch[345] Speed: 1.2608433992086783 samples/sec                   batch loss = 955.6353347301483 | accuracy = 0.5297101449275362


Epoch[1] Batch[350] Speed: 1.2589872310907084 samples/sec                   batch loss = 969.1562745571136 | accuracy = 0.5314285714285715


Epoch[1] Batch[355] Speed: 1.2586181260582436 samples/sec                   batch loss = 982.0260198116302 | accuracy = 0.5345070422535211


Epoch[1] Batch[360] Speed: 1.2562559083964093 samples/sec                   batch loss = 995.19549036026 | accuracy = 0.5354166666666667


Epoch[1] Batch[365] Speed: 1.2613221887526274 samples/sec                   batch loss = 1008.3159444332123 | accuracy = 0.536986301369863


Epoch[1] Batch[370] Speed: 1.263473779924182 samples/sec                   batch loss = 1021.4372222423553 | accuracy = 0.5391891891891892


Epoch[1] Batch[375] Speed: 1.2591970033643702 samples/sec                   batch loss = 1035.4515113830566 | accuracy = 0.54


Epoch[1] Batch[380] Speed: 1.260799244897913 samples/sec                   batch loss = 1048.2539944648743 | accuracy = 0.5434210526315789


Epoch[1] Batch[385] Speed: 1.2611796798206933 samples/sec                   batch loss = 1062.0897450447083 | accuracy = 0.5435064935064935


Epoch[1] Batch[390] Speed: 1.2601567044813966 samples/sec                   batch loss = 1075.2756571769714 | accuracy = 0.5455128205128205


Epoch[1] Batch[395] Speed: 1.2575570696216816 samples/sec                   batch loss = 1089.8906989097595 | accuracy = 0.5436708860759494


Epoch[1] Batch[400] Speed: 1.2592406674632672 samples/sec                   batch loss = 1104.5538167953491 | accuracy = 0.540625


Epoch[1] Batch[405] Speed: 1.2536415063350785 samples/sec                   batch loss = 1117.914912700653 | accuracy = 0.5432098765432098


Epoch[1] Batch[410] Speed: 1.2572015205820355 samples/sec                   batch loss = 1131.6664774417877 | accuracy = 0.5439024390243903


Epoch[1] Batch[415] Speed: 1.2611089588020994 samples/sec                   batch loss = 1145.4175388813019 | accuracy = 0.5433734939759036


Epoch[1] Batch[420] Speed: 1.2549576246876017 samples/sec                   batch loss = 1158.308418750763 | accuracy = 0.5452380952380952


Epoch[1] Batch[425] Speed: 1.2561251691614272 samples/sec                   batch loss = 1171.7135241031647 | accuracy = 0.5452941176470588


Epoch[1] Batch[430] Speed: 1.2567204878982003 samples/sec                   batch loss = 1185.7466733455658 | accuracy = 0.5453488372093023


Epoch[1] Batch[435] Speed: 1.2586847907340626 samples/sec                   batch loss = 1199.2010097503662 | accuracy = 0.5454022988505747


Epoch[1] Batch[440] Speed: 1.259242368723215 samples/sec                   batch loss = 1212.8377704620361 | accuracy = 0.5465909090909091


Epoch[1] Batch[445] Speed: 1.258300668712532 samples/sec                   batch loss = 1226.0715115070343 | accuracy = 0.5483146067415731


Epoch[1] Batch[450] Speed: 1.257237509243503 samples/sec                   batch loss = 1239.689448595047 | accuracy = 0.5483333333333333


Epoch[1] Batch[455] Speed: 1.2548621636516055 samples/sec                   batch loss = 1253.0343058109283 | accuracy = 0.5494505494505495


Epoch[1] Batch[460] Speed: 1.2582956669496177 samples/sec                   batch loss = 1266.277466058731 | accuracy = 0.5483695652173913


Epoch[1] Batch[465] Speed: 1.2533457483461756 samples/sec                   batch loss = 1279.711577653885 | accuracy = 0.5478494623655914


Epoch[1] Batch[470] Speed: 1.265109825682975 samples/sec                   batch loss = 1291.3487086296082 | accuracy = 0.550531914893617


Epoch[1] Batch[475] Speed: 1.2590044260073332 samples/sec                   batch loss = 1305.917867898941 | accuracy = 0.5484210526315789


Epoch[1] Batch[480] Speed: 1.2610264925650227 samples/sec                   batch loss = 1319.2453808784485 | accuracy = 0.55


Epoch[1] Batch[485] Speed: 1.2606219008012964 samples/sec                   batch loss = 1333.735369682312 | accuracy = 0.5489690721649485


Epoch[1] Batch[490] Speed: 1.2570511818616525 samples/sec                   batch loss = 1346.7116243839264 | accuracy = 0.5510204081632653


Epoch[1] Batch[495] Speed: 1.2584681091544443 samples/sec                   batch loss = 1359.9054322242737 | accuracy = 0.5505050505050505


Epoch[1] Batch[500] Speed: 1.2609814724469288 samples/sec                   batch loss = 1373.4272894859314 | accuracy = 0.55


Epoch[1] Batch[505] Speed: 1.2589953561026683 samples/sec                   batch loss = 1385.723741054535 | accuracy = 0.552970297029703


Epoch[1] Batch[510] Speed: 1.2570280125501883 samples/sec                   batch loss = 1398.7789132595062 | accuracy = 0.5558823529411765


Epoch[1] Batch[515] Speed: 1.263473779924182 samples/sec                   batch loss = 1411.9719407558441 | accuracy = 0.5572815533980583


Epoch[1] Batch[520] Speed: 1.2625572863788777 samples/sec                   batch loss = 1424.9400911331177 | accuracy = 0.5591346153846154


Epoch[1] Batch[525] Speed: 1.2613397319982667 samples/sec                   batch loss = 1436.7744190692902 | accuracy = 0.560952380952381


Epoch[1] Batch[530] Speed: 1.2600458768858573 samples/sec                   batch loss = 1449.9900369644165 | accuracy = 0.5622641509433962


Epoch[1] Batch[535] Speed: 1.2569382630770805 samples/sec                   batch loss = 1461.9787106513977 | accuracy = 0.5635514018691589


Epoch[1] Batch[540] Speed: 1.2554363690826154 samples/sec                   batch loss = 1476.1324903964996 | accuracy = 0.5634259259259259


Epoch[1] Batch[545] Speed: 1.2563756668395532 samples/sec                   batch loss = 1488.268277645111 | accuracy = 0.5646788990825689


Epoch[1] Batch[550] Speed: 1.2512922220849898 samples/sec                   batch loss = 1499.9289226531982 | accuracy = 0.5668181818181818


Epoch[1] Batch[555] Speed: 1.2545949124688842 samples/sec                   batch loss = 1514.0440213680267 | accuracy = 0.5662162162162162


Epoch[1] Batch[560] Speed: 1.2528062887870615 samples/sec                   batch loss = 1526.9233067035675 | accuracy = 0.5674107142857143


Epoch[1] Batch[565] Speed: 1.2544550451106755 samples/sec                   batch loss = 1539.5452377796173 | accuracy = 0.5694690265486726


Epoch[1] Batch[570] Speed: 1.2592906675205444 samples/sec                   batch loss = 1552.418609380722 | accuracy = 0.5692982456140351


Epoch[1] Batch[575] Speed: 1.2583259612546798 samples/sec                   batch loss = 1565.7336127758026 | accuracy = 0.57


Epoch[1] Batch[580] Speed: 1.2587097209822267 samples/sec                   batch loss = 1576.8122646808624 | accuracy = 0.5724137931034483


Epoch[1] Batch[585] Speed: 1.2583224693133623 samples/sec                   batch loss = 1589.5224615335464 | accuracy = 0.5722222222222222


Epoch[1] Batch[590] Speed: 1.2578303938352184 samples/sec                   batch loss = 1600.6294149160385 | accuracy = 0.5741525423728814


Epoch[1] Batch[595] Speed: 1.2560137330984593 samples/sec                   batch loss = 1612.771470785141 | accuracy = 0.5756302521008403


Epoch[1] Batch[600] Speed: 1.2597624135346464 samples/sec                   batch loss = 1626.5755314826965 | accuracy = 0.57625


Epoch[1] Batch[605] Speed: 1.257717146451431 samples/sec                   batch loss = 1640.3650765419006 | accuracy = 0.5752066115702479


Epoch[1] Batch[610] Speed: 1.256184986038534 samples/sec                   batch loss = 1653.902990102768 | accuracy = 0.5762295081967214


Epoch[1] Batch[615] Speed: 1.252351702887508 samples/sec                   batch loss = 1667.6284306049347 | accuracy = 0.5764227642276423


Epoch[1] Batch[620] Speed: 1.2628791769057397 samples/sec                   batch loss = 1680.4434819221497 | accuracy = 0.5758064516129032


Epoch[1] Batch[625] Speed: 1.2636193774670275 samples/sec                   batch loss = 1694.7912199497223 | accuracy = 0.576


Epoch[1] Batch[630] Speed: 1.2612663381016311 samples/sec                   batch loss = 1706.7010917663574 | accuracy = 0.5769841269841269


Epoch[1] Batch[635] Speed: 1.259392286676241 samples/sec                   batch loss = 1719.9340965747833 | accuracy = 0.5775590551181102


Epoch[1] Batch[640] Speed: 1.2597452924974974 samples/sec                   batch loss = 1733.8898975849152 | accuracy = 0.57734375


Epoch[1] Batch[645] Speed: 1.2547197969379906 samples/sec                   batch loss = 1747.7942979335785 | accuracy = 0.5775193798449613


Epoch[1] Batch[650] Speed: 1.2576377628593622 samples/sec                   batch loss = 1761.0481655597687 | accuracy = 0.5776923076923077


Epoch[1] Batch[655] Speed: 1.2544576714404034 samples/sec                   batch loss = 1774.6289637088776 | accuracy = 0.5774809160305343


Epoch[1] Batch[660] Speed: 1.2588625347614015 samples/sec                   batch loss = 1787.4498014450073 | accuracy = 0.578030303030303


Epoch[1] Batch[665] Speed: 1.2603788915960397 samples/sec                   batch loss = 1800.7536129951477 | accuracy = 0.5778195488721805


Epoch[1] Batch[670] Speed: 1.2582813224916185 samples/sec                   batch loss = 1814.2327105998993 | accuracy = 0.5776119402985075


Epoch[1] Batch[675] Speed: 1.2689697069812176 samples/sec                   batch loss = 1826.0513670444489 | accuracy = 0.5796296296296296


Epoch[1] Batch[680] Speed: 1.2674217059698065 samples/sec                   batch loss = 1838.9248042106628 | accuracy = 0.580514705882353


Epoch[1] Batch[685] Speed: 1.2612568563090825 samples/sec                   batch loss = 1851.356942653656 | accuracy = 0.581021897810219


Epoch[1] Batch[690] Speed: 1.2581187433732521 samples/sec                   batch loss = 1864.2068654298782 | accuracy = 0.5815217391304348


Epoch[1] Batch[695] Speed: 1.2447814930968084 samples/sec                   batch loss = 1875.3959501981735 | accuracy = 0.5830935251798561


Epoch[1] Batch[700] Speed: 1.251109331990641 samples/sec                   batch loss = 1889.1447545289993 | accuracy = 0.5832142857142857


Epoch[1] Batch[705] Speed: 1.2561167049691104 samples/sec                   batch loss = 1902.8938738107681 | accuracy = 0.5829787234042553


Epoch[1] Batch[710] Speed: 1.254176997776494 samples/sec                   batch loss = 1916.2116948366165 | accuracy = 0.5838028169014085


Epoch[1] Batch[715] Speed: 1.2580325170636788 samples/sec                   batch loss = 1929.2149983644485 | accuracy = 0.5842657342657342


Epoch[1] Batch[720] Speed: 1.2569287520791839 samples/sec                   batch loss = 1943.2158344984055 | accuracy = 0.5840277777777778


Epoch[1] Batch[725] Speed: 1.2533214982620537 samples/sec                   batch loss = 1956.2967888116837 | accuracy = 0.5841379310344827


Epoch[1] Batch[730] Speed: 1.255358869983549 samples/sec                   batch loss = 1970.1343771219254 | accuracy = 0.583904109589041


Epoch[1] Batch[735] Speed: 1.262768725346333 samples/sec                   batch loss = 1984.3298548460007 | accuracy = 0.5840136054421768


Epoch[1] Batch[740] Speed: 1.2567368678418531 samples/sec                   batch loss = 1995.9032393693924 | accuracy = 0.5844594594594594


Epoch[1] Batch[745] Speed: 1.256129965587708 samples/sec                   batch loss = 2009.8777347803116 | accuracy = 0.5835570469798658


Epoch[1] Batch[750] Speed: 1.258719353388055 samples/sec                   batch loss = 2021.8409352302551 | accuracy = 0.585


Epoch[1] Batch[755] Speed: 1.2574561233884578 samples/sec                   batch loss = 2035.5651993751526 | accuracy = 0.5847682119205299


Epoch[1] Batch[760] Speed: 1.2573887406661055 samples/sec                   batch loss = 2048.915055036545 | accuracy = 0.5848684210526316


Epoch[1] Batch[765] Speed: 1.2527396840899554 samples/sec                   batch loss = 2060.906399488449 | accuracy = 0.5856209150326798


Epoch[1] Batch[770] Speed: 1.2524265872624847 samples/sec                   batch loss = 2074.554494380951 | accuracy = 0.5853896103896103


Epoch[1] Batch[775] Speed: 1.2547509515589501 samples/sec                   batch loss = 2085.430187225342 | accuracy = 0.5864516129032258


Epoch[1] Batch[780] Speed: 1.25357247116473 samples/sec                   batch loss = 2097.1197538375854 | accuracy = 0.5875


Epoch[1] Batch[785] Speed: 1.2564542325007282 samples/sec                   batch loss = 2109.2657470703125 | accuracy = 0.5885350318471337


[Epoch 1] training: accuracy=0.5878807106598984
[Epoch 1] time cost: 645.0215308666229
[Epoch 1] validation: validation accuracy=0.6633333333333333


Epoch[2] Batch[5] Speed: 1.2556008867599837 samples/sec                   batch loss = 13.172142744064331 | accuracy = 0.7


Epoch[2] Batch[10] Speed: 1.2605283226406063 samples/sec                   batch loss = 26.220729112625122 | accuracy = 0.65


Epoch[2] Batch[15] Speed: 1.263449802370589 samples/sec                   batch loss = 40.21626615524292 | accuracy = 0.6333333333333333


Epoch[2] Batch[20] Speed: 1.2585511852110849 samples/sec                   batch loss = 53.403315782547 | accuracy = 0.625


Epoch[2] Batch[25] Speed: 1.2558366045645484 samples/sec                   batch loss = 67.25272250175476 | accuracy = 0.63


Epoch[2] Batch[30] Speed: 1.2583423830764315 samples/sec                   batch loss = 78.6779717206955 | accuracy = 0.6416666666666667


Epoch[2] Batch[35] Speed: 1.2583173730013584 samples/sec                   batch loss = 89.28706157207489 | accuracy = 0.65


Epoch[2] Batch[40] Speed: 1.2590414628076871 samples/sec                   batch loss = 101.57397949695587 | accuracy = 0.66875


Epoch[2] Batch[45] Speed: 1.2533964986632453 samples/sec                   batch loss = 115.81864774227142 | accuracy = 0.6611111111111111


Epoch[2] Batch[50] Speed: 1.2547347171808867 samples/sec                   batch loss = 127.1613416671753 | accuracy = 0.665


Epoch[2] Batch[55] Speed: 1.2498846942761204 samples/sec                   batch loss = 139.1936137676239 | accuracy = 0.6681818181818182


Epoch[2] Batch[60] Speed: 1.2546244659015677 samples/sec                   batch loss = 151.18537747859955 | accuracy = 0.6708333333333333


Epoch[2] Batch[65] Speed: 1.2628028473026294 samples/sec                   batch loss = 162.06235301494598 | accuracy = 0.6846153846153846


Epoch[2] Batch[70] Speed: 1.2582776420584068 samples/sec                   batch loss = 175.10628354549408 | accuracy = 0.6714285714285714


Epoch[2] Batch[75] Speed: 1.2607542410040815 samples/sec                   batch loss = 186.7517079114914 | accuracy = 0.6733333333333333


Epoch[2] Batch[80] Speed: 1.2556651646578927 samples/sec                   batch loss = 199.65957641601562 | accuracy = 0.671875


Epoch[2] Batch[85] Speed: 1.2568179266059842 samples/sec                   batch loss = 213.125079870224 | accuracy = 0.6676470588235294


Epoch[2] Batch[90] Speed: 1.2581921488230838 samples/sec                   batch loss = 224.74113249778748 | accuracy = 0.6694444444444444


Epoch[2] Batch[95] Speed: 1.2598189824800536 samples/sec                   batch loss = 237.0222294330597 | accuracy = 0.6684210526315789


Epoch[2] Batch[100] Speed: 1.2562408578898165 samples/sec                   batch loss = 248.37049412727356 | accuracy = 0.6775


Epoch[2] Batch[105] Speed: 1.2544548575161156 samples/sec                   batch loss = 261.9659354686737 | accuracy = 0.6738095238095239


Epoch[2] Batch[110] Speed: 1.258026196781648 samples/sec                   batch loss = 275.1186457872391 | accuracy = 0.6727272727272727


Epoch[2] Batch[115] Speed: 1.2579532822995423 samples/sec                   batch loss = 288.3977061510086 | accuracy = 0.6717391304347826


Epoch[2] Batch[120] Speed: 1.253440229631298 samples/sec                   batch loss = 301.8564866781235 | accuracy = 0.6708333333333333


Epoch[2] Batch[125] Speed: 1.257276138109897 samples/sec                   batch loss = 314.0952579975128 | accuracy = 0.674


Epoch[2] Batch[130] Speed: 1.25501291800623 samples/sec                   batch loss = 327.1493990421295 | accuracy = 0.676923076923077


Epoch[2] Batch[135] Speed: 1.2569706580509181 samples/sec                   batch loss = 337.1971181631088 | accuracy = 0.6814814814814815


Epoch[2] Batch[140] Speed: 1.2577447727834643 samples/sec                   batch loss = 348.220730304718 | accuracy = 0.6821428571428572


Epoch[2] Batch[145] Speed: 1.254292140368862 samples/sec                   batch loss = 359.3144768476486 | accuracy = 0.6879310344827586


Epoch[2] Batch[150] Speed: 1.2494976416845576 samples/sec                   batch loss = 373.3908885717392 | accuracy = 0.685


Epoch[2] Batch[155] Speed: 1.257246271192855 samples/sec                   batch loss = 384.2512717247009 | accuracy = 0.6919354838709677


Epoch[2] Batch[160] Speed: 1.2526703740812195 samples/sec                   batch loss = 397.09027433395386 | accuracy = 0.6890625


Epoch[2] Batch[165] Speed: 1.2512392157798646 samples/sec                   batch loss = 410.1105079650879 | accuracy = 0.6893939393939394


Epoch[2] Batch[170] Speed: 1.2515247376415854 samples/sec                   batch loss = 424.00187849998474 | accuracy = 0.6838235294117647


Epoch[2] Batch[175] Speed: 1.25470872428271 samples/sec                   batch loss = 436.67844212055206 | accuracy = 0.68


Epoch[2] Batch[180] Speed: 1.2561588389668827 samples/sec                   batch loss = 447.6339375972748 | accuracy = 0.6847222222222222


Epoch[2] Batch[185] Speed: 1.252746325532586 samples/sec                   batch loss = 460.51434302330017 | accuracy = 0.6824324324324325


Epoch[2] Batch[190] Speed: 1.2552427801462758 samples/sec                   batch loss = 474.2884796857834 | accuracy = 0.6776315789473685


Epoch[2] Batch[195] Speed: 1.2503815675395973 samples/sec                   batch loss = 486.86372315883636 | accuracy = 0.6743589743589744


Epoch[2] Batch[200] Speed: 1.2552634418151698 samples/sec                   batch loss = 501.4899562597275 | accuracy = 0.6725


Epoch[2] Batch[205] Speed: 1.2627403076314008 samples/sec                   batch loss = 513.8706176280975 | accuracy = 0.6731707317073171


Epoch[2] Batch[210] Speed: 1.2636199485038382 samples/sec                   batch loss = 528.1466629505157 | accuracy = 0.6702380952380952


Epoch[2] Batch[215] Speed: 1.2567280188537489 samples/sec                   batch loss = 539.6394245624542 | accuracy = 0.6697674418604651


Epoch[2] Batch[220] Speed: 1.2567547544749167 samples/sec                   batch loss = 552.8648250102997 | accuracy = 0.6659090909090909


Epoch[2] Batch[225] Speed: 1.258190922183502 samples/sec                   batch loss = 563.937368273735 | accuracy = 0.6688888888888889


Epoch[2] Batch[230] Speed: 1.2575014576470611 samples/sec                   batch loss = 575.8376585245132 | accuracy = 0.6673913043478261


Epoch[2] Batch[235] Speed: 1.2584755666705172 samples/sec                   batch loss = 587.0777771472931 | accuracy = 0.6680851063829787


Epoch[2] Batch[240] Speed: 1.2593125024394785 samples/sec                   batch loss = 598.5560871362686 | accuracy = 0.6697916666666667


Epoch[2] Batch[245] Speed: 1.250582795237575 samples/sec                   batch loss = 609.0664263963699 | accuracy = 0.6693877551020408


Epoch[2] Batch[250] Speed: 1.2495522687754783 samples/sec                   batch loss = 618.5375620126724 | accuracy = 0.672


Epoch[2] Batch[255] Speed: 1.2571574325694372 samples/sec                   batch loss = 631.8005939722061 | accuracy = 0.6705882352941176


Epoch[2] Batch[260] Speed: 1.265885022721427 samples/sec                   batch loss = 644.6417778730392 | accuracy = 0.6701923076923076


Epoch[2] Batch[265] Speed: 1.2552208983022304 samples/sec                   batch loss = 657.2396315336227 | accuracy = 0.6698113207547169


Epoch[2] Batch[270] Speed: 1.252549637715675 samples/sec                   batch loss = 667.4198342561722 | accuracy = 0.6722222222222223


Epoch[2] Batch[275] Speed: 1.2610154978816235 samples/sec                   batch loss = 677.7572929859161 | accuracy = 0.6736363636363636


Epoch[2] Batch[280] Speed: 1.259624796262423 samples/sec                   batch loss = 688.8042124509811 | accuracy = 0.6741071428571429


Epoch[2] Batch[285] Speed: 1.2588366539221445 samples/sec                   batch loss = 703.4941166639328 | accuracy = 0.6719298245614035


Epoch[2] Batch[290] Speed: 1.263434388709559 samples/sec                   batch loss = 716.6090266704559 | accuracy = 0.6698275862068965


Epoch[2] Batch[295] Speed: 1.2674925622556952 samples/sec                   batch loss = 728.6960711479187 | accuracy = 0.6694915254237288


Epoch[2] Batch[300] Speed: 1.2591755504684827 samples/sec                   batch loss = 744.5147354602814 | accuracy = 0.6658333333333334


Epoch[2] Batch[305] Speed: 1.2615229695433763 samples/sec                   batch loss = 756.4877233505249 | accuracy = 0.6672131147540984


Epoch[2] Batch[310] Speed: 1.2659286742848277 samples/sec                   batch loss = 767.7443311214447 | accuracy = 0.6693548387096774


Epoch[2] Batch[315] Speed: 1.2596638557035602 samples/sec                   batch loss = 781.8322463035583 | accuracy = 0.6682539682539682


Epoch[2] Batch[320] Speed: 1.262628645052918 samples/sec                   batch loss = 793.5015133619308 | accuracy = 0.66953125


Epoch[2] Batch[325] Speed: 1.26201480623617 samples/sec                   batch loss = 806.2793799638748 | accuracy = 0.6692307692307692


Epoch[2] Batch[330] Speed: 1.2605047408628107 samples/sec                   batch loss = 815.8192045688629 | accuracy = 0.671969696969697


Epoch[2] Batch[335] Speed: 1.2555647098434572 samples/sec                   batch loss = 829.2881369590759 | accuracy = 0.6701492537313433


Epoch[2] Batch[340] Speed: 1.2559582575366257 samples/sec                   batch loss = 843.2415647506714 | accuracy = 0.6676470588235294


Epoch[2] Batch[345] Speed: 1.2527242500234272 samples/sec                   batch loss = 854.8840482234955 | accuracy = 0.6688405797101449


Epoch[2] Batch[350] Speed: 1.2561956144759312 samples/sec                   batch loss = 867.1444759368896 | accuracy = 0.6685714285714286


Epoch[2] Batch[355] Speed: 1.2580687419946968 samples/sec                   batch loss = 879.9810743331909 | accuracy = 0.6697183098591549


Epoch[2] Batch[360] Speed: 1.2526050931251405 samples/sec                   batch loss = 890.1806747913361 | accuracy = 0.6715277777777777


Epoch[2] Batch[365] Speed: 1.2556767241008315 samples/sec                   batch loss = 901.0668650865555 | accuracy = 0.6719178082191781


Epoch[2] Batch[370] Speed: 1.2568839299078787 samples/sec                   batch loss = 914.1588896512985 | accuracy = 0.6702702702702703


Epoch[2] Batch[375] Speed: 1.2517104574410494 samples/sec                   batch loss = 927.183641910553 | accuracy = 0.6693333333333333


Epoch[2] Batch[380] Speed: 1.2540781870345667 samples/sec                   batch loss = 939.0763653516769 | accuracy = 0.6703947368421053


Epoch[2] Batch[385] Speed: 1.2530515336856054 samples/sec                   batch loss = 950.7740111351013 | accuracy = 0.6701298701298701


Epoch[2] Batch[390] Speed: 1.2510592331673105 samples/sec                   batch loss = 962.719352722168 | accuracy = 0.6692307692307692


Epoch[2] Batch[395] Speed: 1.2554033016300128 samples/sec                   batch loss = 974.6868802309036 | accuracy = 0.670253164556962


Epoch[2] Batch[400] Speed: 1.2526321212342937 samples/sec                   batch loss = 985.7767597436905 | accuracy = 0.671875


Epoch[2] Batch[405] Speed: 1.2524559451288475 samples/sec                   batch loss = 996.9517427682877 | accuracy = 0.6728395061728395


Epoch[2] Batch[410] Speed: 1.2472497964519296 samples/sec                   batch loss = 1008.7048263549805 | accuracy = 0.6731707317073171


Epoch[2] Batch[415] Speed: 1.2569216895501085 samples/sec                   batch loss = 1019.4707325696945 | accuracy = 0.6746987951807228


Epoch[2] Batch[420] Speed: 1.2583504053865735 samples/sec                   batch loss = 1030.7080940008163 | accuracy = 0.6767857142857143


Epoch[2] Batch[425] Speed: 1.257788619112518 samples/sec                   batch loss = 1041.8069850206375 | accuracy = 0.6770588235294117


Epoch[2] Batch[430] Speed: 1.2553295637267776 samples/sec                   batch loss = 1052.8945055007935 | accuracy = 0.6767441860465117


Epoch[2] Batch[435] Speed: 1.256119338261163 samples/sec                   batch loss = 1063.5074409246445 | accuracy = 0.6764367816091954


Epoch[2] Batch[440] Speed: 1.2634956649936755 samples/sec                   batch loss = 1077.1994417905807 | accuracy = 0.6755681818181818


Epoch[2] Batch[445] Speed: 1.2579355501715777 samples/sec                   batch loss = 1087.7333121299744 | accuracy = 0.6758426966292135


Epoch[2] Batch[450] Speed: 1.2566394417018412 samples/sec                   batch loss = 1098.319352388382 | accuracy = 0.6761111111111111


Epoch[2] Batch[455] Speed: 1.2587135928139086 samples/sec                   batch loss = 1106.9495315551758 | accuracy = 0.6774725274725275


Epoch[2] Batch[460] Speed: 1.257191722978669 samples/sec                   batch loss = 1120.8337072134018 | accuracy = 0.6771739130434783


Epoch[2] Batch[465] Speed: 1.262276965747043 samples/sec                   batch loss = 1130.9505282640457 | accuracy = 0.6779569892473118


Epoch[2] Batch[470] Speed: 1.26267663378371 samples/sec                   batch loss = 1142.1124700307846 | accuracy = 0.6787234042553192


Epoch[2] Batch[475] Speed: 1.2621696580336887 samples/sec                   batch loss = 1152.706554889679 | accuracy = 0.6794736842105263


Epoch[2] Batch[480] Speed: 1.25964096827956 samples/sec                   batch loss = 1168.6461923122406 | accuracy = 0.6776041666666667


Epoch[2] Batch[485] Speed: 1.2627153125046569 samples/sec                   batch loss = 1180.3837379217148 | accuracy = 0.6778350515463918


Epoch[2] Batch[490] Speed: 1.259289155173246 samples/sec                   batch loss = 1193.131644129753 | accuracy = 0.6780612244897959


Epoch[2] Batch[495] Speed: 1.263558374673787 samples/sec                   batch loss = 1204.509760260582 | accuracy = 0.6777777777777778


Epoch[2] Batch[500] Speed: 1.2571465052496449 samples/sec                   batch loss = 1213.7950080633163 | accuracy = 0.6795


Epoch[2] Batch[505] Speed: 1.252853533771387 samples/sec                   batch loss = 1224.9566770792007 | accuracy = 0.6792079207920793


Epoch[2] Batch[510] Speed: 1.2537078322215727 samples/sec                   batch loss = 1238.3803848028183 | accuracy = 0.6779411764705883


Epoch[2] Batch[515] Speed: 1.2512390291458795 samples/sec                   batch loss = 1254.063522696495 | accuracy = 0.6762135922330097


Epoch[2] Batch[520] Speed: 1.2585865902727371 samples/sec                   batch loss = 1265.1998314857483 | accuracy = 0.6774038461538462


Epoch[2] Batch[525] Speed: 1.2513398196011427 samples/sec                   batch loss = 1278.3088227510452 | accuracy = 0.6771428571428572


Epoch[2] Batch[530] Speed: 1.2553071153081674 samples/sec                   batch loss = 1288.0020183324814 | accuracy = 0.6787735849056604


Epoch[2] Batch[535] Speed: 1.2574974047632432 samples/sec                   batch loss = 1300.373840212822 | accuracy = 0.6794392523364486


Epoch[2] Batch[540] Speed: 1.2535298548757328 samples/sec                   batch loss = 1313.485037446022 | accuracy = 0.6796296296296296


Epoch[2] Batch[545] Speed: 1.2588221082241435 samples/sec                   batch loss = 1327.2227360010147 | accuracy = 0.6788990825688074


Epoch[2] Batch[550] Speed: 1.2527831821418158 samples/sec                   batch loss = 1339.702971637249 | accuracy = 0.6786363636363636


Epoch[2] Batch[555] Speed: 1.252896665472309 samples/sec                   batch loss = 1352.1254466176033 | accuracy = 0.6774774774774774


Epoch[2] Batch[560] Speed: 1.2600731324202121 samples/sec                   batch loss = 1361.602279484272 | accuracy = 0.6785714285714286


Epoch[2] Batch[565] Speed: 1.265612101561178 samples/sec                   batch loss = 1371.8440136313438 | accuracy = 0.6787610619469027


Epoch[2] Batch[570] Speed: 1.2564478339715845 samples/sec                   batch loss = 1382.002269089222 | accuracy = 0.6807017543859649


Epoch[2] Batch[575] Speed: 1.2592192130795594 samples/sec                   batch loss = 1392.3523740172386 | accuracy = 0.6808695652173913


Epoch[2] Batch[580] Speed: 1.2645270253464422 samples/sec                   batch loss = 1407.0771805644035 | accuracy = 0.6801724137931034


Epoch[2] Batch[585] Speed: 1.2601567044813966 samples/sec                   batch loss = 1419.7277604937553 | accuracy = 0.6799145299145299


Epoch[2] Batch[590] Speed: 1.2601180877005962 samples/sec                   batch loss = 1430.2526106238365 | accuracy = 0.6805084745762712


Epoch[2] Batch[595] Speed: 1.2592850907578808 samples/sec                   batch loss = 1443.2906442284584 | accuracy = 0.6802521008403362


Epoch[2] Batch[600] Speed: 1.2591360488103505 samples/sec                   batch loss = 1457.4438059926033 | accuracy = 0.6804166666666667


Epoch[2] Batch[605] Speed: 1.261045923293286 samples/sec                   batch loss = 1469.3442591428757 | accuracy = 0.6805785123966942


Epoch[2] Batch[610] Speed: 1.2596456024428866 samples/sec                   batch loss = 1481.1268239021301 | accuracy = 0.6799180327868852


Epoch[2] Batch[615] Speed: 1.259909238439466 samples/sec                   batch loss = 1490.2282818555832 | accuracy = 0.6808943089430894


Epoch[2] Batch[620] Speed: 1.2586972557347 samples/sec                   batch loss = 1500.5346161723137 | accuracy = 0.6810483870967742


Epoch[2] Batch[625] Speed: 1.2573373838606035 samples/sec                   batch loss = 1510.798007786274 | accuracy = 0.6812


Epoch[2] Batch[630] Speed: 1.2564455756827402 samples/sec                   batch loss = 1523.1309671998024 | accuracy = 0.6821428571428572


Epoch[2] Batch[635] Speed: 1.2530603309748538 samples/sec                   batch loss = 1533.1558602452278 | accuracy = 0.684251968503937


Epoch[2] Batch[640] Speed: 1.2587423017723978 samples/sec                   batch loss = 1545.9774460196495 | accuracy = 0.683984375


Epoch[2] Batch[645] Speed: 1.2585966928822798 samples/sec                   batch loss = 1557.1476470828056 | accuracy = 0.6841085271317829


Epoch[2] Batch[650] Speed: 1.2585164429803823 samples/sec                   batch loss = 1566.8128609061241 | accuracy = 0.6846153846153846


Epoch[2] Batch[655] Speed: 1.260962422796452 samples/sec                   batch loss = 1578.9529946446419 | accuracy = 0.6851145038167938


Epoch[2] Batch[660] Speed: 1.257603542383025 samples/sec                   batch loss = 1588.7304394841194 | accuracy = 0.6852272727272727


Epoch[2] Batch[665] Speed: 1.2585347579390302 samples/sec                   batch loss = 1602.5912664532661 | accuracy = 0.6857142857142857


Epoch[2] Batch[670] Speed: 1.2546437936613521 samples/sec                   batch loss = 1612.736654818058 | accuracy = 0.6861940298507463


Epoch[2] Batch[675] Speed: 1.2580707231057757 samples/sec                   batch loss = 1627.1481578946114 | accuracy = 0.6851851851851852


Epoch[2] Batch[680] Speed: 1.2576420994714104 samples/sec                   batch loss = 1640.0344713330269 | accuracy = 0.6856617647058824


Epoch[2] Batch[685] Speed: 1.2584346930158712 samples/sec                   batch loss = 1649.6130294203758 | accuracy = 0.6861313868613139


Epoch[2] Batch[690] Speed: 1.2544588908115153 samples/sec                   batch loss = 1660.333364546299 | accuracy = 0.686231884057971


Epoch[2] Batch[695] Speed: 1.2532740307364543 samples/sec                   batch loss = 1670.5208334326744 | accuracy = 0.6866906474820144


Epoch[2] Batch[700] Speed: 1.2567962722540607 samples/sec                   batch loss = 1682.2233079075813 | accuracy = 0.6875


Epoch[2] Batch[705] Speed: 1.2561749221087235 samples/sec                   batch loss = 1692.0986500382423 | accuracy = 0.6879432624113475


Epoch[2] Batch[710] Speed: 1.255289739468444 samples/sec                   batch loss = 1702.3748183846474 | accuracy = 0.6887323943661972


Epoch[2] Batch[715] Speed: 1.2517484672303454 samples/sec                   batch loss = 1715.8565321564674 | accuracy = 0.6884615384615385


Epoch[2] Batch[720] Speed: 1.251821037784109 samples/sec                   batch loss = 1730.7412714362144 | accuracy = 0.6878472222222223


Epoch[2] Batch[725] Speed: 1.2492895069223464 samples/sec                   batch loss = 1743.8411473631859 | accuracy = 0.6886206896551724


Epoch[2] Batch[730] Speed: 1.254445102676322 samples/sec                   batch loss = 1757.2580481171608 | accuracy = 0.688013698630137


Epoch[2] Batch[735] Speed: 1.2533911612593935 samples/sec                   batch loss = 1768.0207259058952 | accuracy = 0.6880952380952381


Epoch[2] Batch[740] Speed: 1.2519206141667205 samples/sec                   batch loss = 1779.3777489066124 | accuracy = 0.6875


Epoch[2] Batch[745] Speed: 1.2523066456938918 samples/sec                   batch loss = 1790.5638671517372 | accuracy = 0.687248322147651


Epoch[2] Batch[750] Speed: 1.257771551683422 samples/sec                   batch loss = 1801.7097229361534 | accuracy = 0.6883333333333334


Epoch[2] Batch[755] Speed: 1.2594514696664303 samples/sec                   batch loss = 1814.3281254172325 | accuracy = 0.6874172185430464


Epoch[2] Batch[760] Speed: 1.2527686824555653 samples/sec                   batch loss = 1825.5620965957642 | accuracy = 0.687828947368421


Epoch[2] Batch[765] Speed: 1.2607027036491076 samples/sec                   batch loss = 1839.259145975113 | accuracy = 0.6872549019607843


Epoch[2] Batch[770] Speed: 1.2538275738371398 samples/sec                   batch loss = 1849.4387214183807 | accuracy = 0.687012987012987


Epoch[2] Batch[775] Speed: 1.2551276507562688 samples/sec                   batch loss = 1858.3429446220398 | accuracy = 0.6880645161290323


Epoch[2] Batch[780] Speed: 1.25955519512719 samples/sec                   batch loss = 1870.787281036377 | accuracy = 0.6878205128205128


Epoch[2] Batch[785] Speed: 1.2620543937914073 samples/sec                   batch loss = 1882.230185866356 | accuracy = 0.6885350318471337


[Epoch 2] training: accuracy=0.6890862944162437
[Epoch 2] time cost: 642.7178499698639
[Epoch 2] validation: validation accuracy=0.7455555555555555


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).