<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[10:26:39] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[10:26:39] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[10:26:40] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 2.9002156, -3.6198332]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7756765896079632 samples/sec                   batch loss = 15.607667684555054 | accuracy = 0.25


Epoch[1] Batch[10] Speed: 1.2508767842211272 samples/sec                   batch loss = 29.313542366027832 | accuracy = 0.375


Epoch[1] Batch[15] Speed: 1.2509311588870144 samples/sec                   batch loss = 43.5167396068573 | accuracy = 0.43333333333333335


Epoch[1] Batch[20] Speed: 1.2493058797861083 samples/sec                   batch loss = 58.32186150550842 | accuracy = 0.4


Epoch[1] Batch[25] Speed: 1.249975022427662 samples/sec                   batch loss = 72.45703744888306 | accuracy = 0.42


Epoch[1] Batch[30] Speed: 1.2470128417553124 samples/sec                   batch loss = 87.11119079589844 | accuracy = 0.44166666666666665


Epoch[1] Batch[35] Speed: 1.2566585492128626 samples/sec                   batch loss = 101.08767294883728 | accuracy = 0.45


Epoch[1] Batch[40] Speed: 1.2492989026610088 samples/sec                   batch loss = 116.67891597747803 | accuracy = 0.41875


Epoch[1] Batch[45] Speed: 1.2496069937168153 samples/sec                   batch loss = 130.2375512123108 | accuracy = 0.43333333333333335


Epoch[1] Batch[50] Speed: 1.253976954985221 samples/sec                   batch loss = 144.39885306358337 | accuracy = 0.445


Epoch[1] Batch[55] Speed: 1.2563808415209463 samples/sec                   batch loss = 157.60334873199463 | accuracy = 0.4636363636363636


Epoch[1] Batch[60] Speed: 1.2444563916258369 samples/sec                   batch loss = 171.85973143577576 | accuracy = 0.4583333333333333


Epoch[1] Batch[65] Speed: 1.2527015205451755 samples/sec                   batch loss = 185.98854422569275 | accuracy = 0.45


Epoch[1] Batch[70] Speed: 1.2528800112314913 samples/sec                   batch loss = 199.69034242630005 | accuracy = 0.4607142857142857


Epoch[1] Batch[75] Speed: 1.251474232136418 samples/sec                   batch loss = 213.95502352714539 | accuracy = 0.4533333333333333


Epoch[1] Batch[80] Speed: 1.2486555746614525 samples/sec                   batch loss = 227.75370573997498 | accuracy = 0.4625


Epoch[1] Batch[85] Speed: 1.2587260583850206 samples/sec                   batch loss = 241.4567906856537 | accuracy = 0.47352941176470587


Epoch[1] Batch[90] Speed: 1.2532457579032572 samples/sec                   batch loss = 256.3465983867645 | accuracy = 0.475


Epoch[1] Batch[95] Speed: 1.2496202103225926 samples/sec                   batch loss = 270.0895297527313 | accuracy = 0.4789473684210526


Epoch[1] Batch[100] Speed: 1.2511281783966346 samples/sec                   batch loss = 283.40900182724 | accuracy = 0.4875


Epoch[1] Batch[105] Speed: 1.258117328185245 samples/sec                   batch loss = 297.1597535610199 | accuracy = 0.4928571428571429


Epoch[1] Batch[110] Speed: 1.2492174156104077 samples/sec                   batch loss = 311.355504989624 | accuracy = 0.4909090909090909


Epoch[1] Batch[115] Speed: 1.2550136690525857 samples/sec                   batch loss = 325.1802659034729 | accuracy = 0.4934782608695652


Epoch[1] Batch[120] Speed: 1.2540906547043342 samples/sec                   batch loss = 338.52592945098877 | accuracy = 0.5020833333333333


Epoch[1] Batch[125] Speed: 1.2492955536686423 samples/sec                   batch loss = 352.4100286960602 | accuracy = 0.5


Epoch[1] Batch[130] Speed: 1.2577758892181874 samples/sec                   batch loss = 365.95217967033386 | accuracy = 0.5019230769230769


Epoch[1] Batch[135] Speed: 1.2592325392847423 samples/sec                   batch loss = 379.66756105422974 | accuracy = 0.5055555555555555


Epoch[1] Batch[140] Speed: 1.2553553944905596 samples/sec                   batch loss = 393.28396129608154 | accuracy = 0.5053571428571428


Epoch[1] Batch[145] Speed: 1.2544856237738602 samples/sec                   batch loss = 406.8474164009094 | accuracy = 0.506896551724138


Epoch[1] Batch[150] Speed: 1.2550504714252093 samples/sec                   batch loss = 420.3308494091034 | accuracy = 0.5083333333333333


Epoch[1] Batch[155] Speed: 1.2520026411341116 samples/sec                   batch loss = 434.27745270729065 | accuracy = 0.5080645161290323


Epoch[1] Batch[160] Speed: 1.2553537976488813 samples/sec                   batch loss = 448.4603052139282 | accuracy = 0.50625


Epoch[1] Batch[165] Speed: 1.2551904715982491 samples/sec                   batch loss = 462.2976677417755 | accuracy = 0.509090909090909


Epoch[1] Batch[170] Speed: 1.2518981006718466 samples/sec                   batch loss = 476.095023393631 | accuracy = 0.5088235294117647


Epoch[1] Batch[175] Speed: 1.2490228571143545 samples/sec                   batch loss = 489.7701346874237 | accuracy = 0.5071428571428571


Epoch[1] Batch[180] Speed: 1.2538417232228423 samples/sec                   batch loss = 502.7633697986603 | accuracy = 0.5125


Epoch[1] Batch[185] Speed: 1.2530966444973168 samples/sec                   batch loss = 516.552407503128 | accuracy = 0.5108108108108108


Epoch[1] Batch[190] Speed: 1.2554886981280386 samples/sec                   batch loss = 530.9670288562775 | accuracy = 0.506578947368421


Epoch[1] Batch[195] Speed: 1.2466519277752213 samples/sec                   batch loss = 544.6609344482422 | accuracy = 0.5051282051282051


Epoch[1] Batch[200] Speed: 1.2522933722011769 samples/sec                   batch loss = 558.4391992092133 | accuracy = 0.50625


Epoch[1] Batch[205] Speed: 1.2549383811271426 samples/sec                   batch loss = 572.6081056594849 | accuracy = 0.5036585365853659


Epoch[1] Batch[210] Speed: 1.2575595204320247 samples/sec                   batch loss = 586.4871244430542 | accuracy = 0.5059523809523809


Epoch[1] Batch[215] Speed: 1.2595582211035636 samples/sec                   batch loss = 600.4313395023346 | accuracy = 0.5046511627906977


Epoch[1] Batch[220] Speed: 1.254774318539996 samples/sec                   batch loss = 614.5113410949707 | accuracy = 0.5056818181818182


Epoch[1] Batch[225] Speed: 1.256897865870789 samples/sec                   batch loss = 628.6363158226013 | accuracy = 0.5055555555555555


Epoch[1] Batch[230] Speed: 1.2569842192334466 samples/sec                   batch loss = 641.9996132850647 | accuracy = 0.508695652173913


Epoch[1] Batch[235] Speed: 1.2555799320841559 samples/sec                   batch loss = 655.5090487003326 | accuracy = 0.5127659574468085


Epoch[1] Batch[240] Speed: 1.25189651261357 samples/sec                   batch loss = 669.2746493816376 | accuracy = 0.5145833333333333


Epoch[1] Batch[245] Speed: 1.255012354722053 samples/sec                   batch loss = 683.2123258113861 | accuracy = 0.5142857142857142


Epoch[1] Batch[250] Speed: 1.2565724288175526 samples/sec                   batch loss = 696.56769490242 | accuracy = 0.516


Epoch[1] Batch[255] Speed: 1.2511382549258507 samples/sec                   batch loss = 709.3442766666412 | accuracy = 0.5186274509803922


Epoch[1] Batch[260] Speed: 1.252773453283024 samples/sec                   batch loss = 723.2733063697815 | accuracy = 0.5182692307692308


Epoch[1] Batch[265] Speed: 1.2522276632092084 samples/sec                   batch loss = 736.545976638794 | accuracy = 0.5216981132075472


Epoch[1] Batch[270] Speed: 1.256666738324037 samples/sec                   batch loss = 749.9730522632599 | accuracy = 0.525


Epoch[1] Batch[275] Speed: 1.2533003387236432 samples/sec                   batch loss = 763.1042006015778 | accuracy = 0.5281818181818182


Epoch[1] Batch[280] Speed: 1.2529866808864671 samples/sec                   batch loss = 776.8902025222778 | accuracy = 0.5285714285714286


Epoch[1] Batch[285] Speed: 1.2498994066313622 samples/sec                   batch loss = 790.6936013698578 | accuracy = 0.5280701754385965


Epoch[1] Batch[290] Speed: 1.2516619913039073 samples/sec                   batch loss = 804.3050208091736 | accuracy = 0.5267241379310345


Epoch[1] Batch[295] Speed: 1.2510660433862357 samples/sec                   batch loss = 818.3442170619965 | accuracy = 0.5271186440677966


Epoch[1] Batch[300] Speed: 1.2509214587717195 samples/sec                   batch loss = 831.6771318912506 | accuracy = 0.5283333333333333


Epoch[1] Batch[305] Speed: 1.2486572474402373 samples/sec                   batch loss = 845.3659920692444 | accuracy = 0.5262295081967213


Epoch[1] Batch[310] Speed: 1.2542864202459572 samples/sec                   batch loss = 858.9237735271454 | accuracy = 0.5274193548387097


Epoch[1] Batch[315] Speed: 1.2518207575727391 samples/sec                   batch loss = 872.8527042865753 | accuracy = 0.5261904761904762


Epoch[1] Batch[320] Speed: 1.2547532976020812 samples/sec                   batch loss = 887.5229206085205 | accuracy = 0.5234375


Epoch[1] Batch[325] Speed: 1.2494380877773879 samples/sec                   batch loss = 901.395379781723 | accuracy = 0.5230769230769231


Epoch[1] Batch[330] Speed: 1.249297693300583 samples/sec                   batch loss = 914.1279423236847 | accuracy = 0.5257575757575758


Epoch[1] Batch[335] Speed: 1.2507231989952 samples/sec                   batch loss = 927.6337487697601 | accuracy = 0.5261194029850746


Epoch[1] Batch[340] Speed: 1.2504078474249287 samples/sec                   batch loss = 941.4616093635559 | accuracy = 0.525


Epoch[1] Batch[345] Speed: 1.250345318021241 samples/sec                   batch loss = 955.5198276042938 | accuracy = 0.5239130434782608


Epoch[1] Batch[350] Speed: 1.2455242957258437 samples/sec                   batch loss = 969.1939251422882 | accuracy = 0.5257142857142857


Epoch[1] Batch[355] Speed: 1.2487352224915031 samples/sec                   batch loss = 984.108457326889 | accuracy = 0.5232394366197183


Epoch[1] Batch[360] Speed: 1.2492962978876165 samples/sec                   batch loss = 997.4220106601715 | accuracy = 0.5263888888888889


Epoch[1] Batch[365] Speed: 1.2613835448293262 samples/sec                   batch loss = 1010.5880069732666 | accuracy = 0.5273972602739726


Epoch[1] Batch[370] Speed: 1.253634761722795 samples/sec                   batch loss = 1024.2164480686188 | accuracy = 0.5283783783783784


Epoch[1] Batch[375] Speed: 1.251205342623521 samples/sec                   batch loss = 1037.863327741623 | accuracy = 0.5286666666666666


Epoch[1] Batch[380] Speed: 1.249882738862576 samples/sec                   batch loss = 1052.3521184921265 | accuracy = 0.5276315789473685


Epoch[1] Batch[385] Speed: 1.2497241844997313 samples/sec                   batch loss = 1065.6489057540894 | accuracy = 0.5298701298701298


Epoch[1] Batch[390] Speed: 1.2549675752479028 samples/sec                   batch loss = 1078.5184943675995 | accuracy = 0.5301282051282051


Epoch[1] Batch[395] Speed: 1.2563532750731996 samples/sec                   batch loss = 1091.774816274643 | accuracy = 0.5303797468354431


Epoch[1] Batch[400] Speed: 1.2585611928398939 samples/sec                   batch loss = 1105.1997740268707 | accuracy = 0.531875


Epoch[1] Batch[405] Speed: 1.2539024473914464 samples/sec                   batch loss = 1118.2945358753204 | accuracy = 0.5339506172839507


Epoch[1] Batch[410] Speed: 1.254370914580661 samples/sec                   batch loss = 1131.7078256607056 | accuracy = 0.5347560975609756


Epoch[1] Batch[415] Speed: 1.2526790724873493 samples/sec                   batch loss = 1144.9096076488495 | accuracy = 0.5373493975903615


Epoch[1] Batch[420] Speed: 1.2518569058751827 samples/sec                   batch loss = 1159.4394953250885 | accuracy = 0.5369047619047619


Epoch[1] Batch[425] Speed: 1.255413071377886 samples/sec                   batch loss = 1173.2453706264496 | accuracy = 0.5364705882352941


Epoch[1] Batch[430] Speed: 1.2521110303762688 samples/sec                   batch loss = 1186.4302978515625 | accuracy = 0.5372093023255814


Epoch[1] Batch[435] Speed: 1.2503208111829722 samples/sec                   batch loss = 1199.9440915584564 | accuracy = 0.5373563218390804


Epoch[1] Batch[440] Speed: 1.25396776992286 samples/sec                   batch loss = 1213.0789070129395 | accuracy = 0.5392045454545454


Epoch[1] Batch[445] Speed: 1.253479468364788 samples/sec                   batch loss = 1227.1838872432709 | accuracy = 0.5393258426966292


Epoch[1] Batch[450] Speed: 1.251970968855748 samples/sec                   batch loss = 1240.6651060581207 | accuracy = 0.54


Epoch[1] Batch[455] Speed: 1.25404734702176 samples/sec                   batch loss = 1254.346792459488 | accuracy = 0.5406593406593406


Epoch[1] Batch[460] Speed: 1.256427980126871 samples/sec                   batch loss = 1267.1274662017822 | accuracy = 0.5423913043478261


Epoch[1] Batch[465] Speed: 1.257712055041455 samples/sec                   batch loss = 1281.460092306137 | accuracy = 0.543010752688172


Epoch[1] Batch[470] Speed: 1.2589184561207623 samples/sec                   batch loss = 1295.8127133846283 | accuracy = 0.5425531914893617


Epoch[1] Batch[475] Speed: 1.2567126746272177 samples/sec                   batch loss = 1309.2367932796478 | accuracy = 0.5436842105263158


Epoch[1] Batch[480] Speed: 1.251047105563766 samples/sec                   batch loss = 1323.4114928245544 | accuracy = 0.5432291666666667


Epoch[1] Batch[485] Speed: 1.2508569195273274 samples/sec                   batch loss = 1337.2914040088654 | accuracy = 0.5438144329896907


Epoch[1] Batch[490] Speed: 1.2529872423537545 samples/sec                   batch loss = 1351.3799324035645 | accuracy = 0.5438775510204081


Epoch[1] Batch[495] Speed: 1.2527826208573936 samples/sec                   batch loss = 1364.581606388092 | accuracy = 0.543939393939394


Epoch[1] Batch[500] Speed: 1.2528473589769573 samples/sec                   batch loss = 1377.2288105487823 | accuracy = 0.546


Epoch[1] Batch[505] Speed: 1.249007142558293 samples/sec                   batch loss = 1390.7381854057312 | accuracy = 0.5475247524752476


Epoch[1] Batch[510] Speed: 1.2583414392819634 samples/sec                   batch loss = 1404.0807359218597 | accuracy = 0.5470588235294118


Epoch[1] Batch[515] Speed: 1.2534592399758173 samples/sec                   batch loss = 1417.1063237190247 | accuracy = 0.5485436893203883


Epoch[1] Batch[520] Speed: 1.2507364392229834 samples/sec                   batch loss = 1429.6540236473083 | accuracy = 0.5509615384615385


Epoch[1] Batch[525] Speed: 1.248071212864557 samples/sec                   batch loss = 1442.5462009906769 | accuracy = 0.5519047619047619


Epoch[1] Batch[530] Speed: 1.2522093444350135 samples/sec                   batch loss = 1455.5202016830444 | accuracy = 0.5528301886792453


Epoch[1] Batch[535] Speed: 1.254042472751257 samples/sec                   batch loss = 1468.052169084549 | accuracy = 0.5537383177570093


Epoch[1] Batch[540] Speed: 1.2512287643624327 samples/sec                   batch loss = 1483.5805177688599 | accuracy = 0.5527777777777778


Epoch[1] Batch[545] Speed: 1.252927074638527 samples/sec                   batch loss = 1497.4403157234192 | accuracy = 0.5518348623853211


Epoch[1] Batch[550] Speed: 1.2584139268182128 samples/sec                   batch loss = 1510.8202447891235 | accuracy = 0.5527272727272727


Epoch[1] Batch[555] Speed: 1.2551128150627757 samples/sec                   batch loss = 1523.965814590454 | accuracy = 0.5527027027027027


Epoch[1] Batch[560] Speed: 1.2537740714310963 samples/sec                   batch loss = 1537.1476480960846 | accuracy = 0.553125


Epoch[1] Batch[565] Speed: 1.2528141471116572 samples/sec                   batch loss = 1550.4616532325745 | accuracy = 0.5530973451327433


Epoch[1] Batch[570] Speed: 1.2506412463582448 samples/sec                   batch loss = 1563.314516544342 | accuracy = 0.5539473684210526


Epoch[1] Batch[575] Speed: 1.2536434735273316 samples/sec                   batch loss = 1577.1195299625397 | accuracy = 0.552608695652174


Epoch[1] Batch[580] Speed: 1.2496110889732197 samples/sec                   batch loss = 1589.4682397842407 | accuracy = 0.5551724137931034


Epoch[1] Batch[585] Speed: 1.2534632668715724 samples/sec                   batch loss = 1602.6005041599274 | accuracy = 0.5555555555555556


Epoch[1] Batch[590] Speed: 1.2542610085273997 samples/sec                   batch loss = 1616.645807981491 | accuracy = 0.5563559322033899


Epoch[1] Batch[595] Speed: 1.255102486622487 samples/sec                   batch loss = 1629.7445788383484 | accuracy = 0.557563025210084


Epoch[1] Batch[600] Speed: 1.2557445813825707 samples/sec                   batch loss = 1643.1141633987427 | accuracy = 0.5575


Epoch[1] Batch[605] Speed: 1.2535561736043486 samples/sec                   batch loss = 1656.9239506721497 | accuracy = 0.5566115702479338


Epoch[1] Batch[610] Speed: 1.2564189473056724 samples/sec                   batch loss = 1671.4087624549866 | accuracy = 0.5565573770491803


Epoch[1] Batch[615] Speed: 1.2540274752335192 samples/sec                   batch loss = 1685.0094683170319 | accuracy = 0.556910569105691


Epoch[1] Batch[620] Speed: 1.2541049975182839 samples/sec                   batch loss = 1698.6579554080963 | accuracy = 0.5576612903225806


Epoch[1] Batch[625] Speed: 1.2527156445272294 samples/sec                   batch loss = 1711.3642945289612 | accuracy = 0.5596


Epoch[1] Batch[630] Speed: 1.2589495361862102 samples/sec                   batch loss = 1725.3156034946442 | accuracy = 0.5591269841269841


Epoch[1] Batch[635] Speed: 1.250631457512175 samples/sec                   batch loss = 1738.5500597953796 | accuracy = 0.5594488188976378


Epoch[1] Batch[640] Speed: 1.2492338796027165 samples/sec                   batch loss = 1751.1088095903397 | accuracy = 0.560546875


Epoch[1] Batch[645] Speed: 1.2505593976759737 samples/sec                   batch loss = 1764.0710154771805 | accuracy = 0.562015503875969


Epoch[1] Batch[650] Speed: 1.258220267832401 samples/sec                   batch loss = 1777.5702286958694 | accuracy = 0.561923076923077


Epoch[1] Batch[655] Speed: 1.2486907039571606 samples/sec                   batch loss = 1792.0606282949448 | accuracy = 0.5606870229007633


Epoch[1] Batch[660] Speed: 1.2583970311903232 samples/sec                   batch loss = 1804.382237792015 | accuracy = 0.5628787878787879


Epoch[1] Batch[665] Speed: 1.2571258757413346 samples/sec                   batch loss = 1817.0730239152908 | accuracy = 0.5631578947368421


Epoch[1] Batch[670] Speed: 1.2617884359184133 samples/sec                   batch loss = 1830.7215665578842 | accuracy = 0.5638059701492537


Epoch[1] Batch[675] Speed: 1.2572442926774874 samples/sec                   batch loss = 1842.6380873918533 | accuracy = 0.5651851851851852


Epoch[1] Batch[680] Speed: 1.254642573930752 samples/sec                   batch loss = 1854.312690615654 | accuracy = 0.5665441176470588


Epoch[1] Batch[685] Speed: 1.2546286879341826 samples/sec                   batch loss = 1866.8220442533493 | accuracy = 0.5686131386861314


Epoch[1] Batch[690] Speed: 1.2527452965720753 samples/sec                   batch loss = 1880.3988691568375 | accuracy = 0.5677536231884058


Epoch[1] Batch[695] Speed: 1.2570327216906794 samples/sec                   batch loss = 1893.26449406147 | accuracy = 0.5683453237410072


Epoch[1] Batch[700] Speed: 1.2537193556546857 samples/sec                   batch loss = 1906.0927294492722 | accuracy = 0.5685714285714286


Epoch[1] Batch[705] Speed: 1.251538928461637 samples/sec                   batch loss = 1918.708569407463 | accuracy = 0.5691489361702128


Epoch[1] Batch[710] Speed: 1.2537422157749483 samples/sec                   batch loss = 1931.844946026802 | accuracy = 0.569718309859155


Epoch[1] Batch[715] Speed: 1.2580894968014253 samples/sec                   batch loss = 1944.5746842622757 | accuracy = 0.570979020979021


Epoch[1] Batch[720] Speed: 1.253308671376941 samples/sec                   batch loss = 1960.0200587511063 | accuracy = 0.5701388888888889


Epoch[1] Batch[725] Speed: 1.2545052286703957 samples/sec                   batch loss = 1972.3560019731522 | accuracy = 0.5710344827586207


Epoch[1] Batch[730] Speed: 1.2545397497862143 samples/sec                   batch loss = 1984.8123327493668 | accuracy = 0.5715753424657535


Epoch[1] Batch[735] Speed: 1.2536755115273222 samples/sec                   batch loss = 1997.7654885053635 | accuracy = 0.5724489795918367


Epoch[1] Batch[740] Speed: 1.2538173602406235 samples/sec                   batch loss = 2010.9391714334488 | accuracy = 0.5733108108108108


Epoch[1] Batch[745] Speed: 1.2571105217979275 samples/sec                   batch loss = 2023.6834625005722 | accuracy = 0.5738255033557047


Epoch[1] Batch[750] Speed: 1.2580410070947141 samples/sec                   batch loss = 2036.1030614376068 | accuracy = 0.5753333333333334


Epoch[1] Batch[755] Speed: 1.2490340156190896 samples/sec                   batch loss = 2048.946727514267 | accuracy = 0.5751655629139073


Epoch[1] Batch[760] Speed: 1.255278093228918 samples/sec                   batch loss = 2060.54741191864 | accuracy = 0.5769736842105263


Epoch[1] Batch[765] Speed: 1.256613652189668 samples/sec                   batch loss = 2072.7402007579803 | accuracy = 0.5774509803921568


Epoch[1] Batch[770] Speed: 1.2522031759888188 samples/sec                   batch loss = 2086.116361141205 | accuracy = 0.5775974025974026


Epoch[1] Batch[775] Speed: 1.256108334935549 samples/sec                   batch loss = 2099.105335712433 | accuracy = 0.577741935483871


Epoch[1] Batch[780] Speed: 1.246184301429361 samples/sec                   batch loss = 2113.3675141334534 | accuracy = 0.5772435897435897


Epoch[1] Batch[785] Speed: 1.2491542611567688 samples/sec                   batch loss = 2126.124972820282 | accuracy = 0.5780254777070064


[Epoch 1] training: accuracy=0.5780456852791879
[Epoch 1] time cost: 646.5554411411285
[Epoch 1] validation: validation accuracy=0.6722222222222223


Epoch[2] Batch[5] Speed: 1.2572685063652098 samples/sec                   batch loss = 13.623588800430298 | accuracy = 0.65


Epoch[2] Batch[10] Speed: 1.2544117122577931 samples/sec                   batch loss = 25.648406744003296 | accuracy = 0.65


Epoch[2] Batch[15] Speed: 1.2602282653609798 samples/sec                   batch loss = 37.29794383049011 | accuracy = 0.7


Epoch[2] Batch[20] Speed: 1.2581388393865502 samples/sec                   batch loss = 49.99523210525513 | accuracy = 0.7


Epoch[2] Batch[25] Speed: 1.2557668574631276 samples/sec                   batch loss = 61.51836037635803 | accuracy = 0.72


Epoch[2] Batch[30] Speed: 1.254068344312488 samples/sec                   batch loss = 73.6521247625351 | accuracy = 0.7166666666666667


Epoch[2] Batch[35] Speed: 1.2574663963539716 samples/sec                   batch loss = 85.32923984527588 | accuracy = 0.7142857142857143


Epoch[2] Batch[40] Speed: 1.2602541087871415 samples/sec                   batch loss = 96.56857061386108 | accuracy = 0.725


Epoch[2] Batch[45] Speed: 1.2495830742327103 samples/sec                   batch loss = 109.8023829460144 | accuracy = 0.7222222222222222


Epoch[2] Batch[50] Speed: 1.2499895506480647 samples/sec                   batch loss = 122.88727855682373 | accuracy = 0.715


Epoch[2] Batch[55] Speed: 1.251202170025126 samples/sec                   batch loss = 137.03227543830872 | accuracy = 0.6954545454545454


Epoch[2] Batch[60] Speed: 1.2596715165560541 samples/sec                   batch loss = 150.01429688930511 | accuracy = 0.6916666666666667


Epoch[2] Batch[65] Speed: 1.2510251831848036 samples/sec                   batch loss = 162.81396448612213 | accuracy = 0.6884615384615385


Epoch[2] Batch[70] Speed: 1.2576586920023247 samples/sec                   batch loss = 176.8549393415451 | accuracy = 0.6785714285714286


Epoch[2] Batch[75] Speed: 1.2535749064687651 samples/sec                   batch loss = 188.44614815711975 | accuracy = 0.6833333333333333


Epoch[2] Batch[80] Speed: 1.2546964319875342 samples/sec                   batch loss = 201.47057461738586 | accuracy = 0.678125


Epoch[2] Batch[85] Speed: 1.2478216952172458 samples/sec                   batch loss = 214.27798926830292 | accuracy = 0.6735294117647059


Epoch[2] Batch[90] Speed: 1.2585354187980178 samples/sec                   batch loss = 224.44841265678406 | accuracy = 0.6888888888888889


Epoch[2] Batch[95] Speed: 1.2617229602796647 samples/sec                   batch loss = 237.9928114414215 | accuracy = 0.6842105263157895


Epoch[2] Batch[100] Speed: 1.2545989466649388 samples/sec                   batch loss = 250.29149889945984 | accuracy = 0.685


Epoch[2] Batch[105] Speed: 1.255476484495585 samples/sec                   batch loss = 261.4312914609909 | accuracy = 0.6880952380952381


Epoch[2] Batch[110] Speed: 1.2545951001053346 samples/sec                   batch loss = 274.85562121868134 | accuracy = 0.6886363636363636


Epoch[2] Batch[115] Speed: 1.2532424813328884 samples/sec                   batch loss = 288.41172564029694 | accuracy = 0.6804347826086956


Epoch[2] Batch[120] Speed: 1.2556568946179363 samples/sec                   batch loss = 299.7343109846115 | accuracy = 0.6854166666666667


Epoch[2] Batch[125] Speed: 1.25246613655714 samples/sec                   batch loss = 311.44332814216614 | accuracy = 0.682


Epoch[2] Batch[130] Speed: 1.2488013090919485 samples/sec                   batch loss = 323.5936846733093 | accuracy = 0.6826923076923077


Epoch[2] Batch[135] Speed: 1.2537632965427008 samples/sec                   batch loss = 335.4753818511963 | accuracy = 0.6851851851851852


Epoch[2] Batch[140] Speed: 1.2587406963026497 samples/sec                   batch loss = 348.0233829021454 | accuracy = 0.6803571428571429


Epoch[2] Batch[145] Speed: 1.2575181407229905 samples/sec                   batch loss = 363.3652255535126 | accuracy = 0.6689655172413793


Epoch[2] Batch[150] Speed: 1.2520553383871627 samples/sec                   batch loss = 374.8636567592621 | accuracy = 0.6733333333333333


Epoch[2] Batch[155] Speed: 1.2516727301050237 samples/sec                   batch loss = 385.58123791217804 | accuracy = 0.6790322580645162


Epoch[2] Batch[160] Speed: 1.254396612118525 samples/sec                   batch loss = 398.31693625450134 | accuracy = 0.6734375


Epoch[2] Batch[165] Speed: 1.2564188532144684 samples/sec                   batch loss = 410.8638868331909 | accuracy = 0.6757575757575758


Epoch[2] Batch[170] Speed: 1.2599113199663448 samples/sec                   batch loss = 423.23866522312164 | accuracy = 0.6779411764705883


Epoch[2] Batch[175] Speed: 1.254200905940479 samples/sec                   batch loss = 435.5666881799698 | accuracy = 0.6757142857142857


Epoch[2] Batch[180] Speed: 1.2541178407085283 samples/sec                   batch loss = 446.9693069458008 | accuracy = 0.6763888888888889


Epoch[2] Batch[185] Speed: 1.2551437075032867 samples/sec                   batch loss = 460.7198498249054 | accuracy = 0.6716216216216216


Epoch[2] Batch[190] Speed: 1.256306894667545 samples/sec                   batch loss = 474.98432970046997 | accuracy = 0.6697368421052632


Epoch[2] Batch[195] Speed: 1.25608454207859 samples/sec                   batch loss = 489.0579208135605 | accuracy = 0.6666666666666666


Epoch[2] Batch[200] Speed: 1.2528708422317107 samples/sec                   batch loss = 500.6479682922363 | accuracy = 0.66875


Epoch[2] Batch[205] Speed: 1.258929319798973 samples/sec                   batch loss = 513.443282365799 | accuracy = 0.6658536585365854


Epoch[2] Batch[210] Speed: 1.2582907595982342 samples/sec                   batch loss = 523.626855134964 | accuracy = 0.6702380952380952


Epoch[2] Batch[215] Speed: 1.255253204812383 samples/sec                   batch loss = 533.7229105234146 | accuracy = 0.6755813953488372


Epoch[2] Batch[220] Speed: 1.2565947342789867 samples/sec                   batch loss = 546.6857236623764 | accuracy = 0.6727272727272727


Epoch[2] Batch[225] Speed: 1.2529811598182772 samples/sec                   batch loss = 559.4919397830963 | accuracy = 0.6722222222222223


Epoch[2] Batch[230] Speed: 1.2591928450366885 samples/sec                   batch loss = 574.7223422527313 | accuracy = 0.6728260869565217


Epoch[2] Batch[235] Speed: 1.2544596411949163 samples/sec                   batch loss = 584.5354218482971 | accuracy = 0.676595744680851


Epoch[2] Batch[240] Speed: 1.2572164999054682 samples/sec                   batch loss = 597.2022494077682 | accuracy = 0.675


Epoch[2] Batch[245] Speed: 1.2503400997546452 samples/sec                   batch loss = 610.4210113286972 | accuracy = 0.6724489795918367


Epoch[2] Batch[250] Speed: 1.2551563841727094 samples/sec                   batch loss = 620.9077317714691 | accuracy = 0.675


Epoch[2] Batch[255] Speed: 1.2536391644475888 samples/sec                   batch loss = 633.7832343578339 | accuracy = 0.6745098039215687


Epoch[2] Batch[260] Speed: 1.254157684395884 samples/sec                   batch loss = 644.517627120018 | accuracy = 0.676923076923077


Epoch[2] Batch[265] Speed: 1.251476565942388 samples/sec                   batch loss = 657.3525894880295 | accuracy = 0.6735849056603773


Epoch[2] Batch[270] Speed: 1.2490941819926968 samples/sec                   batch loss = 667.646680355072 | accuracy = 0.6759259259259259


Epoch[2] Batch[275] Speed: 1.2532630772026858 samples/sec                   batch loss = 683.8904657363892 | accuracy = 0.6736363636363636


Epoch[2] Batch[280] Speed: 1.252016095330205 samples/sec                   batch loss = 696.6885597705841 | accuracy = 0.6732142857142858


Epoch[2] Batch[285] Speed: 1.2544185590245793 samples/sec                   batch loss = 708.5829669237137 | accuracy = 0.6719298245614035


Epoch[2] Batch[290] Speed: 1.2533239325910601 samples/sec                   batch loss = 719.016855597496 | accuracy = 0.6732758620689655


Epoch[2] Batch[295] Speed: 1.2539874523641137 samples/sec                   batch loss = 730.9273697137833 | accuracy = 0.673728813559322


Epoch[2] Batch[300] Speed: 1.2566414183139665 samples/sec                   batch loss = 744.1282724142075 | accuracy = 0.6725


Epoch[2] Batch[305] Speed: 1.2513047277483265 samples/sec                   batch loss = 755.6245385408401 | accuracy = 0.6754098360655738


Epoch[2] Batch[310] Speed: 1.2501700640099398 samples/sec                   batch loss = 767.1164971590042 | accuracy = 0.6766129032258065


Epoch[2] Batch[315] Speed: 1.247185450081928 samples/sec                   batch loss = 778.7740111351013 | accuracy = 0.6793650793650794


Epoch[2] Batch[320] Speed: 1.2509488806399718 samples/sec                   batch loss = 790.4672323465347 | accuracy = 0.6796875


Epoch[2] Batch[325] Speed: 1.2497738040592654 samples/sec                   batch loss = 802.4756137132645 | accuracy = 0.68


Epoch[2] Batch[330] Speed: 1.2561748280540652 samples/sec                   batch loss = 816.4735788106918 | accuracy = 0.6765151515151515


Epoch[2] Batch[335] Speed: 1.255534454386864 samples/sec                   batch loss = 828.5024877786636 | accuracy = 0.6753731343283582


Epoch[2] Batch[340] Speed: 1.2540216637918296 samples/sec                   batch loss = 838.6884520053864 | accuracy = 0.6772058823529412


Epoch[2] Batch[345] Speed: 1.2538326338456764 samples/sec                   batch loss = 849.5264894962311 | accuracy = 0.6768115942028986


Epoch[2] Batch[350] Speed: 1.2536447849922636 samples/sec                   batch loss = 861.5578970909119 | accuracy = 0.6764285714285714


Epoch[2] Batch[355] Speed: 1.256884965678275 samples/sec                   batch loss = 875.2136640548706 | accuracy = 0.6746478873239437


Epoch[2] Batch[360] Speed: 1.257110333408313 samples/sec                   batch loss = 884.8980011940002 | accuracy = 0.6763888888888889


Epoch[2] Batch[365] Speed: 1.2558133860034113 samples/sec                   batch loss = 898.3656640052795 | accuracy = 0.6753424657534246


Epoch[2] Batch[370] Speed: 1.2595044176302823 samples/sec                   batch loss = 912.3099075555801 | accuracy = 0.6722972972972973


Epoch[2] Batch[375] Speed: 1.255037796895054 samples/sec                   batch loss = 924.9266647100449 | accuracy = 0.672


Epoch[2] Batch[380] Speed: 1.256681140123274 samples/sec                   batch loss = 935.4980609416962 | accuracy = 0.6730263157894737


Epoch[2] Batch[385] Speed: 1.2548505253445577 samples/sec                   batch loss = 946.5569266080856 | accuracy = 0.6733766233766234


Epoch[2] Batch[390] Speed: 1.2591376552887679 samples/sec                   batch loss = 957.513880610466 | accuracy = 0.6743589743589744


Epoch[2] Batch[395] Speed: 1.2606284366372364 samples/sec                   batch loss = 968.6220411062241 | accuracy = 0.6746835443037975


Epoch[2] Batch[400] Speed: 1.2518050659362314 samples/sec                   batch loss = 981.9496909379959 | accuracy = 0.675625


Epoch[2] Batch[405] Speed: 1.258685640612627 samples/sec                   batch loss = 994.1689978837967 | accuracy = 0.6759259259259259


Epoch[2] Batch[410] Speed: 1.2576453048125809 samples/sec                   batch loss = 1008.2062939405441 | accuracy = 0.6737804878048781


Epoch[2] Batch[415] Speed: 1.2600278018171596 samples/sec                   batch loss = 1017.6673434972763 | accuracy = 0.6753012048192771


Epoch[2] Batch[420] Speed: 1.2588941788393095 samples/sec                   batch loss = 1029.987641453743 | accuracy = 0.6755952380952381


Epoch[2] Batch[425] Speed: 1.2570259405395499 samples/sec                   batch loss = 1042.3106458187103 | accuracy = 0.6752941176470588


Epoch[2] Batch[430] Speed: 1.2498195171216067 samples/sec                   batch loss = 1057.6642212867737 | accuracy = 0.6738372093023256


Epoch[2] Batch[435] Speed: 1.2524480912972662 samples/sec                   batch loss = 1070.761964917183 | accuracy = 0.6724137931034483


Epoch[2] Batch[440] Speed: 1.252165699311396 samples/sec                   batch loss = 1081.0029002428055 | accuracy = 0.6727272727272727


Epoch[2] Batch[445] Speed: 1.2502677008001923 samples/sec                   batch loss = 1094.291694521904 | accuracy = 0.6730337078651686


Epoch[2] Batch[450] Speed: 1.254284919894488 samples/sec                   batch loss = 1103.96149289608 | accuracy = 0.6738888888888889


Epoch[2] Batch[455] Speed: 1.2489184417700483 samples/sec                   batch loss = 1113.9764287471771 | accuracy = 0.6741758241758242


Epoch[2] Batch[460] Speed: 1.2530177494994565 samples/sec                   batch loss = 1127.3276969194412 | accuracy = 0.6733695652173913


Epoch[2] Batch[465] Speed: 1.2520780444221153 samples/sec                   batch loss = 1140.312487244606 | accuracy = 0.671505376344086


Epoch[2] Batch[470] Speed: 1.2577483558096325 samples/sec                   batch loss = 1150.977647781372 | accuracy = 0.6723404255319149


Epoch[2] Batch[475] Speed: 1.2592945429270772 samples/sec                   batch loss = 1163.7825454473495 | accuracy = 0.6721052631578948


Epoch[2] Batch[480] Speed: 1.257349256804265 samples/sec                   batch loss = 1175.4188684225082 | accuracy = 0.6729166666666667


Epoch[2] Batch[485] Speed: 1.2546448257429432 samples/sec                   batch loss = 1189.2885662317276 | accuracy = 0.6716494845360824


Epoch[2] Batch[490] Speed: 1.257006727675325 samples/sec                   batch loss = 1202.5413392782211 | accuracy = 0.6704081632653062


Epoch[2] Batch[495] Speed: 1.2555051399316062 samples/sec                   batch loss = 1213.4285093545914 | accuracy = 0.6707070707070707


Epoch[2] Batch[500] Speed: 1.2553423380807314 samples/sec                   batch loss = 1228.3849867582321 | accuracy = 0.669


Epoch[2] Batch[505] Speed: 1.2528724327628682 samples/sec                   batch loss = 1242.2979663610458 | accuracy = 0.6673267326732674


Epoch[2] Batch[510] Speed: 1.2526909511510578 samples/sec                   batch loss = 1257.199537038803 | accuracy = 0.6661764705882353


Epoch[2] Batch[515] Speed: 1.2573303167386904 samples/sec                   batch loss = 1269.689116358757 | accuracy = 0.6650485436893204


Epoch[2] Batch[520] Speed: 1.2547453210912292 samples/sec                   batch loss = 1279.4574673175812 | accuracy = 0.666826923076923


Epoch[2] Batch[525] Speed: 1.251859334518146 samples/sec                   batch loss = 1288.8620977401733 | accuracy = 0.6685714285714286


Epoch[2] Batch[530] Speed: 1.2539318744900385 samples/sec                   batch loss = 1300.558711051941 | accuracy = 0.6688679245283019


Epoch[2] Batch[535] Speed: 1.2530330036293946 samples/sec                   batch loss = 1311.9459968805313 | accuracy = 0.6700934579439253


Epoch[2] Batch[540] Speed: 1.2576396483391867 samples/sec                   batch loss = 1321.9280276298523 | accuracy = 0.6717592592592593


Epoch[2] Batch[545] Speed: 1.2486231422253278 samples/sec                   batch loss = 1333.9058318138123 | accuracy = 0.6711009174311927


Epoch[2] Batch[550] Speed: 1.256355438948554 samples/sec                   batch loss = 1346.0297874212265 | accuracy = 0.6713636363636364


Epoch[2] Batch[555] Speed: 1.2503082320003243 samples/sec                   batch loss = 1358.2496730089188 | accuracy = 0.6716216216216216


Epoch[2] Batch[560] Speed: 1.2553883654824949 samples/sec                   batch loss = 1369.213599205017 | accuracy = 0.6714285714285714


Epoch[2] Batch[565] Speed: 1.2514214904426777 samples/sec                   batch loss = 1381.104442358017 | accuracy = 0.672566371681416


Epoch[2] Batch[570] Speed: 1.2518683018968728 samples/sec                   batch loss = 1391.748746752739 | accuracy = 0.6732456140350878


Epoch[2] Batch[575] Speed: 1.252244300111407 samples/sec                   batch loss = 1403.210837483406 | accuracy = 0.6730434782608695


Epoch[2] Batch[580] Speed: 1.2483048547845874 samples/sec                   batch loss = 1414.1278225183487 | accuracy = 0.6732758620689655


Epoch[2] Batch[585] Speed: 1.2532287199244292 samples/sec                   batch loss = 1429.4409168958664 | accuracy = 0.6722222222222223


Epoch[2] Batch[590] Speed: 1.2573015779280128 samples/sec                   batch loss = 1440.2724777460098 | accuracy = 0.6724576271186441


Epoch[2] Batch[595] Speed: 1.2550097260692468 samples/sec                   batch loss = 1451.456133365631 | accuracy = 0.6735294117647059


Epoch[2] Batch[600] Speed: 1.2573588683990822 samples/sec                   batch loss = 1462.166230082512 | accuracy = 0.6733333333333333


Epoch[2] Batch[605] Speed: 1.2563648471889264 samples/sec                   batch loss = 1471.7373168468475 | accuracy = 0.6743801652892562


Epoch[2] Batch[610] Speed: 1.2513588596215257 samples/sec                   batch loss = 1487.2535691261292 | accuracy = 0.6729508196721311


Epoch[2] Batch[615] Speed: 1.2520843050786488 samples/sec                   batch loss = 1498.6847747564316 | accuracy = 0.6735772357723577


Epoch[2] Batch[620] Speed: 1.2457136959545825 samples/sec                   batch loss = 1507.923175573349 | accuracy = 0.675


Epoch[2] Batch[625] Speed: 1.2454780641821617 samples/sec                   batch loss = 1516.6428295373917 | accuracy = 0.6764


Epoch[2] Batch[630] Speed: 1.2463643654774368 samples/sec                   batch loss = 1532.2258368730545 | accuracy = 0.6742063492063493


Epoch[2] Batch[635] Speed: 1.2546915526699676 samples/sec                   batch loss = 1546.129117488861 | accuracy = 0.6728346456692913


Epoch[2] Batch[640] Speed: 1.2579362104014247 samples/sec                   batch loss = 1557.6111830472946 | accuracy = 0.673046875


Epoch[2] Batch[645] Speed: 1.257315994281935 samples/sec                   batch loss = 1567.3961381912231 | accuracy = 0.674031007751938


Epoch[2] Batch[650] Speed: 1.2565746875624226 samples/sec                   batch loss = 1579.1695557832718 | accuracy = 0.6746153846153846


Epoch[2] Batch[655] Speed: 1.250194658100344 samples/sec                   batch loss = 1589.010326743126 | accuracy = 0.6748091603053435


Epoch[2] Batch[660] Speed: 1.2572786820453818 samples/sec                   batch loss = 1601.0801748037338 | accuracy = 0.675


Epoch[2] Batch[665] Speed: 1.260402279287636 samples/sec                   batch loss = 1613.1577396988869 | accuracy = 0.674436090225564


Epoch[2] Batch[670] Speed: 1.2572296895423771 samples/sec                   batch loss = 1625.1582098603249 | accuracy = 0.6742537313432836


Epoch[2] Batch[675] Speed: 1.2533801120414907 samples/sec                   batch loss = 1635.61693328619 | accuracy = 0.6751851851851852


Epoch[2] Batch[680] Speed: 1.261610339100882 samples/sec                   batch loss = 1649.3227915167809 | accuracy = 0.674264705882353


Epoch[2] Batch[685] Speed: 1.2537869078442243 samples/sec                   batch loss = 1657.811600625515 | accuracy = 0.6751824817518248


Epoch[2] Batch[690] Speed: 1.2541547780648559 samples/sec                   batch loss = 1669.4364221692085 | accuracy = 0.6757246376811594


Epoch[2] Batch[695] Speed: 1.252841745580293 samples/sec                   batch loss = 1679.9581597447395 | accuracy = 0.6762589928057554


Epoch[2] Batch[700] Speed: 1.2573458645117668 samples/sec                   batch loss = 1689.513281404972 | accuracy = 0.6778571428571428


Epoch[2] Batch[705] Speed: 1.2567469407778458 samples/sec                   batch loss = 1697.8482840657234 | accuracy = 0.6790780141843972


Epoch[2] Batch[710] Speed: 1.2529762002561335 samples/sec                   batch loss = 1709.1151970028877 | accuracy = 0.6792253521126761


Epoch[2] Batch[715] Speed: 1.2543147400532149 samples/sec                   batch loss = 1721.1223078370094 | accuracy = 0.6797202797202797


Epoch[2] Batch[720] Speed: 1.259246621893196 samples/sec                   batch loss = 1734.390294611454 | accuracy = 0.6795138888888889


Epoch[2] Batch[725] Speed: 1.2540943106845786 samples/sec                   batch loss = 1744.0336795449257 | accuracy = 0.6806896551724138


Epoch[2] Batch[730] Speed: 1.2531296841439794 samples/sec                   batch loss = 1754.8273022770882 | accuracy = 0.6804794520547945


Epoch[2] Batch[735] Speed: 1.2556148882560545 samples/sec                   batch loss = 1765.9882826209068 | accuracy = 0.6806122448979591


Epoch[2] Batch[740] Speed: 1.255860858067468 samples/sec                   batch loss = 1777.5563512444496 | accuracy = 0.6810810810810811


Epoch[2] Batch[745] Speed: 1.2546552404794715 samples/sec                   batch loss = 1788.7636632323265 | accuracy = 0.6812080536912751


Epoch[2] Batch[750] Speed: 1.2590591316408661 samples/sec                   batch loss = 1798.603733241558 | accuracy = 0.681


Epoch[2] Batch[755] Speed: 1.2565529474801034 samples/sec                   batch loss = 1812.123361647129 | accuracy = 0.6798013245033112


Epoch[2] Batch[760] Speed: 1.257032815873849 samples/sec                   batch loss = 1823.073538839817 | accuracy = 0.6799342105263158


Epoch[2] Batch[765] Speed: 1.2542942033768092 samples/sec                   batch loss = 1834.418633401394 | accuracy = 0.6797385620915033


Epoch[2] Batch[770] Speed: 1.2538751770140741 samples/sec                   batch loss = 1847.8751351237297 | accuracy = 0.6792207792207792


Epoch[2] Batch[775] Speed: 1.256755507607479 samples/sec                   batch loss = 1857.6353662610054 | accuracy = 0.6796774193548387


Epoch[2] Batch[780] Speed: 1.259664328592991 samples/sec                   batch loss = 1868.0163142085075 | accuracy = 0.680448717948718


Epoch[2] Batch[785] Speed: 1.2570566446692704 samples/sec                   batch loss = 1877.5734224915504 | accuracy = 0.6815286624203821


[Epoch 2] training: accuracy=0.6814720812182741
[Epoch 2] time cost: 643.860823392868
[Epoch 2] validation: validation accuracy=0.7155555555555555


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).