<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[10:23:29] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[10:23:29] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[10:23:29] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 6.2162814, -4.7117567]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7752077906247872 samples/sec                   batch loss = 13.877300262451172 | accuracy = 0.6


Epoch[1] Batch[10] Speed: 1.258125819361045 samples/sec                   batch loss = 28.09047293663025 | accuracy = 0.475


Epoch[1] Batch[15] Speed: 1.2606155544747293 samples/sec                   batch loss = 42.404107093811035 | accuracy = 0.4666666666666667


Epoch[1] Batch[20] Speed: 1.255893573578681 samples/sec                   batch loss = 56.0464870929718 | accuracy = 0.5125


Epoch[1] Batch[25] Speed: 1.2533625086052236 samples/sec                   batch loss = 70.65025615692139 | accuracy = 0.49


Epoch[1] Batch[30] Speed: 1.255240901846468 samples/sec                   batch loss = 85.42351841926575 | accuracy = 0.4583333333333333


Epoch[1] Batch[35] Speed: 1.2598505800554636 samples/sec                   batch loss = 99.09899759292603 | accuracy = 0.4642857142857143


Epoch[1] Batch[40] Speed: 1.2560724108897838 samples/sec                   batch loss = 113.55788850784302 | accuracy = 0.4625


Epoch[1] Batch[45] Speed: 1.2541327466386953 samples/sec                   batch loss = 128.09942889213562 | accuracy = 0.45555555555555555


Epoch[1] Batch[50] Speed: 1.2561855503758212 samples/sec                   batch loss = 142.4230523109436 | accuracy = 0.46


Epoch[1] Batch[55] Speed: 1.2534023043120308 samples/sec                   batch loss = 156.04031777381897 | accuracy = 0.4590909090909091


Epoch[1] Batch[60] Speed: 1.2516138090384348 samples/sec                   batch loss = 170.29343700408936 | accuracy = 0.4583333333333333


Epoch[1] Batch[65] Speed: 1.2524420139951893 samples/sec                   batch loss = 184.33393335342407 | accuracy = 0.46923076923076923


Epoch[1] Batch[70] Speed: 1.2556496584223356 samples/sec                   batch loss = 198.4173412322998 | accuracy = 0.475


Epoch[1] Batch[75] Speed: 1.2518964191985027 samples/sec                   batch loss = 212.1530566215515 | accuracy = 0.4766666666666667


Epoch[1] Batch[80] Speed: 1.2537222599679987 samples/sec                   batch loss = 226.4213788509369 | accuracy = 0.475


Epoch[1] Batch[85] Speed: 1.2553135961499324 samples/sec                   batch loss = 239.95590567588806 | accuracy = 0.4852941176470588


Epoch[1] Batch[90] Speed: 1.2579696943953549 samples/sec                   batch loss = 254.24925255775452 | accuracy = 0.4861111111111111


Epoch[1] Batch[95] Speed: 1.253642068389378 samples/sec                   batch loss = 268.4839816093445 | accuracy = 0.4842105263157895


Epoch[1] Batch[100] Speed: 1.2599661044417871 samples/sec                   batch loss = 282.4853165149689 | accuracy = 0.4825


Epoch[1] Batch[105] Speed: 1.2534731000978439 samples/sec                   batch loss = 296.11456274986267 | accuracy = 0.4857142857142857


Epoch[1] Batch[110] Speed: 1.258439129519675 samples/sec                   batch loss = 309.92406034469604 | accuracy = 0.49318181818181817


Epoch[1] Batch[115] Speed: 1.2578173801928814 samples/sec                   batch loss = 323.56890392303467 | accuracy = 0.4956521739130435


Epoch[1] Batch[120] Speed: 1.2647724950314765 samples/sec                   batch loss = 337.77867317199707 | accuracy = 0.4979166666666667


Epoch[1] Batch[125] Speed: 1.2648216004050514 samples/sec                   batch loss = 351.6968538761139 | accuracy = 0.496


Epoch[1] Batch[130] Speed: 1.2681272706589575 samples/sec                   batch loss = 365.4494950771332 | accuracy = 0.49423076923076925


Epoch[1] Batch[135] Speed: 1.2640617048858893 samples/sec                   batch loss = 379.0601110458374 | accuracy = 0.4925925925925926


Epoch[1] Batch[140] Speed: 1.2587305913811864 samples/sec                   batch loss = 393.18186354637146 | accuracy = 0.49107142857142855


Epoch[1] Batch[145] Speed: 1.2608031295951707 samples/sec                   batch loss = 406.91708636283875 | accuracy = 0.49310344827586206


Epoch[1] Batch[150] Speed: 1.2592556954185135 samples/sec                   batch loss = 420.6697392463684 | accuracy = 0.48833333333333334


Epoch[1] Batch[155] Speed: 1.2623231231715661 samples/sec                   batch loss = 433.8879396915436 | accuracy = 0.49838709677419357


Epoch[1] Batch[160] Speed: 1.2655719086331758 samples/sec                   batch loss = 447.1752219200134 | accuracy = 0.5015625


Epoch[1] Batch[165] Speed: 1.2607635257549172 samples/sec                   batch loss = 460.40339064598083 | accuracy = 0.5060606060606061


Epoch[1] Batch[170] Speed: 1.2559002485044035 samples/sec                   batch loss = 474.8506987094879 | accuracy = 0.5


Epoch[1] Batch[175] Speed: 1.2565008119929912 samples/sec                   batch loss = 489.1517028808594 | accuracy = 0.5014285714285714


Epoch[1] Batch[180] Speed: 1.2570689832543065 samples/sec                   batch loss = 503.1530933380127 | accuracy = 0.5013888888888889


Epoch[1] Batch[185] Speed: 1.2532352729383847 samples/sec                   batch loss = 517.0216965675354 | accuracy = 0.5


Epoch[1] Batch[190] Speed: 1.2495401703708249 samples/sec                   batch loss = 531.3701214790344 | accuracy = 0.4986842105263158


Epoch[1] Batch[195] Speed: 1.2479663067568356 samples/sec                   batch loss = 545.0187220573425 | accuracy = 0.5012820512820513


Epoch[1] Batch[200] Speed: 1.2517411826201739 samples/sec                   batch loss = 558.6912758350372 | accuracy = 0.50375


Epoch[1] Batch[205] Speed: 1.2525976115017396 samples/sec                   batch loss = 572.199844121933 | accuracy = 0.5048780487804878


Epoch[1] Batch[210] Speed: 1.2515262313969626 samples/sec                   batch loss = 585.9217164516449 | accuracy = 0.5059523809523809


Epoch[1] Batch[215] Speed: 1.259340009843698 samples/sec                   batch loss = 599.2685208320618 | accuracy = 0.5104651162790698


Epoch[1] Batch[220] Speed: 1.2519071620226236 samples/sec                   batch loss = 613.0574316978455 | accuracy = 0.5113636363636364


Epoch[1] Batch[225] Speed: 1.2528953555721574 samples/sec                   batch loss = 626.8899719715118 | accuracy = 0.5133333333333333


Epoch[1] Batch[230] Speed: 1.2501225554390585 samples/sec                   batch loss = 640.2946755886078 | accuracy = 0.5184782608695652


Epoch[1] Batch[235] Speed: 1.24909687892106 samples/sec                   batch loss = 654.6935498714447 | accuracy = 0.5148936170212766


Epoch[1] Batch[240] Speed: 1.2506402208528993 samples/sec                   batch loss = 668.5182337760925 | accuracy = 0.5145833333333333


Epoch[1] Batch[245] Speed: 1.2510712677140223 samples/sec                   batch loss = 682.2020082473755 | accuracy = 0.5173469387755102


Epoch[1] Batch[250] Speed: 1.248678715165814 samples/sec                   batch loss = 695.9316763877869 | accuracy = 0.518


Epoch[1] Batch[255] Speed: 1.2539481818209677 samples/sec                   batch loss = 709.2425558567047 | accuracy = 0.5205882352941177


Epoch[1] Batch[260] Speed: 1.2556755963409318 samples/sec                   batch loss = 723.1436507701874 | accuracy = 0.5230769230769231


Epoch[1] Batch[265] Speed: 1.2548209612611083 samples/sec                   batch loss = 736.9441206455231 | accuracy = 0.5235849056603774


Epoch[1] Batch[270] Speed: 1.2565476772704034 samples/sec                   batch loss = 750.3846955299377 | accuracy = 0.5296296296296297


Epoch[1] Batch[275] Speed: 1.2492392746776495 samples/sec                   batch loss = 763.2699501514435 | accuracy = 0.5345454545454545


Epoch[1] Batch[280] Speed: 1.2536004777324208 samples/sec                   batch loss = 777.0909452438354 | accuracy = 0.5348214285714286


Epoch[1] Batch[285] Speed: 1.2543448429722828 samples/sec                   batch loss = 791.060453414917 | accuracy = 0.5307017543859649


Epoch[1] Batch[290] Speed: 1.2526006041403752 samples/sec                   batch loss = 804.4670972824097 | accuracy = 0.5344827586206896


Epoch[1] Batch[295] Speed: 1.2479945275508522 samples/sec                   batch loss = 818.2745079994202 | accuracy = 0.5347457627118644


Epoch[1] Batch[300] Speed: 1.2508665253802895 samples/sec                   batch loss = 831.6983523368835 | accuracy = 0.5375


Epoch[1] Batch[305] Speed: 1.2489969143614705 samples/sec                   batch loss = 845.3734893798828 | accuracy = 0.5377049180327869


Epoch[1] Batch[310] Speed: 1.2570454365462467 samples/sec                   batch loss = 858.916225194931 | accuracy = 0.5411290322580645


Epoch[1] Batch[315] Speed: 1.2520959855753713 samples/sec                   batch loss = 872.3028976917267 | accuracy = 0.542063492063492


Epoch[1] Batch[320] Speed: 1.2515140013797286 samples/sec                   batch loss = 885.7406480312347 | accuracy = 0.54375


Epoch[1] Batch[325] Speed: 1.2533523962165187 samples/sec                   batch loss = 899.2061924934387 | accuracy = 0.5453846153846154


Epoch[1] Batch[330] Speed: 1.2479376231129622 samples/sec                   batch loss = 913.1398479938507 | accuracy = 0.5454545454545454


Epoch[1] Batch[335] Speed: 1.254300017344809 samples/sec                   batch loss = 927.3228833675385 | accuracy = 0.5462686567164179


Epoch[1] Batch[340] Speed: 1.2515039187971873 samples/sec                   batch loss = 941.3251941204071 | accuracy = 0.5448529411764705


Epoch[1] Batch[345] Speed: 1.255742607590764 samples/sec                   batch loss = 954.4567894935608 | accuracy = 0.5471014492753623


Epoch[1] Batch[350] Speed: 1.251935467911666 samples/sec                   batch loss = 968.4834747314453 | accuracy = 0.545


Epoch[1] Batch[355] Speed: 1.249798754970041 samples/sec                   batch loss = 981.3876767158508 | accuracy = 0.5464788732394367


Epoch[1] Batch[360] Speed: 1.2501135199038729 samples/sec                   batch loss = 995.5659430027008 | accuracy = 0.5465277777777777


Epoch[1] Batch[365] Speed: 1.2537824103848945 samples/sec                   batch loss = 1009.4716923236847 | accuracy = 0.547945205479452


Epoch[1] Batch[370] Speed: 1.2492166714854125 samples/sec                   batch loss = 1023.4817109107971 | accuracy = 0.5466216216216216


Epoch[1] Batch[375] Speed: 1.2505515676118395 samples/sec                   batch loss = 1037.4549417495728 | accuracy = 0.546


Epoch[1] Batch[380] Speed: 1.257734400980686 samples/sec                   batch loss = 1051.1646456718445 | accuracy = 0.5467105263157894


Epoch[1] Batch[385] Speed: 1.2561916640622695 samples/sec                   batch loss = 1065.3237347602844 | accuracy = 0.5467532467532468


Epoch[1] Batch[390] Speed: 1.247239226131366 samples/sec                   batch loss = 1079.5059740543365 | accuracy = 0.5442307692307692


Epoch[1] Batch[395] Speed: 1.2543718524284946 samples/sec                   batch loss = 1093.1052594184875 | accuracy = 0.5449367088607595


Epoch[1] Batch[400] Speed: 1.2524674455599492 samples/sec                   batch loss = 1106.1917028427124 | accuracy = 0.54625


Epoch[1] Batch[405] Speed: 1.2538714286013233 samples/sec                   batch loss = 1119.7377293109894 | accuracy = 0.5462962962962963


Epoch[1] Batch[410] Speed: 1.248907471269245 samples/sec                   batch loss = 1132.9459669589996 | accuracy = 0.55


Epoch[1] Batch[415] Speed: 1.2539285943310332 samples/sec                   batch loss = 1146.4869544506073 | accuracy = 0.5493975903614458


Epoch[1] Batch[420] Speed: 1.2483174865696176 samples/sec                   batch loss = 1159.8156716823578 | accuracy = 0.5511904761904762


Epoch[1] Batch[425] Speed: 1.2513166736900203 samples/sec                   batch loss = 1173.5283136367798 | accuracy = 0.55


Epoch[1] Batch[430] Speed: 1.255817710042857 samples/sec                   batch loss = 1187.0631940364838 | accuracy = 0.5505813953488372


Epoch[1] Batch[435] Speed: 1.25192537853708 samples/sec                   batch loss = 1201.0165615081787 | accuracy = 0.5494252873563218


Epoch[1] Batch[440] Speed: 1.2559119062913287 samples/sec                   batch loss = 1215.1868724822998 | accuracy = 0.5477272727272727


Epoch[1] Batch[445] Speed: 1.251653680532138 samples/sec                   batch loss = 1229.2479541301727 | accuracy = 0.547191011235955


Epoch[1] Batch[450] Speed: 1.2573628261573067 samples/sec                   batch loss = 1242.6932978630066 | accuracy = 0.5472222222222223


Epoch[1] Batch[455] Speed: 1.2541107159865625 samples/sec                   batch loss = 1256.0790722370148 | accuracy = 0.5467032967032966


Epoch[1] Batch[460] Speed: 1.2584711299097988 samples/sec                   batch loss = 1270.5034856796265 | accuracy = 0.5451086956521739


Epoch[1] Batch[465] Speed: 1.2541064974393443 samples/sec                   batch loss = 1283.763118505478 | accuracy = 0.5456989247311828


Epoch[1] Batch[470] Speed: 1.2562782966921675 samples/sec                   batch loss = 1297.3363149166107 | accuracy = 0.5468085106382978


Epoch[1] Batch[475] Speed: 1.2519720899709872 samples/sec                   batch loss = 1310.5630476474762 | accuracy = 0.5478947368421052


Epoch[1] Batch[480] Speed: 1.2581185546813338 samples/sec                   batch loss = 1324.6574425697327 | accuracy = 0.5479166666666667


Epoch[1] Batch[485] Speed: 1.2535436228984063 samples/sec                   batch loss = 1338.164847612381 | accuracy = 0.547938144329897


Epoch[1] Batch[490] Speed: 1.2553521068797677 samples/sec                   batch loss = 1351.287563085556 | accuracy = 0.5489795918367347


Epoch[1] Batch[495] Speed: 1.2565201976074822 samples/sec                   batch loss = 1364.430431842804 | accuracy = 0.5494949494949495


Epoch[1] Batch[500] Speed: 1.2523153390493573 samples/sec                   batch loss = 1377.6460525989532 | accuracy = 0.5515


Epoch[1] Batch[505] Speed: 1.2553850776990076 samples/sec                   batch loss = 1390.607105731964 | accuracy = 0.5524752475247525


Epoch[1] Batch[510] Speed: 1.2567646394116228 samples/sec                   batch loss = 1405.1796717643738 | accuracy = 0.5504901960784314


Epoch[1] Batch[515] Speed: 1.2510086719428026 samples/sec                   batch loss = 1418.6493105888367 | accuracy = 0.5504854368932038


Epoch[1] Batch[520] Speed: 1.2545995095780789 samples/sec                   batch loss = 1431.6701936721802 | accuracy = 0.5509615384615385


Epoch[1] Batch[525] Speed: 1.24990797347878 samples/sec                   batch loss = 1445.2248139381409 | accuracy = 0.550952380952381


Epoch[1] Batch[530] Speed: 1.2564647713966786 samples/sec                   batch loss = 1458.0738027095795 | accuracy = 0.5518867924528302


Epoch[1] Batch[535] Speed: 1.2582955725771148 samples/sec                   batch loss = 1471.2211050987244 | accuracy = 0.552803738317757


Epoch[1] Batch[540] Speed: 1.25732588804945 samples/sec                   batch loss = 1484.3600900173187 | accuracy = 0.5546296296296296


Epoch[1] Batch[545] Speed: 1.252956549546822 samples/sec                   batch loss = 1497.2974746227264 | accuracy = 0.5555045871559633


Epoch[1] Batch[550] Speed: 1.2521846710148619 samples/sec                   batch loss = 1509.97935628891 | accuracy = 0.5563636363636364


Epoch[1] Batch[555] Speed: 1.2526417543717396 samples/sec                   batch loss = 1522.0046410560608 | accuracy = 0.5585585585585585


Epoch[1] Batch[560] Speed: 1.2508090789811237 samples/sec                   batch loss = 1534.2141208648682 | accuracy = 0.5602678571428571


Epoch[1] Batch[565] Speed: 1.2497842311843856 samples/sec                   batch loss = 1547.7547256946564 | accuracy = 0.5597345132743363


Epoch[1] Batch[570] Speed: 1.250926681892031 samples/sec                   batch loss = 1560.2649238109589 | accuracy = 0.5618421052631579


Epoch[1] Batch[575] Speed: 1.2526163157278187 samples/sec                   batch loss = 1572.8758022785187 | accuracy = 0.5626086956521739


Epoch[1] Batch[580] Speed: 1.252036370602472 samples/sec                   batch loss = 1586.11771941185 | accuracy = 0.5637931034482758


Epoch[1] Batch[585] Speed: 1.247351892717074 samples/sec                   batch loss = 1599.1200873851776 | accuracy = 0.564957264957265


Epoch[1] Batch[590] Speed: 1.2514907556700334 samples/sec                   batch loss = 1612.730916261673 | accuracy = 0.5648305084745763


Epoch[1] Batch[595] Speed: 1.2483705242805945 samples/sec                   batch loss = 1626.538067817688 | accuracy = 0.565546218487395


Epoch[1] Batch[600] Speed: 1.2528089082176397 samples/sec                   batch loss = 1640.685709476471 | accuracy = 0.5654166666666667


Epoch[1] Batch[605] Speed: 1.2501426762753367 samples/sec                   batch loss = 1652.5658421516418 | accuracy = 0.5677685950413223


Epoch[1] Batch[610] Speed: 1.2535585151817827 samples/sec                   batch loss = 1664.7792258262634 | accuracy = 0.5684426229508197


Epoch[1] Batch[615] Speed: 1.248456824689138 samples/sec                   batch loss = 1678.1707298755646 | accuracy = 0.5691056910569106


Epoch[1] Batch[620] Speed: 1.2517613555945073 samples/sec                   batch loss = 1692.1989319324493 | accuracy = 0.5685483870967742


Epoch[1] Batch[625] Speed: 1.2545088870682115 samples/sec                   batch loss = 1705.3398995399475 | accuracy = 0.5692


Epoch[1] Batch[630] Speed: 1.2615328347722308 samples/sec                   batch loss = 1719.9553887844086 | accuracy = 0.5686507936507936


Epoch[1] Batch[635] Speed: 1.2589946002831782 samples/sec                   batch loss = 1732.996695280075 | accuracy = 0.5688976377952756


Epoch[1] Batch[640] Speed: 1.25665534889957 samples/sec                   batch loss = 1746.4418008327484 | accuracy = 0.56953125


Epoch[1] Batch[645] Speed: 1.2640819912037542 samples/sec                   batch loss = 1758.9091317653656 | accuracy = 0.5701550387596899


Epoch[1] Batch[650] Speed: 1.2547084427770723 samples/sec                   batch loss = 1771.5597183704376 | accuracy = 0.5711538461538461


Epoch[1] Batch[655] Speed: 1.259385858187937 samples/sec                   batch loss = 1782.9727146625519 | accuracy = 0.5721374045801527


Epoch[1] Batch[660] Speed: 1.2610436484453704 samples/sec                   batch loss = 1796.0085672140121 | accuracy = 0.5723484848484849


Epoch[1] Batch[665] Speed: 1.25543993896861 samples/sec                   batch loss = 1808.8274780511856 | accuracy = 0.5729323308270676


Epoch[1] Batch[670] Speed: 1.2597324283779965 samples/sec                   batch loss = 1822.7099262475967 | accuracy = 0.5720149253731344


Epoch[1] Batch[675] Speed: 1.2651264250492562 samples/sec                   batch loss = 1837.386082291603 | accuracy = 0.5703703703703704


Epoch[1] Batch[680] Speed: 1.2571239918017119 samples/sec                   batch loss = 1850.6005183458328 | accuracy = 0.5705882352941176


Epoch[1] Batch[685] Speed: 1.2576355002910358 samples/sec                   batch loss = 1864.4806028604507 | accuracy = 0.5704379562043795


Epoch[1] Batch[690] Speed: 1.2489017071850415 samples/sec                   batch loss = 1876.3331065177917 | accuracy = 0.571376811594203


Epoch[1] Batch[695] Speed: 1.2611569268849607 samples/sec                   batch loss = 1888.9951664209366 | accuracy = 0.5715827338129497


Epoch[1] Batch[700] Speed: 1.2610887677952578 samples/sec                   batch loss = 1901.0401686429977 | accuracy = 0.5732142857142857


Epoch[1] Batch[705] Speed: 1.2578214351395112 samples/sec                   batch loss = 1913.7248224020004 | accuracy = 0.573758865248227


Epoch[1] Batch[710] Speed: 1.2559721729798101 samples/sec                   batch loss = 1926.2857300043106 | accuracy = 0.573943661971831


Epoch[1] Batch[715] Speed: 1.2522093444350135 samples/sec                   batch loss = 1939.32006752491 | accuracy = 0.5744755244755245


Epoch[1] Batch[720] Speed: 1.2544936907862785 samples/sec                   batch loss = 1951.7799713611603 | accuracy = 0.575


Epoch[1] Batch[725] Speed: 1.2512719709066045 samples/sec                   batch loss = 1964.7396540641785 | accuracy = 0.5758620689655173


Epoch[1] Batch[730] Speed: 1.252629128445059 samples/sec                   batch loss = 1977.9428033828735 | accuracy = 0.5763698630136986


Epoch[1] Batch[735] Speed: 1.249517184047408 samples/sec                   batch loss = 1991.3066540956497 | accuracy = 0.5761904761904761


Epoch[1] Batch[740] Speed: 1.2503100955669675 samples/sec                   batch loss = 2004.3174086809158 | accuracy = 0.577027027027027


Epoch[1] Batch[745] Speed: 1.2483957907645051 samples/sec                   batch loss = 2016.6490062475204 | accuracy = 0.5775167785234899


Epoch[1] Batch[750] Speed: 1.252389751631848 samples/sec                   batch loss = 2027.6362730264664 | accuracy = 0.579


Epoch[1] Batch[755] Speed: 1.2542694477293284 samples/sec                   batch loss = 2040.455903172493 | accuracy = 0.5794701986754967


Epoch[1] Batch[760] Speed: 1.252455571134634 samples/sec                   batch loss = 2054.640923142433 | accuracy = 0.5796052631578947


Epoch[1] Batch[765] Speed: 1.2537546767653263 samples/sec                   batch loss = 2067.025635957718 | accuracy = 0.5807189542483661


Epoch[1] Batch[770] Speed: 1.2494028233742334 samples/sec                   batch loss = 2080.2648916244507 | accuracy = 0.5805194805194805


Epoch[1] Batch[775] Speed: 1.2528695323855548 samples/sec                   batch loss = 2092.5491211414337 | accuracy = 0.5812903225806452


Epoch[1] Batch[780] Speed: 1.248035932736215 samples/sec                   batch loss = 2105.316916704178 | accuracy = 0.5814102564102565


Epoch[1] Batch[785] Speed: 1.246489098431548 samples/sec                   batch loss = 2118.2253847122192 | accuracy = 0.5818471337579618


[Epoch 1] training: accuracy=0.5824873096446701
[Epoch 1] time cost: 646.6458299160004
[Epoch 1] validation: validation accuracy=0.64


Epoch[2] Batch[5] Speed: 1.2565390191647257 samples/sec                   batch loss = 12.86359190940857 | accuracy = 0.55


Epoch[2] Batch[10] Speed: 1.2630341458243999 samples/sec                   batch loss = 27.755083560943604 | accuracy = 0.55


Epoch[2] Batch[15] Speed: 1.2571464110494381 samples/sec                   batch loss = 40.52572536468506 | accuracy = 0.5833333333333334


Epoch[2] Batch[20] Speed: 1.2558282382634118 samples/sec                   batch loss = 52.76760172843933 | accuracy = 0.6125


Epoch[2] Batch[25] Speed: 1.264587645114407 samples/sec                   batch loss = 65.69901490211487 | accuracy = 0.63


Epoch[2] Batch[30] Speed: 1.2529034021449263 samples/sec                   batch loss = 80.78092885017395 | accuracy = 0.5916666666666667


Epoch[2] Batch[35] Speed: 1.2548955780553401 samples/sec                   batch loss = 92.59211897850037 | accuracy = 0.5928571428571429


Epoch[2] Batch[40] Speed: 1.2549426052727854 samples/sec                   batch loss = 105.79826188087463 | accuracy = 0.6


Epoch[2] Batch[45] Speed: 1.2600674540866532 samples/sec                   batch loss = 120.42024517059326 | accuracy = 0.5888888888888889


Epoch[2] Batch[50] Speed: 1.2568185856631784 samples/sec                   batch loss = 132.95625007152557 | accuracy = 0.6


Epoch[2] Batch[55] Speed: 1.2503589229925354 samples/sec                   batch loss = 147.28208434581757 | accuracy = 0.5909090909090909


Epoch[2] Batch[60] Speed: 1.2492205781515273 samples/sec                   batch loss = 159.76299059391022 | accuracy = 0.5958333333333333


Epoch[2] Batch[65] Speed: 1.2548493052119667 samples/sec                   batch loss = 173.54934966564178 | accuracy = 0.5846153846153846


Epoch[2] Batch[70] Speed: 1.2532362090888298 samples/sec                   batch loss = 186.0791791677475 | accuracy = 0.5892857142857143


Epoch[2] Batch[75] Speed: 1.2468113713202742 samples/sec                   batch loss = 200.48107373714447 | accuracy = 0.58


Epoch[2] Batch[80] Speed: 1.2536517170667723 samples/sec                   batch loss = 212.986634850502 | accuracy = 0.584375


Epoch[2] Batch[85] Speed: 1.2540417228668503 samples/sec                   batch loss = 225.76611268520355 | accuracy = 0.5823529411764706


Epoch[2] Batch[90] Speed: 1.253988858276418 samples/sec                   batch loss = 237.48981940746307 | accuracy = 0.5833333333333334


Epoch[2] Batch[95] Speed: 1.2577519388562155 samples/sec                   batch loss = 251.5313354730606 | accuracy = 0.5763157894736842


Epoch[2] Batch[100] Speed: 1.254161340766955 samples/sec                   batch loss = 264.03161454200745 | accuracy = 0.585


Epoch[2] Batch[105] Speed: 1.2598188878789143 samples/sec                   batch loss = 276.54955077171326 | accuracy = 0.5928571428571429


Epoch[2] Batch[110] Speed: 1.2539242832910107 samples/sec                   batch loss = 288.5868978500366 | accuracy = 0.5977272727272728


Epoch[2] Batch[115] Speed: 1.2583633356782178 samples/sec                   batch loss = 302.07425832748413 | accuracy = 0.6021739130434782


Epoch[2] Batch[120] Speed: 1.250921272232515 samples/sec                   batch loss = 313.6469442844391 | accuracy = 0.6083333333333333


Epoch[2] Batch[125] Speed: 1.2515054125028682 samples/sec                   batch loss = 325.1834304332733 | accuracy = 0.614


Epoch[2] Batch[130] Speed: 1.249382819386639 samples/sec                   batch loss = 338.281179189682 | accuracy = 0.6192307692307693


Epoch[2] Batch[135] Speed: 1.2512847561364893 samples/sec                   batch loss = 353.71549916267395 | accuracy = 0.6148148148148148


Epoch[2] Batch[140] Speed: 1.251624173514938 samples/sec                   batch loss = 365.83789896965027 | accuracy = 0.6160714285714286


Epoch[2] Batch[145] Speed: 1.2585668576059872 samples/sec                   batch loss = 377.6493835449219 | accuracy = 0.6189655172413793


Epoch[2] Batch[150] Speed: 1.2488487172909986 samples/sec                   batch loss = 389.02157950401306 | accuracy = 0.6233333333333333


Epoch[2] Batch[155] Speed: 1.2551222047014594 samples/sec                   batch loss = 403.07002902030945 | accuracy = 0.6209677419354839


Epoch[2] Batch[160] Speed: 1.2548991448664686 samples/sec                   batch loss = 415.068865776062 | accuracy = 0.6265625


Epoch[2] Batch[165] Speed: 1.2574259651933504 samples/sec                   batch loss = 425.8666943311691 | accuracy = 0.6287878787878788


Epoch[2] Batch[170] Speed: 1.2554575068612635 samples/sec                   batch loss = 437.1314194202423 | accuracy = 0.6294117647058823


Epoch[2] Batch[175] Speed: 1.2576017512799045 samples/sec                   batch loss = 450.33915317058563 | accuracy = 0.6285714285714286


Epoch[2] Batch[180] Speed: 1.2537010868956036 samples/sec                   batch loss = 463.91154992580414 | accuracy = 0.6305555555555555


Epoch[2] Batch[185] Speed: 1.2553201709853206 samples/sec                   batch loss = 475.78470957279205 | accuracy = 0.6310810810810811


Epoch[2] Batch[190] Speed: 1.2535216129388482 samples/sec                   batch loss = 486.48013520240784 | accuracy = 0.6355263157894737


Epoch[2] Batch[195] Speed: 1.2584310116853983 samples/sec                   batch loss = 498.1932530403137 | accuracy = 0.6371794871794871


Epoch[2] Batch[200] Speed: 1.2529572045605353 samples/sec                   batch loss = 510.38563323020935 | accuracy = 0.63625


Epoch[2] Batch[205] Speed: 1.2487497219061168 samples/sec                   batch loss = 522.367636680603 | accuracy = 0.6365853658536585


Epoch[2] Batch[210] Speed: 1.249681084778042 samples/sec                   batch loss = 535.789999961853 | accuracy = 0.6357142857142857


Epoch[2] Batch[215] Speed: 1.2524416400092955 samples/sec                   batch loss = 549.4604312181473 | accuracy = 0.6360465116279069


Epoch[2] Batch[220] Speed: 1.2535074707769607 samples/sec                   batch loss = 560.0560946464539 | accuracy = 0.6375


Epoch[2] Batch[225] Speed: 1.2531997938674597 samples/sec                   batch loss = 573.2740993499756 | accuracy = 0.6333333333333333


Epoch[2] Batch[230] Speed: 1.254215157514971 samples/sec                   batch loss = 586.0341399908066 | accuracy = 0.6326086956521739


Epoch[2] Batch[235] Speed: 1.255608968122297 samples/sec                   batch loss = 599.1927984952927 | accuracy = 0.6308510638297873


Epoch[2] Batch[240] Speed: 1.2551407966005326 samples/sec                   batch loss = 610.2161602973938 | accuracy = 0.634375


Epoch[2] Batch[245] Speed: 1.260074362732562 samples/sec                   batch loss = 622.5647013187408 | accuracy = 0.6336734693877552


Epoch[2] Batch[250] Speed: 1.253798151633713 samples/sec                   batch loss = 633.2073653936386 | accuracy = 0.636


Epoch[2] Batch[255] Speed: 1.2481148515720637 samples/sec                   batch loss = 646.7224645614624 | accuracy = 0.6362745098039215


Epoch[2] Batch[260] Speed: 1.2502553090267614 samples/sec                   batch loss = 660.9424631595612 | accuracy = 0.6346153846153846


Epoch[2] Batch[265] Speed: 1.255389492726512 samples/sec                   batch loss = 672.0229803323746 | accuracy = 0.6349056603773585


Epoch[2] Batch[270] Speed: 1.257281508652439 samples/sec                   batch loss = 685.0276226997375 | accuracy = 0.6361111111111111


Epoch[2] Batch[275] Speed: 1.2529179985174579 samples/sec                   batch loss = 695.1157855987549 | accuracy = 0.6381818181818182


Epoch[2] Batch[280] Speed: 1.2540839052583463 samples/sec                   batch loss = 707.7830281257629 | accuracy = 0.6392857142857142


Epoch[2] Batch[285] Speed: 1.2572296895423771 samples/sec                   batch loss = 721.02570271492 | accuracy = 0.6385964912280702


Epoch[2] Batch[290] Speed: 1.259033620650257 samples/sec                   batch loss = 732.3005127906799 | accuracy = 0.6405172413793103


Epoch[2] Batch[295] Speed: 1.2502846583618985 samples/sec                   batch loss = 744.7941105365753 | accuracy = 0.6415254237288136


Epoch[2] Batch[300] Speed: 1.2557028511055874 samples/sec                   batch loss = 758.1571333408356 | accuracy = 0.6416666666666667


Epoch[2] Batch[305] Speed: 1.2602489021508236 samples/sec                   batch loss = 768.5582838058472 | accuracy = 0.6442622950819672


Epoch[2] Batch[310] Speed: 1.256762850697267 samples/sec                   batch loss = 781.7004888057709 | accuracy = 0.6435483870967742


Epoch[2] Batch[315] Speed: 1.2581941303228457 samples/sec                   batch loss = 795.9143490791321 | accuracy = 0.6428571428571429


Epoch[2] Batch[320] Speed: 1.2582757546651115 samples/sec                   batch loss = 807.6343258619308 | accuracy = 0.64453125


Epoch[2] Batch[325] Speed: 1.2564713582964273 samples/sec                   batch loss = 820.5126506090164 | accuracy = 0.6438461538461538


Epoch[2] Batch[330] Speed: 1.2520587021922276 samples/sec                   batch loss = 833.8587840795517 | accuracy = 0.6424242424242425


Epoch[2] Batch[335] Speed: 1.2573211767121666 samples/sec                   batch loss = 847.2861415147781 | accuracy = 0.6425373134328358


Epoch[2] Batch[340] Speed: 1.2581592190979167 samples/sec                   batch loss = 856.7108447551727 | accuracy = 0.6470588235294118


Epoch[2] Batch[345] Speed: 1.255072253437995 samples/sec                   batch loss = 868.0438905954361 | accuracy = 0.6478260869565218


Epoch[2] Batch[350] Speed: 1.2612473746590946 samples/sec                   batch loss = 881.6870459318161 | accuracy = 0.6478571428571429


Epoch[2] Batch[355] Speed: 1.2593218604936744 samples/sec                   batch loss = 891.8349516391754 | accuracy = 0.65


Epoch[2] Batch[360] Speed: 1.2545278360342684 samples/sec                   batch loss = 903.9331052303314 | accuracy = 0.65


Epoch[2] Batch[365] Speed: 1.2571579035788711 samples/sec                   batch loss = 914.6405713558197 | accuracy = 0.65


Epoch[2] Batch[370] Speed: 1.2556094379721432 samples/sec                   batch loss = 926.5665663480759 | accuracy = 0.6493243243243243


Epoch[2] Batch[375] Speed: 1.2602586527958122 samples/sec                   batch loss = 937.2196513414383 | accuracy = 0.6506666666666666


Epoch[2] Batch[380] Speed: 1.2538688047257323 samples/sec                   batch loss = 949.5043227672577 | accuracy = 0.6513157894736842


Epoch[2] Batch[385] Speed: 1.2651923498211433 samples/sec                   batch loss = 959.7241183519363 | accuracy = 0.6525974025974026


Epoch[2] Batch[390] Speed: 1.2564419059806824 samples/sec                   batch loss = 973.8124071359634 | accuracy = 0.6532051282051282


Epoch[2] Batch[395] Speed: 1.2601672109269069 samples/sec                   batch loss = 985.0323415994644 | accuracy = 0.6537974683544304


Epoch[2] Batch[400] Speed: 1.2516444360954477 samples/sec                   batch loss = 1000.681260228157 | accuracy = 0.65


Epoch[2] Batch[405] Speed: 1.253751491225422 samples/sec                   batch loss = 1016.3080273866653 | accuracy = 0.6481481481481481


Epoch[2] Batch[410] Speed: 1.2587580733694128 samples/sec                   batch loss = 1028.1531575918198 | accuracy = 0.6481707317073171


Epoch[2] Batch[415] Speed: 1.2618310461796238 samples/sec                   batch loss = 1039.903116941452 | accuracy = 0.6493975903614457


Epoch[2] Batch[420] Speed: 1.2590261565188678 samples/sec                   batch loss = 1049.9404920339584 | accuracy = 0.6511904761904762


Epoch[2] Batch[425] Speed: 1.2558343484723662 samples/sec                   batch loss = 1060.1528601646423 | accuracy = 0.6535294117647059


Epoch[2] Batch[430] Speed: 1.250968748254747 samples/sec                   batch loss = 1073.08577978611 | accuracy = 0.6546511627906977


Epoch[2] Batch[435] Speed: 1.2524277091938671 samples/sec                   batch loss = 1083.13516831398 | accuracy = 0.6551724137931034


Epoch[2] Batch[440] Speed: 1.2476424160239337 samples/sec                   batch loss = 1093.4478458166122 | accuracy = 0.65625


Epoch[2] Batch[445] Speed: 1.2495360755792388 samples/sec                   batch loss = 1103.5029277801514 | accuracy = 0.6578651685393259


Epoch[2] Batch[450] Speed: 1.2468824438432617 samples/sec                   batch loss = 1116.5333080291748 | accuracy = 0.6577777777777778


Epoch[2] Batch[455] Speed: 1.254041910337868 samples/sec                   batch loss = 1129.9010145664215 | accuracy = 0.6576923076923077


Epoch[2] Batch[460] Speed: 1.2506410599026023 samples/sec                   batch loss = 1141.2656973600388 | accuracy = 0.657608695652174


Epoch[2] Batch[465] Speed: 1.2507964899719128 samples/sec                   batch loss = 1153.181091427803 | accuracy = 0.6591397849462366


Epoch[2] Batch[470] Speed: 1.2520495452096978 samples/sec                   batch loss = 1159.8142845630646 | accuracy = 0.6622340425531915


Epoch[2] Batch[475] Speed: 1.2507197491227295 samples/sec                   batch loss = 1172.1007368564606 | accuracy = 0.6610526315789473


Epoch[2] Batch[480] Speed: 1.251673570540712 samples/sec                   batch loss = 1181.430580496788 | accuracy = 0.6619791666666667


Epoch[2] Batch[485] Speed: 1.2516195048105825 samples/sec                   batch loss = 1191.925421833992 | accuracy = 0.6623711340206185


Epoch[2] Batch[490] Speed: 1.2550001503557147 samples/sec                   batch loss = 1206.9035719633102 | accuracy = 0.6612244897959184


Epoch[2] Batch[495] Speed: 1.2550325393872943 samples/sec                   batch loss = 1218.7570799589157 | accuracy = 0.6616161616161617


Epoch[2] Batch[500] Speed: 1.252814989080854 samples/sec                   batch loss = 1232.1658881902695 | accuracy = 0.6605


Epoch[2] Batch[505] Speed: 1.2514673241217613 samples/sec                   batch loss = 1243.8416566848755 | accuracy = 0.6608910891089109


Epoch[2] Batch[510] Speed: 1.2493437436773125 samples/sec                   batch loss = 1255.3628772497177 | accuracy = 0.6607843137254902


Epoch[2] Batch[515] Speed: 1.254226408986775 samples/sec                   batch loss = 1268.22822701931 | accuracy = 0.6606796116504854


Epoch[2] Batch[520] Speed: 1.2487267646551388 samples/sec                   batch loss = 1280.457295536995 | accuracy = 0.6605769230769231


Epoch[2] Batch[525] Speed: 1.245203796794738 samples/sec                   batch loss = 1291.7263848781586 | accuracy = 0.660952380952381


Epoch[2] Batch[530] Speed: 1.257332578209059 samples/sec                   batch loss = 1304.547733783722 | accuracy = 0.6613207547169812


Epoch[2] Batch[535] Speed: 1.2489679973658632 samples/sec                   batch loss = 1314.2560496330261 | accuracy = 0.6621495327102803


Epoch[2] Batch[540] Speed: 1.2582936851300281 samples/sec                   batch loss = 1325.08205640316 | accuracy = 0.663425925925926


Epoch[2] Batch[545] Speed: 1.2537038037542778 samples/sec                   batch loss = 1337.7947766780853 | accuracy = 0.6628440366972477


Epoch[2] Batch[550] Speed: 1.2554559097597962 samples/sec                   batch loss = 1350.0848722457886 | accuracy = 0.6631818181818182


Epoch[2] Batch[555] Speed: 1.2570700193297222 samples/sec                   batch loss = 1361.0595602989197 | accuracy = 0.6644144144144144


Epoch[2] Batch[560] Speed: 1.2467932106715267 samples/sec                   batch loss = 1371.582801938057 | accuracy = 0.6642857142857143


Epoch[2] Batch[565] Speed: 1.2573051584295016 samples/sec                   batch loss = 1384.871463418007 | accuracy = 0.6641592920353983


Epoch[2] Batch[570] Speed: 1.252637545701232 samples/sec                   batch loss = 1395.726288676262 | accuracy = 0.6644736842105263


Epoch[2] Batch[575] Speed: 1.2510886202587395 samples/sec                   batch loss = 1406.1689838171005 | accuracy = 0.6647826086956522


Epoch[2] Batch[580] Speed: 1.2515994297787307 samples/sec                   batch loss = 1416.110517501831 | accuracy = 0.665948275862069


Epoch[2] Batch[585] Speed: 1.2509206193457374 samples/sec                   batch loss = 1428.1845690011978 | accuracy = 0.6662393162393162


Epoch[2] Batch[590] Speed: 1.2531933348332318 samples/sec                   batch loss = 1439.3707201480865 | accuracy = 0.6661016949152543


Epoch[2] Batch[595] Speed: 1.252258787661607 samples/sec                   batch loss = 1452.496024608612 | accuracy = 0.6663865546218487


Epoch[2] Batch[600] Speed: 1.2539637397849088 samples/sec                   batch loss = 1466.0974864959717 | accuracy = 0.665


Epoch[2] Batch[605] Speed: 1.2506522473395563 samples/sec                   batch loss = 1477.6968054771423 | accuracy = 0.6652892561983471


Epoch[2] Batch[610] Speed: 1.2521146748234304 samples/sec                   batch loss = 1489.3260890245438 | accuracy = 0.6647540983606557


Epoch[2] Batch[615] Speed: 1.2498852529668283 samples/sec                   batch loss = 1498.542409658432 | accuracy = 0.6658536585365854


Epoch[2] Batch[620] Speed: 1.254835696201585 samples/sec                   batch loss = 1511.4943350553513 | accuracy = 0.6649193548387097


Epoch[2] Batch[625] Speed: 1.2538399428189984 samples/sec                   batch loss = 1521.7435740232468 | accuracy = 0.6656


Epoch[2] Batch[630] Speed: 1.253155986071377 samples/sec                   batch loss = 1535.0429193973541 | accuracy = 0.6638888888888889


Epoch[2] Batch[635] Speed: 1.2514003016688455 samples/sec                   batch loss = 1546.7511403560638 | accuracy = 0.6637795275590551


Epoch[2] Batch[640] Speed: 1.2477259247663008 samples/sec                   batch loss = 1555.614905834198 | accuracy = 0.665234375


Epoch[2] Batch[645] Speed: 1.254145965401506 samples/sec                   batch loss = 1565.8664045333862 | accuracy = 0.665891472868217


Epoch[2] Batch[650] Speed: 1.2473162823122836 samples/sec                   batch loss = 1576.12195789814 | accuracy = 0.666923076923077


Epoch[2] Batch[655] Speed: 1.2517979674679975 samples/sec                   batch loss = 1589.1795222759247 | accuracy = 0.667557251908397


Epoch[2] Batch[660] Speed: 1.2560939462621523 samples/sec                   batch loss = 1602.030465245247 | accuracy = 0.6674242424242425


Epoch[2] Batch[665] Speed: 1.2528755202760546 samples/sec                   batch loss = 1613.2389924526215 | accuracy = 0.6669172932330827


Epoch[2] Batch[670] Speed: 1.2539225963703755 samples/sec                   batch loss = 1627.6345217227936 | accuracy = 0.6656716417910448


Epoch[2] Batch[675] Speed: 1.2538284171690597 samples/sec                   batch loss = 1640.0214315652847 | accuracy = 0.6651851851851852


Epoch[2] Batch[680] Speed: 1.254432721751981 samples/sec                   batch loss = 1652.3091450929642 | accuracy = 0.6654411764705882


Epoch[2] Batch[685] Speed: 1.2535550496502876 samples/sec                   batch loss = 1662.2597187757492 | accuracy = 0.666058394160584


Epoch[2] Batch[690] Speed: 1.2523569379584076 samples/sec                   batch loss = 1672.0092356204987 | accuracy = 0.6663043478260869


Epoch[2] Batch[695] Speed: 1.2549155712320537 samples/sec                   batch loss = 1683.6475546360016 | accuracy = 0.6661870503597123


Epoch[2] Batch[700] Speed: 1.2511443195818606 samples/sec                   batch loss = 1694.6768202781677 | accuracy = 0.6664285714285715


Epoch[2] Batch[705] Speed: 1.2508944111678821 samples/sec                   batch loss = 1705.80322599411 | accuracy = 0.6663120567375886


Epoch[2] Batch[710] Speed: 1.2520246912179849 samples/sec                   batch loss = 1718.1983925104141 | accuracy = 0.6669014084507042


Epoch[2] Batch[715] Speed: 1.2475198639903347 samples/sec                   batch loss = 1727.1266663074493 | accuracy = 0.6678321678321678


Epoch[2] Batch[720] Speed: 1.25496823236593 samples/sec                   batch loss = 1735.2707425355911 | accuracy = 0.6690972222222222


Epoch[2] Batch[725] Speed: 1.2540083539186528 samples/sec                   batch loss = 1746.9094514846802 | accuracy = 0.6686206896551724


Epoch[2] Batch[730] Speed: 1.260850411121975 samples/sec                   batch loss = 1755.373864531517 | accuracy = 0.6698630136986301


Epoch[2] Batch[735] Speed: 1.259768372899853 samples/sec                   batch loss = 1769.4757288694382 | accuracy = 0.669047619047619


Epoch[2] Batch[740] Speed: 1.2561016577977513 samples/sec                   batch loss = 1780.5044351816177 | accuracy = 0.6692567567567568


Epoch[2] Batch[745] Speed: 1.2649255447433472 samples/sec                   batch loss = 1793.5268646478653 | accuracy = 0.6694630872483222


Epoch[2] Batch[750] Speed: 1.2583459695083228 samples/sec                   batch loss = 1805.3569012880325 | accuracy = 0.6693333333333333


Epoch[2] Batch[755] Speed: 1.2622295771709493 samples/sec                   batch loss = 1814.4383660554886 | accuracy = 0.6708609271523179


Epoch[2] Batch[760] Speed: 1.25861387713275 samples/sec                   batch loss = 1822.478872179985 | accuracy = 0.6723684210526316


Epoch[2] Batch[765] Speed: 1.256796554697326 samples/sec                   batch loss = 1836.2055401802063 | accuracy = 0.6722222222222223


Epoch[2] Batch[770] Speed: 1.2564165950298007 samples/sec                   batch loss = 1849.9090013504028 | accuracy = 0.6724025974025974


Epoch[2] Batch[775] Speed: 1.2579528106939233 samples/sec                   batch loss = 1861.5961911678314 | accuracy = 0.6719354838709677


Epoch[2] Batch[780] Speed: 1.2558351005021928 samples/sec                   batch loss = 1874.487429857254 | accuracy = 0.6714743589743589


Epoch[2] Batch[785] Speed: 1.256724159227755 samples/sec                   batch loss = 1884.9904057979584 | accuracy = 0.6719745222929936


[Epoch 2] training: accuracy=0.6725888324873096
[Epoch 2] time cost: 642.5773108005524
[Epoch 2] validation: validation accuracy=0.7211111111111111


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).