<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[10:27:02] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[10:27:03] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[10:27:03] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 3.968991 , -4.6689606]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7755869076772223 samples/sec                   batch loss = 14.49501895904541 | accuracy = 0.5


Epoch[1] Batch[10] Speed: 1.2632253899537942 samples/sec                   batch loss = 27.57714581489563 | accuracy = 0.525


Epoch[1] Batch[15] Speed: 1.2647515191409842 samples/sec                   batch loss = 42.012858152389526 | accuracy = 0.5333333333333333


Epoch[1] Batch[20] Speed: 1.2588669742823624 samples/sec                   batch loss = 54.32433342933655 | accuracy = 0.5875


Epoch[1] Batch[25] Speed: 1.263576646343406 samples/sec                   batch loss = 68.45992112159729 | accuracy = 0.56


Epoch[1] Batch[30] Speed: 1.2612185513185379 samples/sec                   batch loss = 83.73548197746277 | accuracy = 0.5166666666666667


Epoch[1] Batch[35] Speed: 1.261603508471404 samples/sec                   batch loss = 98.02770137786865 | accuracy = 0.5214285714285715


Epoch[1] Batch[40] Speed: 1.2598390382423845 samples/sec                   batch loss = 112.16803979873657 | accuracy = 0.51875


Epoch[1] Batch[45] Speed: 1.268503221761631 samples/sec                   batch loss = 126.336665391922 | accuracy = 0.5055555555555555


Epoch[1] Batch[50] Speed: 1.2591428527470268 samples/sec                   batch loss = 139.8995656967163 | accuracy = 0.52


Epoch[1] Batch[55] Speed: 1.2538619639588644 samples/sec                   batch loss = 153.32241201400757 | accuracy = 0.5318181818181819


Epoch[1] Batch[60] Speed: 1.2592921798714753 samples/sec                   batch loss = 167.6457724571228 | accuracy = 0.5125


Epoch[1] Batch[65] Speed: 1.2600495676704448 samples/sec                   batch loss = 181.80406713485718 | accuracy = 0.5


Epoch[1] Batch[70] Speed: 1.2645676285105867 samples/sec                   batch loss = 196.6247661113739 | accuracy = 0.5


Epoch[1] Batch[75] Speed: 1.263701897637472 samples/sec                   batch loss = 209.96219539642334 | accuracy = 0.51


Epoch[1] Batch[80] Speed: 1.2590972110796728 samples/sec                   batch loss = 224.40687561035156 | accuracy = 0.509375


Epoch[1] Batch[85] Speed: 1.256758520146755 samples/sec                   batch loss = 238.8946931362152 | accuracy = 0.4970588235294118


Epoch[1] Batch[90] Speed: 1.2541214030998684 samples/sec                   batch loss = 252.31680965423584 | accuracy = 0.5055555555555555


Epoch[1] Batch[95] Speed: 1.2547073167557845 samples/sec                   batch loss = 266.03530621528625 | accuracy = 0.5105263157894737


Epoch[1] Batch[100] Speed: 1.2529174371122256 samples/sec                   batch loss = 279.41274404525757 | accuracy = 0.515


Epoch[1] Batch[105] Speed: 1.2578311482575248 samples/sec                   batch loss = 293.36857748031616 | accuracy = 0.5142857142857142


Epoch[1] Batch[110] Speed: 1.2540323493874366 samples/sec                   batch loss = 307.3271267414093 | accuracy = 0.5113636363636364


Epoch[1] Batch[115] Speed: 1.246912283745035 samples/sec                   batch loss = 321.10740303993225 | accuracy = 0.5130434782608696


Epoch[1] Batch[120] Speed: 1.2555204546845793 samples/sec                   batch loss = 334.7415373325348 | accuracy = 0.5145833333333333


Epoch[1] Batch[125] Speed: 1.253315506107864 samples/sec                   batch loss = 348.11017751693726 | accuracy = 0.52


Epoch[1] Batch[130] Speed: 1.251163167041966 samples/sec                   batch loss = 362.35156440734863 | accuracy = 0.5173076923076924


Epoch[1] Batch[135] Speed: 1.25401660225747 samples/sec                   batch loss = 375.9869945049286 | accuracy = 0.524074074074074


Epoch[1] Batch[140] Speed: 1.2575620655144149 samples/sec                   batch loss = 389.1055610179901 | accuracy = 0.5267857142857143


Epoch[1] Batch[145] Speed: 1.2604656292570313 samples/sec                   batch loss = 402.7354462146759 | accuracy = 0.5275862068965518


Epoch[1] Batch[150] Speed: 1.2577984260216453 samples/sec                   batch loss = 416.1617703437805 | accuracy = 0.53


Epoch[1] Batch[155] Speed: 1.2613825016313105 samples/sec                   batch loss = 429.55040526390076 | accuracy = 0.5370967741935484


Epoch[1] Batch[160] Speed: 1.260628152469047 samples/sec                   batch loss = 443.53018021583557 | accuracy = 0.5359375


Epoch[1] Batch[165] Speed: 1.2629727241452253 samples/sec                   batch loss = 456.6404526233673 | accuracy = 0.5424242424242425


Epoch[1] Batch[170] Speed: 1.2584783042618861 samples/sec                   batch loss = 469.86747550964355 | accuracy = 0.5455882352941176


Epoch[1] Batch[175] Speed: 1.259587630568763 samples/sec                   batch loss = 483.19159746170044 | accuracy = 0.55


Epoch[1] Batch[180] Speed: 1.2592885880439455 samples/sec                   batch loss = 497.24535846710205 | accuracy = 0.55


Epoch[1] Batch[185] Speed: 1.2624884058296009 samples/sec                   batch loss = 510.86818194389343 | accuracy = 0.55


Epoch[1] Batch[190] Speed: 1.26314483402474 samples/sec                   batch loss = 524.8391461372375 | accuracy = 0.5486842105263158


Epoch[1] Batch[195] Speed: 1.2590177476667348 samples/sec                   batch loss = 538.3814206123352 | accuracy = 0.5474358974358975


Epoch[1] Batch[200] Speed: 1.2635032773694768 samples/sec                   batch loss = 551.9571163654327 | accuracy = 0.54625


Epoch[1] Batch[205] Speed: 1.2620666408043 samples/sec                   batch loss = 564.9939124584198 | accuracy = 0.552439024390244


Epoch[1] Batch[210] Speed: 1.2587627010387417 samples/sec                   batch loss = 578.4918775558472 | accuracy = 0.5523809523809524


Epoch[1] Batch[215] Speed: 1.2555943089837338 samples/sec                   batch loss = 592.0113801956177 | accuracy = 0.5534883720930233


Epoch[1] Batch[220] Speed: 1.2660508573560063 samples/sec                   batch loss = 605.6076934337616 | accuracy = 0.553409090909091


Epoch[1] Batch[225] Speed: 1.2582354601691146 samples/sec                   batch loss = 618.8210785388947 | accuracy = 0.5544444444444444


Epoch[1] Batch[230] Speed: 1.2609488704104423 samples/sec                   batch loss = 631.9612345695496 | accuracy = 0.5576086956521739


Epoch[1] Batch[235] Speed: 1.2649574943714 samples/sec                   batch loss = 645.3291981220245 | accuracy = 0.5585106382978723


Epoch[1] Batch[240] Speed: 1.2719789977947442 samples/sec                   batch loss = 658.3184404373169 | accuracy = 0.5614583333333333


Epoch[1] Batch[245] Speed: 1.2598508638732033 samples/sec                   batch loss = 671.4879791736603 | accuracy = 0.5612244897959183


Epoch[1] Batch[250] Speed: 1.2576408739041043 samples/sec                   batch loss = 685.4396710395813 | accuracy = 0.561


Epoch[1] Batch[255] Speed: 1.2550191141655476 samples/sec                   batch loss = 698.9500122070312 | accuracy = 0.5617647058823529


Epoch[1] Batch[260] Speed: 1.258260561355331 samples/sec                   batch loss = 712.0916931629181 | accuracy = 0.5644230769230769


Epoch[1] Batch[265] Speed: 1.2621490532424144 samples/sec                   batch loss = 725.4225850105286 | accuracy = 0.5641509433962264


Epoch[1] Batch[270] Speed: 1.2590302192629519 samples/sec                   batch loss = 738.9206488132477 | accuracy = 0.5648148148148148


Epoch[1] Batch[275] Speed: 1.2577366639048844 samples/sec                   batch loss = 752.7672154903412 | accuracy = 0.5654545454545454


Epoch[1] Batch[280] Speed: 1.2610449754389905 samples/sec                   batch loss = 766.0111916065216 | accuracy = 0.5669642857142857


Epoch[1] Batch[285] Speed: 1.2613575602289164 samples/sec                   batch loss = 779.7230176925659 | accuracy = 0.5675438596491228


Epoch[1] Batch[290] Speed: 1.2580833646281635 samples/sec                   batch loss = 792.2547569274902 | accuracy = 0.5698275862068966


Epoch[1] Batch[295] Speed: 1.2621564594954704 samples/sec                   batch loss = 805.3456194400787 | accuracy = 0.5711864406779661


Epoch[1] Batch[300] Speed: 1.2643989423107793 samples/sec                   batch loss = 819.4486701488495 | accuracy = 0.5683333333333334


Epoch[1] Batch[305] Speed: 1.2618688189287062 samples/sec                   batch loss = 833.7827870845795 | accuracy = 0.5680327868852459


Epoch[1] Batch[310] Speed: 1.2653874930610867 samples/sec                   batch loss = 847.5943133831024 | accuracy = 0.5669354838709677


Epoch[1] Batch[315] Speed: 1.2627984750269519 samples/sec                   batch loss = 861.084047794342 | accuracy = 0.5682539682539682


Epoch[1] Batch[320] Speed: 1.2631148778290695 samples/sec                   batch loss = 874.4999787807465 | accuracy = 0.5703125


Epoch[1] Batch[325] Speed: 1.2579924267986353 samples/sec                   batch loss = 887.1784009933472 | accuracy = 0.5723076923076923


Epoch[1] Batch[330] Speed: 1.2638959148369466 samples/sec                   batch loss = 900.7878260612488 | accuracy = 0.5712121212121212


Epoch[1] Batch[335] Speed: 1.2659428115415192 samples/sec                   batch loss = 914.4591820240021 | accuracy = 0.5708955223880597


Epoch[1] Batch[340] Speed: 1.2620428115637952 samples/sec                   batch loss = 929.3430035114288 | accuracy = 0.5661764705882353


Epoch[1] Batch[345] Speed: 1.26411923210172 samples/sec                   batch loss = 942.814290523529 | accuracy = 0.5659420289855073


Epoch[1] Batch[350] Speed: 1.2616213441595099 samples/sec                   batch loss = 956.3260016441345 | accuracy = 0.5664285714285714


Epoch[1] Batch[355] Speed: 1.2636806716967193 samples/sec                   batch loss = 969.0517494678497 | accuracy = 0.5690140845070423


Epoch[1] Batch[360] Speed: 1.262393125542654 samples/sec                   batch loss = 981.7171382904053 | accuracy = 0.5715277777777777


Epoch[1] Batch[365] Speed: 1.2538838921603483 samples/sec                   batch loss = 995.5691065788269 | accuracy = 0.5705479452054795


Epoch[1] Batch[370] Speed: 1.2568705592069285 samples/sec                   batch loss = 1008.7463445663452 | accuracy = 0.572972972972973


Epoch[1] Batch[375] Speed: 1.2559858066522447 samples/sec                   batch loss = 1022.2658891677856 | accuracy = 0.5726666666666667


Epoch[1] Batch[380] Speed: 1.2518211311879268 samples/sec                   batch loss = 1035.6620800495148 | accuracy = 0.5743421052631579


Epoch[1] Batch[385] Speed: 1.2497553708173177 samples/sec                   batch loss = 1049.8133399486542 | accuracy = 0.5753246753246753


Epoch[1] Batch[390] Speed: 1.257780321078086 samples/sec                   batch loss = 1063.202333688736 | accuracy = 0.5756410256410256


Epoch[1] Batch[395] Speed: 1.25592958141335 samples/sec                   batch loss = 1076.4950654506683 | accuracy = 0.5765822784810126


Epoch[1] Batch[400] Speed: 1.2567049555874337 samples/sec                   batch loss = 1089.692544221878 | accuracy = 0.5775


Epoch[1] Batch[405] Speed: 1.2507868851945703 samples/sec                   batch loss = 1102.8687415122986 | accuracy = 0.578395061728395


Epoch[1] Batch[410] Speed: 1.2525687145706128 samples/sec                   batch loss = 1117.1577730178833 | accuracy = 0.5756097560975609


Epoch[1] Batch[415] Speed: 1.2513966613790486 samples/sec                   batch loss = 1128.8950505256653 | accuracy = 0.5789156626506025


Epoch[1] Batch[420] Speed: 1.2512249384407792 samples/sec                   batch loss = 1142.2244853973389 | accuracy = 0.5791666666666667


Epoch[1] Batch[425] Speed: 1.2587217142943314 samples/sec                   batch loss = 1155.8879318237305 | accuracy = 0.5788235294117647


Epoch[1] Batch[430] Speed: 1.2555187634665133 samples/sec                   batch loss = 1169.1601951122284 | accuracy = 0.577906976744186


Epoch[1] Batch[435] Speed: 1.253847251877533 samples/sec                   batch loss = 1182.6111435890198 | accuracy = 0.5781609195402299


Epoch[1] Batch[440] Speed: 1.255040144010954 samples/sec                   batch loss = 1194.9180252552032 | accuracy = 0.5789772727272727


Epoch[1] Batch[445] Speed: 1.2572109414985173 samples/sec                   batch loss = 1207.2419191598892 | accuracy = 0.5808988764044943


Epoch[1] Batch[450] Speed: 1.257705832263019 samples/sec                   batch loss = 1220.7636402845383 | accuracy = 0.58


Epoch[1] Batch[455] Speed: 1.2643872217217678 samples/sec                   batch loss = 1234.649293065071 | accuracy = 0.5796703296703297


Epoch[1] Batch[460] Speed: 1.259100707322255 samples/sec                   batch loss = 1247.4485553503036 | accuracy = 0.5798913043478261


Epoch[1] Batch[465] Speed: 1.264470604621254 samples/sec                   batch loss = 1260.3211396932602 | accuracy = 0.5795698924731183


Epoch[1] Batch[470] Speed: 1.2659944916328252 samples/sec                   batch loss = 1274.9487360715866 | accuracy = 0.5792553191489361


Epoch[1] Batch[475] Speed: 1.2580954403494817 samples/sec                   batch loss = 1287.749762415886 | accuracy = 0.5794736842105264


Epoch[1] Batch[480] Speed: 1.263161857384937 samples/sec                   batch loss = 1301.202162861824 | accuracy = 0.5791666666666667


Epoch[1] Batch[485] Speed: 1.2566982721051931 samples/sec                   batch loss = 1314.5361887216568 | accuracy = 0.5793814432989691


Epoch[1] Batch[490] Speed: 1.2634476139883999 samples/sec                   batch loss = 1327.9915786981583 | accuracy = 0.5801020408163265


Epoch[1] Batch[495] Speed: 1.2634675951506422 samples/sec                   batch loss = 1341.6945320367813 | accuracy = 0.5797979797979798


Epoch[1] Batch[500] Speed: 1.2571783457283612 samples/sec                   batch loss = 1356.1407154798508 | accuracy = 0.5785


Epoch[1] Batch[505] Speed: 1.2630670459608349 samples/sec                   batch loss = 1368.9318701028824 | accuracy = 0.5782178217821782


Epoch[1] Batch[510] Speed: 1.2665088476699429 samples/sec                   batch loss = 1381.902784705162 | accuracy = 0.5779411764705882


Epoch[1] Batch[515] Speed: 1.2605795615926483 samples/sec                   batch loss = 1394.420098900795 | accuracy = 0.579126213592233


Epoch[1] Batch[520] Speed: 1.2624410013766503 samples/sec                   batch loss = 1409.3800555467606 | accuracy = 0.5788461538461539


Epoch[1] Batch[525] Speed: 1.2588580952717532 samples/sec                   batch loss = 1423.0198348760605 | accuracy = 0.5785714285714286


Epoch[1] Batch[530] Speed: 1.2579495094644917 samples/sec                   batch loss = 1435.7170621156693 | accuracy = 0.5806603773584905


Epoch[1] Batch[535] Speed: 1.2589205343751342 samples/sec                   batch loss = 1449.5150104761124 | accuracy = 0.5808411214953271


Epoch[1] Batch[540] Speed: 1.2579781835784711 samples/sec                   batch loss = 1462.660993218422 | accuracy = 0.5824074074074074


Epoch[1] Batch[545] Speed: 1.2601637087589377 samples/sec                   batch loss = 1477.4573110342026 | accuracy = 0.5802752293577982


Epoch[1] Batch[550] Speed: 1.2628211921805903 samples/sec                   batch loss = 1491.6362944841385 | accuracy = 0.5786363636363636


Epoch[1] Batch[555] Speed: 1.2555076767053603 samples/sec                   batch loss = 1504.1511589288712 | accuracy = 0.5797297297297297


Epoch[1] Batch[560] Speed: 1.2653645880471838 samples/sec                   batch loss = 1517.3568977117538 | accuracy = 0.5794642857142858


Epoch[1] Batch[565] Speed: 1.2580382714055338 samples/sec                   batch loss = 1530.1451643705368 | accuracy = 0.5796460176991151


Epoch[1] Batch[570] Speed: 1.2541664034697884 samples/sec                   batch loss = 1542.80147087574 | accuracy = 0.5802631578947368


Epoch[1] Batch[575] Speed: 1.251945744293691 samples/sec                   batch loss = 1555.8338660001755 | accuracy = 0.5808695652173913


Epoch[1] Batch[580] Speed: 1.2522090640497772 samples/sec                   batch loss = 1569.3774982690811 | accuracy = 0.5810344827586207


Epoch[1] Batch[585] Speed: 1.2569507876800181 samples/sec                   batch loss = 1581.3599319458008 | accuracy = 0.582905982905983


Epoch[1] Batch[590] Speed: 1.2560581170765555 samples/sec                   batch loss = 1595.272893667221 | accuracy = 0.5826271186440678


Epoch[1] Batch[595] Speed: 1.2525398189727364 samples/sec                   batch loss = 1608.3802773952484 | accuracy = 0.5831932773109244


Epoch[1] Batch[600] Speed: 1.2510247167595558 samples/sec                   batch loss = 1622.0938956737518 | accuracy = 0.5829166666666666


Epoch[1] Batch[605] Speed: 1.252112245189632 samples/sec                   batch loss = 1636.5878326892853 | accuracy = 0.5822314049586776


Epoch[1] Batch[610] Speed: 1.2526640140327623 samples/sec                   batch loss = 1649.1270151138306 | accuracy = 0.5827868852459016


Epoch[1] Batch[615] Speed: 1.2524830603044501 samples/sec                   batch loss = 1661.3761830329895 | accuracy = 0.583739837398374


Epoch[1] Batch[620] Speed: 1.2561618486463135 samples/sec                   batch loss = 1675.123544216156 | accuracy = 0.5830645161290322


Epoch[1] Batch[625] Speed: 1.25206393481381 samples/sec                   batch loss = 1688.7224667072296 | accuracy = 0.5824


Epoch[1] Batch[630] Speed: 1.2519783495679735 samples/sec                   batch loss = 1700.265953540802 | accuracy = 0.5837301587301588


Epoch[1] Batch[635] Speed: 1.2551630512645622 samples/sec                   batch loss = 1713.5846881866455 | accuracy = 0.5834645669291338


Epoch[1] Batch[640] Speed: 1.2614246102675197 samples/sec                   batch loss = 1727.16703915596 | accuracy = 0.583984375


Epoch[1] Batch[645] Speed: 1.2594662190090244 samples/sec                   batch loss = 1739.5551497936249 | accuracy = 0.5844961240310077


Epoch[1] Batch[650] Speed: 1.261839018129348 samples/sec                   batch loss = 1752.1068495512009 | accuracy = 0.5857692307692308


Epoch[1] Batch[655] Speed: 1.2648667997887217 samples/sec                   batch loss = 1765.406274318695 | accuracy = 0.584351145038168


Epoch[1] Batch[660] Speed: 1.2610645015251416 samples/sec                   batch loss = 1777.3413157463074 | accuracy = 0.5856060606060606


Epoch[1] Batch[665] Speed: 1.2618048533397312 samples/sec                   batch loss = 1788.141405582428 | accuracy = 0.587218045112782


Epoch[1] Batch[670] Speed: 1.25713077401078 samples/sec                   batch loss = 1800.6722226142883 | accuracy = 0.5884328358208956


Epoch[1] Batch[675] Speed: 1.2547161373099627 samples/sec                   batch loss = 1812.4024460315704 | accuracy = 0.5888888888888889


Epoch[1] Batch[680] Speed: 1.2591328358658136 samples/sec                   batch loss = 1825.7049670219421 | accuracy = 0.5889705882352941


Epoch[1] Batch[685] Speed: 1.2622468607684476 samples/sec                   batch loss = 1838.6697006225586 | accuracy = 0.5894160583941606


Epoch[1] Batch[690] Speed: 1.265618307347643 samples/sec                   batch loss = 1850.2630701065063 | accuracy = 0.5898550724637681


Epoch[1] Batch[695] Speed: 1.268013886678188 samples/sec                   batch loss = 1862.6890538930893 | accuracy = 0.5906474820143884


Epoch[1] Batch[700] Speed: 1.2637162707645788 samples/sec                   batch loss = 1875.1492677927017 | accuracy = 0.5917857142857142


Epoch[1] Batch[705] Speed: 1.2661668527868482 samples/sec                   batch loss = 1889.0613228082657 | accuracy = 0.5918439716312057


Epoch[1] Batch[710] Speed: 1.2653723183966716 samples/sec                   batch loss = 1901.7576030492783 | accuracy = 0.5919014084507043


Epoch[1] Batch[715] Speed: 1.2607513040195333 samples/sec                   batch loss = 1916.0912536382675 | accuracy = 0.5916083916083916


Epoch[1] Batch[720] Speed: 1.2591838669228583 samples/sec                   batch loss = 1929.2601381540298 | accuracy = 0.5923611111111111


Epoch[1] Batch[725] Speed: 1.263418309426943 samples/sec                   batch loss = 1942.4972108602524 | accuracy = 0.5913793103448276


Epoch[1] Batch[730] Speed: 1.2601558526151146 samples/sec                   batch loss = 1957.0978368520737 | accuracy = 0.5910958904109589


Epoch[1] Batch[735] Speed: 1.2619175092504575 samples/sec                   batch loss = 1971.2392131090164 | accuracy = 0.5908163265306122


Epoch[1] Batch[740] Speed: 1.2589766498370039 samples/sec                   batch loss = 1983.895282626152 | accuracy = 0.5912162162162162


Epoch[1] Batch[745] Speed: 1.2576485101700903 samples/sec                   batch loss = 1997.199403643608 | accuracy = 0.5916107382550335


Epoch[1] Batch[750] Speed: 1.2573911908203905 samples/sec                   batch loss = 2009.1614747047424 | accuracy = 0.5926666666666667


Epoch[1] Batch[755] Speed: 1.2564572435958166 samples/sec                   batch loss = 2020.9812778234482 | accuracy = 0.5930463576158941


Epoch[1] Batch[760] Speed: 1.2650208261655147 samples/sec                   batch loss = 2035.1397250890732 | accuracy = 0.593421052631579


Epoch[1] Batch[765] Speed: 1.263417167718296 samples/sec                   batch loss = 2047.554582953453 | accuracy = 0.5941176470588235


Epoch[1] Batch[770] Speed: 1.260839893281276 samples/sec                   batch loss = 2061.749813556671 | accuracy = 0.5931818181818181


Epoch[1] Batch[775] Speed: 1.2627706262424698 samples/sec                   batch loss = 2075.7136430740356 | accuracy = 0.5935483870967742


Epoch[1] Batch[780] Speed: 1.262790300853639 samples/sec                   batch loss = 2088.215914964676 | accuracy = 0.5942307692307692


Epoch[1] Batch[785] Speed: 1.2541977181326158 samples/sec                   batch loss = 2102.0657737255096 | accuracy = 0.5936305732484076


[Epoch 1] training: accuracy=0.5942258883248731
[Epoch 1] time cost: 643.5522925853729
[Epoch 1] validation: validation accuracy=0.6955555555555556


Epoch[2] Batch[5] Speed: 1.2585767710693556 samples/sec                   batch loss = 10.475207448005676 | accuracy = 0.85


Epoch[2] Batch[10] Speed: 1.266674367633591 samples/sec                   batch loss = 21.618335366249084 | accuracy = 0.85


Epoch[2] Batch[15] Speed: 1.2570211372684323 samples/sec                   batch loss = 35.654226183891296 | accuracy = 0.7


Epoch[2] Batch[20] Speed: 1.2599443414718368 samples/sec                   batch loss = 50.3257931470871 | accuracy = 0.6625


Epoch[2] Batch[25] Speed: 1.2566670207090846 samples/sec                   batch loss = 63.87125015258789 | accuracy = 0.65


Epoch[2] Batch[30] Speed: 1.2598117928339505 samples/sec                   batch loss = 75.88893818855286 | accuracy = 0.6583333333333333


Epoch[2] Batch[35] Speed: 1.2640251340141717 samples/sec                   batch loss = 85.81483221054077 | accuracy = 0.7


Epoch[2] Batch[40] Speed: 1.2566817049064412 samples/sec                   batch loss = 97.06008279323578 | accuracy = 0.6875


Epoch[2] Batch[45] Speed: 1.2613307232436044 samples/sec                   batch loss = 109.4322122335434 | accuracy = 0.6833333333333333


Epoch[2] Batch[50] Speed: 1.2563719034616767 samples/sec                   batch loss = 121.04369068145752 | accuracy = 0.675


Epoch[2] Batch[55] Speed: 1.2630396607565975 samples/sec                   batch loss = 133.52168083190918 | accuracy = 0.6772727272727272


Epoch[2] Batch[60] Speed: 1.2586788416162467 samples/sec                   batch loss = 146.2907716035843 | accuracy = 0.675


Epoch[2] Batch[65] Speed: 1.2562833765005923 samples/sec                   batch loss = 159.60027539730072 | accuracy = 0.6846153846153846


Epoch[2] Batch[70] Speed: 1.2576652914260895 samples/sec                   batch loss = 172.22050511837006 | accuracy = 0.6892857142857143


Epoch[2] Batch[75] Speed: 1.2599001554935731 samples/sec                   batch loss = 187.32965171337128 | accuracy = 0.68


Epoch[2] Batch[80] Speed: 1.2593183630227995 samples/sec                   batch loss = 198.22772300243378 | accuracy = 0.68125


Epoch[2] Batch[85] Speed: 1.2574497146509556 samples/sec                   batch loss = 210.18180561065674 | accuracy = 0.6764705882352942


Epoch[2] Batch[90] Speed: 1.2636582091458843 samples/sec                   batch loss = 223.774920463562 | accuracy = 0.6666666666666666


Epoch[2] Batch[95] Speed: 1.2563266506082775 samples/sec                   batch loss = 237.60485649108887 | accuracy = 0.6578947368421053


Epoch[2] Batch[100] Speed: 1.262065216720821 samples/sec                   batch loss = 249.46771335601807 | accuracy = 0.6575


Epoch[2] Batch[105] Speed: 1.2588171023149197 samples/sec                   batch loss = 258.86047065258026 | accuracy = 0.6619047619047619


Epoch[2] Batch[110] Speed: 1.2547039387040482 samples/sec                   batch loss = 270.666500210762 | accuracy = 0.6636363636363637


Epoch[2] Batch[115] Speed: 1.2587793231520352 samples/sec                   batch loss = 284.6543813943863 | accuracy = 0.6608695652173913


Epoch[2] Batch[120] Speed: 1.2550816424697941 samples/sec                   batch loss = 295.2204450368881 | accuracy = 0.66875


Epoch[2] Batch[125] Speed: 1.2560738214810938 samples/sec                   batch loss = 307.7155796289444 | accuracy = 0.664


Epoch[2] Batch[130] Speed: 1.2552525473958913 samples/sec                   batch loss = 322.6384109258652 | accuracy = 0.6596153846153846


Epoch[2] Batch[135] Speed: 1.2531141468693314 samples/sec                   batch loss = 333.85382664203644 | accuracy = 0.6592592592592592


Epoch[2] Batch[140] Speed: 1.2588093574017405 samples/sec                   batch loss = 347.9792686700821 | accuracy = 0.6535714285714286


Epoch[2] Batch[145] Speed: 1.2600862874229055 samples/sec                   batch loss = 361.7156471014023 | accuracy = 0.6482758620689655


Epoch[2] Batch[150] Speed: 1.2631609063471267 samples/sec                   batch loss = 372.82489490509033 | accuracy = 0.6466666666666666


Epoch[2] Batch[155] Speed: 1.2601702398446324 samples/sec                   batch loss = 386.16647708415985 | accuracy = 0.6435483870967742


Epoch[2] Batch[160] Speed: 1.2613589827158378 samples/sec                   batch loss = 400.2750149965286 | accuracy = 0.6375


Epoch[2] Batch[165] Speed: 1.259133780847799 samples/sec                   batch loss = 412.32290637493134 | accuracy = 0.6363636363636364


Epoch[2] Batch[170] Speed: 1.2606022937000978 samples/sec                   batch loss = 423.5044975280762 | accuracy = 0.6397058823529411


Epoch[2] Batch[175] Speed: 1.262564317368024 samples/sec                   batch loss = 438.18006014823914 | accuracy = 0.6357142857142857


Epoch[2] Batch[180] Speed: 1.267025631242817 samples/sec                   batch loss = 451.2690073251724 | accuracy = 0.6347222222222222


Epoch[2] Batch[185] Speed: 1.2604434702476213 samples/sec                   batch loss = 464.3350808620453 | accuracy = 0.6351351351351351


Epoch[2] Batch[190] Speed: 1.2548882567697048 samples/sec                   batch loss = 478.69348907470703 | accuracy = 0.6302631578947369


Epoch[2] Batch[195] Speed: 1.260780011261731 samples/sec                   batch loss = 494.2056174278259 | accuracy = 0.6269230769230769


Epoch[2] Batch[200] Speed: 1.2616393701013593 samples/sec                   batch loss = 506.65671932697296 | accuracy = 0.6275


Epoch[2] Batch[205] Speed: 1.256077865193742 samples/sec                   batch loss = 518.9682954549789 | accuracy = 0.6268292682926829


Epoch[2] Batch[210] Speed: 1.2611987360360892 samples/sec                   batch loss = 529.0607328414917 | accuracy = 0.6309523809523809


Epoch[2] Batch[215] Speed: 1.2640370383625066 samples/sec                   batch loss = 541.2332479953766 | accuracy = 0.6337209302325582


Epoch[2] Batch[220] Speed: 1.258076760816119 samples/sec                   batch loss = 553.5695335865021 | accuracy = 0.6329545454545454


Epoch[2] Batch[225] Speed: 1.2591352928219255 samples/sec                   batch loss = 565.302561879158 | accuracy = 0.6344444444444445


Epoch[2] Batch[230] Speed: 1.2660634686794632 samples/sec                   batch loss = 577.9597836732864 | accuracy = 0.6369565217391304


Epoch[2] Batch[235] Speed: 1.262597953183257 samples/sec                   batch loss = 589.4076042175293 | accuracy = 0.6382978723404256


Epoch[2] Batch[240] Speed: 1.2595081997980548 samples/sec                   batch loss = 599.9921122789383 | accuracy = 0.6416666666666667


Epoch[2] Batch[245] Speed: 1.2638167966072866 samples/sec                   batch loss = 612.2042020559311 | accuracy = 0.6408163265306123


Epoch[2] Batch[250] Speed: 1.2502358367362207 samples/sec                   batch loss = 623.8160735368729 | accuracy = 0.642


Epoch[2] Batch[255] Speed: 1.2536902195786572 samples/sec                   batch loss = 637.3185838460922 | accuracy = 0.6411764705882353


Epoch[2] Batch[260] Speed: 1.2577637253376865 samples/sec                   batch loss = 649.7305417060852 | accuracy = 0.6384615384615384


Epoch[2] Batch[265] Speed: 1.2607422089283777 samples/sec                   batch loss = 660.9667484760284 | accuracy = 0.6386792452830189


Epoch[2] Batch[270] Speed: 1.2615112073567782 samples/sec                   batch loss = 675.5094172954559 | accuracy = 0.6351851851851852


Epoch[2] Batch[275] Speed: 1.262935455698011 samples/sec                   batch loss = 687.2782808542252 | accuracy = 0.6381818181818182


Epoch[2] Batch[280] Speed: 1.256647724689458 samples/sec                   batch loss = 699.8356131315231 | accuracy = 0.6375


Epoch[2] Batch[285] Speed: 1.259286319531852 samples/sec                   batch loss = 712.1227352619171 | accuracy = 0.6385964912280702


Epoch[2] Batch[290] Speed: 1.2555291927171712 samples/sec                   batch loss = 724.0823405981064 | accuracy = 0.6413793103448275


Epoch[2] Batch[295] Speed: 1.2571302088238958 samples/sec                   batch loss = 735.2040468454361 | accuracy = 0.6440677966101694


Epoch[2] Batch[300] Speed: 1.253767138003407 samples/sec                   batch loss = 747.839775800705 | accuracy = 0.6433333333333333


Epoch[2] Batch[305] Speed: 1.257128795858909 samples/sec                   batch loss = 760.2491261959076 | accuracy = 0.6418032786885246


Epoch[2] Batch[310] Speed: 1.2601915373335104 samples/sec                   batch loss = 776.3387503623962 | accuracy = 0.6395161290322581


Epoch[2] Batch[315] Speed: 1.2589595501511865 samples/sec                   batch loss = 788.5784504413605 | accuracy = 0.6412698412698413


Epoch[2] Batch[320] Speed: 1.2596803124645704 samples/sec                   batch loss = 801.2918872833252 | accuracy = 0.63984375


Epoch[2] Batch[325] Speed: 1.2507302852794289 samples/sec                   batch loss = 812.2395881414413 | accuracy = 0.6423076923076924


Epoch[2] Batch[330] Speed: 1.2555024152601035 samples/sec                   batch loss = 825.1309357881546 | accuracy = 0.6416666666666667


Epoch[2] Batch[335] Speed: 1.252522987200122 samples/sec                   batch loss = 838.3178144693375 | accuracy = 0.6425373134328358


Epoch[2] Batch[340] Speed: 1.2551118761066342 samples/sec                   batch loss = 850.2344318628311 | accuracy = 0.6426470588235295


Epoch[2] Batch[345] Speed: 1.2589063645950593 samples/sec                   batch loss = 861.6482344865799 | accuracy = 0.6420289855072464


Epoch[2] Batch[350] Speed: 1.25593635074077 samples/sec                   batch loss = 873.4109671115875 | accuracy = 0.6428571428571429


Epoch[2] Batch[355] Speed: 1.25572014439735 samples/sec                   batch loss = 885.3824682235718 | accuracy = 0.6422535211267606


Epoch[2] Batch[360] Speed: 1.260722503347706 samples/sec                   batch loss = 898.0604455471039 | accuracy = 0.6409722222222223


Epoch[2] Batch[365] Speed: 1.262703718143051 samples/sec                   batch loss = 910.823322057724 | accuracy = 0.6424657534246575


Epoch[2] Batch[370] Speed: 1.2508410655295992 samples/sec                   batch loss = 921.8961718082428 | accuracy = 0.6425675675675676


Epoch[2] Batch[375] Speed: 1.2611310464186016 samples/sec                   batch loss = 934.3310122489929 | accuracy = 0.6433333333333333


Epoch[2] Batch[380] Speed: 1.2646408351693015 samples/sec                   batch loss = 944.8720248937607 | accuracy = 0.6440789473684211


Epoch[2] Batch[385] Speed: 1.2596729352426677 samples/sec                   batch loss = 957.4162447452545 | accuracy = 0.6428571428571429


Epoch[2] Batch[390] Speed: 1.2632001854004942 samples/sec                   batch loss = 968.7239589691162 | accuracy = 0.6455128205128206


Epoch[2] Batch[395] Speed: 1.2620710080136754 samples/sec                   batch loss = 980.2459892034531 | accuracy = 0.6455696202531646


Epoch[2] Batch[400] Speed: 1.2630543991738044 samples/sec                   batch loss = 991.4904210567474 | accuracy = 0.646875


Epoch[2] Batch[405] Speed: 1.2604711217815652 samples/sec                   batch loss = 1003.5421317815781 | accuracy = 0.6475308641975308


Epoch[2] Batch[410] Speed: 1.2605150637011346 samples/sec                   batch loss = 1016.5031439065933 | accuracy = 0.6475609756097561


Epoch[2] Batch[415] Speed: 1.2612742080977166 samples/sec                   batch loss = 1028.8775284290314 | accuracy = 0.6493975903614457


Epoch[2] Batch[420] Speed: 1.2535088756132042 samples/sec                   batch loss = 1043.880872964859 | accuracy = 0.6482142857142857


Epoch[2] Batch[425] Speed: 1.2651382547485246 samples/sec                   batch loss = 1055.5774029493332 | accuracy = 0.6494117647058824


Epoch[2] Batch[430] Speed: 1.2603509600652607 samples/sec                   batch loss = 1068.4297810792923 | accuracy = 0.65


Epoch[2] Batch[435] Speed: 1.2523004762890404 samples/sec                   batch loss = 1082.248678445816 | accuracy = 0.6482758620689655


Epoch[2] Batch[440] Speed: 1.2526603663870535 samples/sec                   batch loss = 1094.1492453813553 | accuracy = 0.6488636363636363


Epoch[2] Batch[445] Speed: 1.2566882940808999 samples/sec                   batch loss = 1104.8723423480988 | accuracy = 0.6494382022471911


Epoch[2] Batch[450] Speed: 1.2561808475805858 samples/sec                   batch loss = 1116.1714315414429 | accuracy = 0.6505555555555556


Epoch[2] Batch[455] Speed: 1.2604929976559804 samples/sec                   batch loss = 1128.1575206518173 | accuracy = 0.6516483516483517


Epoch[2] Batch[460] Speed: 1.2575779018140911 samples/sec                   batch loss = 1139.970435500145 | accuracy = 0.6527173913043478


Epoch[2] Batch[465] Speed: 1.2576822616909458 samples/sec                   batch loss = 1151.4176677465439 | accuracy = 0.6521505376344086


Epoch[2] Batch[470] Speed: 1.25677443035936 samples/sec                   batch loss = 1163.4406189918518 | accuracy = 0.651595744680851


Epoch[2] Batch[475] Speed: 1.2619662982537747 samples/sec                   batch loss = 1177.5466982126236 | accuracy = 0.6505263157894737


Epoch[2] Batch[480] Speed: 1.2625177622757144 samples/sec                   batch loss = 1188.6814725399017 | accuracy = 0.65


Epoch[2] Batch[485] Speed: 1.2594978934442616 samples/sec                   batch loss = 1201.6364002227783 | accuracy = 0.6489690721649485


Epoch[2] Batch[490] Speed: 1.2632525929776046 samples/sec                   batch loss = 1211.335406422615 | accuracy = 0.6510204081632653


Epoch[2] Batch[495] Speed: 1.2624567707995107 samples/sec                   batch loss = 1222.3684267997742 | accuracy = 0.652020202020202


Epoch[2] Batch[500] Speed: 1.2640711336568677 samples/sec                   batch loss = 1237.0810627937317 | accuracy = 0.6515


Epoch[2] Batch[505] Speed: 1.2558612340981434 samples/sec                   batch loss = 1247.8527952432632 | accuracy = 0.651980198019802


Epoch[2] Batch[510] Speed: 1.2651845262617238 samples/sec                   batch loss = 1258.3180623054504 | accuracy = 0.6524509803921569


Epoch[2] Batch[515] Speed: 1.2644256241367808 samples/sec                   batch loss = 1272.274736404419 | accuracy = 0.6519417475728155


Epoch[2] Batch[520] Speed: 1.257787487555921 samples/sec                   batch loss = 1285.2371983528137 | accuracy = 0.6509615384615385


Epoch[2] Batch[525] Speed: 1.2572164999054682 samples/sec                   batch loss = 1295.1119825839996 | accuracy = 0.6528571428571428


Epoch[2] Batch[530] Speed: 1.262151711887357 samples/sec                   batch loss = 1304.4048249721527 | accuracy = 0.654245283018868


Epoch[2] Batch[535] Speed: 1.2593905850111855 samples/sec                   batch loss = 1317.6620472669601 | accuracy = 0.6546728971962616


Epoch[2] Batch[540] Speed: 1.2576481330683589 samples/sec                   batch loss = 1329.6352126598358 | accuracy = 0.6550925925925926


Epoch[2] Batch[545] Speed: 1.2562607998886999 samples/sec                   batch loss = 1342.5983002185822 | accuracy = 0.655045871559633


Epoch[2] Batch[550] Speed: 1.2556756903208461 samples/sec                   batch loss = 1355.75918507576 | accuracy = 0.6554545454545454


Epoch[2] Batch[555] Speed: 1.2538843607200314 samples/sec                   batch loss = 1364.8015086650848 | accuracy = 0.6576576576576577


Epoch[2] Batch[560] Speed: 1.2570655924739778 samples/sec                   batch loss = 1373.9883316755295 | accuracy = 0.6584821428571429


Epoch[2] Batch[565] Speed: 1.2561750161633958 samples/sec                   batch loss = 1387.4339039325714 | accuracy = 0.6579646017699115


Epoch[2] Batch[570] Speed: 1.25517478927445 samples/sec                   batch loss = 1396.206617474556 | accuracy = 0.6596491228070176


Epoch[2] Batch[575] Speed: 1.2570918714999955 samples/sec                   batch loss = 1405.4480127096176 | accuracy = 0.6608695652173913


Epoch[2] Batch[580] Speed: 1.2578150226777898 samples/sec                   batch loss = 1417.9967597723007 | accuracy = 0.6607758620689655


Epoch[2] Batch[585] Speed: 1.257094226305068 samples/sec                   batch loss = 1430.744077205658 | accuracy = 0.6615384615384615


Epoch[2] Batch[590] Speed: 1.2605077714033184 samples/sec                   batch loss = 1445.8246326446533 | accuracy = 0.660593220338983


Epoch[2] Batch[595] Speed: 1.2573583972390483 samples/sec                   batch loss = 1456.339029431343 | accuracy = 0.661344537815126


Epoch[2] Batch[600] Speed: 1.2571598818223482 samples/sec                   batch loss = 1472.5056949853897 | accuracy = 0.6591666666666667


Epoch[2] Batch[605] Speed: 1.2575070186231865 samples/sec                   batch loss = 1482.8523347377777 | accuracy = 0.6603305785123967


Epoch[2] Batch[610] Speed: 1.2591699747253609 samples/sec                   batch loss = 1498.3743832111359 | accuracy = 0.659016393442623


Epoch[2] Batch[615] Speed: 1.259478415841733 samples/sec                   batch loss = 1508.9903119802475 | accuracy = 0.6601626016260163


Epoch[2] Batch[620] Speed: 1.2613131802485575 samples/sec                   batch loss = 1520.8599792718887 | accuracy = 0.6600806451612903


Epoch[2] Batch[625] Speed: 1.2611665967823338 samples/sec                   batch loss = 1531.011670947075 | accuracy = 0.6608


Epoch[2] Batch[630] Speed: 1.2561843276456728 samples/sec                   batch loss = 1540.7918241024017 | accuracy = 0.6619047619047619


Epoch[2] Batch[635] Speed: 1.2579163094918637 samples/sec                   batch loss = 1554.32517516613 | accuracy = 0.6610236220472441


Epoch[2] Batch[640] Speed: 1.2659079465806853 samples/sec                   batch loss = 1565.6147904396057 | accuracy = 0.661328125


Epoch[2] Batch[645] Speed: 1.258508512957475 samples/sec                   batch loss = 1579.9097646474838 | accuracy = 0.6604651162790698


Epoch[2] Batch[650] Speed: 1.2571775920889277 samples/sec                   batch loss = 1590.3939218521118 | accuracy = 0.6607692307692308


Epoch[2] Batch[655] Speed: 1.2612564770403456 samples/sec                   batch loss = 1601.3781193494797 | accuracy = 0.6610687022900763


Epoch[2] Batch[660] Speed: 1.2614102892221597 samples/sec                   batch loss = 1611.3964240550995 | accuracy = 0.6613636363636364


Epoch[2] Batch[665] Speed: 1.2632503101613426 samples/sec                   batch loss = 1625.1551837921143 | accuracy = 0.6612781954887218


Epoch[2] Batch[670] Speed: 1.2581913939676734 samples/sec                   batch loss = 1639.7057042121887 | accuracy = 0.6600746268656716


Epoch[2] Batch[675] Speed: 1.2539877335463223 samples/sec                   batch loss = 1650.024175643921 | accuracy = 0.66


Epoch[2] Batch[680] Speed: 1.255270203963631 samples/sec                   batch loss = 1659.0511915683746 | accuracy = 0.6602941176470588


Epoch[2] Batch[685] Speed: 1.2624068039818312 samples/sec                   batch loss = 1670.4159158468246 | accuracy = 0.6605839416058394


Epoch[2] Batch[690] Speed: 1.2584742450789468 samples/sec                   batch loss = 1681.7130621671677 | accuracy = 0.6608695652173913


Epoch[2] Batch[695] Speed: 1.2561847979262173 samples/sec                   batch loss = 1691.5883826613426 | accuracy = 0.6615107913669065


Epoch[2] Batch[700] Speed: 1.260690956753454 samples/sec                   batch loss = 1703.5653194785118 | accuracy = 0.6610714285714285


Epoch[2] Batch[705] Speed: 1.2566899884512148 samples/sec                   batch loss = 1715.9694347977638 | accuracy = 0.6609929078014184


Epoch[2] Batch[710] Speed: 1.2570436470325026 samples/sec                   batch loss = 1729.8345150351524 | accuracy = 0.6598591549295775


Epoch[2] Batch[715] Speed: 1.2562927836617097 samples/sec                   batch loss = 1740.7317556738853 | accuracy = 0.6604895104895104


Epoch[2] Batch[720] Speed: 1.2621232269887024 samples/sec                   batch loss = 1753.749844968319 | accuracy = 0.6590277777777778


Epoch[2] Batch[725] Speed: 1.2570528772106597 samples/sec                   batch loss = 1764.3364973664284 | accuracy = 0.6586206896551724


Epoch[2] Batch[730] Speed: 1.258902680504488 samples/sec                   batch loss = 1774.9054390788078 | accuracy = 0.6589041095890411


Epoch[2] Batch[735] Speed: 1.2595138730923043 samples/sec                   batch loss = 1791.3452349305153 | accuracy = 0.6578231292517007


Epoch[2] Batch[740] Speed: 1.2554459514540708 samples/sec                   batch loss = 1803.5633482336998 | accuracy = 0.6574324324324324


Epoch[2] Batch[745] Speed: 1.2571725050463844 samples/sec                   batch loss = 1814.4155413508415 | accuracy = 0.6580536912751678


Epoch[2] Batch[750] Speed: 1.2547523591837761 samples/sec                   batch loss = 1821.73489767313 | accuracy = 0.6596666666666666


Epoch[2] Batch[755] Speed: 1.2608155418370892 samples/sec                   batch loss = 1833.6678023934364 | accuracy = 0.659933774834437


Epoch[2] Batch[760] Speed: 1.2567243475016436 samples/sec                   batch loss = 1842.892033278942 | accuracy = 0.6618421052631579


Epoch[2] Batch[765] Speed: 1.2565703583085555 samples/sec                   batch loss = 1854.3491247296333 | accuracy = 0.6620915032679738


Epoch[2] Batch[770] Speed: 1.2534077354515205 samples/sec                   batch loss = 1867.7199514508247 | accuracy = 0.6613636363636364


Epoch[2] Batch[775] Speed: 1.257850009109305 samples/sec                   batch loss = 1877.756023466587 | accuracy = 0.662258064516129


Epoch[2] Batch[780] Speed: 1.2548550304700639 samples/sec                   batch loss = 1887.8720207214355 | accuracy = 0.6625


Epoch[2] Batch[785] Speed: 1.2517550047732011 samples/sec                   batch loss = 1898.397999405861 | accuracy = 0.6627388535031847


[Epoch 2] training: accuracy=0.6630710659898477
[Epoch 2] time cost: 642.1893019676208
[Epoch 2] validation: validation accuracy=0.75


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).