<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[15:36:35] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

15:36:35] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[15:36:35] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 4.7445846, -2.9175136]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7751587930022317 samples/sec                   batch loss = 13.958090543746948 | accuracy = 0.7


Epoch[1] Batch[10] Speed: 1.2515850508494175 samples/sec                   batch loss = 26.43302297592163 | accuracy = 0.7


Epoch[1] Batch[15] Speed: 1.2522473845294915 samples/sec                   batch loss = 40.55722260475159 | accuracy = 0.6666666666666666


Epoch[1] Batch[20] Speed: 1.2545060729141515 samples/sec                   batch loss = 55.301658391952515 | accuracy = 0.6125


Epoch[1] Batch[25] Speed: 1.2527289269731234 samples/sec                   batch loss = 70.71040964126587 | accuracy = 0.54


Epoch[1] Batch[30] Speed: 1.2549036503410869 samples/sec                   batch loss = 84.27719306945801 | accuracy = 0.5416666666666666


Epoch[1] Batch[35] Speed: 1.24881116226446 samples/sec                   batch loss = 97.60018587112427 | accuracy = 0.55


Epoch[1] Batch[40] Speed: 1.246829069196153 samples/sec                   batch loss = 112.53622889518738 | accuracy = 0.525


Epoch[1] Batch[45] Speed: 1.253159449396748 samples/sec                   batch loss = 126.62827706336975 | accuracy = 0.5277777777777778


Epoch[1] Batch[50] Speed: 1.248285721832368 samples/sec                   batch loss = 141.14671063423157 | accuracy = 0.515


Epoch[1] Batch[55] Speed: 1.252736877867754 samples/sec                   batch loss = 155.8516764640808 | accuracy = 0.5181818181818182


Epoch[1] Batch[60] Speed: 1.2392208306373222 samples/sec                   batch loss = 168.92618203163147 | accuracy = 0.5291666666666667


Epoch[1] Batch[65] Speed: 1.2432175600496658 samples/sec                   batch loss = 183.06797814369202 | accuracy = 0.5269230769230769


Epoch[1] Batch[70] Speed: 1.2410142647859985 samples/sec                   batch loss = 196.77875661849976 | accuracy = 0.5357142857142857


Epoch[1] Batch[75] Speed: 1.2362203401570069 samples/sec                   batch loss = 210.49744153022766 | accuracy = 0.53


Epoch[1] Batch[80] Speed: 1.2451663682496683 samples/sec                   batch loss = 224.64659786224365 | accuracy = 0.525


Epoch[1] Batch[85] Speed: 1.246273725193236 samples/sec                   batch loss = 238.63794589042664 | accuracy = 0.5235294117647059


Epoch[1] Batch[90] Speed: 1.2397662408986205 samples/sec                   batch loss = 252.35099077224731 | accuracy = 0.5277777777777778


Epoch[1] Batch[95] Speed: 1.2470025535090403 samples/sec                   batch loss = 265.7899172306061 | accuracy = 0.5289473684210526


Epoch[1] Batch[100] Speed: 1.2491729556978552 samples/sec                   batch loss = 279.9449498653412 | accuracy = 0.5275


Epoch[1] Batch[105] Speed: 1.2505880155301399 samples/sec                   batch loss = 293.9123492240906 | accuracy = 0.5285714285714286


Epoch[1] Batch[110] Speed: 1.2567160635039059 samples/sec                   batch loss = 307.87377095222473 | accuracy = 0.5272727272727272


Epoch[1] Batch[115] Speed: 1.25501291800623 samples/sec                   batch loss = 321.38527846336365 | accuracy = 0.5304347826086957


Epoch[1] Batch[120] Speed: 1.2564977065860616 samples/sec                   batch loss = 335.4969403743744 | accuracy = 0.5375


Epoch[1] Batch[125] Speed: 1.2471065560299097 samples/sec                   batch loss = 349.33520460128784 | accuracy = 0.536


Epoch[1] Batch[130] Speed: 1.252712651338874 samples/sec                   batch loss = 363.1489555835724 | accuracy = 0.5384615384615384


Epoch[1] Batch[135] Speed: 1.2513091141222925 samples/sec                   batch loss = 376.8597857952118 | accuracy = 0.5370370370370371


Epoch[1] Batch[140] Speed: 1.2560426951692427 samples/sec                   batch loss = 390.49469208717346 | accuracy = 0.5410714285714285


Epoch[1] Batch[145] Speed: 1.258439412701831 samples/sec                   batch loss = 404.48049545288086 | accuracy = 0.5431034482758621


Epoch[1] Batch[150] Speed: 1.2576955553849696 samples/sec                   batch loss = 417.8813199996948 | accuracy = 0.5433333333333333


Epoch[1] Batch[155] Speed: 1.253713640754876 samples/sec                   batch loss = 432.13878893852234 | accuracy = 0.5403225806451613


Epoch[1] Batch[160] Speed: 1.2551113127336238 samples/sec                   batch loss = 446.7791566848755 | accuracy = 0.5375


Epoch[1] Batch[165] Speed: 1.2575293572771276 samples/sec                   batch loss = 460.57649874687195 | accuracy = 0.5348484848484848


Epoch[1] Batch[170] Speed: 1.257558766335517 samples/sec                   batch loss = 474.21558022499084 | accuracy = 0.5323529411764706


Epoch[1] Batch[175] Speed: 1.2563756668395532 samples/sec                   batch loss = 488.5533616542816 | accuracy = 0.5271428571428571


Epoch[1] Batch[180] Speed: 1.2590559190888566 samples/sec                   batch loss = 502.12095975875854 | accuracy = 0.5263888888888889


Epoch[1] Batch[185] Speed: 1.2579148004438372 samples/sec                   batch loss = 516.1500577926636 | accuracy = 0.522972972972973


Epoch[1] Batch[190] Speed: 1.2546794483729276 samples/sec                   batch loss = 529.4797580242157 | accuracy = 0.5263157894736842


Epoch[1] Batch[195] Speed: 1.2619346893835555 samples/sec                   batch loss = 543.0733706951141 | accuracy = 0.5269230769230769


Epoch[1] Batch[200] Speed: 1.2602675515743789 samples/sec                   batch loss = 556.6576235294342 | accuracy = 0.5275


Epoch[1] Batch[205] Speed: 1.2589345155374396 samples/sec                   batch loss = 570.8103342056274 | accuracy = 0.526829268292683


Epoch[1] Batch[210] Speed: 1.2530981420073466 samples/sec                   batch loss = 584.5757856369019 | accuracy = 0.5273809523809524


Epoch[1] Batch[215] Speed: 1.2540010429817767 samples/sec                   batch loss = 598.2661967277527 | accuracy = 0.5279069767441861


Epoch[1] Batch[220] Speed: 1.2511849076393262 samples/sec                   batch loss = 611.9367008209229 | accuracy = 0.5295454545454545


Epoch[1] Batch[225] Speed: 1.250170995586641 samples/sec                   batch loss = 625.9792175292969 | accuracy = 0.5288888888888889


Epoch[1] Batch[230] Speed: 1.2530646360763098 samples/sec                   batch loss = 639.865972995758 | accuracy = 0.5293478260869565


Epoch[1] Batch[235] Speed: 1.249811789424063 samples/sec                   batch loss = 653.4523367881775 | accuracy = 0.5319148936170213


Epoch[1] Batch[240] Speed: 1.2552598729328543 samples/sec                   batch loss = 666.7561852931976 | accuracy = 0.5333333333333333


Epoch[1] Batch[245] Speed: 1.251205902495496 samples/sec                   batch loss = 680.890593290329 | accuracy = 0.5316326530612245


Epoch[1] Batch[250] Speed: 1.2515156818259465 samples/sec                   batch loss = 694.2715358734131 | accuracy = 0.538


Epoch[1] Batch[255] Speed: 1.2508769707470637 samples/sec                   batch loss = 707.5623393058777 | accuracy = 0.5392156862745098


Epoch[1] Batch[260] Speed: 1.2527926305050816 samples/sec                   batch loss = 720.6758999824524 | accuracy = 0.5423076923076923


Epoch[1] Batch[265] Speed: 1.2510131495246815 samples/sec                   batch loss = 734.190271615982 | accuracy = 0.5424528301886793


Epoch[1] Batch[270] Speed: 1.2545844987338783 samples/sec                   batch loss = 748.237630367279 | accuracy = 0.5407407407407407


Epoch[1] Batch[275] Speed: 1.2542459120161222 samples/sec                   batch loss = 761.595540523529 | accuracy = 0.5436363636363636


Epoch[1] Batch[280] Speed: 1.2495735811504434 samples/sec                   batch loss = 775.4470028877258 | accuracy = 0.5419642857142857


Epoch[1] Batch[285] Speed: 1.2507592839370416 samples/sec                   batch loss = 788.93217253685 | accuracy = 0.543859649122807


Epoch[1] Batch[290] Speed: 1.250211427353159 samples/sec                   batch loss = 802.8129572868347 | accuracy = 0.5431034482758621


Epoch[1] Batch[295] Speed: 1.252392930255668 samples/sec                   batch loss = 816.4646906852722 | accuracy = 0.5440677966101695


Epoch[1] Batch[300] Speed: 1.2563337064519773 samples/sec                   batch loss = 830.3181505203247 | accuracy = 0.5433333333333333


Epoch[1] Batch[305] Speed: 1.2530460120458766 samples/sec                   batch loss = 843.7232985496521 | accuracy = 0.5450819672131147


Epoch[1] Batch[310] Speed: 1.2560090315896018 samples/sec                   batch loss = 857.3819444179535 | accuracy = 0.5451612903225806


Epoch[1] Batch[315] Speed: 1.2489022649973165 samples/sec                   batch loss = 870.7272262573242 | accuracy = 0.5476190476190477


Epoch[1] Batch[320] Speed: 1.2476208911380198 samples/sec                   batch loss = 883.0896480083466 | accuracy = 0.5515625


Epoch[1] Batch[325] Speed: 1.2464553893661199 samples/sec                   batch loss = 896.8509938716888 | accuracy = 0.5507692307692308


Epoch[1] Batch[330] Speed: 1.2609500076624067 samples/sec                   batch loss = 910.6044023036957 | accuracy = 0.5492424242424242


Epoch[1] Batch[335] Speed: 1.253400056957734 samples/sec                   batch loss = 925.2275264263153 | accuracy = 0.5477611940298508


Epoch[1] Batch[340] Speed: 1.2543024554764621 samples/sec                   batch loss = 939.6547291278839 | accuracy = 0.5463235294117647


Epoch[1] Batch[345] Speed: 1.252845768509462 samples/sec                   batch loss = 952.8804767131805 | accuracy = 0.5485507246376812


Epoch[1] Batch[350] Speed: 1.2544545761243813 samples/sec                   batch loss = 966.5602774620056 | accuracy = 0.5478571428571428


Epoch[1] Batch[355] Speed: 1.2545374045415658 samples/sec                   batch loss = 980.2438662052155 | accuracy = 0.5471830985915493


Epoch[1] Batch[360] Speed: 1.2527449224050362 samples/sec                   batch loss = 993.2690145969391 | accuracy = 0.5472222222222223


Epoch[1] Batch[365] Speed: 1.2479510829111005 samples/sec                   batch loss = 1006.6416370868683 | accuracy = 0.5486301369863014


Epoch[1] Batch[370] Speed: 1.2523429155454608 samples/sec                   batch loss = 1021.2098095417023 | accuracy = 0.5486486486486486


Epoch[1] Batch[375] Speed: 1.2505348824451223 samples/sec                   batch loss = 1034.7048864364624 | accuracy = 0.5493333333333333


Epoch[1] Batch[380] Speed: 1.2485504773818954 samples/sec                   batch loss = 1048.280611038208 | accuracy = 0.5493421052631579


Epoch[1] Batch[385] Speed: 1.2486661690027843 samples/sec                   batch loss = 1062.5070519447327 | accuracy = 0.5487012987012987


Epoch[1] Batch[390] Speed: 1.2521452329458702 samples/sec                   batch loss = 1075.5339665412903 | accuracy = 0.5512820512820513


Epoch[1] Batch[395] Speed: 1.2581719567527987 samples/sec                   batch loss = 1088.0642437934875 | accuracy = 0.5537974683544303


Epoch[1] Batch[400] Speed: 1.2506610109716378 samples/sec                   batch loss = 1101.9410457611084 | accuracy = 0.553125


Epoch[1] Batch[405] Speed: 1.2494208740337536 samples/sec                   batch loss = 1115.857344865799 | accuracy = 0.5530864197530864


Epoch[1] Batch[410] Speed: 1.2595364722216738 samples/sec                   batch loss = 1128.8702101707458 | accuracy = 0.5542682926829269


Epoch[1] Batch[415] Speed: 1.2585071912967063 samples/sec                   batch loss = 1142.4502396583557 | accuracy = 0.5542168674698795


Epoch[1] Batch[420] Speed: 1.2504262067236394 samples/sec                   batch loss = 1155.5674223899841 | accuracy = 0.555952380952381


Epoch[1] Batch[425] Speed: 1.2507815699928415 samples/sec                   batch loss = 1168.4389719963074 | accuracy = 0.5570588235294117


Epoch[1] Batch[430] Speed: 1.2531148020478309 samples/sec                   batch loss = 1182.0806550979614 | accuracy = 0.5575581395348838


Epoch[1] Batch[435] Speed: 1.2569932601843652 samples/sec                   batch loss = 1195.7828407287598 | accuracy = 0.5580459770114943


Epoch[1] Batch[440] Speed: 1.2509913216157587 samples/sec                   batch loss = 1209.3936088085175 | accuracy = 0.5579545454545455


Epoch[1] Batch[445] Speed: 1.2572292184791507 samples/sec                   batch loss = 1222.5730633735657 | accuracy = 0.5584269662921348


Epoch[1] Batch[450] Speed: 1.2534023979519682 samples/sec                   batch loss = 1235.8196127414703 | accuracy = 0.5577777777777778


Epoch[1] Batch[455] Speed: 1.2572749132557426 samples/sec                   batch loss = 1249.0992078781128 | accuracy = 0.5593406593406594


Epoch[1] Batch[460] Speed: 1.2520128251864993 samples/sec                   batch loss = 1262.338526725769 | accuracy = 0.5597826086956522


Epoch[1] Batch[465] Speed: 1.2508297814583889 samples/sec                   batch loss = 1275.6941709518433 | accuracy = 0.560752688172043


Epoch[1] Batch[470] Speed: 1.254451480823634 samples/sec                   batch loss = 1288.5055470466614 | accuracy = 0.5622340425531915


Epoch[1] Batch[475] Speed: 1.256304825033519 samples/sec                   batch loss = 1302.438247680664 | accuracy = 0.5626315789473684


Epoch[1] Batch[480] Speed: 1.2539637397849088 samples/sec                   batch loss = 1316.2850484848022 | accuracy = 0.5630208333333333


Epoch[1] Batch[485] Speed: 1.2483185082692876 samples/sec                   batch loss = 1330.2534265518188 | accuracy = 0.5623711340206186


Epoch[1] Batch[490] Speed: 1.2527073197575467 samples/sec                   batch loss = 1343.6449613571167 | accuracy = 0.563265306122449


Epoch[1] Batch[495] Speed: 1.2564850969095747 samples/sec                   batch loss = 1356.845359325409 | accuracy = 0.5646464646464646


Epoch[1] Batch[500] Speed: 1.2557273814053846 samples/sec                   batch loss = 1370.3832790851593 | accuracy = 0.5655


Epoch[1] Batch[505] Speed: 1.2556855582901312 samples/sec                   batch loss = 1383.4960260391235 | accuracy = 0.5663366336633663


Epoch[1] Batch[510] Speed: 1.2574207818992498 samples/sec                   batch loss = 1396.4512567520142 | accuracy = 0.5661764705882353


Epoch[1] Batch[515] Speed: 1.2568448543627206 samples/sec                   batch loss = 1410.090342760086 | accuracy = 0.5650485436893203


Epoch[1] Batch[520] Speed: 1.255006909667745 samples/sec                   batch loss = 1424.3677024841309 | accuracy = 0.5649038461538461


Epoch[1] Batch[525] Speed: 1.2583925949833408 samples/sec                   batch loss = 1437.464588880539 | accuracy = 0.5652380952380952


Epoch[1] Batch[530] Speed: 1.259905643091057 samples/sec                   batch loss = 1450.437124490738 | accuracy = 0.5660377358490566


Epoch[1] Batch[535] Speed: 1.2546072027749517 samples/sec                   batch loss = 1464.0654714107513 | accuracy = 0.5663551401869159


Epoch[1] Batch[540] Speed: 1.2587405074241251 samples/sec                   batch loss = 1477.9180808067322 | accuracy = 0.5662037037037037


Epoch[1] Batch[545] Speed: 1.2585299431302117 samples/sec                   batch loss = 1491.4273142814636 | accuracy = 0.5660550458715596


Epoch[1] Batch[550] Speed: 1.2612276532837579 samples/sec                   batch loss = 1504.986421585083 | accuracy = 0.5659090909090909


Epoch[1] Batch[555] Speed: 1.2597485085684246 samples/sec                   batch loss = 1518.0196542739868 | accuracy = 0.5653153153153153


Epoch[1] Batch[560] Speed: 1.2646189104110783 samples/sec                   batch loss = 1530.7897815704346 | accuracy = 0.5665178571428572


Epoch[1] Batch[565] Speed: 1.2517078426018355 samples/sec                   batch loss = 1544.0618915557861 | accuracy = 0.5672566371681416


Epoch[1] Batch[570] Speed: 1.2549266475384508 samples/sec                   batch loss = 1556.5381028652191 | accuracy = 0.5688596491228071


Epoch[1] Batch[575] Speed: 1.2479919282138279 samples/sec                   batch loss = 1570.0139079093933 | accuracy = 0.5691304347826087


Epoch[1] Batch[580] Speed: 1.2568117126721352 samples/sec                   batch loss = 1582.9365985393524 | accuracy = 0.5706896551724138


Epoch[1] Batch[585] Speed: 1.2575951525229574 samples/sec                   batch loss = 1595.1502225399017 | accuracy = 0.5713675213675213


Epoch[1] Batch[590] Speed: 1.2585409889227765 samples/sec                   batch loss = 1608.5831320285797 | accuracy = 0.5720338983050848


Epoch[1] Batch[595] Speed: 1.2483688522699454 samples/sec                   batch loss = 1620.911391735077 | accuracy = 0.573109243697479


Epoch[1] Batch[600] Speed: 1.2483234310275027 samples/sec                   batch loss = 1634.118575334549 | accuracy = 0.57375


Epoch[1] Batch[605] Speed: 1.2519188392144853 samples/sec                   batch loss = 1647.913109779358 | accuracy = 0.5743801652892562


Epoch[1] Batch[610] Speed: 1.2525334602499136 samples/sec                   batch loss = 1661.0686039924622 | accuracy = 0.5754098360655737


Epoch[1] Batch[615] Speed: 1.250391725236645 samples/sec                   batch loss = 1675.060338973999 | accuracy = 0.5756097560975609


Epoch[1] Batch[620] Speed: 1.2560877394852015 samples/sec                   batch loss = 1687.3893043994904 | accuracy = 0.5766129032258065


Epoch[1] Batch[625] Speed: 1.250560423048776 samples/sec                   batch loss = 1700.2463352680206 | accuracy = 0.5776


Epoch[1] Batch[630] Speed: 1.2544048655657474 samples/sec                   batch loss = 1712.8009660243988 | accuracy = 0.5785714285714286


Epoch[1] Batch[635] Speed: 1.2545774625243087 samples/sec                   batch loss = 1725.6017560958862 | accuracy = 0.5791338582677166


Epoch[1] Batch[640] Speed: 1.2544477289644191 samples/sec                   batch loss = 1737.866370677948 | accuracy = 0.580859375


Epoch[1] Batch[645] Speed: 1.258795756809223 samples/sec                   batch loss = 1752.1208803653717 | accuracy = 0.5806201550387597


Epoch[1] Batch[650] Speed: 1.2609997644440376 samples/sec                   batch loss = 1765.6588928699493 | accuracy = 0.5807692307692308


Epoch[1] Batch[655] Speed: 1.2576038251892472 samples/sec                   batch loss = 1776.8728086948395 | accuracy = 0.583206106870229


Epoch[1] Batch[660] Speed: 1.2530885954422282 samples/sec                   batch loss = 1789.052508354187 | accuracy = 0.584469696969697


Epoch[1] Batch[665] Speed: 1.2548937007944692 samples/sec                   batch loss = 1802.4343910217285 | accuracy = 0.5842105263157895


Epoch[1] Batch[670] Speed: 1.2593751756981026 samples/sec                   batch loss = 1814.9063551425934 | accuracy = 0.5850746268656717


Epoch[1] Batch[675] Speed: 1.2540402231007275 samples/sec                   batch loss = 1829.518771648407 | accuracy = 0.5840740740740741


Epoch[1] Batch[680] Speed: 1.2579079154581598 samples/sec                   batch loss = 1842.7166457176208 | accuracy = 0.5841911764705883


Epoch[1] Batch[685] Speed: 1.2486344794469908 samples/sec                   batch loss = 1855.3099937438965 | accuracy = 0.583941605839416


Epoch[1] Batch[690] Speed: 1.2525152260351389 samples/sec                   batch loss = 1867.0598449707031 | accuracy = 0.5847826086956521


Epoch[1] Batch[695] Speed: 1.2554625800340542 samples/sec                   batch loss = 1880.7999465465546 | accuracy = 0.5845323741007195


Epoch[1] Batch[700] Speed: 1.255899872450364 samples/sec                   batch loss = 1893.2965214252472 | accuracy = 0.5842857142857143


Epoch[1] Batch[705] Speed: 1.2551016415758045 samples/sec                   batch loss = 1904.00970184803 | accuracy = 0.5861702127659575


Epoch[1] Batch[710] Speed: 1.2515700186856542 samples/sec                   batch loss = 1915.764233469963 | accuracy = 0.5873239436619718


Epoch[1] Batch[715] Speed: 1.2495383090985952 samples/sec                   batch loss = 1929.23595058918 | accuracy = 0.5874125874125874


Epoch[1] Batch[720] Speed: 1.2545126392932646 samples/sec                   batch loss = 1940.606264948845 | accuracy = 0.5888888888888889


Epoch[1] Batch[725] Speed: 1.2528653221843171 samples/sec                   batch loss = 1954.895344376564 | accuracy = 0.5882758620689655


Epoch[1] Batch[730] Speed: 1.2534215007917364 samples/sec                   batch loss = 1967.627583861351 | accuracy = 0.5886986301369863


Epoch[1] Batch[735] Speed: 1.2511552361108418 samples/sec                   batch loss = 1981.3823009729385 | accuracy = 0.5887755102040816


Epoch[1] Batch[740] Speed: 1.25878725658814 samples/sec                   batch loss = 1993.8928161859512 | accuracy = 0.589527027027027


Epoch[1] Batch[745] Speed: 1.2556036118589062 samples/sec                   batch loss = 2005.9498010873795 | accuracy = 0.5902684563758389


Epoch[1] Batch[750] Speed: 1.253534537842702 samples/sec                   batch loss = 2018.6611119508743 | accuracy = 0.591


Epoch[1] Batch[755] Speed: 1.2606641481260177 samples/sec                   batch loss = 2029.0401319265366 | accuracy = 0.5927152317880795


Epoch[1] Batch[760] Speed: 1.2558764635297481 samples/sec                   batch loss = 2042.1994901895523 | accuracy = 0.593421052631579


Epoch[1] Batch[765] Speed: 1.2565896519522914 samples/sec                   batch loss = 2055.609755039215 | accuracy = 0.5931372549019608


Epoch[1] Batch[770] Speed: 1.2565680995792494 samples/sec                   batch loss = 2067.622058033943 | accuracy = 0.5935064935064935


Epoch[1] Batch[775] Speed: 1.2559261967770041 samples/sec                   batch loss = 2080.437858939171 | accuracy = 0.5935483870967742


Epoch[1] Batch[780] Speed: 1.2619363979306526 samples/sec                   batch loss = 2096.754199385643 | accuracy = 0.5935897435897436


Epoch[1] Batch[785] Speed: 1.2518227190549633 samples/sec                   batch loss = 2108.367845416069 | accuracy = 0.5939490445859873


[Epoch 1] training: accuracy=0.5942258883248731
[Epoch 1] time cost: 647.0973417758942
[Epoch 1] validation: validation accuracy=0.6644444444444444


Epoch[2] Batch[5] Speed: 1.2376607870482905 samples/sec                   batch loss = 12.11466133594513 | accuracy = 0.65


Epoch[2] Batch[10] Speed: 1.234287880165631 samples/sec                   batch loss = 23.836285710334778 | accuracy = 0.7


Epoch[2] Batch[15] Speed: 1.2285682337221198 samples/sec                   batch loss = 35.5488406419754 | accuracy = 0.6666666666666666


Epoch[2] Batch[20] Speed: 1.2347687927594921 samples/sec                   batch loss = 45.48794412612915 | accuracy = 0.7125


Epoch[2] Batch[25] Speed: 1.2381272454004943 samples/sec                   batch loss = 58.9440484046936 | accuracy = 0.69


Epoch[2] Batch[30] Speed: 1.2361539389346194 samples/sec                   batch loss = 68.92395615577698 | accuracy = 0.7


Epoch[2] Batch[35] Speed: 1.2394507125401586 samples/sec                   batch loss = 81.76442074775696 | accuracy = 0.7


Epoch[2] Batch[40] Speed: 1.2406159889287338 samples/sec                   batch loss = 92.52876615524292 | accuracy = 0.7125


Epoch[2] Batch[45] Speed: 1.2394198552548819 samples/sec                   batch loss = 107.78976202011108 | accuracy = 0.6944444444444444


Epoch[2] Batch[50] Speed: 1.2419050698417338 samples/sec                   batch loss = 120.73640394210815 | accuracy = 0.685


Epoch[2] Batch[55] Speed: 1.237453747285798 samples/sec                   batch loss = 132.43037009239197 | accuracy = 0.6863636363636364


Epoch[2] Batch[60] Speed: 1.241112863769107 samples/sec                   batch loss = 147.80032873153687 | accuracy = 0.6666666666666666


Epoch[2] Batch[65] Speed: 1.2374320249067512 samples/sec                   batch loss = 162.28209161758423 | accuracy = 0.65


Epoch[2] Batch[70] Speed: 1.2379825298910232 samples/sec                   batch loss = 174.6361733675003 | accuracy = 0.6535714285714286


Epoch[2] Batch[75] Speed: 1.2418761125906792 samples/sec                   batch loss = 188.25207912921906 | accuracy = 0.65


Epoch[2] Batch[80] Speed: 1.2370162522834773 samples/sec                   batch loss = 202.37376189231873 | accuracy = 0.64375


Epoch[2] Batch[85] Speed: 1.2416150067634102 samples/sec                   batch loss = 214.66852486133575 | accuracy = 0.65


Epoch[2] Batch[90] Speed: 1.2447812160280824 samples/sec                   batch loss = 226.88099563121796 | accuracy = 0.6444444444444445


Epoch[2] Batch[95] Speed: 1.2448706232716948 samples/sec                   batch loss = 235.48253071308136 | accuracy = 0.6578947368421053


Epoch[2] Batch[100] Speed: 1.2483662513733935 samples/sec                   batch loss = 250.2752047777176 | accuracy = 0.6525


Epoch[2] Batch[105] Speed: 1.2442391364645107 samples/sec                   batch loss = 263.8727787733078 | accuracy = 0.6428571428571429


Epoch[2] Batch[110] Speed: 1.252609301577579 samples/sec                   batch loss = 278.51846754550934 | accuracy = 0.6386363636363637


Epoch[2] Batch[115] Speed: 1.2434923355401202 samples/sec                   batch loss = 290.8853713274002 | accuracy = 0.6391304347826087


Epoch[2] Batch[120] Speed: 1.248008359919281 samples/sec                   batch loss = 302.8404299020767 | accuracy = 0.6416666666666667


Epoch[2] Batch[125] Speed: 1.241382208795772 samples/sec                   batch loss = 313.45209884643555 | accuracy = 0.646


Epoch[2] Batch[130] Speed: 1.2498218447600755 samples/sec                   batch loss = 324.51807379722595 | accuracy = 0.6480769230769231


Epoch[2] Batch[135] Speed: 1.2445170409553366 samples/sec                   batch loss = 337.8345273733139 | accuracy = 0.6425925925925926


Epoch[2] Batch[140] Speed: 1.2452371609255146 samples/sec                   batch loss = 349.43974339962006 | accuracy = 0.6482142857142857


Epoch[2] Batch[145] Speed: 1.2539272822722363 samples/sec                   batch loss = 363.198321223259 | accuracy = 0.6396551724137931


Epoch[2] Batch[150] Speed: 1.2504228516846319 samples/sec                   batch loss = 374.5067901611328 | accuracy = 0.6416666666666667


Epoch[2] Batch[155] Speed: 1.248194244706012 samples/sec                   batch loss = 385.7485566139221 | accuracy = 0.6483870967741936


Epoch[2] Batch[160] Speed: 1.2470341602834136 samples/sec                   batch loss = 399.44677221775055 | accuracy = 0.64375


Epoch[2] Batch[165] Speed: 1.247064934400784 samples/sec                   batch loss = 412.0380334854126 | accuracy = 0.646969696969697


Epoch[2] Batch[170] Speed: 1.24781965344574 samples/sec                   batch loss = 424.0366668701172 | accuracy = 0.6455882352941177


Epoch[2] Batch[175] Speed: 1.251393487810352 samples/sec                   batch loss = 433.04247868061066 | accuracy = 0.6528571428571428


Epoch[2] Batch[180] Speed: 1.2474581796089128 samples/sec                   batch loss = 443.17391216754913 | accuracy = 0.6555555555555556


Epoch[2] Batch[185] Speed: 1.2437404012047157 samples/sec                   batch loss = 453.235871553421 | accuracy = 0.6567567567567567


Epoch[2] Batch[190] Speed: 1.2468499181238082 samples/sec                   batch loss = 468.48828530311584 | accuracy = 0.6552631578947369


Epoch[2] Batch[195] Speed: 1.243340097494632 samples/sec                   batch loss = 481.3687696456909 | accuracy = 0.6551282051282051


Epoch[2] Batch[200] Speed: 1.2464732623523116 samples/sec                   batch loss = 491.60658276081085 | accuracy = 0.65875


Epoch[2] Batch[205] Speed: 1.252674582972327 samples/sec                   batch loss = 505.8667711019516 | accuracy = 0.6585365853658537


Epoch[2] Batch[210] Speed: 1.2496271910370624 samples/sec                   batch loss = 517.4227575063705 | accuracy = 0.6607142857142857


Epoch[2] Batch[215] Speed: 1.2502365820758907 samples/sec                   batch loss = 528.2507011890411 | accuracy = 0.6627906976744186


Epoch[2] Batch[220] Speed: 1.2468186913231916 samples/sec                   batch loss = 540.1804713010788 | accuracy = 0.6647727272727273


Epoch[2] Batch[225] Speed: 1.2388360540437762 samples/sec                   batch loss = 553.6943792104721 | accuracy = 0.6655555555555556


Epoch[2] Batch[230] Speed: 1.2441507425282579 samples/sec                   batch loss = 564.0716832876205 | accuracy = 0.6673913043478261


Epoch[2] Batch[235] Speed: 1.2409337630822632 samples/sec                   batch loss = 578.5651692152023 | accuracy = 0.6638297872340425


Epoch[2] Batch[240] Speed: 1.2476207983599468 samples/sec                   batch loss = 588.6167857646942 | accuracy = 0.6677083333333333


Epoch[2] Batch[245] Speed: 1.247012656380004 samples/sec                   batch loss = 601.1084163188934 | accuracy = 0.6642857142857143


Epoch[2] Batch[250] Speed: 1.2368625866085712 samples/sec                   batch loss = 613.2579332590103 | accuracy = 0.663


Epoch[2] Batch[255] Speed: 1.2377921855054534 samples/sec                   batch loss = 629.416601061821 | accuracy = 0.6598039215686274


Epoch[2] Batch[260] Speed: 1.2507002623626293 samples/sec                   batch loss = 643.2766436338425 | accuracy = 0.6586538461538461


Epoch[2] Batch[265] Speed: 1.2471892513400031 samples/sec                   batch loss = 655.8545600175858 | accuracy = 0.659433962264151


Epoch[2] Batch[270] Speed: 1.2460996106295228 samples/sec                   batch loss = 667.2517377138138 | accuracy = 0.6601851851851852


Epoch[2] Batch[275] Speed: 1.249709755536172 samples/sec                   batch loss = 679.2930123806 | accuracy = 0.6590909090909091


Epoch[2] Batch[280] Speed: 1.242915097330919 samples/sec                   batch loss = 692.741628408432 | accuracy = 0.6589285714285714


Epoch[2] Batch[285] Speed: 1.237099256747181 samples/sec                   batch loss = 704.7700068950653 | accuracy = 0.6587719298245615


Epoch[2] Batch[290] Speed: 1.2458267346653416 samples/sec                   batch loss = 718.1180262565613 | accuracy = 0.6586206896551724


Epoch[2] Batch[295] Speed: 1.248336527611045 samples/sec                   batch loss = 729.0225316286087 | accuracy = 0.6601694915254237


Epoch[2] Batch[300] Speed: 1.2528944199308687 samples/sec                   batch loss = 745.4594649076462 | accuracy = 0.6575


Epoch[2] Batch[305] Speed: 1.2442696805041094 samples/sec                   batch loss = 755.9700382947922 | accuracy = 0.6598360655737705


Epoch[2] Batch[310] Speed: 1.2406093837429688 samples/sec                   batch loss = 768.4062069654465 | accuracy = 0.6596774193548387


Epoch[2] Batch[315] Speed: 1.2398956127312737 samples/sec                   batch loss = 781.8183954954147 | accuracy = 0.6571428571428571


Epoch[2] Batch[320] Speed: 1.2418893500236021 samples/sec                   batch loss = 794.8347301483154 | accuracy = 0.65625


Epoch[2] Batch[325] Speed: 1.243528649635848 samples/sec                   batch loss = 804.841615319252 | accuracy = 0.6584615384615384


Epoch[2] Batch[330] Speed: 1.25021412910819 samples/sec                   batch loss = 815.4023171663284 | accuracy = 0.6606060606060606


Epoch[2] Batch[335] Speed: 1.2531553308498116 samples/sec                   batch loss = 828.6516366004944 | accuracy = 0.658955223880597


Epoch[2] Batch[340] Speed: 1.2546954936542856 samples/sec                   batch loss = 840.1683104038239 | accuracy = 0.6595588235294118


Epoch[2] Batch[345] Speed: 1.2539068519966512 samples/sec                   batch loss = 854.3478873968124 | accuracy = 0.6565217391304348


Epoch[2] Batch[350] Speed: 1.2506947614254234 samples/sec                   batch loss = 868.4427710771561 | accuracy = 0.6535714285714286


Epoch[2] Batch[355] Speed: 1.2467973801614403 samples/sec                   batch loss = 878.7114852666855 | accuracy = 0.6549295774647887


Epoch[2] Batch[360] Speed: 1.242353941984884 samples/sec                   batch loss = 887.622532427311 | accuracy = 0.6569444444444444


Epoch[2] Batch[365] Speed: 1.2478178901029986 samples/sec                   batch loss = 897.3652215600014 | accuracy = 0.6589041095890411


Epoch[2] Batch[370] Speed: 1.2410394179728037 samples/sec                   batch loss = 908.5368857979774 | accuracy = 0.6587837837837838


Epoch[2] Batch[375] Speed: 1.2363262871920175 samples/sec                   batch loss = 921.696673810482 | accuracy = 0.6586666666666666


Epoch[2] Batch[380] Speed: 1.2488383057702181 samples/sec                   batch loss = 933.1706357598305 | accuracy = 0.6592105263157895


Epoch[2] Batch[385] Speed: 1.2473361274432941 samples/sec                   batch loss = 943.7960520386696 | accuracy = 0.6610389610389611


Epoch[2] Batch[390] Speed: 1.249725767051494 samples/sec                   batch loss = 954.4610329270363 | accuracy = 0.6628205128205128


Epoch[2] Batch[395] Speed: 1.2497819967857842 samples/sec                   batch loss = 968.1720076203346 | accuracy = 0.6620253164556962


Epoch[2] Batch[400] Speed: 1.2449438765867413 samples/sec                   batch loss = 981.7505155205727 | accuracy = 0.661875


Epoch[2] Batch[405] Speed: 1.2389454691564004 samples/sec                   batch loss = 991.7759039998055 | accuracy = 0.6635802469135802


Epoch[2] Batch[410] Speed: 1.2430070915764442 samples/sec                   batch loss = 1002.411738216877 | accuracy = 0.6646341463414634


Epoch[2] Batch[415] Speed: 1.2429114141611697 samples/sec                   batch loss = 1014.2319377064705 | accuracy = 0.6650602409638554


Epoch[2] Batch[420] Speed: 1.2391006595133043 samples/sec                   batch loss = 1024.8557206988335 | accuracy = 0.6660714285714285


Epoch[2] Batch[425] Speed: 1.2423483302304028 samples/sec                   batch loss = 1034.6879406571388 | accuracy = 0.6676470588235294


Epoch[2] Batch[430] Speed: 1.2498647679674595 samples/sec                   batch loss = 1046.5482260584831 | accuracy = 0.6680232558139535


Epoch[2] Batch[435] Speed: 1.2455077444387908 samples/sec                   batch loss = 1057.524119079113 | accuracy = 0.6683908045977012


Epoch[2] Batch[440] Speed: 1.2444985778242956 samples/sec                   batch loss = 1071.1618080735207 | accuracy = 0.6681818181818182


Epoch[2] Batch[445] Speed: 1.24992939111111 samples/sec                   batch loss = 1083.4377824664116 | accuracy = 0.6662921348314607


Epoch[2] Batch[450] Speed: 1.2467467922339606 samples/sec                   batch loss = 1096.6043996214867 | accuracy = 0.665


Epoch[2] Batch[455] Speed: 1.2521760729302378 samples/sec                   batch loss = 1110.4641781449318 | accuracy = 0.6648351648351648


Epoch[2] Batch[460] Speed: 1.2398673905107063 samples/sec                   batch loss = 1122.6764894127846 | accuracy = 0.6641304347826087


Epoch[2] Batch[465] Speed: 1.2434519685524914 samples/sec                   batch loss = 1132.2732852101326 | accuracy = 0.6655913978494624


Epoch[2] Batch[470] Speed: 1.2374727321743562 samples/sec                   batch loss = 1146.1141396164894 | accuracy = 0.6654255319148936


Epoch[2] Batch[475] Speed: 1.241943497739258 samples/sec                   batch loss = 1158.522703230381 | accuracy = 0.6657894736842105


Epoch[2] Batch[480] Speed: 1.2404808718345484 samples/sec                   batch loss = 1172.0677521824837 | accuracy = 0.6645833333333333


Epoch[2] Batch[485] Speed: 1.2456355428842956 samples/sec                   batch loss = 1182.7486054301262 | accuracy = 0.6649484536082474


Epoch[2] Batch[490] Speed: 1.249879852310907 samples/sec                   batch loss = 1194.8895476460457 | accuracy = 0.6663265306122449


Epoch[2] Batch[495] Speed: 1.244851318347872 samples/sec                   batch loss = 1205.3412891030312 | accuracy = 0.6666666666666666


Epoch[2] Batch[500] Speed: 1.2487277870248754 samples/sec                   batch loss = 1216.836915075779 | accuracy = 0.6665


Epoch[2] Batch[505] Speed: 1.2399782709114762 samples/sec                   batch loss = 1228.7865681052208 | accuracy = 0.6663366336633664


Epoch[2] Batch[510] Speed: 1.2411032235127333 samples/sec                   batch loss = 1243.0861259102821 | accuracy = 0.6651960784313725


Epoch[2] Batch[515] Speed: 1.2404560164299712 samples/sec                   batch loss = 1254.3946207165718 | accuracy = 0.6660194174757281


Epoch[2] Batch[520] Speed: 1.2367333910767926 samples/sec                   batch loss = 1266.74812489748 | accuracy = 0.6649038461538461


Epoch[2] Batch[525] Speed: 1.2426518056993705 samples/sec                   batch loss = 1280.4553771615028 | accuracy = 0.6638095238095238


Epoch[2] Batch[530] Speed: 1.243720024931903 samples/sec                   batch loss = 1293.9105033278465 | accuracy = 0.6636792452830189


Epoch[2] Batch[535] Speed: 1.247852879466937 samples/sec                   batch loss = 1306.2381491065025 | accuracy = 0.664018691588785


Epoch[2] Batch[540] Speed: 1.2469546366166633 samples/sec                   batch loss = 1316.0097056031227 | accuracy = 0.6652777777777777


Epoch[2] Batch[545] Speed: 1.2496360333873542 samples/sec                   batch loss = 1327.0171956419945 | accuracy = 0.6655963302752294


Epoch[2] Batch[550] Speed: 1.2413695332756298 samples/sec                   batch loss = 1340.204135119915 | accuracy = 0.6645454545454546


Epoch[2] Batch[555] Speed: 1.2441955838114418 samples/sec                   batch loss = 1353.9289559721947 | accuracy = 0.6635135135135135


Epoch[2] Batch[560] Speed: 1.2348417708685169 samples/sec                   batch loss = 1365.284280717373 | accuracy = 0.6638392857142857


Epoch[2] Batch[565] Speed: 1.2372050807863795 samples/sec                   batch loss = 1377.8321378827095 | accuracy = 0.6637168141592921


Epoch[2] Batch[570] Speed: 1.2428918937259252 samples/sec                   batch loss = 1390.296528995037 | accuracy = 0.6644736842105263


Epoch[2] Batch[575] Speed: 1.2393239968729577 samples/sec                   batch loss = 1403.836031615734 | accuracy = 0.6639130434782609


Epoch[2] Batch[580] Speed: 1.2472403387882514 samples/sec                   batch loss = 1416.2829142212868 | accuracy = 0.6646551724137931


Epoch[2] Batch[585] Speed: 1.2424883632542778 samples/sec                   batch loss = 1430.4356778264046 | accuracy = 0.6628205128205128


Epoch[2] Batch[590] Speed: 1.2482478291726724 samples/sec                   batch loss = 1440.1244146227837 | accuracy = 0.6644067796610169


Epoch[2] Batch[595] Speed: 1.244512148172259 samples/sec                   batch loss = 1450.9607283473015 | accuracy = 0.665126050420168


Epoch[2] Batch[600] Speed: 1.240432812836869 samples/sec                   batch loss = 1461.4396811127663 | accuracy = 0.6654166666666667


Epoch[2] Batch[605] Speed: 1.2432903425366413 samples/sec                   batch loss = 1471.4796513915062 | accuracy = 0.6661157024793388


Epoch[2] Batch[610] Speed: 1.2463459400824137 samples/sec                   batch loss = 1485.0884019732475 | accuracy = 0.6647540983606557


Epoch[2] Batch[615] Speed: 1.2424303956901395 samples/sec                   batch loss = 1498.1994522213936 | accuracy = 0.6646341463414634


Epoch[2] Batch[620] Speed: 1.2404181390758697 samples/sec                   batch loss = 1507.258984863758 | accuracy = 0.665725806451613


Epoch[2] Batch[625] Speed: 1.2498250103623068 samples/sec                   batch loss = 1517.0366721749306 | accuracy = 0.6668


Epoch[2] Batch[630] Speed: 1.2421776099350421 samples/sec                   batch loss = 1528.4643278717995 | accuracy = 0.6674603174603174


Epoch[2] Batch[635] Speed: 1.2399035848249935 samples/sec                   batch loss = 1538.2332111001015 | accuracy = 0.668503937007874


Epoch[2] Batch[640] Speed: 1.2423231239585117 samples/sec                   batch loss = 1549.442228615284 | accuracy = 0.66875


Epoch[2] Batch[645] Speed: 1.2424747449912108 samples/sec                   batch loss = 1560.0594826340675 | accuracy = 0.6697674418604651


Epoch[2] Batch[650] Speed: 1.2470211837012408 samples/sec                   batch loss = 1570.2894992232323 | accuracy = 0.6707692307692308


Epoch[2] Batch[655] Speed: 1.2460786016886052 samples/sec                   batch loss = 1583.4409913420677 | accuracy = 0.6709923664122137


Epoch[2] Batch[660] Speed: 1.242377953506916 samples/sec                   batch loss = 1598.0889033675194 | accuracy = 0.6708333333333333


Epoch[2] Batch[665] Speed: 1.2397094430992737 samples/sec                   batch loss = 1609.3261266350746 | accuracy = 0.6703007518796993


Epoch[2] Batch[670] Speed: 1.2421250051640402 samples/sec                   batch loss = 1619.84366440773 | accuracy = 0.6708955223880597


Epoch[2] Batch[675] Speed: 1.247423676119241 samples/sec                   batch loss = 1631.4615927934647 | accuracy = 0.6711111111111111


Epoch[2] Batch[680] Speed: 1.2475851725998517 samples/sec                   batch loss = 1643.0203385353088 | accuracy = 0.6716911764705882


Epoch[2] Batch[685] Speed: 1.2448223159035967 samples/sec                   batch loss = 1655.1865819692612 | accuracy = 0.6726277372262773


Epoch[2] Batch[690] Speed: 1.2425013376879939 samples/sec                   batch loss = 1667.4438165426254 | accuracy = 0.6717391304347826


Epoch[2] Batch[695] Speed: 1.2450539113202344 samples/sec                   batch loss = 1677.4274014234543 | accuracy = 0.6726618705035972


Epoch[2] Batch[700] Speed: 1.2412835670140223 samples/sec                   batch loss = 1687.5953795909882 | accuracy = 0.6735714285714286


Epoch[2] Batch[705] Speed: 1.2430115120632503 samples/sec                   batch loss = 1703.0187034606934 | accuracy = 0.673049645390071


Epoch[2] Batch[710] Speed: 1.2443056708740028 samples/sec                   batch loss = 1714.6890585422516 | accuracy = 0.6721830985915493


Epoch[2] Batch[715] Speed: 1.2361305316994842 samples/sec                   batch loss = 1725.3568542003632 | accuracy = 0.6723776223776223


Epoch[2] Batch[720] Speed: 1.247187489778502 samples/sec                   batch loss = 1737.0568841695786 | accuracy = 0.6722222222222223


Epoch[2] Batch[725] Speed: 1.2417488086337032 samples/sec                   batch loss = 1750.029645562172 | accuracy = 0.6724137931034483


Epoch[2] Batch[730] Speed: 1.2428738470765541 samples/sec                   batch loss = 1760.6917123794556 | accuracy = 0.6722602739726027


Epoch[2] Batch[735] Speed: 1.2407049823162806 samples/sec                   batch loss = 1770.7323096990585 | accuracy = 0.6727891156462585


Epoch[2] Batch[740] Speed: 1.2377698120114728 samples/sec                   batch loss = 1780.636755347252 | accuracy = 0.6733108108108108


Epoch[2] Batch[745] Speed: 1.2332433190746848 samples/sec                   batch loss = 1792.3139731884003 | accuracy = 0.6734899328859061


Epoch[2] Batch[750] Speed: 1.2376445353905936 samples/sec                   batch loss = 1801.9485161304474 | accuracy = 0.6736666666666666


Epoch[2] Batch[755] Speed: 1.241974572754255 samples/sec                   batch loss = 1813.5255250930786 | accuracy = 0.6731788079470199


Epoch[2] Batch[760] Speed: 1.2389514161836563 samples/sec                   batch loss = 1822.4335215091705 | accuracy = 0.6740131578947368


Epoch[2] Batch[765] Speed: 1.234543551114353 samples/sec                   batch loss = 1833.1742033958435 | accuracy = 0.6738562091503268


Epoch[2] Batch[770] Speed: 1.2406212180840122 samples/sec                   batch loss = 1843.7998107671738 | accuracy = 0.6737012987012987


Epoch[2] Batch[775] Speed: 1.2394177493240617 samples/sec                   batch loss = 1856.6919606924057 | accuracy = 0.6745161290322581


Epoch[2] Batch[780] Speed: 1.2389345816698234 samples/sec                   batch loss = 1865.643525838852 | accuracy = 0.6753205128205129


Epoch[2] Batch[785] Speed: 1.2411929295896749 samples/sec                   batch loss = 1876.6852729320526 | accuracy = 0.6764331210191082


[Epoch 2] training: accuracy=0.6767131979695431
[Epoch 2] time cost: 649.6327369213104
[Epoch 2] validation: validation accuracy=0.7255555555555555


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).