<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[03:31:15] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[03:31:15] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[03:31:15] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 2.4428618, -3.128041 ]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.795947678867688 samples/sec                   batch loss = 13.542342185974121 | accuracy = 0.65


Epoch[1] Batch[10] Speed: 1.2851784833082154 samples/sec                   batch loss = 27.872493505477905 | accuracy = 0.575


Epoch[1] Batch[15] Speed: 1.281515953434601 samples/sec                   batch loss = 41.06647491455078 | accuracy = 0.5833333333333334


Epoch[1] Batch[20] Speed: 1.2860145552253623 samples/sec                   batch loss = 54.50868344306946 | accuracy = 0.5875


Epoch[1] Batch[25] Speed: 1.284254425920708 samples/sec                   batch loss = 69.29692316055298 | accuracy = 0.56


Epoch[1] Batch[30] Speed: 1.2795357153425468 samples/sec                   batch loss = 84.04071640968323 | accuracy = 0.5083333333333333


Epoch[1] Batch[35] Speed: 1.2902995976027516 samples/sec                   batch loss = 98.51258397102356 | accuracy = 0.5142857142857142


Epoch[1] Batch[40] Speed: 1.28552708383809 samples/sec                   batch loss = 112.6721544265747 | accuracy = 0.51875


Epoch[1] Batch[45] Speed: 1.2862240636797295 samples/sec                   batch loss = 127.90080547332764 | accuracy = 0.49444444444444446


Epoch[1] Batch[50] Speed: 1.2824815595484365 samples/sec                   batch loss = 142.63250374794006 | accuracy = 0.485


Epoch[1] Batch[55] Speed: 1.2846156072546424 samples/sec                   batch loss = 157.4992265701294 | accuracy = 0.4818181818181818


Epoch[1] Batch[60] Speed: 1.2878660126885833 samples/sec                   batch loss = 172.2491958141327 | accuracy = 0.4625


Epoch[1] Batch[65] Speed: 1.278243721680927 samples/sec                   batch loss = 186.37569046020508 | accuracy = 0.46153846153846156


Epoch[1] Batch[70] Speed: 1.2812582669381736 samples/sec                   batch loss = 200.61721086502075 | accuracy = 0.4607142857142857


Epoch[1] Batch[75] Speed: 1.2827296367667158 samples/sec                   batch loss = 214.52684020996094 | accuracy = 0.47


Epoch[1] Batch[80] Speed: 1.2818200647907885 samples/sec                   batch loss = 228.4606683254242 | accuracy = 0.471875


Epoch[1] Batch[85] Speed: 1.2891906174073149 samples/sec                   batch loss = 242.24371910095215 | accuracy = 0.47352941176470587


Epoch[1] Batch[90] Speed: 1.280507720842378 samples/sec                   batch loss = 256.1293408870697 | accuracy = 0.46944444444444444


Epoch[1] Batch[95] Speed: 1.2867152217670421 samples/sec                   batch loss = 269.3319306373596 | accuracy = 0.4789473684210526


Epoch[1] Batch[100] Speed: 1.2820200775651514 samples/sec                   batch loss = 283.0921061038971 | accuracy = 0.485


Epoch[1] Batch[105] Speed: 1.283341905664378 samples/sec                   batch loss = 297.17990589141846 | accuracy = 0.4857142857142857


Epoch[1] Batch[110] Speed: 1.277048034564877 samples/sec                   batch loss = 310.7965307235718 | accuracy = 0.48863636363636365


Epoch[1] Batch[115] Speed: 1.2856270705727912 samples/sec                   batch loss = 324.4803249835968 | accuracy = 0.4934782608695652


Epoch[1] Batch[120] Speed: 1.2815577528471649 samples/sec                   batch loss = 338.24477672576904 | accuracy = 0.4979166666666667


Epoch[1] Batch[125] Speed: 1.2798357632634119 samples/sec                   batch loss = 352.17310786247253 | accuracy = 0.5


Epoch[1] Batch[130] Speed: 1.2797566868847872 samples/sec                   batch loss = 366.42383646965027 | accuracy = 0.5019230769230769


Epoch[1] Batch[135] Speed: 1.28375424070744 samples/sec                   batch loss = 380.0402042865753 | accuracy = 0.5074074074074074


Epoch[1] Batch[140] Speed: 1.2855698348833737 samples/sec                   batch loss = 393.7982726097107 | accuracy = 0.5071428571428571


Epoch[1] Batch[145] Speed: 1.284664396563144 samples/sec                   batch loss = 407.48131799697876 | accuracy = 0.5103448275862069


Epoch[1] Batch[150] Speed: 1.285015178792659 samples/sec                   batch loss = 421.3677954673767 | accuracy = 0.515


Epoch[1] Batch[155] Speed: 1.2885079351620894 samples/sec                   batch loss = 435.1887466907501 | accuracy = 0.5161290322580645


Epoch[1] Batch[160] Speed: 1.2828146718710587 samples/sec                   batch loss = 449.2495732307434 | accuracy = 0.509375


Epoch[1] Batch[165] Speed: 1.2832576840346197 samples/sec                   batch loss = 462.97998309135437 | accuracy = 0.5106060606060606


Epoch[1] Batch[170] Speed: 1.2769379092844604 samples/sec                   batch loss = 476.7701139450073 | accuracy = 0.5073529411764706


Epoch[1] Batch[175] Speed: 1.2829469057075606 samples/sec                   batch loss = 490.8973548412323 | accuracy = 0.5057142857142857


Epoch[1] Batch[180] Speed: 1.2833323835505785 samples/sec                   batch loss = 504.30614280700684 | accuracy = 0.5083333333333333


Epoch[1] Batch[185] Speed: 1.283812395467283 samples/sec                   batch loss = 518.0039122104645 | accuracy = 0.5108108108108108


Epoch[1] Batch[190] Speed: 1.283682242198347 samples/sec                   batch loss = 532.0575523376465 | accuracy = 0.5078947368421053


Epoch[1] Batch[195] Speed: 1.2769783414261373 samples/sec                   batch loss = 545.7815339565277 | accuracy = 0.5102564102564102


Epoch[1] Batch[200] Speed: 1.279240392458 samples/sec                   batch loss = 559.6697990894318 | accuracy = 0.51125


Epoch[1] Batch[205] Speed: 1.2813401712346997 samples/sec                   batch loss = 573.3981850147247 | accuracy = 0.5121951219512195


Epoch[1] Batch[210] Speed: 1.2837985439400545 samples/sec                   batch loss = 587.2610490322113 | accuracy = 0.513095238095238


Epoch[1] Batch[215] Speed: 1.2789885927761746 samples/sec                   batch loss = 600.289687871933 | accuracy = 0.5174418604651163


Epoch[1] Batch[220] Speed: 1.2749236305728175 samples/sec                   batch loss = 614.0166265964508 | accuracy = 0.5181818181818182


Epoch[1] Batch[225] Speed: 1.277586298407938 samples/sec                   batch loss = 627.6416163444519 | accuracy = 0.5188888888888888


Epoch[1] Batch[230] Speed: 1.2803637752303283 samples/sec                   batch loss = 641.162558555603 | accuracy = 0.5217391304347826


Epoch[1] Batch[235] Speed: 1.277747525457518 samples/sec                   batch loss = 655.1166143417358 | accuracy = 0.5202127659574468


Epoch[1] Batch[240] Speed: 1.2868646460868642 samples/sec                   batch loss = 669.1621477603912 | accuracy = 0.5177083333333333


Epoch[1] Batch[245] Speed: 1.2820937512776258 samples/sec                   batch loss = 683.0448970794678 | accuracy = 0.5193877551020408


Epoch[1] Batch[250] Speed: 1.2788950952104063 samples/sec                   batch loss = 697.1794328689575 | accuracy = 0.52


Epoch[1] Batch[255] Speed: 1.2804500605338196 samples/sec                   batch loss = 710.318514585495 | accuracy = 0.5215686274509804


Epoch[1] Batch[260] Speed: 1.286651376646332 samples/sec                   batch loss = 724.2249567508698 | accuracy = 0.5192307692307693


Epoch[1] Batch[265] Speed: 1.2903291700622261 samples/sec                   batch loss = 738.4224121570587 | accuracy = 0.5169811320754717


Epoch[1] Batch[270] Speed: 1.2795749458836077 samples/sec                   batch loss = 751.27649974823 | accuracy = 0.5212962962962963


Epoch[1] Batch[275] Speed: 1.2851161687317345 samples/sec                   batch loss = 764.7953779697418 | accuracy = 0.5218181818181818


Epoch[1] Batch[280] Speed: 1.2887275621071197 samples/sec                   batch loss = 777.7780907154083 | accuracy = 0.5258928571428572


Epoch[1] Batch[285] Speed: 1.2839425751315225 samples/sec                   batch loss = 791.4636731147766 | accuracy = 0.5263157894736842


Epoch[1] Batch[290] Speed: 1.2837158339492825 samples/sec                   batch loss = 804.5220913887024 | accuracy = 0.5301724137931034


Epoch[1] Batch[295] Speed: 1.2842552123732114 samples/sec                   batch loss = 818.2813446521759 | accuracy = 0.5288135593220339


Epoch[1] Batch[300] Speed: 1.2810392219579376 samples/sec                   batch loss = 831.660728931427 | accuracy = 0.5283333333333333


Epoch[1] Batch[305] Speed: 1.2820607342184842 samples/sec                   batch loss = 845.3029055595398 | accuracy = 0.5278688524590164


Epoch[1] Batch[310] Speed: 1.2799677742488478 samples/sec                   batch loss = 858.6685876846313 | accuracy = 0.5298387096774193


Epoch[1] Batch[315] Speed: 1.2829058004305407 samples/sec                   batch loss = 872.3277733325958 | accuracy = 0.530952380952381


Epoch[1] Batch[320] Speed: 1.286392903208097 samples/sec                   batch loss = 885.761810541153 | accuracy = 0.53203125


Epoch[1] Batch[325] Speed: 1.2897801175001697 samples/sec                   batch loss = 899.8294603824615 | accuracy = 0.5338461538461539


Epoch[1] Batch[330] Speed: 1.2869267354839264 samples/sec                   batch loss = 912.6060962677002 | accuracy = 0.5386363636363637


Epoch[1] Batch[335] Speed: 1.287666247223057 samples/sec                   batch loss = 926.4720237255096 | accuracy = 0.5395522388059701


Epoch[1] Batch[340] Speed: 1.2808949612468794 samples/sec                   batch loss = 940.0598568916321 | accuracy = 0.5419117647058823


Epoch[1] Batch[345] Speed: 1.283976671828681 samples/sec                   batch loss = 954.0555250644684 | accuracy = 0.5413043478260869


Epoch[1] Batch[350] Speed: 1.286306012377728 samples/sec                   batch loss = 967.5111985206604 | accuracy = 0.5435714285714286


Epoch[1] Batch[355] Speed: 1.2842650431107765 samples/sec                   batch loss = 980.7161705493927 | accuracy = 0.5443661971830986


Epoch[1] Batch[360] Speed: 1.282847727774607 samples/sec                   batch loss = 994.2145822048187 | accuracy = 0.5472222222222223


Epoch[1] Batch[365] Speed: 1.2857200771222863 samples/sec                   batch loss = 1008.1237580776215 | accuracy = 0.5465753424657535


Epoch[1] Batch[370] Speed: 1.2803643615011902 samples/sec                   batch loss = 1022.1102499961853 | accuracy = 0.5459459459459459


Epoch[1] Batch[375] Speed: 1.2888205227794962 samples/sec                   batch loss = 1036.0262608528137 | accuracy = 0.5466666666666666


Epoch[1] Batch[380] Speed: 1.2842654363434096 samples/sec                   batch loss = 1049.5222308635712 | accuracy = 0.5473684210526316


Epoch[1] Batch[385] Speed: 1.2843002383851632 samples/sec                   batch loss = 1063.1896641254425 | accuracy = 0.5467532467532468


Epoch[1] Batch[390] Speed: 1.2929335774233908 samples/sec                   batch loss = 1076.563437461853 | accuracy = 0.5480769230769231


Epoch[1] Batch[395] Speed: 1.2817645385233394 samples/sec                   batch loss = 1089.304992198944 | accuracy = 0.5512658227848102


Epoch[1] Batch[400] Speed: 1.2872422105662373 samples/sec                   batch loss = 1102.3372659683228 | accuracy = 0.551875


Epoch[1] Batch[405] Speed: 1.2831521771967742 samples/sec                   batch loss = 1116.0395379066467 | accuracy = 0.5512345679012346


Epoch[1] Batch[410] Speed: 1.2828918703614578 samples/sec                   batch loss = 1129.6283650398254 | accuracy = 0.5530487804878049


Epoch[1] Batch[415] Speed: 1.2850927408016557 samples/sec                   batch loss = 1143.3432021141052 | accuracy = 0.5536144578313253


Epoch[1] Batch[420] Speed: 1.2827249292702179 samples/sec                   batch loss = 1157.2536182403564 | accuracy = 0.5529761904761905


Epoch[1] Batch[425] Speed: 1.288924686342906 samples/sec                   batch loss = 1171.0238211154938 | accuracy = 0.5535294117647059


Epoch[1] Batch[430] Speed: 1.2826119597189956 samples/sec                   batch loss = 1185.0213332176208 | accuracy = 0.5529069767441861


Epoch[1] Batch[435] Speed: 1.2777147318680433 samples/sec                   batch loss = 1197.3855090141296 | accuracy = 0.5563218390804597


Epoch[1] Batch[440] Speed: 1.282450777211578 samples/sec                   batch loss = 1210.6218287944794 | accuracy = 0.5579545454545455


Epoch[1] Batch[445] Speed: 1.2859337278590885 samples/sec                   batch loss = 1224.0993728637695 | accuracy = 0.5584269662921348


Epoch[1] Batch[450] Speed: 1.2858681862969619 samples/sec                   batch loss = 1238.7295382022858 | accuracy = 0.5555555555555556


Epoch[1] Batch[455] Speed: 1.2783937174886078 samples/sec                   batch loss = 1252.0141360759735 | accuracy = 0.5565934065934066


Epoch[1] Batch[460] Speed: 1.284805768770256 samples/sec                   batch loss = 1265.2935373783112 | accuracy = 0.5570652173913043


Epoch[1] Batch[465] Speed: 1.289792214435631 samples/sec                   batch loss = 1278.917492866516 | accuracy = 0.5591397849462365


Epoch[1] Batch[470] Speed: 1.2876653577579054 samples/sec                   batch loss = 1292.4330995082855 | accuracy = 0.5585106382978723


Epoch[1] Batch[475] Speed: 1.2875216758547632 samples/sec                   batch loss = 1304.9732129573822 | accuracy = 0.5605263157894737


Epoch[1] Batch[480] Speed: 1.2852976164574812 samples/sec                   batch loss = 1318.9043264389038 | accuracy = 0.5604166666666667


Epoch[1] Batch[485] Speed: 1.2798398637865747 samples/sec                   batch loss = 1332.3025031089783 | accuracy = 0.5608247422680412


Epoch[1] Batch[490] Speed: 1.2862002995243413 samples/sec                   batch loss = 1345.9039316177368 | accuracy = 0.5607142857142857


Epoch[1] Batch[495] Speed: 1.2786327120900989 samples/sec                   batch loss = 1359.7617630958557 | accuracy = 0.5606060606060606


Epoch[1] Batch[500] Speed: 1.2762609542356378 samples/sec                   batch loss = 1371.6281275749207 | accuracy = 0.5635


Epoch[1] Batch[505] Speed: 1.2784082319377728 samples/sec                   batch loss = 1386.0414414405823 | accuracy = 0.5623762376237624


Epoch[1] Batch[510] Speed: 1.2810590786877947 samples/sec                   batch loss = 1399.161556005478 | accuracy = 0.5627450980392157


Epoch[1] Batch[515] Speed: 1.2806089810011734 samples/sec                   batch loss = 1412.282728433609 | accuracy = 0.5635922330097087


Epoch[1] Batch[520] Speed: 1.2769566671471906 samples/sec                   batch loss = 1426.6810746192932 | accuracy = 0.5625


Epoch[1] Batch[525] Speed: 1.2849209948807545 samples/sec                   batch loss = 1438.7163679599762 | accuracy = 0.5638095238095238


Epoch[1] Batch[530] Speed: 1.2787970300765348 samples/sec                   batch loss = 1452.684916973114 | accuracy = 0.5641509433962264


Epoch[1] Batch[535] Speed: 1.2802679271640176 samples/sec                   batch loss = 1467.7469384670258 | accuracy = 0.5626168224299065


Epoch[1] Batch[540] Speed: 1.2761644575013118 samples/sec                   batch loss = 1480.9796032905579 | accuracy = 0.5634259259259259


Epoch[1] Batch[545] Speed: 1.274932931427243 samples/sec                   batch loss = 1493.6663687229156 | accuracy = 0.5651376146788991


Epoch[1] Batch[550] Speed: 1.278272159740134 samples/sec                   batch loss = 1505.4861643314362 | accuracy = 0.5672727272727273


Epoch[1] Batch[555] Speed: 1.2755832618138003 samples/sec                   batch loss = 1518.179181098938 | accuracy = 0.5684684684684684


Epoch[1] Batch[560] Speed: 1.2816824820546135 samples/sec                   batch loss = 1531.2882325649261 | accuracy = 0.5691964285714286


Epoch[1] Batch[565] Speed: 1.2774806520557411 samples/sec                   batch loss = 1544.5305874347687 | accuracy = 0.570353982300885


Epoch[1] Batch[570] Speed: 1.2804999999236764 samples/sec                   batch loss = 1557.017321586609 | accuracy = 0.5714912280701754


Epoch[1] Batch[575] Speed: 1.2760264362319 samples/sec                   batch loss = 1570.0166308879852 | accuracy = 0.5721739130434783


Epoch[1] Batch[580] Speed: 1.2825469524096886 samples/sec                   batch loss = 1583.1511466503143 | accuracy = 0.5728448275862069


Epoch[1] Batch[585] Speed: 1.2853884087759861 samples/sec                   batch loss = 1596.624575138092 | accuracy = 0.5735042735042735


Epoch[1] Batch[590] Speed: 1.2819226100668866 samples/sec                   batch loss = 1610.004857301712 | accuracy = 0.573728813559322


Epoch[1] Batch[595] Speed: 1.2793642807342207 samples/sec                   batch loss = 1623.7503571510315 | accuracy = 0.573109243697479


Epoch[1] Batch[600] Speed: 1.2772164181365593 samples/sec                   batch loss = 1636.0908470153809 | accuracy = 0.5741666666666667


Epoch[1] Batch[605] Speed: 1.28419780387168 samples/sec                   batch loss = 1649.8217108249664 | accuracy = 0.5739669421487603


Epoch[1] Batch[610] Speed: 1.278101550362853 samples/sec                   batch loss = 1663.4881093502045 | accuracy = 0.5737704918032787


Epoch[1] Batch[615] Speed: 1.277765236642091 samples/sec                   batch loss = 1676.6438982486725 | accuracy = 0.5739837398373984


Epoch[1] Batch[620] Speed: 1.2787672041022464 samples/sec                   batch loss = 1689.5315763950348 | accuracy = 0.5741935483870968


Epoch[1] Batch[625] Speed: 1.2794776546531639 samples/sec                   batch loss = 1702.320787191391 | accuracy = 0.574


Epoch[1] Batch[630] Speed: 1.2778972101571144 samples/sec                   batch loss = 1714.585744857788 | accuracy = 0.5746031746031746


Epoch[1] Batch[635] Speed: 1.285401309809906 samples/sec                   batch loss = 1727.7890827655792 | accuracy = 0.5744094488188977


Epoch[1] Batch[640] Speed: 1.2819148720764453 samples/sec                   batch loss = 1740.788206577301 | accuracy = 0.5734375


Epoch[1] Batch[645] Speed: 1.280038575381262 samples/sec                   batch loss = 1754.1090693473816 | accuracy = 0.5736434108527132


Epoch[1] Batch[650] Speed: 1.2839339284183717 samples/sec                   batch loss = 1768.2388091087341 | accuracy = 0.5734615384615385


Epoch[1] Batch[655] Speed: 1.2780603655402125 samples/sec                   batch loss = 1781.045095205307 | accuracy = 0.5740458015267176


Epoch[1] Batch[660] Speed: 1.2727231486903379 samples/sec                   batch loss = 1795.3694043159485 | accuracy = 0.5734848484848485


Epoch[1] Batch[665] Speed: 1.2785278669305638 samples/sec                   batch loss = 1809.4942755699158 | accuracy = 0.5718045112781955


Epoch[1] Batch[670] Speed: 1.282910214952177 samples/sec                   batch loss = 1823.1312448978424 | accuracy = 0.5712686567164179


Epoch[1] Batch[675] Speed: 1.280224746486028 samples/sec                   batch loss = 1836.094326019287 | accuracy = 0.5718518518518518


Epoch[1] Batch[680] Speed: 1.2790293499036032 samples/sec                   batch loss = 1849.520614862442 | accuracy = 0.5716911764705882


Epoch[1] Batch[685] Speed: 1.2829105092546995 samples/sec                   batch loss = 1861.7859320640564 | accuracy = 0.572992700729927


Epoch[1] Batch[690] Speed: 1.2756844235972042 samples/sec                   batch loss = 1875.274497270584 | accuracy = 0.5728260869565217


Epoch[1] Batch[695] Speed: 1.2764090284025311 samples/sec                   batch loss = 1888.299064397812 | accuracy = 0.573021582733813


Epoch[1] Batch[700] Speed: 1.2838363661922187 samples/sec                   batch loss = 1901.4161014556885 | accuracy = 0.5725


Epoch[1] Batch[705] Speed: 1.2868947522799765 samples/sec                   batch loss = 1914.0451555252075 | accuracy = 0.573404255319149


Epoch[1] Batch[710] Speed: 1.2854158853353612 samples/sec                   batch loss = 1927.8228657245636 | accuracy = 0.5732394366197183


Epoch[1] Batch[715] Speed: 1.2825307751440678 samples/sec                   batch loss = 1939.8714418411255 | accuracy = 0.5741258741258741


Epoch[1] Batch[720] Speed: 1.2828788234584338 samples/sec                   batch loss = 1954.077684879303 | accuracy = 0.5739583333333333


Epoch[1] Batch[725] Speed: 1.2783230007935604 samples/sec                   batch loss = 1966.9973328113556 | accuracy = 0.5741379310344827


Epoch[1] Batch[730] Speed: 1.280253272761251 samples/sec                   batch loss = 1979.5036953687668 | accuracy = 0.5739726027397261


Epoch[1] Batch[735] Speed: 1.28068288365737 samples/sec                   batch loss = 1993.8371304273605 | accuracy = 0.5724489795918367


Epoch[1] Batch[740] Speed: 1.2783260202170559 samples/sec                   batch loss = 2006.4048601388931 | accuracy = 0.5736486486486486


Epoch[1] Batch[745] Speed: 1.2773393311752512 samples/sec                   batch loss = 2019.4983328580856 | accuracy = 0.5741610738255034


Epoch[1] Batch[750] Speed: 1.2817289924862785 samples/sec                   batch loss = 2032.5129268169403 | accuracy = 0.5743333333333334


Epoch[1] Batch[755] Speed: 1.280720229248794 samples/sec                   batch loss = 2046.3172669410706 | accuracy = 0.573841059602649


Epoch[1] Batch[760] Speed: 1.280108700636582 samples/sec                   batch loss = 2058.6000096797943 | accuracy = 0.5743421052631579


Epoch[1] Batch[765] Speed: 1.280586694572518 samples/sec                   batch loss = 2071.5567860603333 | accuracy = 0.5751633986928104


Epoch[1] Batch[770] Speed: 1.2805055707037654 samples/sec                   batch loss = 2083.246687889099 | accuracy = 0.5762987012987013


Epoch[1] Batch[775] Speed: 1.2813497616698744 samples/sec                   batch loss = 2098.8450717926025 | accuracy = 0.5748387096774193


Epoch[1] Batch[780] Speed: 1.2839295068486964 samples/sec                   batch loss = 2113.193065404892 | accuracy = 0.5743589743589743


Epoch[1] Batch[785] Speed: 1.279797590606283 samples/sec                   batch loss = 2124.1610847711563 | accuracy = 0.5764331210191083


[Epoch 1] training: accuracy=0.5774111675126904
[Epoch 1] time cost: 632.5103185176849
[Epoch 1] validation: validation accuracy=0.6855555555555556


Epoch[2] Batch[5] Speed: 1.280920779081765 samples/sec                   batch loss = 13.171717405319214 | accuracy = 0.5


Epoch[2] Batch[10] Speed: 1.286831580252095 samples/sec                   batch loss = 25.022517800331116 | accuracy = 0.55


Epoch[2] Batch[15] Speed: 1.282382159486326 samples/sec                   batch loss = 37.97721874713898 | accuracy = 0.5333333333333333


Epoch[2] Batch[20] Speed: 1.2802057948005405 samples/sec                   batch loss = 51.056973814964294 | accuracy = 0.575


Epoch[2] Batch[25] Speed: 1.2745698144851318 samples/sec                   batch loss = 64.4782646894455 | accuracy = 0.6


Epoch[2] Batch[30] Speed: 1.2758890269937135 samples/sec                   batch loss = 75.6305627822876 | accuracy = 0.6333333333333333


Epoch[2] Batch[35] Speed: 1.2796609296868284 samples/sec                   batch loss = 87.11169481277466 | accuracy = 0.6428571428571429


Epoch[2] Batch[40] Speed: 1.2745276950763085 samples/sec                   batch loss = 98.29937386512756 | accuracy = 0.65625


Epoch[2] Batch[45] Speed: 1.2779242700417184 samples/sec                   batch loss = 108.81941568851471 | accuracy = 0.6833333333333333


Epoch[2] Batch[50] Speed: 1.2814605513898631 samples/sec                   batch loss = 119.6975736618042 | accuracy = 0.7


Epoch[2] Batch[55] Speed: 1.2760974814265822 samples/sec                   batch loss = 134.24386310577393 | accuracy = 0.6727272727272727


Epoch[2] Batch[60] Speed: 1.28623086768675 samples/sec                   batch loss = 145.34063482284546 | accuracy = 0.6791666666666667


Epoch[2] Batch[65] Speed: 1.2846600683295284 samples/sec                   batch loss = 155.95414769649506 | accuracy = 0.6923076923076923


Epoch[2] Batch[70] Speed: 1.2827840697045048 samples/sec                   batch loss = 167.4092630147934 | accuracy = 0.7035714285714286


Epoch[2] Batch[75] Speed: 1.2779612602271004 samples/sec                   batch loss = 178.24490928649902 | accuracy = 0.71


Epoch[2] Batch[80] Speed: 1.2772860400226873 samples/sec                   batch loss = 190.37709665298462 | accuracy = 0.715625


Epoch[2] Batch[85] Speed: 1.2760992285392263 samples/sec                   batch loss = 202.6653928756714 | accuracy = 0.711764705882353


Epoch[2] Batch[90] Speed: 1.2751703429158499 samples/sec                   batch loss = 215.0337084531784 | accuracy = 0.7055555555555556


Epoch[2] Batch[95] Speed: 1.2747171090636105 samples/sec                   batch loss = 228.76322960853577 | accuracy = 0.7078947368421052


Epoch[2] Batch[100] Speed: 1.2775404772641539 samples/sec                   batch loss = 240.66943740844727 | accuracy = 0.7025


Epoch[2] Batch[105] Speed: 1.2790189166116628 samples/sec                   batch loss = 252.75361561775208 | accuracy = 0.7023809523809523


Epoch[2] Batch[110] Speed: 1.273513603617561 samples/sec                   batch loss = 265.87684059143066 | accuracy = 0.6954545454545454


Epoch[2] Batch[115] Speed: 1.273447292128175 samples/sec                   batch loss = 278.66286635398865 | accuracy = 0.6956521739130435


Epoch[2] Batch[120] Speed: 1.2858740009925407 samples/sec                   batch loss = 291.00090432167053 | accuracy = 0.69375


Epoch[2] Batch[125] Speed: 1.2852573450128835 samples/sec                   batch loss = 303.13709449768066 | accuracy = 0.692


Epoch[2] Batch[130] Speed: 1.2861084065730586 samples/sec                   batch loss = 314.4847756624222 | accuracy = 0.6961538461538461


Epoch[2] Batch[135] Speed: 1.2875784925817262 samples/sec                   batch loss = 326.6378872394562 | accuracy = 0.6944444444444444


Epoch[2] Batch[140] Speed: 1.2804996089935603 samples/sec                   batch loss = 338.57199907302856 | accuracy = 0.6964285714285714


Epoch[2] Batch[145] Speed: 1.2911861689936759 samples/sec                   batch loss = 350.21933698654175 | accuracy = 0.7


Epoch[2] Batch[150] Speed: 1.2830522806673805 samples/sec                   batch loss = 360.34326171875 | accuracy = 0.7


Epoch[2] Batch[155] Speed: 1.2829413136654393 samples/sec                   batch loss = 372.4848698377609 | accuracy = 0.6983870967741935


Epoch[2] Batch[160] Speed: 1.2847526398658122 samples/sec                   batch loss = 383.6256836652756 | accuracy = 0.7


Epoch[2] Batch[165] Speed: 1.2747320244166127 samples/sec                   batch loss = 396.35436499118805 | accuracy = 0.696969696969697


Epoch[2] Batch[170] Speed: 1.273659010412258 samples/sec                   batch loss = 407.8842782974243 | accuracy = 0.6985294117647058


Epoch[2] Batch[175] Speed: 1.27894072109465 samples/sec                   batch loss = 418.6397044658661 | accuracy = 0.7014285714285714


Epoch[2] Batch[180] Speed: 1.273285988782599 samples/sec                   batch loss = 432.185697555542 | accuracy = 0.6958333333333333


Epoch[2] Batch[185] Speed: 1.2739829112166003 samples/sec                   batch loss = 446.7812759876251 | accuracy = 0.6905405405405406


Epoch[2] Batch[190] Speed: 1.2817497519158196 samples/sec                   batch loss = 457.9534822702408 | accuracy = 0.6907894736842105


Epoch[2] Batch[195] Speed: 1.2759064926278696 samples/sec                   batch loss = 473.2010592222214 | accuracy = 0.6871794871794872


Epoch[2] Batch[200] Speed: 1.2780088639061107 samples/sec                   batch loss = 490.1632763147354 | accuracy = 0.68375


Epoch[2] Batch[205] Speed: 1.2810156490039406 samples/sec                   batch loss = 503.605912566185 | accuracy = 0.6804878048780488


Epoch[2] Batch[210] Speed: 1.2826564783295238 samples/sec                   batch loss = 514.7444231510162 | accuracy = 0.6821428571428572


Epoch[2] Batch[215] Speed: 1.278211779068544 samples/sec                   batch loss = 529.8325283527374 | accuracy = 0.6755813953488372


Epoch[2] Batch[220] Speed: 1.279660051246762 samples/sec                   batch loss = 540.845908999443 | accuracy = 0.6795454545454546


Epoch[2] Batch[225] Speed: 1.272488577747429 samples/sec                   batch loss = 554.8361263275146 | accuracy = 0.6777777777777778


Epoch[2] Batch[230] Speed: 1.274307267172089 samples/sec                   batch loss = 564.2666432857513 | accuracy = 0.6804347826086956


Epoch[2] Batch[235] Speed: 1.2749665512724195 samples/sec                   batch loss = 578.1427662372589 | accuracy = 0.676595744680851


Epoch[2] Batch[240] Speed: 1.2746078695861296 samples/sec                   batch loss = 591.3635355234146 | accuracy = 0.678125


Epoch[2] Batch[245] Speed: 1.270197138841503 samples/sec                   batch loss = 604.5355995893478 | accuracy = 0.6785714285714286


Epoch[2] Batch[250] Speed: 1.2753138016792087 samples/sec                   batch loss = 620.4464517831802 | accuracy = 0.673


Epoch[2] Batch[255] Speed: 1.2757262314739972 samples/sec                   batch loss = 633.6770728826523 | accuracy = 0.6745098039215687


Epoch[2] Batch[260] Speed: 1.2792307360335222 samples/sec                   batch loss = 646.8883261680603 | accuracy = 0.6740384615384616


Epoch[2] Batch[265] Speed: 1.2786949842221116 samples/sec                   batch loss = 658.0044203996658 | accuracy = 0.6745283018867925


Epoch[2] Batch[270] Speed: 1.2769794105800154 samples/sec                   batch loss = 670.6670345067978 | accuracy = 0.6731481481481482


Epoch[2] Batch[275] Speed: 1.2791406164346066 samples/sec                   batch loss = 681.8275436162949 | accuracy = 0.6727272727272727


Epoch[2] Batch[280] Speed: 1.282993409840877 samples/sec                   batch loss = 692.784854054451 | accuracy = 0.6732142857142858


Epoch[2] Batch[285] Speed: 1.2771480677189586 samples/sec                   batch loss = 702.6885524988174 | accuracy = 0.6763157894736842


Epoch[2] Batch[290] Speed: 1.274958897030745 samples/sec                   batch loss = 714.6569241285324 | accuracy = 0.6758620689655173


Epoch[2] Batch[295] Speed: 1.2800527365476788 samples/sec                   batch loss = 730.2240210771561 | accuracy = 0.6711864406779661


Epoch[2] Batch[300] Speed: 1.279378134300718 samples/sec                   batch loss = 743.4497419595718 | accuracy = 0.6675


Epoch[2] Batch[305] Speed: 1.2798307840920387 samples/sec                   batch loss = 755.2442594766617 | accuracy = 0.6688524590163935


Epoch[2] Batch[310] Speed: 1.2773791079444923 samples/sec                   batch loss = 768.2419422864914 | accuracy = 0.6685483870967742


Epoch[2] Batch[315] Speed: 1.2796130076627827 samples/sec                   batch loss = 779.1522151231766 | accuracy = 0.6706349206349206


Epoch[2] Batch[320] Speed: 1.2824370530790783 samples/sec                   batch loss = 790.3363121747971 | accuracy = 0.66953125


Epoch[2] Batch[325] Speed: 1.2814712203158045 samples/sec                   batch loss = 800.3268601894379 | accuracy = 0.6715384615384615


Epoch[2] Batch[330] Speed: 1.2767838823958448 samples/sec                   batch loss = 810.8592059612274 | accuracy = 0.6727272727272727


Epoch[2] Batch[335] Speed: 1.2712797379547094 samples/sec                   batch loss = 824.6040043830872 | accuracy = 0.6716417910447762


Epoch[2] Batch[340] Speed: 1.2802355902285485 samples/sec                   batch loss = 835.9654077291489 | accuracy = 0.6735294117647059


Epoch[2] Batch[345] Speed: 1.281324905123836 samples/sec                   batch loss = 845.9795650243759 | accuracy = 0.6753623188405797


Epoch[2] Batch[350] Speed: 1.2735527557711457 samples/sec                   batch loss = 860.9047154188156 | accuracy = 0.6728571428571428


Epoch[2] Batch[355] Speed: 1.2729456365810898 samples/sec                   batch loss = 875.4633753299713 | accuracy = 0.6711267605633803


Epoch[2] Batch[360] Speed: 1.2702127179537188 samples/sec                   batch loss = 884.6398248672485 | accuracy = 0.6743055555555556


Epoch[2] Batch[365] Speed: 1.2670529981877683 samples/sec                   batch loss = 895.1586503982544 | accuracy = 0.6760273972602739


Epoch[2] Batch[370] Speed: 1.2724996768922434 samples/sec                   batch loss = 905.4967501163483 | accuracy = 0.6763513513513514


Epoch[2] Batch[375] Speed: 1.274612905036901 samples/sec                   batch loss = 917.4611448049545 | accuracy = 0.6773333333333333


Epoch[2] Batch[380] Speed: 1.27684529434113 samples/sec                   batch loss = 933.069503068924 | accuracy = 0.675


Epoch[2] Batch[385] Speed: 1.2672773368679973 samples/sec                   batch loss = 944.5153841972351 | accuracy = 0.674025974025974


Epoch[2] Batch[390] Speed: 1.2710166183832532 samples/sec                   batch loss = 957.5344016551971 | accuracy = 0.6724358974358975


Epoch[2] Batch[395] Speed: 1.271380603651347 samples/sec                   batch loss = 969.2618192434311 | accuracy = 0.6740506329113924


Epoch[2] Batch[400] Speed: 1.277760760142481 samples/sec                   batch loss = 981.4833467006683 | accuracy = 0.67375


Epoch[2] Batch[405] Speed: 1.2725891527223099 samples/sec                   batch loss = 991.3605136871338 | accuracy = 0.6753086419753086


Epoch[2] Batch[410] Speed: 1.2707661203889402 samples/sec                   batch loss = 1003.2012227773666 | accuracy = 0.675609756097561


Epoch[2] Batch[415] Speed: 1.2770178041102591 samples/sec                   batch loss = 1014.7749654054642 | accuracy = 0.677710843373494


Epoch[2] Batch[420] Speed: 1.2778105872978565 samples/sec                   batch loss = 1027.0647835731506 | accuracy = 0.6767857142857143


Epoch[2] Batch[425] Speed: 1.2802032549265672 samples/sec                   batch loss = 1039.3794585466385 | accuracy = 0.6770588235294117


Epoch[2] Batch[430] Speed: 1.2825913684554624 samples/sec                   batch loss = 1055.8060499429703 | accuracy = 0.675


Epoch[2] Batch[435] Speed: 1.2809516836162609 samples/sec                   batch loss = 1065.2200791835785 | accuracy = 0.6764367816091954


Epoch[2] Batch[440] Speed: 1.2824546004150859 samples/sec                   batch loss = 1079.4665648937225 | accuracy = 0.675


Epoch[2] Batch[445] Speed: 1.279225468954344 samples/sec                   batch loss = 1090.6028109788895 | accuracy = 0.6752808988764045


Epoch[2] Batch[450] Speed: 1.2789929803804378 samples/sec                   batch loss = 1102.3498138189316 | accuracy = 0.675


Epoch[2] Batch[455] Speed: 1.282392059590525 samples/sec                   batch loss = 1115.45188498497 | accuracy = 0.6741758241758242


Epoch[2] Batch[460] Speed: 1.2764037845337641 samples/sec                   batch loss = 1128.6386189460754 | accuracy = 0.6722826086956522


Epoch[2] Batch[465] Speed: 1.2715088523978892 samples/sec                   batch loss = 1140.9724180698395 | accuracy = 0.6725806451612903


Epoch[2] Batch[470] Speed: 1.2736620078430516 samples/sec                   batch loss = 1151.7864093780518 | accuracy = 0.6734042553191489


Epoch[2] Batch[475] Speed: 1.2712251210914043 samples/sec                   batch loss = 1163.2386177778244 | accuracy = 0.6736842105263158


Epoch[2] Batch[480] Speed: 1.2728668298907497 samples/sec                   batch loss = 1174.7947174310684 | accuracy = 0.6734375


Epoch[2] Batch[485] Speed: 1.275270954532999 samples/sec                   batch loss = 1186.232114493847 | accuracy = 0.6742268041237114


Epoch[2] Batch[490] Speed: 1.276134560065938 samples/sec                   batch loss = 1199.2271283268929 | accuracy = 0.6739795918367347


Epoch[2] Batch[495] Speed: 1.2764219440408364 samples/sec                   batch loss = 1209.7473248839378 | accuracy = 0.6752525252525252


Epoch[2] Batch[500] Speed: 1.276497015293648 samples/sec                   batch loss = 1219.8001942038536 | accuracy = 0.6765


Epoch[2] Batch[505] Speed: 1.2832867382662658 samples/sec                   batch loss = 1229.9751735329628 | accuracy = 0.6767326732673268


Epoch[2] Batch[510] Speed: 1.2865316964523639 samples/sec                   batch loss = 1239.6884271502495 | accuracy = 0.6779411764705883


Epoch[2] Batch[515] Speed: 1.281536705963717 samples/sec                   batch loss = 1248.333406507969 | accuracy = 0.6800970873786408


Epoch[2] Batch[520] Speed: 1.2864201268139195 samples/sec                   batch loss = 1261.815870821476 | accuracy = 0.6798076923076923


Epoch[2] Batch[525] Speed: 1.2797434108212555 samples/sec                   batch loss = 1275.0013536810875 | accuracy = 0.6785714285714286


Epoch[2] Batch[530] Speed: 1.2857251022236773 samples/sec                   batch loss = 1285.852501809597 | accuracy = 0.6787735849056604


Epoch[2] Batch[535] Speed: 1.2816253033288523 samples/sec                   batch loss = 1300.7529396414757 | accuracy = 0.6785046728971963


Epoch[2] Batch[540] Speed: 1.2800472673704077 samples/sec                   batch loss = 1314.4525930285454 | accuracy = 0.6773148148148148


Epoch[2] Batch[545] Speed: 1.2822840488959941 samples/sec                   batch loss = 1328.8361207842827 | accuracy = 0.676605504587156


Epoch[2] Batch[550] Speed: 1.2790462190441179 samples/sec                   batch loss = 1342.9775809645653 | accuracy = 0.6754545454545454


Epoch[2] Batch[555] Speed: 1.2792291754129477 samples/sec                   batch loss = 1354.7802074551582 | accuracy = 0.6756756756756757


Epoch[2] Batch[560] Speed: 1.2772037781054295 samples/sec                   batch loss = 1366.8899312615395 | accuracy = 0.675


Epoch[2] Batch[565] Speed: 1.2715071178320343 samples/sec                   batch loss = 1377.926851093769 | accuracy = 0.6756637168141593


Epoch[2] Batch[570] Speed: 1.2710130556519579 samples/sec                   batch loss = 1388.5202433466911 | accuracy = 0.675


Epoch[2] Batch[575] Speed: 1.2823219779835098 samples/sec                   batch loss = 1399.7419261336327 | accuracy = 0.6747826086956522


Epoch[2] Batch[580] Speed: 1.2802954783494573 samples/sec                   batch loss = 1411.4883092045784 | accuracy = 0.6758620689655173


Epoch[2] Batch[585] Speed: 1.2881984664669601 samples/sec                   batch loss = 1426.9462417960167 | accuracy = 0.6756410256410257


Epoch[2] Batch[590] Speed: 1.2868879412474223 samples/sec                   batch loss = 1435.626420557499 | accuracy = 0.6766949152542373


Epoch[2] Batch[595] Speed: 1.2796839647679172 samples/sec                   batch loss = 1446.4842622876167 | accuracy = 0.6785714285714286


Epoch[2] Batch[600] Speed: 1.2824023519401198 samples/sec                   batch loss = 1457.6653724312782 | accuracy = 0.67875


Epoch[2] Batch[605] Speed: 1.2845097786708737 samples/sec                   batch loss = 1471.828231394291 | accuracy = 0.6772727272727272


Epoch[2] Batch[610] Speed: 1.2880420078855377 samples/sec                   batch loss = 1484.8228099942207 | accuracy = 0.6770491803278689


Epoch[2] Batch[615] Speed: 1.2884756754099922 samples/sec                   batch loss = 1497.4633323550224 | accuracy = 0.6764227642276422


Epoch[2] Batch[620] Speed: 1.2838470747151576 samples/sec                   batch loss = 1510.9887728095055 | accuracy = 0.6762096774193549


Epoch[2] Batch[625] Speed: 1.2815968137791311 samples/sec                   batch loss = 1518.3107758164406 | accuracy = 0.6784


Epoch[2] Batch[630] Speed: 1.2829425890391757 samples/sec                   batch loss = 1530.23973184824 | accuracy = 0.6793650793650794


Epoch[2] Batch[635] Speed: 1.2910811428318139 samples/sec                   batch loss = 1541.7871118187904 | accuracy = 0.6791338582677166


Epoch[2] Batch[640] Speed: 1.2816134570048552 samples/sec                   batch loss = 1554.6644276976585 | accuracy = 0.678515625


Epoch[2] Batch[645] Speed: 1.285234896501122 samples/sec                   batch loss = 1567.0332141518593 | accuracy = 0.6790697674418604


Epoch[2] Batch[650] Speed: 1.284100791238377 samples/sec                   batch loss = 1580.5471239686012 | accuracy = 0.6792307692307692


Epoch[2] Batch[655] Speed: 1.2878864770743883 samples/sec                   batch loss = 1594.1532617211342 | accuracy = 0.6786259541984733


Epoch[2] Batch[660] Speed: 1.2874864026011847 samples/sec                   batch loss = 1609.008455812931 | accuracy = 0.6776515151515151


Epoch[2] Batch[665] Speed: 1.291504531446695 samples/sec                   batch loss = 1623.3715750575066 | accuracy = 0.6755639097744361


Epoch[2] Batch[670] Speed: 1.2868498402791164 samples/sec                   batch loss = 1634.4330791831017 | accuracy = 0.6757462686567164


Epoch[2] Batch[675] Speed: 1.2835521153166871 samples/sec                   batch loss = 1645.8933572173119 | accuracy = 0.6755555555555556


Epoch[2] Batch[680] Speed: 1.2842979771782643 samples/sec                   batch loss = 1657.368384540081 | accuracy = 0.6761029411764706


Epoch[2] Batch[685] Speed: 1.2897898346926935 samples/sec                   batch loss = 1670.5148729681969 | accuracy = 0.6762773722627737


Epoch[2] Batch[690] Speed: 1.2906228842638425 samples/sec                   batch loss = 1681.0338866114616 | accuracy = 0.6771739130434783


Epoch[2] Batch[695] Speed: 1.2833541766379613 samples/sec                   batch loss = 1691.8515747189522 | accuracy = 0.6784172661870503


Epoch[2] Batch[700] Speed: 1.2884333245682802 samples/sec                   batch loss = 1707.7643442749977 | accuracy = 0.6771428571428572


Epoch[2] Batch[705] Speed: 1.2887070709832869 samples/sec                   batch loss = 1717.8716061711311 | accuracy = 0.6783687943262411


Epoch[2] Batch[710] Speed: 1.2804583672072554 samples/sec                   batch loss = 1729.1932333111763 | accuracy = 0.6792253521126761


Epoch[2] Batch[715] Speed: 1.2841218241413326 samples/sec                   batch loss = 1742.4594017863274 | accuracy = 0.6783216783216783


Epoch[2] Batch[720] Speed: 1.2852423792512444 samples/sec                   batch loss = 1755.4518937468529 | accuracy = 0.6784722222222223


Epoch[2] Batch[725] Speed: 1.2837155392771573 samples/sec                   batch loss = 1769.2481679320335 | accuracy = 0.6775862068965517


Epoch[2] Batch[730] Speed: 1.2859150995776163 samples/sec                   batch loss = 1777.9537344574928 | accuracy = 0.6794520547945205


Epoch[2] Batch[735] Speed: 1.2856077615778112 samples/sec                   batch loss = 1790.290187895298 | accuracy = 0.6795918367346939


Epoch[2] Batch[740] Speed: 1.2819032162924997 samples/sec                   batch loss = 1802.5011695027351 | accuracy = 0.6800675675675676


Epoch[2] Batch[745] Speed: 1.2882900647907034 samples/sec                   batch loss = 1813.9748136401176 | accuracy = 0.6795302013422819


Epoch[2] Batch[750] Speed: 1.2793573540634802 samples/sec                   batch loss = 1826.4277120232582 | accuracy = 0.6796666666666666


Epoch[2] Batch[755] Speed: 1.2829716290042832 samples/sec                   batch loss = 1837.0804451107979 | accuracy = 0.6798013245033112


Epoch[2] Batch[760] Speed: 1.28385807815417 samples/sec                   batch loss = 1846.4982112050056 | accuracy = 0.6802631578947368


Epoch[2] Batch[765] Speed: 1.2780962925781247 samples/sec                   batch loss = 1858.2292652726173 | accuracy = 0.6803921568627451


Epoch[2] Batch[770] Speed: 1.2819695297562181 samples/sec                   batch loss = 1870.4609666466713 | accuracy = 0.6795454545454546


Epoch[2] Batch[775] Speed: 1.2816518359729836 samples/sec                   batch loss = 1881.5777097344398 | accuracy = 0.6796774193548387


Epoch[2] Batch[780] Speed: 1.2790426111558495 samples/sec                   batch loss = 1892.033530652523 | accuracy = 0.6798076923076923


Epoch[2] Batch[785] Speed: 1.2799699225817753 samples/sec                   batch loss = 1903.2398366332054 | accuracy = 0.6799363057324841


[Epoch 2] training: accuracy=0.6802030456852792
[Epoch 2] time cost: 631.2220981121063
[Epoch 2] validation: validation accuracy=0.7277777777777777


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).