<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `device` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), device=gpu)
x

[22:33:30] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], device=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[22:33:30] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), device=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], device=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_device(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', device=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), device=gpu)
net(x)

[22:33:30] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 7.500602 , -2.5558558]], device=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, device=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7708247971838449 samples/sec                   batch loss = 14.007229328155518 | accuracy = 0.5


Epoch[1] Batch[10] Speed: 1.2534943591238639 samples/sec                   batch loss = 27.021639585494995 | accuracy = 0.575


Epoch[1] Batch[15] Speed: 1.2588374095520045 samples/sec                   batch loss = 42.08128571510315 | accuracy = 0.5166666666666667


Epoch[1] Batch[20] Speed: 1.2535465263975851 samples/sec                   batch loss = 56.850080728530884 | accuracy = 0.4875


Epoch[1] Batch[25] Speed: 1.2548927621661399 samples/sec                   batch loss = 71.48263597488403 | accuracy = 0.49


Epoch[1] Batch[30] Speed: 1.2528565276330204 samples/sec                   batch loss = 86.15578889846802 | accuracy = 0.4666666666666667


Epoch[1] Batch[35] Speed: 1.2576535067893824 samples/sec                   batch loss = 100.45981740951538 | accuracy = 0.4642857142857143


Epoch[1] Batch[40] Speed: 1.2539219403469097 samples/sec                   batch loss = 115.13668036460876 | accuracy = 0.45


Epoch[1] Batch[45] Speed: 1.2554900134564766 samples/sec                   batch loss = 128.72842025756836 | accuracy = 0.46111111111111114


Epoch[1] Batch[50] Speed: 1.258543821208512 samples/sec                   batch loss = 143.1917018890381 | accuracy = 0.46


Epoch[1] Batch[55] Speed: 1.2576212651520535 samples/sec                   batch loss = 157.34535813331604 | accuracy = 0.4590909090909091


Epoch[1] Batch[60] Speed: 1.2609310537308145 samples/sec                   batch loss = 170.4355001449585 | accuracy = 0.4708333333333333


Epoch[1] Batch[65] Speed: 1.254685735071698 samples/sec                   batch loss = 184.1295862197876 | accuracy = 0.47692307692307695


Epoch[1] Batch[70] Speed: 1.253855966635619 samples/sec                   batch loss = 198.21501183509827 | accuracy = 0.48214285714285715


Epoch[1] Batch[75] Speed: 1.2556105656132082 samples/sec                   batch loss = 211.29746174812317 | accuracy = 0.49333333333333335


Epoch[1] Batch[80] Speed: 1.255666762291803 samples/sec                   batch loss = 225.8105492591858 | accuracy = 0.4875


Epoch[1] Batch[85] Speed: 1.25465392689957 samples/sec                   batch loss = 240.19067001342773 | accuracy = 0.48823529411764705


Epoch[1] Batch[90] Speed: 1.2547574266592993 samples/sec                   batch loss = 254.5199384689331 | accuracy = 0.4861111111111111


Epoch[1] Batch[95] Speed: 1.2577072465254382 samples/sec                   batch loss = 268.0873739719391 | accuracy = 0.4921052631578947


Epoch[1] Batch[100] Speed: 1.2549507720349973 samples/sec                   batch loss = 281.70852756500244 | accuracy = 0.49


Epoch[1] Batch[105] Speed: 1.2540708752834087 samples/sec                   batch loss = 295.2496178150177 | accuracy = 0.49523809523809526


Epoch[1] Batch[110] Speed: 1.249879945425269 samples/sec                   batch loss = 309.4216260910034 | accuracy = 0.4954545454545455


Epoch[1] Batch[115] Speed: 1.2456260172042222 samples/sec                   batch loss = 322.6258587837219 | accuracy = 0.5


Epoch[1] Batch[120] Speed: 1.251044866647284 samples/sec                   batch loss = 336.87170481681824 | accuracy = 0.4979166666666667


Epoch[1] Batch[125] Speed: 1.2517656517748736 samples/sec                   batch loss = 350.3957335948944 | accuracy = 0.506


Epoch[1] Batch[130] Speed: 1.2524197622232154 samples/sec                   batch loss = 364.25640058517456 | accuracy = 0.5057692307692307


Epoch[1] Batch[135] Speed: 1.2534112001685438 samples/sec                   batch loss = 377.6032373905182 | accuracy = 0.5074074074074074


Epoch[1] Batch[140] Speed: 1.2585020934883162 samples/sec                   batch loss = 391.666978597641 | accuracy = 0.5035714285714286


Epoch[1] Batch[145] Speed: 1.2565062700181258 samples/sec                   batch loss = 405.2932653427124 | accuracy = 0.5051724137931034


Epoch[1] Batch[150] Speed: 1.2504086861628636 samples/sec                   batch loss = 419.3421628475189 | accuracy = 0.5066666666666667


Epoch[1] Batch[155] Speed: 1.2499469913670804 samples/sec                   batch loss = 432.62311339378357 | accuracy = 0.5129032258064516


Epoch[1] Batch[160] Speed: 1.2536144346497349 samples/sec                   batch loss = 446.7067630290985 | accuracy = 0.5125


Epoch[1] Batch[165] Speed: 1.250166803502419 samples/sec                   batch loss = 460.72845244407654 | accuracy = 0.5121212121212121


Epoch[1] Batch[170] Speed: 1.2474150505450503 samples/sec                   batch loss = 474.5195822715759 | accuracy = 0.513235294117647


Epoch[1] Batch[175] Speed: 1.254569957321092 samples/sec                   batch loss = 488.05433797836304 | accuracy = 0.5157142857142857


Epoch[1] Batch[180] Speed: 1.2512267114259534 samples/sec                   batch loss = 501.7450876235962 | accuracy = 0.5152777777777777


Epoch[1] Batch[185] Speed: 1.2539050714078421 samples/sec                   batch loss = 515.763603925705 | accuracy = 0.5135135135135135


Epoch[1] Batch[190] Speed: 1.251753603865412 samples/sec                   batch loss = 529.6352717876434 | accuracy = 0.5144736842105263


Epoch[1] Batch[195] Speed: 1.247341320665935 samples/sec                   batch loss = 543.6908264160156 | accuracy = 0.5166666666666667


Epoch[1] Batch[200] Speed: 1.2484666724263422 samples/sec                   batch loss = 557.4336910247803 | accuracy = 0.51625


Epoch[1] Batch[205] Speed: 1.2531249106029345 samples/sec                   batch loss = 571.2540466785431 | accuracy = 0.5170731707317073


Epoch[1] Batch[210] Speed: 1.248961395924375 samples/sec                   batch loss = 584.2854526042938 | accuracy = 0.5190476190476191


Epoch[1] Batch[215] Speed: 1.2522499081555896 samples/sec                   batch loss = 597.8800103664398 | accuracy = 0.522093023255814


Epoch[1] Batch[220] Speed: 1.2547504823513769 samples/sec                   batch loss = 611.4712085723877 | accuracy = 0.5238636363636363


Epoch[1] Batch[225] Speed: 1.2551954487050863 samples/sec                   batch loss = 625.3295872211456 | accuracy = 0.5211111111111111


Epoch[1] Batch[230] Speed: 1.249061540783229 samples/sec                   batch loss = 638.9481885433197 | accuracy = 0.5217391304347826


Epoch[1] Batch[235] Speed: 1.2518909077342089 samples/sec                   batch loss = 652.3123366832733 | accuracy = 0.5234042553191489


Epoch[1] Batch[240] Speed: 1.2560870811942744 samples/sec                   batch loss = 666.462117433548 | accuracy = 0.521875


Epoch[1] Batch[245] Speed: 1.2525827420405615 samples/sec                   batch loss = 680.2535803318024 | accuracy = 0.5204081632653061


Epoch[1] Batch[250] Speed: 1.2549644774150444 samples/sec                   batch loss = 693.6180613040924 | accuracy = 0.525


Epoch[1] Batch[255] Speed: 1.2521753252762855 samples/sec                   batch loss = 707.1605257987976 | accuracy = 0.5264705882352941


Epoch[1] Batch[260] Speed: 1.2496900209771218 samples/sec                   batch loss = 720.3583989143372 | accuracy = 0.5269230769230769


Epoch[1] Batch[265] Speed: 1.2591370882959172 samples/sec                   batch loss = 734.2998061180115 | accuracy = 0.5254716981132076


Epoch[1] Batch[270] Speed: 1.257836334935379 samples/sec                   batch loss = 747.9213199615479 | accuracy = 0.525


Epoch[1] Batch[275] Speed: 1.2610690513796528 samples/sec                   batch loss = 761.2537078857422 | accuracy = 0.5245454545454545


Epoch[1] Batch[280] Speed: 1.2556611236020312 samples/sec                   batch loss = 775.2212460041046 | accuracy = 0.5241071428571429


Epoch[1] Batch[285] Speed: 1.2534784381995718 samples/sec                   batch loss = 788.42231798172 | accuracy = 0.5236842105263158


Epoch[1] Batch[290] Speed: 1.256007715173429 samples/sec                   batch loss = 802.310106754303 | accuracy = 0.5206896551724138


Epoch[1] Batch[295] Speed: 1.2524401440679542 samples/sec                   batch loss = 815.435741186142 | accuracy = 0.5228813559322034


Epoch[1] Batch[300] Speed: 1.2519506022784654 samples/sec                   batch loss = 828.8989489078522 | accuracy = 0.5233333333333333


Epoch[1] Batch[305] Speed: 1.2541386528599816 samples/sec                   batch loss = 843.2875392436981 | accuracy = 0.5221311475409836


Epoch[1] Batch[310] Speed: 1.2513745400733793 samples/sec                   batch loss = 856.6108558177948 | accuracy = 0.5241935483870968


Epoch[1] Batch[315] Speed: 1.2509199664596415 samples/sec                   batch loss = 870.7326440811157 | accuracy = 0.5253968253968254


Epoch[1] Batch[320] Speed: 1.2525025091534456 samples/sec                   batch loss = 884.6304919719696 | accuracy = 0.52421875


Epoch[1] Batch[325] Speed: 1.254133965378215 samples/sec                   batch loss = 898.8497142791748 | accuracy = 0.5238461538461539


Epoch[1] Batch[330] Speed: 1.2538402239350586 samples/sec                   batch loss = 912.5559847354889 | accuracy = 0.5234848484848484


Epoch[1] Batch[335] Speed: 1.254379824191701 samples/sec                   batch loss = 926.1834452152252 | accuracy = 0.5238805970149254


Epoch[1] Batch[340] Speed: 1.258808035109015 samples/sec                   batch loss = 938.8514838218689 | accuracy = 0.5264705882352941


Epoch[1] Batch[345] Speed: 1.2606608326506126 samples/sec                   batch loss = 952.2404327392578 | accuracy = 0.527536231884058


Epoch[1] Batch[350] Speed: 1.2563877098002671 samples/sec                   batch loss = 965.638035774231 | accuracy = 0.5271428571428571


Epoch[1] Batch[355] Speed: 1.259627822573228 samples/sec                   batch loss = 979.1824996471405 | accuracy = 0.528169014084507


Epoch[1] Batch[360] Speed: 1.2519272469275122 samples/sec                   batch loss = 993.1082231998444 | accuracy = 0.5284722222222222


Epoch[1] Batch[365] Speed: 1.2506973720336576 samples/sec                   batch loss = 1005.9632971286774 | accuracy = 0.5308219178082192


Epoch[1] Batch[370] Speed: 1.2593198754402535 samples/sec                   batch loss = 1019.561005115509 | accuracy = 0.5304054054054054


Epoch[1] Batch[375] Speed: 1.250292205556511 samples/sec                   batch loss = 1033.3161008358002 | accuracy = 0.5306666666666666


Epoch[1] Batch[380] Speed: 1.2521372895595693 samples/sec                   batch loss = 1047.0965962409973 | accuracy = 0.5296052631578947


Epoch[1] Batch[385] Speed: 1.2593296117145796 samples/sec                   batch loss = 1060.8548364639282 | accuracy = 0.5298701298701298


Epoch[1] Batch[390] Speed: 1.253671202227332 samples/sec                   batch loss = 1074.3189511299133 | accuracy = 0.5314102564102564


Epoch[1] Batch[395] Speed: 1.249923431348409 samples/sec                   batch loss = 1087.8925755023956 | accuracy = 0.5316455696202531


Epoch[1] Batch[400] Speed: 1.2520968265795456 samples/sec                   batch loss = 1101.4378514289856 | accuracy = 0.53125


Epoch[1] Batch[405] Speed: 1.25113386974967 samples/sec                   batch loss = 1114.9149343967438 | accuracy = 0.5314814814814814


Epoch[1] Batch[410] Speed: 1.2549776198414517 samples/sec                   batch loss = 1127.539482831955 | accuracy = 0.5329268292682927


Epoch[1] Batch[415] Speed: 1.2546317841094932 samples/sec                   batch loss = 1141.3513326644897 | accuracy = 0.5325301204819277


Epoch[1] Batch[420] Speed: 1.2530911224600154 samples/sec                   batch loss = 1155.0574407577515 | accuracy = 0.5321428571428571


Epoch[1] Batch[425] Speed: 1.2508807012774745 samples/sec                   batch loss = 1168.167673587799 | accuracy = 0.5335294117647059


Epoch[1] Batch[430] Speed: 1.255612726930911 samples/sec                   batch loss = 1182.0185766220093 | accuracy = 0.5337209302325582


Epoch[1] Batch[435] Speed: 1.2559518640580056 samples/sec                   batch loss = 1195.966097831726 | accuracy = 0.532183908045977


Epoch[1] Batch[440] Speed: 1.2542363479547882 samples/sec                   batch loss = 1208.5943446159363 | accuracy = 0.5340909090909091


Epoch[1] Batch[445] Speed: 1.2507080942889215 samples/sec                   batch loss = 1221.873690366745 | accuracy = 0.5353932584269663


Epoch[1] Batch[450] Speed: 1.2554785514012956 samples/sec                   batch loss = 1234.660914182663 | accuracy = 0.5377777777777778


Epoch[1] Batch[455] Speed: 1.2571380272875692 samples/sec                   batch loss = 1247.5689182281494 | accuracy = 0.5395604395604395


Epoch[1] Batch[460] Speed: 1.25480960526922 samples/sec                   batch loss = 1259.3152282238007 | accuracy = 0.5434782608695652


Epoch[1] Batch[465] Speed: 1.2529578595749333 samples/sec                   batch loss = 1271.9762370586395 | accuracy = 0.5440860215053763


Epoch[1] Batch[470] Speed: 1.2548081036658443 samples/sec                   batch loss = 1284.395753145218 | accuracy = 0.5452127659574468


Epoch[1] Batch[475] Speed: 1.2530951469908664 samples/sec                   batch loss = 1296.455769777298 | accuracy = 0.5457894736842105


Epoch[1] Batch[480] Speed: 1.2532661666415423 samples/sec                   batch loss = 1310.5695481300354 | accuracy = 0.5463541666666667


Epoch[1] Batch[485] Speed: 1.2543194288864168 samples/sec                   batch loss = 1324.8971889019012 | accuracy = 0.545360824742268


Epoch[1] Batch[490] Speed: 1.2554765784456972 samples/sec                   batch loss = 1339.5896286964417 | accuracy = 0.5438775510204081


Epoch[1] Batch[495] Speed: 1.2512822363989788 samples/sec                   batch loss = 1353.1581168174744 | accuracy = 0.5444444444444444


Epoch[1] Batch[500] Speed: 1.254284075948364 samples/sec                   batch loss = 1365.540771484375 | accuracy = 0.546


Epoch[1] Batch[505] Speed: 1.254772160107674 samples/sec                   batch loss = 1378.5880477428436 | accuracy = 0.5475247524752476


Epoch[1] Batch[510] Speed: 1.2561111562827136 samples/sec                   batch loss = 1391.9247159957886 | accuracy = 0.5485294117647059


Epoch[1] Batch[515] Speed: 1.253218048019811 samples/sec                   batch loss = 1405.2724571228027 | accuracy = 0.5480582524271844


Epoch[1] Batch[520] Speed: 1.257217630434931 samples/sec                   batch loss = 1418.2852852344513 | accuracy = 0.5485576923076924


Epoch[1] Batch[525] Speed: 1.2583633356782178 samples/sec                   batch loss = 1431.6798927783966 | accuracy = 0.549047619047619


Epoch[1] Batch[530] Speed: 1.2536287665728008 samples/sec                   batch loss = 1443.7104120254517 | accuracy = 0.5504716981132075


Epoch[1] Batch[535] Speed: 1.2528379097881694 samples/sec                   batch loss = 1456.0039293766022 | accuracy = 0.5518691588785046


Epoch[1] Batch[540] Speed: 1.255794586178084 samples/sec                   batch loss = 1468.0113985538483 | accuracy = 0.5541666666666667


Epoch[1] Batch[545] Speed: 1.2517795678659183 samples/sec                   batch loss = 1481.4136159420013 | accuracy = 0.5545871559633028


Epoch[1] Batch[550] Speed: 1.2528004886583326 samples/sec                   batch loss = 1493.9152963161469 | accuracy = 0.5554545454545454


Epoch[1] Batch[555] Speed: 1.2588602677840517 samples/sec                   batch loss = 1506.5056583881378 | accuracy = 0.5572072072072072


Epoch[1] Batch[560] Speed: 1.2568842123905448 samples/sec                   batch loss = 1520.1978240013123 | accuracy = 0.5575892857142857


Epoch[1] Batch[565] Speed: 1.254630658225794 samples/sec                   batch loss = 1532.917381286621 | accuracy = 0.5588495575221238


Epoch[1] Batch[570] Speed: 1.2439138570531016 samples/sec                   batch loss = 1545.7109143733978 | accuracy = 0.5596491228070175


Epoch[1] Batch[575] Speed: 1.2512050626877211 samples/sec                   batch loss = 1558.951473236084 | accuracy = 0.558695652173913


Epoch[1] Batch[580] Speed: 1.2498602986022858 samples/sec                   batch loss = 1571.3216009140015 | accuracy = 0.5599137931034482


Epoch[1] Batch[585] Speed: 1.2557028511055874 samples/sec                   batch loss = 1584.2194175720215 | accuracy = 0.5598290598290598


Epoch[1] Batch[590] Speed: 1.2523058978840058 samples/sec                   batch loss = 1598.03058218956 | accuracy = 0.5601694915254237


Epoch[1] Batch[595] Speed: 1.2563921318828557 samples/sec                   batch loss = 1612.3004305362701 | accuracy = 0.5592436974789916


Epoch[1] Batch[600] Speed: 1.2586572175188544 samples/sec                   batch loss = 1624.0628700256348 | accuracy = 0.5608333333333333


Epoch[1] Batch[605] Speed: 1.2539073205734939 samples/sec                   batch loss = 1636.382645368576 | accuracy = 0.5615702479338843


Epoch[1] Batch[610] Speed: 1.2584881219288788 samples/sec                   batch loss = 1648.670901298523 | accuracy = 0.5627049180327869


Epoch[1] Batch[615] Speed: 1.252808814666359 samples/sec                   batch loss = 1662.047171831131 | accuracy = 0.5630081300813008


Epoch[1] Batch[620] Speed: 1.2482748553196865 samples/sec                   batch loss = 1674.0408959388733 | accuracy = 0.5637096774193548


Epoch[1] Batch[625] Speed: 1.2520280548583764 samples/sec                   batch loss = 1686.2741997241974 | accuracy = 0.5652


Epoch[1] Batch[630] Speed: 1.2551518768831338 samples/sec                   batch loss = 1698.1490675210953 | accuracy = 0.5658730158730159


Epoch[1] Batch[635] Speed: 1.2546750383382534 samples/sec                   batch loss = 1710.5819990634918 | accuracy = 0.5661417322834645


Epoch[1] Batch[640] Speed: 1.256634170766573 samples/sec                   batch loss = 1724.977531194687 | accuracy = 0.5671875


Epoch[1] Batch[645] Speed: 1.2523077674103957 samples/sec                   batch loss = 1737.5806283950806 | accuracy = 0.5678294573643411


Epoch[1] Batch[650] Speed: 1.2538238257089989 samples/sec                   batch loss = 1750.9977930784225 | accuracy = 0.568076923076923


Epoch[1] Batch[655] Speed: 1.2513506462083377 samples/sec                   batch loss = 1762.2606699466705 | accuracy = 0.5702290076335877


Epoch[1] Batch[660] Speed: 1.2533674712260356 samples/sec                   batch loss = 1775.2297666072845 | accuracy = 0.5712121212121212


Epoch[1] Batch[665] Speed: 1.2469867043807714 samples/sec                   batch loss = 1789.7105436325073 | accuracy = 0.5710526315789474


Epoch[1] Batch[670] Speed: 1.2534957639307185 samples/sec                   batch loss = 1802.1154236793518 | accuracy = 0.5720149253731344


Epoch[1] Batch[675] Speed: 1.2560667685562255 samples/sec                   batch loss = 1814.501380443573 | accuracy = 0.572962962962963


Epoch[1] Batch[680] Speed: 1.253259332448735 samples/sec                   batch loss = 1827.811170578003 | accuracy = 0.5724264705882353


Epoch[1] Batch[685] Speed: 1.2589330040454623 samples/sec                   batch loss = 1842.518758058548 | accuracy = 0.572992700729927


Epoch[1] Batch[690] Speed: 1.2538783631825337 samples/sec                   batch loss = 1856.3298199176788 | accuracy = 0.5731884057971014


Epoch[1] Batch[695] Speed: 1.2570311205789537 samples/sec                   batch loss = 1869.4939451217651 | accuracy = 0.573021582733813


Epoch[1] Batch[700] Speed: 1.2549120981931952 samples/sec                   batch loss = 1884.5667004585266 | accuracy = 0.5714285714285714


Epoch[1] Batch[705] Speed: 1.2592888716085318 samples/sec                   batch loss = 1897.7232632637024 | accuracy = 0.5719858156028369


Epoch[1] Batch[710] Speed: 1.2507422202554264 samples/sec                   batch loss = 1911.3825647830963 | accuracy = 0.5714788732394366


Epoch[1] Batch[715] Speed: 1.25711127535695 samples/sec                   batch loss = 1923.731171131134 | accuracy = 0.5720279720279721


Epoch[1] Batch[720] Speed: 1.2540140715056123 samples/sec                   batch loss = 1936.552857875824 | accuracy = 0.5732638888888889


Epoch[1] Batch[725] Speed: 1.2580590252068982 samples/sec                   batch loss = 1948.542323589325 | accuracy = 0.5744827586206896


Epoch[1] Batch[730] Speed: 1.2568166084936696 samples/sec                   batch loss = 1963.0090742111206 | accuracy = 0.5739726027397261


Epoch[1] Batch[735] Speed: 1.2521635498443044 samples/sec                   batch loss = 1977.3845295906067 | accuracy = 0.5744897959183674


Epoch[1] Batch[740] Speed: 1.259243597413814 samples/sec                   batch loss = 1991.7873344421387 | accuracy = 0.5739864864864865


Epoch[1] Batch[745] Speed: 1.260178474788305 samples/sec                   batch loss = 2003.8493008613586 | accuracy = 0.575503355704698


Epoch[1] Batch[750] Speed: 1.2565687583744583 samples/sec                   batch loss = 2017.331371307373 | accuracy = 0.576


Epoch[1] Batch[755] Speed: 1.259941124400931 samples/sec                   batch loss = 2030.6365337371826 | accuracy = 0.5758278145695365


Epoch[1] Batch[760] Speed: 1.2569830891237261 samples/sec                   batch loss = 2043.6184844970703 | accuracy = 0.5759868421052632


Epoch[1] Batch[765] Speed: 1.2610914219752105 samples/sec                   batch loss = 2057.4697300195694 | accuracy = 0.5758169934640522


Epoch[1] Batch[770] Speed: 1.260003197844992 samples/sec                   batch loss = 2069.715476632118 | accuracy = 0.5766233766233766


Epoch[1] Batch[775] Speed: 1.2609542723755436 samples/sec                   batch loss = 2081.2288551330566 | accuracy = 0.5783870967741935


Epoch[1] Batch[780] Speed: 1.260251458130549 samples/sec                   batch loss = 2094.9218316078186 | accuracy = 0.5785256410256411


Epoch[1] Batch[785] Speed: 1.2559277010575722 samples/sec                   batch loss = 2108.0472841262817 | accuracy = 0.5792993630573249


[Epoch 1] training: accuracy=0.5796319796954315
[Epoch 1] time cost: 646.2207770347595
[Epoch 1] validation: validation accuracy=0.6811111111111111


Epoch[2] Batch[5] Speed: 1.2507755088530164 samples/sec                   batch loss = 13.160124778747559 | accuracy = 0.55


Epoch[2] Batch[10] Speed: 1.2512577861397745 samples/sec                   batch loss = 26.376669764518738 | accuracy = 0.625


Epoch[2] Batch[15] Speed: 1.2503812879722873 samples/sec                   batch loss = 38.49019742012024 | accuracy = 0.6333333333333333


Epoch[2] Batch[20] Speed: 1.2562565668643066 samples/sec                   batch loss = 50.72647523880005 | accuracy = 0.675


Epoch[2] Batch[25] Speed: 1.2509611928902595 samples/sec                   batch loss = 61.694268107414246 | accuracy = 0.7


Epoch[2] Batch[30] Speed: 1.2542978605439397 samples/sec                   batch loss = 73.3012946844101 | accuracy = 0.7


Epoch[2] Batch[35] Speed: 1.253013912629526 samples/sec                   batch loss = 86.21855962276459 | accuracy = 0.6928571428571428


Epoch[2] Batch[40] Speed: 1.2503086047132086 samples/sec                   batch loss = 100.1827529668808 | accuracy = 0.66875


Epoch[2] Batch[45] Speed: 1.2538622450847992 samples/sec                   batch loss = 111.61031758785248 | accuracy = 0.6777777777777778


Epoch[2] Batch[50] Speed: 1.2520029214269142 samples/sec                   batch loss = 124.06376993656158 | accuracy = 0.68


Epoch[2] Batch[55] Speed: 1.2508886287284793 samples/sec                   batch loss = 136.96120512485504 | accuracy = 0.6818181818181818


Epoch[2] Batch[60] Speed: 1.2501682940180325 samples/sec                   batch loss = 150.01824247837067 | accuracy = 0.6666666666666666


Epoch[2] Batch[65] Speed: 1.250553618333445 samples/sec                   batch loss = 165.86087834835052 | accuracy = 0.6461538461538462


Epoch[2] Batch[70] Speed: 1.257743829885234 samples/sec                   batch loss = 177.40999567508698 | accuracy = 0.6535714285714286


Epoch[2] Batch[75] Speed: 1.2505118595103755 samples/sec                   batch loss = 190.01752698421478 | accuracy = 0.6566666666666666


Epoch[2] Batch[80] Speed: 1.2541996870708532 samples/sec                   batch loss = 204.17372906208038 | accuracy = 0.640625


Epoch[2] Batch[85] Speed: 1.2549054337670607 samples/sec                   batch loss = 217.17198932170868 | accuracy = 0.638235294117647


Epoch[2] Batch[90] Speed: 1.24993134667063 samples/sec                   batch loss = 230.33619105815887 | accuracy = 0.6361111111111111


Epoch[2] Batch[95] Speed: 1.2559704805445737 samples/sec                   batch loss = 242.26612436771393 | accuracy = 0.6342105263157894


Epoch[2] Batch[100] Speed: 1.2542967352594743 samples/sec                   batch loss = 255.7056452035904 | accuracy = 0.63


Epoch[2] Batch[105] Speed: 1.251834955106659 samples/sec                   batch loss = 269.147541642189 | accuracy = 0.6333333333333333


Epoch[2] Batch[110] Speed: 1.2458742873556417 samples/sec                   batch loss = 281.9100910425186 | accuracy = 0.6363636363636364


Epoch[2] Batch[115] Speed: 1.2557401638571242 samples/sec                   batch loss = 293.34091329574585 | accuracy = 0.6369565217391304


Epoch[2] Batch[120] Speed: 1.2548723004175124 samples/sec                   batch loss = 305.7912678718567 | accuracy = 0.6375


Epoch[2] Batch[125] Speed: 1.2508808878045792 samples/sec                   batch loss = 317.357679605484 | accuracy = 0.64


Epoch[2] Batch[130] Speed: 1.2511535566324454 samples/sec                   batch loss = 330.33301758766174 | accuracy = 0.6403846153846153


Epoch[2] Batch[135] Speed: 1.252944478702421 samples/sec                   batch loss = 345.39578795433044 | accuracy = 0.6314814814814815


Epoch[2] Batch[140] Speed: 1.2529188406262497 samples/sec                   batch loss = 358.0516892671585 | accuracy = 0.6303571428571428


Epoch[2] Batch[145] Speed: 1.2524500547459274 samples/sec                   batch loss = 372.37373530864716 | accuracy = 0.6275862068965518


Epoch[2] Batch[150] Speed: 1.2501858078427228 samples/sec                   batch loss = 385.8650658130646 | accuracy = 0.6266666666666667


Epoch[2] Batch[155] Speed: 1.250728979905248 samples/sec                   batch loss = 398.81597566604614 | accuracy = 0.6274193548387097


Epoch[2] Batch[160] Speed: 1.2536450660208203 samples/sec                   batch loss = 411.17901968955994 | accuracy = 0.6296875


Epoch[2] Batch[165] Speed: 1.2524759541449053 samples/sec                   batch loss = 424.47402691841125 | accuracy = 0.6303030303030303


Epoch[2] Batch[170] Speed: 1.2540710627631426 samples/sec                   batch loss = 435.713011264801 | accuracy = 0.6367647058823529


Epoch[2] Batch[175] Speed: 1.2546025117900235 samples/sec                   batch loss = 448.1417586803436 | accuracy = 0.6371428571428571


Epoch[2] Batch[180] Speed: 1.2501276787386846 samples/sec                   batch loss = 461.2769367694855 | accuracy = 0.6375


Epoch[2] Batch[185] Speed: 1.2498325520087017 samples/sec                   batch loss = 472.78080010414124 | accuracy = 0.6378378378378379


Epoch[2] Batch[190] Speed: 1.2517126987404994 samples/sec                   batch loss = 484.9931466579437 | accuracy = 0.6421052631578947


Epoch[2] Batch[195] Speed: 1.2572945112086023 samples/sec                   batch loss = 495.66217386722565 | accuracy = 0.6448717948717949


Epoch[2] Batch[200] Speed: 1.2573456760516093 samples/sec                   batch loss = 508.24482572078705 | accuracy = 0.645


Epoch[2] Batch[205] Speed: 1.2529475665707108 samples/sec                   batch loss = 519.5119171142578 | accuracy = 0.6487804878048781


Epoch[2] Batch[210] Speed: 1.2581008178937323 samples/sec                   batch loss = 530.3911865949631 | accuracy = 0.6523809523809524


Epoch[2] Batch[215] Speed: 1.256097707975006 samples/sec                   batch loss = 543.1590082645416 | accuracy = 0.6534883720930232


Epoch[2] Batch[220] Speed: 1.2624131683566597 samples/sec                   batch loss = 558.326030254364 | accuracy = 0.65


Epoch[2] Batch[225] Speed: 1.2513851806038019 samples/sec                   batch loss = 573.2052040100098 | accuracy = 0.6444444444444445


Epoch[2] Batch[230] Speed: 1.2521696244442575 samples/sec                   batch loss = 585.1034197807312 | accuracy = 0.6456521739130435


Epoch[2] Batch[235] Speed: 1.2488339367016008 samples/sec                   batch loss = 596.4741053581238 | accuracy = 0.65


Epoch[2] Batch[240] Speed: 1.2527869240508172 samples/sec                   batch loss = 606.4568229913712 | accuracy = 0.6510416666666666


Epoch[2] Batch[245] Speed: 1.2546984024919245 samples/sec                   batch loss = 619.460421204567 | accuracy = 0.6520408163265307


Epoch[2] Batch[250] Speed: 1.246553743443269 samples/sec                   batch loss = 631.8351631164551 | accuracy = 0.651


Epoch[2] Batch[255] Speed: 1.252267106470966 samples/sec                   batch loss = 644.0804342031479 | accuracy = 0.6509803921568628


Epoch[2] Batch[260] Speed: 1.254532526460784 samples/sec                   batch loss = 656.4891020059586 | accuracy = 0.6528846153846154


Epoch[2] Batch[265] Speed: 1.2484409385726152 samples/sec                   batch loss = 666.95936191082 | accuracy = 0.6556603773584906


Epoch[2] Batch[270] Speed: 1.247540364962391 samples/sec                   batch loss = 682.668329834938 | accuracy = 0.6518518518518519


Epoch[2] Batch[275] Speed: 1.254202406090963 samples/sec                   batch loss = 694.9119470119476 | accuracy = 0.6536363636363637


Epoch[2] Batch[280] Speed: 1.2499987706554105 samples/sec                   batch loss = 706.4329996109009 | accuracy = 0.6535714285714286


Epoch[2] Batch[285] Speed: 1.2515437832897445 samples/sec                   batch loss = 716.584500670433 | accuracy = 0.6570175438596492


Epoch[2] Batch[290] Speed: 1.2482113318006107 samples/sec                   batch loss = 728.4460793733597 | accuracy = 0.6577586206896552


Epoch[2] Batch[295] Speed: 1.2485164709105063 samples/sec                   batch loss = 740.7882921695709 | accuracy = 0.6584745762711864


Epoch[2] Batch[300] Speed: 1.2523202933811952 samples/sec                   batch loss = 752.5133593082428 | accuracy = 0.6591666666666667


Epoch[2] Batch[305] Speed: 1.252410319483318 samples/sec                   batch loss = 764.8244831562042 | accuracy = 0.6598360655737705


Epoch[2] Batch[310] Speed: 1.25251709618658 samples/sec                   batch loss = 777.972885966301 | accuracy = 0.660483870967742


Epoch[2] Batch[315] Speed: 1.25068347999372 samples/sec                   batch loss = 791.6218379735947 | accuracy = 0.6587301587301587


Epoch[2] Batch[320] Speed: 1.255277623627018 samples/sec                   batch loss = 802.3679382801056 | accuracy = 0.65859375


Epoch[2] Batch[325] Speed: 1.247830976081031 samples/sec                   batch loss = 814.5603368282318 | accuracy = 0.6592307692307692


Epoch[2] Batch[330] Speed: 1.2533375088301 samples/sec                   batch loss = 828.5077575445175 | accuracy = 0.6575757575757576


Epoch[2] Batch[335] Speed: 1.252505501337671 samples/sec                   batch loss = 838.767332315445 | accuracy = 0.6582089552238806


Epoch[2] Batch[340] Speed: 1.256070436067272 samples/sec                   batch loss = 849.2583595514297 | accuracy = 0.6602941176470588


Epoch[2] Batch[345] Speed: 1.2529261389498632 samples/sec                   batch loss = 860.1749091148376 | accuracy = 0.6615942028985508


Epoch[2] Batch[350] Speed: 1.252667474639354 samples/sec                   batch loss = 872.4455280303955 | accuracy = 0.6614285714285715


Epoch[2] Batch[355] Speed: 1.2558275802444554 samples/sec                   batch loss = 883.7316876649857 | accuracy = 0.6626760563380282


Epoch[2] Batch[360] Speed: 1.2555435684547935 samples/sec                   batch loss = 895.7502974271774 | accuracy = 0.6618055555555555


Epoch[2] Batch[365] Speed: 1.2580927987656347 samples/sec                   batch loss = 911.5231913328171 | accuracy = 0.660958904109589


Epoch[2] Batch[370] Speed: 1.2538179224526178 samples/sec                   batch loss = 925.3189491033554 | accuracy = 0.6601351351351351


Epoch[2] Batch[375] Speed: 1.2548142978029329 samples/sec                   batch loss = 942.2397304773331 | accuracy = 0.6546666666666666


Epoch[2] Batch[380] Speed: 1.2587426795305807 samples/sec                   batch loss = 953.6675864458084 | accuracy = 0.655921052631579


Epoch[2] Batch[385] Speed: 1.249521650959033 samples/sec                   batch loss = 965.9371095895767 | accuracy = 0.6558441558441559


Epoch[2] Batch[390] Speed: 1.2490569841688886 samples/sec                   batch loss = 978.2671786546707 | accuracy = 0.6564102564102564


Epoch[2] Batch[395] Speed: 1.2466925954383394 samples/sec                   batch loss = 988.294956445694 | accuracy = 0.6575949367088607


Epoch[2] Batch[400] Speed: 1.2497043563967623 samples/sec                   batch loss = 999.8470017910004 | accuracy = 0.6575


Epoch[2] Batch[405] Speed: 1.255555125659042 samples/sec                   batch loss = 1012.09914290905 | accuracy = 0.6580246913580247


Epoch[2] Batch[410] Speed: 1.249126731906025 samples/sec                   batch loss = 1022.5268251895905 | accuracy = 0.6591463414634147


Epoch[2] Batch[415] Speed: 1.2476862103259374 samples/sec                   batch loss = 1034.0794578790665 | accuracy = 0.6596385542168675


Epoch[2] Batch[420] Speed: 1.2497317249293725 samples/sec                   batch loss = 1046.4910752773285 | accuracy = 0.6601190476190476


Epoch[2] Batch[425] Speed: 1.2478396074082503 samples/sec                   batch loss = 1056.4729055166245 | accuracy = 0.6623529411764706


Epoch[2] Batch[430] Speed: 1.248886181609995 samples/sec                   batch loss = 1068.602576494217 | accuracy = 0.6627906976744186


Epoch[2] Batch[435] Speed: 1.2482570235037143 samples/sec                   batch loss = 1081.6189856529236 | accuracy = 0.6626436781609195


Epoch[2] Batch[440] Speed: 1.2480355613770513 samples/sec                   batch loss = 1094.904993057251 | accuracy = 0.6625


Epoch[2] Batch[445] Speed: 1.2468607598417458 samples/sec                   batch loss = 1107.5277633666992 | accuracy = 0.6634831460674158


Epoch[2] Batch[450] Speed: 1.2488707493850078 samples/sec                   batch loss = 1121.5717334747314 | accuracy = 0.6627777777777778


Epoch[2] Batch[455] Speed: 1.2444602685750568 samples/sec                   batch loss = 1131.8650209903717 | accuracy = 0.6642857142857143


Epoch[2] Batch[460] Speed: 1.2558417748063733 samples/sec                   batch loss = 1143.860013961792 | accuracy = 0.6657608695652174


Epoch[2] Batch[465] Speed: 1.2507736438987342 samples/sec                   batch loss = 1154.0340348482132 | accuracy = 0.6672043010752688


Epoch[2] Batch[470] Speed: 1.253170026698146 samples/sec                   batch loss = 1164.9866186380386 | accuracy = 0.6670212765957447


Epoch[2] Batch[475] Speed: 1.2488265000593626 samples/sec                   batch loss = 1178.7488120794296 | accuracy = 0.6652631578947369


Epoch[2] Batch[480] Speed: 1.2516109144855705 samples/sec                   batch loss = 1189.5034905672073 | accuracy = 0.6671875


Epoch[2] Batch[485] Speed: 1.248971251623262 samples/sec                   batch loss = 1203.2347549200058 | accuracy = 0.6659793814432989


Epoch[2] Batch[490] Speed: 1.2540871862299605 samples/sec                   batch loss = 1212.7371529340744 | accuracy = 0.6673469387755102


Epoch[2] Batch[495] Speed: 1.2496660983143915 samples/sec                   batch loss = 1225.1655279397964 | accuracy = 0.6671717171717172


Epoch[2] Batch[500] Speed: 1.24981216184103 samples/sec                   batch loss = 1238.8820320367813 | accuracy = 0.666


Epoch[2] Batch[505] Speed: 1.2590184090330574 samples/sec                   batch loss = 1250.0729439258575 | accuracy = 0.6653465346534654


Epoch[2] Batch[510] Speed: 1.2520678593086991 samples/sec                   batch loss = 1261.7786655426025 | accuracy = 0.6656862745098039


Epoch[2] Batch[515] Speed: 1.2540891548211697 samples/sec                   batch loss = 1276.057474732399 | accuracy = 0.6655339805825242


Epoch[2] Batch[520] Speed: 1.2503861338233553 samples/sec                   batch loss = 1286.7674797773361 | accuracy = 0.6658653846153846


Epoch[2] Batch[525] Speed: 1.2460061401607778 samples/sec                   batch loss = 1302.821604847908 | accuracy = 0.6628571428571428


Epoch[2] Batch[530] Speed: 1.254189748683854 samples/sec                   batch loss = 1315.7310317754745 | accuracy = 0.6622641509433962


Epoch[2] Batch[535] Speed: 1.2527131190236116 samples/sec                   batch loss = 1324.412835240364 | accuracy = 0.664018691588785


Epoch[2] Batch[540] Speed: 1.2531870630811004 samples/sec                   batch loss = 1335.4211072921753 | accuracy = 0.6652777777777777


Epoch[2] Batch[545] Speed: 1.2511445994904804 samples/sec                   batch loss = 1349.1480400562286 | accuracy = 0.6651376146788991


Epoch[2] Batch[550] Speed: 1.2573881752472418 samples/sec                   batch loss = 1360.847175002098 | accuracy = 0.665


Epoch[2] Batch[555] Speed: 1.2563963658208872 samples/sec                   batch loss = 1372.4936608076096 | accuracy = 0.6648648648648648


Epoch[2] Batch[560] Speed: 1.259216283242484 samples/sec                   batch loss = 1387.367621421814 | accuracy = 0.6638392857142857


Epoch[2] Batch[565] Speed: 1.2565710171061326 samples/sec                   batch loss = 1399.2958344221115 | accuracy = 0.6650442477876106


Epoch[2] Batch[570] Speed: 1.2543997071485058 samples/sec                   batch loss = 1412.1910382509232 | accuracy = 0.6644736842105263


Epoch[2] Batch[575] Speed: 1.2523170216482271 samples/sec                   batch loss = 1423.813557624817 | accuracy = 0.6643478260869565


Epoch[2] Batch[580] Speed: 1.2522004656301633 samples/sec                   batch loss = 1437.8547562360764 | accuracy = 0.6629310344827586


Epoch[2] Batch[585] Speed: 1.2571626136926228 samples/sec                   batch loss = 1449.3882330656052 | accuracy = 0.6628205128205128


Epoch[2] Batch[590] Speed: 1.2588747199045438 samples/sec                   batch loss = 1460.4513518810272 | accuracy = 0.6635593220338983


Epoch[2] Batch[595] Speed: 1.259301632147094 samples/sec                   batch loss = 1470.4236649274826 | accuracy = 0.6647058823529411


Epoch[2] Batch[600] Speed: 1.2611627098449782 samples/sec                   batch loss = 1481.476460814476 | accuracy = 0.665


Epoch[2] Batch[605] Speed: 1.2567604971335038 samples/sec                   batch loss = 1494.9753690958023 | accuracy = 0.6644628099173554


Epoch[2] Batch[610] Speed: 1.2535263895027016 samples/sec                   batch loss = 1508.085016131401 | accuracy = 0.6647540983606557


Epoch[2] Batch[615] Speed: 1.2521848579310564 samples/sec                   batch loss = 1518.0896579027176 | accuracy = 0.6654471544715447


Epoch[2] Batch[620] Speed: 1.2534124175060992 samples/sec                   batch loss = 1529.6621096134186 | accuracy = 0.6653225806451613


Epoch[2] Batch[625] Speed: 1.2511943319098326 samples/sec                   batch loss = 1540.651870727539 | accuracy = 0.666


Epoch[2] Batch[630] Speed: 1.252987990977587 samples/sec                   batch loss = 1551.8313952684402 | accuracy = 0.6654761904761904


Epoch[2] Batch[635] Speed: 1.2566345472604827 samples/sec                   batch loss = 1563.2613071203232 | accuracy = 0.6665354330708662


Epoch[2] Batch[640] Speed: 1.2489764584703738 samples/sec                   batch loss = 1575.357927441597 | accuracy = 0.665234375


Epoch[2] Batch[645] Speed: 1.249120035785261 samples/sec                   batch loss = 1586.6620906591415 | accuracy = 0.665891472868217


Epoch[2] Batch[650] Speed: 1.2483763763531375 samples/sec                   batch loss = 1599.5556108951569 | accuracy = 0.6657692307692308


Epoch[2] Batch[655] Speed: 1.255087463740057 samples/sec                   batch loss = 1610.9622699022293 | accuracy = 0.666412213740458


Epoch[2] Batch[660] Speed: 1.2548660118490078 samples/sec                   batch loss = 1622.4168199896812 | accuracy = 0.6670454545454545


Epoch[2] Batch[665] Speed: 1.247761094570699 samples/sec                   batch loss = 1630.7864146232605 | accuracy = 0.6687969924812031


Epoch[2] Batch[670] Speed: 1.2474142158183945 samples/sec                   batch loss = 1644.6185557842255 | accuracy = 0.6667910447761194


Epoch[2] Batch[675] Speed: 1.2503952664909264 samples/sec                   batch loss = 1655.7155793905258 | accuracy = 0.667037037037037


Epoch[2] Batch[680] Speed: 1.2503873452919911 samples/sec                   batch loss = 1666.941347360611 | accuracy = 0.6669117647058823


Epoch[2] Batch[685] Speed: 1.2525645063908935 samples/sec                   batch loss = 1676.1778860092163 | accuracy = 0.6675182481751825


Epoch[2] Batch[690] Speed: 1.2520742133049663 samples/sec                   batch loss = 1687.6447744369507 | accuracy = 0.6670289855072464


Epoch[2] Batch[695] Speed: 1.2608305126524821 samples/sec                   batch loss = 1698.3429173231125 | accuracy = 0.6672661870503597


Epoch[2] Batch[700] Speed: 1.2546649985877067 samples/sec                   batch loss = 1710.5261226892471 | accuracy = 0.6675


Epoch[2] Batch[705] Speed: 1.254229034359243 samples/sec                   batch loss = 1722.1251204013824 | accuracy = 0.6680851063829787


Epoch[2] Batch[710] Speed: 1.2500932137137863 samples/sec                   batch loss = 1736.179675102234 | accuracy = 0.6683098591549296


Epoch[2] Batch[715] Speed: 1.2538066783085269 samples/sec                   batch loss = 1746.951907992363 | accuracy = 0.6685314685314685


Epoch[2] Batch[720] Speed: 1.2514518280419764 samples/sec                   batch loss = 1758.949630498886 | accuracy = 0.6680555555555555


Epoch[2] Batch[725] Speed: 1.2570165223954648 samples/sec                   batch loss = 1769.745511174202 | accuracy = 0.6679310344827586


Epoch[2] Batch[730] Speed: 1.2582230986747842 samples/sec                   batch loss = 1780.9408921003342 | accuracy = 0.6684931506849315


Epoch[2] Batch[735] Speed: 1.2490464761850024 samples/sec                   batch loss = 1794.6623755693436 | accuracy = 0.667687074829932


Epoch[2] Batch[740] Speed: 1.251640887762076 samples/sec                   batch loss = 1806.427010178566 | accuracy = 0.6675675675675675


Epoch[2] Batch[745] Speed: 1.249720367773734 samples/sec                   batch loss = 1817.3896989822388 | accuracy = 0.6681208053691275


Epoch[2] Batch[750] Speed: 1.250491726788554 samples/sec                   batch loss = 1828.6662640571594 | accuracy = 0.6686666666666666


Epoch[2] Batch[755] Speed: 1.253454182980587 samples/sec                   batch loss = 1842.1138402223587 | accuracy = 0.669205298013245


Epoch[2] Batch[760] Speed: 1.2507930396951137 samples/sec                   batch loss = 1852.054855465889 | accuracy = 0.6703947368421053


Epoch[2] Batch[765] Speed: 1.2449785201858012 samples/sec                   batch loss = 1862.938053369522 | accuracy = 0.6712418300653594


Epoch[2] Batch[770] Speed: 1.2533931276660502 samples/sec                   batch loss = 1874.4883933067322 | accuracy = 0.6711038961038961


Epoch[2] Batch[775] Speed: 1.2501269335288565 samples/sec                   batch loss = 1885.8340368270874 | accuracy = 0.6716129032258065


Epoch[2] Batch[780] Speed: 1.244261190771627 samples/sec                   batch loss = 1895.761410832405 | accuracy = 0.6727564102564103


Epoch[2] Batch[785] Speed: 1.2519815261038183 samples/sec                   batch loss = 1907.2781574726105 | accuracy = 0.6722929936305733


[Epoch 2] training: accuracy=0.6719543147208121
[Epoch 2] time cost: 645.1444997787476
[Epoch 2] validation: validation accuracy=0.7522222222222222


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).