<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[15:35:30] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[15:35:30] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[15:35:31] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 7.286863, -3.681819]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7945953999974046 samples/sec                   batch loss = 12.536965608596802 | accuracy = 0.55


Epoch[1] Batch[10] Speed: 1.2849607531762868 samples/sec                   batch loss = 26.341429948806763 | accuracy = 0.525


Epoch[1] Batch[15] Speed: 1.2838839173283376 samples/sec                   batch loss = 40.8190438747406 | accuracy = 0.5166666666666667


Epoch[1] Batch[20] Speed: 1.2847872715419646 samples/sec                   batch loss = 55.35334014892578 | accuracy = 0.525


Epoch[1] Batch[25] Speed: 1.2890010374415521 samples/sec                   batch loss = 69.11587953567505 | accuracy = 0.54


Epoch[1] Batch[30] Speed: 1.2883959235699312 samples/sec                   batch loss = 82.78548192977905 | accuracy = 0.5583333333333333


Epoch[1] Batch[35] Speed: 1.2814878602796735 samples/sec                   batch loss = 97.43494749069214 | accuracy = 0.5285714285714286


Epoch[1] Batch[40] Speed: 1.2880190664596076 samples/sec                   batch loss = 111.39654231071472 | accuracy = 0.51875


Epoch[1] Batch[45] Speed: 1.291615194933072 samples/sec                   batch loss = 125.70336413383484 | accuracy = 0.5166666666666667


Epoch[1] Batch[50] Speed: 1.2933338544529789 samples/sec                   batch loss = 139.04121160507202 | accuracy = 0.525


Epoch[1] Batch[55] Speed: 1.2906154380143318 samples/sec                   batch loss = 151.76326513290405 | accuracy = 0.5409090909090909


Epoch[1] Batch[60] Speed: 1.2875078429759474 samples/sec                   batch loss = 164.64332151412964 | accuracy = 0.55


Epoch[1] Batch[65] Speed: 1.2850081907973243 samples/sec                   batch loss = 177.78224778175354 | accuracy = 0.5576923076923077


Epoch[1] Batch[70] Speed: 1.2835440630482313 samples/sec                   batch loss = 190.36680936813354 | accuracy = 0.5642857142857143


Epoch[1] Batch[75] Speed: 1.2854669022776537 samples/sec                   batch loss = 203.78908896446228 | accuracy = 0.5733333333333334


Epoch[1] Batch[80] Speed: 1.2873607388190702 samples/sec                   batch loss = 218.5575737953186 | accuracy = 0.5625


Epoch[1] Batch[85] Speed: 1.2862512801397294 samples/sec                   batch loss = 232.71652102470398 | accuracy = 0.5647058823529412


Epoch[1] Batch[90] Speed: 1.2926408025749856 samples/sec                   batch loss = 248.22601556777954 | accuracy = 0.5638888888888889


Epoch[1] Batch[95] Speed: 1.2840556809337598 samples/sec                   batch loss = 261.9549169540405 | accuracy = 0.5657894736842105


Epoch[1] Batch[100] Speed: 1.2872738159661792 samples/sec                   batch loss = 276.69475173950195 | accuracy = 0.555


Epoch[1] Batch[105] Speed: 1.28945437882755 samples/sec                   batch loss = 290.8048560619354 | accuracy = 0.5547619047619048


Epoch[1] Batch[110] Speed: 1.2857326892158865 samples/sec                   batch loss = 304.57271814346313 | accuracy = 0.5522727272727272


Epoch[1] Batch[115] Speed: 1.2849060369510845 samples/sec                   batch loss = 318.38974690437317 | accuracy = 0.5543478260869565


Epoch[1] Batch[120] Speed: 1.2872632477378618 samples/sec                   batch loss = 332.3851499557495 | accuracy = 0.5520833333333334


Epoch[1] Batch[125] Speed: 1.288184124496398 samples/sec                   batch loss = 346.8480761051178 | accuracy = 0.546


Epoch[1] Batch[130] Speed: 1.287539461421434 samples/sec                   batch loss = 360.51228761672974 | accuracy = 0.551923076923077


Epoch[1] Batch[135] Speed: 1.2774998149678378 samples/sec                   batch loss = 374.332400560379 | accuracy = 0.5537037037037037


Epoch[1] Batch[140] Speed: 1.2826678536243825 samples/sec                   batch loss = 388.6712303161621 | accuracy = 0.55


Epoch[1] Batch[145] Speed: 1.2828308562994082 samples/sec                   batch loss = 402.5942289829254 | accuracy = 0.5517241379310345


Epoch[1] Batch[150] Speed: 1.2815830100198533 samples/sec                   batch loss = 416.27846789360046 | accuracy = 0.5533333333333333


Epoch[1] Batch[155] Speed: 1.2805039092379646 samples/sec                   batch loss = 430.02572560310364 | accuracy = 0.5516129032258065


Epoch[1] Batch[160] Speed: 1.2765906482207514 samples/sec                   batch loss = 443.9106934070587 | accuracy = 0.5515625


Epoch[1] Batch[165] Speed: 1.2796833791199886 samples/sec                   batch loss = 457.6037473678589 | accuracy = 0.55


Epoch[1] Batch[170] Speed: 1.282200945974998 samples/sec                   batch loss = 471.53317761421204 | accuracy = 0.5470588235294118


Epoch[1] Batch[175] Speed: 1.2826593221343232 samples/sec                   batch loss = 485.41896414756775 | accuracy = 0.5428571428571428


Epoch[1] Batch[180] Speed: 1.284359622494844 samples/sec                   batch loss = 498.7577154636383 | accuracy = 0.5472222222222223


Epoch[1] Batch[185] Speed: 1.2813566120686093 samples/sec                   batch loss = 512.7447879314423 | accuracy = 0.5472972972972973


Epoch[1] Batch[190] Speed: 1.2823496176465912 samples/sec                   batch loss = 526.3106479644775 | accuracy = 0.5473684210526316


Epoch[1] Batch[195] Speed: 1.2841070813801283 samples/sec                   batch loss = 539.8202164173126 | accuracy = 0.5487179487179488


Epoch[1] Batch[200] Speed: 1.2790571403434754 samples/sec                   batch loss = 553.7710537910461 | accuracy = 0.54625


Epoch[1] Batch[205] Speed: 1.2887586465051293 samples/sec                   batch loss = 567.0355083942413 | accuracy = 0.5475609756097561


Epoch[1] Batch[210] Speed: 1.2862902332366768 samples/sec                   batch loss = 580.9814057350159 | accuracy = 0.5464285714285714


Epoch[1] Batch[215] Speed: 1.284409965643605 samples/sec                   batch loss = 594.3674690723419 | accuracy = 0.5511627906976744


Epoch[1] Batch[220] Speed: 1.2887290469964865 samples/sec                   batch loss = 608.0555477142334 | accuracy = 0.553409090909091


Epoch[1] Batch[225] Speed: 1.2809626374801009 samples/sec                   batch loss = 620.8848581314087 | accuracy = 0.5577777777777778


Epoch[1] Batch[230] Speed: 1.287345230131578 samples/sec                   batch loss = 634.7085011005402 | accuracy = 0.5608695652173913


Epoch[1] Batch[235] Speed: 1.2884744879652794 samples/sec                   batch loss = 648.6344180107117 | accuracy = 0.5595744680851064


Epoch[1] Batch[240] Speed: 1.2830750455020412 samples/sec                   batch loss = 662.5503859519958 | accuracy = 0.559375


Epoch[1] Batch[245] Speed: 1.2838267384984126 samples/sec                   batch loss = 676.1475074291229 | accuracy = 0.5561224489795918


Epoch[1] Batch[250] Speed: 1.2817051983068906 samples/sec                   batch loss = 689.9416801929474 | accuracy = 0.554


Epoch[1] Batch[255] Speed: 1.2875569510411564 samples/sec                   batch loss = 703.8948204517365 | accuracy = 0.5519607843137255


Epoch[1] Batch[260] Speed: 1.289064818833673 samples/sec                   batch loss = 717.502213716507 | accuracy = 0.5538461538461539


Epoch[1] Batch[265] Speed: 1.2854145065552844 samples/sec                   batch loss = 730.5272815227509 | accuracy = 0.5566037735849056


Epoch[1] Batch[270] Speed: 1.2909276585057472 samples/sec                   batch loss = 744.84965467453 | accuracy = 0.5537037037037037


Epoch[1] Batch[275] Speed: 1.2877723000420938 samples/sec                   batch loss = 758.5542168617249 | accuracy = 0.5536363636363636


Epoch[1] Batch[280] Speed: 1.2868783664395156 samples/sec                   batch loss = 772.1576581001282 | accuracy = 0.5553571428571429


Epoch[1] Batch[285] Speed: 1.280978677404444 samples/sec                   batch loss = 786.0742719173431 | accuracy = 0.5543859649122806


Epoch[1] Batch[290] Speed: 1.2827237524014923 samples/sec                   batch loss = 799.7067537307739 | accuracy = 0.5551724137931034


Epoch[1] Batch[295] Speed: 1.2839924925251278 samples/sec                   batch loss = 813.7052764892578 | accuracy = 0.5542372881355933


Epoch[1] Batch[300] Speed: 1.2845043696856584 samples/sec                   batch loss = 826.5365283489227 | accuracy = 0.5566666666666666


Epoch[1] Batch[305] Speed: 1.2845900336943084 samples/sec                   batch loss = 840.4762406349182 | accuracy = 0.5549180327868852


Epoch[1] Batch[310] Speed: 1.2841092436305879 samples/sec                   batch loss = 853.7930977344513 | accuracy = 0.5580645161290323


Epoch[1] Batch[315] Speed: 1.2799959961062743 samples/sec                   batch loss = 867.4744603633881 | accuracy = 0.557936507936508


Epoch[1] Batch[320] Speed: 1.2847484094197028 samples/sec                   batch loss = 881.13915848732 | accuracy = 0.55703125


Epoch[1] Batch[325] Speed: 1.2848620508468593 samples/sec                   batch loss = 895.1962239742279 | accuracy = 0.5561538461538461


Epoch[1] Batch[330] Speed: 1.283353194951439 samples/sec                   batch loss = 908.467588186264 | accuracy = 0.5568181818181818


Epoch[1] Batch[335] Speed: 1.2889242902518414 samples/sec                   batch loss = 922.1838726997375 | accuracy = 0.5552238805970149


Epoch[1] Batch[340] Speed: 1.295781061620057 samples/sec                   batch loss = 935.5686428546906 | accuracy = 0.5558823529411765


Epoch[1] Batch[345] Speed: 1.2880598078307377 samples/sec                   batch loss = 949.5562891960144 | accuracy = 0.5557971014492754


Epoch[1] Batch[350] Speed: 1.2841318494123166 samples/sec                   batch loss = 963.4220781326294 | accuracy = 0.5542857142857143


Epoch[1] Batch[355] Speed: 1.2891630782976873 samples/sec                   batch loss = 976.7738077640533 | accuracy = 0.5549295774647888


Epoch[1] Batch[360] Speed: 1.2830008665134203 samples/sec                   batch loss = 990.7796387672424 | accuracy = 0.5513888888888889


Epoch[1] Batch[365] Speed: 1.277890104693933 samples/sec                   batch loss = 1004.6806044578552 | accuracy = 0.5506849315068493


Epoch[1] Batch[370] Speed: 1.2768335362171372 samples/sec                   batch loss = 1017.7572321891785 | accuracy = 0.5533783783783783


Epoch[1] Batch[375] Speed: 1.2760879694527778 samples/sec                   batch loss = 1032.1698377132416 | accuracy = 0.5506666666666666


Epoch[1] Batch[380] Speed: 1.2812952546786875 samples/sec                   batch loss = 1046.1882610321045 | accuracy = 0.5486842105263158


Epoch[1] Batch[385] Speed: 1.2816684806278194 samples/sec                   batch loss = 1059.9129748344421 | accuracy = 0.548051948051948


Epoch[1] Batch[390] Speed: 1.2812866435854242 samples/sec                   batch loss = 1073.8691234588623 | accuracy = 0.5480769230769231


Epoch[1] Batch[395] Speed: 1.2783798852330317 samples/sec                   batch loss = 1087.282991886139 | accuracy = 0.5481012658227848


Epoch[1] Batch[400] Speed: 1.2849540610137382 samples/sec                   batch loss = 1101.6495079994202 | accuracy = 0.545


Epoch[1] Batch[405] Speed: 1.2886695551071006 samples/sec                   batch loss = 1116.0823571681976 | accuracy = 0.5438271604938272


Epoch[1] Batch[410] Speed: 1.2817570962146596 samples/sec                   batch loss = 1129.8247320652008 | accuracy = 0.5445121951219513


Epoch[1] Batch[415] Speed: 1.2804984362046445 samples/sec                   batch loss = 1143.59787774086 | accuracy = 0.5445783132530121


Epoch[1] Batch[420] Speed: 1.2758636056474013 samples/sec                   batch loss = 1156.6495110988617 | accuracy = 0.5458333333333333


Epoch[1] Batch[425] Speed: 1.2831982054883495 samples/sec                   batch loss = 1170.2520668506622 | accuracy = 0.5476470588235294


Epoch[1] Batch[430] Speed: 1.2849385118463725 samples/sec                   batch loss = 1184.0235056877136 | accuracy = 0.5465116279069767


Epoch[1] Batch[435] Speed: 1.2860608876871737 samples/sec                   batch loss = 1197.5474722385406 | accuracy = 0.5471264367816092


Epoch[1] Batch[440] Speed: 1.2859039622914687 samples/sec                   batch loss = 1211.1630256175995 | accuracy = 0.5482954545454546


Epoch[1] Batch[445] Speed: 1.2864564267474594 samples/sec                   batch loss = 1224.7662694454193 | accuracy = 0.547752808988764


Epoch[1] Batch[450] Speed: 1.2791639253731788 samples/sec                   batch loss = 1237.9264249801636 | accuracy = 0.5494444444444444


Epoch[1] Batch[455] Speed: 1.2819102685061998 samples/sec                   batch loss = 1251.0017783641815 | accuracy = 0.5505494505494506


Epoch[1] Batch[460] Speed: 1.280955693491108 samples/sec                   batch loss = 1265.220428943634 | accuracy = 0.5505434782608696


Epoch[1] Batch[465] Speed: 1.2779841368081277 samples/sec                   batch loss = 1278.769428730011 | accuracy = 0.5516129032258065


Epoch[1] Batch[470] Speed: 1.2819311317593833 samples/sec                   batch loss = 1292.6548812389374 | accuracy = 0.551063829787234


Epoch[1] Batch[475] Speed: 1.277154678819564 samples/sec                   batch loss = 1305.3624868392944 | accuracy = 0.5531578947368421


Epoch[1] Batch[480] Speed: 1.275845850117077 samples/sec                   batch loss = 1319.2522401809692 | accuracy = 0.5520833333333334


Epoch[1] Batch[485] Speed: 1.2849070210147027 samples/sec                   batch loss = 1333.0675785541534 | accuracy = 0.5515463917525774


Epoch[1] Batch[490] Speed: 1.2829011897403788 samples/sec                   batch loss = 1345.87704372406 | accuracy = 0.5530612244897959


Epoch[1] Batch[495] Speed: 1.2796017840822393 samples/sec                   batch loss = 1359.5058317184448 | accuracy = 0.5525252525252525


Epoch[1] Batch[500] Speed: 1.282708159094744 samples/sec                   batch loss = 1372.4217052459717 | accuracy = 0.554


Epoch[1] Batch[505] Speed: 1.2841696913064813 samples/sec                   batch loss = 1386.3935387134552 | accuracy = 0.5524752475247525


Epoch[1] Batch[510] Speed: 1.2792621443320351 samples/sec                   batch loss = 1399.585482120514 | accuracy = 0.5529411764705883


Epoch[1] Batch[515] Speed: 1.2859167751152363 samples/sec                   batch loss = 1412.6282172203064 | accuracy = 0.5533980582524272


Epoch[1] Batch[520] Speed: 1.2863856043212554 samples/sec                   batch loss = 1426.1772158145905 | accuracy = 0.5524038461538462


Epoch[1] Batch[525] Speed: 1.2797645940829212 samples/sec                   batch loss = 1440.2768473625183 | accuracy = 0.5514285714285714


Epoch[1] Batch[530] Speed: 1.2839409047346788 samples/sec                   batch loss = 1453.5642352104187 | accuracy = 0.5523584905660377


Epoch[1] Batch[535] Speed: 1.28222005477318 samples/sec                   batch loss = 1467.0453307628632 | accuracy = 0.5523364485981308


Epoch[1] Batch[540] Speed: 1.2853393675142224 samples/sec                   batch loss = 1481.0511348247528 | accuracy = 0.5523148148148148


Epoch[1] Batch[545] Speed: 1.290987855775036 samples/sec                   batch loss = 1494.3431248664856 | accuracy = 0.5532110091743119


Epoch[1] Batch[550] Speed: 1.2822087854130433 samples/sec                   batch loss = 1508.063781261444 | accuracy = 0.5522727272727272


Epoch[1] Batch[555] Speed: 1.2848687420510445 samples/sec                   batch loss = 1521.6956117153168 | accuracy = 0.5527027027027027


Epoch[1] Batch[560] Speed: 1.2839407082176888 samples/sec                   batch loss = 1533.3654282093048 | accuracy = 0.5549107142857143


Epoch[1] Batch[565] Speed: 1.2846080334402665 samples/sec                   batch loss = 1547.077969789505 | accuracy = 0.5553097345132744


Epoch[1] Batch[570] Speed: 1.2850492340547985 samples/sec                   batch loss = 1560.7458369731903 | accuracy = 0.5561403508771929


Epoch[1] Batch[575] Speed: 1.28713723251796 samples/sec                   batch loss = 1573.5124716758728 | accuracy = 0.5573913043478261


Epoch[1] Batch[580] Speed: 1.289621291906099 samples/sec                   batch loss = 1587.5070786476135 | accuracy = 0.5577586206896552


Epoch[1] Batch[585] Speed: 1.2873236963953252 samples/sec                   batch loss = 1601.1747374534607 | accuracy = 0.5572649572649573


Epoch[1] Batch[590] Speed: 1.286679005905441 samples/sec                   batch loss = 1614.8742544651031 | accuracy = 0.5563559322033899


Epoch[1] Batch[595] Speed: 1.2853560096112777 samples/sec                   batch loss = 1627.2577695846558 | accuracy = 0.5579831932773109


Epoch[1] Batch[600] Speed: 1.2871650801527486 samples/sec                   batch loss = 1640.2406356334686 | accuracy = 0.5595833333333333


Epoch[1] Batch[605] Speed: 1.2863125213861941 samples/sec                   batch loss = 1653.7115423679352 | accuracy = 0.5599173553719008


Epoch[1] Batch[610] Speed: 1.2855763364434936 samples/sec                   batch loss = 1665.6247172355652 | accuracy = 0.5602459016393443


Epoch[1] Batch[615] Speed: 1.2901564185897327 samples/sec                   batch loss = 1677.5168030261993 | accuracy = 0.5621951219512196


Epoch[1] Batch[620] Speed: 1.2895242510756018 samples/sec                   batch loss = 1689.4711782932281 | accuracy = 0.5637096774193548


Epoch[1] Batch[625] Speed: 1.295843413895183 samples/sec                   batch loss = 1702.3228740692139 | accuracy = 0.5628


Epoch[1] Batch[630] Speed: 1.288635604627077 samples/sec                   batch loss = 1716.5982222557068 | accuracy = 0.5626984126984127


Epoch[1] Batch[635] Speed: 1.2948517935901456 samples/sec                   batch loss = 1731.0506165027618 | accuracy = 0.5618110236220473


Epoch[1] Batch[640] Speed: 1.293064915924887 samples/sec                   batch loss = 1743.0680303573608 | accuracy = 0.56484375


Epoch[1] Batch[645] Speed: 1.2870326665198377 samples/sec                   batch loss = 1756.5930397510529 | accuracy = 0.5647286821705426


Epoch[1] Batch[650] Speed: 1.2857886583665123 samples/sec                   batch loss = 1770.305079460144 | accuracy = 0.565


Epoch[1] Batch[655] Speed: 1.2896942556110818 samples/sec                   batch loss = 1782.8590404987335 | accuracy = 0.5648854961832062


Epoch[1] Batch[660] Speed: 1.2853077585638264 samples/sec                   batch loss = 1797.4133484363556 | accuracy = 0.5640151515151515


Epoch[1] Batch[665] Speed: 1.2834289855686667 samples/sec                   batch loss = 1811.228848695755 | accuracy = 0.562781954887218


Epoch[1] Batch[670] Speed: 1.2767873803859613 samples/sec                   batch loss = 1824.5027849674225 | accuracy = 0.5634328358208955


Epoch[1] Batch[675] Speed: 1.283531690247251 samples/sec                   batch loss = 1837.6536929607391 | accuracy = 0.5644444444444444


Epoch[1] Batch[680] Speed: 1.283968909020835 samples/sec                   batch loss = 1850.1505224704742 | accuracy = 0.5650735294117647


Epoch[1] Batch[685] Speed: 1.2868670150477206 samples/sec                   batch loss = 1863.688069343567 | accuracy = 0.5664233576642336


Epoch[1] Batch[690] Speed: 1.274673236704464 samples/sec                   batch loss = 1877.0606603622437 | accuracy = 0.5670289855072463


Epoch[1] Batch[695] Speed: 1.2788975324082428 samples/sec                   batch loss = 1889.4952056407928 | accuracy = 0.5676258992805755


Epoch[1] Batch[700] Speed: 1.2778862113229603 samples/sec                   batch loss = 1902.1216747760773 | accuracy = 0.5682142857142857


Epoch[1] Batch[705] Speed: 1.2814299159202767 samples/sec                   batch loss = 1915.60617852211 | accuracy = 0.5684397163120567


Epoch[1] Batch[710] Speed: 1.2818396519235484 samples/sec                   batch loss = 1929.4556419849396 | accuracy = 0.5686619718309859


Epoch[1] Batch[715] Speed: 1.2761029169038522 samples/sec                   batch loss = 1941.9449036121368 | accuracy = 0.5695804195804196


Epoch[1] Batch[720] Speed: 1.2821080559321132 samples/sec                   batch loss = 1954.9237430095673 | accuracy = 0.5697916666666667


Epoch[1] Batch[725] Speed: 1.2849542578410533 samples/sec                   batch loss = 1968.9494156837463 | accuracy = 0.5689655172413793


Epoch[1] Batch[730] Speed: 1.288293922889341 samples/sec                   batch loss = 1981.185736656189 | accuracy = 0.5705479452054795


Epoch[1] Batch[735] Speed: 1.2791008274060274 samples/sec                   batch loss = 1994.003059387207 | accuracy = 0.5710884353741497


Epoch[1] Batch[740] Speed: 1.2790834692426203 samples/sec                   batch loss = 2008.22815823555 | accuracy = 0.5709459459459459


Epoch[1] Batch[745] Speed: 1.2821303953460879 samples/sec                   batch loss = 2021.667636871338 | accuracy = 0.5711409395973155


Epoch[1] Batch[750] Speed: 1.2811400770461019 samples/sec                   batch loss = 2034.9926488399506 | accuracy = 0.5713333333333334


Epoch[1] Batch[755] Speed: 1.2840020244372816 samples/sec                   batch loss = 2047.545977473259 | accuracy = 0.5718543046357616


Epoch[1] Batch[760] Speed: 1.2800777390960938 samples/sec                   batch loss = 2059.7892194986343 | accuracy = 0.5723684210526315


Epoch[1] Batch[765] Speed: 1.28204094439283 samples/sec                   batch loss = 2073.422515273094 | accuracy = 0.5725490196078431


Epoch[1] Batch[770] Speed: 1.2821952624423576 samples/sec                   batch loss = 2087.122237801552 | accuracy = 0.5724025974025974


Epoch[1] Batch[775] Speed: 1.2799197315964224 samples/sec                   batch loss = 2100.436494231224 | accuracy = 0.572258064516129


Epoch[1] Batch[780] Speed: 1.2863793904693526 samples/sec                   batch loss = 2113.703612446785 | accuracy = 0.573076923076923


Epoch[1] Batch[785] Speed: 1.2829223795689997 samples/sec                   batch loss = 2125.959320664406 | accuracy = 0.5735668789808918


[Epoch 1] training: accuracy=0.5736040609137056
[Epoch 1] time cost: 630.949196100235
[Epoch 1] validation: validation accuracy=0.6711111111111111


Epoch[2] Batch[5] Speed: 1.2887388473521277 samples/sec                   batch loss = 14.797056436538696 | accuracy = 0.6


Epoch[2] Batch[10] Speed: 1.2854504542882939 samples/sec                   batch loss = 28.38944721221924 | accuracy = 0.55


Epoch[2] Batch[15] Speed: 1.2790735226422145 samples/sec                   batch loss = 39.64159178733826 | accuracy = 0.6


Epoch[2] Batch[20] Speed: 1.2808276832944874 samples/sec                   batch loss = 52.67197251319885 | accuracy = 0.575


Epoch[2] Batch[25] Speed: 1.283562131694059 samples/sec                   batch loss = 66.32958793640137 | accuracy = 0.57


Epoch[2] Batch[30] Speed: 1.288603239508526 samples/sec                   batch loss = 79.0871889591217 | accuracy = 0.5666666666666667


Epoch[2] Batch[35] Speed: 1.2851298518255243 samples/sec                   batch loss = 92.51889538764954 | accuracy = 0.5714285714285714


Epoch[2] Batch[40] Speed: 1.279839473259427 samples/sec                   batch loss = 105.23770880699158 | accuracy = 0.575


Epoch[2] Batch[45] Speed: 1.2888158694770588 samples/sec                   batch loss = 118.83701372146606 | accuracy = 0.5666666666666667


Epoch[2] Batch[50] Speed: 1.2852953517371488 samples/sec                   batch loss = 131.58655881881714 | accuracy = 0.58


Epoch[2] Batch[55] Speed: 1.2942332879273666 samples/sec                   batch loss = 144.102445602417 | accuracy = 0.5818181818181818


Epoch[2] Batch[60] Speed: 1.2926460811107445 samples/sec                   batch loss = 156.43407201766968 | accuracy = 0.5875


Epoch[2] Batch[65] Speed: 1.2875976631826944 samples/sec                   batch loss = 170.00121545791626 | accuracy = 0.573076923076923


Epoch[2] Batch[70] Speed: 1.2837263440102047 samples/sec                   batch loss = 183.1708381175995 | accuracy = 0.5821428571428572


Epoch[2] Batch[75] Speed: 1.2877446238102503 samples/sec                   batch loss = 195.69089698791504 | accuracy = 0.5833333333333334


Epoch[2] Batch[80] Speed: 1.2868234867842467 samples/sec                   batch loss = 208.18349313735962 | accuracy = 0.584375


Epoch[2] Batch[85] Speed: 1.2806361558540929 samples/sec                   batch loss = 220.8052635192871 | accuracy = 0.5823529411764706


Epoch[2] Batch[90] Speed: 1.282579798416554 samples/sec                   batch loss = 234.09202194213867 | accuracy = 0.5777777777777777


Epoch[2] Batch[95] Speed: 1.2845451840626876 samples/sec                   batch loss = 247.13344740867615 | accuracy = 0.5815789473684211


Epoch[2] Batch[100] Speed: 1.2825762686157314 samples/sec                   batch loss = 260.9277219772339 | accuracy = 0.58


Epoch[2] Batch[105] Speed: 1.2823684368224026 samples/sec                   batch loss = 273.09935688972473 | accuracy = 0.5857142857142857


Epoch[2] Batch[110] Speed: 1.2859800544968902 samples/sec                   batch loss = 285.30635380744934 | accuracy = 0.5863636363636363


Epoch[2] Batch[115] Speed: 1.2829366046151476 samples/sec                   batch loss = 296.5836991071701 | accuracy = 0.5978260869565217


Epoch[2] Batch[120] Speed: 1.2839438524967477 samples/sec                   batch loss = 308.37708485126495 | accuracy = 0.6020833333333333


Epoch[2] Batch[125] Speed: 1.2920774423218495 samples/sec                   batch loss = 320.9783104658127 | accuracy = 0.604


Epoch[2] Batch[130] Speed: 1.283687742481644 samples/sec                   batch loss = 334.7413386106491 | accuracy = 0.5980769230769231


Epoch[2] Batch[135] Speed: 1.2854570531316685 samples/sec                   batch loss = 347.6317995786667 | accuracy = 0.6


Epoch[2] Batch[140] Speed: 1.2842723179534703 samples/sec                   batch loss = 359.0689586400986 | accuracy = 0.6


Epoch[2] Batch[145] Speed: 1.2817462266822712 samples/sec                   batch loss = 371.8066145181656 | accuracy = 0.6


Epoch[2] Batch[150] Speed: 1.2850881143786463 samples/sec                   batch loss = 384.834997177124 | accuracy = 0.6


Epoch[2] Batch[155] Speed: 1.2885500929784026 samples/sec                   batch loss = 397.2987971305847 | accuracy = 0.6016129032258064


Epoch[2] Batch[160] Speed: 1.2827184565189502 samples/sec                   batch loss = 410.08729553222656 | accuracy = 0.6015625


Epoch[2] Batch[165] Speed: 1.2880158033059286 samples/sec                   batch loss = 423.44304895401 | accuracy = 0.603030303030303


Epoch[2] Batch[170] Speed: 1.2872752975073816 samples/sec                   batch loss = 437.1807105541229 | accuracy = 0.6014705882352941


Epoch[2] Batch[175] Speed: 1.2819641421434422 samples/sec                   batch loss = 450.0457134246826 | accuracy = 0.6014285714285714


Epoch[2] Batch[180] Speed: 1.2841260504619756 samples/sec                   batch loss = 460.8205007314682 | accuracy = 0.6069444444444444


Epoch[2] Batch[185] Speed: 1.2855019664624463 samples/sec                   batch loss = 472.91938388347626 | accuracy = 0.6081081081081081


Epoch[2] Batch[190] Speed: 1.2950141093509822 samples/sec                   batch loss = 484.5355659723282 | accuracy = 0.6105263157894737


Epoch[2] Batch[195] Speed: 1.283198303633224 samples/sec                   batch loss = 498.0755704641342 | accuracy = 0.6102564102564103


Epoch[2] Batch[200] Speed: 1.2809110972717181 samples/sec                   batch loss = 511.566623210907 | accuracy = 0.60875


Epoch[2] Batch[205] Speed: 1.2841815849338514 samples/sec                   batch loss = 524.5026378631592 | accuracy = 0.6097560975609756


Epoch[2] Batch[210] Speed: 1.283856309731608 samples/sec                   batch loss = 536.1092507839203 | accuracy = 0.6154761904761905


Epoch[2] Batch[215] Speed: 1.2882607835645319 samples/sec                   batch loss = 548.5947906970978 | accuracy = 0.6174418604651163


Epoch[2] Batch[220] Speed: 1.2827058054209965 samples/sec                   batch loss = 560.1068760156631 | accuracy = 0.6193181818181818


Epoch[2] Batch[225] Speed: 1.284411243938992 samples/sec                   batch loss = 569.9869515895844 | accuracy = 0.6244444444444445


Epoch[2] Batch[230] Speed: 1.284097941038184 samples/sec                   batch loss = 581.7055020332336 | accuracy = 0.6260869565217392


Epoch[2] Batch[235] Speed: 1.2857118991000822 samples/sec                   batch loss = 594.5038893222809 | accuracy = 0.6297872340425532


Epoch[2] Batch[240] Speed: 1.2886142256509043 samples/sec                   batch loss = 609.1954019069672 | accuracy = 0.6270833333333333


Epoch[2] Batch[245] Speed: 1.2908915031086563 samples/sec                   batch loss = 626.2977802753448 | accuracy = 0.6234693877551021


Epoch[2] Batch[250] Speed: 1.2807126035322236 samples/sec                   batch loss = 638.5089230537415 | accuracy = 0.623


Epoch[2] Batch[255] Speed: 1.2807515151903786 samples/sec                   batch loss = 653.3083367347717 | accuracy = 0.6225490196078431


Epoch[2] Batch[260] Speed: 1.2806706637142742 samples/sec                   batch loss = 665.5741248130798 | accuracy = 0.6240384615384615


Epoch[2] Batch[265] Speed: 1.2880647523472695 samples/sec                   batch loss = 678.1419396400452 | accuracy = 0.6273584905660378


Epoch[2] Batch[270] Speed: 1.2847866812136617 samples/sec                   batch loss = 690.7783887386322 | accuracy = 0.6277777777777778


Epoch[2] Batch[275] Speed: 1.2839423786140214 samples/sec                   batch loss = 702.326784491539 | accuracy = 0.6290909090909091


Epoch[2] Batch[280] Speed: 1.2814365714249052 samples/sec                   batch loss = 715.6576076745987 | accuracy = 0.6303571428571428


Epoch[2] Batch[285] Speed: 1.2815324966699833 samples/sec                   batch loss = 728.8980730772018 | accuracy = 0.6307017543859649


Epoch[2] Batch[290] Speed: 1.284581771684743 samples/sec                   batch loss = 743.5057255029678 | accuracy = 0.6301724137931034


Epoch[2] Batch[295] Speed: 1.2829598558870061 samples/sec                   batch loss = 756.0220892429352 | accuracy = 0.6305084745762712


Epoch[2] Batch[300] Speed: 1.2821921267217795 samples/sec                   batch loss = 767.7719479799271 | accuracy = 0.6316666666666667


Epoch[2] Batch[305] Speed: 1.2880724658688607 samples/sec                   batch loss = 779.9078123569489 | accuracy = 0.6311475409836066


Epoch[2] Batch[310] Speed: 1.2845032878940819 samples/sec                   batch loss = 792.4523267745972 | accuracy = 0.6290322580645161


Epoch[2] Batch[315] Speed: 1.2861975386005704 samples/sec                   batch loss = 805.5067110061646 | accuracy = 0.6277777777777778


Epoch[2] Batch[320] Speed: 1.2845240389411652 samples/sec                   batch loss = 818.7769758701324 | accuracy = 0.62734375


Epoch[2] Batch[325] Speed: 1.2862080893374515 samples/sec                   batch loss = 833.3263907432556 | accuracy = 0.6261538461538462


Epoch[2] Batch[330] Speed: 1.2838320435363213 samples/sec                   batch loss = 846.9571017026901 | accuracy = 0.6242424242424243


Epoch[2] Batch[335] Speed: 1.2888322056868902 samples/sec                   batch loss = 859.3763399124146 | accuracy = 0.6238805970149254


Epoch[2] Batch[340] Speed: 1.2835928692005332 samples/sec                   batch loss = 873.5036878585815 | accuracy = 0.6227941176470588


Epoch[2] Batch[345] Speed: 1.2837153428291492 samples/sec                   batch loss = 886.5743156671524 | accuracy = 0.6231884057971014


Epoch[2] Batch[350] Speed: 1.282303552203383 samples/sec                   batch loss = 897.4585021734238 | accuracy = 0.6257142857142857


Epoch[2] Batch[355] Speed: 1.2858504468726883 samples/sec                   batch loss = 910.286346077919 | accuracy = 0.6274647887323944


Epoch[2] Batch[360] Speed: 1.2806813194916393 samples/sec                   batch loss = 923.2402189970016 | accuracy = 0.6284722222222222


Epoch[2] Batch[365] Speed: 1.2875622869519363 samples/sec                   batch loss = 933.9731949567795 | accuracy = 0.6308219178082192


Epoch[2] Batch[370] Speed: 1.2851384162123445 samples/sec                   batch loss = 946.9464794397354 | accuracy = 0.6297297297297297


Epoch[2] Batch[375] Speed: 1.2823174694990152 samples/sec                   batch loss = 957.3369668722153 | accuracy = 0.6306666666666667


Epoch[2] Batch[380] Speed: 1.2813285258992746 samples/sec                   batch loss = 969.0686115026474 | accuracy = 0.631578947368421


Epoch[2] Batch[385] Speed: 1.283307842671559 samples/sec                   batch loss = 981.1208683252335 | accuracy = 0.6318181818181818


Epoch[2] Batch[390] Speed: 1.2841350929158555 samples/sec                   batch loss = 993.1456018686295 | accuracy = 0.632051282051282


Epoch[2] Batch[395] Speed: 1.2869852767619416 samples/sec                   batch loss = 1004.3872617483139 | accuracy = 0.6335443037974684


Epoch[2] Batch[400] Speed: 1.28331108201382 samples/sec                   batch loss = 1013.9028615951538 | accuracy = 0.63625


Epoch[2] Batch[405] Speed: 1.285323809020364 samples/sec                   batch loss = 1026.7807579040527 | accuracy = 0.6364197530864197


Epoch[2] Batch[410] Speed: 1.2893467606473104 samples/sec                   batch loss = 1039.9212629795074 | accuracy = 0.6365853658536585


Epoch[2] Batch[415] Speed: 1.2831932982637608 samples/sec                   batch loss = 1051.5151880979538 | accuracy = 0.636144578313253


Epoch[2] Batch[420] Speed: 1.287539461421434 samples/sec                   batch loss = 1063.4063435792923 | accuracy = 0.638095238095238


Epoch[2] Batch[425] Speed: 1.2902906665895542 samples/sec                   batch loss = 1073.6669428348541 | accuracy = 0.6394117647058823


Epoch[2] Batch[430] Speed: 1.2870654464775388 samples/sec                   batch loss = 1086.4196510314941 | accuracy = 0.6377906976744186


Epoch[2] Batch[435] Speed: 1.2848275135335159 samples/sec                   batch loss = 1099.4663753509521 | accuracy = 0.6367816091954023


Epoch[2] Batch[440] Speed: 1.2814520359526038 samples/sec                   batch loss = 1110.689601302147 | accuracy = 0.6386363636363637


Epoch[2] Batch[445] Speed: 1.2882946153710297 samples/sec                   batch loss = 1124.698910355568 | accuracy = 0.6365168539325843


Epoch[2] Batch[450] Speed: 1.2852306628784629 samples/sec                   batch loss = 1137.801898598671 | accuracy = 0.6372222222222222


Epoch[2] Batch[455] Speed: 1.2858615832318512 samples/sec                   batch loss = 1151.4514080286026 | accuracy = 0.6357142857142857


Epoch[2] Batch[460] Speed: 1.2854380447066485 samples/sec                   batch loss = 1165.8954952955246 | accuracy = 0.6353260869565217


Epoch[2] Batch[465] Speed: 1.2805308841571434 samples/sec                   batch loss = 1178.2305483818054 | accuracy = 0.6349462365591397


Epoch[2] Batch[470] Speed: 1.2829641726715861 samples/sec                   batch loss = 1191.0852558612823 | accuracy = 0.6361702127659574


Epoch[2] Batch[475] Speed: 1.286132167332863 samples/sec                   batch loss = 1202.2117255926132 | accuracy = 0.6378947368421053


Epoch[2] Batch[480] Speed: 1.2855573244905822 samples/sec                   batch loss = 1213.4696856737137 | accuracy = 0.6385416666666667


Epoch[2] Batch[485] Speed: 1.2878672978724173 samples/sec                   batch loss = 1226.885793685913 | accuracy = 0.6376288659793814


Epoch[2] Batch[490] Speed: 1.2811080873402538 samples/sec                   batch loss = 1239.4441912174225 | accuracy = 0.6387755102040816


Epoch[2] Batch[495] Speed: 1.280333583006742 samples/sec                   batch loss = 1252.5752696990967 | accuracy = 0.6388888888888888


Epoch[2] Batch[500] Speed: 1.2826693245824154 samples/sec                   batch loss = 1262.3394191265106 | accuracy = 0.6405


Epoch[2] Batch[505] Speed: 1.2814205200314375 samples/sec                   batch loss = 1274.4590759277344 | accuracy = 0.6415841584158416


Epoch[2] Batch[510] Speed: 1.2843682749426781 samples/sec                   batch loss = 1287.1598858833313 | accuracy = 0.6421568627450981


Epoch[2] Batch[515] Speed: 1.2821916367668247 samples/sec                   batch loss = 1298.2744505405426 | accuracy = 0.6432038834951457


Epoch[2] Batch[520] Speed: 1.2806317569687318 samples/sec                   batch loss = 1309.7684359550476 | accuracy = 0.6442307692307693


Epoch[2] Batch[525] Speed: 1.2842209042782828 samples/sec                   batch loss = 1319.5076591968536 | accuracy = 0.6461904761904762


Epoch[2] Batch[530] Speed: 1.282240242122238 samples/sec                   batch loss = 1334.106315255165 | accuracy = 0.6462264150943396


Epoch[2] Batch[535] Speed: 1.2808637660849322 samples/sec                   batch loss = 1345.5163258314133 | accuracy = 0.6457943925233645


Epoch[2] Batch[540] Speed: 1.28082210971138 samples/sec                   batch loss = 1357.4263756275177 | accuracy = 0.6458333333333334


Epoch[2] Batch[545] Speed: 1.2867887452494813 samples/sec                   batch loss = 1370.317489862442 | accuracy = 0.6458715596330276


Epoch[2] Batch[550] Speed: 1.2855960385424041 samples/sec                   batch loss = 1381.3831087350845 | accuracy = 0.6472727272727272


Epoch[2] Batch[555] Speed: 1.2879250349305922 samples/sec                   batch loss = 1392.5328557491302 | accuracy = 0.6481981981981982


Epoch[2] Batch[560] Speed: 1.28411071789644 samples/sec                   batch loss = 1405.2196741104126 | accuracy = 0.6482142857142857


Epoch[2] Batch[565] Speed: 1.2869980124164522 samples/sec                   batch loss = 1416.6474286317825 | accuracy = 0.6491150442477877


Epoch[2] Batch[570] Speed: 1.2826490256595309 samples/sec                   batch loss = 1426.664099574089 | accuracy = 0.65


Epoch[2] Batch[575] Speed: 1.2883682205266047 samples/sec                   batch loss = 1436.7558380365372 | accuracy = 0.6508695652173913


Epoch[2] Batch[580] Speed: 1.2861604644867664 samples/sec                   batch loss = 1445.6343752145767 | accuracy = 0.6525862068965518


Epoch[2] Batch[585] Speed: 1.2886175908130744 samples/sec                   batch loss = 1453.1974851489067 | accuracy = 0.6551282051282051


Epoch[2] Batch[590] Speed: 1.2874905522954094 samples/sec                   batch loss = 1466.7946402430534 | accuracy = 0.6546610169491526


Epoch[2] Batch[595] Speed: 1.2888562652227238 samples/sec                   batch loss = 1479.5704093575478 | accuracy = 0.653781512605042


Epoch[2] Batch[600] Speed: 1.2833381753349205 samples/sec                   batch loss = 1489.5645617842674 | accuracy = 0.655


Epoch[2] Batch[605] Speed: 1.2876133755957544 samples/sec                   batch loss = 1501.3565738797188 | accuracy = 0.6553719008264463


Epoch[2] Batch[610] Speed: 1.283626456274096 samples/sec                   batch loss = 1513.716631948948 | accuracy = 0.655327868852459


Epoch[2] Batch[615] Speed: 1.2873966967040016 samples/sec                   batch loss = 1525.6214419007301 | accuracy = 0.6560975609756098


Epoch[2] Batch[620] Speed: 1.292803560654338 samples/sec                   batch loss = 1535.6800958514214 | accuracy = 0.657258064516129


Epoch[2] Batch[625] Speed: 1.280854280709989 samples/sec                   batch loss = 1548.478084743023 | accuracy = 0.6568


Epoch[2] Batch[630] Speed: 1.2801011798723105 samples/sec                   batch loss = 1560.4297763705254 | accuracy = 0.6567460317460317


Epoch[2] Batch[635] Speed: 1.2873692341820868 samples/sec                   batch loss = 1572.03334826231 | accuracy = 0.6566929133858268


Epoch[2] Batch[640] Speed: 1.2809782861819978 samples/sec                   batch loss = 1584.7314419150352 | accuracy = 0.65703125


Epoch[2] Batch[645] Speed: 1.2828138871818822 samples/sec                   batch loss = 1596.6156688332558 | accuracy = 0.6577519379844962


Epoch[2] Batch[650] Speed: 1.2810256258329384 samples/sec                   batch loss = 1607.1006299853325 | accuracy = 0.6592307692307692


Epoch[2] Batch[655] Speed: 1.2846790537526778 samples/sec                   batch loss = 1618.8393501639366 | accuracy = 0.6595419847328244


Epoch[2] Batch[660] Speed: 1.2850936267162447 samples/sec                   batch loss = 1631.4043251872063 | accuracy = 0.6602272727272728


Epoch[2] Batch[665] Speed: 1.2822135871161853 samples/sec                   batch loss = 1643.2109377980232 | accuracy = 0.6597744360902256


Epoch[2] Batch[670] Speed: 1.284232700551133 samples/sec                   batch loss = 1655.4785229563713 | accuracy = 0.6597014925373135


Epoch[2] Batch[675] Speed: 1.2815474740502772 samples/sec                   batch loss = 1664.2888707518578 | accuracy = 0.6611111111111111


Epoch[2] Batch[680] Speed: 1.2816988337584598 samples/sec                   batch loss = 1674.7629291415215 | accuracy = 0.6610294117647059


Epoch[2] Batch[685] Speed: 1.287575330475588 samples/sec                   batch loss = 1688.217974126339 | accuracy = 0.6605839416058394


Epoch[2] Batch[690] Speed: 1.2965228719702273 samples/sec                   batch loss = 1698.143614590168 | accuracy = 0.6619565217391304


Epoch[2] Batch[695] Speed: 1.281632352485076 samples/sec                   batch loss = 1708.4688729047775 | accuracy = 0.6622302158273381


Epoch[2] Batch[700] Speed: 1.289532477668286 samples/sec                   batch loss = 1719.1849702596664 | accuracy = 0.6625


Epoch[2] Batch[705] Speed: 1.2887693383007068 samples/sec                   batch loss = 1731.6104459762573 | accuracy = 0.6631205673758865


Epoch[2] Batch[710] Speed: 1.2910016641910667 samples/sec                   batch loss = 1746.9006447792053 | accuracy = 0.6626760563380282


Epoch[2] Batch[715] Speed: 1.2880354813622277 samples/sec                   batch loss = 1758.7852112054825 | accuracy = 0.6622377622377622


Epoch[2] Batch[720] Speed: 1.285143732096061 samples/sec                   batch loss = 1771.2069506645203 | accuracy = 0.6625


Epoch[2] Batch[725] Speed: 1.2897638564034755 samples/sec                   batch loss = 1784.4708228111267 | accuracy = 0.6610344827586206


Epoch[2] Batch[730] Speed: 1.2910137840514258 samples/sec                   batch loss = 1795.9284064769745 | accuracy = 0.661986301369863


Epoch[2] Batch[735] Speed: 1.2913364347997027 samples/sec                   batch loss = 1808.6607775688171 | accuracy = 0.6619047619047619


Epoch[2] Batch[740] Speed: 1.2901211992397634 samples/sec                   batch loss = 1821.050708413124 | accuracy = 0.6618243243243244


Epoch[2] Batch[745] Speed: 1.2855397906490502 samples/sec                   batch loss = 1832.4184052944183 | accuracy = 0.6620805369127517


Epoch[2] Batch[750] Speed: 1.2812878178185991 samples/sec                   batch loss = 1843.7336047887802 | accuracy = 0.6613333333333333


Epoch[2] Batch[755] Speed: 1.2888587405345733 samples/sec                   batch loss = 1855.6776424646378 | accuracy = 0.6612582781456954


Epoch[2] Batch[760] Speed: 1.286991496468701 samples/sec                   batch loss = 1867.4381755590439 | accuracy = 0.6621710526315789


Epoch[2] Batch[765] Speed: 1.2819857908257397 samples/sec                   batch loss = 1877.3169379234314 | accuracy = 0.6637254901960784


Epoch[2] Batch[770] Speed: 1.2869427276820677 samples/sec                   batch loss = 1888.3034131526947 | accuracy = 0.6642857142857143


Epoch[2] Batch[775] Speed: 1.2829803608724857 samples/sec                   batch loss = 1898.9875371456146 | accuracy = 0.6648387096774193


Epoch[2] Batch[780] Speed: 1.2870461929867818 samples/sec                   batch loss = 1911.3902304172516 | accuracy = 0.6647435897435897


Epoch[2] Batch[785] Speed: 1.2962032326554023 samples/sec                   batch loss = 1921.7829699516296 | accuracy = 0.664968152866242


[Epoch 2] training: accuracy=0.6652918781725888
[Epoch 2] time cost: 628.8195941448212
[Epoch 2] validation: validation accuracy=0.7477777777777778


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).