<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[15:37:00] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[15:37:00] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[15:37:01] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 3.8325076, -2.841408 ]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.779516338107339 samples/sec                   batch loss = 14.624186992645264 | accuracy = 0.4


Epoch[1] Batch[10] Speed: 1.2531180779506044 samples/sec                   batch loss = 29.871703624725342 | accuracy = 0.35


Epoch[1] Batch[15] Speed: 1.254644356614737 samples/sec                   batch loss = 45.128093242645264 | accuracy = 0.36666666666666664


Epoch[1] Batch[20] Speed: 1.2525917197862149 samples/sec                   batch loss = 59.062973976135254 | accuracy = 0.4125


Epoch[1] Batch[25] Speed: 1.2510216383616444 samples/sec                   batch loss = 73.81603026390076 | accuracy = 0.42


Epoch[1] Batch[30] Speed: 1.2542914839586654 samples/sec                   batch loss = 87.36227321624756 | accuracy = 0.44166666666666665


Epoch[1] Batch[35] Speed: 1.251794605063775 samples/sec                   batch loss = 100.88459491729736 | accuracy = 0.45714285714285713


Epoch[1] Batch[40] Speed: 1.2487022282836755 samples/sec                   batch loss = 114.93676495552063 | accuracy = 0.475


Epoch[1] Batch[45] Speed: 1.2508371487214816 samples/sec                   batch loss = 128.34864568710327 | accuracy = 0.4888888888888889


Epoch[1] Batch[50] Speed: 1.2559309916838781 samples/sec                   batch loss = 142.96037888526917 | accuracy = 0.48


Epoch[1] Batch[55] Speed: 1.2520046031863656 samples/sec                   batch loss = 157.43460130691528 | accuracy = 0.4863636363636364


Epoch[1] Batch[60] Speed: 1.2510560613098314 samples/sec                   batch loss = 171.3926179409027 | accuracy = 0.49583333333333335


Epoch[1] Batch[65] Speed: 1.248671187437742 samples/sec                   batch loss = 185.3381073474884 | accuracy = 0.5


Epoch[1] Batch[70] Speed: 1.250472340335774 samples/sec                   batch loss = 199.99222707748413 | accuracy = 0.5


Epoch[1] Batch[75] Speed: 1.2523177694713983 samples/sec                   batch loss = 213.9950578212738 | accuracy = 0.49666666666666665


Epoch[1] Batch[80] Speed: 1.2567347026523974 samples/sec                   batch loss = 228.2658715248108 | accuracy = 0.484375


Epoch[1] Batch[85] Speed: 1.2546228709188638 samples/sec                   batch loss = 242.3204345703125 | accuracy = 0.47941176470588237


Epoch[1] Batch[90] Speed: 1.2563080235617061 samples/sec                   batch loss = 256.1294434070587 | accuracy = 0.48333333333333334


Epoch[1] Batch[95] Speed: 1.2553072092329325 samples/sec                   batch loss = 269.74929904937744 | accuracy = 0.4842105263157895


Epoch[1] Batch[100] Speed: 1.2528754267148254 samples/sec                   batch loss = 283.4947497844696 | accuracy = 0.4925


Epoch[1] Batch[105] Speed: 1.2517649980063938 samples/sec                   batch loss = 297.31545329093933 | accuracy = 0.49523809523809526


Epoch[1] Batch[110] Speed: 1.2520446864570316 samples/sec                   batch loss = 310.5470907688141 | accuracy = 0.49772727272727274


Epoch[1] Batch[115] Speed: 1.251989841229852 samples/sec                   batch loss = 324.2488613128662 | accuracy = 0.5043478260869565


Epoch[1] Batch[120] Speed: 1.2541262779839721 samples/sec                   batch loss = 337.6373507976532 | accuracy = 0.5104166666666666


Epoch[1] Batch[125] Speed: 1.2540904672187425 samples/sec                   batch loss = 351.3149869441986 | accuracy = 0.514


Epoch[1] Batch[130] Speed: 1.2520331003528513 samples/sec                   batch loss = 364.72583961486816 | accuracy = 0.5192307692307693


Epoch[1] Batch[135] Speed: 1.2489856635384484 samples/sec                   batch loss = 378.32367277145386 | accuracy = 0.5222222222222223


Epoch[1] Batch[140] Speed: 1.251216166903838 samples/sec                   batch loss = 392.2602035999298 | accuracy = 0.5232142857142857


Epoch[1] Batch[145] Speed: 1.2504501584973804 samples/sec                   batch loss = 405.4762964248657 | accuracy = 0.5310344827586206


Epoch[1] Batch[150] Speed: 1.2552943416710531 samples/sec                   batch loss = 418.7456088066101 | accuracy = 0.5333333333333333


Epoch[1] Batch[155] Speed: 1.2619115295348708 samples/sec                   batch loss = 432.46896052360535 | accuracy = 0.535483870967742


Epoch[1] Batch[160] Speed: 1.2545077614050726 samples/sec                   batch loss = 447.08835315704346 | accuracy = 0.53125


Epoch[1] Batch[165] Speed: 1.2544081482169906 samples/sec                   batch loss = 460.5231647491455 | accuracy = 0.5333333333333333


Epoch[1] Batch[170] Speed: 1.2563491355063343 samples/sec                   batch loss = 473.9419355392456 | accuracy = 0.5367647058823529


Epoch[1] Batch[175] Speed: 1.2598717721317279 samples/sec                   batch loss = 487.22366976737976 | accuracy = 0.5414285714285715


Epoch[1] Batch[180] Speed: 1.2616997133180214 samples/sec                   batch loss = 502.1451120376587 | accuracy = 0.5388888888888889


Epoch[1] Batch[185] Speed: 1.2568574712610956 samples/sec                   batch loss = 515.6420147418976 | accuracy = 0.5418918918918919


Epoch[1] Batch[190] Speed: 1.258865179578629 samples/sec                   batch loss = 529.7615671157837 | accuracy = 0.5394736842105263


Epoch[1] Batch[195] Speed: 1.2613442838390323 samples/sec                   batch loss = 543.1777565479279 | accuracy = 0.5371794871794872


Epoch[1] Batch[200] Speed: 1.2564024815595336 samples/sec                   batch loss = 556.35790848732 | accuracy = 0.535


Epoch[1] Batch[205] Speed: 1.2596713273980806 samples/sec                   batch loss = 569.8373634815216 | accuracy = 0.5329268292682927


Epoch[1] Batch[210] Speed: 1.2624206727019585 samples/sec                   batch loss = 583.5066165924072 | accuracy = 0.5333333333333333


Epoch[1] Batch[215] Speed: 1.2582854748010897 samples/sec                   batch loss = 597.1209199428558 | accuracy = 0.5337209302325582


Epoch[1] Batch[220] Speed: 1.2582638642175499 samples/sec                   batch loss = 610.8530395030975 | accuracy = 0.5340909090909091


Epoch[1] Batch[225] Speed: 1.2644286735600268 samples/sec                   batch loss = 624.4084095954895 | accuracy = 0.5333333333333333


Epoch[1] Batch[230] Speed: 1.2637091317532942 samples/sec                   batch loss = 638.5197358131409 | accuracy = 0.5304347826086957


Epoch[1] Batch[235] Speed: 1.2596081518129243 samples/sec                   batch loss = 652.4413113594055 | accuracy = 0.5308510638297872


Epoch[1] Batch[240] Speed: 1.2534616748399254 samples/sec                   batch loss = 665.9073007106781 | accuracy = 0.5333333333333333


Epoch[1] Batch[245] Speed: 1.2523962958749462 samples/sec                   batch loss = 679.782844543457 | accuracy = 0.5336734693877551


Epoch[1] Batch[250] Speed: 1.2577837157153018 samples/sec                   batch loss = 693.956794500351 | accuracy = 0.533


Epoch[1] Batch[255] Speed: 1.2580929874498272 samples/sec                   batch loss = 707.5730941295624 | accuracy = 0.5362745098039216


Epoch[1] Batch[260] Speed: 1.2556218421352585 samples/sec                   batch loss = 721.7928812503815 | accuracy = 0.5346153846153846


Epoch[1] Batch[265] Speed: 1.2568691468255528 samples/sec                   batch loss = 735.4499049186707 | accuracy = 0.5349056603773585


Epoch[1] Batch[270] Speed: 1.2582155497900689 samples/sec                   batch loss = 748.7143709659576 | accuracy = 0.5370370370370371


Epoch[1] Batch[275] Speed: 1.2567643569827016 samples/sec                   batch loss = 762.4234654903412 | accuracy = 0.5372727272727272


Epoch[1] Batch[280] Speed: 1.254858409335422 samples/sec                   batch loss = 776.0863509178162 | accuracy = 0.5357142857142857


Epoch[1] Batch[285] Speed: 1.253392004004348 samples/sec                   batch loss = 789.6021010875702 | accuracy = 0.5359649122807018


Epoch[1] Batch[290] Speed: 1.2535434355763404 samples/sec                   batch loss = 802.7900264263153 | accuracy = 0.5405172413793103


Epoch[1] Batch[295] Speed: 1.2572251673499748 samples/sec                   batch loss = 816.2262654304504 | accuracy = 0.5432203389830509


Epoch[1] Batch[300] Speed: 1.2555279712644427 samples/sec                   batch loss = 830.1254909038544 | accuracy = 0.5416666666666666


Epoch[1] Batch[305] Speed: 1.2637027543047592 samples/sec                   batch loss = 843.7709386348724 | accuracy = 0.5426229508196722


Epoch[1] Batch[310] Speed: 1.2531811658191974 samples/sec                   batch loss = 857.5104458332062 | accuracy = 0.5419354838709678


Epoch[1] Batch[315] Speed: 1.2578159656827659 samples/sec                   batch loss = 871.2651743888855 | accuracy = 0.5444444444444444


Epoch[1] Batch[320] Speed: 1.256337845917168 samples/sec                   batch loss = 885.0011477470398 | accuracy = 0.5453125


Epoch[1] Batch[325] Speed: 1.2601476179670987 samples/sec                   batch loss = 899.4759087562561 | accuracy = 0.5438461538461539


Epoch[1] Batch[330] Speed: 1.2606395192965456 samples/sec                   batch loss = 913.3489718437195 | accuracy = 0.5446969696969697


Epoch[1] Batch[335] Speed: 1.2583379472547405 samples/sec                   batch loss = 926.9488160610199 | accuracy = 0.5447761194029851


Epoch[1] Batch[340] Speed: 1.2640015164507938 samples/sec                   batch loss = 941.0407631397247 | accuracy = 0.5419117647058823


Epoch[1] Batch[345] Speed: 1.258942545399426 samples/sec                   batch loss = 955.0483338832855 | accuracy = 0.5413043478260869


Epoch[1] Batch[350] Speed: 1.2612759148568151 samples/sec                   batch loss = 968.378006696701 | accuracy = 0.5407142857142857


Epoch[1] Batch[355] Speed: 1.261227748096587 samples/sec                   batch loss = 981.8188331127167 | accuracy = 0.5429577464788733


Epoch[1] Batch[360] Speed: 1.2618668258376937 samples/sec                   batch loss = 995.8689386844635 | accuracy = 0.5423611111111111


Epoch[1] Batch[365] Speed: 1.2674793479149375 samples/sec                   batch loss = 1010.1038174629211 | accuracy = 0.541095890410959


Epoch[1] Batch[370] Speed: 1.263710178803235 samples/sec                   batch loss = 1023.6155829429626 | accuracy = 0.5418918918918919


Epoch[1] Batch[375] Speed: 1.2614304905260099 samples/sec                   batch loss = 1037.4353799819946 | accuracy = 0.54


Epoch[1] Batch[380] Speed: 1.2584138324279701 samples/sec                   batch loss = 1051.1963338851929 | accuracy = 0.5394736842105263


Epoch[1] Batch[385] Speed: 1.2574851520175567 samples/sec                   batch loss = 1064.5178384780884 | accuracy = 0.5422077922077922


Epoch[1] Batch[390] Speed: 1.2581074219583375 samples/sec                   batch loss = 1077.8892147541046 | accuracy = 0.5455128205128205


Epoch[1] Batch[395] Speed: 1.2575288859893132 samples/sec                   batch loss = 1091.3672380447388 | accuracy = 0.5462025316455696


Epoch[1] Batch[400] Speed: 1.26333021363989 samples/sec                   batch loss = 1105.4261455535889 | accuracy = 0.5475


Epoch[1] Batch[405] Speed: 1.2696503810642676 samples/sec                   batch loss = 1119.463062763214 | accuracy = 0.5456790123456791


Epoch[1] Batch[410] Speed: 1.2587299303172121 samples/sec                   batch loss = 1133.1930508613586 | accuracy = 0.5457317073170732


Epoch[1] Batch[415] Speed: 1.2618956788169728 samples/sec                   batch loss = 1147.3529605865479 | accuracy = 0.5451807228915663


Epoch[1] Batch[420] Speed: 1.258205170221486 samples/sec                   batch loss = 1160.6914839744568 | accuracy = 0.5482142857142858


Epoch[1] Batch[425] Speed: 1.2580424221110276 samples/sec                   batch loss = 1174.1926951408386 | accuracy = 0.5476470588235294


Epoch[1] Batch[430] Speed: 1.2598888019953702 samples/sec                   batch loss = 1187.5228612422943 | accuracy = 0.5476744186046512


Epoch[1] Batch[435] Speed: 1.2627385018666493 samples/sec                   batch loss = 1200.7266190052032 | accuracy = 0.5494252873563218


Epoch[1] Batch[440] Speed: 1.2596549654483427 samples/sec                   batch loss = 1214.6973435878754 | accuracy = 0.5482954545454546


Epoch[1] Batch[445] Speed: 1.262631495766698 samples/sec                   batch loss = 1228.6558785438538 | accuracy = 0.5483146067415731


Epoch[1] Batch[450] Speed: 1.2560974258457605 samples/sec                   batch loss = 1242.1169307231903 | accuracy = 0.5494444444444444


Epoch[1] Batch[455] Speed: 1.2636799102412417 samples/sec                   batch loss = 1255.1268610954285 | accuracy = 0.5527472527472528


Epoch[1] Batch[460] Speed: 1.263399186079278 samples/sec                   batch loss = 1268.3888261318207 | accuracy = 0.5543478260869565


Epoch[1] Batch[465] Speed: 1.2571068482106296 samples/sec                   batch loss = 1281.9938752651215 | accuracy = 0.5548387096774193


Epoch[1] Batch[470] Speed: 1.259450713299199 samples/sec                   batch loss = 1295.584941148758 | accuracy = 0.5558510638297872


Epoch[1] Batch[475] Speed: 1.2623811571515262 samples/sec                   batch loss = 1309.6428530216217 | accuracy = 0.5557894736842105


Epoch[1] Batch[480] Speed: 1.2522322429865038 samples/sec                   batch loss = 1322.9686715602875 | accuracy = 0.55625


Epoch[1] Batch[485] Speed: 1.2565048584553555 samples/sec                   batch loss = 1336.7673046588898 | accuracy = 0.5567010309278351


Epoch[1] Batch[490] Speed: 1.2559140686465693 samples/sec                   batch loss = 1350.0243980884552 | accuracy = 0.5571428571428572


Epoch[1] Batch[495] Speed: 1.2611326579917213 samples/sec                   batch loss = 1363.863319158554 | accuracy = 0.5575757575757576


Epoch[1] Batch[500] Speed: 1.2617505730900052 samples/sec                   batch loss = 1377.5954947471619 | accuracy = 0.5565


Epoch[1] Batch[505] Speed: 1.2655453693342926 samples/sec                   batch loss = 1390.911687374115 | accuracy = 0.5574257425742575


Epoch[1] Batch[510] Speed: 1.2619313672218884 samples/sec                   batch loss = 1403.6563639640808 | accuracy = 0.5602941176470588


Epoch[1] Batch[515] Speed: 1.259402118608825 samples/sec                   batch loss = 1417.4543342590332 | accuracy = 0.5601941747572815


Epoch[1] Batch[520] Speed: 1.2589369717196437 samples/sec                   batch loss = 1430.8811559677124 | accuracy = 0.5610576923076923


Epoch[1] Batch[525] Speed: 1.2580811947965587 samples/sec                   batch loss = 1445.0162847042084 | accuracy = 0.56


Epoch[1] Batch[530] Speed: 1.2568365687762126 samples/sec                   batch loss = 1458.2962381839752 | accuracy = 0.560377358490566


Epoch[1] Batch[535] Speed: 1.257866041278048 samples/sec                   batch loss = 1471.027664422989 | accuracy = 0.5630841121495327


Epoch[1] Batch[540] Speed: 1.2563137621383942 samples/sec                   batch loss = 1484.4343128204346 | accuracy = 0.5634259259259259


Epoch[1] Batch[545] Speed: 1.2586061347200603 samples/sec                   batch loss = 1497.5051345825195 | accuracy = 0.5651376146788991


Epoch[1] Batch[550] Speed: 1.2562585422721402 samples/sec                   batch loss = 1511.050938129425 | accuracy = 0.5663636363636364


Epoch[1] Batch[555] Speed: 1.2626389076827695 samples/sec                   batch loss = 1524.1985366344452 | accuracy = 0.5662162162162162


Epoch[1] Batch[560] Speed: 1.2650317954187091 samples/sec                   batch loss = 1537.3785684108734 | accuracy = 0.5665178571428572


Epoch[1] Batch[565] Speed: 1.2573233439229317 samples/sec                   batch loss = 1550.5680859088898 | accuracy = 0.5676991150442477


Epoch[1] Batch[570] Speed: 1.2511465588543262 samples/sec                   batch loss = 1564.4989984035492 | accuracy = 0.5666666666666667


Epoch[1] Batch[575] Speed: 1.2554711293624479 samples/sec                   batch loss = 1577.2429234981537 | accuracy = 0.5682608695652174


Epoch[1] Batch[580] Speed: 1.257868493292829 samples/sec                   batch loss = 1590.3744280338287 | accuracy = 0.569396551724138


Epoch[1] Batch[585] Speed: 1.2527364101652758 samples/sec                   batch loss = 1603.8579497337341 | accuracy = 0.5700854700854701


Epoch[1] Batch[590] Speed: 1.2532837673716548 samples/sec                   batch loss = 1617.4520119428635 | accuracy = 0.5703389830508474


Epoch[1] Batch[595] Speed: 1.2564145250343193 samples/sec                   batch loss = 1630.3358458280563 | accuracy = 0.5714285714285714


Epoch[1] Batch[600] Speed: 1.2582221550592412 samples/sec                   batch loss = 1643.3692945241928 | accuracy = 0.57125


Epoch[1] Batch[605] Speed: 1.2578840543786989 samples/sec                   batch loss = 1656.5720361471176 | accuracy = 0.5710743801652892


Epoch[1] Batch[610] Speed: 1.2563257098351057 samples/sec                   batch loss = 1669.664367556572 | accuracy = 0.5709016393442623


Epoch[1] Batch[615] Speed: 1.2584134548671404 samples/sec                   batch loss = 1682.1128712892532 | accuracy = 0.5719512195121951


Epoch[1] Batch[620] Speed: 1.255730201041486 samples/sec                   batch loss = 1695.503482222557 | accuracy = 0.5709677419354838


Epoch[1] Batch[625] Speed: 1.2697003464026704 samples/sec                   batch loss = 1709.6242340803146 | accuracy = 0.5712


Epoch[1] Batch[630] Speed: 1.259933176413832 samples/sec                   batch loss = 1722.9438527822495 | accuracy = 0.5714285714285714


Epoch[1] Batch[635] Speed: 1.2671327135437047 samples/sec                   batch loss = 1736.3993493318558 | accuracy = 0.5712598425196851


Epoch[1] Batch[640] Speed: 1.2616596736211092 samples/sec                   batch loss = 1748.0282250642776 | accuracy = 0.5734375


Epoch[1] Batch[645] Speed: 1.261207079237019 samples/sec                   batch loss = 1761.5206462144852 | accuracy = 0.5732558139534883


Epoch[1] Batch[650] Speed: 1.2624299820137836 samples/sec                   batch loss = 1774.9380124807358 | accuracy = 0.573076923076923


Epoch[1] Batch[655] Speed: 1.2663800760980688 samples/sec                   batch loss = 1787.375432252884 | accuracy = 0.5725190839694656


Epoch[1] Batch[660] Speed: 1.2609801455869531 samples/sec                   batch loss = 1800.0619838237762 | accuracy = 0.5738636363636364


Epoch[1] Batch[665] Speed: 1.26820788817116 samples/sec                   batch loss = 1811.841768026352 | accuracy = 0.575187969924812


Epoch[1] Batch[670] Speed: 1.254230815868249 samples/sec                   batch loss = 1824.0607714653015 | accuracy = 0.5764925373134329


Epoch[1] Batch[675] Speed: 1.2674921792274478 samples/sec                   batch loss = 1836.9152755737305 | accuracy = 0.5777777777777777


Epoch[1] Batch[680] Speed: 1.257831996983701 samples/sec                   batch loss = 1850.231142282486 | accuracy = 0.5779411764705882


Epoch[1] Batch[685] Speed: 1.254819553482358 samples/sec                   batch loss = 1863.1072525978088 | accuracy = 0.5777372262773722


Epoch[1] Batch[690] Speed: 1.261904980387585 samples/sec                   batch loss = 1876.6453177928925 | accuracy = 0.5786231884057971


Epoch[1] Batch[695] Speed: 1.2569622766330297 samples/sec                   batch loss = 1891.7959249019623 | accuracy = 0.5776978417266188


Epoch[1] Batch[700] Speed: 1.253886141250021 samples/sec                   batch loss = 1904.4755640029907 | accuracy = 0.5785714285714286


Epoch[1] Batch[705] Speed: 1.248776583890848 samples/sec                   batch loss = 1918.5880670547485 | accuracy = 0.5780141843971631


Epoch[1] Batch[710] Speed: 1.2517891878949656 samples/sec                   batch loss = 1931.436686515808 | accuracy = 0.5788732394366197


Epoch[1] Batch[715] Speed: 1.2535612314225655 samples/sec                   batch loss = 1944.8541793823242 | accuracy = 0.5783216783216784


Epoch[1] Batch[720] Speed: 1.2536570566899456 samples/sec                   batch loss = 1958.8860402107239 | accuracy = 0.578125


Epoch[1] Batch[725] Speed: 1.2545770872620152 samples/sec                   batch loss = 1972.4041934013367 | accuracy = 0.5786206896551724


Epoch[1] Batch[730] Speed: 1.2558797538875666 samples/sec                   batch loss = 1985.7188098430634 | accuracy = 0.5780821917808219


Epoch[1] Batch[735] Speed: 1.2521689702537384 samples/sec                   batch loss = 1999.4359140396118 | accuracy = 0.5772108843537415


Epoch[1] Batch[740] Speed: 1.2544719288508155 samples/sec                   batch loss = 2011.2685091495514 | accuracy = 0.5790540540540541


Epoch[1] Batch[745] Speed: 1.2585917831826186 samples/sec                   batch loss = 2025.4915978908539 | accuracy = 0.5788590604026845


Epoch[1] Batch[750] Speed: 1.2523715216009845 samples/sec                   batch loss = 2038.595088481903 | accuracy = 0.5786666666666667


Epoch[1] Batch[755] Speed: 1.2488962220575444 samples/sec                   batch loss = 2050.3664890527725 | accuracy = 0.5798013245033112


Epoch[1] Batch[760] Speed: 1.258065911846836 samples/sec                   batch loss = 2061.8483978509903 | accuracy = 0.58125


Epoch[1] Batch[765] Speed: 1.2588144576997042 samples/sec                   batch loss = 2074.64173412323 | accuracy = 0.5813725490196079


Epoch[1] Batch[770] Speed: 1.2571719398219765 samples/sec                   batch loss = 2086.513407945633 | accuracy = 0.5824675324675325


Epoch[1] Batch[775] Speed: 1.2572392993092505 samples/sec                   batch loss = 2098.131780385971 | accuracy = 0.5835483870967741


Epoch[1] Batch[780] Speed: 1.2601204538568413 samples/sec                   batch loss = 2111.8287909030914 | accuracy = 0.5833333333333334


Epoch[1] Batch[785] Speed: 1.2588371261907008 samples/sec                   batch loss = 2123.4427983760834 | accuracy = 0.5847133757961783


[Epoch 1] training: accuracy=0.5850253807106599
[Epoch 1] time cost: 644.4986162185669
[Epoch 1] validation: validation accuracy=0.6666666666666666


Epoch[2] Batch[5] Speed: 1.2626433738793403 samples/sec                   batch loss = 9.974749207496643 | accuracy = 0.85


Epoch[2] Batch[10] Speed: 1.2657890380776007 samples/sec                   batch loss = 23.594755172729492 | accuracy = 0.7


Epoch[2] Batch[15] Speed: 1.2607466617170267 samples/sec                   batch loss = 35.22732329368591 | accuracy = 0.7166666666666667


Epoch[2] Batch[20] Speed: 1.2665921281599282 samples/sec                   batch loss = 47.079455852508545 | accuracy = 0.7125


Epoch[2] Batch[25] Speed: 1.2632360427466285 samples/sec                   batch loss = 59.49352717399597 | accuracy = 0.69


Epoch[2] Batch[30] Speed: 1.2624621856921991 samples/sec                   batch loss = 70.2931478023529 | accuracy = 0.7083333333333334


Epoch[2] Batch[35] Speed: 1.267636597286173 samples/sec                   batch loss = 84.27076005935669 | accuracy = 0.6857142857142857


Epoch[2] Batch[40] Speed: 1.264361017938819 samples/sec                   batch loss = 96.49757754802704 | accuracy = 0.6875


Epoch[2] Batch[45] Speed: 1.2666875651946492 samples/sec                   batch loss = 108.99847662448883 | accuracy = 0.6888888888888889


Epoch[2] Batch[50] Speed: 1.266258211760298 samples/sec                   batch loss = 118.60130763053894 | accuracy = 0.7


Epoch[2] Batch[55] Speed: 1.2666016903407482 samples/sec                   batch loss = 130.4490520954132 | accuracy = 0.6954545454545454


Epoch[2] Batch[60] Speed: 1.2651606745435862 samples/sec                   batch loss = 142.52118682861328 | accuracy = 0.6958333333333333


Epoch[2] Batch[65] Speed: 1.2697936575921398 samples/sec                   batch loss = 153.42998468875885 | accuracy = 0.7115384615384616


Epoch[2] Batch[70] Speed: 1.266172490668164 samples/sec                   batch loss = 166.16372859477997 | accuracy = 0.7142857142857143


Epoch[2] Batch[75] Speed: 1.269355089362144 samples/sec                   batch loss = 176.84346842765808 | accuracy = 0.72


Epoch[2] Batch[80] Speed: 1.265358671053368 samples/sec                   batch loss = 188.80896627902985 | accuracy = 0.71875


Epoch[2] Batch[85] Speed: 1.262868910369754 samples/sec                   batch loss = 202.71030712127686 | accuracy = 0.7176470588235294


Epoch[2] Batch[90] Speed: 1.2640708479344684 samples/sec                   batch loss = 216.01990044116974 | accuracy = 0.7111111111111111


Epoch[2] Batch[95] Speed: 1.2632389913192497 samples/sec                   batch loss = 227.2017627954483 | accuracy = 0.7052631578947368


Epoch[2] Batch[100] Speed: 1.2681306255234364 samples/sec                   batch loss = 239.65432381629944 | accuracy = 0.705


Epoch[2] Batch[105] Speed: 1.2642444957242531 samples/sec                   batch loss = 249.95434069633484 | accuracy = 0.7023809523809523


Epoch[2] Batch[110] Speed: 1.2681251618960847 samples/sec                   batch loss = 260.3481113910675 | accuracy = 0.7022727272727273


Epoch[2] Batch[115] Speed: 1.267123813252659 samples/sec                   batch loss = 272.0985018014908 | accuracy = 0.6978260869565217


Epoch[2] Batch[120] Speed: 1.2608682255227346 samples/sec                   batch loss = 284.6304278373718 | accuracy = 0.6979166666666666


Epoch[2] Batch[125] Speed: 1.2564450111117975 samples/sec                   batch loss = 297.7426367998123 | accuracy = 0.696


Epoch[2] Batch[130] Speed: 1.2567395037347386 samples/sec                   batch loss = 311.6333295106888 | accuracy = 0.6884615384615385


Epoch[2] Batch[135] Speed: 1.252406299350141 samples/sec                   batch loss = 323.3313707113266 | accuracy = 0.6907407407407408


Epoch[2] Batch[140] Speed: 1.2549956441881442 samples/sec                   batch loss = 335.36145544052124 | accuracy = 0.6910714285714286


Epoch[2] Batch[145] Speed: 1.2541604969871232 samples/sec                   batch loss = 351.2245659828186 | accuracy = 0.6844827586206896


Epoch[2] Batch[150] Speed: 1.2559168891211165 samples/sec                   batch loss = 364.1351320743561 | accuracy = 0.68


Epoch[2] Batch[155] Speed: 1.2532193585947988 samples/sec                   batch loss = 375.96171736717224 | accuracy = 0.6806451612903226


Epoch[2] Batch[160] Speed: 1.2518586806518057 samples/sec                   batch loss = 387.38009238243103 | accuracy = 0.6875


Epoch[2] Batch[165] Speed: 1.2585337194477375 samples/sec                   batch loss = 399.4566762447357 | accuracy = 0.6909090909090909


Epoch[2] Batch[170] Speed: 1.2552460671844656 samples/sec                   batch loss = 412.61040329933167 | accuracy = 0.6867647058823529


Epoch[2] Batch[175] Speed: 1.255265977612305 samples/sec                   batch loss = 426.43248438835144 | accuracy = 0.6842857142857143


Epoch[2] Batch[180] Speed: 1.2554017046664545 samples/sec                   batch loss = 437.98321425914764 | accuracy = 0.6833333333333333


Epoch[2] Batch[185] Speed: 1.2563370932851043 samples/sec                   batch loss = 451.56063067913055 | accuracy = 0.6824324324324325


Epoch[2] Batch[190] Speed: 1.2554679350942584 samples/sec                   batch loss = 462.93566393852234 | accuracy = 0.6842105263157895


Epoch[2] Batch[195] Speed: 1.252406299350141 samples/sec                   batch loss = 473.68660950660706 | accuracy = 0.6871794871794872


Epoch[2] Batch[200] Speed: 1.2584705635170652 samples/sec                   batch loss = 485.9646873474121 | accuracy = 0.68625


Epoch[2] Batch[205] Speed: 1.2614505976305606 samples/sec                   batch loss = 498.55645751953125 | accuracy = 0.6841463414634147


Epoch[2] Batch[210] Speed: 1.264811588320399 samples/sec                   batch loss = 512.2143520116806 | accuracy = 0.6821428571428572


Epoch[2] Batch[215] Speed: 1.2645064388227798 samples/sec                   batch loss = 525.1423602104187 | accuracy = 0.6790697674418604


Epoch[2] Batch[220] Speed: 1.2631843974002093 samples/sec                   batch loss = 537.1416584253311 | accuracy = 0.678409090909091


Epoch[2] Batch[225] Speed: 1.260832123457601 samples/sec                   batch loss = 549.528604388237 | accuracy = 0.6766666666666666


Epoch[2] Batch[230] Speed: 1.260951903086906 samples/sec                   batch loss = 559.1833139657974 | accuracy = 0.6804347826086956


Epoch[2] Batch[235] Speed: 1.257171186190223 samples/sec                   batch loss = 569.9882345199585 | accuracy = 0.6819148936170213


Epoch[2] Batch[240] Speed: 1.2585980147310412 samples/sec                   batch loss = 583.7520725727081 | accuracy = 0.678125


Epoch[2] Batch[245] Speed: 1.257626355826992 samples/sec                   batch loss = 595.254105091095 | accuracy = 0.6806122448979591


Epoch[2] Batch[250] Speed: 1.2611626150419275 samples/sec                   batch loss = 606.6062461137772 | accuracy = 0.682


Epoch[2] Batch[255] Speed: 1.2602477661631624 samples/sec                   batch loss = 618.4182150363922 | accuracy = 0.6823529411764706


Epoch[2] Batch[260] Speed: 1.2640286576779165 samples/sec                   batch loss = 631.8186657428741 | accuracy = 0.6788461538461539


Epoch[2] Batch[265] Speed: 1.2643401509758099 samples/sec                   batch loss = 645.2918980121613 | accuracy = 0.6773584905660377


Epoch[2] Batch[270] Speed: 1.2647378851852606 samples/sec                   batch loss = 657.9975918531418 | accuracy = 0.6768518518518518


Epoch[2] Batch[275] Speed: 1.261214095195969 samples/sec                   batch loss = 671.6749976873398 | accuracy = 0.6781818181818182


Epoch[2] Batch[280] Speed: 1.2622406880007198 samples/sec                   batch loss = 683.4595816135406 | accuracy = 0.6776785714285715


Epoch[2] Batch[285] Speed: 1.2613243697782242 samples/sec                   batch loss = 695.0213998556137 | accuracy = 0.6780701754385965


Epoch[2] Batch[290] Speed: 1.262963501901196 samples/sec                   batch loss = 708.077831864357 | accuracy = 0.6775862068965517


Epoch[2] Batch[295] Speed: 1.2634901460785768 samples/sec                   batch loss = 720.944256901741 | accuracy = 0.6779661016949152


Epoch[2] Batch[300] Speed: 1.2608939056777524 samples/sec                   batch loss = 732.6829166412354 | accuracy = 0.6783333333333333


Epoch[2] Batch[305] Speed: 1.2603379888939639 samples/sec                   batch loss = 745.1206158399582 | accuracy = 0.6778688524590164


Epoch[2] Batch[310] Speed: 1.2650317954187091 samples/sec                   batch loss = 756.5191091299057 | accuracy = 0.6782258064516129


Epoch[2] Batch[315] Speed: 1.2645313142899093 samples/sec                   batch loss = 768.3501108884811 | accuracy = 0.6785714285714286


Epoch[2] Batch[320] Speed: 1.262210014988786 samples/sec                   batch loss = 780.368190407753 | accuracy = 0.67734375


Epoch[2] Batch[325] Speed: 1.2592433138696169 samples/sec                   batch loss = 793.8154963254929 | accuracy = 0.6753846153846154


Epoch[2] Batch[330] Speed: 1.26389962821022 samples/sec                   batch loss = 806.7329787015915 | accuracy = 0.6734848484848485


Epoch[2] Batch[335] Speed: 1.270843704522682 samples/sec                   batch loss = 819.2583068609238 | accuracy = 0.673134328358209


Epoch[2] Batch[340] Speed: 1.263753204537672 samples/sec                   batch loss = 830.7971112728119 | accuracy = 0.6735294117647059


Epoch[2] Batch[345] Speed: 1.2628380167082074 samples/sec                   batch loss = 841.9247858524323 | accuracy = 0.6753623188405797


Epoch[2] Batch[350] Speed: 1.263753299730598 samples/sec                   batch loss = 851.6910725831985 | accuracy = 0.6785714285714286


Epoch[2] Batch[355] Speed: 1.2649508181963287 samples/sec                   batch loss = 865.4638148546219 | accuracy = 0.678169014084507


Epoch[2] Batch[360] Speed: 1.2645444672312938 samples/sec                   batch loss = 877.2779568433762 | accuracy = 0.6784722222222223


Epoch[2] Batch[365] Speed: 1.2662541978046693 samples/sec                   batch loss = 888.5112663507462 | accuracy = 0.6787671232876712


Epoch[2] Batch[370] Speed: 1.2605015209294879 samples/sec                   batch loss = 902.4599956274033 | accuracy = 0.6777027027027027


Epoch[2] Batch[375] Speed: 1.2623470579799367 samples/sec                   batch loss = 916.3987302780151 | accuracy = 0.6766666666666666


Epoch[2] Batch[380] Speed: 1.2657084413224156 samples/sec                   batch loss = 927.9574820995331 | accuracy = 0.6763157894736842


Epoch[2] Batch[385] Speed: 1.2609699099038565 samples/sec                   batch loss = 941.0231692790985 | accuracy = 0.6766233766233766


Epoch[2] Batch[390] Speed: 1.266223711686603 samples/sec                   batch loss = 953.5678218603134 | accuracy = 0.675


Epoch[2] Batch[395] Speed: 1.2619638302369471 samples/sec                   batch loss = 967.0099085569382 | accuracy = 0.6734177215189874


Epoch[2] Batch[400] Speed: 1.2578098361756955 samples/sec                   batch loss = 976.0060361623764 | accuracy = 0.67625


Epoch[2] Batch[405] Speed: 1.2561081468461888 samples/sec                   batch loss = 984.411728322506 | accuracy = 0.678395061728395


Epoch[2] Batch[410] Speed: 1.2595844153192082 samples/sec                   batch loss = 996.0330345034599 | accuracy = 0.6786585365853659


Epoch[2] Batch[415] Speed: 1.259684190269705 samples/sec                   batch loss = 1009.7003654837608 | accuracy = 0.677710843373494


Epoch[2] Batch[420] Speed: 1.2633087148053102 samples/sec                   batch loss = 1020.4406358599663 | accuracy = 0.6785714285714286


Epoch[2] Batch[425] Speed: 1.2621861803338923 samples/sec                   batch loss = 1032.989272415638 | accuracy = 0.6794117647058824


Epoch[2] Batch[430] Speed: 1.2624231425060006 samples/sec                   batch loss = 1046.9997436404228 | accuracy = 0.6784883720930233


Epoch[2] Batch[435] Speed: 1.2600775804839126 samples/sec                   batch loss = 1058.690079152584 | accuracy = 0.6775862068965517


Epoch[2] Batch[440] Speed: 1.2593021992881435 samples/sec                   batch loss = 1069.1756774783134 | accuracy = 0.6789772727272727


Epoch[2] Batch[445] Speed: 1.2620119583063758 samples/sec                   batch loss = 1081.2056083083153 | accuracy = 0.6792134831460674


Epoch[2] Batch[450] Speed: 1.2659110031579837 samples/sec                   batch loss = 1092.9958651661873 | accuracy = 0.68


Epoch[2] Batch[455] Speed: 1.2602967101548694 samples/sec                   batch loss = 1109.2664989829063 | accuracy = 0.6780219780219781


Epoch[2] Batch[460] Speed: 1.2631557707676977 samples/sec                   batch loss = 1120.4690029025078 | accuracy = 0.6777173913043478


Epoch[2] Batch[465] Speed: 1.266392693981206 samples/sec                   batch loss = 1131.0794044137 | accuracy = 0.6779569892473118


Epoch[2] Batch[470] Speed: 1.2630666656026437 samples/sec                   batch loss = 1143.8959680199623 | accuracy = 0.676595744680851


Epoch[2] Batch[475] Speed: 1.2603136568329614 samples/sec                   batch loss = 1156.0355173945427 | accuracy = 0.6768421052631579


Epoch[2] Batch[480] Speed: 1.2670912756182966 samples/sec                   batch loss = 1168.231484591961 | accuracy = 0.6776041666666667


Epoch[2] Batch[485] Speed: 1.2679858073879005 samples/sec                   batch loss = 1180.9607452750206 | accuracy = 0.677319587628866


Epoch[2] Batch[490] Speed: 1.2675575846558074 samples/sec                   batch loss = 1193.3091499209404 | accuracy = 0.676530612244898


Epoch[2] Batch[495] Speed: 1.2634931909912899 samples/sec                   batch loss = 1203.983184158802 | accuracy = 0.6777777777777778


Epoch[2] Batch[500] Speed: 1.2643111861048586 samples/sec                   batch loss = 1214.7075620293617 | accuracy = 0.679


Epoch[2] Batch[505] Speed: 1.2616748542497411 samples/sec                   batch loss = 1230.6488791108131 | accuracy = 0.6772277227722773


Epoch[2] Batch[510] Speed: 1.2630322441348087 samples/sec                   batch loss = 1244.6845551133156 | accuracy = 0.6754901960784314


Epoch[2] Batch[515] Speed: 1.260029694470203 samples/sec                   batch loss = 1259.3309378027916 | accuracy = 0.6737864077669903


Epoch[2] Batch[520] Speed: 1.2565082462113322 samples/sec                   batch loss = 1271.9093807339668 | accuracy = 0.6735576923076924


Epoch[2] Batch[525] Speed: 1.2649241142007372 samples/sec                   batch loss = 1283.2521602511406 | accuracy = 0.6747619047619048


Epoch[2] Batch[530] Speed: 1.2590277627168835 samples/sec                   batch loss = 1298.6396114230156 | accuracy = 0.6745283018867925


Epoch[2] Batch[535] Speed: 1.2642193457506101 samples/sec                   batch loss = 1310.5499976277351 | accuracy = 0.6742990654205607


Epoch[2] Batch[540] Speed: 1.263294065576143 samples/sec                   batch loss = 1324.5331793427467 | accuracy = 0.6731481481481482


Epoch[2] Batch[545] Speed: 1.2598746104103637 samples/sec                   batch loss = 1337.7418182492256 | accuracy = 0.6729357798165138


Epoch[2] Batch[550] Speed: 1.261818613935256 samples/sec                   batch loss = 1349.1010285019875 | accuracy = 0.6722727272727272


Epoch[2] Batch[555] Speed: 1.258610761272287 samples/sec                   batch loss = 1362.4474497437477 | accuracy = 0.6716216216216216


Epoch[2] Batch[560] Speed: 1.262220365776626 samples/sec                   batch loss = 1376.2684057354927 | accuracy = 0.6709821428571429


Epoch[2] Batch[565] Speed: 1.2685892587404084 samples/sec                   batch loss = 1388.9025287032127 | accuracy = 0.6707964601769911


Epoch[2] Batch[570] Speed: 1.2605169578182676 samples/sec                   batch loss = 1399.721035540104 | accuracy = 0.6710526315789473


Epoch[2] Batch[575] Speed: 1.26295408964668 samples/sec                   batch loss = 1412.449464738369 | accuracy = 0.6704347826086956


Epoch[2] Batch[580] Speed: 1.2651218458702225 samples/sec                   batch loss = 1425.7092009186745 | accuracy = 0.6698275862068965


Epoch[2] Batch[585] Speed: 1.2635370583937597 samples/sec                   batch loss = 1438.828899204731 | accuracy = 0.67008547008547


Epoch[2] Batch[590] Speed: 1.262921480597351 samples/sec                   batch loss = 1453.3501092791557 | accuracy = 0.6677966101694915


Epoch[2] Batch[595] Speed: 1.2606057983062702 samples/sec                   batch loss = 1466.0682947039604 | accuracy = 0.6672268907563025


Epoch[2] Batch[600] Speed: 1.2693891840233498 samples/sec                   batch loss = 1479.10992115736 | accuracy = 0.6666666666666666


Epoch[2] Batch[605] Speed: 1.2610593829780596 samples/sec                   batch loss = 1489.0748506188393 | accuracy = 0.6681818181818182


Epoch[2] Batch[610] Speed: 1.2581359145747437 samples/sec                   batch loss = 1500.6331614851952 | accuracy = 0.6692622950819672


Epoch[2] Batch[615] Speed: 1.2627446795042927 samples/sec                   batch loss = 1514.364553630352 | accuracy = 0.6686991869918699


Epoch[2] Batch[620] Speed: 1.2625341986815088 samples/sec                   batch loss = 1524.9779906868935 | accuracy = 0.6697580645161291


Epoch[2] Batch[625] Speed: 1.2626019439921294 samples/sec                   batch loss = 1538.9954074025154 | accuracy = 0.6688


Epoch[2] Batch[630] Speed: 1.2671781739696841 samples/sec                   batch loss = 1549.5226774811745 | accuracy = 0.6694444444444444


Epoch[2] Batch[635] Speed: 1.2673034702155037 samples/sec                   batch loss = 1559.41747289896 | accuracy = 0.6700787401574804


Epoch[2] Batch[640] Speed: 1.26221675722581 samples/sec                   batch loss = 1571.886278808117 | accuracy = 0.669921875


Epoch[2] Batch[645] Speed: 1.2607178612572947 samples/sec                   batch loss = 1583.631286084652 | accuracy = 0.67015503875969


Epoch[2] Batch[650] Speed: 1.2517820896068057 samples/sec                   batch loss = 1599.0025414824486 | accuracy = 0.6696153846153846


Epoch[2] Batch[655] Speed: 1.2526323082840958 samples/sec                   batch loss = 1614.329133927822 | accuracy = 0.6690839694656489


Epoch[2] Batch[660] Speed: 1.2601132607694094 samples/sec                   batch loss = 1625.5091868042946 | accuracy = 0.6704545454545454


Epoch[2] Batch[665] Speed: 1.2575309596583386 samples/sec                   batch loss = 1635.771133363247 | accuracy = 0.6706766917293233


Epoch[2] Batch[670] Speed: 1.254959689885254 samples/sec                   batch loss = 1646.3063886761665 | accuracy = 0.6720149253731343


Epoch[2] Batch[675] Speed: 1.2571708093746852 samples/sec                   batch loss = 1658.172508776188 | accuracy = 0.6725925925925926


Epoch[2] Batch[680] Speed: 1.257706303683472 samples/sec                   batch loss = 1669.6019912362099 | accuracy = 0.6724264705882353


Epoch[2] Batch[685] Speed: 1.259141340754744 samples/sec                   batch loss = 1680.6050935387611 | accuracy = 0.6733576642335767


Epoch[2] Batch[690] Speed: 1.2592129753783012 samples/sec                   batch loss = 1691.4922574162483 | accuracy = 0.6731884057971015


Epoch[2] Batch[695] Speed: 1.2575961894658916 samples/sec                   batch loss = 1706.0607009530067 | accuracy = 0.6726618705035972


Epoch[2] Batch[700] Speed: 1.2590415572921072 samples/sec                   batch loss = 1715.806392967701 | accuracy = 0.6742857142857143


Epoch[2] Batch[705] Speed: 1.2635359164704836 samples/sec                   batch loss = 1727.487644970417 | accuracy = 0.6737588652482269


Epoch[2] Batch[710] Speed: 1.2606968301739172 samples/sec                   batch loss = 1740.595166862011 | accuracy = 0.673943661971831


Epoch[2] Batch[715] Speed: 1.2625314434159987 samples/sec                   batch loss = 1750.391698539257 | accuracy = 0.6748251748251748


Epoch[2] Batch[720] Speed: 1.2665821836450313 samples/sec                   batch loss = 1761.3959657549858 | accuracy = 0.6756944444444445


Epoch[2] Batch[725] Speed: 1.2507109846674513 samples/sec                   batch loss = 1775.7875033020973 | accuracy = 0.6748275862068965


Epoch[2] Batch[730] Speed: 1.2532737498742224 samples/sec                   batch loss = 1784.706524193287 | accuracy = 0.6753424657534246


Epoch[2] Batch[735] Speed: 1.2583776820061958 samples/sec                   batch loss = 1795.8510140776634 | accuracy = 0.6755102040816326


Epoch[2] Batch[740] Speed: 1.2516690882300827 samples/sec                   batch loss = 1809.2449014782906 | accuracy = 0.6753378378378379


Epoch[2] Batch[745] Speed: 1.255860858067468 samples/sec                   batch loss = 1822.3972801566124 | accuracy = 0.674496644295302


Epoch[2] Batch[750] Speed: 1.256619111195166 samples/sec                   batch loss = 1833.8354116082191 | accuracy = 0.6743333333333333


Epoch[2] Batch[755] Speed: 1.25448374773933 samples/sec                   batch loss = 1843.5684166550636 | accuracy = 0.6754966887417219


Epoch[2] Batch[760] Speed: 1.2564417177914111 samples/sec                   batch loss = 1856.6778561472893 | accuracy = 0.6756578947368421


Epoch[2] Batch[765] Speed: 1.2623946453545913 samples/sec                   batch loss = 1870.4129611849785 | accuracy = 0.6754901960784314


Epoch[2] Batch[770] Speed: 1.2555339845931341 samples/sec                   batch loss = 1881.930552303791 | accuracy = 0.6753246753246753


Epoch[2] Batch[775] Speed: 1.2587081155955882 samples/sec                   batch loss = 1894.4950467944145 | accuracy = 0.6748387096774193


Epoch[2] Batch[780] Speed: 1.2610863980011695 samples/sec                   batch loss = 1905.051635324955 | accuracy = 0.6753205128205129


Epoch[2] Batch[785] Speed: 1.259977364708321 samples/sec                   batch loss = 1918.0761018395424 | accuracy = 0.6745222929936305


[Epoch 2] training: accuracy=0.674492385786802
[Epoch 2] time cost: 640.7119791507721
[Epoch 2] validation: validation accuracy=0.7266666666666667


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).