<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[09:31:48] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[09:31:48] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[09:31:48] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 8.953545, -7.410872]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7729595089980413 samples/sec                   batch loss = 12.945581197738647 | accuracy = 0.6


Epoch[1] Batch[10] Speed: 1.2554215260901251 samples/sec                   batch loss = 26.36283564567566 | accuracy = 0.6


Epoch[1] Batch[15] Speed: 1.2525616074390913 samples/sec                   batch loss = 40.9498610496521 | accuracy = 0.5166666666666667


Epoch[1] Batch[20] Speed: 1.251707749214923 samples/sec                   batch loss = 55.34710097312927 | accuracy = 0.4875


Epoch[1] Batch[25] Speed: 1.2512312838845916 samples/sec                   batch loss = 69.31519365310669 | accuracy = 0.49


Epoch[1] Batch[30] Speed: 1.2543183035632512 samples/sec                   batch loss = 83.86077928543091 | accuracy = 0.4583333333333333


Epoch[1] Batch[35] Speed: 1.2556329309127878 samples/sec                   batch loss = 97.86843156814575 | accuracy = 0.4857142857142857


Epoch[1] Batch[40] Speed: 1.252352357268977 samples/sec                   batch loss = 112.73854064941406 | accuracy = 0.46875


Epoch[1] Batch[45] Speed: 1.2504338488241438 samples/sec                   batch loss = 126.92554473876953 | accuracy = 0.48333333333333334


Epoch[1] Batch[50] Speed: 1.260407297812297 samples/sec                   batch loss = 141.2408127784729 | accuracy = 0.475


Epoch[1] Batch[55] Speed: 1.2562767915716346 samples/sec                   batch loss = 155.4229211807251 | accuracy = 0.4772727272727273


Epoch[1] Batch[60] Speed: 1.2569359088562757 samples/sec                   batch loss = 170.0300154685974 | accuracy = 0.4708333333333333


Epoch[1] Batch[65] Speed: 1.252510644187711 samples/sec                   batch loss = 184.2542061805725 | accuracy = 0.46153846153846156


Epoch[1] Batch[70] Speed: 1.252823595942 samples/sec                   batch loss = 197.76717042922974 | accuracy = 0.4642857142857143


Epoch[1] Batch[75] Speed: 1.2631947641751826 samples/sec                   batch loss = 211.55777287483215 | accuracy = 0.47


Epoch[1] Batch[80] Speed: 1.2573409645660292 samples/sec                   batch loss = 225.92957878112793 | accuracy = 0.471875


Epoch[1] Batch[85] Speed: 1.258525033951308 samples/sec                   batch loss = 239.37354612350464 | accuracy = 0.4852941176470588


Epoch[1] Batch[90] Speed: 1.2515398620795017 samples/sec                   batch loss = 253.03183913230896 | accuracy = 0.4888888888888889


Epoch[1] Batch[95] Speed: 1.2575973206783158 samples/sec                   batch loss = 266.8067669868469 | accuracy = 0.48947368421052634


Epoch[1] Batch[100] Speed: 1.2555167903845281 samples/sec                   batch loss = 280.36735129356384 | accuracy = 0.4975


Epoch[1] Batch[105] Speed: 1.256206619330717 samples/sec                   batch loss = 293.3711197376251 | accuracy = 0.5071428571428571


Epoch[1] Batch[110] Speed: 1.2570285776451844 samples/sec                   batch loss = 306.86503052711487 | accuracy = 0.5113636363636364


Epoch[1] Batch[115] Speed: 1.2502511163770802 samples/sec                   batch loss = 320.4290204048157 | accuracy = 0.5173913043478261


Epoch[1] Batch[120] Speed: 1.2580623270111584 samples/sec                   batch loss = 334.5875265598297 | accuracy = 0.5083333333333333


Epoch[1] Batch[125] Speed: 1.2619260517996036 samples/sec                   batch loss = 348.9551293849945 | accuracy = 0.502


Epoch[1] Batch[130] Speed: 1.2571335999528235 samples/sec                   batch loss = 363.4392521381378 | accuracy = 0.49615384615384617


Epoch[1] Batch[135] Speed: 1.2574043841229876 samples/sec                   batch loss = 377.21850323677063 | accuracy = 0.5


Epoch[1] Batch[140] Speed: 1.2495250942252083 samples/sec                   batch loss = 390.91833424568176 | accuracy = 0.5017857142857143


Epoch[1] Batch[145] Speed: 1.258323507456054 samples/sec                   batch loss = 404.97875690460205 | accuracy = 0.5051724137931034


Epoch[1] Batch[150] Speed: 1.261309007932379 samples/sec                   batch loss = 418.7842710018158 | accuracy = 0.5066666666666667


Epoch[1] Batch[155] Speed: 1.2554787393021523 samples/sec                   batch loss = 432.5603837966919 | accuracy = 0.5080645161290323


Epoch[1] Batch[160] Speed: 1.2522352338795542 samples/sec                   batch loss = 446.1847631931305 | accuracy = 0.509375


Epoch[1] Batch[165] Speed: 1.2553308787217983 samples/sec                   batch loss = 459.9629633426666 | accuracy = 0.5106060606060606


Epoch[1] Batch[170] Speed: 1.2588800096525148 samples/sec                   batch loss = 474.02438163757324 | accuracy = 0.5102941176470588


Epoch[1] Batch[175] Speed: 1.2572331754212243 samples/sec                   batch loss = 487.9858033657074 | accuracy = 0.5071428571428571


Epoch[1] Batch[180] Speed: 1.2521279445282445 samples/sec                   batch loss = 501.4550795555115 | accuracy = 0.5069444444444444


Epoch[1] Batch[185] Speed: 1.2509817138459978 samples/sec                   batch loss = 514.5141754150391 | accuracy = 0.5108108108108108


Epoch[1] Batch[190] Speed: 1.2590094334065225 samples/sec                   batch loss = 528.2011172771454 | accuracy = 0.5105263157894737


Epoch[1] Batch[195] Speed: 1.2557109337804946 samples/sec                   batch loss = 542.0453655719757 | accuracy = 0.5115384615384615


Epoch[1] Batch[200] Speed: 1.257352743346185 samples/sec                   batch loss = 555.7910766601562 | accuracy = 0.5125


Epoch[1] Batch[205] Speed: 1.2538572785451765 samples/sec                   batch loss = 569.2419226169586 | accuracy = 0.5158536585365854


Epoch[1] Batch[210] Speed: 1.251481887052547 samples/sec                   batch loss = 582.7275421619415 | accuracy = 0.5190476190476191


Epoch[1] Batch[215] Speed: 1.2606152703123479 samples/sec                   batch loss = 596.3273992538452 | accuracy = 0.5209302325581395


Epoch[1] Batch[220] Speed: 1.2558662165258547 samples/sec                   batch loss = 610.1314053535461 | accuracy = 0.5204545454545455


Epoch[1] Batch[225] Speed: 1.2529825634750689 samples/sec                   batch loss = 623.9098644256592 | accuracy = 0.5211111111111111


Epoch[1] Batch[230] Speed: 1.247705603276522 samples/sec                   batch loss = 637.7132856845856 | accuracy = 0.5206521739130435


Epoch[1] Batch[235] Speed: 1.2482226615451781 samples/sec                   batch loss = 651.597838640213 | accuracy = 0.5191489361702127


Epoch[1] Batch[240] Speed: 1.2500377010748829 samples/sec                   batch loss = 665.0846436023712 | accuracy = 0.5239583333333333


Epoch[1] Batch[245] Speed: 1.25015674261499 samples/sec                   batch loss = 678.6808466911316 | accuracy = 0.5244897959183673


Epoch[1] Batch[250] Speed: 1.250439906653109 samples/sec                   batch loss = 692.8011212348938 | accuracy = 0.525


Epoch[1] Batch[255] Speed: 1.2484818159480058 samples/sec                   batch loss = 706.6410131454468 | accuracy = 0.5284313725490196


Epoch[1] Batch[260] Speed: 1.2524859588926813 samples/sec                   batch loss = 719.7585451602936 | accuracy = 0.5317307692307692


Epoch[1] Batch[265] Speed: 1.2531689970414925 samples/sec                   batch loss = 733.6823215484619 | accuracy = 0.5320754716981132


Epoch[1] Batch[270] Speed: 1.2483030900702663 samples/sec                   batch loss = 747.1896545886993 | accuracy = 0.5324074074074074


Epoch[1] Batch[275] Speed: 1.2496580932699064 samples/sec                   batch loss = 760.8956336975098 | accuracy = 0.5309090909090909


Epoch[1] Batch[280] Speed: 1.2498144894521024 samples/sec                   batch loss = 774.6622886657715 | accuracy = 0.5330357142857143


Epoch[1] Batch[285] Speed: 1.252387320930402 samples/sec                   batch loss = 788.1747448444366 | accuracy = 0.5342105263157895


Epoch[1] Batch[290] Speed: 1.2547983433315024 samples/sec                   batch loss = 801.9466533660889 | accuracy = 0.5327586206896552


Epoch[1] Batch[295] Speed: 1.2526845908936521 samples/sec                   batch loss = 815.3949599266052 | accuracy = 0.5338983050847458


Epoch[1] Batch[300] Speed: 1.2503705713196542 samples/sec                   batch loss = 828.9565093517303 | accuracy = 0.5341666666666667


Epoch[1] Batch[305] Speed: 1.2486030702722206 samples/sec                   batch loss = 842.4790105819702 | accuracy = 0.5336065573770492


Epoch[1] Batch[310] Speed: 1.2525601112111533 samples/sec                   batch loss = 856.0282373428345 | accuracy = 0.5362903225806451


Epoch[1] Batch[315] Speed: 1.2594857907854151 samples/sec                   batch loss = 870.1147828102112 | accuracy = 0.5365079365079365


Epoch[1] Batch[320] Speed: 1.2534815287003005 samples/sec                   batch loss = 883.7643976211548 | accuracy = 0.5359375


Epoch[1] Batch[325] Speed: 1.255468404838561 samples/sec                   batch loss = 896.8416223526001 | accuracy = 0.5384615384615384


Epoch[1] Batch[330] Speed: 1.259879624733874 samples/sec                   batch loss = 910.4500024318695 | accuracy = 0.5386363636363637


Epoch[1] Batch[335] Speed: 1.25787490629976 samples/sec                   batch loss = 924.3290829658508 | accuracy = 0.5395522388059701


Epoch[1] Batch[340] Speed: 1.2578775469686794 samples/sec                   batch loss = 937.6883821487427 | accuracy = 0.5389705882352941


Epoch[1] Batch[345] Speed: 1.2479969412306413 samples/sec                   batch loss = 950.8677089214325 | accuracy = 0.5413043478260869


Epoch[1] Batch[350] Speed: 1.2488663800893796 samples/sec                   batch loss = 964.7545025348663 | accuracy = 0.5414285714285715


Epoch[1] Batch[355] Speed: 1.253185471750992 samples/sec                   batch loss = 978.1973533630371 | accuracy = 0.5408450704225352


Epoch[1] Batch[360] Speed: 1.2563110339560544 samples/sec                   batch loss = 992.153314113617 | accuracy = 0.5423611111111111


Epoch[1] Batch[365] Speed: 1.2524663235573736 samples/sec                   batch loss = 1005.8900573253632 | accuracy = 0.5417808219178082


Epoch[1] Batch[370] Speed: 1.2504205218070248 samples/sec                   batch loss = 1018.0433187484741 | accuracy = 0.5452702702702703


Epoch[1] Batch[375] Speed: 1.2510551284136338 samples/sec                   batch loss = 1032.133089542389 | accuracy = 0.5453333333333333


Epoch[1] Batch[380] Speed: 1.2551306554963946 samples/sec                   batch loss = 1046.1579880714417 | accuracy = 0.5447368421052632


Epoch[1] Batch[385] Speed: 1.256642077186057 samples/sec                   batch loss = 1058.831505537033 | accuracy = 0.5461038961038961


Epoch[1] Batch[390] Speed: 1.2607171033682287 samples/sec                   batch loss = 1071.909690618515 | accuracy = 0.5480769230769231


Epoch[1] Batch[395] Speed: 1.2536583681832971 samples/sec                   batch loss = 1085.1166546344757 | accuracy = 0.55


Epoch[1] Batch[400] Speed: 1.2581483687064425 samples/sec                   batch loss = 1098.7825281620026 | accuracy = 0.55


Epoch[1] Batch[405] Speed: 1.2593154327243206 samples/sec                   batch loss = 1111.8015327453613 | accuracy = 0.5487654320987654


Epoch[1] Batch[410] Speed: 1.249066655389947 samples/sec                   batch loss = 1125.892987728119 | accuracy = 0.5475609756097561


Epoch[1] Batch[415] Speed: 1.2462588203413683 samples/sec                   batch loss = 1139.1785247325897 | accuracy = 0.5487951807228916


Epoch[1] Batch[420] Speed: 1.246859462532233 samples/sec                   batch loss = 1153.6842811107635 | accuracy = 0.5488095238095239


Epoch[1] Batch[425] Speed: 1.247570514674771 samples/sec                   batch loss = 1167.4782445430756 | accuracy = 0.5494117647058824


Epoch[1] Batch[430] Speed: 1.2569834658267405 samples/sec                   batch loss = 1181.3802726268768 | accuracy = 0.5476744186046512


Epoch[1] Batch[435] Speed: 1.2606225638540276 samples/sec                   batch loss = 1194.3111641407013 | accuracy = 0.5494252873563218


Epoch[1] Batch[440] Speed: 1.2448261951331387 samples/sec                   batch loss = 1208.8208439350128 | accuracy = 0.5477272727272727


Epoch[1] Batch[445] Speed: 1.248474197688734 samples/sec                   batch loss = 1223.5289883613586 | accuracy = 0.5466292134831461


Epoch[1] Batch[450] Speed: 1.2505051485310723 samples/sec                   batch loss = 1238.3821814060211 | accuracy = 0.5427777777777778


Epoch[1] Batch[455] Speed: 1.2599856917575603 samples/sec                   batch loss = 1251.4008932113647 | accuracy = 0.5439560439560439


Epoch[1] Batch[460] Speed: 1.2541712787037402 samples/sec                   batch loss = 1264.6984009742737 | accuracy = 0.5456521739130434


Epoch[1] Batch[465] Speed: 1.2549054337670607 samples/sec                   batch loss = 1278.3162412643433 | accuracy = 0.5451612903225806


Epoch[1] Batch[470] Speed: 1.2601683467693476 samples/sec                   batch loss = 1291.893390417099 | accuracy = 0.5462765957446809


Epoch[1] Batch[475] Speed: 1.2547030003595716 samples/sec                   batch loss = 1306.3366858959198 | accuracy = 0.5452631578947369


Epoch[1] Batch[480] Speed: 1.2583350215171243 samples/sec                   batch loss = 1319.9476721286774 | accuracy = 0.5453125


Epoch[1] Batch[485] Speed: 1.2459338721146331 samples/sec                   batch loss = 1333.7146849632263 | accuracy = 0.5458762886597938


Epoch[1] Batch[490] Speed: 1.2442838918373225 samples/sec                   batch loss = 1347.6564559936523 | accuracy = 0.5464285714285714


Epoch[1] Batch[495] Speed: 1.2548331621425965 samples/sec                   batch loss = 1361.1933119297028 | accuracy = 0.5464646464646464


Epoch[1] Batch[500] Speed: 1.2553820717406037 samples/sec                   batch loss = 1374.7036962509155 | accuracy = 0.548


Epoch[1] Batch[505] Speed: 1.2522484126722295 samples/sec                   batch loss = 1387.9078569412231 | accuracy = 0.5495049504950495


Epoch[1] Batch[510] Speed: 1.2475982536972552 samples/sec                   batch loss = 1400.5831487178802 | accuracy = 0.5519607843137255


Epoch[1] Batch[515] Speed: 1.248587366277397 samples/sec                   batch loss = 1413.549870967865 | accuracy = 0.5524271844660195


Epoch[1] Batch[520] Speed: 1.2538972931054997 samples/sec                   batch loss = 1426.0916168689728 | accuracy = 0.5543269230769231


Epoch[1] Batch[525] Speed: 1.2493510004099415 samples/sec                   batch loss = 1438.4779777526855 | accuracy = 0.5552380952380952


Epoch[1] Batch[530] Speed: 1.25473931531469 samples/sec                   batch loss = 1452.142517566681 | accuracy = 0.5547169811320755


Epoch[1] Batch[535] Speed: 1.249501084818411 samples/sec                   batch loss = 1465.589514017105 | accuracy = 0.555607476635514


Epoch[1] Batch[540] Speed: 1.2582903821112517 samples/sec                   batch loss = 1479.8799328804016 | accuracy = 0.5550925925925926


Epoch[1] Batch[545] Speed: 1.2620786032324052 samples/sec                   batch loss = 1493.1110589504242 | accuracy = 0.5564220183486238


Epoch[1] Batch[550] Speed: 1.2645475172277967 samples/sec                   batch loss = 1506.2170412540436 | accuracy = 0.5563636363636364


Epoch[1] Batch[555] Speed: 1.257591664636546 samples/sec                   batch loss = 1520.3123533725739 | accuracy = 0.5558558558558558


Epoch[1] Batch[560] Speed: 1.2548027542330256 samples/sec                   batch loss = 1533.5597026348114 | accuracy = 0.5566964285714285


Epoch[1] Batch[565] Speed: 1.2582523514586488 samples/sec                   batch loss = 1546.4843764305115 | accuracy = 0.5566371681415929


Epoch[1] Batch[570] Speed: 1.260879501919553 samples/sec                   batch loss = 1559.3333187103271 | accuracy = 0.5570175438596491


Epoch[1] Batch[575] Speed: 1.2675387188542087 samples/sec                   batch loss = 1571.5888590812683 | accuracy = 0.5591304347826087


Epoch[1] Batch[580] Speed: 1.2485311510725567 samples/sec                   batch loss = 1584.9736659526825 | accuracy = 0.5594827586206896


Epoch[1] Batch[585] Speed: 1.2481500433056438 samples/sec                   batch loss = 1598.0562644004822 | accuracy = 0.5602564102564103


Epoch[1] Batch[590] Speed: 1.2564785098657794 samples/sec                   batch loss = 1611.8187618255615 | accuracy = 0.5601694915254237


Epoch[1] Batch[595] Speed: 1.2590146298062868 samples/sec                   batch loss = 1625.4577796459198 | accuracy = 0.5609243697478992


Epoch[1] Batch[600] Speed: 1.2509943998644706 samples/sec                   batch loss = 1638.9762139320374 | accuracy = 0.56125


Epoch[1] Batch[605] Speed: 1.2524392091064305 samples/sec                   batch loss = 1652.7258105278015 | accuracy = 0.5619834710743802


Epoch[1] Batch[610] Speed: 1.2595133057605794 samples/sec                   batch loss = 1665.495005607605 | accuracy = 0.5614754098360656


Epoch[1] Batch[615] Speed: 1.2554648347906774 samples/sec                   batch loss = 1679.743088722229 | accuracy = 0.5609756097560976


Epoch[1] Batch[620] Speed: 1.258260183886467 samples/sec                   batch loss = 1692.7283532619476 | accuracy = 0.5620967741935484


Epoch[1] Batch[625] Speed: 1.2589450960829054 samples/sec                   batch loss = 1706.6468183994293 | accuracy = 0.5616


Epoch[1] Batch[630] Speed: 1.2523669408048697 samples/sec                   batch loss = 1720.654042005539 | accuracy = 0.5619047619047619


Epoch[1] Batch[635] Speed: 1.260321041561513 samples/sec                   batch loss = 1733.2737925052643 | accuracy = 0.5633858267716535


Epoch[1] Batch[640] Speed: 1.2536227714801464 samples/sec                   batch loss = 1745.4002294540405 | accuracy = 0.5640625


Epoch[1] Batch[645] Speed: 1.2518978204259756 samples/sec                   batch loss = 1758.5195307731628 | accuracy = 0.5647286821705426


Epoch[1] Batch[650] Speed: 1.2529147236606952 samples/sec                   batch loss = 1770.8023726940155 | accuracy = 0.5665384615384615


Epoch[1] Batch[655] Speed: 1.2534987608625323 samples/sec                   batch loss = 1783.226356267929 | accuracy = 0.5675572519083969


Epoch[1] Batch[660] Speed: 1.2563763254329994 samples/sec                   batch loss = 1795.2087433338165 | accuracy = 0.568939393939394


Epoch[1] Batch[665] Speed: 1.2572531489375798 samples/sec                   batch loss = 1807.7226185798645 | accuracy = 0.5695488721804511


Epoch[1] Batch[670] Speed: 1.2559937049151266 samples/sec                   batch loss = 1821.2805829048157 | accuracy = 0.5697761194029851


Epoch[1] Batch[675] Speed: 1.2492893208695424 samples/sec                   batch loss = 1835.8359870910645 | accuracy = 0.5696296296296296


Epoch[1] Batch[680] Speed: 1.2523940521267507 samples/sec                   batch loss = 1848.5471503734589 | accuracy = 0.5702205882352941


Epoch[1] Batch[685] Speed: 1.2605261443671165 samples/sec                   batch loss = 1861.159915447235 | accuracy = 0.5715328467153284


Epoch[1] Batch[690] Speed: 1.2488808825499857 samples/sec                   batch loss = 1875.1263431310654 | accuracy = 0.5717391304347826


Epoch[1] Batch[695] Speed: 1.2528458620662617 samples/sec                   batch loss = 1887.0472744703293 | accuracy = 0.573021582733813


Epoch[1] Batch[700] Speed: 1.2484224517296385 samples/sec                   batch loss = 1900.3581107854843 | accuracy = 0.5728571428571428


Epoch[1] Batch[705] Speed: 1.2535870831308744 samples/sec                   batch loss = 1913.538232922554 | accuracy = 0.573049645390071


Epoch[1] Batch[710] Speed: 1.2513851806038019 samples/sec                   batch loss = 1924.8464714288712 | accuracy = 0.5742957746478873


Epoch[1] Batch[715] Speed: 1.2499487607324007 samples/sec                   batch loss = 1936.433962225914 | accuracy = 0.5755244755244755


Epoch[1] Batch[720] Speed: 1.2531935220506782 samples/sec                   batch loss = 1949.2145422697067 | accuracy = 0.5767361111111111


Epoch[1] Batch[725] Speed: 1.2525369201352068 samples/sec                   batch loss = 1962.2451688051224 | accuracy = 0.5768965517241379


Epoch[1] Batch[730] Speed: 1.2575253042134689 samples/sec                   batch loss = 1974.6359091997147 | accuracy = 0.5773972602739726


Epoch[1] Batch[735] Speed: 1.255376153802064 samples/sec                   batch loss = 1986.3214489221573 | accuracy = 0.5785714285714286


Epoch[1] Batch[740] Speed: 1.2585528846085323 samples/sec                   batch loss = 2001.1099325418472 | accuracy = 0.577027027027027


Epoch[1] Batch[745] Speed: 1.2497455958338874 samples/sec                   batch loss = 2013.5984441041946 | accuracy = 0.5781879194630872


Epoch[1] Batch[750] Speed: 1.2499918789199111 samples/sec                   batch loss = 2026.2231477499008 | accuracy = 0.5783333333333334


Epoch[1] Batch[755] Speed: 1.2553314422919362 samples/sec                   batch loss = 2040.511035323143 | accuracy = 0.5774834437086093


Epoch[1] Batch[760] Speed: 1.2529453208468089 samples/sec                   batch loss = 2054.1368688344955 | accuracy = 0.5776315789473684


Epoch[1] Batch[765] Speed: 1.2515992430372715 samples/sec                   batch loss = 2066.3291009664536 | accuracy = 0.5787581699346406


Epoch[1] Batch[770] Speed: 1.2490089092638015 samples/sec                   batch loss = 2079.2199479341507 | accuracy = 0.5785714285714286


Epoch[1] Batch[775] Speed: 1.2530527503246303 samples/sec                   batch loss = 2091.536945939064 | accuracy = 0.5787096774193549


Epoch[1] Batch[780] Speed: 1.2573072313607372 samples/sec                   batch loss = 2105.6438170671463 | accuracy = 0.5785256410256411


Epoch[1] Batch[785] Speed: 1.2607790638071341 samples/sec                   batch loss = 2118.253204226494 | accuracy = 0.5789808917197452


[Epoch 1] training: accuracy=0.5799492385786802
[Epoch 1] time cost: 646.7024936676025
[Epoch 1] validation: validation accuracy=0.6577777777777778


Epoch[2] Batch[5] Speed: 1.2529477537147329 samples/sec                   batch loss = 13.531935214996338 | accuracy = 0.65


Epoch[2] Batch[10] Speed: 1.2501080241267668 samples/sec                   batch loss = 27.399807572364807 | accuracy = 0.55


Epoch[2] Batch[15] Speed: 1.2408207845788457 samples/sec                   batch loss = 40.2297967672348 | accuracy = 0.6


Epoch[2] Batch[20] Speed: 1.2459050967756595 samples/sec                   batch loss = 53.11086142063141 | accuracy = 0.6375


Epoch[2] Batch[25] Speed: 1.2581390280845532 samples/sec                   batch loss = 64.59736156463623 | accuracy = 0.67


Epoch[2] Batch[30] Speed: 1.256393355017578 samples/sec                   batch loss = 77.25893568992615 | accuracy = 0.6333333333333333


Epoch[2] Batch[35] Speed: 1.2513828471386574 samples/sec                   batch loss = 90.12297987937927 | accuracy = 0.6357142857142857


Epoch[2] Batch[40] Speed: 1.2453643491895214 samples/sec                   batch loss = 102.79773759841919 | accuracy = 0.6375


Epoch[2] Batch[45] Speed: 1.2578787729973036 samples/sec                   batch loss = 115.28043282032013 | accuracy = 0.65


Epoch[2] Batch[50] Speed: 1.2618858079124193 samples/sec                   batch loss = 129.43616271018982 | accuracy = 0.645


Epoch[2] Batch[55] Speed: 1.2577384553923165 samples/sec                   batch loss = 143.26749348640442 | accuracy = 0.6227272727272727


Epoch[2] Batch[60] Speed: 1.2512847561364893 samples/sec                   batch loss = 156.1745330095291 | accuracy = 0.625


Epoch[2] Batch[65] Speed: 1.2518353287303217 samples/sec                   batch loss = 168.54897713661194 | accuracy = 0.6307692307692307


Epoch[2] Batch[70] Speed: 1.258966730073189 samples/sec                   batch loss = 179.54537785053253 | accuracy = 0.6392857142857142


Epoch[2] Batch[75] Speed: 1.2524643600577035 samples/sec                   batch loss = 190.65863835811615 | accuracy = 0.6466666666666666


Epoch[2] Batch[80] Speed: 1.2544533567616574 samples/sec                   batch loss = 205.74750244617462 | accuracy = 0.6375


Epoch[2] Batch[85] Speed: 1.246149590693001 samples/sec                   batch loss = 217.2720685005188 | accuracy = 0.6441176470588236


Epoch[2] Batch[90] Speed: 1.2502454330524062 samples/sec                   batch loss = 228.57961177825928 | accuracy = 0.65


Epoch[2] Batch[95] Speed: 1.251216166903838 samples/sec                   batch loss = 242.98456358909607 | accuracy = 0.6421052631578947


Epoch[2] Batch[100] Speed: 1.259942070596551 samples/sec                   batch loss = 257.21791088581085 | accuracy = 0.64


Epoch[2] Batch[105] Speed: 1.2554951808448789 samples/sec                   batch loss = 270.64985382556915 | accuracy = 0.638095238095238


Epoch[2] Batch[110] Speed: 1.242718906854021 samples/sec                   batch loss = 284.423530459404 | accuracy = 0.6386363636363637


Epoch[2] Batch[115] Speed: 1.2590134960426798 samples/sec                   batch loss = 297.08395850658417 | accuracy = 0.6391304347826087


Epoch[2] Batch[120] Speed: 1.254063376140019 samples/sec                   batch loss = 310.18693220615387 | accuracy = 0.6395833333333333


Epoch[2] Batch[125] Speed: 1.2618356964713882 samples/sec                   batch loss = 322.1040815114975 | accuracy = 0.644


Epoch[2] Batch[130] Speed: 1.253258396263744 samples/sec                   batch loss = 334.5102428197861 | accuracy = 0.6442307692307693


Epoch[2] Batch[135] Speed: 1.2505624737994248 samples/sec                   batch loss = 348.0294123888016 | accuracy = 0.6444444444444445


Epoch[2] Batch[140] Speed: 1.255424438295153 samples/sec                   batch loss = 362.03456819057465 | accuracy = 0.6428571428571429


Epoch[2] Batch[145] Speed: 1.2654311103179694 samples/sec                   batch loss = 373.310324549675 | accuracy = 0.65


Epoch[2] Batch[150] Speed: 1.2595554788118575 samples/sec                   batch loss = 385.45362186431885 | accuracy = 0.655


Epoch[2] Batch[155] Speed: 1.252354039967323 samples/sec                   batch loss = 397.0558224916458 | accuracy = 0.6580645161290323


Epoch[2] Batch[160] Speed: 1.2551271812669238 samples/sec                   batch loss = 408.88772344589233 | accuracy = 0.6609375


Epoch[2] Batch[165] Speed: 1.2604830539825702 samples/sec                   batch loss = 421.4129545688629 | accuracy = 0.6575757575757576


Epoch[2] Batch[170] Speed: 1.2632139764465617 samples/sec                   batch loss = 433.16978108882904 | accuracy = 0.6558823529411765


Epoch[2] Batch[175] Speed: 1.2566228760541787 samples/sec                   batch loss = 444.4383841753006 | accuracy = 0.6585714285714286


Epoch[2] Batch[180] Speed: 1.2463699209798123 samples/sec                   batch loss = 456.9411382675171 | accuracy = 0.6611111111111111


Epoch[2] Batch[185] Speed: 1.2624256123197066 samples/sec                   batch loss = 466.71051716804504 | accuracy = 0.6662162162162162


Epoch[2] Batch[190] Speed: 1.2602922605439852 samples/sec                   batch loss = 479.3076640367508 | accuracy = 0.6631578947368421


Epoch[2] Batch[195] Speed: 1.2529003144943174 samples/sec                   batch loss = 491.77913188934326 | accuracy = 0.6653846153846154


Epoch[2] Batch[200] Speed: 1.2510549418345611 samples/sec                   batch loss = 505.88161540031433 | accuracy = 0.6625


Epoch[2] Batch[205] Speed: 1.244390856377916 samples/sec                   batch loss = 516.7657977342606 | accuracy = 0.6658536585365854


Epoch[2] Batch[210] Speed: 1.2536956532135803 samples/sec                   batch loss = 529.2370833158493 | accuracy = 0.6654761904761904


Epoch[2] Batch[215] Speed: 1.2525151325277135 samples/sec                   batch loss = 543.9056128263474 | accuracy = 0.6627906976744186


Epoch[2] Batch[220] Speed: 1.2508241861266842 samples/sec                   batch loss = 557.2193528413773 | accuracy = 0.6625


Epoch[2] Batch[225] Speed: 1.243169657240227 samples/sec                   batch loss = 568.7784761190414 | accuracy = 0.6644444444444444


Epoch[2] Batch[230] Speed: 1.2430528635562368 samples/sec                   batch loss = 581.7678591012955 | accuracy = 0.6619565217391304


Epoch[2] Batch[235] Speed: 1.254029443637271 samples/sec                   batch loss = 594.788812160492 | accuracy = 0.6617021276595745


Epoch[2] Batch[240] Speed: 1.2510345117627943 samples/sec                   batch loss = 608.4626258611679 | accuracy = 0.6614583333333334


Epoch[2] Batch[245] Speed: 1.2517918030740542 samples/sec                   batch loss = 620.0507917404175 | accuracy = 0.6632653061224489


Epoch[2] Batch[250] Speed: 1.2422648958616014 samples/sec                   batch loss = 633.09037733078 | accuracy = 0.661


Epoch[2] Batch[255] Speed: 1.24752394557811 samples/sec                   batch loss = 647.1391650438309 | accuracy = 0.6588235294117647


Epoch[2] Batch[260] Speed: 1.2556242854083997 samples/sec                   batch loss = 658.5930045843124 | accuracy = 0.6596153846153846


Epoch[2] Batch[265] Speed: 1.2525271951008654 samples/sec                   batch loss = 671.557654261589 | accuracy = 0.6566037735849056


Epoch[2] Batch[270] Speed: 1.2500553975217377 samples/sec                   batch loss = 682.1955723762512 | accuracy = 0.6592592592592592


Epoch[2] Batch[275] Speed: 1.2475434262513307 samples/sec                   batch loss = 694.9672124385834 | accuracy = 0.6572727272727272


Epoch[2] Batch[280] Speed: 1.256276415292065 samples/sec                   batch loss = 707.2489906549454 | accuracy = 0.6589285714285714


Epoch[2] Batch[285] Speed: 1.2544390997733934 samples/sec                   batch loss = 720.8918083906174 | accuracy = 0.6587719298245615


Epoch[2] Batch[290] Speed: 1.2544924713475125 samples/sec                   batch loss = 732.7839312553406 | accuracy = 0.6594827586206896


Epoch[2] Batch[295] Speed: 1.2469668703667642 samples/sec                   batch loss = 743.5233798027039 | accuracy = 0.6610169491525424


Epoch[2] Batch[300] Speed: 1.2496882523442434 samples/sec                   batch loss = 754.5150529146194 | accuracy = 0.6608333333333334


Epoch[2] Batch[305] Speed: 1.250081756842595 samples/sec                   batch loss = 766.2145202159882 | accuracy = 0.6622950819672131


Epoch[2] Batch[310] Speed: 1.2590893682277537 samples/sec                   batch loss = 777.7075756788254 | accuracy = 0.6612903225806451


Epoch[2] Batch[315] Speed: 1.255691009234902 samples/sec                   batch loss = 789.8075367212296 | accuracy = 0.6603174603174603


Epoch[2] Batch[320] Speed: 1.2447706875078923 samples/sec                   batch loss = 801.9307398796082 | accuracy = 0.66015625


Epoch[2] Batch[325] Speed: 1.2542698228076055 samples/sec                   batch loss = 812.087014913559 | accuracy = 0.6623076923076923


Epoch[2] Batch[330] Speed: 1.2502671417675286 samples/sec                   batch loss = 824.5706417560577 | accuracy = 0.6621212121212121


Epoch[2] Batch[335] Speed: 1.2484393592731835 samples/sec                   batch loss = 837.678762793541 | accuracy = 0.6611940298507463


Epoch[2] Batch[340] Speed: 1.24261094105699 samples/sec                   batch loss = 849.3320298194885 | accuracy = 0.6625


Epoch[2] Batch[345] Speed: 1.2443769194899166 samples/sec                   batch loss = 860.6476452350616 | accuracy = 0.663768115942029


Epoch[2] Batch[350] Speed: 1.2533622277033338 samples/sec                   batch loss = 872.3531799316406 | accuracy = 0.6642857142857143


Epoch[2] Batch[355] Speed: 1.2542472247418068 samples/sec                   batch loss = 883.2120699882507 | accuracy = 0.6647887323943662


Epoch[2] Batch[360] Speed: 1.2569114254826763 samples/sec                   batch loss = 896.9446501731873 | accuracy = 0.6618055555555555


Epoch[2] Batch[365] Speed: 1.242507963035253 samples/sec                   batch loss = 909.6124175786972 | accuracy = 0.6616438356164384


Epoch[2] Batch[370] Speed: 1.2485654370994421 samples/sec                   batch loss = 920.5013890266418 | accuracy = 0.6621621621621622


Epoch[2] Batch[375] Speed: 1.251932198189128 samples/sec                   batch loss = 933.4654512405396 | accuracy = 0.6613333333333333


Epoch[2] Batch[380] Speed: 1.2511157695605222 samples/sec                   batch loss = 946.6122851371765 | accuracy = 0.6605263157894737


Epoch[2] Batch[385] Speed: 1.2540457535060805 samples/sec                   batch loss = 955.2351924777031 | accuracy = 0.662987012987013


Epoch[2] Batch[390] Speed: 1.2436745725753013 samples/sec                   batch loss = 966.6917254328728 | accuracy = 0.6628205128205128


Epoch[2] Batch[395] Speed: 1.250008363332676 samples/sec                   batch loss = 977.8614336848259 | accuracy = 0.6632911392405063


Epoch[2] Batch[400] Speed: 1.2553113419365112 samples/sec                   batch loss = 988.8661504387856 | accuracy = 0.665


Epoch[2] Batch[405] Speed: 1.2520132923488398 samples/sec                   batch loss = 1000.7570869326591 | accuracy = 0.6648148148148149


Epoch[2] Batch[410] Speed: 1.253118265145566 samples/sec                   batch loss = 1014.184633910656 | accuracy = 0.6646341463414634


Epoch[2] Batch[415] Speed: 1.2475109588006204 samples/sec                   batch loss = 1024.7514887452126 | accuracy = 0.6656626506024096


Epoch[2] Batch[420] Speed: 1.2563910969245662 samples/sec                   batch loss = 1035.2695879340172 | accuracy = 0.6678571428571428


Epoch[2] Batch[425] Speed: 1.2545451907875678 samples/sec                   batch loss = 1047.2896828055382 | accuracy = 0.6664705882352941


Epoch[2] Batch[430] Speed: 1.252321602079169 samples/sec                   batch loss = 1063.0588263869286 | accuracy = 0.6645348837209303


Epoch[2] Batch[435] Speed: 1.2467125133711299 samples/sec                   batch loss = 1075.341247022152 | accuracy = 0.6660919540229885


Epoch[2] Batch[440] Speed: 1.2475735761116793 samples/sec                   batch loss = 1087.1019622683525 | accuracy = 0.6670454545454545


Epoch[2] Batch[445] Speed: 1.2574745017479771 samples/sec                   batch loss = 1098.0761200785637 | accuracy = 0.6679775280898876


Epoch[2] Batch[450] Speed: 1.2500806391103074 samples/sec                   batch loss = 1108.902893126011 | accuracy = 0.6683333333333333


Epoch[2] Batch[455] Speed: 1.254475868455576 samples/sec                   batch loss = 1120.149533212185 | accuracy = 0.6681318681318681


Epoch[2] Batch[460] Speed: 1.2450981709085078 samples/sec                   batch loss = 1134.6684924960136 | accuracy = 0.6663043478260869


Epoch[2] Batch[465] Speed: 1.2521461674685315 samples/sec                   batch loss = 1148.7919381260872 | accuracy = 0.6666666666666666


Epoch[2] Batch[470] Speed: 1.2504338488241438 samples/sec                   batch loss = 1158.2959485650063 | accuracy = 0.6680851063829787


Epoch[2] Batch[475] Speed: 1.2481451219148385 samples/sec                   batch loss = 1169.2592975497246 | accuracy = 0.6678947368421052


Epoch[2] Batch[480] Speed: 1.2548800908211233 samples/sec                   batch loss = 1180.134543478489 | accuracy = 0.66875


Epoch[2] Batch[485] Speed: 1.2396962520970451 samples/sec                   batch loss = 1192.151910841465 | accuracy = 0.6690721649484536


Epoch[2] Batch[490] Speed: 1.255048030384683 samples/sec                   batch loss = 1204.8704523444176 | accuracy = 0.6683673469387755


Epoch[2] Batch[495] Speed: 1.25628582234893 samples/sec                   batch loss = 1216.634038388729 | accuracy = 0.6676767676767676


Epoch[2] Batch[500] Speed: 1.2596840956888078 samples/sec                   batch loss = 1230.8610028624535 | accuracy = 0.668


Epoch[2] Batch[505] Speed: 1.2521119648478776 samples/sec                   batch loss = 1242.372022330761 | accuracy = 0.6683168316831684


Epoch[2] Batch[510] Speed: 1.255667044227621 samples/sec                   batch loss = 1254.7230556607246 | accuracy = 0.6676470588235294


Epoch[2] Batch[515] Speed: 1.2592281916974422 samples/sec                   batch loss = 1265.6945134997368 | accuracy = 0.6674757281553398


Epoch[2] Batch[520] Speed: 1.2549655100242978 samples/sec                   batch loss = 1278.322603404522 | accuracy = 0.6658653846153846


Epoch[2] Batch[525] Speed: 1.252966655548912 samples/sec                   batch loss = 1289.4192181229591 | accuracy = 0.6661904761904762


Epoch[2] Batch[530] Speed: 1.244375627343929 samples/sec                   batch loss = 1302.8937770724297 | accuracy = 0.664622641509434


Epoch[2] Batch[535] Speed: 1.2499672928070102 samples/sec                   batch loss = 1314.9752803444862 | accuracy = 0.6654205607476635


Epoch[2] Batch[540] Speed: 1.2542603521497875 samples/sec                   batch loss = 1326.5822390913963 | accuracy = 0.6657407407407407


Epoch[2] Batch[545] Speed: 1.2528223797479243 samples/sec                   batch loss = 1341.3106867671013 | accuracy = 0.6646788990825688


Epoch[2] Batch[550] Speed: 1.2441239869237037 samples/sec                   batch loss = 1352.8244954943657 | accuracy = 0.665


Epoch[2] Batch[555] Speed: 1.2445848976682046 samples/sec                   batch loss = 1365.1825392842293 | accuracy = 0.6657657657657657


Epoch[2] Batch[560] Speed: 1.2484333208122027 samples/sec                   batch loss = 1378.3816530108452 | accuracy = 0.6651785714285714


Epoch[2] Batch[565] Speed: 1.2474464927292956 samples/sec                   batch loss = 1389.6983470320702 | accuracy = 0.6650442477876106


Epoch[2] Batch[570] Speed: 1.2513248866572462 samples/sec                   batch loss = 1401.8443917632103 | accuracy = 0.6657894736842105


Epoch[2] Batch[575] Speed: 1.2488624756383022 samples/sec                   batch loss = 1413.3802545666695 | accuracy = 0.6660869565217391


Epoch[2] Batch[580] Speed: 1.2453449365099005 samples/sec                   batch loss = 1422.057218849659 | accuracy = 0.6672413793103448


Epoch[2] Batch[585] Speed: 1.2544427577855277 samples/sec                   batch loss = 1430.6809391379356 | accuracy = 0.6688034188034188


Epoch[2] Batch[590] Speed: 1.251185934038136 samples/sec                   batch loss = 1443.003549516201 | accuracy = 0.6699152542372881


Epoch[2] Batch[595] Speed: 1.2540498779079277 samples/sec                   batch loss = 1452.4609425663948 | accuracy = 0.6710084033613445


Epoch[2] Batch[600] Speed: 1.247843134221477 samples/sec                   batch loss = 1466.5919820666313 | accuracy = 0.6708333333333333


Epoch[2] Batch[605] Speed: 1.2584009010985533 samples/sec                   batch loss = 1477.993296444416 | accuracy = 0.6714876033057852


Epoch[2] Batch[610] Speed: 1.2577431698573138 samples/sec                   batch loss = 1487.5804663300514 | accuracy = 0.6729508196721311


Epoch[2] Batch[615] Speed: 1.257944699132624 samples/sec                   batch loss = 1500.4271319508553 | accuracy = 0.6739837398373983


Epoch[2] Batch[620] Speed: 1.2512659050130093 samples/sec                   batch loss = 1509.2305582165718 | accuracy = 0.675


Epoch[2] Batch[625] Speed: 1.2435068056235699 samples/sec                   batch loss = 1521.3062646985054 | accuracy = 0.6756


Epoch[2] Batch[630] Speed: 1.265783021614606 samples/sec                   batch loss = 1532.2252497076988 | accuracy = 0.6757936507936508


Epoch[2] Batch[635] Speed: 1.250697092325111 samples/sec                   batch loss = 1545.610956132412 | accuracy = 0.6755905511811023


Epoch[2] Batch[640] Speed: 1.260498585122266 samples/sec                   batch loss = 1557.5752550959587 | accuracy = 0.675390625


Epoch[2] Batch[645] Speed: 1.2506744363483342 samples/sec                   batch loss = 1569.6802431941032 | accuracy = 0.6751937984496124


Epoch[2] Batch[650] Speed: 1.2476837050605019 samples/sec                   batch loss = 1583.6850692629814 | accuracy = 0.6746153846153846


Epoch[2] Batch[655] Speed: 1.2542867015622572 samples/sec                   batch loss = 1593.5901809334755 | accuracy = 0.6748091603053435


Epoch[2] Batch[660] Speed: 1.2556786976855303 samples/sec                   batch loss = 1605.9184032082558 | accuracy = 0.675


Epoch[2] Batch[665] Speed: 1.25515366101472 samples/sec                   batch loss = 1616.0971022248268 | accuracy = 0.6759398496240602


Epoch[2] Batch[670] Speed: 1.2463589952055505 samples/sec                   batch loss = 1627.5549181103706 | accuracy = 0.6764925373134328


Epoch[2] Batch[675] Speed: 1.2506325762296846 samples/sec                   batch loss = 1641.5698744654655 | accuracy = 0.6755555555555556


Epoch[2] Batch[680] Speed: 1.2574398189345948 samples/sec                   batch loss = 1652.7965781092644 | accuracy = 0.6757352941176471


Epoch[2] Batch[685] Speed: 1.2621956761044615 samples/sec                   batch loss = 1662.547359764576 | accuracy = 0.6773722627737226


Epoch[2] Batch[690] Speed: 1.2525094286011627 samples/sec                   batch loss = 1674.0332606434822 | accuracy = 0.677536231884058


Epoch[2] Batch[695] Speed: 1.246904313942042 samples/sec                   batch loss = 1686.814582645893 | accuracy = 0.6766187050359712


Epoch[2] Batch[700] Speed: 1.2529601053437833 samples/sec                   batch loss = 1697.4651715159416 | accuracy = 0.6767857142857143


Epoch[2] Batch[705] Speed: 1.26029263923305 samples/sec                   batch loss = 1708.907893359661 | accuracy = 0.676595744680851


Epoch[2] Batch[710] Speed: 1.2583167123713432 samples/sec                   batch loss = 1718.5405717492104 | accuracy = 0.6771126760563381


Epoch[2] Batch[715] Speed: 1.2510107241638517 samples/sec                   batch loss = 1728.1873471140862 | accuracy = 0.6779720279720279


Epoch[2] Batch[720] Speed: 1.2463202935801625 samples/sec                   batch loss = 1740.5571193099022 | accuracy = 0.678125


Epoch[2] Batch[725] Speed: 1.2565899343026958 samples/sec                   batch loss = 1751.463724911213 | accuracy = 0.6782758620689655


Epoch[2] Batch[730] Speed: 1.256087457360434 samples/sec                   batch loss = 1761.3881846666336 | accuracy = 0.6791095890410959


Epoch[2] Batch[735] Speed: 1.2485811405426424 samples/sec                   batch loss = 1776.9415646791458 | accuracy = 0.677891156462585


Epoch[2] Batch[740] Speed: 1.238861576409461 samples/sec                   batch loss = 1788.415517091751 | accuracy = 0.6777027027027027


Epoch[2] Batch[745] Speed: 1.2453111968639672 samples/sec                   batch loss = 1800.416779756546 | accuracy = 0.6771812080536913


Epoch[2] Batch[750] Speed: 1.2488909229123313 samples/sec                   batch loss = 1813.0101778507233 | accuracy = 0.6763333333333333


Epoch[2] Batch[755] Speed: 1.2583665446968637 samples/sec                   batch loss = 1824.7344043254852 | accuracy = 0.676158940397351


Epoch[2] Batch[760] Speed: 1.2504274182699278 samples/sec                   batch loss = 1839.8813253641129 | accuracy = 0.6756578947368421


Epoch[2] Batch[765] Speed: 1.24152092161891 samples/sec                   batch loss = 1851.6704759597778 | accuracy = 0.6764705882352942


Epoch[2] Batch[770] Speed: 1.2505118595103755 samples/sec                   batch loss = 1861.6213029623032 | accuracy = 0.676948051948052


Epoch[2] Batch[775] Speed: 1.255881352067585 samples/sec                   batch loss = 1870.6785953044891 | accuracy = 0.6774193548387096


Epoch[2] Batch[780] Speed: 1.253053124675575 samples/sec                   batch loss = 1883.1156455278397 | accuracy = 0.6775641025641026


Epoch[2] Batch[785] Speed: 1.251960972333638 samples/sec                   batch loss = 1894.4855850934982 | accuracy = 0.6780254777070064


[Epoch 2] training: accuracy=0.679251269035533
[Epoch 2] time cost: 645.6047065258026
[Epoch 2] validation: validation accuracy=0.6888888888888889


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).