<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[03:20:23] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[03:20:23] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[03:20:24] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 3.0093381, -3.1293747]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7752040296228829 samples/sec                   batch loss = 13.755680799484253 | accuracy = 0.55


Epoch[1] Batch[10] Speed: 1.2454177834046152 samples/sec                   batch loss = 27.04723882675171 | accuracy = 0.575


Epoch[1] Batch[15] Speed: 1.2526277255800282 samples/sec                   batch loss = 42.04049038887024 | accuracy = 0.5333333333333333


Epoch[1] Batch[20] Speed: 1.2470911677542367 samples/sec                   batch loss = 57.0247745513916 | accuracy = 0.5


Epoch[1] Batch[25] Speed: 1.2464574266754502 samples/sec                   batch loss = 70.84157514572144 | accuracy = 0.52


Epoch[1] Batch[30] Speed: 1.2510755591587739 samples/sec                   batch loss = 86.28740763664246 | accuracy = 0.475


Epoch[1] Batch[35] Speed: 1.253429085904323 samples/sec                   batch loss = 100.06212306022644 | accuracy = 0.4857142857142857


Epoch[1] Batch[40] Speed: 1.2502170172040672 samples/sec                   batch loss = 113.77732849121094 | accuracy = 0.48125


Epoch[1] Batch[45] Speed: 1.2508389206078296 samples/sec                   batch loss = 128.4558343887329 | accuracy = 0.4722222222222222


Epoch[1] Batch[50] Speed: 1.2553386748202895 samples/sec                   batch loss = 142.44361972808838 | accuracy = 0.475


Epoch[1] Batch[55] Speed: 1.2559961496356744 samples/sec                   batch loss = 157.333829164505 | accuracy = 0.4772727272727273


Epoch[1] Batch[60] Speed: 1.2439655066308135 samples/sec                   batch loss = 170.75579261779785 | accuracy = 0.4875


Epoch[1] Batch[65] Speed: 1.2512170067265194 samples/sec                   batch loss = 183.8112714290619 | accuracy = 0.49615384615384617


Epoch[1] Batch[70] Speed: 1.2533854493512393 samples/sec                   batch loss = 197.8695867061615 | accuracy = 0.5071428571428571


Epoch[1] Batch[75] Speed: 1.254000480605548 samples/sec                   batch loss = 211.76610612869263 | accuracy = 0.51


Epoch[1] Batch[80] Speed: 1.2506201772224572 samples/sec                   batch loss = 226.16784930229187 | accuracy = 0.509375


Epoch[1] Batch[85] Speed: 1.2538480952359243 samples/sec                   batch loss = 240.6449031829834 | accuracy = 0.5


Epoch[1] Batch[90] Speed: 1.2458386687542011 samples/sec                   batch loss = 254.25874018669128 | accuracy = 0.5055555555555555


Epoch[1] Batch[95] Speed: 1.2541232780508078 samples/sec                   batch loss = 268.2461462020874 | accuracy = 0.5157894736842106


Epoch[1] Batch[100] Speed: 1.255697306074841 samples/sec                   batch loss = 282.6224377155304 | accuracy = 0.5075


Epoch[1] Batch[105] Speed: 1.249379842103715 samples/sec                   batch loss = 296.37543177604675 | accuracy = 0.5095238095238095


Epoch[1] Batch[110] Speed: 1.2488155311737248 samples/sec                   batch loss = 309.54642391204834 | accuracy = 0.5181818181818182


Epoch[1] Batch[115] Speed: 1.2478324610320484 samples/sec                   batch loss = 323.655410528183 | accuracy = 0.5173913043478261


Epoch[1] Batch[120] Speed: 1.2464687246026074 samples/sec                   batch loss = 338.3295705318451 | accuracy = 0.5125


Epoch[1] Batch[125] Speed: 1.2573823326154272 samples/sec                   batch loss = 351.842312335968 | accuracy = 0.516


Epoch[1] Batch[130] Speed: 1.2570821697961443 samples/sec                   batch loss = 365.03922390937805 | accuracy = 0.5211538461538462


Epoch[1] Batch[135] Speed: 1.2522352338795542 samples/sec                   batch loss = 378.5598886013031 | accuracy = 0.5203703703703704


Epoch[1] Batch[140] Speed: 1.2562377537676914 samples/sec                   batch loss = 392.2825880050659 | accuracy = 0.525


Epoch[1] Batch[145] Speed: 1.2519561142683848 samples/sec                   batch loss = 406.5701971054077 | accuracy = 0.5206896551724138


Epoch[1] Batch[150] Speed: 1.2564413414130375 samples/sec                   batch loss = 420.3925561904907 | accuracy = 0.5233333333333333


Epoch[1] Batch[155] Speed: 1.2537115796562526 samples/sec                   batch loss = 434.30390191078186 | accuracy = 0.5241935483870968


Epoch[1] Batch[160] Speed: 1.2584160034071379 samples/sec                   batch loss = 447.7418010234833 | accuracy = 0.5296875


Epoch[1] Batch[165] Speed: 1.2548424537422576 samples/sec                   batch loss = 461.59136056900024 | accuracy = 0.5287878787878788


Epoch[1] Batch[170] Speed: 1.257788619112518 samples/sec                   batch loss = 475.78657817840576 | accuracy = 0.5264705882352941


Epoch[1] Batch[175] Speed: 1.258881521017243 samples/sec                   batch loss = 489.09828758239746 | accuracy = 0.5314285714285715


Epoch[1] Batch[180] Speed: 1.2567527775062353 samples/sec                   batch loss = 502.8732531070709 | accuracy = 0.5319444444444444


Epoch[1] Batch[185] Speed: 1.2506190585271302 samples/sec                   batch loss = 516.4080548286438 | accuracy = 0.5378378378378378


Epoch[1] Batch[190] Speed: 1.2565476772704034 samples/sec                   batch loss = 530.3597269058228 | accuracy = 0.5355263157894737


Epoch[1] Batch[195] Speed: 1.2617567410650092 samples/sec                   batch loss = 544.1048781871796 | accuracy = 0.5346153846153846


Epoch[1] Batch[200] Speed: 1.2593903959375743 samples/sec                   batch loss = 557.8402278423309 | accuracy = 0.53875


Epoch[1] Batch[205] Speed: 1.2588760423373688 samples/sec                   batch loss = 571.2031753063202 | accuracy = 0.5414634146341464


Epoch[1] Batch[210] Speed: 1.25605256890775 samples/sec                   batch loss = 585.0086243152618 | accuracy = 0.5404761904761904


Epoch[1] Batch[215] Speed: 1.2558647123925686 samples/sec                   batch loss = 599.0232057571411 | accuracy = 0.5383720930232558


Epoch[1] Batch[220] Speed: 1.2469128397816038 samples/sec                   batch loss = 612.4663045406342 | accuracy = 0.5409090909090909


Epoch[1] Batch[225] Speed: 1.254036286231723 samples/sec                   batch loss = 626.3774590492249 | accuracy = 0.5388888888888889


Epoch[1] Batch[230] Speed: 1.2455723801722753 samples/sec                   batch loss = 640.2829194068909 | accuracy = 0.5369565217391304


Epoch[1] Batch[235] Speed: 1.2517822764028312 samples/sec                   batch loss = 654.0645101070404 | accuracy = 0.5393617021276595


Epoch[1] Batch[240] Speed: 1.2479596230715062 samples/sec                   batch loss = 667.4814507961273 | accuracy = 0.5416666666666666


Epoch[1] Batch[245] Speed: 1.2512813964887304 samples/sec                   batch loss = 680.3109912872314 | accuracy = 0.5459183673469388


Epoch[1] Batch[250] Speed: 1.250928080949521 samples/sec                   batch loss = 693.7297909259796 | accuracy = 0.548


Epoch[1] Batch[255] Speed: 1.2543312449015647 samples/sec                   batch loss = 707.7304270267487 | accuracy = 0.5480392156862746


Epoch[1] Batch[260] Speed: 1.2509181010745514 samples/sec                   batch loss = 721.6067242622375 | accuracy = 0.55


Epoch[1] Batch[265] Speed: 1.2544400377231888 samples/sec                   batch loss = 735.5685641765594 | accuracy = 0.5490566037735849


Epoch[1] Batch[270] Speed: 1.2490912990821528 samples/sec                   batch loss = 749.184716463089 | accuracy = 0.549074074074074


Epoch[1] Batch[275] Speed: 1.2538382561252839 samples/sec                   batch loss = 762.9152674674988 | accuracy = 0.5490909090909091


Epoch[1] Batch[280] Speed: 1.2586722315512964 samples/sec                   batch loss = 776.4531502723694 | accuracy = 0.5491071428571429


Epoch[1] Batch[285] Speed: 1.257931683124963 samples/sec                   batch loss = 789.6883480548859 | accuracy = 0.5517543859649123


Epoch[1] Batch[290] Speed: 1.2511799622868474 samples/sec                   batch loss = 803.755865573883 | accuracy = 0.5482758620689655


Epoch[1] Batch[295] Speed: 1.2563024731849481 samples/sec                   batch loss = 818.7179582118988 | accuracy = 0.5432203389830509


Epoch[1] Batch[300] Speed: 1.2592563570348494 samples/sec                   batch loss = 831.9224662780762 | accuracy = 0.5458333333333333


Epoch[1] Batch[305] Speed: 1.2549660732664252 samples/sec                   batch loss = 845.97780585289 | accuracy = 0.5434426229508197


Epoch[1] Batch[310] Speed: 1.2569855376973553 samples/sec                   batch loss = 859.8643555641174 | accuracy = 0.5443548387096774


Epoch[1] Batch[315] Speed: 1.2524494002623563 samples/sec                   batch loss = 873.1474237442017 | accuracy = 0.546031746031746


Epoch[1] Batch[320] Speed: 1.253532758311132 samples/sec                   batch loss = 886.6222562789917 | accuracy = 0.546875


Epoch[1] Batch[325] Speed: 1.252416583463193 samples/sec                   batch loss = 900.4532523155212 | accuracy = 0.5461538461538461


Epoch[1] Batch[330] Speed: 1.257840578612712 samples/sec                   batch loss = 914.1279149055481 | accuracy = 0.546969696969697


Epoch[1] Batch[335] Speed: 1.2545895648536385 samples/sec                   batch loss = 927.4925315380096 | accuracy = 0.5470149253731343


Epoch[1] Batch[340] Speed: 1.2538976679612306 samples/sec                   batch loss = 941.5246999263763 | accuracy = 0.5448529411764705


Epoch[1] Batch[345] Speed: 1.254755080600728 samples/sec                   batch loss = 955.330573797226 | accuracy = 0.5449275362318841


Epoch[1] Batch[350] Speed: 1.2505834477717621 samples/sec                   batch loss = 968.4939453601837 | accuracy = 0.5478571428571428


Epoch[1] Batch[355] Speed: 1.2530658527407783 samples/sec                   batch loss = 981.4077467918396 | accuracy = 0.5507042253521127


Epoch[1] Batch[360] Speed: 1.254560857382637 samples/sec                   batch loss = 994.7690052986145 | accuracy = 0.5513888888888889


Epoch[1] Batch[365] Speed: 1.2515623626913444 samples/sec                   batch loss = 1008.922310590744 | accuracy = 0.5493150684931507


Epoch[1] Batch[370] Speed: 1.2495763732184904 samples/sec                   batch loss = 1022.8514750003815 | accuracy = 0.55


Epoch[1] Batch[375] Speed: 1.2538300101324482 samples/sec                   batch loss = 1035.9095845222473 | accuracy = 0.5533333333333333


Epoch[1] Batch[380] Speed: 1.249341138716931 samples/sec                   batch loss = 1049.58629155159 | accuracy = 0.5513157894736842


Epoch[1] Batch[385] Speed: 1.2496928135655436 samples/sec                   batch loss = 1062.7992119789124 | accuracy = 0.5532467532467532


Epoch[1] Batch[390] Speed: 1.2519075356893894 samples/sec                   batch loss = 1075.516309261322 | accuracy = 0.5570512820512821


Epoch[1] Batch[395] Speed: 1.2479051351412465 samples/sec                   batch loss = 1088.8606266975403 | accuracy = 0.5575949367088607


Epoch[1] Batch[400] Speed: 1.2546414480276875 samples/sec                   batch loss = 1102.7379505634308 | accuracy = 0.5575


Epoch[1] Batch[405] Speed: 1.254465644294442 samples/sec                   batch loss = 1116.1614763736725 | accuracy = 0.5592592592592592


Epoch[1] Batch[410] Speed: 1.2523362783794916 samples/sec                   batch loss = 1129.3156061172485 | accuracy = 0.5603658536585366


Epoch[1] Batch[415] Speed: 1.2526410996878032 samples/sec                   batch loss = 1142.7380239963531 | accuracy = 0.5614457831325301


Epoch[1] Batch[420] Speed: 1.2512032897638987 samples/sec                   batch loss = 1156.1505398750305 | accuracy = 0.5613095238095238


Epoch[1] Batch[425] Speed: 1.2561072064002319 samples/sec                   batch loss = 1169.6986639499664 | accuracy = 0.5617647058823529


Epoch[1] Batch[430] Speed: 1.2507359730130183 samples/sec                   batch loss = 1182.8617579936981 | accuracy = 0.5627906976744186


Epoch[1] Batch[435] Speed: 1.2584084522075345 samples/sec                   batch loss = 1197.318511724472 | accuracy = 0.5626436781609195


Epoch[1] Batch[440] Speed: 1.256412078684817 samples/sec                   batch loss = 1210.6573641300201 | accuracy = 0.5625


Epoch[1] Batch[445] Speed: 1.2509341435681398 samples/sec                   batch loss = 1223.9064900875092 | accuracy = 0.5634831460674158


Epoch[1] Batch[450] Speed: 1.251709150020077 samples/sec                   batch loss = 1237.2986571788788 | accuracy = 0.565


Epoch[1] Batch[455] Speed: 1.2545856245347335 samples/sec                   batch loss = 1251.3834507465363 | accuracy = 0.5631868131868132


Epoch[1] Batch[460] Speed: 1.2608942847284648 samples/sec                   batch loss = 1264.6631588935852 | accuracy = 0.5646739130434782


Epoch[1] Batch[465] Speed: 1.2497222295883814 samples/sec                   batch loss = 1277.4193296432495 | accuracy = 0.5661290322580645


Epoch[1] Batch[470] Speed: 1.2519950732758798 samples/sec                   batch loss = 1289.6974170207977 | accuracy = 0.5686170212765957


Epoch[1] Batch[475] Speed: 1.2520274942503895 samples/sec                   batch loss = 1303.1850743293762 | accuracy = 0.5689473684210526


Epoch[1] Batch[480] Speed: 1.2585156877358 samples/sec                   batch loss = 1316.1661303043365 | accuracy = 0.5697916666666667


Epoch[1] Batch[485] Speed: 1.2539285006124565 samples/sec                   batch loss = 1328.5654528141022 | accuracy = 0.5721649484536082


Epoch[1] Batch[490] Speed: 1.250544203712422 samples/sec                   batch loss = 1341.1500549316406 | accuracy = 0.5729591836734694


Epoch[1] Batch[495] Speed: 1.2496442243076917 samples/sec                   batch loss = 1354.6717314720154 | accuracy = 0.5742424242424242


Epoch[1] Batch[500] Speed: 1.2538704915016372 samples/sec                   batch loss = 1368.136120557785 | accuracy = 0.575


Epoch[1] Batch[505] Speed: 1.2479504331210722 samples/sec                   batch loss = 1381.0551190376282 | accuracy = 0.5757425742574257


Epoch[1] Batch[510] Speed: 1.2535221748856475 samples/sec                   batch loss = 1393.6078305244446 | accuracy = 0.5774509803921568


Epoch[1] Batch[515] Speed: 1.252017683691883 samples/sec                   batch loss = 1406.614575624466 | accuracy = 0.5771844660194175


Epoch[1] Batch[520] Speed: 1.2548834698214397 samples/sec                   batch loss = 1420.2836246490479 | accuracy = 0.5774038461538461


Epoch[1] Batch[525] Speed: 1.2496098790083066 samples/sec                   batch loss = 1432.9433472156525 | accuracy = 0.5780952380952381


Epoch[1] Batch[530] Speed: 1.2487651511401712 samples/sec                   batch loss = 1444.9728642702103 | accuracy = 0.5787735849056603


Epoch[1] Batch[535] Speed: 1.25379271711028 samples/sec                   batch loss = 1457.788948893547 | accuracy = 0.5794392523364486


Epoch[1] Batch[540] Speed: 1.2531034769159157 samples/sec                   batch loss = 1470.3727797269821 | accuracy = 0.5800925925925926


Epoch[1] Batch[545] Speed: 1.2535689118912374 samples/sec                   batch loss = 1483.0569750070572 | accuracy = 0.581651376146789


Epoch[1] Batch[550] Speed: 1.248299653535644 samples/sec                   batch loss = 1497.4918781518936 | accuracy = 0.5813636363636364


Epoch[1] Batch[555] Speed: 1.2472393188526973 samples/sec                   batch loss = 1509.7200516462326 | accuracy = 0.5828828828828829


Epoch[1] Batch[560] Speed: 1.2544167769821934 samples/sec                   batch loss = 1521.8650068044662 | accuracy = 0.5839285714285715


Epoch[1] Batch[565] Speed: 1.2533458419776626 samples/sec                   batch loss = 1533.8396438360214 | accuracy = 0.5858407079646017


Epoch[1] Batch[570] Speed: 1.2586812023705545 samples/sec                   batch loss = 1545.7393695116043 | accuracy = 0.587280701754386


Epoch[1] Batch[575] Speed: 1.255925820707425 samples/sec                   batch loss = 1559.5944019556046 | accuracy = 0.5869565217391305


Epoch[1] Batch[580] Speed: 1.250206210204076 samples/sec                   batch loss = 1572.5858463048935 | accuracy = 0.5875


Epoch[1] Batch[585] Speed: 1.2512868092634455 samples/sec                   batch loss = 1585.36474275589 | accuracy = 0.588034188034188


Epoch[1] Batch[590] Speed: 1.246214478200738 samples/sec                   batch loss = 1598.2929992675781 | accuracy = 0.5885593220338983


Epoch[1] Batch[595] Speed: 1.254520894267391 samples/sec                   batch loss = 1610.9612028598785 | accuracy = 0.5890756302521009


Epoch[1] Batch[600] Speed: 1.2554731023009202 samples/sec                   batch loss = 1623.0084595680237 | accuracy = 0.5904166666666667


Epoch[1] Batch[605] Speed: 1.2554579765977625 samples/sec                   batch loss = 1635.2550971508026 | accuracy = 0.5909090909090909


Epoch[1] Batch[610] Speed: 1.262005123327312 samples/sec                   batch loss = 1646.803748011589 | accuracy = 0.5926229508196721


Epoch[1] Batch[615] Speed: 1.258526922092508 samples/sec                   batch loss = 1658.2765057086945 | accuracy = 0.5939024390243902


Epoch[1] Batch[620] Speed: 1.2556087801824571 samples/sec                   batch loss = 1670.147872686386 | accuracy = 0.5951612903225807


Epoch[1] Batch[625] Speed: 1.2521804654152386 samples/sec                   batch loss = 1683.3008000850677 | accuracy = 0.5948


Epoch[1] Batch[630] Speed: 1.256037899409407 samples/sec                   batch loss = 1698.0557389259338 | accuracy = 0.5936507936507937


Epoch[1] Batch[635] Speed: 1.258402600090173 samples/sec                   batch loss = 1710.4949615001678 | accuracy = 0.5940944881889764


Epoch[1] Batch[640] Speed: 1.249575442527755 samples/sec                   batch loss = 1722.9190654754639 | accuracy = 0.5953125


Epoch[1] Batch[645] Speed: 1.2545933175613209 samples/sec                   batch loss = 1736.1538746356964 | accuracy = 0.5957364341085272


Epoch[1] Batch[650] Speed: 1.2533170977682324 samples/sec                   batch loss = 1748.2282090187073 | accuracy = 0.5957692307692307


Epoch[1] Batch[655] Speed: 1.2499869429938937 samples/sec                   batch loss = 1761.6216535568237 | accuracy = 0.5958015267175573


Epoch[1] Batch[660] Speed: 1.2560686493284459 samples/sec                   batch loss = 1774.7693803310394 | accuracy = 0.5962121212121212


Epoch[1] Batch[665] Speed: 1.2506972787974615 samples/sec                   batch loss = 1786.835908293724 | accuracy = 0.5969924812030075


Epoch[1] Batch[670] Speed: 1.2637655797383105 samples/sec                   batch loss = 1801.128807425499 | accuracy = 0.5962686567164179


Epoch[1] Batch[675] Speed: 1.2530785810645586 samples/sec                   batch loss = 1815.8838320970535 | accuracy = 0.5955555555555555


Epoch[1] Batch[680] Speed: 1.254599884853786 samples/sec                   batch loss = 1828.2745405435562 | accuracy = 0.5959558823529412


Epoch[1] Batch[685] Speed: 1.260191253362257 samples/sec                   batch loss = 1842.5884016752243 | accuracy = 0.5956204379562043


Epoch[1] Batch[690] Speed: 1.251964522482089 samples/sec                   batch loss = 1854.880275607109 | accuracy = 0.5956521739130435


Epoch[1] Batch[695] Speed: 1.2529689949244842 samples/sec                   batch loss = 1866.7761994600296 | accuracy = 0.5967625899280575


Epoch[1] Batch[700] Speed: 1.2498955888350158 samples/sec                   batch loss = 1878.5595635175705 | accuracy = 0.5978571428571429


Epoch[1] Batch[705] Speed: 1.2547774154341245 samples/sec                   batch loss = 1894.2597893476486 | accuracy = 0.5982269503546099


Epoch[1] Batch[710] Speed: 1.2549106902099452 samples/sec                   batch loss = 1906.9401260614395 | accuracy = 0.598943661971831


Epoch[1] Batch[715] Speed: 1.2540284125678685 samples/sec                   batch loss = 1919.3616203069687 | accuracy = 0.5993006993006993


Epoch[1] Batch[720] Speed: 1.2485697113702998 samples/sec                   batch loss = 1934.5710607767105 | accuracy = 0.5982638888888889


Epoch[1] Batch[725] Speed: 1.2468399105515173 samples/sec                   batch loss = 1946.9699751138687 | accuracy = 0.5989655172413794


Epoch[1] Batch[730] Speed: 1.2548062266666786 samples/sec                   batch loss = 1961.8135682344437 | accuracy = 0.5986301369863014


Epoch[1] Batch[735] Speed: 1.2581074219583375 samples/sec                   batch loss = 1973.6394387483597 | accuracy = 0.5996598639455782


Epoch[1] Batch[740] Speed: 1.246034827625032 samples/sec                   batch loss = 1986.4961179494858 | accuracy = 0.5983108108108108


Epoch[1] Batch[745] Speed: 1.255837826617865 samples/sec                   batch loss = 2000.3220032453537 | accuracy = 0.5979865771812081


Epoch[1] Batch[750] Speed: 1.249570323753492 samples/sec                   batch loss = 2012.29206097126 | accuracy = 0.5986666666666667


Epoch[1] Batch[755] Speed: 1.2497505298540306 samples/sec                   batch loss = 2024.5527898073196 | accuracy = 0.5993377483443708


Epoch[1] Batch[760] Speed: 1.2512940885860357 samples/sec                   batch loss = 2037.5732680559158 | accuracy = 0.5996710526315789


Epoch[1] Batch[765] Speed: 1.2543127707537234 samples/sec                   batch loss = 2053.2560847997665 | accuracy = 0.5986928104575163


Epoch[1] Batch[770] Speed: 1.2521331777286013 samples/sec                   batch loss = 2065.7329584360123 | accuracy = 0.599025974025974


Epoch[1] Batch[775] Speed: 1.2572947938758534 samples/sec                   batch loss = 2079.38119328022 | accuracy = 0.597741935483871


Epoch[1] Batch[780] Speed: 1.2505962189350814 samples/sec                   batch loss = 2092.482768893242 | accuracy = 0.5977564102564102


Epoch[1] Batch[785] Speed: 1.2522583203160453 samples/sec                   batch loss = 2105.0686625242233 | accuracy = 0.5984076433121019


[Epoch 1] training: accuracy=0.5989847715736041
[Epoch 1] time cost: 647.0018103122711
[Epoch 1] validation: validation accuracy=0.6722222222222223


Epoch[2] Batch[5] Speed: 1.2465148444952445 samples/sec                   batch loss = 11.19411587715149 | accuracy = 0.8


Epoch[2] Batch[10] Speed: 1.2505420598086383 samples/sec                   batch loss = 22.681196451187134 | accuracy = 0.725


Epoch[2] Batch[15] Speed: 1.2536236145365862 samples/sec                   batch loss = 34.740025997161865 | accuracy = 0.7


Epoch[2] Batch[20] Speed: 1.242654843043611 samples/sec                   batch loss = 46.97691369056702 | accuracy = 0.675


Epoch[2] Batch[25] Speed: 1.2456971396334489 samples/sec                   batch loss = 61.037177085876465 | accuracy = 0.61


Epoch[2] Batch[30] Speed: 1.2488645208239169 samples/sec                   batch loss = 74.92750263214111 | accuracy = 0.6083333333333333


Epoch[2] Batch[35] Speed: 1.2485834635707376 samples/sec                   batch loss = 89.08564758300781 | accuracy = 0.6071428571428571


Epoch[2] Batch[40] Speed: 1.2518626972701135 samples/sec                   batch loss = 102.05147242546082 | accuracy = 0.6125


Epoch[2] Batch[45] Speed: 1.2522860812474421 samples/sec                   batch loss = 114.31065630912781 | accuracy = 0.6111111111111112


Epoch[2] Batch[50] Speed: 1.2557519126636052 samples/sec                   batch loss = 127.02881264686584 | accuracy = 0.615


Epoch[2] Batch[55] Speed: 1.2508141146557636 samples/sec                   batch loss = 141.25327372550964 | accuracy = 0.6136363636363636


Epoch[2] Batch[60] Speed: 1.2529081739985284 samples/sec                   batch loss = 156.89188361167908 | accuracy = 0.5958333333333333


Epoch[2] Batch[65] Speed: 1.2544829035256222 samples/sec                   batch loss = 169.08815455436707 | accuracy = 0.5923076923076923


Epoch[2] Batch[70] Speed: 1.2535703168652015 samples/sec                   batch loss = 181.2538983821869 | accuracy = 0.6035714285714285


Epoch[2] Batch[75] Speed: 1.2539660828852164 samples/sec                   batch loss = 192.67764043807983 | accuracy = 0.6166666666666667


Epoch[2] Batch[80] Speed: 1.2510698683362298 samples/sec                   batch loss = 204.33799612522125 | accuracy = 0.621875


Epoch[2] Batch[85] Speed: 1.2454707599122294 samples/sec                   batch loss = 217.1349719762802 | accuracy = 0.6235294117647059


Epoch[2] Batch[90] Speed: 1.2461241373819134 samples/sec                   batch loss = 230.35255193710327 | accuracy = 0.6194444444444445


Epoch[2] Batch[95] Speed: 1.250651128586854 samples/sec                   batch loss = 243.9948377609253 | accuracy = 0.6210526315789474


Epoch[2] Batch[100] Speed: 1.2548320358974425 samples/sec                   batch loss = 255.2754179239273 | accuracy = 0.6275


Epoch[2] Batch[105] Speed: 1.252171213195507 samples/sec                   batch loss = 267.1136473417282 | accuracy = 0.6333333333333333


Epoch[2] Batch[110] Speed: 1.2588573396170593 samples/sec                   batch loss = 277.34613609313965 | accuracy = 0.6409090909090909


Epoch[2] Batch[115] Speed: 1.2562682312671904 samples/sec                   batch loss = 288.5773878097534 | accuracy = 0.6456521739130435


Epoch[2] Batch[120] Speed: 1.255838484647559 samples/sec                   batch loss = 300.14944291114807 | accuracy = 0.6479166666666667


Epoch[2] Batch[125] Speed: 1.2560532271625031 samples/sec                   batch loss = 311.9387261867523 | accuracy = 0.646


Epoch[2] Batch[130] Speed: 1.2524625835633068 samples/sec                   batch loss = 323.367800951004 | accuracy = 0.6442307692307693


Epoch[2] Batch[135] Speed: 1.2537790373115711 samples/sec                   batch loss = 335.0249924659729 | accuracy = 0.6462962962962963


Epoch[2] Batch[140] Speed: 1.2557740010149654 samples/sec                   batch loss = 349.90950298309326 | accuracy = 0.6392857142857142


Epoch[2] Batch[145] Speed: 1.251723905358148 samples/sec                   batch loss = 361.24442744255066 | accuracy = 0.6413793103448275


Epoch[2] Batch[150] Speed: 1.2517633168905815 samples/sec                   batch loss = 375.32876312732697 | accuracy = 0.6383333333333333


Epoch[2] Batch[155] Speed: 1.26100175479698 samples/sec                   batch loss = 386.27242505550385 | accuracy = 0.6403225806451613


Epoch[2] Batch[160] Speed: 1.2547793861928926 samples/sec                   batch loss = 398.0620299577713 | accuracy = 0.6421875


Epoch[2] Batch[165] Speed: 1.2493038331546686 samples/sec                   batch loss = 411.19976937770844 | accuracy = 0.6409090909090909


Epoch[2] Batch[170] Speed: 1.251600176745125 samples/sec                   batch loss = 423.00908160209656 | accuracy = 0.6426470588235295


Epoch[2] Batch[175] Speed: 1.248373682535152 samples/sec                   batch loss = 433.5160596370697 | accuracy = 0.6471428571428571


Epoch[2] Batch[180] Speed: 1.2592965279006363 samples/sec                   batch loss = 445.2310446500778 | accuracy = 0.65


Epoch[2] Batch[185] Speed: 1.2539140681188623 samples/sec                   batch loss = 458.88504219055176 | accuracy = 0.6472972972972973


Epoch[2] Batch[190] Speed: 1.2499022001556603 samples/sec                   batch loss = 470.4377874135971 | accuracy = 0.6473684210526316


Epoch[2] Batch[195] Speed: 1.246487894504959 samples/sec                   batch loss = 480.52326345443726 | accuracy = 0.6487179487179487


Epoch[2] Batch[200] Speed: 1.2502540046438473 samples/sec                   batch loss = 492.3882269859314 | accuracy = 0.6525


Epoch[2] Batch[205] Speed: 1.257656617912017 samples/sec                   batch loss = 504.5446263551712 | accuracy = 0.6536585365853659


Epoch[2] Batch[210] Speed: 1.2580192162454478 samples/sec                   batch loss = 514.7301598787308 | accuracy = 0.6547619047619048


Epoch[2] Batch[215] Speed: 1.250049529696993 samples/sec                   batch loss = 528.657998919487 | accuracy = 0.6523255813953488


Epoch[2] Batch[220] Speed: 1.2539453701819263 samples/sec                   batch loss = 542.2418693304062 | accuracy = 0.65


Epoch[2] Batch[225] Speed: 1.2591286779619295 samples/sec                   batch loss = 553.6805076599121 | accuracy = 0.6511111111111111


Epoch[2] Batch[230] Speed: 1.2507350405941307 samples/sec                   batch loss = 565.65971159935 | accuracy = 0.6489130434782608


Epoch[2] Batch[235] Speed: 1.250176678234583 samples/sec                   batch loss = 579.17253947258 | accuracy = 0.6468085106382979


Epoch[2] Batch[240] Speed: 1.2528424004739787 samples/sec                   batch loss = 590.2906821966171 | accuracy = 0.6479166666666667


Epoch[2] Batch[245] Speed: 1.250558745166884 samples/sec                   batch loss = 601.3112304210663 | accuracy = 0.6489795918367347


Epoch[2] Batch[250] Speed: 1.2543955804452587 samples/sec                   batch loss = 612.0106887817383 | accuracy = 0.65


Epoch[2] Batch[255] Speed: 1.2523945195736286 samples/sec                   batch loss = 625.7498557567596 | accuracy = 0.6509803921568628


Epoch[2] Batch[260] Speed: 1.2528328578051147 samples/sec                   batch loss = 637.7902987003326 | accuracy = 0.6519230769230769


Epoch[2] Batch[265] Speed: 1.2536119991921155 samples/sec                   batch loss = 651.2435584068298 | accuracy = 0.65


Epoch[2] Batch[270] Speed: 1.252899472410429 samples/sec                   batch loss = 661.2055238485336 | accuracy = 0.6527777777777778


Epoch[2] Batch[275] Speed: 1.2524767956595897 samples/sec                   batch loss = 671.4255839586258 | accuracy = 0.6554545454545454


Epoch[2] Batch[280] Speed: 1.2530336587230468 samples/sec                   batch loss = 680.8939592838287 | accuracy = 0.6580357142857143


Epoch[2] Batch[285] Speed: 1.2526367039705242 samples/sec                   batch loss = 690.8815860748291 | accuracy = 0.6596491228070176


Epoch[2] Batch[290] Speed: 1.2499091840210366 samples/sec                   batch loss = 700.9331955909729 | accuracy = 0.6620689655172414


Epoch[2] Batch[295] Speed: 1.2528330449148717 samples/sec                   batch loss = 713.1624853610992 | accuracy = 0.661864406779661


Epoch[2] Batch[300] Speed: 1.2478370087165243 samples/sec                   batch loss = 725.792014837265 | accuracy = 0.6591666666666667


Epoch[2] Batch[305] Speed: 1.2522532730062097 samples/sec                   batch loss = 740.6088650226593 | accuracy = 0.6573770491803279


Epoch[2] Batch[310] Speed: 1.25055249975719 samples/sec                   batch loss = 751.599752664566 | accuracy = 0.6580645161290323


Epoch[2] Batch[315] Speed: 1.2548545611846476 samples/sec                   batch loss = 763.0326752662659 | accuracy = 0.6587301587301587


Epoch[2] Batch[320] Speed: 1.2539563356454904 samples/sec                   batch loss = 777.4337418079376 | accuracy = 0.65625


Epoch[2] Batch[325] Speed: 1.2470147881993798 samples/sec                   batch loss = 792.0225059986115 | accuracy = 0.6538461538461539


Epoch[2] Batch[330] Speed: 1.2531218218604612 samples/sec                   batch loss = 803.0538778305054 | accuracy = 0.6530303030303031


Epoch[2] Batch[335] Speed: 1.2483095916740155 samples/sec                   batch loss = 819.2768615484238 | accuracy = 0.6514925373134328


Epoch[2] Batch[340] Speed: 1.2506343475365012 samples/sec                   batch loss = 830.7044483423233 | accuracy = 0.6514705882352941


Epoch[2] Batch[345] Speed: 1.2513578329389807 samples/sec                   batch loss = 843.6490551233292 | accuracy = 0.6492753623188405


Epoch[2] Batch[350] Speed: 1.2418944060620425 samples/sec                   batch loss = 854.4722487926483 | accuracy = 0.65


Epoch[2] Batch[355] Speed: 1.2485288282391156 samples/sec                   batch loss = 865.6836577653885 | accuracy = 0.6514084507042254


Epoch[2] Batch[360] Speed: 1.2479388298416365 samples/sec                   batch loss = 875.6015932559967 | accuracy = 0.6534722222222222


Epoch[2] Batch[365] Speed: 1.2534251528715221 samples/sec                   batch loss = 887.289128780365 | accuracy = 0.6547945205479452


Epoch[2] Batch[370] Speed: 1.249670100875093 samples/sec                   batch loss = 902.107180595398 | accuracy = 0.6547297297297298


Epoch[2] Batch[375] Speed: 1.2497918654398894 samples/sec                   batch loss = 914.6133878231049 | accuracy = 0.6546666666666666


Epoch[2] Batch[380] Speed: 1.2485105246139896 samples/sec                   batch loss = 925.6335682868958 | accuracy = 0.655921052631579


Epoch[2] Batch[385] Speed: 1.2526508264910177 samples/sec                   batch loss = 938.5846565961838 | accuracy = 0.6564935064935065


Epoch[2] Batch[390] Speed: 1.24732101170208 samples/sec                   batch loss = 948.9881248474121 | accuracy = 0.6583333333333333


Epoch[2] Batch[395] Speed: 1.2397357343368534 samples/sec                   batch loss = 963.2814239263535 | accuracy = 0.6563291139240506


Epoch[2] Batch[400] Speed: 1.2519762941710162 samples/sec                   batch loss = 976.7028455734253 | accuracy = 0.65625


Epoch[2] Batch[405] Speed: 1.249932743502605 samples/sec                   batch loss = 987.9738457202911 | accuracy = 0.6561728395061729


Epoch[2] Batch[410] Speed: 1.2483846436602342 samples/sec                   batch loss = 998.387877702713 | accuracy = 0.6579268292682927


Epoch[2] Batch[415] Speed: 1.2468110933471739 samples/sec                   batch loss = 1009.8031964302063 | accuracy = 0.6596385542168675


Epoch[2] Batch[420] Speed: 1.2508779966407084 samples/sec                   batch loss = 1018.71814930439 | accuracy = 0.6625


Epoch[2] Batch[425] Speed: 1.2496974679073152 samples/sec                   batch loss = 1029.0540654659271 | accuracy = 0.6641176470588235


Epoch[2] Batch[430] Speed: 1.2504100840619223 samples/sec                   batch loss = 1041.9704231023788 | accuracy = 0.6651162790697674


Epoch[2] Batch[435] Speed: 1.2482903656989075 samples/sec                   batch loss = 1055.7301223278046 | accuracy = 0.6637931034482759


Epoch[2] Batch[440] Speed: 1.2443125000509896 samples/sec                   batch loss = 1071.5159089565277 | accuracy = 0.6619318181818182


Epoch[2] Batch[445] Speed: 1.246654799436579 samples/sec                   batch loss = 1088.490736246109 | accuracy = 0.6589887640449438


Epoch[2] Batch[450] Speed: 1.2514460404479577 samples/sec                   batch loss = 1099.189883351326 | accuracy = 0.6588888888888889


Epoch[2] Batch[455] Speed: 1.2424314997836863 samples/sec                   batch loss = 1113.8934215307236 | accuracy = 0.6576923076923077


Epoch[2] Batch[460] Speed: 1.2520151610016887 samples/sec                   batch loss = 1124.746345281601 | accuracy = 0.658695652173913


Epoch[2] Batch[465] Speed: 1.2593907740848536 samples/sec                   batch loss = 1135.1627695560455 | accuracy = 0.6596774193548387


Epoch[2] Batch[470] Speed: 1.2511124108201448 samples/sec                   batch loss = 1147.243285894394 | accuracy = 0.6590425531914894


Epoch[2] Batch[475] Speed: 1.253397060497874 samples/sec                   batch loss = 1155.4385796785355 | accuracy = 0.6621052631578948


Epoch[2] Batch[480] Speed: 1.259044486316163 samples/sec                   batch loss = 1168.52119743824 | accuracy = 0.6619791666666667


Epoch[2] Batch[485] Speed: 1.2582773589490515 samples/sec                   batch loss = 1180.6300321817398 | accuracy = 0.6623711340206185


Epoch[2] Batch[490] Speed: 1.2523951739998436 samples/sec                   batch loss = 1191.3956087827682 | accuracy = 0.6622448979591836


Epoch[2] Batch[495] Speed: 1.256216401585731 samples/sec                   batch loss = 1202.1304388046265 | accuracy = 0.6631313131313131


Epoch[2] Batch[500] Speed: 1.2556574584808364 samples/sec                   batch loss = 1214.5769625902176 | accuracy = 0.663


Epoch[2] Batch[505] Speed: 1.2563390689461946 samples/sec                   batch loss = 1224.9152116775513 | accuracy = 0.6638613861386139


Epoch[2] Batch[510] Speed: 1.2556876258843008 samples/sec                   batch loss = 1238.9720585346222 | accuracy = 0.6642156862745098


Epoch[2] Batch[515] Speed: 1.257478460234189 samples/sec                   batch loss = 1248.9039081335068 | accuracy = 0.666504854368932


Epoch[2] Batch[520] Speed: 1.2568161377399416 samples/sec                   batch loss = 1258.3795457482338 | accuracy = 0.6677884615384615


Epoch[2] Batch[525] Speed: 1.2562739694803562 samples/sec                   batch loss = 1270.0997931361198 | accuracy = 0.6680952380952381


Epoch[2] Batch[530] Speed: 1.253492579706368 samples/sec                   batch loss = 1284.817641556263 | accuracy = 0.6674528301886793


Epoch[2] Batch[535] Speed: 1.251815340177577 samples/sec                   batch loss = 1296.2085509896278 | accuracy = 0.6691588785046729


Epoch[2] Batch[540] Speed: 1.2550517858355799 samples/sec                   batch loss = 1309.2920139431953 | accuracy = 0.6685185185185185


Epoch[2] Batch[545] Speed: 1.2549502088066047 samples/sec                   batch loss = 1319.4520558714867 | accuracy = 0.6688073394495413


Epoch[2] Batch[550] Speed: 1.2540319744511748 samples/sec                   batch loss = 1333.5871856808662 | accuracy = 0.6681818181818182


Epoch[2] Batch[555] Speed: 1.251798714671389 samples/sec                   batch loss = 1342.693381011486 | accuracy = 0.6693693693693694


Epoch[2] Batch[560] Speed: 1.2561797189149688 samples/sec                   batch loss = 1353.4532901644707 | accuracy = 0.6705357142857142


Epoch[2] Batch[565] Speed: 1.2529586081636555 samples/sec                   batch loss = 1362.7948736548424 | accuracy = 0.6716814159292035


Epoch[2] Batch[570] Speed: 1.2493737945414602 samples/sec                   batch loss = 1374.304100215435 | accuracy = 0.6723684210526316


Epoch[2] Batch[575] Speed: 1.2519405126599428 samples/sec                   batch loss = 1387.409937798977 | accuracy = 0.672608695652174


Epoch[2] Batch[580] Speed: 1.250384270030038 samples/sec                   batch loss = 1402.349655687809 | accuracy = 0.6715517241379311


Epoch[2] Batch[585] Speed: 1.2472732557858586 samples/sec                   batch loss = 1409.9313188195229 | accuracy = 0.6735042735042736


Epoch[2] Batch[590] Speed: 1.2527077874383035 samples/sec                   batch loss = 1422.6571412682533 | accuracy = 0.6741525423728814


Epoch[2] Batch[595] Speed: 1.2564181004853439 samples/sec                   batch loss = 1434.8286163210869 | accuracy = 0.6743697478991597


Epoch[2] Batch[600] Speed: 1.2499825658845654 samples/sec                   batch loss = 1447.0325353741646 | accuracy = 0.6741666666666667


Epoch[2] Batch[605] Speed: 1.248262688765029 samples/sec                   batch loss = 1458.5355440974236 | accuracy = 0.6743801652892562


Epoch[2] Batch[610] Speed: 1.2487748178425198 samples/sec                   batch loss = 1470.5475202202797 | accuracy = 0.6741803278688525


Epoch[2] Batch[615] Speed: 1.244095202868627 samples/sec                   batch loss = 1482.5586702227592 | accuracy = 0.6743902439024391


Epoch[2] Batch[620] Speed: 1.244304655732472 samples/sec                   batch loss = 1492.5972593426704 | accuracy = 0.6754032258064516


Epoch[2] Batch[625] Speed: 1.2439994501173544 samples/sec                   batch loss = 1499.5163666009903 | accuracy = 0.6772


Epoch[2] Batch[630] Speed: 1.2440295211663963 samples/sec                   batch loss = 1511.3959095478058 | accuracy = 0.6773809523809524


Epoch[2] Batch[635] Speed: 1.2581602569712558 samples/sec                   batch loss = 1521.3656610250473 | accuracy = 0.6779527559055119


Epoch[2] Batch[640] Speed: 1.2595267327484896 samples/sec                   batch loss = 1532.4488652944565 | accuracy = 0.677734375


Epoch[2] Batch[645] Speed: 1.2547364062873474 samples/sec                   batch loss = 1544.908071756363 | accuracy = 0.6775193798449612


Epoch[2] Batch[650] Speed: 1.2540194142159586 samples/sec                   batch loss = 1555.103993654251 | accuracy = 0.6784615384615384


Epoch[2] Batch[655] Speed: 1.2509030849272258 samples/sec                   batch loss = 1566.6384918689728 | accuracy = 0.6778625954198473


Epoch[2] Batch[660] Speed: 1.2631169699616627 samples/sec                   batch loss = 1574.62765532732 | accuracy = 0.6791666666666667


Epoch[2] Batch[665] Speed: 1.2546089853584224 samples/sec                   batch loss = 1591.0748117566109 | accuracy = 0.6778195488721804


Epoch[2] Batch[670] Speed: 1.2530599566196035 samples/sec                   batch loss = 1601.555876672268 | accuracy = 0.6787313432835821


Epoch[2] Batch[675] Speed: 1.2516242668893804 samples/sec                   batch loss = 1610.8008800148964 | accuracy = 0.68


Epoch[2] Batch[680] Speed: 1.2534081100145997 samples/sec                   batch loss = 1623.078875362873 | accuracy = 0.6790441176470589


Epoch[2] Batch[685] Speed: 1.2443025331691688 samples/sec                   batch loss = 1632.7919333577156 | accuracy = 0.6795620437956205


Epoch[2] Batch[690] Speed: 1.2460116924701223 samples/sec                   batch loss = 1645.4137181639671 | accuracy = 0.6793478260869565


Epoch[2] Batch[695] Speed: 1.247724625655824 samples/sec                   batch loss = 1654.8458889126778 | accuracy = 0.6802158273381295


Epoch[2] Batch[700] Speed: 1.2513696865581936 samples/sec                   batch loss = 1664.1922075152397 | accuracy = 0.6810714285714285


Epoch[2] Batch[705] Speed: 1.2597441574175614 samples/sec                   batch loss = 1677.5087210536003 | accuracy = 0.6808510638297872


Epoch[2] Batch[710] Speed: 1.2531126493210483 samples/sec                   batch loss = 1688.8953110575676 | accuracy = 0.6809859154929577


Epoch[2] Batch[715] Speed: 1.2494271081455834 samples/sec                   batch loss = 1697.8178172707558 | accuracy = 0.6821678321678322


Epoch[2] Batch[720] Speed: 1.2583440819100422 samples/sec                   batch loss = 1711.5381332039833 | accuracy = 0.6805555555555556


Epoch[2] Batch[725] Speed: 1.251052702890031 samples/sec                   batch loss = 1725.8781049847603 | accuracy = 0.68


Epoch[2] Batch[730] Speed: 1.2528073178477679 samples/sec                   batch loss = 1736.9094614386559 | accuracy = 0.6791095890410959


Epoch[2] Batch[735] Speed: 1.2569651959906196 samples/sec                   batch loss = 1748.7100263237953 | accuracy = 0.6789115646258503


Epoch[2] Batch[740] Speed: 1.2501171527322164 samples/sec                   batch loss = 1758.8307405114174 | accuracy = 0.6783783783783783


Epoch[2] Batch[745] Speed: 1.2548524024762542 samples/sec                   batch loss = 1769.5289583802223 | accuracy = 0.6788590604026845


Epoch[2] Batch[750] Speed: 1.2575245501579961 samples/sec                   batch loss = 1779.8283156752586 | accuracy = 0.6793333333333333


Epoch[2] Batch[755] Speed: 1.2504546320821073 samples/sec                   batch loss = 1790.8058909773827 | accuracy = 0.6788079470198676


Epoch[2] Batch[760] Speed: 1.2494271081455834 samples/sec                   batch loss = 1801.8631157279015 | accuracy = 0.6789473684210526


Epoch[2] Batch[765] Speed: 1.2432926459190077 samples/sec                   batch loss = 1812.8428091406822 | accuracy = 0.6794117647058824


Epoch[2] Batch[770] Speed: 1.2463970510612068 samples/sec                   batch loss = 1827.3566544651985 | accuracy = 0.6785714285714286


Epoch[2] Batch[775] Speed: 1.2527577377533998 samples/sec                   batch loss = 1837.956410229206 | accuracy = 0.6790322580645162


Epoch[2] Batch[780] Speed: 1.2547810754196211 samples/sec                   batch loss = 1848.7370976805687 | accuracy = 0.6794871794871795


Epoch[2] Batch[785] Speed: 1.2559881573153386 samples/sec                   batch loss = 1861.22921782732 | accuracy = 0.678343949044586


[Epoch 2] training: accuracy=0.6782994923857868
[Epoch 2] time cost: 645.8074812889099
[Epoch 2] validation: validation accuracy=0.7588888888888888


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).