<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[15:32:27] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[15:32:27] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[15:32:28] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 2.7548363, -0.4405467]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7792152613314632 samples/sec                   batch loss = 15.378276824951172 | accuracy = 0.3


Epoch[1] Batch[10] Speed: 1.2556966481930871 samples/sec                   batch loss = 30.528947353363037 | accuracy = 0.325


Epoch[1] Batch[15] Speed: 1.2608402722995147 samples/sec                   batch loss = 44.594969749450684 | accuracy = 0.45


Epoch[1] Batch[20] Speed: 1.259102786178401 samples/sec                   batch loss = 58.665791511535645 | accuracy = 0.4625


Epoch[1] Batch[25] Speed: 1.2607450511302674 samples/sec                   batch loss = 73.02931880950928 | accuracy = 0.47


Epoch[1] Batch[30] Speed: 1.2579326263048718 samples/sec                   batch loss = 88.09531092643738 | accuracy = 0.44166666666666665


Epoch[1] Batch[35] Speed: 1.2606686950920742 samples/sec                   batch loss = 101.97206473350525 | accuracy = 0.45714285714285713


Epoch[1] Batch[40] Speed: 1.2535246099942663 samples/sec                   batch loss = 116.25527882575989 | accuracy = 0.48125


Epoch[1] Batch[45] Speed: 1.2570292369233218 samples/sec                   batch loss = 131.0781545639038 | accuracy = 0.4722222222222222


Epoch[1] Batch[50] Speed: 1.2563665406871578 samples/sec                   batch loss = 145.5557780265808 | accuracy = 0.46


Epoch[1] Batch[55] Speed: 1.2491578884120338 samples/sec                   batch loss = 159.6171350479126 | accuracy = 0.4590909090909091


Epoch[1] Batch[60] Speed: 1.255253862229563 samples/sec                   batch loss = 174.18987798690796 | accuracy = 0.45


Epoch[1] Batch[65] Speed: 1.2517325906248473 samples/sec                   batch loss = 188.7183380126953 | accuracy = 0.45


Epoch[1] Batch[70] Speed: 1.2516275350036405 samples/sec                   batch loss = 202.105318069458 | accuracy = 0.4642857142857143


Epoch[1] Batch[75] Speed: 1.2568614258634734 samples/sec                   batch loss = 215.89716362953186 | accuracy = 0.47


Epoch[1] Batch[80] Speed: 1.2513581129431441 samples/sec                   batch loss = 230.20544695854187 | accuracy = 0.46875


Epoch[1] Batch[85] Speed: 1.2525052208197924 samples/sec                   batch loss = 244.03883266448975 | accuracy = 0.47058823529411764


Epoch[1] Batch[90] Speed: 1.2560688374059776 samples/sec                   batch loss = 257.9337213039398 | accuracy = 0.48055555555555557


Epoch[1] Batch[95] Speed: 1.2581846946655966 samples/sec                   batch loss = 272.0046122074127 | accuracy = 0.4763157894736842


Epoch[1] Batch[100] Speed: 1.251638553343198 samples/sec                   batch loss = 286.03939414024353 | accuracy = 0.48


Epoch[1] Batch[105] Speed: 1.2561864909390938 samples/sec                   batch loss = 299.8793263435364 | accuracy = 0.48095238095238096


Epoch[1] Batch[110] Speed: 1.2586104780130098 samples/sec                   batch loss = 313.5127594470978 | accuracy = 0.48409090909090907


Epoch[1] Batch[115] Speed: 1.2532804906025343 samples/sec                   batch loss = 327.1521511077881 | accuracy = 0.49130434782608695


Epoch[1] Batch[120] Speed: 1.2513485928718875 samples/sec                   batch loss = 341.1174018383026 | accuracy = 0.49166666666666664


Epoch[1] Batch[125] Speed: 1.250413252644694 samples/sec                   batch loss = 355.44977736473083 | accuracy = 0.492


Epoch[1] Batch[130] Speed: 1.2483798133102049 samples/sec                   batch loss = 369.64814615249634 | accuracy = 0.49230769230769234


Epoch[1] Batch[135] Speed: 1.2479074556545575 samples/sec                   batch loss = 384.04113268852234 | accuracy = 0.4888888888888889


Epoch[1] Batch[140] Speed: 1.2558952658066573 samples/sec                   batch loss = 397.9202153682709 | accuracy = 0.4857142857142857


Epoch[1] Batch[145] Speed: 1.249075768792999 samples/sec                   batch loss = 411.322988986969 | accuracy = 0.49310344827586206


Epoch[1] Batch[150] Speed: 1.2506251181507595 samples/sec                   batch loss = 425.2924919128418 | accuracy = 0.495


Epoch[1] Batch[155] Speed: 1.2554605131809309 samples/sec                   batch loss = 438.82287526130676 | accuracy = 0.49838709677419357


Epoch[1] Batch[160] Speed: 1.256222139325424 samples/sec                   batch loss = 452.57971119880676 | accuracy = 0.5015625


Epoch[1] Batch[165] Speed: 1.259572121869335 samples/sec                   batch loss = 466.58125257492065 | accuracy = 0.4984848484848485


Epoch[1] Batch[170] Speed: 1.2630688526653706 samples/sec                   batch loss = 480.0818009376526 | accuracy = 0.5014705882352941


Epoch[1] Batch[175] Speed: 1.26226006119742 samples/sec                   batch loss = 494.2538094520569 | accuracy = 0.5028571428571429


Epoch[1] Batch[180] Speed: 1.2579905402608778 samples/sec                   batch loss = 508.2498745918274 | accuracy = 0.5041666666666667


Epoch[1] Batch[185] Speed: 1.2533536134398537 samples/sec                   batch loss = 521.8852934837341 | accuracy = 0.5067567567567568


Epoch[1] Batch[190] Speed: 1.2487222104830134 samples/sec                   batch loss = 535.4524919986725 | accuracy = 0.5052631578947369


Epoch[1] Batch[195] Speed: 1.251349432872348 samples/sec                   batch loss = 548.9670116901398 | accuracy = 0.5102564102564102


Epoch[1] Batch[200] Speed: 1.253206533800249 samples/sec                   batch loss = 563.1144535541534 | accuracy = 0.51125


Epoch[1] Batch[205] Speed: 1.254868546040674 samples/sec                   batch loss = 576.9874744415283 | accuracy = 0.5121951219512195


Epoch[1] Batch[210] Speed: 1.2522405614431906 samples/sec                   batch loss = 590.515531539917 | accuracy = 0.5142857142857142


Epoch[1] Batch[215] Speed: 1.2539875460915026 samples/sec                   batch loss = 604.2459397315979 | accuracy = 0.5151162790697674


Epoch[1] Batch[220] Speed: 1.2566755864491712 samples/sec                   batch loss = 617.757895231247 | accuracy = 0.5215909090909091


Epoch[1] Batch[225] Speed: 1.2521684095195518 samples/sec                   batch loss = 631.4950339794159 | accuracy = 0.5222222222222223


Epoch[1] Batch[230] Speed: 1.2560169301445872 samples/sec                   batch loss = 645.5834505558014 | accuracy = 0.5206521739130435


Epoch[1] Batch[235] Speed: 1.2535129028278877 samples/sec                   batch loss = 659.2450242042542 | accuracy = 0.5212765957446809


Epoch[1] Batch[240] Speed: 1.2462070727227526 samples/sec                   batch loss = 672.943596124649 | accuracy = 0.521875


Epoch[1] Batch[245] Speed: 1.2537210420285194 samples/sec                   batch loss = 686.2864518165588 | accuracy = 0.5244897959183673


Epoch[1] Batch[250] Speed: 1.2553558641506497 samples/sec                   batch loss = 700.3005654811859 | accuracy = 0.524


Epoch[1] Batch[255] Speed: 1.2548420783214218 samples/sec                   batch loss = 713.2279748916626 | accuracy = 0.5274509803921569


Epoch[1] Batch[260] Speed: 1.2529108874216894 samples/sec                   batch loss = 726.4077315330505 | accuracy = 0.5317307692307692


Epoch[1] Batch[265] Speed: 1.2551850249982623 samples/sec                   batch loss = 740.7846693992615 | accuracy = 0.530188679245283


Epoch[1] Batch[270] Speed: 1.2514809535211762 samples/sec                   batch loss = 753.7266917228699 | accuracy = 0.5324074074074074


Epoch[1] Batch[275] Speed: 1.2505846596227737 samples/sec                   batch loss = 767.0583708286285 | accuracy = 0.5345454545454545


Epoch[1] Batch[280] Speed: 1.2536745747204563 samples/sec                   batch loss = 781.0051372051239 | accuracy = 0.5339285714285714


Epoch[1] Batch[285] Speed: 1.253593077882194 samples/sec                   batch loss = 794.9835543632507 | accuracy = 0.5333333333333333


Epoch[1] Batch[290] Speed: 1.2514114093506685 samples/sec                   batch loss = 808.1784930229187 | accuracy = 0.5344827586206896


Epoch[1] Batch[295] Speed: 1.2538524994592892 samples/sec                   batch loss = 822.647439956665 | accuracy = 0.5305084745762711


Epoch[1] Batch[300] Speed: 1.2542257526453753 samples/sec                   batch loss = 836.4903697967529 | accuracy = 0.5316666666666666


Epoch[1] Batch[305] Speed: 1.2499432664038321 samples/sec                   batch loss = 850.1092977523804 | accuracy = 0.5327868852459017


Epoch[1] Batch[310] Speed: 1.261638136731027 samples/sec                   batch loss = 863.2753372192383 | accuracy = 0.5346774193548387


Epoch[1] Batch[315] Speed: 1.2593437910241327 samples/sec                   batch loss = 877.2148950099945 | accuracy = 0.5317460317460317


Epoch[1] Batch[320] Speed: 1.2626348216157108 samples/sec                   batch loss = 890.4576177597046 | accuracy = 0.53359375


Epoch[1] Batch[325] Speed: 1.2541531842761402 samples/sec                   batch loss = 904.032276391983 | accuracy = 0.5330769230769231


Epoch[1] Batch[330] Speed: 1.2574158813695815 samples/sec                   batch loss = 917.3467667102814 | accuracy = 0.5348484848484848


Epoch[1] Batch[335] Speed: 1.2497925171489483 samples/sec                   batch loss = 930.8913540840149 | accuracy = 0.5350746268656716


Epoch[1] Batch[340] Speed: 1.2557052007096585 samples/sec                   batch loss = 944.7354257106781 | accuracy = 0.5352941176470588


Epoch[1] Batch[345] Speed: 1.2563862044175518 samples/sec                   batch loss = 958.1985754966736 | accuracy = 0.5355072463768116


Epoch[1] Batch[350] Speed: 1.2566518662242139 samples/sec                   batch loss = 972.1610062122345 | accuracy = 0.5357142857142857


Epoch[1] Batch[355] Speed: 1.2591785746209998 samples/sec                   batch loss = 985.0243723392487 | accuracy = 0.5380281690140845


Epoch[1] Batch[360] Speed: 1.2601404245695451 samples/sec                   batch loss = 998.9739646911621 | accuracy = 0.5381944444444444


Epoch[1] Batch[365] Speed: 1.2627989502728414 samples/sec                   batch loss = 1011.7106690406799 | accuracy = 0.5417808219178082


Epoch[1] Batch[370] Speed: 1.2551918802093869 samples/sec                   batch loss = 1026.0343911647797 | accuracy = 0.5398648648648648


Epoch[1] Batch[375] Speed: 1.2581966779745672 samples/sec                   batch loss = 1039.1395146846771 | accuracy = 0.5426666666666666


Epoch[1] Batch[380] Speed: 1.2636938069481551 samples/sec                   batch loss = 1052.2695372104645 | accuracy = 0.5434210526315789


Epoch[1] Batch[385] Speed: 1.2656667145708222 samples/sec                   batch loss = 1065.0595359802246 | accuracy = 0.5454545454545454


Epoch[1] Batch[390] Speed: 1.2575957181279822 samples/sec                   batch loss = 1080.2724001407623 | accuracy = 0.5429487179487179


Epoch[1] Batch[395] Speed: 1.2574151274453307 samples/sec                   batch loss = 1093.0337400436401 | accuracy = 0.5443037974683544


Epoch[1] Batch[400] Speed: 1.2603972608029386 samples/sec                   batch loss = 1107.4600756168365 | accuracy = 0.5425


Epoch[1] Batch[405] Speed: 1.258342288696921 samples/sec                   batch loss = 1120.8934338092804 | accuracy = 0.5425925925925926


Epoch[1] Batch[410] Speed: 1.2617018007652836 samples/sec                   batch loss = 1133.9815876483917 | accuracy = 0.5439024390243903


Epoch[1] Batch[415] Speed: 1.26645961120032 samples/sec                   batch loss = 1147.827133178711 | accuracy = 0.5433734939759036


Epoch[1] Batch[420] Speed: 1.2580796853529 samples/sec                   batch loss = 1160.645414352417 | accuracy = 0.5452380952380952


Epoch[1] Batch[425] Speed: 1.252880198355333 samples/sec                   batch loss = 1174.0383651256561 | accuracy = 0.5464705882352942


Epoch[1] Batch[430] Speed: 1.2496636781738475 samples/sec                   batch loss = 1186.964280128479 | accuracy = 0.5482558139534883


Epoch[1] Batch[435] Speed: 1.2529991268627172 samples/sec                   batch loss = 1201.0955193042755 | accuracy = 0.5465517241379311


Epoch[1] Batch[440] Speed: 1.2539205345846518 samples/sec                   batch loss = 1214.2029049396515 | accuracy = 0.5482954545454546


Epoch[1] Batch[445] Speed: 1.2526747700348104 samples/sec                   batch loss = 1227.588739156723 | accuracy = 0.55


Epoch[1] Batch[450] Speed: 1.2638655421670046 samples/sec                   batch loss = 1240.4529361724854 | accuracy = 0.5505555555555556


Epoch[1] Batch[455] Speed: 1.264311757767105 samples/sec                   batch loss = 1253.700339794159 | accuracy = 0.5510989010989011


Epoch[1] Batch[460] Speed: 1.2550962896399072 samples/sec                   batch loss = 1265.931743144989 | accuracy = 0.5527173913043478


Epoch[1] Batch[465] Speed: 1.2535424989668513 samples/sec                   batch loss = 1279.6325993537903 | accuracy = 0.553763440860215


Epoch[1] Batch[470] Speed: 1.2574587622993578 samples/sec                   batch loss = 1292.7325410842896 | accuracy = 0.5542553191489362


Epoch[1] Batch[475] Speed: 1.2582310251012245 samples/sec                   batch loss = 1305.9609599113464 | accuracy = 0.5557894736842105


Epoch[1] Batch[480] Speed: 1.2647328321154931 samples/sec                   batch loss = 1319.922117948532 | accuracy = 0.5557291666666667


Epoch[1] Batch[485] Speed: 1.2627804159482063 samples/sec                   batch loss = 1332.496925830841 | accuracy = 0.5572164948453608


Epoch[1] Batch[490] Speed: 1.2608174368573593 samples/sec                   batch loss = 1346.9512705802917 | accuracy = 0.5566326530612244


Epoch[1] Batch[495] Speed: 1.2572881041183317 samples/sec                   batch loss = 1360.9572460651398 | accuracy = 0.5565656565656566


Epoch[1] Batch[500] Speed: 1.2583547469147192 samples/sec                   batch loss = 1375.1117277145386 | accuracy = 0.556


Epoch[1] Batch[505] Speed: 1.2565583118460908 samples/sec                   batch loss = 1387.4476251602173 | accuracy = 0.556930693069307


Epoch[1] Batch[510] Speed: 1.257011342476374 samples/sec                   batch loss = 1399.7899923324585 | accuracy = 0.5583333333333333


Epoch[1] Batch[515] Speed: 1.2530403968689405 samples/sec                   batch loss = 1412.3674805164337 | accuracy = 0.5606796116504854


Epoch[1] Batch[520] Speed: 1.2581989425625366 samples/sec                   batch loss = 1425.9352297782898 | accuracy = 0.5610576923076923


Epoch[1] Batch[525] Speed: 1.2448402344516618 samples/sec                   batch loss = 1438.575068116188 | accuracy = 0.5628571428571428


Epoch[1] Batch[530] Speed: 1.2597125651801069 samples/sec                   batch loss = 1450.487558722496 | accuracy = 0.5636792452830188


Epoch[1] Batch[535] Speed: 1.2554667137607167 samples/sec                   batch loss = 1465.0000659227371 | accuracy = 0.561214953271028


Epoch[1] Batch[540] Speed: 1.258693856164587 samples/sec                   batch loss = 1476.8834968805313 | accuracy = 0.562962962962963


Epoch[1] Batch[545] Speed: 1.2608719211263635 samples/sec                   batch loss = 1490.0881351232529 | accuracy = 0.563302752293578


Epoch[1] Batch[550] Speed: 1.2589089151319324 samples/sec                   batch loss = 1501.9533935785294 | accuracy = 0.5654545454545454


Epoch[1] Batch[555] Speed: 1.255788852343378 samples/sec                   batch loss = 1514.2235740423203 | accuracy = 0.5662162162162162


Epoch[1] Batch[560] Speed: 1.2585791314407955 samples/sec                   batch loss = 1526.3249155282974 | accuracy = 0.5678571428571428


Epoch[1] Batch[565] Speed: 1.2590853051019277 samples/sec                   batch loss = 1539.81940472126 | accuracy = 0.5685840707964602


Epoch[1] Batch[570] Speed: 1.2540941231978937 samples/sec                   batch loss = 1551.7394001483917 | accuracy = 0.5697368421052632


Epoch[1] Batch[575] Speed: 1.2517748980022148 samples/sec                   batch loss = 1563.037488102913 | accuracy = 0.571304347826087


Epoch[1] Batch[580] Speed: 1.2568731956606556 samples/sec                   batch loss = 1575.9721077680588 | accuracy = 0.5728448275862069


Epoch[1] Batch[585] Speed: 1.256461195680022 samples/sec                   batch loss = 1591.7223340272903 | accuracy = 0.5713675213675213


Epoch[1] Batch[590] Speed: 1.2606652848644568 samples/sec                   batch loss = 1603.5863431692123 | accuracy = 0.5724576271186441


Epoch[1] Batch[595] Speed: 1.263391670096137 samples/sec                   batch loss = 1616.1722575426102 | accuracy = 0.5735294117647058


Epoch[1] Batch[600] Speed: 1.254160121974229 samples/sec                   batch loss = 1630.4629737138748 | accuracy = 0.5733333333333334


Epoch[1] Batch[605] Speed: 1.2581078936798908 samples/sec                   batch loss = 1646.0362530946732 | accuracy = 0.5714876033057851


Epoch[1] Batch[610] Speed: 1.2615485815149006 samples/sec                   batch loss = 1658.7133740186691 | accuracy = 0.5729508196721311


Epoch[1] Batch[615] Speed: 1.2505774817702557 samples/sec                   batch loss = 1672.6604999303818 | accuracy = 0.573170731707317


Epoch[1] Batch[620] Speed: 1.2538778009163347 samples/sec                   batch loss = 1687.387063384056 | accuracy = 0.5725806451612904


Epoch[1] Batch[625] Speed: 1.2573866674660916 samples/sec                   batch loss = 1700.8183671236038 | accuracy = 0.5728


Epoch[1] Batch[630] Speed: 1.2621066113915673 samples/sec                   batch loss = 1714.8351851701736 | accuracy = 0.5726190476190476


Epoch[1] Batch[635] Speed: 1.2654707215918795 samples/sec                   batch loss = 1727.103035569191 | accuracy = 0.574015748031496


Epoch[1] Batch[640] Speed: 1.2643392934441193 samples/sec                   batch loss = 1739.4481555223465 | accuracy = 0.575


Epoch[1] Batch[645] Speed: 1.2569755551108543 samples/sec                   batch loss = 1751.703698515892 | accuracy = 0.5755813953488372


Epoch[1] Batch[650] Speed: 1.2566896119240833 samples/sec                   batch loss = 1764.336299777031 | accuracy = 0.575


Epoch[1] Batch[655] Speed: 1.2604518034588992 samples/sec                   batch loss = 1777.374176621437 | accuracy = 0.5748091603053435


Epoch[1] Batch[660] Speed: 1.253674855762369 samples/sec                   batch loss = 1790.0033086538315 | accuracy = 0.5753787878787879


Epoch[1] Batch[665] Speed: 1.2581127052599412 samples/sec                   batch loss = 1803.1148520708084 | accuracy = 0.575187969924812


Epoch[1] Batch[670] Speed: 1.2609619489318864 samples/sec                   batch loss = 1817.2219554185867 | accuracy = 0.5746268656716418


Epoch[1] Batch[675] Speed: 1.2618084595357444 samples/sec                   batch loss = 1830.4681607484818 | accuracy = 0.5748148148148148


Epoch[1] Batch[680] Speed: 1.2603453739076633 samples/sec                   batch loss = 1842.7507137060165 | accuracy = 0.5753676470588235


Epoch[1] Batch[685] Speed: 1.2552330130486589 samples/sec                   batch loss = 1854.5171180963516 | accuracy = 0.5766423357664233


Epoch[1] Batch[690] Speed: 1.2637958524045259 samples/sec                   batch loss = 1868.6279371976852 | accuracy = 0.5760869565217391


Epoch[1] Batch[695] Speed: 1.2641830515299606 samples/sec                   batch loss = 1879.7171474695206 | accuracy = 0.5776978417266188


Epoch[1] Batch[700] Speed: 1.2683008842385879 samples/sec                   batch loss = 1893.6304248571396 | accuracy = 0.5778571428571428


Epoch[1] Batch[705] Speed: 1.2653124822295156 samples/sec                   batch loss = 1907.0695370435715 | accuracy = 0.577304964539007


Epoch[1] Batch[710] Speed: 1.2581506331205146 samples/sec                   batch loss = 1919.5516349077225 | accuracy = 0.5778169014084507


Epoch[1] Batch[715] Speed: 1.2640480857983536 samples/sec                   batch loss = 1931.7619742155075 | accuracy = 0.5786713286713286


Epoch[1] Batch[720] Speed: 1.2621143969311839 samples/sec                   batch loss = 1943.8583921194077 | accuracy = 0.5791666666666667


Epoch[1] Batch[725] Speed: 1.2558419628159692 samples/sec                   batch loss = 1955.967078447342 | accuracy = 0.579655172413793


Epoch[1] Batch[730] Speed: 1.2570407273104898 samples/sec                   batch loss = 1968.3394796848297 | accuracy = 0.5808219178082191


Epoch[1] Batch[735] Speed: 1.2641553321508983 samples/sec                   batch loss = 1979.8699269294739 | accuracy = 0.5819727891156462


Epoch[1] Batch[740] Speed: 1.2598498232087827 samples/sec                   batch loss = 1995.314671754837 | accuracy = 0.581081081081081


Epoch[1] Batch[745] Speed: 1.2582007353671272 samples/sec                   batch loss = 2008.821436882019 | accuracy = 0.5808724832214766


Epoch[1] Batch[750] Speed: 1.2593517315769542 samples/sec                   batch loss = 2021.4178597927094 | accuracy = 0.581


Epoch[1] Batch[755] Speed: 1.2583268106487413 samples/sec                   batch loss = 2034.856385231018 | accuracy = 0.580794701986755


Epoch[1] Batch[760] Speed: 1.2611348383619094 samples/sec                   batch loss = 2047.6798572540283 | accuracy = 0.5809210526315789


Epoch[1] Batch[765] Speed: 1.2604584322283734 samples/sec                   batch loss = 2061.7991042137146 | accuracy = 0.5803921568627451


Epoch[1] Batch[770] Speed: 1.2657024256255687 samples/sec                   batch loss = 2073.2509956359863 | accuracy = 0.5811688311688312


Epoch[1] Batch[775] Speed: 1.2582268731511101 samples/sec                   batch loss = 2086.1862077713013 | accuracy = 0.5812903225806452


Epoch[1] Batch[780] Speed: 1.2570184060129916 samples/sec                   batch loss = 2100.0218979120255 | accuracy = 0.5807692307692308


Epoch[1] Batch[785] Speed: 1.2576823559714743 samples/sec                   batch loss = 2112.399404168129 | accuracy = 0.5815286624203821


[Epoch 1] training: accuracy=0.5812182741116751
[Epoch 1] time cost: 644.8922591209412
[Epoch 1] validation: validation accuracy=0.6911111111111111


Epoch[2] Batch[5] Speed: 1.2560189047988557 samples/sec                   batch loss = 13.91066312789917 | accuracy = 0.45


Epoch[2] Batch[10] Speed: 1.2591042035842543 samples/sec                   batch loss = 26.55508852005005 | accuracy = 0.525


Epoch[2] Batch[15] Speed: 1.2525300938933803 samples/sec                   batch loss = 38.422876834869385 | accuracy = 0.6


Epoch[2] Batch[20] Speed: 1.2563846990384442 samples/sec                   batch loss = 49.82040512561798 | accuracy = 0.6375


Epoch[2] Batch[25] Speed: 1.2612688033910455 samples/sec                   batch loss = 64.02911126613617 | accuracy = 0.61


Epoch[2] Batch[30] Speed: 1.2593089104960173 samples/sec                   batch loss = 76.85326361656189 | accuracy = 0.6


Epoch[2] Batch[35] Speed: 1.2557968421274035 samples/sec                   batch loss = 88.31074476242065 | accuracy = 0.6214285714285714


Epoch[2] Batch[40] Speed: 1.2550349803675618 samples/sec                   batch loss = 101.83381617069244 | accuracy = 0.63125


Epoch[2] Batch[45] Speed: 1.2616852911438383 samples/sec                   batch loss = 114.02286779880524 | accuracy = 0.6444444444444445


Epoch[2] Batch[50] Speed: 1.2590019695619161 samples/sec                   batch loss = 126.59767091274261 | accuracy = 0.635


Epoch[2] Batch[55] Speed: 1.2590115119612808 samples/sec                   batch loss = 139.3767945766449 | accuracy = 0.6409090909090909


Epoch[2] Batch[60] Speed: 1.2525584279589959 samples/sec                   batch loss = 152.03595280647278 | accuracy = 0.6333333333333333


Epoch[2] Batch[65] Speed: 1.2565848520148277 samples/sec                   batch loss = 165.24707007408142 | accuracy = 0.6269230769230769


Epoch[2] Batch[70] Speed: 1.2574058919466278 samples/sec                   batch loss = 177.4445662498474 | accuracy = 0.6392857142857142


Epoch[2] Batch[75] Speed: 1.2607996238917138 samples/sec                   batch loss = 189.62538075447083 | accuracy = 0.6466666666666666


Epoch[2] Batch[80] Speed: 1.2464330720045242 samples/sec                   batch loss = 203.0333287715912 | accuracy = 0.64375


Epoch[2] Batch[85] Speed: 1.2547689693952349 samples/sec                   batch loss = 216.18557822704315 | accuracy = 0.6411764705882353


Epoch[2] Batch[90] Speed: 1.2589880813777445 samples/sec                   batch loss = 230.52713119983673 | accuracy = 0.6388888888888888


Epoch[2] Batch[95] Speed: 1.2554420996989433 samples/sec                   batch loss = 245.73800241947174 | accuracy = 0.6263157894736842


Epoch[2] Batch[100] Speed: 1.2565971813398358 samples/sec                   batch loss = 258.0427986383438 | accuracy = 0.625


Epoch[2] Batch[105] Speed: 1.2603258700846647 samples/sec                   batch loss = 271.2717663049698 | accuracy = 0.6238095238095238


Epoch[2] Batch[110] Speed: 1.2536816008060763 samples/sec                   batch loss = 284.8659590482712 | accuracy = 0.625


Epoch[2] Batch[115] Speed: 1.2556966481930871 samples/sec                   batch loss = 296.17344295978546 | accuracy = 0.6282608695652174


Epoch[2] Batch[120] Speed: 1.2607678839504293 samples/sec                   batch loss = 307.6349415779114 | accuracy = 0.6291666666666667


Epoch[2] Batch[125] Speed: 1.2556497523983672 samples/sec                   batch loss = 319.4859457015991 | accuracy = 0.634


Epoch[2] Batch[130] Speed: 1.2542184391733942 samples/sec                   batch loss = 334.1848130226135 | accuracy = 0.6307692307692307


Epoch[2] Batch[135] Speed: 1.2572552216973214 samples/sec                   batch loss = 345.99023401737213 | accuracy = 0.6351851851851852


Epoch[2] Batch[140] Speed: 1.2564748399713375 samples/sec                   batch loss = 358.6552723646164 | accuracy = 0.6321428571428571


Epoch[2] Batch[145] Speed: 1.2563830995770948 samples/sec                   batch loss = 371.3759447336197 | accuracy = 0.6344827586206897


Epoch[2] Batch[150] Speed: 1.2569525769293846 samples/sec                   batch loss = 384.0734301805496 | accuracy = 0.635


Epoch[2] Batch[155] Speed: 1.2554110986280824 samples/sec                   batch loss = 397.2121846675873 | accuracy = 0.6338709677419355


Epoch[2] Batch[160] Speed: 1.255923282243658 samples/sec                   batch loss = 411.86797761917114 | accuracy = 0.6328125


Epoch[2] Batch[165] Speed: 1.2636413627568102 samples/sec                   batch loss = 425.3557057380676 | accuracy = 0.6303030303030303


Epoch[2] Batch[170] Speed: 1.2534166313851274 samples/sec                   batch loss = 436.796839594841 | accuracy = 0.6352941176470588


Epoch[2] Batch[175] Speed: 1.2433502332520798 samples/sec                   batch loss = 447.1571923494339 | accuracy = 0.6357142857142857


Epoch[2] Batch[180] Speed: 1.2590709426347737 samples/sec                   batch loss = 458.31077885627747 | accuracy = 0.6375


Epoch[2] Batch[185] Speed: 1.2562453730039287 samples/sec                   batch loss = 470.9559475183487 | accuracy = 0.6405405405405405


Epoch[2] Batch[190] Speed: 1.2608116570633439 samples/sec                   batch loss = 484.6983724832535 | accuracy = 0.6368421052631579


Epoch[2] Batch[195] Speed: 1.2628387771494234 samples/sec                   batch loss = 497.2718685865402 | accuracy = 0.6384615384615384


Epoch[2] Batch[200] Speed: 1.25476455873123 samples/sec                   batch loss = 511.73769986629486 | accuracy = 0.63875


Epoch[2] Batch[205] Speed: 1.257146693650101 samples/sec                   batch loss = 523.0296195745468 | accuracy = 0.6390243902439025


Epoch[2] Batch[210] Speed: 1.2510791975806466 samples/sec                   batch loss = 537.4423540830612 | accuracy = 0.6333333333333333


Epoch[2] Batch[215] Speed: 1.2578145511758319 samples/sec                   batch loss = 549.3053767681122 | accuracy = 0.6360465116279069


Epoch[2] Batch[220] Speed: 1.26136410369532 samples/sec                   batch loss = 562.0242781639099 | accuracy = 0.6352272727272728


Epoch[2] Batch[225] Speed: 1.2573155231540325 samples/sec                   batch loss = 574.153829574585 | accuracy = 0.6366666666666667


Epoch[2] Batch[230] Speed: 1.2603087337286711 samples/sec                   batch loss = 585.7461276054382 | accuracy = 0.6402173913043478


Epoch[2] Batch[235] Speed: 1.2645057716781336 samples/sec                   batch loss = 599.3877220153809 | accuracy = 0.6351063829787233


Epoch[2] Batch[240] Speed: 1.2651804237021642 samples/sec                   batch loss = 612.1726013422012 | accuracy = 0.6364583333333333


Epoch[2] Batch[245] Speed: 1.2627048585626197 samples/sec                   batch loss = 625.9225350618362 | accuracy = 0.6336734693877552


Epoch[2] Batch[250] Speed: 1.2638841083592085 samples/sec                   batch loss = 637.7754774093628 | accuracy = 0.635


Epoch[2] Batch[255] Speed: 1.257387232883599 samples/sec                   batch loss = 650.1585631370544 | accuracy = 0.6362745098039215


Epoch[2] Batch[260] Speed: 1.261186600668149 samples/sec                   batch loss = 663.5652322769165 | accuracy = 0.6336538461538461


Epoch[2] Batch[265] Speed: 1.2532541834485926 samples/sec                   batch loss = 676.6373655796051 | accuracy = 0.6358490566037736


Epoch[2] Batch[270] Speed: 1.2556881897748016 samples/sec                   batch loss = 690.5335638523102 | accuracy = 0.6342592592592593


Epoch[2] Batch[275] Speed: 1.2546957751541128 samples/sec                   batch loss = 699.7443566322327 | accuracy = 0.639090909090909


Epoch[2] Batch[280] Speed: 1.2554622042420711 samples/sec                   batch loss = 712.011766910553 | accuracy = 0.6366071428571428


Epoch[2] Batch[285] Speed: 1.254702625022174 samples/sec                   batch loss = 724.21981549263 | accuracy = 0.6350877192982456


Epoch[2] Batch[290] Speed: 1.262049741887559 samples/sec                   batch loss = 736.1311770677567 | accuracy = 0.6353448275862069


Epoch[2] Batch[295] Speed: 1.2554807122645424 samples/sec                   batch loss = 748.8894768953323 | accuracy = 0.6347457627118644


Epoch[2] Batch[300] Speed: 1.2581212907196884 samples/sec                   batch loss = 761.7467864751816 | accuracy = 0.6333333333333333


Epoch[2] Batch[305] Speed: 1.2554804304123928 samples/sec                   batch loss = 773.9425097703934 | accuracy = 0.6344262295081967


Epoch[2] Batch[310] Speed: 1.2562245849353229 samples/sec                   batch loss = 784.7034673690796 | accuracy = 0.6370967741935484


Epoch[2] Batch[315] Speed: 1.2589330040454623 samples/sec                   batch loss = 797.6315542459488 | accuracy = 0.638095238095238


Epoch[2] Batch[320] Speed: 1.2606666110618934 samples/sec                   batch loss = 811.8326143026352 | accuracy = 0.63515625


Epoch[2] Batch[325] Speed: 1.2572453290419032 samples/sec                   batch loss = 824.0613743066788 | accuracy = 0.6338461538461538


Epoch[2] Batch[330] Speed: 1.2589247853713579 samples/sec                   batch loss = 835.4388999938965 | accuracy = 0.634090909090909


Epoch[2] Batch[335] Speed: 1.2545793388391435 samples/sec                   batch loss = 849.9069128036499 | accuracy = 0.6328358208955224


Epoch[2] Batch[340] Speed: 1.252931098115706 samples/sec                   batch loss = 860.7893087863922 | accuracy = 0.6352941176470588


Epoch[2] Batch[345] Speed: 1.2596440892429492 samples/sec                   batch loss = 873.055274605751 | accuracy = 0.6340579710144928


Epoch[2] Batch[350] Speed: 1.2533376960906257 samples/sec                   batch loss = 885.9219696521759 | accuracy = 0.6342857142857142


Epoch[2] Batch[355] Speed: 1.2588818043985335 samples/sec                   batch loss = 896.6805627346039 | accuracy = 0.6359154929577465


Epoch[2] Batch[360] Speed: 1.2492024402654824 samples/sec                   batch loss = 910.8071746826172 | accuracy = 0.6354166666666666


Epoch[2] Batch[365] Speed: 1.25285475002595 samples/sec                   batch loss = 919.9647604227066 | accuracy = 0.6376712328767123


Epoch[2] Batch[370] Speed: 1.2595208702256013 samples/sec                   batch loss = 932.4860459566116 | accuracy = 0.6371621621621621


Epoch[2] Batch[375] Speed: 1.2600405773355439 samples/sec                   batch loss = 943.5904822349548 | accuracy = 0.6373333333333333


Epoch[2] Batch[380] Speed: 1.2571168328826419 samples/sec                   batch loss = 953.3805056810379 | accuracy = 0.6407894736842106


Epoch[2] Batch[385] Speed: 1.2622338505438664 samples/sec                   batch loss = 966.5467814207077 | accuracy = 0.6415584415584416


Epoch[2] Batch[390] Speed: 1.254025694302116 samples/sec                   batch loss = 980.5361951589584 | accuracy = 0.6416666666666667


Epoch[2] Batch[395] Speed: 1.2538451903395749 samples/sec                   batch loss = 993.3567299842834 | accuracy = 0.6417721518987342


Epoch[2] Batch[400] Speed: 1.2566264526911362 samples/sec                   batch loss = 1006.3842315673828 | accuracy = 0.640625


Epoch[2] Batch[405] Speed: 1.2591207402216766 samples/sec                   batch loss = 1019.9065647125244 | accuracy = 0.6395061728395062


Epoch[2] Batch[410] Speed: 1.250963524783142 samples/sec                   batch loss = 1033.040554523468 | accuracy = 0.6390243902439025


Epoch[2] Batch[415] Speed: 1.2518782035267932 samples/sec                   batch loss = 1044.2591198682785 | accuracy = 0.6403614457831325


Epoch[2] Batch[420] Speed: 1.255400483461769 samples/sec                   batch loss = 1055.5027912855148 | accuracy = 0.6416666666666667


Epoch[2] Batch[425] Speed: 1.2597177673856759 samples/sec                   batch loss = 1066.991764307022 | accuracy = 0.6429411764705882


Epoch[2] Batch[430] Speed: 1.2506294065352725 samples/sec                   batch loss = 1077.7951689958572 | accuracy = 0.6436046511627908


Epoch[2] Batch[435] Speed: 1.2597458600382327 samples/sec                   batch loss = 1090.6105479002 | accuracy = 0.6436781609195402


Epoch[2] Batch[440] Speed: 1.2508164460003466 samples/sec                   batch loss = 1103.0109357833862 | accuracy = 0.64375


Epoch[2] Batch[445] Speed: 1.2537350016305744 samples/sec                   batch loss = 1115.2082866430283 | accuracy = 0.645505617977528


Epoch[2] Batch[450] Speed: 1.2575693237689236 samples/sec                   batch loss = 1124.6749532222748 | accuracy = 0.6466666666666666


Epoch[2] Batch[455] Speed: 1.249826500063023 samples/sec                   batch loss = 1135.905283331871 | accuracy = 0.6478021978021978


Epoch[2] Batch[460] Speed: 1.2565427835438379 samples/sec                   batch loss = 1145.6568166017532 | accuracy = 0.6494565217391305


Epoch[2] Batch[465] Speed: 1.253611249822444 samples/sec                   batch loss = 1156.9860351085663 | accuracy = 0.65


Epoch[2] Batch[470] Speed: 1.2601913480193272 samples/sec                   batch loss = 1167.3044587373734 | accuracy = 0.651595744680851


Epoch[2] Batch[475] Speed: 1.2578441621847596 samples/sec                   batch loss = 1181.2097209692001 | accuracy = 0.651578947368421


Epoch[2] Batch[480] Speed: 1.2576995152630446 samples/sec                   batch loss = 1196.573550105095 | accuracy = 0.65


Epoch[2] Batch[485] Speed: 1.25829925311522 samples/sec                   batch loss = 1206.892825126648 | accuracy = 0.6510309278350516


Epoch[2] Batch[490] Speed: 1.2547137914057886 samples/sec                   batch loss = 1220.1132093667984 | accuracy = 0.6510204081632653


Epoch[2] Batch[495] Speed: 1.2583664503137288 samples/sec                   batch loss = 1232.2613562345505 | accuracy = 0.6505050505050505


Epoch[2] Batch[500] Speed: 1.25603827554611 samples/sec                   batch loss = 1244.1405806541443 | accuracy = 0.651


Epoch[2] Batch[505] Speed: 1.2546672504803937 samples/sec                   batch loss = 1255.9924755096436 | accuracy = 0.6514851485148515


Epoch[2] Batch[510] Speed: 1.268660629370677 samples/sec                   batch loss = 1268.4971846342087 | accuracy = 0.6519607843137255


Epoch[2] Batch[515] Speed: 1.2570454365462467 samples/sec                   batch loss = 1278.8621438741684 | accuracy = 0.6519417475728155


Epoch[2] Batch[520] Speed: 1.255717888723586 samples/sec                   batch loss = 1289.8844829797745 | accuracy = 0.6524038461538462


Epoch[2] Batch[525] Speed: 1.2565448539619637 samples/sec                   batch loss = 1303.1072889566422 | accuracy = 0.6514285714285715


Epoch[2] Batch[530] Speed: 1.2537617974424695 samples/sec                   batch loss = 1316.0157643556595 | accuracy = 0.6518867924528302


Epoch[2] Batch[535] Speed: 1.2546657492177042 samples/sec                   batch loss = 1327.7641543149948 | accuracy = 0.6518691588785047


Epoch[2] Batch[540] Speed: 1.2555824691601574 samples/sec                   batch loss = 1339.515636920929 | accuracy = 0.6513888888888889


Epoch[2] Batch[545] Speed: 1.2597203212114063 samples/sec                   batch loss = 1350.8356165885925 | accuracy = 0.6518348623853211


Epoch[2] Batch[550] Speed: 1.2487240693247665 samples/sec                   batch loss = 1363.5748046636581 | accuracy = 0.6518181818181819


Epoch[2] Batch[555] Speed: 1.256749106009471 samples/sec                   batch loss = 1375.3207267522812 | accuracy = 0.6518018018018018


Epoch[2] Batch[560] Speed: 1.2543550651643909 samples/sec                   batch loss = 1385.7879246473312 | accuracy = 0.6535714285714286


Epoch[2] Batch[565] Speed: 1.2590713205902377 samples/sec                   batch loss = 1398.9187208414078 | accuracy = 0.6526548672566371


Epoch[2] Batch[570] Speed: 1.2528346353500621 samples/sec                   batch loss = 1410.11956179142 | accuracy = 0.6521929824561403


Epoch[2] Batch[575] Speed: 1.2562339912159848 samples/sec                   batch loss = 1419.1436928510666 | accuracy = 0.6543478260869565


Epoch[2] Batch[580] Speed: 1.2511780961262602 samples/sec                   batch loss = 1431.490473985672 | accuracy = 0.6538793103448276


Epoch[2] Batch[585] Speed: 1.2638975334841522 samples/sec                   batch loss = 1442.963933467865 | accuracy = 0.6542735042735043


Epoch[2] Batch[590] Speed: 1.2587718620355182 samples/sec                   batch loss = 1457.6079626083374 | accuracy = 0.6533898305084745


Epoch[2] Batch[595] Speed: 1.2599037508106625 samples/sec                   batch loss = 1471.0422023534775 | accuracy = 0.6542016806722689


Epoch[2] Batch[600] Speed: 1.2603502026172608 samples/sec                   batch loss = 1482.2996295690536 | accuracy = 0.6558333333333334


Epoch[2] Batch[605] Speed: 1.260248807485107 samples/sec                   batch loss = 1493.3457474708557 | accuracy = 0.6553719008264463


Epoch[2] Batch[610] Speed: 1.2545949124688842 samples/sec                   batch loss = 1505.9333038330078 | accuracy = 0.6549180327868852


Epoch[2] Batch[615] Speed: 1.260158881478239 samples/sec                   batch loss = 1519.3503893017769 | accuracy = 0.6544715447154471


Epoch[2] Batch[620] Speed: 1.2584147763310356 samples/sec                   batch loss = 1530.9831777215004 | accuracy = 0.6548387096774193


Epoch[2] Batch[625] Speed: 1.262455725825567 samples/sec                   batch loss = 1544.1542642712593 | accuracy = 0.6548


Epoch[2] Batch[630] Speed: 1.2587991569298 samples/sec                   batch loss = 1556.2987347245216 | accuracy = 0.6543650793650794


Epoch[2] Batch[635] Speed: 1.2600770126442447 samples/sec                   batch loss = 1566.3393823504448 | accuracy = 0.6551181102362205


Epoch[2] Batch[640] Speed: 1.2586347443534713 samples/sec                   batch loss = 1577.9079412817955 | accuracy = 0.65625


Epoch[2] Batch[645] Speed: 1.2522675738230855 samples/sec                   batch loss = 1589.8073686361313 | accuracy = 0.6569767441860465


Epoch[2] Batch[650] Speed: 1.2612487968975048 samples/sec                   batch loss = 1601.3902587890625 | accuracy = 0.6569230769230769


Epoch[2] Batch[655] Speed: 1.2605678169916823 samples/sec                   batch loss = 1614.3804347515106 | accuracy = 0.6561068702290076


Epoch[2] Batch[660] Speed: 1.2598479310960589 samples/sec                   batch loss = 1627.951026916504 | accuracy = 0.656060606060606


Epoch[2] Batch[665] Speed: 1.2569145329349396 samples/sec                   batch loss = 1640.4795463085175 | accuracy = 0.6556390977443609


Epoch[2] Batch[670] Speed: 1.2646555156497314 samples/sec                   batch loss = 1656.305772781372 | accuracy = 0.6537313432835821


Epoch[2] Batch[675] Speed: 1.2632666705268705 samples/sec                   batch loss = 1668.2943559885025 | accuracy = 0.654074074074074


Epoch[2] Batch[680] Speed: 1.262510161770342 samples/sec                   batch loss = 1679.5297176837921 | accuracy = 0.6540441176470588


Epoch[2] Batch[685] Speed: 1.2577375125035581 samples/sec                   batch loss = 1692.8950151205063 | accuracy = 0.654014598540146


Epoch[2] Batch[690] Speed: 1.2544023332465275 samples/sec                   batch loss = 1705.1256530284882 | accuracy = 0.6536231884057971


Epoch[2] Batch[695] Speed: 1.2518323397472655 samples/sec                   batch loss = 1717.4797291755676 | accuracy = 0.6539568345323741


Epoch[2] Batch[700] Speed: 1.2591154484504419 samples/sec                   batch loss = 1726.8265867233276 | accuracy = 0.6546428571428572


Epoch[2] Batch[705] Speed: 1.2656750215079753 samples/sec                   batch loss = 1738.6610295772552 | accuracy = 0.6549645390070922


Epoch[2] Batch[710] Speed: 1.2623118209386672 samples/sec                   batch loss = 1748.0640442371368 | accuracy = 0.6563380281690141


Epoch[2] Batch[715] Speed: 1.256129871539782 samples/sec                   batch loss = 1759.264059305191 | accuracy = 0.6573426573426573


Epoch[2] Batch[720] Speed: 1.2574674330845863 samples/sec                   batch loss = 1769.5925623178482 | accuracy = 0.6583333333333333


Epoch[2] Batch[725] Speed: 1.2557155390720325 samples/sec                   batch loss = 1780.9031257629395 | accuracy = 0.6582758620689655


Epoch[2] Batch[730] Speed: 1.256377830792041 samples/sec                   batch loss = 1791.6995607614517 | accuracy = 0.6578767123287671


Epoch[2] Batch[735] Speed: 1.2511575687272058 samples/sec                   batch loss = 1802.3247683048248 | accuracy = 0.6581632653061225


Epoch[2] Batch[740] Speed: 1.2474188532028407 samples/sec                   batch loss = 1814.2731821537018 | accuracy = 0.6581081081081082


Epoch[2] Batch[745] Speed: 1.2518408397052507 samples/sec                   batch loss = 1822.8470060825348 | accuracy = 0.6593959731543624


Epoch[2] Batch[750] Speed: 1.252057207165523 samples/sec                   batch loss = 1832.1838697195053 | accuracy = 0.6603333333333333


Epoch[2] Batch[755] Speed: 1.2628997104784927 samples/sec                   batch loss = 1844.885344862938 | accuracy = 0.6596026490066225


Epoch[2] Batch[760] Speed: 1.2614009949428824 samples/sec                   batch loss = 1856.813168168068 | accuracy = 0.6598684210526315


Epoch[2] Batch[765] Speed: 1.2618319003122762 samples/sec                   batch loss = 1868.5484442710876 | accuracy = 0.6604575163398693


Epoch[2] Batch[770] Speed: 1.261883530033299 samples/sec                   batch loss = 1878.3102670907974 | accuracy = 0.6610389610389611


Epoch[2] Batch[775] Speed: 1.2600223131554864 samples/sec                   batch loss = 1889.2433907985687 | accuracy = 0.6612903225806451


Epoch[2] Batch[780] Speed: 1.2540003868762257 samples/sec                   batch loss = 1902.5430772304535 | accuracy = 0.6612179487179487


Epoch[2] Batch[785] Speed: 1.2547600542553148 samples/sec                   batch loss = 1913.363815665245 | accuracy = 0.6617834394904458


[Epoch 2] training: accuracy=0.6618020304568528
[Epoch 2] time cost: 642.6276953220367
[Epoch 2] validation: validation accuracy=0.7355555555555555


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).