<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[15:36:02] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[15:36:02] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[15:36:02] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 4.0409217, -5.426557 ]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7884466166197256 samples/sec                   batch loss = 15.068442583084106 | accuracy = 0.4


Epoch[1] Batch[10] Speed: 1.274429524088702 samples/sec                   batch loss = 29.47301435470581 | accuracy = 0.4


Epoch[1] Batch[15] Speed: 1.2750531765644337 samples/sec                   batch loss = 43.91231060028076 | accuracy = 0.4


Epoch[1] Batch[20] Speed: 1.2664906823131648 samples/sec                   batch loss = 59.28009510040283 | accuracy = 0.3625


Epoch[1] Batch[25] Speed: 1.2722244752302745 samples/sec                   batch loss = 73.17195248603821 | accuracy = 0.4


Epoch[1] Batch[30] Speed: 1.2796582943702473 samples/sec                   batch loss = 87.4517810344696 | accuracy = 0.425


Epoch[1] Batch[35] Speed: 1.2663855246985778 samples/sec                   batch loss = 101.5922429561615 | accuracy = 0.4357142857142857


Epoch[1] Batch[40] Speed: 1.2774259874150375 samples/sec                   batch loss = 116.31132674217224 | accuracy = 0.40625


Epoch[1] Batch[45] Speed: 1.2755104313848922 samples/sec                   batch loss = 131.04099798202515 | accuracy = 0.4


Epoch[1] Batch[50] Speed: 1.271358829949667 samples/sec                   batch loss = 145.58574056625366 | accuracy = 0.415


Epoch[1] Batch[55] Speed: 1.28039279628249 samples/sec                   batch loss = 159.54461216926575 | accuracy = 0.4318181818181818


Epoch[1] Batch[60] Speed: 1.2777901499085902 samples/sec                   batch loss = 173.01951336860657 | accuracy = 0.45


Epoch[1] Batch[65] Speed: 1.2692537765998373 samples/sec                   batch loss = 186.31592416763306 | accuracy = 0.4576923076923077


Epoch[1] Batch[70] Speed: 1.2751716998069975 samples/sec                   batch loss = 201.17964339256287 | accuracy = 0.45357142857142857


Epoch[1] Batch[75] Speed: 1.2727675628017399 samples/sec                   batch loss = 214.87493443489075 | accuracy = 0.46


Epoch[1] Batch[80] Speed: 1.2738008720378455 samples/sec                   batch loss = 229.02418494224548 | accuracy = 0.453125


Epoch[1] Batch[85] Speed: 1.2733549895260936 samples/sec                   batch loss = 243.78048086166382 | accuracy = 0.4411764705882353


Epoch[1] Batch[90] Speed: 1.2727186109013162 samples/sec                   batch loss = 257.66797065734863 | accuracy = 0.4444444444444444


Epoch[1] Batch[95] Speed: 1.2737012658197095 samples/sec                   batch loss = 271.1794023513794 | accuracy = 0.45


Epoch[1] Batch[100] Speed: 1.2759906254646483 samples/sec                   batch loss = 284.3906936645508 | accuracy = 0.4625


Epoch[1] Batch[105] Speed: 1.27044385300358 samples/sec                   batch loss = 298.96755504608154 | accuracy = 0.45714285714285713


Epoch[1] Batch[110] Speed: 1.277343026712015 samples/sec                   batch loss = 312.8644688129425 | accuracy = 0.45681818181818185


Epoch[1] Batch[115] Speed: 1.2769071981186924 samples/sec                   batch loss = 326.8023648262024 | accuracy = 0.45652173913043476


Epoch[1] Batch[120] Speed: 1.275645916214021 samples/sec                   batch loss = 341.182404756546 | accuracy = 0.46041666666666664


Epoch[1] Batch[125] Speed: 1.2761522265629754 samples/sec                   batch loss = 354.9674549102783 | accuracy = 0.464


Epoch[1] Batch[130] Speed: 1.2736021585900843 samples/sec                   batch loss = 368.72242975234985 | accuracy = 0.47307692307692306


Epoch[1] Batch[135] Speed: 1.2709040649847296 samples/sec                   batch loss = 382.956782579422 | accuracy = 0.4740740740740741


Epoch[1] Batch[140] Speed: 1.2741612284952462 samples/sec                   batch loss = 397.12598276138306 | accuracy = 0.4732142857142857


Epoch[1] Batch[145] Speed: 1.2711910240536675 samples/sec                   batch loss = 410.72988176345825 | accuracy = 0.47586206896551725


Epoch[1] Batch[150] Speed: 1.2831523734724568 samples/sec                   batch loss = 424.2096951007843 | accuracy = 0.485


Epoch[1] Batch[155] Speed: 1.265346169200029 samples/sec                   batch loss = 437.97712111473083 | accuracy = 0.4838709677419355


Epoch[1] Batch[160] Speed: 1.2818997881610892 samples/sec                   batch loss = 451.74513125419617 | accuracy = 0.484375


Epoch[1] Batch[165] Speed: 1.2750906790832568 samples/sec                   batch loss = 465.40276861190796 | accuracy = 0.48484848484848486


Epoch[1] Batch[170] Speed: 1.2762719251101158 samples/sec                   batch loss = 479.25247406959534 | accuracy = 0.4852941176470588


Epoch[1] Batch[175] Speed: 1.2716868632883351 samples/sec                   batch loss = 493.1587829589844 | accuracy = 0.4857142857142857


Epoch[1] Batch[180] Speed: 1.2746906690683657 samples/sec                   batch loss = 506.85875058174133 | accuracy = 0.4888888888888889


Epoch[1] Batch[185] Speed: 1.270984843377017 samples/sec                   batch loss = 521.1194672584534 | accuracy = 0.4864864864864865


Epoch[1] Batch[190] Speed: 1.2796175947480295 samples/sec                   batch loss = 535.3776450157166 | accuracy = 0.48289473684210527


Epoch[1] Batch[195] Speed: 1.2753683826435978 samples/sec                   batch loss = 549.0086667537689 | accuracy = 0.48717948717948717


Epoch[1] Batch[200] Speed: 1.2750150948687327 samples/sec                   batch loss = 563.1373994350433 | accuracy = 0.485


Epoch[1] Batch[205] Speed: 1.269386014583262 samples/sec                   batch loss = 577.3131000995636 | accuracy = 0.4792682926829268


Epoch[1] Batch[210] Speed: 1.2691814751062227 samples/sec                   batch loss = 590.6513912677765 | accuracy = 0.4869047619047619


Epoch[1] Batch[215] Speed: 1.2687903449451452 samples/sec                   batch loss = 604.0061237812042 | accuracy = 0.49186046511627907


Epoch[1] Batch[220] Speed: 1.2738540662049356 samples/sec                   batch loss = 617.7546455860138 | accuracy = 0.49318181818181817


Epoch[1] Batch[225] Speed: 1.2738765058040853 samples/sec                   batch loss = 631.5348212718964 | accuracy = 0.49777777777777776


Epoch[1] Batch[230] Speed: 1.2723801058647268 samples/sec                   batch loss = 645.366682767868 | accuracy = 0.4956521739130435


Epoch[1] Batch[235] Speed: 1.276974745194414 samples/sec                   batch loss = 659.5267384052277 | accuracy = 0.4925531914893617


Epoch[1] Batch[240] Speed: 1.273749906521763 samples/sec                   batch loss = 672.9818153381348 | accuracy = 0.4979166666666667


Epoch[1] Batch[245] Speed: 1.278787380344368 samples/sec                   batch loss = 686.5014083385468 | accuracy = 0.5


Epoch[1] Batch[250] Speed: 1.2777184295773119 samples/sec                   batch loss = 700.5396263599396 | accuracy = 0.499


Epoch[1] Batch[255] Speed: 1.2729906457599531 samples/sec                   batch loss = 714.3736085891724 | accuracy = 0.5


Epoch[1] Batch[260] Speed: 1.273342522451699 samples/sec                   batch loss = 727.7629187107086 | accuracy = 0.5028846153846154


Epoch[1] Batch[265] Speed: 1.274020834532514 samples/sec                   batch loss = 741.2339069843292 | accuracy = 0.5066037735849057


Epoch[1] Batch[270] Speed: 1.2774240421403646 samples/sec                   batch loss = 754.5718913078308 | accuracy = 0.5083333333333333


Epoch[1] Batch[275] Speed: 1.2742125172584573 samples/sec                   batch loss = 767.6526045799255 | accuracy = 0.5154545454545455


Epoch[1] Batch[280] Speed: 1.2678324953141735 samples/sec                   batch loss = 781.7506368160248 | accuracy = 0.5169642857142858


Epoch[1] Batch[285] Speed: 1.2715246564381124 samples/sec                   batch loss = 795.2299065589905 | accuracy = 0.5166666666666667


Epoch[1] Batch[290] Speed: 1.2707732431030554 samples/sec                   batch loss = 809.3209493160248 | accuracy = 0.5146551724137931


Epoch[1] Batch[295] Speed: 1.2640919917409 samples/sec                   batch loss = 822.8107216358185 | accuracy = 0.514406779661017


Epoch[1] Batch[300] Speed: 1.267499169529401 samples/sec                   batch loss = 835.8151843547821 | accuracy = 0.5191666666666667


Epoch[1] Batch[305] Speed: 1.2672110032107755 samples/sec                   batch loss = 849.477881193161 | accuracy = 0.519672131147541


Epoch[1] Batch[310] Speed: 1.269193092719047 samples/sec                   batch loss = 863.2651131153107 | accuracy = 0.521774193548387


Epoch[1] Batch[315] Speed: 1.275746796910751 samples/sec                   batch loss = 877.3771932125092 | accuracy = 0.5222222222222223


Epoch[1] Batch[320] Speed: 1.275206398433136 samples/sec                   batch loss = 891.317033290863 | accuracy = 0.52265625


Epoch[1] Batch[325] Speed: 1.2801949515627533 samples/sec                   batch loss = 905.0523891448975 | accuracy = 0.5230769230769231


Epoch[1] Batch[330] Speed: 1.2719437995641922 samples/sec                   batch loss = 918.9417049884796 | accuracy = 0.5227272727272727


Epoch[1] Batch[335] Speed: 1.2800733440461256 samples/sec                   batch loss = 932.4948954582214 | accuracy = 0.5261194029850746


Epoch[1] Batch[340] Speed: 1.2793717928421988 samples/sec                   batch loss = 945.4918689727783 | accuracy = 0.5294117647058824


Epoch[1] Batch[345] Speed: 1.2789643152433783 samples/sec                   batch loss = 959.4732222557068 | accuracy = 0.5297101449275362


Epoch[1] Batch[350] Speed: 1.2723863782011173 samples/sec                   batch loss = 973.6576051712036 | accuracy = 0.5292857142857142


Epoch[1] Batch[355] Speed: 1.2708544861941697 samples/sec                   batch loss = 987.2805268764496 | accuracy = 0.5295774647887324


Epoch[1] Batch[360] Speed: 1.2721057275563274 samples/sec                   batch loss = 1000.9457094669342 | accuracy = 0.5319444444444444


Epoch[1] Batch[365] Speed: 1.2747138161106344 samples/sec                   batch loss = 1014.4499781131744 | accuracy = 0.5321917808219178


Epoch[1] Batch[370] Speed: 1.2755630895706402 samples/sec                   batch loss = 1028.795648097992 | accuracy = 0.5297297297297298


Epoch[1] Batch[375] Speed: 1.2729500794056599 samples/sec                   batch loss = 1042.4259622097015 | accuracy = 0.53


Epoch[1] Batch[380] Speed: 1.275340267531427 samples/sec                   batch loss = 1055.9249341487885 | accuracy = 0.5322368421052631


Epoch[1] Batch[385] Speed: 1.281788041118149 samples/sec                   batch loss = 1069.451901435852 | accuracy = 0.5331168831168831


Epoch[1] Batch[390] Speed: 1.275248562712963 samples/sec                   batch loss = 1082.9718482494354 | accuracy = 0.533974358974359


Epoch[1] Batch[395] Speed: 1.2752101785651306 samples/sec                   batch loss = 1096.0745615959167 | accuracy = 0.5360759493670886


Epoch[1] Batch[400] Speed: 1.2724866474811392 samples/sec                   batch loss = 1109.854950428009 | accuracy = 0.536875


Epoch[1] Batch[405] Speed: 1.2761194177434982 samples/sec                   batch loss = 1123.71160197258 | accuracy = 0.5376543209876543


Epoch[1] Batch[410] Speed: 1.273357019072938 samples/sec                   batch loss = 1137.5932114124298 | accuracy = 0.5359756097560976


Epoch[1] Batch[415] Speed: 1.2793656465654732 samples/sec                   batch loss = 1150.8973639011383 | accuracy = 0.5367469879518072


Epoch[1] Batch[420] Speed: 1.2732598980130208 samples/sec                   batch loss = 1163.6072895526886 | accuracy = 0.5398809523809524


Epoch[1] Batch[425] Speed: 1.2744605034377021 samples/sec                   batch loss = 1177.6221182346344 | accuracy = 0.54


Epoch[1] Batch[430] Speed: 1.276530620552373 samples/sec                   batch loss = 1190.929366827011 | accuracy = 0.5406976744186046


Epoch[1] Batch[435] Speed: 1.2724081871135688 samples/sec                   batch loss = 1204.9248704910278 | accuracy = 0.5402298850574713


Epoch[1] Batch[440] Speed: 1.2708293613325774 samples/sec                   batch loss = 1218.9511964321136 | accuracy = 0.540340909090909


Epoch[1] Batch[445] Speed: 1.2696545126642567 samples/sec                   batch loss = 1232.7993683815002 | accuracy = 0.5410112359550562


Epoch[1] Batch[450] Speed: 1.2703333247519863 samples/sec                   batch loss = 1246.8617956638336 | accuracy = 0.5405555555555556


Epoch[1] Batch[455] Speed: 1.2690346890481454 samples/sec                   batch loss = 1260.1462888717651 | accuracy = 0.5412087912087912


Epoch[1] Batch[460] Speed: 1.2691887720943955 samples/sec                   batch loss = 1273.449712753296 | accuracy = 0.5407608695652174


Epoch[1] Batch[465] Speed: 1.2709833990963006 samples/sec                   batch loss = 1287.4019494056702 | accuracy = 0.5397849462365591


Epoch[1] Batch[470] Speed: 1.2725090389301443 samples/sec                   batch loss = 1301.0076835155487 | accuracy = 0.5404255319148936


Epoch[1] Batch[475] Speed: 1.272387922170329 samples/sec                   batch loss = 1314.631847858429 | accuracy = 0.5405263157894736


Epoch[1] Batch[480] Speed: 1.2692055746888398 samples/sec                   batch loss = 1328.7407104969025 | accuracy = 0.5395833333333333


Epoch[1] Batch[485] Speed: 1.2738473925111615 samples/sec                   batch loss = 1342.1405146121979 | accuracy = 0.5396907216494845


Epoch[1] Batch[490] Speed: 1.2743819930783042 samples/sec                   batch loss = 1355.7535059452057 | accuracy = 0.5403061224489796


Epoch[1] Batch[495] Speed: 1.2697297509206311 samples/sec                   batch loss = 1369.371955871582 | accuracy = 0.5409090909090909


Epoch[1] Batch[500] Speed: 1.264802911308581 samples/sec                   batch loss = 1382.8601295948029 | accuracy = 0.5415


Epoch[1] Batch[505] Speed: 1.2739754622591242 samples/sec                   batch loss = 1396.117469549179 | accuracy = 0.5435643564356436


Epoch[1] Batch[510] Speed: 1.2761555269518112 samples/sec                   batch loss = 1410.6062655448914 | accuracy = 0.5416666666666666


Epoch[1] Batch[515] Speed: 1.2766982847778994 samples/sec                   batch loss = 1423.8304393291473 | accuracy = 0.5422330097087379


Epoch[1] Batch[520] Speed: 1.2711574104030057 samples/sec                   batch loss = 1437.5394625663757 | accuracy = 0.541826923076923


Epoch[1] Batch[525] Speed: 1.272085858055486 samples/sec                   batch loss = 1450.7561583518982 | accuracy = 0.5433333333333333


Epoch[1] Batch[530] Speed: 1.2649987928491429 samples/sec                   batch loss = 1464.3754441738129 | accuracy = 0.5433962264150943


Epoch[1] Batch[535] Speed: 1.2729908389392541 samples/sec                   batch loss = 1476.9149560928345 | accuracy = 0.5462616822429907


Epoch[1] Batch[540] Speed: 1.271840916180645 samples/sec                   batch loss = 1490.8033423423767 | accuracy = 0.5462962962962963


Epoch[1] Batch[545] Speed: 1.2665834267008536 samples/sec                   batch loss = 1504.5624372959137 | accuracy = 0.5463302752293578


Epoch[1] Batch[550] Speed: 1.2708513094326677 samples/sec                   batch loss = 1517.8205471038818 | accuracy = 0.5472727272727272


Epoch[1] Batch[555] Speed: 1.2687524445308906 samples/sec                   batch loss = 1530.3494997024536 | accuracy = 0.5486486486486486


Epoch[1] Batch[560] Speed: 1.272611644311104 samples/sec                   batch loss = 1543.7754187583923 | accuracy = 0.5482142857142858


Epoch[1] Batch[565] Speed: 1.2767971942938852 samples/sec                   batch loss = 1557.6777036190033 | accuracy = 0.5473451327433628


Epoch[1] Batch[570] Speed: 1.2697140875601787 samples/sec                   batch loss = 1571.709793806076 | accuracy = 0.5473684210526316


Epoch[1] Batch[575] Speed: 1.277042591033308 samples/sec                   batch loss = 1584.2919926643372 | accuracy = 0.548695652173913


Epoch[1] Batch[580] Speed: 1.2746834055254614 samples/sec                   batch loss = 1597.6250066757202 | accuracy = 0.5491379310344827


Epoch[1] Batch[585] Speed: 1.2722824583878296 samples/sec                   batch loss = 1610.0553696155548 | accuracy = 0.5517094017094017


Epoch[1] Batch[590] Speed: 1.2730060036973379 samples/sec                   batch loss = 1622.574904680252 | accuracy = 0.5533898305084746


Epoch[1] Batch[595] Speed: 1.2793095523244367 samples/sec                   batch loss = 1635.446587562561 | accuracy = 0.553781512605042


Epoch[1] Batch[600] Speed: 1.2760126551567148 samples/sec                   batch loss = 1649.26278424263 | accuracy = 0.5545833333333333


Epoch[1] Batch[605] Speed: 1.2728899107418021 samples/sec                   batch loss = 1661.9611194133759 | accuracy = 0.556198347107438


Epoch[1] Batch[610] Speed: 1.2773580035799368 samples/sec                   batch loss = 1675.3466079235077 | accuracy = 0.5565573770491803


Epoch[1] Batch[615] Speed: 1.2774739403087676 samples/sec                   batch loss = 1688.4745655059814 | accuracy = 0.556910569105691


Epoch[1] Batch[620] Speed: 1.2811850805600518 samples/sec                   batch loss = 1700.6625201702118 | accuracy = 0.5588709677419355


Epoch[1] Batch[625] Speed: 1.272613478424219 samples/sec                   batch loss = 1714.9873571395874 | accuracy = 0.5584


Epoch[1] Batch[630] Speed: 1.2813176636343482 samples/sec                   batch loss = 1728.2322092056274 | accuracy = 0.5599206349206349


Epoch[1] Batch[635] Speed: 1.2742532608979644 samples/sec                   batch loss = 1740.772135257721 | accuracy = 0.5610236220472441


Epoch[1] Batch[640] Speed: 1.275634374151511 samples/sec                   batch loss = 1753.9461691379547 | accuracy = 0.56171875


Epoch[1] Batch[645] Speed: 1.2784247924500984 samples/sec                   batch loss = 1767.0165185928345 | accuracy = 0.5624031007751938


Epoch[1] Batch[650] Speed: 1.2730011741173368 samples/sec                   batch loss = 1778.4886572360992 | accuracy = 0.5634615384615385


Epoch[1] Batch[655] Speed: 1.2716435848048089 samples/sec                   batch loss = 1791.3164467811584 | accuracy = 0.5641221374045802


Epoch[1] Batch[660] Speed: 1.2700143532408188 samples/sec                   batch loss = 1805.62704372406 | accuracy = 0.5640151515151515


Epoch[1] Batch[665] Speed: 1.2748166804997352 samples/sec                   batch loss = 1817.833526134491 | accuracy = 0.5650375939849624


Epoch[1] Batch[670] Speed: 1.2732433744119853 samples/sec                   batch loss = 1831.6546382904053 | accuracy = 0.5645522388059702


Epoch[1] Batch[675] Speed: 1.269380924303516 samples/sec                   batch loss = 1844.2428319454193 | accuracy = 0.5651851851851852


Epoch[1] Batch[680] Speed: 1.2729556812721472 samples/sec                   batch loss = 1857.924108505249 | accuracy = 0.5650735294117647


Epoch[1] Batch[685] Speed: 1.2663150788657247 samples/sec                   batch loss = 1869.429694890976 | accuracy = 0.5660583941605839


Epoch[1] Batch[690] Speed: 1.2694140598754406 samples/sec                   batch loss = 1882.4456007480621 | accuracy = 0.5666666666666667


Epoch[1] Batch[695] Speed: 1.272960220751845 samples/sec                   batch loss = 1896.4524109363556 | accuracy = 0.5661870503597123


Epoch[1] Batch[700] Speed: 1.2713106606933946 samples/sec                   batch loss = 1908.4071016311646 | accuracy = 0.5671428571428572


Epoch[1] Batch[705] Speed: 1.2625818001669624 samples/sec                   batch loss = 1920.7744452953339 | accuracy = 0.5684397163120567


Epoch[1] Batch[710] Speed: 1.26152828157051 samples/sec                   batch loss = 1934.480101108551 | accuracy = 0.5686619718309859


Epoch[1] Batch[715] Speed: 1.2692083591617478 samples/sec                   batch loss = 1947.3491220474243 | accuracy = 0.5685314685314685


Epoch[1] Batch[720] Speed: 1.2690783661119516 samples/sec                   batch loss = 1961.8759739398956 | accuracy = 0.5694444444444444


Epoch[1] Batch[725] Speed: 1.2651373961338737 samples/sec                   batch loss = 1976.510811805725 | accuracy = 0.5682758620689655


Epoch[1] Batch[730] Speed: 1.269643943508572 samples/sec                   batch loss = 1989.5649647712708 | accuracy = 0.5688356164383561


Epoch[1] Batch[735] Speed: 1.2674326211516875 samples/sec                   batch loss = 2004.2958269119263 | accuracy = 0.5683673469387756


Epoch[1] Batch[740] Speed: 1.2652433961510008 samples/sec                   batch loss = 2017.8986673355103 | accuracy = 0.5682432432432433


Epoch[1] Batch[745] Speed: 1.2707361866119893 samples/sec                   batch loss = 2031.2409427165985 | accuracy = 0.5687919463087249


Epoch[1] Batch[750] Speed: 1.2711584698319118 samples/sec                   batch loss = 2044.8205502033234 | accuracy = 0.5683333333333334


Epoch[1] Batch[755] Speed: 1.2649135282859885 samples/sec                   batch loss = 2058.046023130417 | accuracy = 0.5682119205298013


Epoch[1] Batch[760] Speed: 1.2622084956213733 samples/sec                   batch loss = 2070.5838243961334 | accuracy = 0.5684210526315789


Epoch[1] Batch[765] Speed: 1.2606406559905696 samples/sec                   batch loss = 2083.2846603393555 | accuracy = 0.568954248366013


Epoch[1] Batch[770] Speed: 1.2614157899825764 samples/sec                   batch loss = 2096.045416355133 | accuracy = 0.5691558441558442


Epoch[1] Batch[775] Speed: 1.261673431050288 samples/sec                   batch loss = 2109.8136994838715 | accuracy = 0.57


Epoch[1] Batch[780] Speed: 1.2635762656782323 samples/sec                   batch loss = 2122.9174723625183 | accuracy = 0.5698717948717948


Epoch[1] Batch[785] Speed: 1.2613583188882085 samples/sec                   batch loss = 2137.4514338970184 | accuracy = 0.5687898089171974


[Epoch 1] training: accuracy=0.5688451776649747
[Epoch 1] time cost: 636.7285950183868
[Epoch 1] validation: validation accuracy=0.6722222222222223


Epoch[2] Batch[5] Speed: 1.2726567264128208 samples/sec                   batch loss = 12.655039072036743 | accuracy = 0.7


Epoch[2] Batch[10] Speed: 1.2734069866236684 samples/sec                   batch loss = 24.80151927471161 | accuracy = 0.7


Epoch[2] Batch[15] Speed: 1.265675880852525 samples/sec                   batch loss = 37.918285727500916 | accuracy = 0.6833333333333333


Epoch[2] Batch[20] Speed: 1.2752464301996844 samples/sec                   batch loss = 50.47310268878937 | accuracy = 0.675


Epoch[2] Batch[25] Speed: 1.2672661371918599 samples/sec                   batch loss = 64.41476690769196 | accuracy = 0.65


Epoch[2] Batch[30] Speed: 1.2699167801556126 samples/sec                   batch loss = 77.36698997020721 | accuracy = 0.6583333333333333


Epoch[2] Batch[35] Speed: 1.272934626236471 samples/sec                   batch loss = 88.59812712669373 | accuracy = 0.6785714285714286


Epoch[2] Batch[40] Speed: 1.277848836168666 samples/sec                   batch loss = 100.9666041135788 | accuracy = 0.675


Epoch[2] Batch[45] Speed: 1.2646865936353546 samples/sec                   batch loss = 113.05037820339203 | accuracy = 0.6777777777777778


Epoch[2] Batch[50] Speed: 1.2712229056915836 samples/sec                   batch loss = 124.82079529762268 | accuracy = 0.68


Epoch[2] Batch[55] Speed: 1.2689372664310399 samples/sec                   batch loss = 137.7957842350006 | accuracy = 0.6772727272727272


Epoch[2] Batch[60] Speed: 1.2767066399990077 samples/sec                   batch loss = 148.9996658563614 | accuracy = 0.6875


Epoch[2] Batch[65] Speed: 1.2749073544041034 samples/sec                   batch loss = 161.84000492095947 | accuracy = 0.6807692307692308


Epoch[2] Batch[70] Speed: 1.2744654409105964 samples/sec                   batch loss = 173.87807273864746 | accuracy = 0.6857142857142857


Epoch[2] Batch[75] Speed: 1.2721562721910764 samples/sec                   batch loss = 188.0649721622467 | accuracy = 0.6733333333333333


Epoch[2] Batch[80] Speed: 1.2744565341244884 samples/sec                   batch loss = 200.2245945930481 | accuracy = 0.675


Epoch[2] Batch[85] Speed: 1.2724635812520906 samples/sec                   batch loss = 212.8093342781067 | accuracy = 0.6705882352941176


Epoch[2] Batch[90] Speed: 1.279243708839432 samples/sec                   batch loss = 226.6766231060028 | accuracy = 0.6722222222222223


Epoch[2] Batch[95] Speed: 1.2834016920611493 samples/sec                   batch loss = 240.01842260360718 | accuracy = 0.6710526315789473


Epoch[2] Batch[100] Speed: 1.2787785105169927 samples/sec                   batch loss = 252.606675863266 | accuracy = 0.6725


Epoch[2] Batch[105] Speed: 1.2726608776020487 samples/sec                   batch loss = 262.88394021987915 | accuracy = 0.680952380952381


Epoch[2] Batch[110] Speed: 1.2759659764025768 samples/sec                   batch loss = 274.40726339817047 | accuracy = 0.6818181818181818


Epoch[2] Batch[115] Speed: 1.2719871949226098 samples/sec                   batch loss = 284.87313091754913 | accuracy = 0.6891304347826087


Epoch[2] Batch[120] Speed: 1.2746615185500485 samples/sec                   batch loss = 298.48184049129486 | accuracy = 0.6833333333333333


Epoch[2] Batch[125] Speed: 1.268584750381546 samples/sec                   batch loss = 312.51376807689667 | accuracy = 0.674


Epoch[2] Batch[130] Speed: 1.2701400186994425 samples/sec                   batch loss = 325.0872223377228 | accuracy = 0.6711538461538461


Epoch[2] Batch[135] Speed: 1.2738160561015475 samples/sec                   batch loss = 337.17852449417114 | accuracy = 0.6722222222222223


Epoch[2] Batch[140] Speed: 1.2640175153489095 samples/sec                   batch loss = 348.595467209816 | accuracy = 0.6732142857142858


Epoch[2] Batch[145] Speed: 1.2665520641920816 samples/sec                   batch loss = 360.0158165693283 | accuracy = 0.6741379310344827


Epoch[2] Batch[150] Speed: 1.2739197429504974 samples/sec                   batch loss = 371.1607140302658 | accuracy = 0.6783333333333333


Epoch[2] Batch[155] Speed: 1.2743947709240895 samples/sec                   batch loss = 383.52818036079407 | accuracy = 0.6774193548387096


Epoch[2] Batch[160] Speed: 1.2757683331259928 samples/sec                   batch loss = 395.93425035476685 | accuracy = 0.6765625


Epoch[2] Batch[165] Speed: 1.2703315933919075 samples/sec                   batch loss = 407.8206549882889 | accuracy = 0.6742424242424242


Epoch[2] Batch[170] Speed: 1.2734764837988506 samples/sec                   batch loss = 419.39371478557587 | accuracy = 0.6779411764705883


Epoch[2] Batch[175] Speed: 1.2755176073826864 samples/sec                   batch loss = 430.76407492160797 | accuracy = 0.6814285714285714


Epoch[2] Batch[180] Speed: 1.271929045816831 samples/sec                   batch loss = 443.17167115211487 | accuracy = 0.6819444444444445


Epoch[2] Batch[185] Speed: 1.2745982829725222 samples/sec                   batch loss = 456.618013381958 | accuracy = 0.6783783783783783


Epoch[2] Batch[190] Speed: 1.2802062832389984 samples/sec                   batch loss = 469.1737540960312 | accuracy = 0.6736842105263158


Epoch[2] Batch[195] Speed: 1.2734335667058603 samples/sec                   batch loss = 481.7311121225357 | accuracy = 0.6730769230769231


Epoch[2] Batch[200] Speed: 1.2776859293916452 samples/sec                   batch loss = 495.00600588321686 | accuracy = 0.6725


Epoch[2] Batch[205] Speed: 1.2709775257218692 samples/sec                   batch loss = 506.56091248989105 | accuracy = 0.6731707317073171


Epoch[2] Batch[210] Speed: 1.2708331155591086 samples/sec                   batch loss = 520.2225860357285 | accuracy = 0.6738095238095239


Epoch[2] Batch[215] Speed: 1.274897666405743 samples/sec                   batch loss = 533.6362138986588 | accuracy = 0.672093023255814


Epoch[2] Batch[220] Speed: 1.2792902375330093 samples/sec                   batch loss = 548.6033190488815 | accuracy = 0.6704545454545454


Epoch[2] Batch[225] Speed: 1.2770288852039875 samples/sec                   batch loss = 560.6811691522598 | accuracy = 0.6711111111111111


Epoch[2] Batch[230] Speed: 1.2738524219550502 samples/sec                   batch loss = 573.367866396904 | accuracy = 0.6695652173913044


Epoch[2] Batch[235] Speed: 1.268376825162966 samples/sec                   batch loss = 585.4569618701935 | accuracy = 0.6702127659574468


Epoch[2] Batch[240] Speed: 1.2684397326126875 samples/sec                   batch loss = 598.0535743236542 | accuracy = 0.6677083333333333


Epoch[2] Batch[245] Speed: 1.2722374992729677 samples/sec                   batch loss = 613.0935137271881 | accuracy = 0.6642857142857143


Epoch[2] Batch[250] Speed: 1.269038432678669 samples/sec                   batch loss = 622.9135746359825 | accuracy = 0.667


Epoch[2] Batch[255] Speed: 1.2714297417364382 samples/sec                   batch loss = 634.2037430405617 | accuracy = 0.6676470588235294


Epoch[2] Batch[260] Speed: 1.2675603619024478 samples/sec                   batch loss = 645.0783666968346 | accuracy = 0.6692307692307692


Epoch[2] Batch[265] Speed: 1.2677307548942112 samples/sec                   batch loss = 656.4222204089165 | accuracy = 0.6698113207547169


Epoch[2] Batch[270] Speed: 1.2659980262849106 samples/sec                   batch loss = 669.1065642237663 | accuracy = 0.6666666666666666


Epoch[2] Batch[275] Speed: 1.2724065465941512 samples/sec                   batch loss = 681.5822725892067 | accuracy = 0.6663636363636364


Epoch[2] Batch[280] Speed: 1.2739518583458491 samples/sec                   batch loss = 692.9491043686867 | accuracy = 0.6669642857142857


Epoch[2] Batch[285] Speed: 1.2674439195204248 samples/sec                   batch loss = 703.5136559605598 | accuracy = 0.6675438596491228


Epoch[2] Batch[290] Speed: 1.270003970367208 samples/sec                   batch loss = 714.5847645401955 | accuracy = 0.6698275862068965


Epoch[2] Batch[295] Speed: 1.273855130133594 samples/sec                   batch loss = 726.7672181725502 | accuracy = 0.6711864406779661


Epoch[2] Batch[300] Speed: 1.2735157303393814 samples/sec                   batch loss = 739.8671949505806 | accuracy = 0.6708333333333333


Epoch[2] Batch[305] Speed: 1.2717595469237755 samples/sec                   batch loss = 752.7696375250816 | accuracy = 0.6688524590163935


Epoch[2] Batch[310] Speed: 1.2751382629734926 samples/sec                   batch loss = 764.5297294259071 | accuracy = 0.6685483870967742


Epoch[2] Batch[315] Speed: 1.272972873515077 samples/sec                   batch loss = 777.3033803105354 | accuracy = 0.669047619047619


Epoch[2] Batch[320] Speed: 1.278759504158373 samples/sec                   batch loss = 789.9023130536079 | accuracy = 0.66875


Epoch[2] Batch[325] Speed: 1.2705111990519413 samples/sec                   batch loss = 803.6658096909523 | accuracy = 0.6653846153846154


Epoch[2] Batch[330] Speed: 1.259415637766789 samples/sec                   batch loss = 814.5085392594337 | accuracy = 0.6666666666666666


Epoch[2] Batch[335] Speed: 1.2698096112383024 samples/sec                   batch loss = 827.6501985192299 | accuracy = 0.6656716417910448


Epoch[2] Batch[340] Speed: 1.2719420638114436 samples/sec                   batch loss = 841.0577083230019 | accuracy = 0.6639705882352941


Epoch[2] Batch[345] Speed: 1.2670307983376938 samples/sec                   batch loss = 852.8265218138695 | accuracy = 0.6652173913043479


Epoch[2] Batch[350] Speed: 1.2714890979000655 samples/sec                   batch loss = 863.2425289750099 | accuracy = 0.665


Epoch[2] Batch[355] Speed: 1.261855721588737 samples/sec                   batch loss = 875.2614008784294 | accuracy = 0.6654929577464789


Epoch[2] Batch[360] Speed: 1.2692534885297335 samples/sec                   batch loss = 889.2859823107719 | accuracy = 0.6631944444444444


Epoch[2] Batch[365] Speed: 1.2690964137765348 samples/sec                   batch loss = 900.8883835673332 | accuracy = 0.6643835616438356


Epoch[2] Batch[370] Speed: 1.2669796076831579 samples/sec                   batch loss = 913.1760300993919 | accuracy = 0.6641891891891892


Epoch[2] Batch[375] Speed: 1.2653196393678054 samples/sec                   batch loss = 925.6950849890709 | accuracy = 0.6626666666666666


Epoch[2] Batch[380] Speed: 1.2667373932485284 samples/sec                   batch loss = 936.8077556490898 | accuracy = 0.6631578947368421


Epoch[2] Batch[385] Speed: 1.2660294568971835 samples/sec                   batch loss = 951.6877836585045 | accuracy = 0.6597402597402597


Epoch[2] Batch[390] Speed: 1.2723743160706384 samples/sec                   batch loss = 965.6935438513756 | accuracy = 0.6583333333333333


Epoch[2] Batch[395] Speed: 1.2695272140846166 samples/sec                   batch loss = 979.3628308176994 | accuracy = 0.6569620253164556


Epoch[2] Batch[400] Speed: 1.2604234899284994 samples/sec                   batch loss = 991.6731447577477 | accuracy = 0.656875


Epoch[2] Batch[405] Speed: 1.2709742520613199 samples/sec                   batch loss = 1002.4382318854332 | accuracy = 0.658641975308642


Epoch[2] Batch[410] Speed: 1.2694622776237474 samples/sec                   batch loss = 1014.9186901450157 | accuracy = 0.6585365853658537


Epoch[2] Batch[415] Speed: 1.2685273915066542 samples/sec                   batch loss = 1025.6895695328712 | accuracy = 0.6602409638554216


Epoch[2] Batch[420] Speed: 1.2672592451818596 samples/sec                   batch loss = 1038.472430408001 | accuracy = 0.6601190476190476


Epoch[2] Batch[425] Speed: 1.2689000290276904 samples/sec                   batch loss = 1052.0074912905693 | accuracy = 0.6588235294117647


Epoch[2] Batch[430] Speed: 1.2721966914970704 samples/sec                   batch loss = 1062.52577573061 | accuracy = 0.6593023255813953


Epoch[2] Batch[435] Speed: 1.2750749800859975 samples/sec                   batch loss = 1076.092715203762 | accuracy = 0.6580459770114943


Epoch[2] Batch[440] Speed: 1.2721708382798003 samples/sec                   batch loss = 1090.8218691945076 | accuracy = 0.6579545454545455


Epoch[2] Batch[445] Speed: 1.2686000981173209 samples/sec                   batch loss = 1102.4991608262062 | accuracy = 0.6589887640449438


Epoch[2] Batch[450] Speed: 1.2598546481219532 samples/sec                   batch loss = 1115.7279673218727 | accuracy = 0.6583333333333333


Epoch[2] Batch[455] Speed: 1.2701875222963228 samples/sec                   batch loss = 1128.1712449193 | accuracy = 0.6587912087912088


Epoch[2] Batch[460] Speed: 1.2682833385758867 samples/sec                   batch loss = 1140.4610250592232 | accuracy = 0.6597826086956522


Epoch[2] Batch[465] Speed: 1.2664889614116495 samples/sec                   batch loss = 1150.3378509879112 | accuracy = 0.660752688172043


Epoch[2] Batch[470] Speed: 1.269375834064594 samples/sec                   batch loss = 1162.4242191910744 | accuracy = 0.6617021276595745


Epoch[2] Batch[475] Speed: 1.2747088767130699 samples/sec                   batch loss = 1174.8035570979118 | accuracy = 0.6610526315789473


Epoch[2] Batch[480] Speed: 1.266165228321888 samples/sec                   batch loss = 1187.8640379309654 | accuracy = 0.6614583333333334


Epoch[2] Batch[485] Speed: 1.268610170270745 samples/sec                   batch loss = 1200.7378937602043 | accuracy = 0.6608247422680412


Epoch[2] Batch[490] Speed: 1.2715437374491905 samples/sec                   batch loss = 1213.3356140255928 | accuracy = 0.6602040816326531


Epoch[2] Batch[495] Speed: 1.2671020894113787 samples/sec                   batch loss = 1226.7203593850136 | accuracy = 0.658080808080808


Epoch[2] Batch[500] Speed: 1.2654688125579474 samples/sec                   batch loss = 1237.8956995606422 | accuracy = 0.658


Epoch[2] Batch[505] Speed: 1.2642193457506101 samples/sec                   batch loss = 1251.2111293673515 | accuracy = 0.656930693069307


Epoch[2] Batch[510] Speed: 1.2687184800075622 samples/sec                   batch loss = 1262.7359883189201 | accuracy = 0.6568627450980392


Epoch[2] Batch[515] Speed: 1.2652971185127773 samples/sec                   batch loss = 1274.0418350100517 | accuracy = 0.6572815533980583


Epoch[2] Batch[520] Speed: 1.2637625335125895 samples/sec                   batch loss = 1287.002925336361 | accuracy = 0.6557692307692308


Epoch[2] Batch[525] Speed: 1.265228988264977 samples/sec                   batch loss = 1297.5217605233192 | accuracy = 0.6566666666666666


Epoch[2] Batch[530] Speed: 1.264129899975512 samples/sec                   batch loss = 1309.6462463736534 | accuracy = 0.6566037735849056


Epoch[2] Batch[535] Speed: 1.2632277677937613 samples/sec                   batch loss = 1321.5929334759712 | accuracy = 0.6565420560747663


Epoch[2] Batch[540] Speed: 1.2672712105249004 samples/sec                   batch loss = 1334.0621091723442 | accuracy = 0.6564814814814814


Epoch[2] Batch[545] Speed: 1.2706399460819966 samples/sec                   batch loss = 1349.3744658827782 | accuracy = 0.6559633027522935


Epoch[2] Batch[550] Speed: 1.2546756951499636 samples/sec                   batch loss = 1361.3051806092262 | accuracy = 0.6563636363636364


Epoch[2] Batch[555] Speed: 1.273024742814677 samples/sec                   batch loss = 1373.5298382639885 | accuracy = 0.6576576576576577


Epoch[2] Batch[560] Speed: 1.2764648685662114 samples/sec                   batch loss = 1384.4466465115547 | accuracy = 0.6584821428571429


Epoch[2] Batch[565] Speed: 1.2668060686606544 samples/sec                   batch loss = 1398.3593078255653 | accuracy = 0.6570796460176991


Epoch[2] Batch[570] Speed: 1.274420617804605 samples/sec                   batch loss = 1410.680788218975 | accuracy = 0.6574561403508772


Epoch[2] Batch[575] Speed: 1.2788744281461262 samples/sec                   batch loss = 1418.9457432627678 | accuracy = 0.6595652173913044


Epoch[2] Batch[580] Speed: 1.27003233139567 samples/sec                   batch loss = 1430.572943508625 | accuracy = 0.6607758620689655


Epoch[2] Batch[585] Speed: 1.2726029564787074 samples/sec                   batch loss = 1441.400874555111 | accuracy = 0.6611111111111111


Epoch[2] Batch[590] Speed: 1.2722689510236982 samples/sec                   batch loss = 1451.9662566781044 | accuracy = 0.6622881355932203


Epoch[2] Batch[595] Speed: 1.2752812298305092 samples/sec                   batch loss = 1461.0868925452232 | accuracy = 0.6634453781512605


Epoch[2] Batch[600] Speed: 1.2679368393145596 samples/sec                   batch loss = 1472.6348474621773 | accuracy = 0.66375


Epoch[2] Batch[605] Speed: 1.2707697779889178 samples/sec                   batch loss = 1486.6395691037178 | accuracy = 0.6632231404958677


Epoch[2] Batch[610] Speed: 1.26937593010646 samples/sec                   batch loss = 1497.928872525692 | accuracy = 0.6631147540983606


Epoch[2] Batch[615] Speed: 1.2686686878495892 samples/sec                   batch loss = 1512.4013323187828 | accuracy = 0.6626016260162602


Epoch[2] Batch[620] Speed: 1.2656923995912297 samples/sec                   batch loss = 1524.8775368332863 | accuracy = 0.6625


Epoch[2] Batch[625] Speed: 1.2739183887224177 samples/sec                   batch loss = 1537.739311993122 | accuracy = 0.6628


Epoch[2] Batch[630] Speed: 1.2735731545136264 samples/sec                   batch loss = 1549.0907137989998 | accuracy = 0.6623015873015873


Epoch[2] Batch[635] Speed: 1.2711332366414725 samples/sec                   batch loss = 1559.0943550467491 | accuracy = 0.6637795275590551


Epoch[2] Batch[640] Speed: 1.2730683085649601 samples/sec                   batch loss = 1571.1506627202034 | accuracy = 0.6640625


Epoch[2] Batch[645] Speed: 1.276997780867201 samples/sec                   batch loss = 1585.3088404536247 | accuracy = 0.663953488372093


Epoch[2] Batch[650] Speed: 1.2633314503207722 samples/sec                   batch loss = 1595.2552219033241 | accuracy = 0.6646153846153846


Epoch[2] Batch[655] Speed: 1.2667123352727012 samples/sec                   batch loss = 1608.8317635655403 | accuracy = 0.6648854961832061


Epoch[2] Batch[660] Speed: 1.2799255902633835 samples/sec                   batch loss = 1620.8359470963478 | accuracy = 0.6647727272727273


Epoch[2] Batch[665] Speed: 1.2690499516803557 samples/sec                   batch loss = 1631.2256115078926 | accuracy = 0.6654135338345865


Epoch[2] Batch[670] Speed: 1.267830674956038 samples/sec                   batch loss = 1642.065140068531 | accuracy = 0.6671641791044776


Epoch[2] Batch[675] Speed: 1.2722762835572106 samples/sec                   batch loss = 1655.6179247498512 | accuracy = 0.6655555555555556


Epoch[2] Batch[680] Speed: 1.2729116403414245 samples/sec                   batch loss = 1668.3769972920418 | accuracy = 0.6654411764705882


Epoch[2] Batch[685] Speed: 1.2685287342972726 samples/sec                   batch loss = 1679.7744234204292 | accuracy = 0.6656934306569343


Epoch[2] Batch[690] Speed: 1.272412240179678 samples/sec                   batch loss = 1692.3597384095192 | accuracy = 0.6655797101449276


Epoch[2] Batch[695] Speed: 1.2699704194640804 samples/sec                   batch loss = 1704.6562774777412 | accuracy = 0.6658273381294963


Epoch[2] Batch[700] Speed: 1.2717740075238142 samples/sec                   batch loss = 1716.5928710103035 | accuracy = 0.6660714285714285


Epoch[2] Batch[705] Speed: 1.2700858843168557 samples/sec                   batch loss = 1726.5056741833687 | accuracy = 0.6659574468085107


Epoch[2] Batch[710] Speed: 1.2753863187938226 samples/sec                   batch loss = 1736.0193335413933 | accuracy = 0.6672535211267606


Epoch[2] Batch[715] Speed: 1.264596319172856 samples/sec                   batch loss = 1744.5545541644096 | accuracy = 0.6678321678321678


Epoch[2] Batch[720] Speed: 1.2737951660183127 samples/sec                   batch loss = 1755.401827275753 | accuracy = 0.6690972222222222


Epoch[2] Batch[725] Speed: 1.2723005972994086 samples/sec                   batch loss = 1766.6502150893211 | accuracy = 0.6696551724137931


Epoch[2] Batch[730] Speed: 1.271615055388859 samples/sec                   batch loss = 1777.2951640486717 | accuracy = 0.6702054794520548


Epoch[2] Batch[735] Speed: 1.2740591471135494 samples/sec                   batch loss = 1788.2597293257713 | accuracy = 0.6707482993197279


Epoch[2] Batch[740] Speed: 1.2699198561225138 samples/sec                   batch loss = 1800.1620637774467 | accuracy = 0.6716216216216216


Epoch[2] Batch[745] Speed: 1.2712839764965904 samples/sec                   batch loss = 1812.6615596413612 | accuracy = 0.6724832214765101


Epoch[2] Batch[750] Speed: 1.2669989352573163 samples/sec                   batch loss = 1826.491060912609 | accuracy = 0.672


Epoch[2] Batch[755] Speed: 1.2810219089569133 samples/sec                   batch loss = 1838.7215749621391 | accuracy = 0.6718543046357616


Epoch[2] Batch[760] Speed: 1.2657664050139306 samples/sec                   batch loss = 1851.2648159861565 | accuracy = 0.6710526315789473


Epoch[2] Batch[765] Speed: 1.270348907205073 samples/sec                   batch loss = 1863.1416965126991 | accuracy = 0.6705882352941176


Epoch[2] Batch[770] Speed: 1.2747216611148569 samples/sec                   batch loss = 1875.1323671936989 | accuracy = 0.6704545454545454


Epoch[2] Batch[775] Speed: 1.272638577345383 samples/sec                   batch loss = 1887.911197245121 | accuracy = 0.67


Epoch[2] Batch[780] Speed: 1.276907586858395 samples/sec                   batch loss = 1899.3565645813942 | accuracy = 0.6705128205128205


Epoch[2] Batch[785] Speed: 1.2789625602767287 samples/sec                   batch loss = 1911.655762732029 | accuracy = 0.6697452229299363


[Epoch 2] training: accuracy=0.6700507614213198
[Epoch 2] time cost: 635.3373367786407
[Epoch 2] validation: validation accuracy=0.7133333333333334


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).