<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `device` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), device=gpu)
x

[04:33:18] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], device=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[04:33:18] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), device=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], device=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_device(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', device=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), device=gpu)
net(x)

[04:33:19] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 3.3337808, -0.9091317]], device=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, device=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.771559018469258 samples/sec                   batch loss = 14.219425916671753 | accuracy = 0.45


Epoch[1] Batch[10] Speed: 1.245309995211664 samples/sec                   batch loss = 29.866958379745483 | accuracy = 0.475


Epoch[1] Batch[15] Speed: 1.2506267029850884 samples/sec                   batch loss = 44.06512689590454 | accuracy = 0.5


Epoch[1] Batch[20] Speed: 1.2529979103278202 samples/sec                   batch loss = 58.4277663230896 | accuracy = 0.5


Epoch[1] Batch[25] Speed: 1.2504709422975127 samples/sec                   batch loss = 73.25126576423645 | accuracy = 0.49


Epoch[1] Batch[30] Speed: 1.2498802247684377 samples/sec                   batch loss = 87.22441077232361 | accuracy = 0.5


Epoch[1] Batch[35] Speed: 1.2568166084936696 samples/sec                   batch loss = 101.62382245063782 | accuracy = 0.4928571428571429


Epoch[1] Batch[40] Speed: 1.2481844012660683 samples/sec                   batch loss = 115.11426758766174 | accuracy = 0.5


Epoch[1] Batch[45] Speed: 1.2527747629282906 samples/sec                   batch loss = 129.00099539756775 | accuracy = 0.5


Epoch[1] Batch[50] Speed: 1.2524354692742947 samples/sec                   batch loss = 142.51794242858887 | accuracy = 0.51


Epoch[1] Batch[55] Speed: 1.2504704762854537 samples/sec                   batch loss = 157.0192108154297 | accuracy = 0.5


Epoch[1] Batch[60] Speed: 1.2523379610346297 samples/sec                   batch loss = 170.30674576759338 | accuracy = 0.5208333333333334


Epoch[1] Batch[65] Speed: 1.2512329635716681 samples/sec                   batch loss = 184.65893650054932 | accuracy = 0.5230769230769231


Epoch[1] Batch[70] Speed: 1.243166064678333 samples/sec                   batch loss = 199.20412302017212 | accuracy = 0.5178571428571429


Epoch[1] Batch[75] Speed: 1.2421871749175748 samples/sec                   batch loss = 212.90191054344177 | accuracy = 0.5166666666666667


Epoch[1] Batch[80] Speed: 1.2453415162433437 samples/sec                   batch loss = 226.4406452178955 | accuracy = 0.51875


Epoch[1] Batch[85] Speed: 1.2488172973372107 samples/sec                   batch loss = 240.5904040336609 | accuracy = 0.5117647058823529


Epoch[1] Batch[90] Speed: 1.2520267467738544 samples/sec                   batch loss = 254.4157223701477 | accuracy = 0.5194444444444445


Epoch[1] Batch[95] Speed: 1.2483918892553663 samples/sec                   batch loss = 268.2559859752655 | accuracy = 0.5263157894736842


Epoch[1] Batch[100] Speed: 1.2473098837831729 samples/sec                   batch loss = 282.9278917312622 | accuracy = 0.5175


Epoch[1] Batch[105] Speed: 1.2398732547624798 samples/sec                   batch loss = 297.64745235443115 | accuracy = 0.5119047619047619


Epoch[1] Batch[110] Speed: 1.2529436365591655 samples/sec                   batch loss = 311.8339877128601 | accuracy = 0.509090909090909


Epoch[1] Batch[115] Speed: 1.2523286130077476 samples/sec                   batch loss = 325.9719982147217 | accuracy = 0.508695652173913


Epoch[1] Batch[120] Speed: 1.2476168089158575 samples/sec                   batch loss = 339.755318403244 | accuracy = 0.5104166666666666


Epoch[1] Batch[125] Speed: 1.2435816499018424 samples/sec                   batch loss = 353.6996912956238 | accuracy = 0.508


Epoch[1] Batch[130] Speed: 1.2561572400805512 samples/sec                   batch loss = 367.25262355804443 | accuracy = 0.5134615384615384


Epoch[1] Batch[135] Speed: 1.2543891090789328 samples/sec                   batch loss = 380.3480041027069 | accuracy = 0.5203703703703704


Epoch[1] Batch[140] Speed: 1.248415391572274 samples/sec                   batch loss = 394.16697812080383 | accuracy = 0.5214285714285715


Epoch[1] Batch[145] Speed: 1.2557133774003693 samples/sec                   batch loss = 407.79932475090027 | accuracy = 0.5241379310344828


Epoch[1] Batch[150] Speed: 1.2507135021047977 samples/sec                   batch loss = 421.31820797920227 | accuracy = 0.5283333333333333


Epoch[1] Batch[155] Speed: 1.2523222564291816 samples/sec                   batch loss = 435.2847774028778 | accuracy = 0.5258064516129032


Epoch[1] Batch[160] Speed: 1.2561252632086493 samples/sec                   batch loss = 448.90203619003296 | accuracy = 0.5265625


Epoch[1] Batch[165] Speed: 1.2519484535497807 samples/sec                   batch loss = 462.26451659202576 | accuracy = 0.5272727272727272


Epoch[1] Batch[170] Speed: 1.250192329073032 samples/sec                   batch loss = 476.3173279762268 | accuracy = 0.5220588235294118


Epoch[1] Batch[175] Speed: 1.2513894742024052 samples/sec                   batch loss = 489.8279139995575 | accuracy = 0.5228571428571429


Epoch[1] Batch[180] Speed: 1.256425721909395 samples/sec                   batch loss = 504.24291229248047 | accuracy = 0.5194444444444445


Epoch[1] Batch[185] Speed: 1.252321228165183 samples/sec                   batch loss = 517.6277139186859 | accuracy = 0.5243243243243243


Epoch[1] Batch[190] Speed: 1.2514749789533814 samples/sec                   batch loss = 531.5100724697113 | accuracy = 0.525


Epoch[1] Batch[195] Speed: 1.2455281793317738 samples/sec                   batch loss = 545.1518776416779 | accuracy = 0.5256410256410257


Epoch[1] Batch[200] Speed: 1.2479079197582554 samples/sec                   batch loss = 558.6377141475677 | accuracy = 0.5275


Epoch[1] Batch[205] Speed: 1.2517957258631753 samples/sec                   batch loss = 572.3351120948792 | accuracy = 0.526829268292683


Epoch[1] Batch[210] Speed: 1.2504875325531024 samples/sec                   batch loss = 585.7828333377838 | accuracy = 0.530952380952381


Epoch[1] Batch[215] Speed: 1.2471769205139935 samples/sec                   batch loss = 599.708270072937 | accuracy = 0.5313953488372093


Epoch[1] Batch[220] Speed: 1.2513284331991499 samples/sec                   batch loss = 613.1820175647736 | accuracy = 0.5306818181818181


Epoch[1] Batch[225] Speed: 1.2509486008189183 samples/sec                   batch loss = 627.0606596469879 | accuracy = 0.53


Epoch[1] Batch[230] Speed: 1.2500079907987722 samples/sec                   batch loss = 640.7266619205475 | accuracy = 0.5304347826086957


Epoch[1] Batch[235] Speed: 1.2495038765624928 samples/sec                   batch loss = 654.6558036804199 | accuracy = 0.5287234042553192


Epoch[1] Batch[240] Speed: 1.256213861947191 samples/sec                   batch loss = 668.3672297000885 | accuracy = 0.5302083333333333


Epoch[1] Batch[245] Speed: 1.242494896445897 samples/sec                   batch loss = 681.9234318733215 | accuracy = 0.5326530612244897


Epoch[1] Batch[250] Speed: 1.2450690645619775 samples/sec                   batch loss = 695.3654537200928 | accuracy = 0.534


Epoch[1] Batch[255] Speed: 1.2479765180809959 samples/sec                   batch loss = 708.7968018054962 | accuracy = 0.5343137254901961


Epoch[1] Batch[260] Speed: 1.2458138757092219 samples/sec                   batch loss = 722.8165152072906 | accuracy = 0.5336538461538461


Epoch[1] Batch[265] Speed: 1.240922289915567 samples/sec                   batch loss = 736.8291201591492 | accuracy = 0.529245283018868


Epoch[1] Batch[270] Speed: 1.2421626189606942 samples/sec                   batch loss = 750.4475741386414 | accuracy = 0.5296296296296297


Epoch[1] Batch[275] Speed: 1.243516483255844 samples/sec                   batch loss = 763.8802723884583 | accuracy = 0.5336363636363637


Epoch[1] Batch[280] Speed: 1.2494085920850013 samples/sec                   batch loss = 777.1428940296173 | accuracy = 0.5366071428571428


Epoch[1] Batch[285] Speed: 1.2513985281916835 samples/sec                   batch loss = 790.5887765884399 | accuracy = 0.5412280701754386


Epoch[1] Batch[290] Speed: 1.2429107696087078 samples/sec                   batch loss = 804.2935738563538 | accuracy = 0.5413793103448276


Epoch[1] Batch[295] Speed: 1.2491598415582095 samples/sec                   batch loss = 817.5665261745453 | accuracy = 0.5423728813559322


Epoch[1] Batch[300] Speed: 1.2495548746163758 samples/sec                   batch loss = 831.0469219684601 | accuracy = 0.5433333333333333


Epoch[1] Batch[305] Speed: 1.2490965999279304 samples/sec                   batch loss = 844.9208800792694 | accuracy = 0.5418032786885246


Epoch[1] Batch[310] Speed: 1.2500838991683998 samples/sec                   batch loss = 858.6593976020813 | accuracy = 0.5411290322580645


Epoch[1] Batch[315] Speed: 1.254959783758036 samples/sec                   batch loss = 872.1845154762268 | accuracy = 0.5428571428571428


Epoch[1] Batch[320] Speed: 1.2518080547890462 samples/sec                   batch loss = 885.3291869163513 | accuracy = 0.5453125


Epoch[1] Batch[325] Speed: 1.2515102670709561 samples/sec                   batch loss = 899.1472430229187 | accuracy = 0.5453846153846154


Epoch[1] Batch[330] Speed: 1.24837776971183 samples/sec                   batch loss = 913.368465423584 | accuracy = 0.5462121212121213


Epoch[1] Batch[335] Speed: 1.2437769142052588 samples/sec                   batch loss = 926.6629953384399 | accuracy = 0.5462686567164179


Epoch[1] Batch[340] Speed: 1.2475455598858403 samples/sec                   batch loss = 939.8756210803986 | accuracy = 0.549264705882353


Epoch[1] Batch[345] Speed: 1.2486453522218188 samples/sec                   batch loss = 952.7986233234406 | accuracy = 0.5528985507246377


Epoch[1] Batch[350] Speed: 1.2480795689765798 samples/sec                   batch loss = 966.3975164890289 | accuracy = 0.5535714285714286


Epoch[1] Batch[355] Speed: 1.250855707148604 samples/sec                   batch loss = 980.1188366413116 | accuracy = 0.5535211267605634


Epoch[1] Batch[360] Speed: 1.250847966632157 samples/sec                   batch loss = 993.7033376693726 | accuracy = 0.5541666666666667


Epoch[1] Batch[365] Speed: 1.247462260793064 samples/sec                   batch loss = 1007.2140653133392 | accuracy = 0.5534246575342465


Epoch[1] Batch[370] Speed: 1.2470741113141102 samples/sec                   batch loss = 1020.4451684951782 | accuracy = 0.5540540540540541


Epoch[1] Batch[375] Speed: 1.2481632291100324 samples/sec                   batch loss = 1033.6866374015808 | accuracy = 0.5546666666666666


Epoch[1] Batch[380] Speed: 1.2493374173638017 samples/sec                   batch loss = 1047.3694882392883 | accuracy = 0.5539473684210526


Epoch[1] Batch[385] Speed: 1.2446107497909848 samples/sec                   batch loss = 1060.5128681659698 | accuracy = 0.5551948051948052


Epoch[1] Batch[390] Speed: 1.2517677064802561 samples/sec                   batch loss = 1074.3043365478516 | accuracy = 0.5544871794871795


Epoch[1] Batch[395] Speed: 1.2523493658164158 samples/sec                   batch loss = 1086.9149374961853 | accuracy = 0.5569620253164557


Epoch[1] Batch[400] Speed: 1.251933692917312 samples/sec                   batch loss = 1100.396889448166 | accuracy = 0.55625


Epoch[1] Batch[405] Speed: 1.2479363235616152 samples/sec                   batch loss = 1114.0892391204834 | accuracy = 0.5561728395061728


Epoch[1] Batch[410] Speed: 1.2558654644587612 samples/sec                   batch loss = 1127.9184758663177 | accuracy = 0.5573170731707318


Epoch[1] Batch[415] Speed: 1.2490033302108032 samples/sec                   batch loss = 1140.3160161972046 | accuracy = 0.5590361445783133


Epoch[1] Batch[420] Speed: 1.250373646514225 samples/sec                   batch loss = 1152.329603433609 | accuracy = 0.5619047619047619


Epoch[1] Batch[425] Speed: 1.244485007772275 samples/sec                   batch loss = 1165.7920541763306 | accuracy = 0.5617647058823529


Epoch[1] Batch[430] Speed: 1.2451498264739032 samples/sec                   batch loss = 1179.7862060070038 | accuracy = 0.5598837209302325


Epoch[1] Batch[435] Speed: 1.2500994545368518 samples/sec                   batch loss = 1192.759936094284 | accuracy = 0.5614942528735632


Epoch[1] Batch[440] Speed: 1.2419576559851935 samples/sec                   batch loss = 1206.4468500614166 | accuracy = 0.5613636363636364


Epoch[1] Batch[445] Speed: 1.2398904813228797 samples/sec                   batch loss = 1219.4662466049194 | accuracy = 0.5601123595505618


Epoch[1] Batch[450] Speed: 1.243698635173779 samples/sec                   batch loss = 1231.9514610767365 | accuracy = 0.5611111111111111


Epoch[1] Batch[455] Speed: 1.2457154533552877 samples/sec                   batch loss = 1244.9989042282104 | accuracy = 0.5598901098901099


Epoch[1] Batch[460] Speed: 1.2471247258081768 samples/sec                   batch loss = 1257.3569494485855 | accuracy = 0.5625


Epoch[1] Batch[465] Speed: 1.2457766879919425 samples/sec                   batch loss = 1271.0867232084274 | accuracy = 0.5623655913978495


Epoch[1] Batch[470] Speed: 1.2437309966881465 samples/sec                   batch loss = 1284.908546090126 | accuracy = 0.5622340425531915


Epoch[1] Batch[475] Speed: 1.24560696628114 samples/sec                   batch loss = 1298.2068704366684 | accuracy = 0.5615789473684211


Epoch[1] Batch[480] Speed: 1.2411837472051162 samples/sec                   batch loss = 1311.0104607343674 | accuracy = 0.5614583333333333


Epoch[1] Batch[485] Speed: 1.2440438192610088 samples/sec                   batch loss = 1323.6578673124313 | accuracy = 0.5623711340206186


Epoch[1] Batch[490] Speed: 1.2416102286584847 samples/sec                   batch loss = 1336.470292687416 | accuracy = 0.5627551020408164


Epoch[1] Batch[495] Speed: 1.2369186678755781 samples/sec                   batch loss = 1349.0066503286362 | accuracy = 0.5641414141414142


Epoch[1] Batch[500] Speed: 1.2410183957138556 samples/sec                   batch loss = 1361.5030490159988 | accuracy = 0.566


Epoch[1] Batch[505] Speed: 1.2377119186427175 samples/sec                   batch loss = 1374.6093095541 | accuracy = 0.5663366336633663


Epoch[1] Batch[510] Speed: 1.2420492326963382 samples/sec                   batch loss = 1388.7717996835709 | accuracy = 0.5666666666666667


Epoch[1] Batch[515] Speed: 1.2447604362251563 samples/sec                   batch loss = 1401.7119890451431 | accuracy = 0.566990291262136


Epoch[1] Batch[520] Speed: 1.2404255676239861 samples/sec                   batch loss = 1414.5656121969223 | accuracy = 0.5682692307692307


Epoch[1] Batch[525] Speed: 1.236521831290535 samples/sec                   batch loss = 1429.187808394432 | accuracy = 0.5671428571428572


Epoch[1] Batch[530] Speed: 1.2435657954356527 samples/sec                   batch loss = 1442.7974685430527 | accuracy = 0.5683962264150944


Epoch[1] Batch[535] Speed: 1.2436370515306983 samples/sec                   batch loss = 1456.9565972089767 | accuracy = 0.5682242990654206


Epoch[1] Batch[540] Speed: 1.234504035540779 samples/sec                   batch loss = 1469.24438726902 | accuracy = 0.5703703703703704


Epoch[1] Batch[545] Speed: 1.2367646617606092 samples/sec                   batch loss = 1482.5540133714676 | accuracy = 0.5711009174311926


Epoch[1] Batch[550] Speed: 1.2408098641080343 samples/sec                   batch loss = 1495.6539033651352 | accuracy = 0.5718181818181818


Epoch[1] Batch[555] Speed: 1.2372135657389836 samples/sec                   batch loss = 1508.859680056572 | accuracy = 0.5716216216216217


Epoch[1] Batch[560] Speed: 1.2405405839065131 samples/sec                   batch loss = 1521.9178932905197 | accuracy = 0.5727678571428572


Epoch[1] Batch[565] Speed: 1.236101296670291 samples/sec                   batch loss = 1535.5202959775925 | accuracy = 0.5738938053097346


Epoch[1] Batch[570] Speed: 1.2380580810531292 samples/sec                   batch loss = 1548.6769160032272 | accuracy = 0.5741228070175438


Epoch[1] Batch[575] Speed: 1.238615178580657 samples/sec                   batch loss = 1562.1343995332718 | accuracy = 0.5730434782608695


Epoch[1] Batch[580] Speed: 1.240957995107536 samples/sec                   batch loss = 1573.330270767212 | accuracy = 0.5762931034482759


Epoch[1] Batch[585] Speed: 1.2382186236252053 samples/sec                   batch loss = 1587.416910648346 | accuracy = 0.5756410256410256


Epoch[1] Batch[590] Speed: 1.2339119683507163 samples/sec                   batch loss = 1600.8065776824951 | accuracy = 0.5758474576271186


Epoch[1] Batch[595] Speed: 1.236520373137348 samples/sec                   batch loss = 1612.9558336734772 | accuracy = 0.576890756302521


Epoch[1] Batch[600] Speed: 1.2387489748906446 samples/sec                   batch loss = 1625.3525547981262 | accuracy = 0.5783333333333334


Epoch[1] Batch[605] Speed: 1.240420431828141 samples/sec                   batch loss = 1638.4101746082306 | accuracy = 0.5789256198347107


Epoch[1] Batch[610] Speed: 1.2369338973252653 samples/sec                   batch loss = 1651.401529431343 | accuracy = 0.5799180327868853


Epoch[1] Batch[615] Speed: 1.2370617665386217 samples/sec                   batch loss = 1665.0043553113937 | accuracy = 0.5800813008130081


Epoch[1] Batch[620] Speed: 1.2323865276184498 samples/sec                   batch loss = 1677.3737338781357 | accuracy = 0.5818548387096775


Epoch[1] Batch[625] Speed: 1.236277365281251 samples/sec                   batch loss = 1689.6122819185257 | accuracy = 0.5824


Epoch[1] Batch[630] Speed: 1.2471368701898653 samples/sec                   batch loss = 1701.2872585058212 | accuracy = 0.5837301587301588


Epoch[1] Batch[635] Speed: 1.233669258831464 samples/sec                   batch loss = 1712.5207911729813 | accuracy = 0.5850393700787402


Epoch[1] Batch[640] Speed: 1.2390320269075563 samples/sec                   batch loss = 1725.992471575737 | accuracy = 0.584375


Epoch[1] Batch[645] Speed: 1.2345986954646209 samples/sec                   batch loss = 1741.0307978391647 | accuracy = 0.5825581395348837


Epoch[1] Batch[650] Speed: 1.2404358393432613 samples/sec                   batch loss = 1756.3252090215683 | accuracy = 0.5834615384615385


Epoch[1] Batch[655] Speed: 1.2422452117679839 samples/sec                   batch loss = 1770.1278527975082 | accuracy = 0.583587786259542


Epoch[1] Batch[660] Speed: 1.245103715128806 samples/sec                   batch loss = 1783.0092035531998 | accuracy = 0.5840909090909091


Epoch[1] Batch[665] Speed: 1.2440916972093166 samples/sec                   batch loss = 1794.386273264885 | accuracy = 0.5857142857142857


Epoch[1] Batch[670] Speed: 1.245320625293289 samples/sec                   batch loss = 1810.2459791898727 | accuracy = 0.5847014925373134


Epoch[1] Batch[675] Speed: 1.24055938843731 samples/sec                   batch loss = 1824.0500212907791 | accuracy = 0.5848148148148148


Epoch[1] Batch[680] Speed: 1.2438006119209677 samples/sec                   batch loss = 1835.7884491682053 | accuracy = 0.5852941176470589


Epoch[1] Batch[685] Speed: 1.237644900591697 samples/sec                   batch loss = 1849.1349707841873 | accuracy = 0.5854014598540146


Epoch[1] Batch[690] Speed: 1.2424400565752092 samples/sec                   batch loss = 1862.110450387001 | accuracy = 0.5851449275362319


Epoch[1] Batch[695] Speed: 1.24093770990065 samples/sec                   batch loss = 1874.1516516208649 | accuracy = 0.5866906474820144


Epoch[1] Batch[700] Speed: 1.2401350951495944 samples/sec                   batch loss = 1888.4487390518188 | accuracy = 0.5864285714285714


Epoch[1] Batch[705] Speed: 1.2431002968301832 samples/sec                   batch loss = 1901.7079037427902 | accuracy = 0.5861702127659575


Epoch[1] Batch[710] Speed: 1.243606999452218 samples/sec                   batch loss = 1915.4815255403519 | accuracy = 0.5855633802816902


Epoch[1] Batch[715] Speed: 1.2449587500024117 samples/sec                   batch loss = 1928.6742516756058 | accuracy = 0.5856643356643356


Epoch[1] Batch[720] Speed: 1.2468189692996798 samples/sec                   batch loss = 1943.4286552667618 | accuracy = 0.5850694444444444


Epoch[1] Batch[725] Speed: 1.2400831215307204 samples/sec                   batch loss = 1956.6242781877518 | accuracy = 0.5855172413793104


Epoch[1] Batch[730] Speed: 1.2447858338562845 samples/sec                   batch loss = 1969.789981007576 | accuracy = 0.5852739726027397


Epoch[1] Batch[735] Speed: 1.2445791734148195 samples/sec                   batch loss = 1983.0191138982773 | accuracy = 0.5850340136054422


Epoch[1] Batch[740] Speed: 1.2508793955892996 samples/sec                   batch loss = 1994.9318805932999 | accuracy = 0.5858108108108108


Epoch[1] Batch[745] Speed: 1.2497706387164231 samples/sec                   batch loss = 2009.8221377134323 | accuracy = 0.5852348993288591


Epoch[1] Batch[750] Speed: 1.247266579521941 samples/sec                   batch loss = 2023.6359864473343 | accuracy = 0.5846666666666667


Epoch[1] Batch[755] Speed: 1.2506745295811246 samples/sec                   batch loss = 2036.8879039287567 | accuracy = 0.5847682119205299


Epoch[1] Batch[760] Speed: 1.2463081650965258 samples/sec                   batch loss = 2051.5926632881165 | accuracy = 0.5851973684210526


Epoch[1] Batch[765] Speed: 1.245739409996245 samples/sec                   batch loss = 2066.066570043564 | accuracy = 0.584640522875817


Epoch[1] Batch[770] Speed: 1.2517645310293264 samples/sec                   batch loss = 2079.472747564316 | accuracy = 0.5844155844155844


Epoch[1] Batch[775] Speed: 1.245776872999966 samples/sec                   batch loss = 2092.53843665123 | accuracy = 0.5841935483870968


Epoch[1] Batch[780] Speed: 1.2413856073488017 samples/sec                   batch loss = 2106.298091173172 | accuracy = 0.583974358974359


Epoch[1] Batch[785] Speed: 1.2442965346598538 samples/sec                   batch loss = 2120.3632028102875 | accuracy = 0.5831210191082803


[Epoch 1] training: accuracy=0.5840736040609137
[Epoch 1] time cost: 651.0724217891693
[Epoch 1] validation: validation accuracy=0.6955555555555556


Epoch[2] Batch[5] Speed: 1.250259967559387 samples/sec                   batch loss = 13.772523641586304 | accuracy = 0.6


Epoch[2] Batch[10] Speed: 1.2457331201325834 samples/sec                   batch loss = 26.17596745491028 | accuracy = 0.675


Epoch[2] Batch[15] Speed: 1.2485118253615126 samples/sec                   batch loss = 39.37470078468323 | accuracy = 0.65


Epoch[2] Batch[20] Speed: 1.2487686831871394 samples/sec                   batch loss = 53.812700033187866 | accuracy = 0.6


Epoch[2] Batch[25] Speed: 1.2482835856653622 samples/sec                   batch loss = 66.95594000816345 | accuracy = 0.62


Epoch[2] Batch[30] Speed: 1.245388107435295 samples/sec                   batch loss = 78.9243643283844 | accuracy = 0.6333333333333333


Epoch[2] Batch[35] Speed: 1.2398285412437766 samples/sec                   batch loss = 90.69036972522736 | accuracy = 0.65


Epoch[2] Batch[40] Speed: 1.2376196109246096 samples/sec                   batch loss = 102.92504584789276 | accuracy = 0.65625


Epoch[2] Batch[45] Speed: 1.242429567621267 samples/sec                   batch loss = 115.52106416225433 | accuracy = 0.65


Epoch[2] Batch[50] Speed: 1.2446831414486985 samples/sec                   batch loss = 128.77674090862274 | accuracy = 0.635


Epoch[2] Batch[55] Speed: 1.2433511546927694 samples/sec                   batch loss = 141.13938069343567 | accuracy = 0.6409090909090909


Epoch[2] Batch[60] Speed: 1.2383537967488778 samples/sec                   batch loss = 153.8938434123993 | accuracy = 0.6416666666666667


Epoch[2] Batch[65] Speed: 1.2436044183598332 samples/sec                   batch loss = 166.7026355266571 | accuracy = 0.6346153846153846


Epoch[2] Batch[70] Speed: 1.247427757077632 samples/sec                   batch loss = 179.1725025177002 | accuracy = 0.625


Epoch[2] Batch[75] Speed: 1.248496402386765 samples/sec                   batch loss = 190.6940860748291 | accuracy = 0.6466666666666666


Epoch[2] Batch[80] Speed: 1.248076597901708 samples/sec                   batch loss = 201.16365242004395 | accuracy = 0.65625


Epoch[2] Batch[85] Speed: 1.2488613600853355 samples/sec                   batch loss = 214.78412413597107 | accuracy = 0.65


Epoch[2] Batch[90] Speed: 1.2453208101658793 samples/sec                   batch loss = 227.15579748153687 | accuracy = 0.6444444444444445


Epoch[2] Batch[95] Speed: 1.256649230698938 samples/sec                   batch loss = 240.52157330513 | accuracy = 0.6421052631578947


Epoch[2] Batch[100] Speed: 1.2508650331992768 samples/sec                   batch loss = 252.17741417884827 | accuracy = 0.6425


Epoch[2] Batch[105] Speed: 1.2462591906439282 samples/sec                   batch loss = 262.5230610370636 | accuracy = 0.6523809523809524


Epoch[2] Batch[110] Speed: 1.2445584926800697 samples/sec                   batch loss = 275.97486996650696 | accuracy = 0.6568181818181819


Epoch[2] Batch[115] Speed: 1.247764156943015 samples/sec                   batch loss = 288.0620551109314 | accuracy = 0.658695652173913


Epoch[2] Batch[120] Speed: 1.2504396270597107 samples/sec                   batch loss = 299.04809629917145 | accuracy = 0.6645833333333333


Epoch[2] Batch[125] Speed: 1.2411542726686953 samples/sec                   batch loss = 312.40464437007904 | accuracy = 0.664


Epoch[2] Batch[130] Speed: 1.2483040188666554 samples/sec                   batch loss = 324.78167498111725 | accuracy = 0.6634615384615384


Epoch[2] Batch[135] Speed: 1.24302753659149 samples/sec                   batch loss = 337.0209001302719 | accuracy = 0.6666666666666666


Epoch[2] Batch[140] Speed: 1.2442546389920666 samples/sec                   batch loss = 347.31617546081543 | accuracy = 0.6696428571428571


Epoch[2] Batch[145] Speed: 1.2514627499378081 samples/sec                   batch loss = 360.75882720947266 | accuracy = 0.6672413793103448


Epoch[2] Batch[150] Speed: 1.2386778205546107 samples/sec                   batch loss = 371.87637162208557 | accuracy = 0.6683333333333333


Epoch[2] Batch[155] Speed: 1.2418279454248182 samples/sec                   batch loss = 384.38845562934875 | accuracy = 0.6645161290322581


Epoch[2] Batch[160] Speed: 1.2454405266934845 samples/sec                   batch loss = 397.1228530406952 | accuracy = 0.6640625


Epoch[2] Batch[165] Speed: 1.2437140320375484 samples/sec                   batch loss = 411.0785677433014 | accuracy = 0.6606060606060606


Epoch[2] Batch[170] Speed: 1.243785950581774 samples/sec                   batch loss = 424.24697279930115 | accuracy = 0.6588235294117647


Epoch[2] Batch[175] Speed: 1.244492485111271 samples/sec                   batch loss = 436.1501052379608 | accuracy = 0.6571428571428571


Epoch[2] Batch[180] Speed: 1.246744105444197 samples/sec                   batch loss = 447.4847618341446 | accuracy = 0.6611111111111111


Epoch[2] Batch[185] Speed: 1.2508170055243395 samples/sec                   batch loss = 458.4150240421295 | accuracy = 0.6648648648648648


Epoch[2] Batch[190] Speed: 1.244637895675805 samples/sec                   batch loss = 471.7947678565979 | accuracy = 0.6618421052631579


Epoch[2] Batch[195] Speed: 1.2449267864547298 samples/sec                   batch loss = 484.93667459487915 | accuracy = 0.658974358974359


Epoch[2] Batch[200] Speed: 1.2433355825286068 samples/sec                   batch loss = 495.0803933143616 | accuracy = 0.6625


Epoch[2] Batch[205] Speed: 1.2554880404648532 samples/sec                   batch loss = 509.656126499176 | accuracy = 0.6597560975609756


Epoch[2] Batch[210] Speed: 1.2522477583993827 samples/sec                   batch loss = 522.682759642601 | accuracy = 0.6607142857142857


Epoch[2] Batch[215] Speed: 1.2518398122316687 samples/sec                   batch loss = 534.6843918561935 | accuracy = 0.6616279069767442


Epoch[2] Batch[220] Speed: 1.23979757352488 samples/sec                   batch loss = 547.6219054460526 | accuracy = 0.6613636363636364


Epoch[2] Batch[225] Speed: 1.248520280286478 samples/sec                   batch loss = 559.58837723732 | accuracy = 0.6633333333333333


Epoch[2] Batch[230] Speed: 1.2462466930541416 samples/sec                   batch loss = 571.3289011716843 | accuracy = 0.6630434782608695


Epoch[2] Batch[235] Speed: 1.2443867029680433 samples/sec                   batch loss = 582.8595353364944 | accuracy = 0.6638297872340425


Epoch[2] Batch[240] Speed: 1.244371658626603 samples/sec                   batch loss = 596.9826506376266 | accuracy = 0.6614583333333334


Epoch[2] Batch[245] Speed: 1.2424217470252465 samples/sec                   batch loss = 610.6461302042007 | accuracy = 0.6591836734693878


Epoch[2] Batch[250] Speed: 1.2438796415697637 samples/sec                   batch loss = 621.9252713918686 | accuracy = 0.661


Epoch[2] Batch[255] Speed: 1.2459554313585202 samples/sec                   batch loss = 632.6501687765121 | accuracy = 0.6627450980392157


Epoch[2] Batch[260] Speed: 1.2532326517245789 samples/sec                   batch loss = 643.7032668590546 | accuracy = 0.6644230769230769


Epoch[2] Batch[265] Speed: 1.2514896354166782 samples/sec                   batch loss = 657.8218519687653 | accuracy = 0.6613207547169812


Epoch[2] Batch[270] Speed: 1.2488867394084013 samples/sec                   batch loss = 669.6951456069946 | accuracy = 0.6611111111111111


Epoch[2] Batch[275] Speed: 1.2466204330027606 samples/sec                   batch loss = 681.4037318229675 | accuracy = 0.6627272727272727


Epoch[2] Batch[280] Speed: 1.2555367094016614 samples/sec                   batch loss = 693.8917140960693 | accuracy = 0.6625


Epoch[2] Batch[285] Speed: 1.2527825273100386 samples/sec                   batch loss = 706.8712216615677 | accuracy = 0.6614035087719298


Epoch[2] Batch[290] Speed: 1.2538070531100807 samples/sec                   batch loss = 717.2066929340363 | accuracy = 0.6629310344827586


Epoch[2] Batch[295] Speed: 1.2475930583349513 samples/sec                   batch loss = 730.0494594573975 | accuracy = 0.6610169491525424


Epoch[2] Batch[300] Speed: 1.2551060546099138 samples/sec                   batch loss = 742.2168188095093 | accuracy = 0.66


Epoch[2] Batch[305] Speed: 1.2508895613763797 samples/sec                   batch loss = 750.1849707365036 | accuracy = 0.6639344262295082


Epoch[2] Batch[310] Speed: 1.2557965601332952 samples/sec                   batch loss = 764.9466165304184 | accuracy = 0.660483870967742


Epoch[2] Batch[315] Speed: 1.2433737304163006 samples/sec                   batch loss = 777.0776702165604 | accuracy = 0.6587301587301587


Epoch[2] Batch[320] Speed: 1.2476924271391807 samples/sec                   batch loss = 788.3317989110947 | accuracy = 0.659375


Epoch[2] Batch[325] Speed: 1.2480428957614549 samples/sec                   batch loss = 798.6239268779755 | accuracy = 0.6607692307692308


Epoch[2] Batch[330] Speed: 1.251433998690028 samples/sec                   batch loss = 814.0995455980301 | accuracy = 0.6575757575757576


Epoch[2] Batch[335] Speed: 1.2436999259141859 samples/sec                   batch loss = 828.0320500135422 | accuracy = 0.6567164179104478


Epoch[2] Batch[340] Speed: 1.2502761795235464 samples/sec                   batch loss = 842.6885432004929 | accuracy = 0.6529411764705882


Epoch[2] Batch[345] Speed: 1.2485895964064422 samples/sec                   batch loss = 856.4684456586838 | accuracy = 0.6507246376811594


Epoch[2] Batch[350] Speed: 1.251932104768735 samples/sec                   batch loss = 868.0842189788818 | accuracy = 0.6507142857142857


Epoch[2] Batch[355] Speed: 1.2511805221361092 samples/sec                   batch loss = 882.7213382720947 | accuracy = 0.6492957746478873


Epoch[2] Batch[360] Speed: 1.2394941167822329 samples/sec                   batch loss = 895.4513914585114 | accuracy = 0.6479166666666667


Epoch[2] Batch[365] Speed: 1.242931671864929 samples/sec                   batch loss = 906.6653431653976 | accuracy = 0.65


Epoch[2] Batch[370] Speed: 1.2408609809087967 samples/sec                   batch loss = 919.2057758569717 | accuracy = 0.6493243243243243


Epoch[2] Batch[375] Speed: 1.2483614211656928 samples/sec                   batch loss = 929.3504178524017 | accuracy = 0.6506666666666666


Epoch[2] Batch[380] Speed: 1.2425498331833336 samples/sec                   batch loss = 941.1691416501999 | accuracy = 0.6506578947368421


Epoch[2] Batch[385] Speed: 1.2376311143995498 samples/sec                   batch loss = 951.7308964729309 | accuracy = 0.6525974025974026


Epoch[2] Batch[390] Speed: 1.2496480405686083 samples/sec                   batch loss = 964.7481861114502 | accuracy = 0.6519230769230769


Epoch[2] Batch[395] Speed: 1.2458473650559796 samples/sec                   batch loss = 977.6340103149414 | accuracy = 0.6506329113924051


Epoch[2] Batch[400] Speed: 1.2469207170196057 samples/sec                   batch loss = 987.03073990345 | accuracy = 0.653125


Epoch[2] Batch[405] Speed: 1.2478366374757326 samples/sec                   batch loss = 1003.3824979066849 | accuracy = 0.6506172839506172


Epoch[2] Batch[410] Speed: 1.2434280076533168 samples/sec                   batch loss = 1015.5188428163528 | accuracy = 0.65


Epoch[2] Batch[415] Speed: 1.2473285231595896 samples/sec                   batch loss = 1029.3342674970627 | accuracy = 0.6493975903614457


Epoch[2] Batch[420] Speed: 1.2524697830717715 samples/sec                   batch loss = 1038.8476493358612 | accuracy = 0.6511904761904762


Epoch[2] Batch[425] Speed: 1.2489827811288032 samples/sec                   batch loss = 1049.6483273506165 | accuracy = 0.6529411764705882


Epoch[2] Batch[430] Speed: 1.2476106856327027 samples/sec                   batch loss = 1062.6953729391098 | accuracy = 0.6523255813953488


Epoch[2] Batch[435] Speed: 1.2474098567084537 samples/sec                   batch loss = 1074.1014316082 | accuracy = 0.6522988505747126


Epoch[2] Batch[440] Speed: 1.2472530417617347 samples/sec                   batch loss = 1085.4712734222412 | accuracy = 0.6534090909090909


Epoch[2] Batch[445] Speed: 1.252284118312865 samples/sec                   batch loss = 1097.1076091527939 | accuracy = 0.653370786516854


Epoch[2] Batch[450] Speed: 1.2391520016733544 samples/sec                   batch loss = 1110.6879914999008 | accuracy = 0.6516666666666666


Epoch[2] Batch[455] Speed: 1.2383167788619236 samples/sec                   batch loss = 1123.997149348259 | accuracy = 0.6505494505494506


Epoch[2] Batch[460] Speed: 1.242644350462821 samples/sec                   batch loss = 1132.6181204319 | accuracy = 0.6532608695652173


Epoch[2] Batch[465] Speed: 1.2441105172962283 samples/sec                   batch loss = 1145.2054553031921 | accuracy = 0.6521505376344086


Epoch[2] Batch[470] Speed: 1.2470356433385725 samples/sec                   batch loss = 1158.0158631801605 | accuracy = 0.6510638297872341


Epoch[2] Batch[475] Speed: 1.2456843758427214 samples/sec                   batch loss = 1169.9737178087234 | accuracy = 0.65


Epoch[2] Batch[480] Speed: 1.2419020361616313 samples/sec                   batch loss = 1181.8684343099594 | accuracy = 0.6489583333333333


Epoch[2] Batch[485] Speed: 1.2454257342160449 samples/sec                   batch loss = 1194.0988957881927 | accuracy = 0.6489690721649485


Epoch[2] Batch[490] Speed: 1.2463798284152965 samples/sec                   batch loss = 1205.446808218956 | accuracy = 0.65


Epoch[2] Batch[495] Speed: 1.2566354884962445 samples/sec                   batch loss = 1217.17041015625 | accuracy = 0.6505050505050505


Epoch[2] Batch[500] Speed: 1.2510402022636848 samples/sec                   batch loss = 1230.1157346963882 | accuracy = 0.65


Epoch[2] Batch[505] Speed: 1.2497387999831353 samples/sec                   batch loss = 1242.6530044078827 | accuracy = 0.6495049504950495


Epoch[2] Batch[510] Speed: 1.251547517798534 samples/sec                   batch loss = 1252.8779584169388 | accuracy = 0.6504901960784314


Epoch[2] Batch[515] Speed: 1.2502871740833121 samples/sec                   batch loss = 1265.8784228563309 | accuracy = 0.6504854368932039


Epoch[2] Batch[520] Speed: 1.2468929154473751 samples/sec                   batch loss = 1278.9058980941772 | accuracy = 0.6495192307692308


Epoch[2] Batch[525] Speed: 1.2500009126967895 samples/sec                   batch loss = 1293.0402534008026 | accuracy = 0.6490476190476191


Epoch[2] Batch[530] Speed: 1.2481916445369758 samples/sec                   batch loss = 1305.6764616966248 | accuracy = 0.6485849056603774


Epoch[2] Batch[535] Speed: 1.243040798582491 samples/sec                   batch loss = 1314.441398382187 | accuracy = 0.6504672897196262


Epoch[2] Batch[540] Speed: 1.2505701175656438 samples/sec                   batch loss = 1326.527764081955 | accuracy = 0.65


Epoch[2] Batch[545] Speed: 1.2527761661226864 samples/sec                   batch loss = 1340.2006560564041 | accuracy = 0.6490825688073395


Epoch[2] Batch[550] Speed: 1.2524827797966236 samples/sec                   batch loss = 1351.1486271619797 | accuracy = 0.6504545454545455


Epoch[2] Batch[555] Speed: 1.2437139398396246 samples/sec                   batch loss = 1365.3058067560196 | accuracy = 0.6495495495495496


Epoch[2] Batch[560] Speed: 1.24099544646265 samples/sec                   batch loss = 1374.9060627222061 | accuracy = 0.6504464285714285


Epoch[2] Batch[565] Speed: 1.2390703687193154 samples/sec                   batch loss = 1386.2885010242462 | accuracy = 0.6504424778761062


Epoch[2] Batch[570] Speed: 1.2388564535563664 samples/sec                   batch loss = 1400.375761270523 | accuracy = 0.6495614035087719


Epoch[2] Batch[575] Speed: 1.245940534118926 samples/sec                   batch loss = 1413.427764415741 | accuracy = 0.6491304347826087


Epoch[2] Batch[580] Speed: 1.2475616087619235 samples/sec                   batch loss = 1423.3931822776794 | accuracy = 0.6504310344827586


Epoch[2] Batch[585] Speed: 1.2447410423683563 samples/sec                   batch loss = 1439.1825151443481 | accuracy = 0.6495726495726496


Epoch[2] Batch[590] Speed: 1.2460298303588284 samples/sec                   batch loss = 1449.6305153369904 | accuracy = 0.65


Epoch[2] Batch[595] Speed: 1.2505496101111266 samples/sec                   batch loss = 1459.1349732875824 | accuracy = 0.6512605042016807


Epoch[2] Batch[600] Speed: 1.2454396946073014 samples/sec                   batch loss = 1470.8663566112518 | accuracy = 0.6520833333333333


Epoch[2] Batch[605] Speed: 1.243429666455046 samples/sec                   batch loss = 1481.711069703102 | accuracy = 0.6528925619834711


Epoch[2] Batch[610] Speed: 1.2468493621433552 samples/sec                   batch loss = 1495.6530221700668 | accuracy = 0.6532786885245901


Epoch[2] Batch[615] Speed: 1.2413227830517266 samples/sec                   batch loss = 1504.833654999733 | accuracy = 0.6548780487804878


Epoch[2] Batch[620] Speed: 1.2419909383281005 samples/sec                   batch loss = 1521.2284280061722 | accuracy = 0.6536290322580646


Epoch[2] Batch[625] Speed: 1.241878962291986 samples/sec                   batch loss = 1530.130793094635 | accuracy = 0.6556


Epoch[2] Batch[630] Speed: 1.2411112111430926 samples/sec                   batch loss = 1539.6740392446518 | accuracy = 0.6579365079365079


Epoch[2] Batch[635] Speed: 1.239538714602268 samples/sec                   batch loss = 1551.0452721118927 | accuracy = 0.6586614173228347


Epoch[2] Batch[640] Speed: 1.2500026822147703 samples/sec                   batch loss = 1562.2036516666412 | accuracy = 0.659765625


Epoch[2] Batch[645] Speed: 1.2394599608685284 samples/sec                   batch loss = 1576.5777041912079 | accuracy = 0.6589147286821705


Epoch[2] Batch[650] Speed: 1.2440053535214974 samples/sec                   batch loss = 1585.790330529213 | accuracy = 0.6592307692307692


Epoch[2] Batch[655] Speed: 1.248153200444708 samples/sec                   batch loss = 1599.4040486812592 | accuracy = 0.6595419847328244


Epoch[2] Batch[660] Speed: 1.243723805094969 samples/sec                   batch loss = 1607.8407788276672 | accuracy = 0.6613636363636364


Epoch[2] Batch[665] Speed: 1.2395792858380426 samples/sec                   batch loss = 1621.5488004684448 | accuracy = 0.6601503759398496


Epoch[2] Batch[670] Speed: 1.2445896063677182 samples/sec                   batch loss = 1632.6799041032791 | accuracy = 0.6604477611940298


Epoch[2] Batch[675] Speed: 1.2503318065277107 samples/sec                   batch loss = 1643.493418455124 | accuracy = 0.6611111111111111


Epoch[2] Batch[680] Speed: 1.244254454435895 samples/sec                   batch loss = 1656.9666497707367 | accuracy = 0.6606617647058823


Epoch[2] Batch[685] Speed: 1.244607702877828 samples/sec                   batch loss = 1669.5493487119675 | accuracy = 0.6613138686131387


Epoch[2] Batch[690] Speed: 1.2392589095193831 samples/sec                   batch loss = 1680.9587329626083 | accuracy = 0.6612318840579711


Epoch[2] Batch[695] Speed: 1.2430686128438297 samples/sec                   batch loss = 1695.185317516327 | accuracy = 0.660431654676259


Epoch[2] Batch[700] Speed: 1.242153146322972 samples/sec                   batch loss = 1706.660859465599 | accuracy = 0.6603571428571429


Epoch[2] Batch[705] Speed: 1.2495312363146922 samples/sec                   batch loss = 1719.9470180273056 | accuracy = 0.6599290780141844


Epoch[2] Batch[710] Speed: 1.244576311307873 samples/sec                   batch loss = 1730.4782721996307 | accuracy = 0.6602112676056338


Epoch[2] Batch[715] Speed: 1.23959219958766 samples/sec                   batch loss = 1740.9124947786331 | accuracy = 0.6608391608391608


Epoch[2] Batch[720] Speed: 1.2451018670498872 samples/sec                   batch loss = 1754.934198141098 | accuracy = 0.6604166666666667


Epoch[2] Batch[725] Speed: 1.2447633915325869 samples/sec                   batch loss = 1764.2786206007004 | accuracy = 0.6613793103448276


Epoch[2] Batch[730] Speed: 1.250589693490577 samples/sec                   batch loss = 1774.960809826851 | accuracy = 0.661986301369863


Epoch[2] Batch[735] Speed: 1.2450943823863678 samples/sec                   batch loss = 1788.3860429525375 | accuracy = 0.6615646258503401


Epoch[2] Batch[740] Speed: 1.2479768894050232 samples/sec                   batch loss = 1800.3622033596039 | accuracy = 0.6621621621621622


Epoch[2] Batch[745] Speed: 1.2445820355349302 samples/sec                   batch loss = 1809.2634717226028 | accuracy = 0.6634228187919463


Epoch[2] Batch[750] Speed: 1.2436752179201973 samples/sec                   batch loss = 1820.8307002782822 | accuracy = 0.663


Epoch[2] Batch[755] Speed: 1.2438815782449788 samples/sec                   batch loss = 1832.452572107315 | accuracy = 0.6632450331125828


Epoch[2] Batch[760] Speed: 1.2431439570621812 samples/sec                   batch loss = 1843.6326271295547 | accuracy = 0.6638157894736842


Epoch[2] Batch[765] Speed: 1.2431109813292722 samples/sec                   batch loss = 1857.2251108884811 | accuracy = 0.6640522875816993


Epoch[2] Batch[770] Speed: 1.2449957965143386 samples/sec                   batch loss = 1869.134336233139 | accuracy = 0.6646103896103897


Epoch[2] Batch[775] Speed: 1.2448767196878239 samples/sec                   batch loss = 1877.395071387291 | accuracy = 0.665483870967742


Epoch[2] Batch[780] Speed: 1.2446762158620766 samples/sec                   batch loss = 1890.3645615577698 | accuracy = 0.6657051282051282


Epoch[2] Batch[785] Speed: 1.2465154927911513 samples/sec                   batch loss = 1903.654843211174 | accuracy = 0.664968152866242


[Epoch 2] training: accuracy=0.6652918781725888
[Epoch 2] time cost: 649.437905550003
[Epoch 2] validation: validation accuracy=0.7022222222222222


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).