<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[04:32:59] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[04:32:59] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[04:33:00] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 9.83634 , -9.954982]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7711307654922115 samples/sec                   batch loss = 15.547078609466553 | accuracy = 0.35


Epoch[1] Batch[10] Speed: 1.2478396074082503 samples/sec                   batch loss = 29.42835545539856 | accuracy = 0.475


Epoch[1] Batch[15] Speed: 1.2520467420785575 samples/sec                   batch loss = 44.0159969329834 | accuracy = 0.4666666666666667


Epoch[1] Batch[20] Speed: 1.2419070923033693 samples/sec                   batch loss = 58.25520730018616 | accuracy = 0.525


Epoch[1] Batch[25] Speed: 1.2509029916603565 samples/sec                   batch loss = 72.19148778915405 | accuracy = 0.55


Epoch[1] Batch[30] Speed: 1.2418931190665337 samples/sec                   batch loss = 86.19008135795593 | accuracy = 0.5333333333333333


Epoch[1] Batch[35] Speed: 1.2528001144583234 samples/sec                   batch loss = 99.35095071792603 | accuracy = 0.5428571428571428


Epoch[1] Batch[40] Speed: 1.2444771612785883 samples/sec                   batch loss = 113.86033248901367 | accuracy = 0.53125


Epoch[1] Batch[45] Speed: 1.2494198505288017 samples/sec                   batch loss = 127.5017421245575 | accuracy = 0.5388888888888889


Epoch[1] Batch[50] Speed: 1.2448082769892395 samples/sec                   batch loss = 141.6361575126648 | accuracy = 0.54


Epoch[1] Batch[55] Speed: 1.2413077208612553 samples/sec                   batch loss = 156.5672538280487 | accuracy = 0.5454545454545454


Epoch[1] Batch[60] Speed: 1.241632833094784 samples/sec                   batch loss = 170.03934717178345 | accuracy = 0.5541666666666667


Epoch[1] Batch[65] Speed: 1.2525225196573402 samples/sec                   batch loss = 184.64649438858032 | accuracy = 0.5461538461538461


Epoch[1] Batch[70] Speed: 1.2421139697634986 samples/sec                   batch loss = 199.10846590995789 | accuracy = 0.5285714285714286


Epoch[1] Batch[75] Speed: 1.237178988005484 samples/sec                   batch loss = 212.68522953987122 | accuracy = 0.53


Epoch[1] Batch[80] Speed: 1.2437019542259524 samples/sec                   batch loss = 226.89353156089783 | accuracy = 0.521875


Epoch[1] Batch[85] Speed: 1.2467407701350834 samples/sec                   batch loss = 240.76698184013367 | accuracy = 0.5235294117647059


Epoch[1] Batch[90] Speed: 1.2453939315794118 samples/sec                   batch loss = 255.2740399837494 | accuracy = 0.5222222222222223


Epoch[1] Batch[95] Speed: 1.245765310104098 samples/sec                   batch loss = 268.74030566215515 | accuracy = 0.5263157894736842


Epoch[1] Batch[100] Speed: 1.2482306482100245 samples/sec                   batch loss = 282.7219228744507 | accuracy = 0.525


Epoch[1] Batch[105] Speed: 1.24684185645577 samples/sec                   batch loss = 296.6615345478058 | accuracy = 0.5261904761904762


Epoch[1] Batch[110] Speed: 1.2537451201941758 samples/sec                   batch loss = 310.6221182346344 | accuracy = 0.5318181818181819


Epoch[1] Batch[115] Speed: 1.2523334739643095 samples/sec                   batch loss = 324.1974251270294 | accuracy = 0.5391304347826087


Epoch[1] Batch[120] Speed: 1.2479425428675799 samples/sec                   batch loss = 337.5837502479553 | accuracy = 0.54375


Epoch[1] Batch[125] Speed: 1.2485136835769617 samples/sec                   batch loss = 352.0634548664093 | accuracy = 0.536


Epoch[1] Batch[130] Speed: 1.2525541263351472 samples/sec                   batch loss = 366.41591119766235 | accuracy = 0.5307692307692308


Epoch[1] Batch[135] Speed: 1.2469001437368916 samples/sec                   batch loss = 379.9350070953369 | accuracy = 0.5314814814814814


Epoch[1] Batch[140] Speed: 1.2539922324788104 samples/sec                   batch loss = 393.90504598617554 | accuracy = 0.5232142857142857


Epoch[1] Batch[145] Speed: 1.2481393648654149 samples/sec                   batch loss = 407.9877812862396 | accuracy = 0.5206896551724138


Epoch[1] Batch[150] Speed: 1.2487290882251432 samples/sec                   batch loss = 421.5512318611145 | accuracy = 0.525


Epoch[1] Batch[155] Speed: 1.2496278425743923 samples/sec                   batch loss = 435.39785981178284 | accuracy = 0.5225806451612903


Epoch[1] Batch[160] Speed: 1.2511949850823563 samples/sec                   batch loss = 448.84720039367676 | accuracy = 0.5265625


Epoch[1] Batch[165] Speed: 1.251575994160929 samples/sec                   batch loss = 462.6415617465973 | accuracy = 0.5287878787878788


Epoch[1] Batch[170] Speed: 1.2445427979383152 samples/sec                   batch loss = 476.53704714775085 | accuracy = 0.5294117647058824


Epoch[1] Batch[175] Speed: 1.245527994397609 samples/sec                   batch loss = 490.46888041496277 | accuracy = 0.5328571428571428


Epoch[1] Batch[180] Speed: 1.248194337569392 samples/sec                   batch loss = 504.209801197052 | accuracy = 0.5347222222222222


Epoch[1] Batch[185] Speed: 1.247265837719251 samples/sec                   batch loss = 518.454336643219 | accuracy = 0.5337837837837838


Epoch[1] Batch[190] Speed: 1.2527078809744967 samples/sec                   batch loss = 532.1556532382965 | accuracy = 0.5355263157894737


Epoch[1] Batch[195] Speed: 1.2445795427199007 samples/sec                   batch loss = 545.7795686721802 | accuracy = 0.5371794871794872


Epoch[1] Batch[200] Speed: 1.2486187746620803 samples/sec                   batch loss = 559.4982912540436 | accuracy = 0.5375


Epoch[1] Batch[205] Speed: 1.247153557512314 samples/sec                   batch loss = 573.1159427165985 | accuracy = 0.5390243902439025


Epoch[1] Batch[210] Speed: 1.2544996004077278 samples/sec                   batch loss = 586.7131369113922 | accuracy = 0.5380952380952381


Epoch[1] Batch[215] Speed: 1.2467245570811103 samples/sec                   batch loss = 600.5985419750214 | accuracy = 0.5395348837209303


Epoch[1] Batch[220] Speed: 1.248792199693422 samples/sec                   batch loss = 614.4642558097839 | accuracy = 0.5397727272727273


Epoch[1] Batch[225] Speed: 1.2485164709105063 samples/sec                   batch loss = 628.5625202655792 | accuracy = 0.5355555555555556


Epoch[1] Batch[230] Speed: 1.245495169453312 samples/sec                   batch loss = 642.6268548965454 | accuracy = 0.533695652173913


Epoch[1] Batch[235] Speed: 1.2514366123852365 samples/sec                   batch loss = 656.1554884910583 | accuracy = 0.5351063829787234


Epoch[1] Batch[240] Speed: 1.25051726562942 samples/sec                   batch loss = 669.5389561653137 | accuracy = 0.5364583333333334


Epoch[1] Batch[245] Speed: 1.2460582412764605 samples/sec                   batch loss = 683.7295255661011 | accuracy = 0.5336734693877551


Epoch[1] Batch[250] Speed: 1.2502743160580045 samples/sec                   batch loss = 697.2268753051758 | accuracy = 0.537


Epoch[1] Batch[255] Speed: 1.2472860521581444 samples/sec                   batch loss = 710.9873259067535 | accuracy = 0.5362745098039216


Epoch[1] Batch[260] Speed: 1.2417798739066008 samples/sec                   batch loss = 724.4714918136597 | accuracy = 0.5375


Epoch[1] Batch[265] Speed: 1.243449203786227 samples/sec                   batch loss = 737.9097008705139 | accuracy = 0.5386792452830189


Epoch[1] Batch[270] Speed: 1.2423604737629612 samples/sec                   batch loss = 752.2646484375 | accuracy = 0.5333333333333333


Epoch[1] Batch[275] Speed: 1.2421611474735115 samples/sec                   batch loss = 765.8509097099304 | accuracy = 0.5345454545454545


Epoch[1] Batch[280] Speed: 1.2427813201935178 samples/sec                   batch loss = 778.9898645877838 | accuracy = 0.5357142857142857


Epoch[1] Batch[285] Speed: 1.2399922927381821 samples/sec                   batch loss = 792.5826468467712 | accuracy = 0.5385964912280702


Epoch[1] Batch[290] Speed: 1.2415795393996802 samples/sec                   batch loss = 807.0197110176086 | accuracy = 0.5370689655172414


Epoch[1] Batch[295] Speed: 1.240200457930195 samples/sec                   batch loss = 821.6013538837433 | accuracy = 0.535593220338983


Epoch[1] Batch[300] Speed: 1.2400766136772399 samples/sec                   batch loss = 835.0340392589569 | accuracy = 0.5383333333333333


Epoch[1] Batch[305] Speed: 1.2446845265752713 samples/sec                   batch loss = 848.93980717659 | accuracy = 0.5385245901639344


Epoch[1] Batch[310] Speed: 1.2404504218083414 samples/sec                   batch loss = 862.5821435451508 | accuracy = 0.5387096774193548


Epoch[1] Batch[315] Speed: 1.2407176442686378 samples/sec                   batch loss = 876.4650619029999 | accuracy = 0.5388888888888889


Epoch[1] Batch[320] Speed: 1.243686004498324 samples/sec                   batch loss = 889.9729816913605 | accuracy = 0.5390625


Epoch[1] Batch[325] Speed: 1.2415923110925375 samples/sec                   batch loss = 903.582305431366 | accuracy = 0.54


Epoch[1] Batch[330] Speed: 1.240394936899777 samples/sec                   batch loss = 917.1137461662292 | accuracy = 0.5401515151515152


Epoch[1] Batch[335] Speed: 1.243933962986008 samples/sec                   batch loss = 930.795907497406 | accuracy = 0.5425373134328358


Epoch[1] Batch[340] Speed: 1.238676540217265 samples/sec                   batch loss = 944.6767311096191 | accuracy = 0.5426470588235294


Epoch[1] Batch[345] Speed: 1.240835284324533 samples/sec                   batch loss = 958.3455953598022 | accuracy = 0.5434782608695652


Epoch[1] Batch[350] Speed: 1.2426097446102116 samples/sec                   batch loss = 971.7634439468384 | accuracy = 0.545


Epoch[1] Batch[355] Speed: 1.2485124757362907 samples/sec                   batch loss = 985.3819725513458 | accuracy = 0.545774647887324


Epoch[1] Batch[360] Speed: 1.2396269123476025 samples/sec                   batch loss = 999.081533908844 | accuracy = 0.5472222222222223


Epoch[1] Batch[365] Speed: 1.2402577591624462 samples/sec                   batch loss = 1012.3861081600189 | accuracy = 0.5493150684931507


Epoch[1] Batch[370] Speed: 1.2423957097504184 samples/sec                   batch loss = 1026.0175476074219 | accuracy = 0.5493243243243243


Epoch[1] Batch[375] Speed: 1.2401200617679315 samples/sec                   batch loss = 1039.5855641365051 | accuracy = 0.5486666666666666


Epoch[1] Batch[380] Speed: 1.243722514304997 samples/sec                   batch loss = 1053.7587189674377 | accuracy = 0.5473684210526316


Epoch[1] Batch[385] Speed: 1.2408282179504417 samples/sec                   batch loss = 1066.959341287613 | accuracy = 0.55


Epoch[1] Batch[390] Speed: 1.2415904734227057 samples/sec                   batch loss = 1080.3553023338318 | accuracy = 0.5493589743589744


Epoch[1] Batch[395] Speed: 1.2388477630989092 samples/sec                   batch loss = 1093.739228963852 | accuracy = 0.55


Epoch[1] Batch[400] Speed: 1.24210091145949 samples/sec                   batch loss = 1107.6535732746124 | accuracy = 0.550625


Epoch[1] Batch[405] Speed: 1.242927436108633 samples/sec                   batch loss = 1120.6201264858246 | accuracy = 0.5524691358024691


Epoch[1] Batch[410] Speed: 1.251464710298265 samples/sec                   batch loss = 1134.4167704582214 | accuracy = 0.551829268292683


Epoch[1] Batch[415] Speed: 1.2444439301669514 samples/sec                   batch loss = 1148.7026278972626 | accuracy = 0.5493975903614458


Epoch[1] Batch[420] Speed: 1.2437365287394921 samples/sec                   batch loss = 1162.8684639930725 | accuracy = 0.5476190476190477


Epoch[1] Batch[425] Speed: 1.2467575395088193 samples/sec                   batch loss = 1176.6793611049652 | accuracy = 0.5464705882352942


Epoch[1] Batch[430] Speed: 1.250322302065841 samples/sec                   batch loss = 1190.7046241760254 | accuracy = 0.5447674418604651


Epoch[1] Batch[435] Speed: 1.2468238802380773 samples/sec                   batch loss = 1204.6317281723022 | accuracy = 0.5431034482758621


Epoch[1] Batch[440] Speed: 1.24622604943629 samples/sec                   batch loss = 1217.8084924221039 | accuracy = 0.54375


Epoch[1] Batch[445] Speed: 1.2463297372957645 samples/sec                   batch loss = 1231.8451385498047 | accuracy = 0.5432584269662921


Epoch[1] Batch[450] Speed: 1.2512511604707275 samples/sec                   batch loss = 1245.547997713089 | accuracy = 0.5422222222222223


Epoch[1] Batch[455] Speed: 1.2449693740883334 samples/sec                   batch loss = 1259.0972611904144 | accuracy = 0.5434065934065934


Epoch[1] Batch[460] Speed: 1.249888884468604 samples/sec                   batch loss = 1271.9742653369904 | accuracy = 0.5451086956521739


Epoch[1] Batch[465] Speed: 1.2431273768660656 samples/sec                   batch loss = 1284.9124455451965 | accuracy = 0.546236559139785


Epoch[1] Batch[470] Speed: 1.2476131906049102 samples/sec                   batch loss = 1297.1212947368622 | accuracy = 0.5478723404255319


Epoch[1] Batch[475] Speed: 1.2523717085729662 samples/sec                   batch loss = 1310.3015127182007 | accuracy = 0.55


Epoch[1] Batch[480] Speed: 1.2470928363440856 samples/sec                   batch loss = 1323.873692035675 | accuracy = 0.5505208333333333


Epoch[1] Batch[485] Speed: 1.2419661143121186 samples/sec                   batch loss = 1337.5524451732635 | accuracy = 0.5510309278350516


Epoch[1] Batch[490] Speed: 1.2496723348736103 samples/sec                   batch loss = 1350.6867966651917 | accuracy = 0.5525510204081633


Epoch[1] Batch[495] Speed: 1.2489968213786315 samples/sec                   batch loss = 1364.8673701286316 | accuracy = 0.5535353535353535


Epoch[1] Batch[500] Speed: 1.2440377309998985 samples/sec                   batch loss = 1378.2478847503662 | accuracy = 0.5545


Epoch[1] Batch[505] Speed: 1.2509793818853032 samples/sec                   batch loss = 1392.110378742218 | accuracy = 0.554950495049505


Epoch[1] Batch[510] Speed: 1.2465389245101377 samples/sec                   batch loss = 1404.4976041316986 | accuracy = 0.557843137254902


Epoch[1] Batch[515] Speed: 1.2493069961333565 samples/sec                   batch loss = 1416.5391800403595 | accuracy = 0.5592233009708738


Epoch[1] Batch[520] Speed: 1.2471611596628212 samples/sec                   batch loss = 1429.2785489559174 | accuracy = 0.5600961538461539


Epoch[1] Batch[525] Speed: 1.250871281731028 samples/sec                   batch loss = 1442.1569163799286 | accuracy = 0.559047619047619


Epoch[1] Batch[530] Speed: 1.2460898001989025 samples/sec                   batch loss = 1455.5106031894684 | accuracy = 0.5589622641509434


Epoch[1] Batch[535] Speed: 1.2506450687111672 samples/sec                   batch loss = 1468.4715151786804 | accuracy = 0.5602803738317756


Epoch[1] Batch[540] Speed: 1.2460232599401884 samples/sec                   batch loss = 1482.1747648715973 | accuracy = 0.5606481481481481


Epoch[1] Batch[545] Speed: 1.25081420790938 samples/sec                   batch loss = 1495.9574556350708 | accuracy = 0.560091743119266


Epoch[1] Batch[550] Speed: 1.2505614484232597 samples/sec                   batch loss = 1510.0606219768524 | accuracy = 0.5595454545454546


Epoch[1] Batch[555] Speed: 1.2455418646123744 samples/sec                   batch loss = 1523.4786713123322 | accuracy = 0.559009009009009


Epoch[1] Batch[560] Speed: 1.2457198006309675 samples/sec                   batch loss = 1536.8401448726654 | accuracy = 0.559375


Epoch[1] Batch[565] Speed: 1.2524278961826263 samples/sec                   batch loss = 1549.5206182003021 | accuracy = 0.5606194690265487


Epoch[1] Batch[570] Speed: 1.2527185441920887 samples/sec                   batch loss = 1561.7294595241547 | accuracy = 0.5614035087719298


Epoch[1] Batch[575] Speed: 1.2477892132811494 samples/sec                   batch loss = 1575.6201491355896 | accuracy = 0.5621739130434783


Epoch[1] Batch[580] Speed: 1.2488684252877824 samples/sec                   batch loss = 1589.7652311325073 | accuracy = 0.5612068965517242


Epoch[1] Batch[585] Speed: 1.246904869971503 samples/sec                   batch loss = 1602.305146932602 | accuracy = 0.5615384615384615


Epoch[1] Batch[590] Speed: 1.2560283079996002 samples/sec                   batch loss = 1617.163233757019 | accuracy = 0.5601694915254237


Epoch[1] Batch[595] Speed: 1.2482068742525698 samples/sec                   batch loss = 1630.1235084533691 | accuracy = 0.5605042016806723


Epoch[1] Batch[600] Speed: 1.2486385683315488 samples/sec                   batch loss = 1643.6149520874023 | accuracy = 0.5604166666666667


Epoch[1] Batch[605] Speed: 1.2477883780536656 samples/sec                   batch loss = 1657.5236585140228 | accuracy = 0.5599173553719008


Epoch[1] Batch[610] Speed: 1.240472066854055 samples/sec                   batch loss = 1669.3078782558441 | accuracy = 0.5618852459016394


Epoch[1] Batch[615] Speed: 1.2450460576510423 samples/sec                   batch loss = 1682.170957326889 | accuracy = 0.5634146341463414


Epoch[1] Batch[620] Speed: 1.2444298997860013 samples/sec                   batch loss = 1695.1201741695404 | accuracy = 0.5641129032258064


Epoch[1] Batch[625] Speed: 1.2449932096538752 samples/sec                   batch loss = 1707.942638874054 | accuracy = 0.5652


Epoch[1] Batch[630] Speed: 1.2543263684238897 samples/sec                   batch loss = 1720.8839237689972 | accuracy = 0.5658730158730159


Epoch[1] Batch[635] Speed: 1.2452453867061728 samples/sec                   batch loss = 1733.4699838161469 | accuracy = 0.5661417322834645


Epoch[1] Batch[640] Speed: 1.2485194440800051 samples/sec                   batch loss = 1746.844387292862 | accuracy = 0.56640625


Epoch[1] Batch[645] Speed: 1.2476102217500644 samples/sec                   batch loss = 1760.0400385856628 | accuracy = 0.5678294573643411


Epoch[1] Batch[650] Speed: 1.2543942674090105 samples/sec                   batch loss = 1772.0527180433273 | accuracy = 0.5688461538461539


Epoch[1] Batch[655] Speed: 1.2502301535504603 samples/sec                   batch loss = 1786.9973071813583 | accuracy = 0.5687022900763359


Epoch[1] Batch[660] Speed: 1.2477135833259956 samples/sec                   batch loss = 1801.4297345876694 | accuracy = 0.5685606060606061


Epoch[1] Batch[665] Speed: 1.2455867137543908 samples/sec                   batch loss = 1814.1038082838058 | accuracy = 0.5695488721804511


Epoch[1] Batch[670] Speed: 1.2420925432540926 samples/sec                   batch loss = 1828.257249712944 | accuracy = 0.5690298507462687


Epoch[1] Batch[675] Speed: 1.2424117184050147 samples/sec                   batch loss = 1840.2594703435898 | accuracy = 0.5707407407407408


Epoch[1] Batch[680] Speed: 1.2464758553669764 samples/sec                   batch loss = 1853.9874926805496 | accuracy = 0.5716911764705882


Epoch[1] Batch[685] Speed: 1.2417079114581957 samples/sec                   batch loss = 1866.7461282014847 | accuracy = 0.5718978102189781


Epoch[1] Batch[690] Speed: 1.2538256997702681 samples/sec                   batch loss = 1880.1727503538132 | accuracy = 0.572463768115942


Epoch[1] Batch[695] Speed: 1.2488161818649541 samples/sec                   batch loss = 1893.6808544397354 | accuracy = 0.5726618705035971


Epoch[1] Batch[700] Speed: 1.254508230431135 samples/sec                   batch loss = 1906.03671002388 | accuracy = 0.5732142857142857


Epoch[1] Batch[705] Speed: 1.2517431438530333 samples/sec                   batch loss = 1919.0205101966858 | accuracy = 0.573758865248227


Epoch[1] Batch[710] Speed: 1.2503077661095316 samples/sec                   batch loss = 1932.443903684616 | accuracy = 0.573943661971831


Epoch[1] Batch[715] Speed: 1.2530448890064634 samples/sec                   batch loss = 1947.0322525501251 | accuracy = 0.573076923076923


Epoch[1] Batch[720] Speed: 1.2537245084778694 samples/sec                   batch loss = 1959.2954120635986 | accuracy = 0.5736111111111111


Epoch[1] Batch[725] Speed: 1.2537065206247275 samples/sec                   batch loss = 1973.5825600624084 | accuracy = 0.573103448275862


Epoch[1] Batch[730] Speed: 1.2477689825298972 samples/sec                   batch loss = 1986.4203670024872 | accuracy = 0.5732876712328767


Epoch[1] Batch[735] Speed: 1.2476763748971964 samples/sec                   batch loss = 1999.9470493793488 | accuracy = 0.5738095238095238


Epoch[1] Batch[740] Speed: 1.2486594778189222 samples/sec                   batch loss = 2013.0846440792084 | accuracy = 0.5746621621621621


Epoch[1] Batch[745] Speed: 1.251362126349922 samples/sec                   batch loss = 2026.4812891483307 | accuracy = 0.5748322147651007


Epoch[1] Batch[750] Speed: 1.2414049886419973 samples/sec                   batch loss = 2038.7557971477509 | accuracy = 0.576


Epoch[1] Batch[755] Speed: 1.242213295427941 samples/sec                   batch loss = 2052.7610442638397 | accuracy = 0.5758278145695365


Epoch[1] Batch[760] Speed: 1.2332559198388826 samples/sec                   batch loss = 2066.1837182044983 | accuracy = 0.5763157894736842


Epoch[1] Batch[765] Speed: 1.2449591195327914 samples/sec                   batch loss = 2079.5279178619385 | accuracy = 0.5758169934640522


Epoch[1] Batch[770] Speed: 1.2407804990895241 samples/sec                   batch loss = 2092.4243544340134 | accuracy = 0.575974025974026


Epoch[1] Batch[775] Speed: 1.2484397308726902 samples/sec                   batch loss = 2104.7304853200912 | accuracy = 0.5764516129032258


Epoch[1] Batch[780] Speed: 1.2464059403550023 samples/sec                   batch loss = 2117.9500106573105 | accuracy = 0.576602564102564


Epoch[1] Batch[785] Speed: 1.2414352099701866 samples/sec                   batch loss = 2131.3398629426956 | accuracy = 0.5764331210191083


[Epoch 1] training: accuracy=0.5764593908629442
[Epoch 1] time cost: 651.2676434516907
[Epoch 1] validation: validation accuracy=0.6711111111111111


Epoch[2] Batch[5] Speed: 1.2328300842233215 samples/sec                   batch loss = 13.583370208740234 | accuracy = 0.55


Epoch[2] Batch[10] Speed: 1.2398586858642633 samples/sec                   batch loss = 25.392021417617798 | accuracy = 0.625


Epoch[2] Batch[15] Speed: 1.246229937419665 samples/sec                   batch loss = 38.79745411872864 | accuracy = 0.5333333333333333


Epoch[2] Batch[20] Speed: 1.244296996081594 samples/sec                   batch loss = 52.08365440368652 | accuracy = 0.525


Epoch[2] Batch[25] Speed: 1.2441805440907896 samples/sec                   batch loss = 63.68005871772766 | accuracy = 0.57


Epoch[2] Batch[30] Speed: 1.2544589846093914 samples/sec                   batch loss = 77.65504169464111 | accuracy = 0.575


Epoch[2] Batch[35] Speed: 1.2462235500312164 samples/sec                   batch loss = 91.39967823028564 | accuracy = 0.5785714285714286


Epoch[2] Batch[40] Speed: 1.2414115104454588 samples/sec                   batch loss = 105.2856993675232 | accuracy = 0.58125


Epoch[2] Batch[45] Speed: 1.2437523875594336 samples/sec                   batch loss = 119.02133274078369 | accuracy = 0.5722222222222222


Epoch[2] Batch[50] Speed: 1.2408728200503887 samples/sec                   batch loss = 130.85285758972168 | accuracy = 0.585


Epoch[2] Batch[55] Speed: 1.2397448036983927 samples/sec                   batch loss = 143.8151478767395 | accuracy = 0.5954545454545455


Epoch[2] Batch[60] Speed: 1.2437898233547613 samples/sec                   batch loss = 156.6974654197693 | accuracy = 0.6


Epoch[2] Batch[65] Speed: 1.2439564676450918 samples/sec                   batch loss = 168.3002359867096 | accuracy = 0.6115384615384616


Epoch[2] Batch[70] Speed: 1.244761359757221 samples/sec                   batch loss = 181.24164700508118 | accuracy = 0.6071428571428571


Epoch[2] Batch[75] Speed: 1.2417914548498374 samples/sec                   batch loss = 195.00003266334534 | accuracy = 0.5966666666666667


Epoch[2] Batch[80] Speed: 1.2356001447026232 samples/sec                   batch loss = 206.51412057876587 | accuracy = 0.60625


Epoch[2] Batch[85] Speed: 1.236684710418859 samples/sec                   batch loss = 218.22347974777222 | accuracy = 0.6176470588235294


Epoch[2] Batch[90] Speed: 1.235309289793555 samples/sec                   batch loss = 231.43789935112 | accuracy = 0.6166666666666667


Epoch[2] Batch[95] Speed: 1.239066799810904 samples/sec                   batch loss = 245.03038501739502 | accuracy = 0.6131578947368421


Epoch[2] Batch[100] Speed: 1.2393374546048732 samples/sec                   batch loss = 257.6860423088074 | accuracy = 0.6125


Epoch[2] Batch[105] Speed: 1.2388939611398297 samples/sec                   batch loss = 268.12900948524475 | accuracy = 0.6214285714285714


Epoch[2] Batch[110] Speed: 1.2346222264315256 samples/sec                   batch loss = 280.9596962928772 | accuracy = 0.625


Epoch[2] Batch[115] Speed: 1.2401593876646793 samples/sec                   batch loss = 293.13469076156616 | accuracy = 0.6260869565217392


Epoch[2] Batch[120] Speed: 1.2415503216948154 samples/sec                   batch loss = 306.79748916625977 | accuracy = 0.6270833333333333


Epoch[2] Batch[125] Speed: 1.2390307458378682 samples/sec                   batch loss = 317.1285079717636 | accuracy = 0.636


Epoch[2] Batch[130] Speed: 1.24318089564484 samples/sec                   batch loss = 332.074826836586 | accuracy = 0.6326923076923077


Epoch[2] Batch[135] Speed: 1.238285429641575 samples/sec                   batch loss = 344.33104944229126 | accuracy = 0.6351851851851852


Epoch[2] Batch[140] Speed: 1.239900652548718 samples/sec                   batch loss = 357.22445249557495 | accuracy = 0.6321428571428571


Epoch[2] Batch[145] Speed: 1.2351148563992254 samples/sec                   batch loss = 370.59158635139465 | accuracy = 0.6310344827586207


Epoch[2] Batch[150] Speed: 1.2368927695528207 samples/sec                   batch loss = 382.50487315654755 | accuracy = 0.6316666666666667


Epoch[2] Batch[155] Speed: 1.242764657611893 samples/sec                   batch loss = 392.81203734874725 | accuracy = 0.6403225806451613


Epoch[2] Batch[160] Speed: 1.2411561090475172 samples/sec                   batch loss = 404.7967575788498 | accuracy = 0.6421875


Epoch[2] Batch[165] Speed: 1.2421183839001853 samples/sec                   batch loss = 415.59153175354004 | accuracy = 0.6454545454545455


Epoch[2] Batch[170] Speed: 1.237932563390562 samples/sec                   batch loss = 428.3580529689789 | accuracy = 0.6470588235294118


Epoch[2] Batch[175] Speed: 1.2411145163995225 samples/sec                   batch loss = 440.6393778324127 | accuracy = 0.6485714285714286


Epoch[2] Batch[180] Speed: 1.236905171431213 samples/sec                   batch loss = 453.9431850910187 | accuracy = 0.6444444444444445


Epoch[2] Batch[185] Speed: 1.2405222385721955 samples/sec                   batch loss = 467.3944728374481 | accuracy = 0.6445945945945946


Epoch[2] Batch[190] Speed: 1.2432699810077152 samples/sec                   batch loss = 482.90674018859863 | accuracy = 0.6407894736842106


Epoch[2] Batch[195] Speed: 1.2374482709838897 samples/sec                   batch loss = 494.30650413036346 | accuracy = 0.6435897435897436


Epoch[2] Batch[200] Speed: 1.2390097915740148 samples/sec                   batch loss = 508.0294016599655 | accuracy = 0.6425


Epoch[2] Batch[205] Speed: 1.2384379864056294 samples/sec                   batch loss = 521.6505957841873 | accuracy = 0.6426829268292683


Epoch[2] Batch[210] Speed: 1.2333525644757868 samples/sec                   batch loss = 532.0867159366608 | accuracy = 0.6488095238095238


Epoch[2] Batch[215] Speed: 1.232506667512962 samples/sec                   batch loss = 544.1592061519623 | accuracy = 0.6488372093023256


Epoch[2] Batch[220] Speed: 1.2389163752920853 samples/sec                   batch loss = 558.4758403301239 | accuracy = 0.6465909090909091


Epoch[2] Batch[225] Speed: 1.2342185085223762 samples/sec                   batch loss = 570.2071496248245 | accuracy = 0.6477777777777778


Epoch[2] Batch[230] Speed: 1.2356076066551236 samples/sec                   batch loss = 584.6103762388229 | accuracy = 0.6434782608695652


Epoch[2] Batch[235] Speed: 1.2360751593571166 samples/sec                   batch loss = 597.6773985624313 | accuracy = 0.6404255319148936


Epoch[2] Batch[240] Speed: 1.2341203664165576 samples/sec                   batch loss = 612.1397968530655 | accuracy = 0.6385416666666667


Epoch[2] Batch[245] Speed: 1.2409189856829068 samples/sec                   batch loss = 623.8209676742554 | accuracy = 0.6408163265306123


Epoch[2] Batch[250] Speed: 1.234008715842057 samples/sec                   batch loss = 634.4071751832962 | accuracy = 0.644


Epoch[2] Batch[255] Speed: 1.2376186066631372 samples/sec                   batch loss = 646.0299025774002 | accuracy = 0.6460784313725491


Epoch[2] Batch[260] Speed: 1.2393881754863005 samples/sec                   batch loss = 658.1800369024277 | accuracy = 0.6471153846153846


Epoch[2] Batch[265] Speed: 1.2335966913534786 samples/sec                   batch loss = 670.523335814476 | accuracy = 0.6471698113207547


Epoch[2] Batch[270] Speed: 1.2347886950246612 samples/sec                   batch loss = 683.2261862754822 | accuracy = 0.65


Epoch[2] Batch[275] Speed: 1.2344923176156188 samples/sec                   batch loss = 696.3977189064026 | accuracy = 0.6472727272727272


Epoch[2] Batch[280] Speed: 1.234261547079148 samples/sec                   batch loss = 708.2008900642395 | accuracy = 0.6491071428571429


Epoch[2] Batch[285] Speed: 1.2373454167846691 samples/sec                   batch loss = 719.1002584695816 | accuracy = 0.6526315789473685


Epoch[2] Batch[290] Speed: 1.2353554971875536 samples/sec                   batch loss = 732.8473641872406 | accuracy = 0.65


Epoch[2] Batch[295] Speed: 1.2406780994626927 samples/sec                   batch loss = 745.3023227453232 | accuracy = 0.6516949152542373


Epoch[2] Batch[300] Speed: 1.2361364517335711 samples/sec                   batch loss = 760.0480986833572 | accuracy = 0.6491666666666667


Epoch[2] Batch[305] Speed: 1.2356612989706515 samples/sec                   batch loss = 773.9762340784073 | accuracy = 0.6467213114754098


Epoch[2] Batch[310] Speed: 1.2387160489889388 samples/sec                   batch loss = 786.4884567260742 | accuracy = 0.6475806451612903


Epoch[2] Batch[315] Speed: 1.2385109417493216 samples/sec                   batch loss = 797.4754656553268 | accuracy = 0.6492063492063492


Epoch[2] Batch[320] Speed: 1.2347796980308103 samples/sec                   batch loss = 812.3037732839584 | accuracy = 0.64609375


Epoch[2] Batch[325] Speed: 1.2377600409841811 samples/sec                   batch loss = 824.5294916629791 | accuracy = 0.6469230769230769


Epoch[2] Batch[330] Speed: 1.234990207508367 samples/sec                   batch loss = 837.0190238952637 | accuracy = 0.6477272727272727


Epoch[2] Batch[335] Speed: 1.2360339065440142 samples/sec                   batch loss = 849.8746852874756 | accuracy = 0.6492537313432836


Epoch[2] Batch[340] Speed: 1.2394288284316304 samples/sec                   batch loss = 861.7987065315247 | accuracy = 0.649264705882353


Epoch[2] Batch[345] Speed: 1.2371212411137207 samples/sec                   batch loss = 874.4822399616241 | accuracy = 0.6485507246376812


Epoch[2] Batch[350] Speed: 1.2340843273970254 samples/sec                   batch loss = 887.2189313173294 | accuracy = 0.6485714285714286


Epoch[2] Batch[355] Speed: 1.2314512881664612 samples/sec                   batch loss = 900.2476538419724 | accuracy = 0.6485915492957747


Epoch[2] Batch[360] Speed: 1.234242024998946 samples/sec                   batch loss = 914.2248607873917 | accuracy = 0.6458333333333334


Epoch[2] Batch[365] Speed: 1.2372604631676571 samples/sec                   batch loss = 924.8237473964691 | accuracy = 0.6465753424657534


Epoch[2] Batch[370] Speed: 1.2384352438911685 samples/sec                   batch loss = 937.4827542304993 | accuracy = 0.6445945945945946


Epoch[2] Batch[375] Speed: 1.2367331175792453 samples/sec                   batch loss = 949.6805540323257 | accuracy = 0.6466666666666666


Epoch[2] Batch[380] Speed: 1.2353153838779465 samples/sec                   batch loss = 962.1342425346375 | accuracy = 0.6460526315789473


Epoch[2] Batch[385] Speed: 1.2314457744748808 samples/sec                   batch loss = 972.4046490192413 | accuracy = 0.6474025974025974


Epoch[2] Batch[390] Speed: 1.2388824341773261 samples/sec                   batch loss = 983.8219203948975 | accuracy = 0.6474358974358975


Epoch[2] Batch[395] Speed: 1.2326894119683585 samples/sec                   batch loss = 997.3543679714203 | accuracy = 0.6468354430379747


Epoch[2] Batch[400] Speed: 1.234007445138789 samples/sec                   batch loss = 1008.7722057104111 | accuracy = 0.648125


Epoch[2] Batch[405] Speed: 1.2326413208349114 samples/sec                   batch loss = 1025.1819645166397 | accuracy = 0.645679012345679


Epoch[2] Batch[410] Speed: 1.2369011590315773 samples/sec                   batch loss = 1038.1902064085007 | accuracy = 0.6451219512195122


Epoch[2] Batch[415] Speed: 1.232993713848751 samples/sec                   batch loss = 1050.761864900589 | accuracy = 0.6457831325301204


Epoch[2] Batch[420] Speed: 1.2350388457670043 samples/sec                   batch loss = 1061.5535726547241 | accuracy = 0.6470238095238096


Epoch[2] Batch[425] Speed: 1.2319663580658942 samples/sec                   batch loss = 1076.4931230545044 | accuracy = 0.6458823529411765


Epoch[2] Batch[430] Speed: 1.235213520514205 samples/sec                   batch loss = 1089.6519869565964 | accuracy = 0.6453488372093024


Epoch[2] Batch[435] Speed: 1.2358888605688978 samples/sec                   batch loss = 1101.6233077049255 | accuracy = 0.6459770114942529


Epoch[2] Batch[440] Speed: 1.24106273607129 samples/sec                   batch loss = 1112.422502398491 | accuracy = 0.6465909090909091


Epoch[2] Batch[445] Speed: 1.2359364770569399 samples/sec                   batch loss = 1124.144702076912 | accuracy = 0.6477528089887641


Epoch[2] Batch[450] Speed: 1.2346129593027444 samples/sec                   batch loss = 1138.61443567276 | accuracy = 0.6472222222222223


Epoch[2] Batch[455] Speed: 1.2434807228504892 samples/sec                   batch loss = 1149.7646038532257 | accuracy = 0.6478021978021978


Epoch[2] Batch[460] Speed: 1.2304922873909498 samples/sec                   batch loss = 1162.3133496046066 | accuracy = 0.6472826086956521


Epoch[2] Batch[465] Speed: 1.230510608020784 samples/sec                   batch loss = 1172.7795859575272 | accuracy = 0.6473118279569893


Epoch[2] Batch[470] Speed: 1.2437515577274327 samples/sec                   batch loss = 1185.7082248926163 | accuracy = 0.6468085106382979


Epoch[2] Batch[475] Speed: 1.2410561261276123 samples/sec                   batch loss = 1198.0295169353485 | accuracy = 0.6473684210526316


Epoch[2] Batch[480] Speed: 1.2370981621103472 samples/sec                   batch loss = 1212.3530864715576 | accuracy = 0.6463541666666667


Epoch[2] Batch[485] Speed: 1.2386762658596069 samples/sec                   batch loss = 1225.5263316631317 | accuracy = 0.6458762886597939


Epoch[2] Batch[490] Speed: 1.236435441648361 samples/sec                   batch loss = 1236.999035835266 | accuracy = 0.6474489795918368


Epoch[2] Batch[495] Speed: 1.2402846238031633 samples/sec                   batch loss = 1249.0260854959488 | accuracy = 0.6464646464646465


Epoch[2] Batch[500] Speed: 1.2383859721175197 samples/sec                   batch loss = 1262.0318781137466 | accuracy = 0.6465


Epoch[2] Batch[505] Speed: 1.2397162219179478 samples/sec                   batch loss = 1272.7446004152298 | accuracy = 0.6475247524752475


Epoch[2] Batch[510] Speed: 1.2435545500709753 samples/sec                   batch loss = 1285.2787429094315 | accuracy = 0.6485294117647059


Epoch[2] Batch[515] Speed: 1.2409543235277236 samples/sec                   batch loss = 1297.0839228630066 | accuracy = 0.6490291262135922


Epoch[2] Batch[520] Speed: 1.2404994911111744 samples/sec                   batch loss = 1310.85103058815 | accuracy = 0.6475961538461539


Epoch[2] Batch[525] Speed: 1.2403150656899424 samples/sec                   batch loss = 1319.697608590126 | accuracy = 0.6490476190476191


Epoch[2] Batch[530] Speed: 1.2395901846586563 samples/sec                   batch loss = 1331.4370827674866 | accuracy = 0.6495283018867924


Epoch[2] Batch[535] Speed: 1.2387562005310553 samples/sec                   batch loss = 1343.0826598405838 | accuracy = 0.65


Epoch[2] Batch[540] Speed: 1.238943639313346 samples/sec                   batch loss = 1353.0072145462036 | accuracy = 0.6509259259259259


Epoch[2] Batch[545] Speed: 1.2353999796176722 samples/sec                   batch loss = 1365.5025100708008 | accuracy = 0.6504587155963303


Epoch[2] Batch[550] Speed: 1.24194294612582 samples/sec                   batch loss = 1377.7588119506836 | accuracy = 0.6509090909090909


Epoch[2] Batch[555] Speed: 1.2399393230566842 samples/sec                   batch loss = 1387.2101678848267 | accuracy = 0.6531531531531531


Epoch[2] Batch[560] Speed: 1.2424707883896646 samples/sec                   batch loss = 1397.9758944511414 | accuracy = 0.6535714285714286


Epoch[2] Batch[565] Speed: 1.2382129577830687 samples/sec                   batch loss = 1410.4058438539505 | accuracy = 0.6535398230088496


Epoch[2] Batch[570] Speed: 1.235987375218599 samples/sec                   batch loss = 1426.480097413063 | accuracy = 0.6530701754385965


Epoch[2] Batch[575] Speed: 1.2384352438911685 samples/sec                   batch loss = 1437.455704331398 | accuracy = 0.6539130434782608


Epoch[2] Batch[580] Speed: 1.235407803034777 samples/sec                   batch loss = 1449.1901878118515 | accuracy = 0.6543103448275862


Epoch[2] Batch[585] Speed: 1.2404274018470944 samples/sec                   batch loss = 1464.5706306695938 | accuracy = 0.6534188034188034


Epoch[2] Batch[590] Speed: 1.2272165431078308 samples/sec                   batch loss = 1476.5765410661697 | accuracy = 0.6533898305084745


Epoch[2] Batch[595] Speed: 1.233447229253107 samples/sec                   batch loss = 1490.4809929132462 | accuracy = 0.6529411764705882


Epoch[2] Batch[600] Speed: 1.2348830350365305 samples/sec                   batch loss = 1501.1722555160522 | accuracy = 0.6541666666666667


Epoch[2] Batch[605] Speed: 1.2338824751818294 samples/sec                   batch loss = 1513.525995850563 | accuracy = 0.6549586776859504


Epoch[2] Batch[610] Speed: 1.236650709161439 samples/sec                   batch loss = 1525.2955980300903 | accuracy = 0.655327868852459


Epoch[2] Batch[615] Speed: 1.2354426457351784 samples/sec                   batch loss = 1538.195237159729 | accuracy = 0.6544715447154471


Epoch[2] Batch[620] Speed: 1.240944777521888 samples/sec                   batch loss = 1549.1334601640701 | accuracy = 0.6548387096774193


Epoch[2] Batch[625] Speed: 1.240327811381519 samples/sec                   batch loss = 1560.8604884147644 | accuracy = 0.6556


Epoch[2] Batch[630] Speed: 1.2425701711306014 samples/sec                   batch loss = 1573.0274534225464 | accuracy = 0.655952380952381


Epoch[2] Batch[635] Speed: 1.2421457890341343 samples/sec                   batch loss = 1585.2649989128113 | accuracy = 0.6547244094488189


Epoch[2] Batch[640] Speed: 1.2422725305144025 samples/sec                   batch loss = 1597.238172531128 | accuracy = 0.6546875


Epoch[2] Batch[645] Speed: 1.2466796259637734 samples/sec                   batch loss = 1608.6091351509094 | accuracy = 0.6550387596899225


Epoch[2] Batch[650] Speed: 1.241481968662992 samples/sec                   batch loss = 1620.2047570943832 | accuracy = 0.6557692307692308


Epoch[2] Batch[655] Speed: 1.244821484643269 samples/sec                   batch loss = 1633.3787089586258 | accuracy = 0.6564885496183206


Epoch[2] Batch[660] Speed: 1.2425799262193875 samples/sec                   batch loss = 1645.1727446317673 | accuracy = 0.656060606060606


Epoch[2] Batch[665] Speed: 1.2409861752058784 samples/sec                   batch loss = 1656.9274138212204 | accuracy = 0.6571428571428571


Epoch[2] Batch[670] Speed: 1.2396019079737657 samples/sec                   batch loss = 1668.8464612960815 | accuracy = 0.6574626865671642


Epoch[2] Batch[675] Speed: 1.2421151652557534 samples/sec                   batch loss = 1681.6534212827682 | accuracy = 0.6570370370370371


Epoch[2] Batch[680] Speed: 1.2448030124779934 samples/sec                   batch loss = 1695.9895149469376 | accuracy = 0.6566176470588235


Epoch[2] Batch[685] Speed: 1.2413453770228204 samples/sec                   batch loss = 1706.3114436864853 | accuracy = 0.656934306569343


Epoch[2] Batch[690] Speed: 1.2328427671375395 samples/sec                   batch loss = 1717.364765048027 | accuracy = 0.657608695652174


Epoch[2] Batch[695] Speed: 1.2313874769865547 samples/sec                   batch loss = 1729.8667460680008 | accuracy = 0.658273381294964


Epoch[2] Batch[700] Speed: 1.2368000369184886 samples/sec                   batch loss = 1741.5380446910858 | accuracy = 0.6582142857142858


Epoch[2] Batch[705] Speed: 1.2393729770970487 samples/sec                   batch loss = 1751.7748470306396 | accuracy = 0.6581560283687943


Epoch[2] Batch[710] Speed: 1.233325908626113 samples/sec                   batch loss = 1761.471659898758 | accuracy = 0.6588028169014084


Epoch[2] Batch[715] Speed: 1.2369373627614026 samples/sec                   batch loss = 1772.3374283313751 | accuracy = 0.6594405594405595


Epoch[2] Batch[720] Speed: 1.2361585840555647 samples/sec                   batch loss = 1781.6553362607956 | accuracy = 0.6607638888888889


Epoch[2] Batch[725] Speed: 1.2352145208762702 samples/sec                   batch loss = 1791.6037427186966 | accuracy = 0.6613793103448276


Epoch[2] Batch[730] Speed: 1.2322727467701458 samples/sec                   batch loss = 1805.3975286483765 | accuracy = 0.6613013698630137


Epoch[2] Batch[735] Speed: 1.2370716177601881 samples/sec                   batch loss = 1815.61531996727 | accuracy = 0.6619047619047619


Epoch[2] Batch[740] Speed: 1.235051846922385 samples/sec                   batch loss = 1828.118822336197 | accuracy = 0.6621621621621622


Epoch[2] Batch[745] Speed: 1.240606723340794 samples/sec                   batch loss = 1840.5584081411362 | accuracy = 0.6624161073825503


Epoch[2] Batch[750] Speed: 1.2429043241208568 samples/sec                   batch loss = 1850.065127670765 | accuracy = 0.6636666666666666


Epoch[2] Batch[755] Speed: 1.2359082526919167 samples/sec                   batch loss = 1858.8241659998894 | accuracy = 0.6649006622516557


Epoch[2] Batch[760] Speed: 1.2369059009612162 samples/sec                   batch loss = 1868.668095767498 | accuracy = 0.6661184210526315


Epoch[2] Batch[765] Speed: 1.2382436636130063 samples/sec                   batch loss = 1882.421314418316 | accuracy = 0.6663398692810457


Epoch[2] Batch[770] Speed: 1.2352834588198376 samples/sec                   batch loss = 1896.3234669566154 | accuracy = 0.6655844155844156


Epoch[2] Batch[775] Speed: 1.24683129304855 samples/sec                   batch loss = 1912.3441386818886 | accuracy = 0.6641935483870968


Epoch[2] Batch[780] Speed: 1.2459468260772861 samples/sec                   batch loss = 1922.1954842209816 | accuracy = 0.6647435897435897


Epoch[2] Batch[785] Speed: 1.243191673649974 samples/sec                   batch loss = 1934.384036719799 | accuracy = 0.6640127388535032


[Epoch 2] training: accuracy=0.6649746192893401
[Epoch 2] time cost: 652.8920545578003
[Epoch 2] validation: validation accuracy=0.7455555555555555


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).