<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[21:38:01] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[21:38:01] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[21:38:02] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 5.175228 , -1.9694889]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7968040480942965 samples/sec                   batch loss = 13.868988513946533 | accuracy = 0.55


Epoch[1] Batch[10] Speed: 1.2853934312851247 samples/sec                   batch loss = 28.791743278503418 | accuracy = 0.5


Epoch[1] Batch[15] Speed: 1.2822040817387126 samples/sec                   batch loss = 42.23243236541748 | accuracy = 0.5333333333333333


Epoch[1] Batch[20] Speed: 1.277956295630439 samples/sec                   batch loss = 55.79063868522644 | accuracy = 0.525


Epoch[1] Batch[25] Speed: 1.2848450279503996 samples/sec                   batch loss = 69.42654538154602 | accuracy = 0.53


Epoch[1] Batch[30] Speed: 1.2869271303482306 samples/sec                   batch loss = 83.42409110069275 | accuracy = 0.5166666666666667


Epoch[1] Batch[35] Speed: 1.281954150690521 samples/sec                   batch loss = 97.4768340587616 | accuracy = 0.5214285714285715


Epoch[1] Batch[40] Speed: 1.2837936321227552 samples/sec                   batch loss = 111.41974925994873 | accuracy = 0.525


Epoch[1] Batch[45] Speed: 1.2858531077551265 samples/sec                   batch loss = 125.40536403656006 | accuracy = 0.5277777777777778


Epoch[1] Batch[50] Speed: 1.2875502318090113 samples/sec                   batch loss = 139.6426739692688 | accuracy = 0.525


Epoch[1] Batch[55] Speed: 1.2848616572487834 samples/sec                   batch loss = 153.13676023483276 | accuracy = 0.5363636363636364


Epoch[1] Batch[60] Speed: 1.2858074800735746 samples/sec                   batch loss = 168.41970348358154 | accuracy = 0.5208333333333334


Epoch[1] Batch[65] Speed: 1.282660400822201 samples/sec                   batch loss = 182.23099422454834 | accuracy = 0.5230769230769231


Epoch[1] Batch[70] Speed: 1.2847782199009332 samples/sec                   batch loss = 196.2162163257599 | accuracy = 0.5285714285714286


Epoch[1] Batch[75] Speed: 1.2906478050038364 samples/sec                   batch loss = 210.32494044303894 | accuracy = 0.5233333333333333


Epoch[1] Batch[80] Speed: 1.2814010435757053 samples/sec                   batch loss = 224.77786946296692 | accuracy = 0.5125


Epoch[1] Batch[85] Speed: 1.2874395721921852 samples/sec                   batch loss = 238.0511598587036 | accuracy = 0.5147058823529411


Epoch[1] Batch[90] Speed: 1.2806238390512346 samples/sec                   batch loss = 251.85908102989197 | accuracy = 0.5111111111111111


Epoch[1] Batch[95] Speed: 1.2801986636414984 samples/sec                   batch loss = 265.2088587284088 | accuracy = 0.5131578947368421


Epoch[1] Batch[100] Speed: 1.2818383787417287 samples/sec                   batch loss = 279.23123931884766 | accuracy = 0.5125


Epoch[1] Batch[105] Speed: 1.2825968593944252 samples/sec                   batch loss = 292.78021001815796 | accuracy = 0.5166666666666667


Epoch[1] Batch[110] Speed: 1.2900967948462616 samples/sec                   batch loss = 306.72673177719116 | accuracy = 0.5159090909090909


Epoch[1] Batch[115] Speed: 1.2849603595177368 samples/sec                   batch loss = 320.96015977859497 | accuracy = 0.508695652173913


Epoch[1] Batch[120] Speed: 1.291717622783582 samples/sec                   batch loss = 333.99686431884766 | accuracy = 0.5166666666666667


Epoch[1] Batch[125] Speed: 1.2877766492724005 samples/sec                   batch loss = 347.5232923030853 | accuracy = 0.522


Epoch[1] Batch[130] Speed: 1.2927655070531368 samples/sec                   batch loss = 360.8697280883789 | accuracy = 0.525


Epoch[1] Batch[135] Speed: 1.2912931005951864 samples/sec                   batch loss = 375.3546280860901 | accuracy = 0.5222222222222223


Epoch[1] Batch[140] Speed: 1.281961203464651 samples/sec                   batch loss = 388.95575761795044 | accuracy = 0.525


Epoch[1] Batch[145] Speed: 1.284347922211025 samples/sec                   batch loss = 403.14034509658813 | accuracy = 0.5241379310344828


Epoch[1] Batch[150] Speed: 1.2840798573041865 samples/sec                   batch loss = 416.73729038238525 | accuracy = 0.525


Epoch[1] Batch[155] Speed: 1.2786929376237273 samples/sec                   batch loss = 429.8574137687683 | accuracy = 0.5338709677419354


Epoch[1] Batch[160] Speed: 1.284230242976418 samples/sec                   batch loss = 443.60844135284424 | accuracy = 0.534375


Epoch[1] Batch[165] Speed: 1.2798706185478534 samples/sec                   batch loss = 457.35822439193726 | accuracy = 0.5363636363636364


Epoch[1] Batch[170] Speed: 1.2823830416676254 samples/sec                   batch loss = 471.0157678127289 | accuracy = 0.538235294117647


Epoch[1] Batch[175] Speed: 1.2803969980926704 samples/sec                   batch loss = 484.8428854942322 | accuracy = 0.5385714285714286


Epoch[1] Batch[180] Speed: 1.282165375729152 samples/sec                   batch loss = 498.9041759967804 | accuracy = 0.5402777777777777


Epoch[1] Batch[185] Speed: 1.2809855238359347 samples/sec                   batch loss = 512.7201464176178 | accuracy = 0.5405405405405406


Epoch[1] Batch[190] Speed: 1.2832579784965843 samples/sec                   batch loss = 527.0156033039093 | accuracy = 0.5368421052631579


Epoch[1] Batch[195] Speed: 1.2848599844596513 samples/sec                   batch loss = 540.769540309906 | accuracy = 0.5333333333333333


Epoch[1] Batch[200] Speed: 1.287537089984154 samples/sec                   batch loss = 554.2284276485443 | accuracy = 0.5375


Epoch[1] Batch[205] Speed: 1.277356350272077 samples/sec                   batch loss = 567.8698432445526 | accuracy = 0.5390243902439025


Epoch[1] Batch[210] Speed: 1.2813092479521044 samples/sec                   batch loss = 581.7088775634766 | accuracy = 0.5380952380952381


Epoch[1] Batch[215] Speed: 1.2786512274268795 samples/sec                   batch loss = 595.0222132205963 | accuracy = 0.5395348837209303


Epoch[1] Batch[220] Speed: 1.2833725344525668 samples/sec                   batch loss = 608.4946765899658 | accuracy = 0.5409090909090909


Epoch[1] Batch[225] Speed: 1.2816235410519123 samples/sec                   batch loss = 622.0478074550629 | accuracy = 0.5433333333333333


Epoch[1] Batch[230] Speed: 1.2779598973928035 samples/sec                   batch loss = 635.5183980464935 | accuracy = 0.5445652173913044


Epoch[1] Batch[235] Speed: 1.2851151843477329 samples/sec                   batch loss = 649.1653754711151 | accuracy = 0.5436170212765957


Epoch[1] Batch[240] Speed: 1.2870628793121555 samples/sec                   batch loss = 662.9701311588287 | accuracy = 0.5427083333333333


Epoch[1] Batch[245] Speed: 1.283164738961518 samples/sec                   batch loss = 676.0012254714966 | accuracy = 0.5459183673469388


Epoch[1] Batch[250] Speed: 1.288873790635262 samples/sec                   batch loss = 689.879513502121 | accuracy = 0.544


Epoch[1] Batch[255] Speed: 1.2888485423108618 samples/sec                   batch loss = 703.5036256313324 | accuracy = 0.5450980392156862


Epoch[1] Batch[260] Speed: 1.2824408762007584 samples/sec                   batch loss = 716.9409883022308 | accuracy = 0.5471153846153847


Epoch[1] Batch[265] Speed: 1.2856278587073557 samples/sec                   batch loss = 730.5543200969696 | accuracy = 0.55


Epoch[1] Batch[270] Speed: 1.2783505657373238 samples/sec                   batch loss = 743.9841086864471 | accuracy = 0.549074074074074


Epoch[1] Batch[275] Speed: 1.2817319301005563 samples/sec                   batch loss = 757.3423316478729 | accuracy = 0.5518181818181818


Epoch[1] Batch[280] Speed: 1.2875360030783203 samples/sec                   batch loss = 770.6130125522614 | accuracy = 0.5544642857142857


Epoch[1] Batch[285] Speed: 1.279849138876376 samples/sec                   batch loss = 784.7285301685333 | accuracy = 0.5526315789473685


Epoch[1] Batch[290] Speed: 1.2841084573568509 samples/sec                   batch loss = 798.3699462413788 | accuracy = 0.553448275862069


Epoch[1] Batch[295] Speed: 1.2827382672667025 samples/sec                   batch loss = 811.5390965938568 | accuracy = 0.5559322033898305


Epoch[1] Batch[300] Speed: 1.2830665085943778 samples/sec                   batch loss = 824.744127035141 | accuracy = 0.5566666666666666


Epoch[1] Batch[305] Speed: 1.2797767967409248 samples/sec                   batch loss = 837.3521056175232 | accuracy = 0.5581967213114755


Epoch[1] Batch[310] Speed: 1.2785912007049118 samples/sec                   batch loss = 850.6977069377899 | accuracy = 0.5588709677419355


Epoch[1] Batch[315] Speed: 1.2814788550691052 samples/sec                   batch loss = 864.3542461395264 | accuracy = 0.5603174603174603


Epoch[1] Batch[320] Speed: 1.2848331220307714 samples/sec                   batch loss = 877.9151916503906 | accuracy = 0.5609375


Epoch[1] Batch[325] Speed: 1.2822922813754967 samples/sec                   batch loss = 891.8163363933563 | accuracy = 0.56


Epoch[1] Batch[330] Speed: 1.2796676644340719 samples/sec                   batch loss = 905.2582106590271 | accuracy = 0.5606060606060606


Epoch[1] Batch[335] Speed: 1.2837298801441597 samples/sec                   batch loss = 919.5359017848969 | accuracy = 0.558955223880597


Epoch[1] Batch[340] Speed: 1.2797946618519689 samples/sec                   batch loss = 933.0872213840485 | accuracy = 0.5580882352941177


Epoch[1] Batch[345] Speed: 1.2837955968451649 samples/sec                   batch loss = 946.4546656608582 | accuracy = 0.5601449275362319


Epoch[1] Batch[350] Speed: 1.2871720916265688 samples/sec                   batch loss = 960.9266624450684 | accuracy = 0.5592857142857143


Epoch[1] Batch[355] Speed: 1.286067394215094 samples/sec                   batch loss = 974.484625339508 | accuracy = 0.5605633802816902


Epoch[1] Batch[360] Speed: 1.2876318554653579 samples/sec                   batch loss = 987.9043393135071 | accuracy = 0.5625


Epoch[1] Batch[365] Speed: 1.2923930586822243 samples/sec                   batch loss = 1000.442699432373 | accuracy = 0.5643835616438356


Epoch[1] Batch[370] Speed: 1.286917949815832 samples/sec                   batch loss = 1014.2777979373932 | accuracy = 0.5635135135135135


Epoch[1] Batch[375] Speed: 1.2836586700895083 samples/sec                   batch loss = 1028.3008964061737 | accuracy = 0.562


Epoch[1] Batch[380] Speed: 1.28386760807073 samples/sec                   batch loss = 1041.259386062622 | accuracy = 0.5625


Epoch[1] Batch[385] Speed: 1.285894894908935 samples/sec                   batch loss = 1055.6808123588562 | accuracy = 0.5603896103896104


Epoch[1] Batch[390] Speed: 1.2866878869907306 samples/sec                   batch loss = 1068.1337096691132 | accuracy = 0.5641025641025641


Epoch[1] Batch[395] Speed: 1.2864196336224065 samples/sec                   batch loss = 1081.2137687206268 | accuracy = 0.5651898734177215


Epoch[1] Batch[400] Speed: 1.281036385332494 samples/sec                   batch loss = 1094.4436793327332 | accuracy = 0.5675


Epoch[1] Batch[405] Speed: 1.2908818686330517 samples/sec                   batch loss = 1107.4197466373444 | accuracy = 0.5679012345679012


Epoch[1] Batch[410] Speed: 1.2821641019002028 samples/sec                   batch loss = 1120.767834663391 | accuracy = 0.5676829268292682


Epoch[1] Batch[415] Speed: 1.2891780364450711 samples/sec                   batch loss = 1134.5366804599762 | accuracy = 0.5662650602409639


Epoch[1] Batch[420] Speed: 1.2842649448026557 samples/sec                   batch loss = 1147.4300413131714 | accuracy = 0.5672619047619047


Epoch[1] Batch[425] Speed: 1.2819595382193163 samples/sec                   batch loss = 1161.0590481758118 | accuracy = 0.5682352941176471


Epoch[1] Batch[430] Speed: 1.2858574440315853 samples/sec                   batch loss = 1174.500292301178 | accuracy = 0.5691860465116279


Epoch[1] Batch[435] Speed: 1.288446484695239 samples/sec                   batch loss = 1187.7270095348358 | accuracy = 0.5706896551724138


Epoch[1] Batch[440] Speed: 1.286270707085576 samples/sec                   batch loss = 1201.2074296474457 | accuracy = 0.5704545454545454


Epoch[1] Batch[445] Speed: 1.2851470791565187 samples/sec                   batch loss = 1213.5148746967316 | accuracy = 0.5719101123595506


Epoch[1] Batch[450] Speed: 1.2846263287809332 samples/sec                   batch loss = 1227.6718263626099 | accuracy = 0.5722222222222222


Epoch[1] Batch[455] Speed: 1.2889706345584429 samples/sec                   batch loss = 1241.0309238433838 | accuracy = 0.5714285714285714


Epoch[1] Batch[460] Speed: 1.2831957518713637 samples/sec                   batch loss = 1253.885986328125 | accuracy = 0.5722826086956522


Epoch[1] Batch[465] Speed: 1.2831738660229848 samples/sec                   batch loss = 1267.5136241912842 | accuracy = 0.5704301075268817


Epoch[1] Batch[470] Speed: 1.2825684246834634 samples/sec                   batch loss = 1280.2047021389008 | accuracy = 0.5707446808510638


Epoch[1] Batch[475] Speed: 1.2826625582033995 samples/sec                   batch loss = 1293.5777950286865 | accuracy = 0.5689473684210526


Epoch[1] Batch[480] Speed: 1.2857868846214833 samples/sec                   batch loss = 1306.8885171413422 | accuracy = 0.56875


Epoch[1] Batch[485] Speed: 1.282030853776318 samples/sec                   batch loss = 1319.8114762306213 | accuracy = 0.5695876288659794


Epoch[1] Batch[490] Speed: 1.2764341801498134 samples/sec                   batch loss = 1333.259569644928 | accuracy = 0.5704081632653061


Epoch[1] Batch[495] Speed: 1.278370923649771 samples/sec                   batch loss = 1346.8629536628723 | accuracy = 0.5702020202020202


Epoch[1] Batch[500] Speed: 1.2841914145442355 samples/sec                   batch loss = 1359.6521439552307 | accuracy = 0.5715


Epoch[1] Batch[505] Speed: 1.2820652408978515 samples/sec                   batch loss = 1371.756709575653 | accuracy = 0.5737623762376237


Epoch[1] Batch[510] Speed: 1.2824138207528528 samples/sec                   batch loss = 1384.3255207538605 | accuracy = 0.5740196078431372


Epoch[1] Batch[515] Speed: 1.2875783937656744 samples/sec                   batch loss = 1399.2698621749878 | accuracy = 0.5718446601941748


Epoch[1] Batch[520] Speed: 1.2837279156228907 samples/sec                   batch loss = 1413.9284799098969 | accuracy = 0.5697115384615384


Epoch[1] Batch[525] Speed: 1.2909888491764 samples/sec                   batch loss = 1427.9729146957397 | accuracy = 0.57


Epoch[1] Batch[530] Speed: 1.2856770205107944 samples/sec                   batch loss = 1441.5595390796661 | accuracy = 0.5693396226415094


Epoch[1] Batch[535] Speed: 1.2903231165289748 samples/sec                   batch loss = 1454.6045320034027 | accuracy = 0.5714953271028037


Epoch[1] Batch[540] Speed: 1.2875502318090113 samples/sec                   batch loss = 1467.8389732837677 | accuracy = 0.5712962962962963


Epoch[1] Batch[545] Speed: 1.2843910865003332 samples/sec                   batch loss = 1480.8766646385193 | accuracy = 0.5729357798165138


Epoch[1] Batch[550] Speed: 1.2884327308846797 samples/sec                   batch loss = 1493.5092589855194 | accuracy = 0.5736363636363636


Epoch[1] Batch[555] Speed: 1.2870380968191228 samples/sec                   batch loss = 1506.1258072853088 | accuracy = 0.5747747747747748


Epoch[1] Batch[560] Speed: 1.2830501219749924 samples/sec                   batch loss = 1519.5853097438812 | accuracy = 0.5758928571428571


Epoch[1] Batch[565] Speed: 1.288075828202056 samples/sec                   batch loss = 1532.455901145935 | accuracy = 0.5769911504424778


Epoch[1] Batch[570] Speed: 1.2824155852037884 samples/sec                   batch loss = 1545.4765040874481 | accuracy = 0.5767543859649122


Epoch[1] Batch[575] Speed: 1.280504593370419 samples/sec                   batch loss = 1559.0271050930023 | accuracy = 0.5760869565217391


Epoch[1] Batch[580] Speed: 1.2815254486121557 samples/sec                   batch loss = 1570.6325507164001 | accuracy = 0.5780172413793103


Epoch[1] Batch[585] Speed: 1.2875293828733245 samples/sec                   batch loss = 1583.4442501068115 | accuracy = 0.5794871794871795


Epoch[1] Batch[590] Speed: 1.2819615952877113 samples/sec                   batch loss = 1596.3739490509033 | accuracy = 0.5796610169491525


Epoch[1] Batch[595] Speed: 1.28740973687286 samples/sec                   batch loss = 1608.4072971343994 | accuracy = 0.5823529411764706


Epoch[1] Batch[600] Speed: 1.288638771943138 samples/sec                   batch loss = 1621.068427324295 | accuracy = 0.5820833333333333


Epoch[1] Batch[605] Speed: 1.2842028170808029 samples/sec                   batch loss = 1634.5187590122223 | accuracy = 0.5814049586776859


Epoch[1] Batch[610] Speed: 1.2796240362424034 samples/sec                   batch loss = 1648.5997321605682 | accuracy = 0.580327868852459


Epoch[1] Batch[615] Speed: 1.2865498493114245 samples/sec                   batch loss = 1660.176743030548 | accuracy = 0.5817073170731707


Epoch[1] Batch[620] Speed: 1.287252087087003 samples/sec                   batch loss = 1673.6910109519958 | accuracy = 0.5818548387096775


Epoch[1] Batch[625] Speed: 1.283427512867769 samples/sec                   batch loss = 1687.9869122505188 | accuracy = 0.582


Epoch[1] Batch[630] Speed: 1.2867230178186257 samples/sec                   batch loss = 1700.661205649376 | accuracy = 0.5833333333333334


Epoch[1] Batch[635] Speed: 1.2857186976895603 samples/sec                   batch loss = 1712.7574751377106 | accuracy = 0.584251968503937


Epoch[1] Batch[640] Speed: 1.2837347914736377 samples/sec                   batch loss = 1723.9648830890656 | accuracy = 0.584765625


Epoch[1] Batch[645] Speed: 1.2835732284515182 samples/sec                   batch loss = 1736.789673089981 | accuracy = 0.5852713178294574


Epoch[1] Batch[650] Speed: 1.2833116709868972 samples/sec                   batch loss = 1748.495190858841 | accuracy = 0.5869230769230769


Epoch[1] Batch[655] Speed: 1.2784083293512967 samples/sec                   batch loss = 1761.5653998851776 | accuracy = 0.5874045801526717


Epoch[1] Batch[660] Speed: 1.2818789259281105 samples/sec                   batch loss = 1774.883192062378 | accuracy = 0.5871212121212122


Epoch[1] Batch[665] Speed: 1.2818249615178592 samples/sec                   batch loss = 1786.5163085460663 | accuracy = 0.5887218045112782


Epoch[1] Batch[670] Speed: 1.2784889928445318 samples/sec                   batch loss = 1798.2770676612854 | accuracy = 0.5891791044776119


Epoch[1] Batch[675] Speed: 1.2840420207042127 samples/sec                   batch loss = 1812.9847404956818 | accuracy = 0.5888888888888889


Epoch[1] Batch[680] Speed: 1.2863400374679952 samples/sec                   batch loss = 1827.3530390262604 | accuracy = 0.5882352941176471


Epoch[1] Batch[685] Speed: 1.2823581450171928 samples/sec                   batch loss = 1840.9166662693024 | accuracy = 0.5886861313868613


Epoch[1] Batch[690] Speed: 1.2811176743015777 samples/sec                   batch loss = 1854.058836698532 | accuracy = 0.5887681159420289


Epoch[1] Batch[695] Speed: 1.2830962410742903 samples/sec                   batch loss = 1865.740556716919 | accuracy = 0.5899280575539568


Epoch[1] Batch[700] Speed: 1.2883113340834629 samples/sec                   batch loss = 1878.1173017024994 | accuracy = 0.59


Epoch[1] Batch[705] Speed: 1.2833645826074325 samples/sec                   batch loss = 1892.1010746955872 | accuracy = 0.5900709219858156


Epoch[1] Batch[710] Speed: 1.282585289256451 samples/sec                   batch loss = 1905.189710855484 | accuracy = 0.5901408450704225


Epoch[1] Batch[715] Speed: 1.279908112065624 samples/sec                   batch loss = 1918.4602332115173 | accuracy = 0.5905594405594405


Epoch[1] Batch[720] Speed: 1.2868387854981571 samples/sec                   batch loss = 1931.8005430698395 | accuracy = 0.5909722222222222


Epoch[1] Batch[725] Speed: 1.2819112479864618 samples/sec                   batch loss = 1943.4133369922638 | accuracy = 0.5924137931034483


Epoch[1] Batch[730] Speed: 1.2813660070202708 samples/sec                   batch loss = 1955.807209968567 | accuracy = 0.5921232876712329


Epoch[1] Batch[735] Speed: 1.284852801355829 samples/sec                   batch loss = 1968.7310461997986 | accuracy = 0.5928571428571429


Epoch[1] Batch[740] Speed: 1.2894045313412648 samples/sec                   batch loss = 1980.069931268692 | accuracy = 0.5939189189189189


Epoch[1] Batch[745] Speed: 1.2860831678893767 samples/sec                   batch loss = 1993.078974723816 | accuracy = 0.5939597315436241


Epoch[1] Batch[750] Speed: 1.280967918874247 samples/sec                   batch loss = 2004.7630152702332 | accuracy = 0.5946666666666667


Epoch[1] Batch[755] Speed: 1.279828538596056 samples/sec                   batch loss = 2018.254644393921 | accuracy = 0.5947019867549669


Epoch[1] Batch[760] Speed: 1.2865248892622834 samples/sec                   batch loss = 2031.858197569847 | accuracy = 0.5944078947368421


Epoch[1] Batch[765] Speed: 1.2849676422399585 samples/sec                   batch loss = 2043.6806761026382 | accuracy = 0.5954248366013072


Epoch[1] Batch[770] Speed: 1.2856901243916194 samples/sec                   batch loss = 2056.356488585472 | accuracy = 0.5961038961038961


Epoch[1] Batch[775] Speed: 1.286111364296147 samples/sec                   batch loss = 2068.806650161743 | accuracy = 0.5974193548387097


Epoch[1] Batch[780] Speed: 1.2925193085738604 samples/sec                   batch loss = 2082.964480161667 | accuracy = 0.596474358974359


Epoch[1] Batch[785] Speed: 1.2913594945528377 samples/sec                   batch loss = 2094.1428126096725 | accuracy = 0.597452229299363


[Epoch 1] training: accuracy=0.5980329949238579
[Epoch 1] time cost: 631.1943199634552
[Epoch 1] validation: validation accuracy=0.7222222222222222


Epoch[2] Batch[5] Speed: 1.2817708057976824 samples/sec                   batch loss = 13.251557350158691 | accuracy = 0.6


Epoch[2] Batch[10] Speed: 1.2782812173328952 samples/sec                   batch loss = 25.86022412776947 | accuracy = 0.625


Epoch[2] Batch[15] Speed: 1.2830453140044158 samples/sec                   batch loss = 38.72248041629791 | accuracy = 0.6333333333333333


Epoch[2] Batch[20] Speed: 1.2804516241347026 samples/sec                   batch loss = 49.68735086917877 | accuracy = 0.6875


Epoch[2] Batch[25] Speed: 1.2813597437038557 samples/sec                   batch loss = 64.42250049114227 | accuracy = 0.65


Epoch[2] Batch[30] Speed: 1.2807393917066225 samples/sec                   batch loss = 76.76492130756378 | accuracy = 0.6416666666666667


Epoch[2] Batch[35] Speed: 1.2802332456200058 samples/sec                   batch loss = 87.04260289669037 | accuracy = 0.6571428571428571


Epoch[2] Batch[40] Speed: 1.2798164325795556 samples/sec                   batch loss = 103.13300621509552 | accuracy = 0.625


Epoch[2] Batch[45] Speed: 1.2875319519000157 samples/sec                   batch loss = 115.34792411327362 | accuracy = 0.6277777777777778


Epoch[2] Batch[50] Speed: 1.2810757079499586 samples/sec                   batch loss = 128.73726451396942 | accuracy = 0.62


Epoch[2] Batch[55] Speed: 1.284741719468767 samples/sec                   batch loss = 141.444398522377 | accuracy = 0.6136363636363636


Epoch[2] Batch[60] Speed: 1.2783432604244362 samples/sec                   batch loss = 153.96537220478058 | accuracy = 0.625


Epoch[2] Batch[65] Speed: 1.2766651564311302 samples/sec                   batch loss = 166.75975835323334 | accuracy = 0.6230769230769231


Epoch[2] Batch[70] Speed: 1.2815045006756896 samples/sec                   batch loss = 180.72166693210602 | accuracy = 0.6142857142857143


Epoch[2] Batch[75] Speed: 1.2820895383237971 samples/sec                   batch loss = 193.15286207199097 | accuracy = 0.6166666666666667


Epoch[2] Batch[80] Speed: 1.2793198928060094 samples/sec                   batch loss = 204.87969434261322 | accuracy = 0.625


Epoch[2] Batch[85] Speed: 1.279838399311 samples/sec                   batch loss = 214.530193567276 | accuracy = 0.6411764705882353


Epoch[2] Batch[90] Speed: 1.2804346201801438 samples/sec                   batch loss = 227.77240002155304 | accuracy = 0.6388888888888888


Epoch[2] Batch[95] Speed: 1.2799530290674606 samples/sec                   batch loss = 239.23224914073944 | accuracy = 0.6473684210526316


Epoch[2] Batch[100] Speed: 1.2765177027146708 samples/sec                   batch loss = 251.73576414585114 | accuracy = 0.6525


Epoch[2] Batch[105] Speed: 1.2836201708400936 samples/sec                   batch loss = 264.76082956790924 | accuracy = 0.6523809523809524


Epoch[2] Batch[110] Speed: 1.2772938194757963 samples/sec                   batch loss = 277.2635453939438 | accuracy = 0.6477272727272727


Epoch[2] Batch[115] Speed: 1.2814290350498447 samples/sec                   batch loss = 289.5698356628418 | accuracy = 0.6434782608695652


Epoch[2] Batch[120] Speed: 1.278797419967764 samples/sec                   batch loss = 302.36757588386536 | accuracy = 0.6416666666666667


Epoch[2] Batch[125] Speed: 1.2811450664102932 samples/sec                   batch loss = 315.497599363327 | accuracy = 0.638


Epoch[2] Batch[130] Speed: 1.2819238834160367 samples/sec                   batch loss = 326.03071224689484 | accuracy = 0.6423076923076924


Epoch[2] Batch[135] Speed: 1.28849467482306 samples/sec                   batch loss = 337.41595101356506 | accuracy = 0.6462962962962963


Epoch[2] Batch[140] Speed: 1.2804011022129582 samples/sec                   batch loss = 350.2749812602997 | accuracy = 0.6464285714285715


Epoch[2] Batch[145] Speed: 1.2768192518454546 samples/sec                   batch loss = 360.27451598644257 | accuracy = 0.65


Epoch[2] Batch[150] Speed: 1.28024799725838 samples/sec                   batch loss = 370.70834743976593 | accuracy = 0.655


Epoch[2] Batch[155] Speed: 1.2834387054793568 samples/sec                   batch loss = 384.051616191864 | accuracy = 0.6564516129032258


Epoch[2] Batch[160] Speed: 1.2829272847216928 samples/sec                   batch loss = 395.121835231781 | accuracy = 0.6609375


Epoch[2] Batch[165] Speed: 1.2862519704272422 samples/sec                   batch loss = 406.78150272369385 | accuracy = 0.6636363636363637


Epoch[2] Batch[170] Speed: 1.2869644461184067 samples/sec                   batch loss = 419.40488374233246 | accuracy = 0.6661764705882353


Epoch[2] Batch[175] Speed: 1.283553293705911 samples/sec                   batch loss = 432.23453426361084 | accuracy = 0.6657142857142857


Epoch[2] Batch[180] Speed: 1.2922764886117917 samples/sec                   batch loss = 445.4357979297638 | accuracy = 0.6666666666666666


Epoch[2] Batch[185] Speed: 1.284263076951223 samples/sec                   batch loss = 458.1189075708389 | accuracy = 0.6621621621621622


Epoch[2] Batch[190] Speed: 1.2815310283182095 samples/sec                   batch loss = 470.48145973682404 | accuracy = 0.6605263157894737


Epoch[2] Batch[195] Speed: 1.2864545525159536 samples/sec                   batch loss = 483.14329862594604 | accuracy = 0.6602564102564102


Epoch[2] Batch[200] Speed: 1.2777379888691924 samples/sec                   batch loss = 495.58112359046936 | accuracy = 0.66


Epoch[2] Batch[205] Speed: 1.2786686713136737 samples/sec                   batch loss = 507.7508578300476 | accuracy = 0.6597560975609756


Epoch[2] Batch[210] Speed: 1.2744957442782319 samples/sec                   batch loss = 521.5952830314636 | accuracy = 0.6583333333333333


Epoch[2] Batch[215] Speed: 1.2792016701002944 samples/sec                   batch loss = 533.0874811410904 | accuracy = 0.6604651162790698


Epoch[2] Batch[220] Speed: 1.2820452549958774 samples/sec                   batch loss = 545.7043324708939 | accuracy = 0.6602272727272728


Epoch[2] Batch[225] Speed: 1.2861505061129117 samples/sec                   batch loss = 558.9896208047867 | accuracy = 0.6611111111111111


Epoch[2] Batch[230] Speed: 1.285350002650804 samples/sec                   batch loss = 569.2463971376419 | accuracy = 0.6663043478260869


Epoch[2] Batch[235] Speed: 1.2812057245340063 samples/sec                   batch loss = 580.2145233154297 | accuracy = 0.6670212765957447


Epoch[2] Batch[240] Speed: 1.2834659023753796 samples/sec                   batch loss = 590.3735233545303 | accuracy = 0.6697916666666667


Epoch[2] Batch[245] Speed: 1.2821959483832785 samples/sec                   batch loss = 604.5495647192001 | accuracy = 0.6683673469387755


Epoch[2] Batch[250] Speed: 1.28863095265984 samples/sec                   batch loss = 620.3240827322006 | accuracy = 0.666


Epoch[2] Batch[255] Speed: 1.2856906170239277 samples/sec                   batch loss = 632.8960316181183 | accuracy = 0.6656862745098039


Epoch[2] Batch[260] Speed: 1.2862462509244992 samples/sec                   batch loss = 641.7068685293198 | accuracy = 0.6711538461538461


Epoch[2] Batch[265] Speed: 1.2831332368758734 samples/sec                   batch loss = 653.2541935443878 | accuracy = 0.6716981132075471


Epoch[2] Batch[270] Speed: 1.2838198616626726 samples/sec                   batch loss = 665.6296430826187 | accuracy = 0.6703703703703704


Epoch[2] Batch[275] Speed: 1.2803604530389185 samples/sec                   batch loss = 676.6663866043091 | accuracy = 0.67


Epoch[2] Batch[280] Speed: 1.281751122845214 samples/sec                   batch loss = 689.5656399726868 | accuracy = 0.6678571428571428


Epoch[2] Batch[285] Speed: 1.2857822531992016 samples/sec                   batch loss = 701.7428847551346 | accuracy = 0.6692982456140351


Epoch[2] Batch[290] Speed: 1.2870449094412124 samples/sec                   batch loss = 713.4847983121872 | accuracy = 0.6698275862068965


Epoch[2] Batch[295] Speed: 1.2820463326511684 samples/sec                   batch loss = 727.4836493730545 | accuracy = 0.6669491525423729


Epoch[2] Batch[300] Speed: 1.2799539079098103 samples/sec                   batch loss = 737.650337934494 | accuracy = 0.6675


Epoch[2] Batch[305] Speed: 1.2786412875478388 samples/sec                   batch loss = 749.0012143850327 | accuracy = 0.6680327868852459


Epoch[2] Batch[310] Speed: 1.2854726148514806 samples/sec                   batch loss = 759.4439949989319 | accuracy = 0.6693548387096774


Epoch[2] Batch[315] Speed: 1.2788097991379026 samples/sec                   batch loss = 772.5144662857056 | accuracy = 0.6674603174603174


Epoch[2] Batch[320] Speed: 1.2801783451582534 samples/sec                   batch loss = 786.0808107852936 | accuracy = 0.6640625


Epoch[2] Batch[325] Speed: 1.285228201482762 samples/sec                   batch loss = 799.3986083269119 | accuracy = 0.6630769230769231


Epoch[2] Batch[330] Speed: 1.2842804776723935 samples/sec                   batch loss = 810.5001627206802 | accuracy = 0.6659090909090909


Epoch[2] Batch[335] Speed: 1.2884218467822968 samples/sec                   batch loss = 821.367805480957 | accuracy = 0.667910447761194


Epoch[2] Batch[340] Speed: 1.2820660246714104 samples/sec                   batch loss = 834.4638254642487 | accuracy = 0.6683823529411764


Epoch[2] Batch[345] Speed: 1.2809516836162609 samples/sec                   batch loss = 846.3425908088684 | accuracy = 0.6695652173913044


Epoch[2] Batch[350] Speed: 1.2827104147068573 samples/sec                   batch loss = 861.137977719307 | accuracy = 0.6685714285714286


Epoch[2] Batch[355] Speed: 1.282261214119198 samples/sec                   batch loss = 870.0982121229172 | accuracy = 0.6704225352112676


Epoch[2] Batch[360] Speed: 1.2855547633378326 samples/sec                   batch loss = 883.300364613533 | accuracy = 0.66875


Epoch[2] Batch[365] Speed: 1.2878823248275753 samples/sec                   batch loss = 895.0188874006271 | accuracy = 0.6684931506849315


Epoch[2] Batch[370] Speed: 1.28094728256301 samples/sec                   batch loss = 906.486034989357 | accuracy = 0.668918918918919


Epoch[2] Batch[375] Speed: 1.282178016168976 samples/sec                   batch loss = 920.5406461954117 | accuracy = 0.6686666666666666


Epoch[2] Batch[380] Speed: 1.284344186030845 samples/sec                   batch loss = 932.8559538125992 | accuracy = 0.6690789473684211


Epoch[2] Batch[385] Speed: 1.2843988544139588 samples/sec                   batch loss = 944.2301760911942 | accuracy = 0.6688311688311688


Epoch[2] Batch[390] Speed: 1.2861158009062872 samples/sec                   batch loss = 955.1660965681076 | accuracy = 0.6705128205128205


Epoch[2] Batch[395] Speed: 1.2832468871892353 samples/sec                   batch loss = 965.640496134758 | accuracy = 0.6721518987341772


Epoch[2] Batch[400] Speed: 1.2847818602194936 samples/sec                   batch loss = 977.5053132772446 | accuracy = 0.6725


Epoch[2] Batch[405] Speed: 1.2852125472267588 samples/sec                   batch loss = 988.7638431787491 | accuracy = 0.6728395061728395


Epoch[2] Batch[410] Speed: 1.2839181091650766 samples/sec                   batch loss = 1000.1368942260742 | accuracy = 0.6719512195121952


Epoch[2] Batch[415] Speed: 1.2888086420739404 samples/sec                   batch loss = 1016.7006704807281 | accuracy = 0.6686746987951807


Epoch[2] Batch[420] Speed: 1.2855584080582794 samples/sec                   batch loss = 1032.7894411087036 | accuracy = 0.6666666666666666


Epoch[2] Batch[425] Speed: 1.278359234816946 samples/sec                   batch loss = 1043.6311111450195 | accuracy = 0.6676470588235294


Epoch[2] Batch[430] Speed: 1.2843335675321714 samples/sec                   batch loss = 1052.9490944743156 | accuracy = 0.6691860465116279


Epoch[2] Batch[435] Speed: 1.2815029345023476 samples/sec                   batch loss = 1066.5134231448174 | accuracy = 0.6683908045977012


Epoch[2] Batch[440] Speed: 1.2823937259597267 samples/sec                   batch loss = 1078.3146763443947 | accuracy = 0.6676136363636364


Epoch[2] Batch[445] Speed: 1.2792775564214607 samples/sec                   batch loss = 1089.6563859581947 | accuracy = 0.6685393258426966


Epoch[2] Batch[450] Speed: 1.286149125756428 samples/sec                   batch loss = 1098.9490253329277 | accuracy = 0.6711111111111111


Epoch[2] Batch[455] Speed: 1.2841097350521624 samples/sec                   batch loss = 1109.761026918888 | accuracy = 0.671978021978022


Epoch[2] Batch[460] Speed: 1.2823198217478398 samples/sec                   batch loss = 1118.9216060042381 | accuracy = 0.6739130434782609


Epoch[2] Batch[465] Speed: 1.2789306792210706 samples/sec                   batch loss = 1129.9215500950813 | accuracy = 0.6741935483870968


Epoch[2] Batch[470] Speed: 1.2842702534627084 samples/sec                   batch loss = 1139.1934581398964 | accuracy = 0.675


Epoch[2] Batch[475] Speed: 1.2787755864250463 samples/sec                   batch loss = 1149.697173178196 | accuracy = 0.6757894736842105


Epoch[2] Batch[480] Speed: 1.281632058768685 samples/sec                   batch loss = 1159.6484132409096 | accuracy = 0.6770833333333334


Epoch[2] Batch[485] Speed: 1.2843109546478797 samples/sec                   batch loss = 1168.3276404738426 | accuracy = 0.6778350515463918


Epoch[2] Batch[490] Speed: 1.2771186102643786 samples/sec                   batch loss = 1178.054974257946 | accuracy = 0.6790816326530612


Epoch[2] Batch[495] Speed: 1.2854238626210113 samples/sec                   batch loss = 1191.862709581852 | accuracy = 0.6782828282828283


Epoch[2] Batch[500] Speed: 1.2812149215896942 samples/sec                   batch loss = 1202.7865672707558 | accuracy = 0.6795


Epoch[2] Batch[505] Speed: 1.2840536171394346 samples/sec                   batch loss = 1214.5664077401161 | accuracy = 0.6797029702970298


Epoch[2] Batch[510] Speed: 1.2815636265194168 samples/sec                   batch loss = 1228.6526837944984 | accuracy = 0.6779411764705883


Epoch[2] Batch[515] Speed: 1.2805688073422505 samples/sec                   batch loss = 1242.574370086193 | accuracy = 0.6771844660194175


Epoch[2] Batch[520] Speed: 1.2828885350376917 samples/sec                   batch loss = 1254.7964077591896 | accuracy = 0.676923076923077


Epoch[2] Batch[525] Speed: 1.2777087961163391 samples/sec                   batch loss = 1266.4549114108086 | accuracy = 0.6766666666666666


Epoch[2] Batch[530] Speed: 1.284830170184007 samples/sec                   batch loss = 1273.9207852482796 | accuracy = 0.6787735849056604


Epoch[2] Batch[535] Speed: 1.2824159773046557 samples/sec                   batch loss = 1286.3999801278114 | accuracy = 0.6785046728971963


Epoch[2] Batch[540] Speed: 1.2877084488152268 samples/sec                   batch loss = 1298.94967097044 | accuracy = 0.6782407407407407


Epoch[2] Batch[545] Speed: 1.2839293103351956 samples/sec                   batch loss = 1310.695609509945 | accuracy = 0.6779816513761467


Epoch[2] Batch[550] Speed: 1.2834985008958057 samples/sec                   batch loss = 1318.6119373440742 | accuracy = 0.6795454545454546


Epoch[2] Batch[555] Speed: 1.2834195603413239 samples/sec                   batch loss = 1330.3829558491707 | accuracy = 0.6806306306306307


Epoch[2] Batch[560] Speed: 1.2838221211862892 samples/sec                   batch loss = 1342.4937450289726 | accuracy = 0.6808035714285714


Epoch[2] Batch[565] Speed: 1.2864510999985277 samples/sec                   batch loss = 1355.4124001860619 | accuracy = 0.6800884955752212


Epoch[2] Batch[570] Speed: 1.2781531568422366 samples/sec                   batch loss = 1366.5232283473015 | accuracy = 0.6807017543859649


Epoch[2] Batch[575] Speed: 1.2818329922312461 samples/sec                   batch loss = 1379.8714115023613 | accuracy = 0.68


Epoch[2] Batch[580] Speed: 1.2898642058103802 samples/sec                   batch loss = 1390.927154123783 | accuracy = 0.680603448275862


Epoch[2] Batch[585] Speed: 1.2860666055415366 samples/sec                   batch loss = 1399.5647435188293 | accuracy = 0.6816239316239316


Epoch[2] Batch[590] Speed: 1.278273036275759 samples/sec                   batch loss = 1410.2568823099136 | accuracy = 0.6817796610169492


Epoch[2] Batch[595] Speed: 1.2828051575795398 samples/sec                   batch loss = 1418.3151071071625 | accuracy = 0.6836134453781513


Epoch[2] Batch[600] Speed: 1.2802345156152337 samples/sec                   batch loss = 1430.4317809343338 | accuracy = 0.6845833333333333


Epoch[2] Batch[605] Speed: 1.2767709594318994 samples/sec                   batch loss = 1440.8341569900513 | accuracy = 0.684297520661157


Epoch[2] Batch[610] Speed: 1.2814728842927452 samples/sec                   batch loss = 1451.0799984931946 | accuracy = 0.6848360655737705


Epoch[2] Batch[615] Speed: 1.2813390947662435 samples/sec                   batch loss = 1460.9048726558685 | accuracy = 0.6849593495934959


Epoch[2] Batch[620] Speed: 1.283440570933601 samples/sec                   batch loss = 1471.2939952611923 | accuracy = 0.6858870967741936


Epoch[2] Batch[625] Speed: 1.281762775863473 samples/sec                   batch loss = 1484.803943514824 | accuracy = 0.6856


Epoch[2] Batch[630] Speed: 1.2860898718181146 samples/sec                   batch loss = 1499.86982524395 | accuracy = 0.6841269841269841


Epoch[2] Batch[635] Speed: 1.2869227868542112 samples/sec                   batch loss = 1513.7064901590347 | accuracy = 0.6834645669291338


Epoch[2] Batch[640] Speed: 1.2810704256669543 samples/sec                   batch loss = 1525.0227110385895 | accuracy = 0.683203125


Epoch[2] Batch[645] Speed: 1.279395305334736 samples/sec                   batch loss = 1537.3965497016907 | accuracy = 0.6833333333333333


Epoch[2] Batch[650] Speed: 1.2803884967843409 samples/sec                   batch loss = 1547.4474363327026 | accuracy = 0.6838461538461539


Epoch[2] Batch[655] Speed: 1.2796364314216144 samples/sec                   batch loss = 1557.2918573617935 | accuracy = 0.684351145038168


Epoch[2] Batch[660] Speed: 1.2773012100440246 samples/sec                   batch loss = 1570.4783668518066 | accuracy = 0.6833333333333333


Epoch[2] Batch[665] Speed: 1.2793901344634913 samples/sec                   batch loss = 1583.3273077011108 | accuracy = 0.6830827067669173


Epoch[2] Batch[670] Speed: 1.2844383837338311 samples/sec                   batch loss = 1597.4868162870407 | accuracy = 0.6832089552238806


Epoch[2] Batch[675] Speed: 1.2859667475677552 samples/sec                   batch loss = 1612.0063499212265 | accuracy = 0.6818518518518518


Epoch[2] Batch[680] Speed: 1.2886444137634392 samples/sec                   batch loss = 1623.0114551782608 | accuracy = 0.6816176470588236


Epoch[2] Batch[685] Speed: 1.2797984692351911 samples/sec                   batch loss = 1633.3045001029968 | accuracy = 0.6817518248175183


Epoch[2] Batch[690] Speed: 1.2911251583391437 samples/sec                   batch loss = 1650.4646298885345 | accuracy = 0.6800724637681159


Epoch[2] Batch[695] Speed: 1.2831014419466977 samples/sec                   batch loss = 1661.0523809194565 | accuracy = 0.681294964028777


Epoch[2] Batch[700] Speed: 1.2794946332003725 samples/sec                   batch loss = 1672.8432260751724 | accuracy = 0.6810714285714285


Epoch[2] Batch[705] Speed: 1.2840148976566743 samples/sec                   batch loss = 1685.5440980196 | accuracy = 0.6804964539007092


Epoch[2] Batch[710] Speed: 1.2844301236748576 samples/sec                   batch loss = 1698.0761951208115 | accuracy = 0.6792253521126761


Epoch[2] Batch[715] Speed: 1.2792358080766877 samples/sec                   batch loss = 1706.7446212768555 | accuracy = 0.6800699300699301


Epoch[2] Batch[720] Speed: 1.290090049079232 samples/sec                   batch loss = 1723.5420136451721 | accuracy = 0.6784722222222223


Epoch[2] Batch[725] Speed: 1.2807028271053595 samples/sec                   batch loss = 1734.5919030308723 | accuracy = 0.6786206896551724


Epoch[2] Batch[730] Speed: 1.2799419948162138 samples/sec                   batch loss = 1748.2047676444054 | accuracy = 0.678082191780822


Epoch[2] Batch[735] Speed: 1.282233186277492 samples/sec                   batch loss = 1757.6420540809631 | accuracy = 0.6795918367346939


Epoch[2] Batch[740] Speed: 1.2825498937745567 samples/sec                   batch loss = 1771.1066763401031 | accuracy = 0.6787162162162163


Epoch[2] Batch[745] Speed: 1.2867471960477574 samples/sec                   batch loss = 1781.6349058151245 | accuracy = 0.6791946308724832


Epoch[2] Batch[750] Speed: 1.2806124022321466 samples/sec                   batch loss = 1793.034314274788 | accuracy = 0.6793333333333333


Epoch[2] Batch[755] Speed: 1.2820965925877017 samples/sec                   batch loss = 1803.3268274068832 | accuracy = 0.6798013245033112


Epoch[2] Batch[760] Speed: 1.2829272847216928 samples/sec                   batch loss = 1812.347177028656 | accuracy = 0.6805921052631579


Epoch[2] Batch[765] Speed: 1.2783014756393634 samples/sec                   batch loss = 1825.6163952350616 | accuracy = 0.6803921568627451


Epoch[2] Batch[770] Speed: 1.2777758441250828 samples/sec                   batch loss = 1838.6570522785187 | accuracy = 0.6805194805194805


Epoch[2] Batch[775] Speed: 1.282922870082581 samples/sec                   batch loss = 1849.657478094101 | accuracy = 0.6806451612903226


Epoch[2] Batch[780] Speed: 1.278704242725229 samples/sec                   batch loss = 1860.7270594835281 | accuracy = 0.6810897435897436


Epoch[2] Batch[785] Speed: 1.2827068841870586 samples/sec                   batch loss = 1871.5304771661758 | accuracy = 0.6818471337579618


[Epoch 2] training: accuracy=0.6817893401015228
[Epoch 2] time cost: 630.2815062999725
[Epoch 2] validation: validation accuracy=0.7777777777777778


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).