<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[21:37:55] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[21:37:55] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[21:37:55] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 6.378095, -4.610625]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7762512465804345 samples/sec                   batch loss = 12.845285892486572 | accuracy = 0.7


Epoch[1] Batch[10] Speed: 1.258635027623671 samples/sec                   batch loss = 26.932134866714478 | accuracy = 0.65


Epoch[1] Batch[15] Speed: 1.2537094248785743 samples/sec                   batch loss = 42.12915897369385 | accuracy = 0.5666666666666667


Epoch[1] Batch[20] Speed: 1.2582337616287818 samples/sec                   batch loss = 57.19399881362915 | accuracy = 0.55


Epoch[1] Batch[25] Speed: 1.2571616716670488 samples/sec                   batch loss = 72.32446074485779 | accuracy = 0.51


Epoch[1] Batch[30] Speed: 1.2591897263089509 samples/sec                   batch loss = 87.05168223381042 | accuracy = 0.5083333333333333


Epoch[1] Batch[35] Speed: 1.2518768957553734 samples/sec                   batch loss = 101.99072647094727 | accuracy = 0.4785714285714286


Epoch[1] Batch[40] Speed: 1.252764660021432 samples/sec                   batch loss = 116.67025065422058 | accuracy = 0.475


Epoch[1] Batch[45] Speed: 1.2579230059361626 samples/sec                   batch loss = 130.77496790885925 | accuracy = 0.4722222222222222


Epoch[1] Batch[50] Speed: 1.2592297984109906 samples/sec                   batch loss = 144.73187708854675 | accuracy = 0.48


Epoch[1] Batch[55] Speed: 1.261264821005243 samples/sec                   batch loss = 158.29814529418945 | accuracy = 0.4863636363636364


Epoch[1] Batch[60] Speed: 1.2555840665836406 samples/sec                   batch loss = 172.53113436698914 | accuracy = 0.48333333333333334


Epoch[1] Batch[65] Speed: 1.2473630213855655 samples/sec                   batch loss = 186.10998463630676 | accuracy = 0.4846153846153846


Epoch[1] Batch[70] Speed: 1.2462772431605873 samples/sec                   batch loss = 200.62203407287598 | accuracy = 0.4785714285714286


Epoch[1] Batch[75] Speed: 1.2516469572919116 samples/sec                   batch loss = 215.24292516708374 | accuracy = 0.46


Epoch[1] Batch[80] Speed: 1.2518775496407417 samples/sec                   batch loss = 228.33461952209473 | accuracy = 0.478125


Epoch[1] Batch[85] Speed: 1.2475007550937038 samples/sec                   batch loss = 242.3097128868103 | accuracy = 0.4852941176470588


Epoch[1] Batch[90] Speed: 1.2518231860754465 samples/sec                   batch loss = 255.48023533821106 | accuracy = 0.49722222222222223


Epoch[1] Batch[95] Speed: 1.258132801080344 samples/sec                   batch loss = 269.27763175964355 | accuracy = 0.5


Epoch[1] Batch[100] Speed: 1.2536973395236504 samples/sec                   batch loss = 283.5467896461487 | accuracy = 0.4925


Epoch[1] Batch[105] Speed: 1.2488145086603057 samples/sec                   batch loss = 296.8028562068939 | accuracy = 0.5


Epoch[1] Batch[110] Speed: 1.2489754356934077 samples/sec                   batch loss = 310.86724281311035 | accuracy = 0.4954545454545455


Epoch[1] Batch[115] Speed: 1.2526597116836622 samples/sec                   batch loss = 324.5368468761444 | accuracy = 0.49782608695652175


Epoch[1] Batch[120] Speed: 1.2518522354344312 samples/sec                   batch loss = 338.7569327354431 | accuracy = 0.5


Epoch[1] Batch[125] Speed: 1.2530377764702596 samples/sec                   batch loss = 352.04032039642334 | accuracy = 0.504


Epoch[1] Batch[130] Speed: 1.2511579419466312 samples/sec                   batch loss = 366.3262622356415 | accuracy = 0.5


Epoch[1] Batch[135] Speed: 1.2534893018452586 samples/sec                   batch loss = 380.31596422195435 | accuracy = 0.5018518518518519


Epoch[1] Batch[140] Speed: 1.2587036771950169 samples/sec                   batch loss = 394.56033396720886 | accuracy = 0.5017857142857143


Epoch[1] Batch[145] Speed: 1.2551780759565607 samples/sec                   batch loss = 408.5103635787964 | accuracy = 0.5017241379310344


Epoch[1] Batch[150] Speed: 1.2559489494055298 samples/sec                   batch loss = 422.13892102241516 | accuracy = 0.505


Epoch[1] Batch[155] Speed: 1.2554335507659078 samples/sec                   batch loss = 435.7014663219452 | accuracy = 0.5080645161290323


Epoch[1] Batch[160] Speed: 1.2504664685960833 samples/sec                   batch loss = 449.3415832519531 | accuracy = 0.5171875


Epoch[1] Batch[165] Speed: 1.253135393721228 samples/sec                   batch loss = 463.3342332839966 | accuracy = 0.5136363636363637


Epoch[1] Batch[170] Speed: 1.2534134475628007 samples/sec                   batch loss = 477.28476095199585 | accuracy = 0.5102941176470588


Epoch[1] Batch[175] Speed: 1.24985284973136 samples/sec                   batch loss = 490.65964007377625 | accuracy = 0.51


Epoch[1] Batch[180] Speed: 1.2496025261949872 samples/sec                   batch loss = 504.0242462158203 | accuracy = 0.5152777777777777


Epoch[1] Batch[185] Speed: 1.2454292473976194 samples/sec                   batch loss = 518.0246257781982 | accuracy = 0.5162162162162162


Epoch[1] Batch[190] Speed: 1.2449303892002845 samples/sec                   batch loss = 531.2519941329956 | accuracy = 0.5223684210526316


Epoch[1] Batch[195] Speed: 1.2478062892881556 samples/sec                   batch loss = 545.4101161956787 | accuracy = 0.5205128205128206


Epoch[1] Batch[200] Speed: 1.2437510967101328 samples/sec                   batch loss = 559.0729105472565 | accuracy = 0.52


Epoch[1] Batch[205] Speed: 1.2474153287875172 samples/sec                   batch loss = 572.8998942375183 | accuracy = 0.5219512195121951


Epoch[1] Batch[210] Speed: 1.2506647402140303 samples/sec                   batch loss = 586.1198236942291 | accuracy = 0.5261904761904762


Epoch[1] Batch[215] Speed: 1.248752138507955 samples/sec                   batch loss = 599.7089805603027 | accuracy = 0.5279069767441861


Epoch[1] Batch[220] Speed: 1.2421192115543067 samples/sec                   batch loss = 613.4604806900024 | accuracy = 0.5295454545454545


Epoch[1] Batch[225] Speed: 1.247466434759022 samples/sec                   batch loss = 626.7685983181 | accuracy = 0.5333333333333333


Epoch[1] Batch[230] Speed: 1.2468284205740316 samples/sec                   batch loss = 640.5300769805908 | accuracy = 0.533695652173913


Epoch[1] Batch[235] Speed: 1.2474468637379814 samples/sec                   batch loss = 654.0772085189819 | accuracy = 0.5361702127659574


Epoch[1] Batch[240] Speed: 1.2475390662382888 samples/sec                   batch loss = 668.0172591209412 | accuracy = 0.5364583333333334


Epoch[1] Batch[245] Speed: 1.2493326726707137 samples/sec                   batch loss = 681.5168790817261 | accuracy = 0.5357142857142857


Epoch[1] Batch[250] Speed: 1.2471519814683198 samples/sec                   batch loss = 695.3438229560852 | accuracy = 0.535


Epoch[1] Batch[255] Speed: 1.2608556227296794 samples/sec                   batch loss = 709.1079790592194 | accuracy = 0.5333333333333333


Epoch[1] Batch[260] Speed: 1.251429518095071 samples/sec                   batch loss = 723.0429673194885 | accuracy = 0.5326923076923077


Epoch[1] Batch[265] Speed: 1.2542599770771745 samples/sec                   batch loss = 736.5874032974243 | accuracy = 0.5330188679245284


Epoch[1] Batch[270] Speed: 1.2500722561818618 samples/sec                   batch loss = 750.6144046783447 | accuracy = 0.5314814814814814


Epoch[1] Batch[275] Speed: 1.2546880808709608 samples/sec                   batch loss = 763.8912878036499 | accuracy = 0.5354545454545454


Epoch[1] Batch[280] Speed: 1.2529154721978746 samples/sec                   batch loss = 778.1081323623657 | accuracy = 0.5366071428571428


Epoch[1] Batch[285] Speed: 1.2517641574479232 samples/sec                   batch loss = 791.6752586364746 | accuracy = 0.5385964912280702


Epoch[1] Batch[290] Speed: 1.2584343154425237 samples/sec                   batch loss = 804.8615162372589 | accuracy = 0.5405172413793103


Epoch[1] Batch[295] Speed: 1.2527981499119423 samples/sec                   batch loss = 818.7176141738892 | accuracy = 0.5389830508474577


Epoch[1] Batch[300] Speed: 1.2523861990713785 samples/sec                   batch loss = 831.9858627319336 | accuracy = 0.5391666666666667


Epoch[1] Batch[305] Speed: 1.2512903555895123 samples/sec                   batch loss = 845.257654428482 | accuracy = 0.5409836065573771


Epoch[1] Batch[310] Speed: 1.251417663342326 samples/sec                   batch loss = 859.0507869720459 | accuracy = 0.5403225806451613


Epoch[1] Batch[315] Speed: 1.2558146080115404 samples/sec                   batch loss = 872.3863577842712 | accuracy = 0.5412698412698412


Epoch[1] Batch[320] Speed: 1.2540301935069922 samples/sec                   batch loss = 885.1313261985779 | accuracy = 0.5453125


Epoch[1] Batch[325] Speed: 1.2506697747265383 samples/sec                   batch loss = 899.1899788379669 | accuracy = 0.5430769230769231


Epoch[1] Batch[330] Speed: 1.2594463642052491 samples/sec                   batch loss = 913.057629108429 | accuracy = 0.5416666666666666


Epoch[1] Batch[335] Speed: 1.259366384050383 samples/sec                   batch loss = 926.2237181663513 | accuracy = 0.5432835820895522


Epoch[1] Batch[340] Speed: 1.2498840424709257 samples/sec                   batch loss = 939.4282968044281 | accuracy = 0.5448529411764705


Epoch[1] Batch[345] Speed: 1.245068879764125 samples/sec                   batch loss = 953.6247954368591 | accuracy = 0.5456521739130434


Epoch[1] Batch[350] Speed: 1.243949826841329 samples/sec                   batch loss = 968.042219877243 | accuracy = 0.5435714285714286


Epoch[1] Batch[355] Speed: 1.2503518409157373 samples/sec                   batch loss = 981.5110244750977 | accuracy = 0.5471830985915493


Epoch[1] Batch[360] Speed: 1.2520697281244362 samples/sec                   batch loss = 993.9002656936646 | accuracy = 0.5493055555555556


Epoch[1] Batch[365] Speed: 1.245990316350618 samples/sec                   batch loss = 1007.6010887622833 | accuracy = 0.547945205479452


Epoch[1] Batch[370] Speed: 1.2491111077359165 samples/sec                   batch loss = 1021.1696422100067 | accuracy = 0.5472972972972973


Epoch[1] Batch[375] Speed: 1.2517204499629757 samples/sec                   batch loss = 1034.3963124752045 | accuracy = 0.548


Epoch[1] Batch[380] Speed: 1.2493161130438928 samples/sec                   batch loss = 1047.358692407608 | accuracy = 0.5480263157894737


Epoch[1] Batch[385] Speed: 1.2477352042055208 samples/sec                   batch loss = 1060.3077671527863 | accuracy = 0.5493506493506494


Epoch[1] Batch[390] Speed: 1.2497206470455775 samples/sec                   batch loss = 1074.7618339061737 | accuracy = 0.5487179487179488


Epoch[1] Batch[395] Speed: 1.2518609224821016 samples/sec                   batch loss = 1088.1318945884705 | accuracy = 0.549367088607595


Epoch[1] Batch[400] Speed: 1.253038057226737 samples/sec                   batch loss = 1101.468092918396 | accuracy = 0.55


Epoch[1] Batch[405] Speed: 1.2529664683992436 samples/sec                   batch loss = 1114.911836385727 | accuracy = 0.55


Epoch[1] Batch[410] Speed: 1.251529125558153 samples/sec                   batch loss = 1129.2338736057281 | accuracy = 0.5493902439024391


Epoch[1] Batch[415] Speed: 1.2551899081546791 samples/sec                   batch loss = 1142.711550951004 | accuracy = 0.5481927710843374


Epoch[1] Batch[420] Speed: 1.2576715138033612 samples/sec                   batch loss = 1156.298921585083 | accuracy = 0.5488095238095239


Epoch[1] Batch[425] Speed: 1.2514717116357472 samples/sec                   batch loss = 1169.3256583213806 | accuracy = 0.5494117647058824


Epoch[1] Batch[430] Speed: 1.2516056856498172 samples/sec                   batch loss = 1184.2093987464905 | accuracy = 0.5476744186046512


Epoch[1] Batch[435] Speed: 1.2453750726434696 samples/sec                   batch loss = 1198.4405362606049 | accuracy = 0.5465517241379311


Epoch[1] Batch[440] Speed: 1.2529747966135987 samples/sec                   batch loss = 1212.3859922885895 | accuracy = 0.5454545454545454


Epoch[1] Batch[445] Speed: 1.2478702356619868 samples/sec                   batch loss = 1226.1987845897675 | accuracy = 0.5460674157303371


Epoch[1] Batch[450] Speed: 1.2502097504076337 samples/sec                   batch loss = 1239.603642463684 | accuracy = 0.5472222222222223


Epoch[1] Batch[455] Speed: 1.24793251774824 samples/sec                   batch loss = 1253.2442252635956 | accuracy = 0.5478021978021979


Epoch[1] Batch[460] Speed: 1.2487300176555662 samples/sec                   batch loss = 1266.7780485153198 | accuracy = 0.5478260869565217


Epoch[1] Batch[465] Speed: 1.2481519933014151 samples/sec                   batch loss = 1279.347666501999 | accuracy = 0.55


Epoch[1] Batch[470] Speed: 1.254488250231629 samples/sec                   batch loss = 1292.8541626930237 | accuracy = 0.5521276595744681


Epoch[1] Batch[475] Speed: 1.2581539354056472 samples/sec                   batch loss = 1306.8569648265839 | accuracy = 0.5521052631578948


Epoch[1] Batch[480] Speed: 1.2601446838081793 samples/sec                   batch loss = 1320.6958394050598 | accuracy = 0.5536458333333333


Epoch[1] Batch[485] Speed: 1.2595819566100361 samples/sec                   batch loss = 1333.709733247757 | accuracy = 0.554639175257732


Epoch[1] Batch[490] Speed: 1.263531824595693 samples/sec                   batch loss = 1347.702862739563 | accuracy = 0.5530612244897959


Epoch[1] Batch[495] Speed: 1.260286106878578 samples/sec                   batch loss = 1361.1142945289612 | accuracy = 0.554040404040404


Epoch[1] Batch[500] Speed: 1.2564165950298007 samples/sec                   batch loss = 1374.0879204273224 | accuracy = 0.5555


Epoch[1] Batch[505] Speed: 1.252525418428214 samples/sec                   batch loss = 1388.7411494255066 | accuracy = 0.5544554455445545


Epoch[1] Batch[510] Speed: 1.2563949545050395 samples/sec                   batch loss = 1401.663215637207 | accuracy = 0.5563725490196079


Epoch[1] Batch[515] Speed: 1.252879543422131 samples/sec                   batch loss = 1415.573972940445 | accuracy = 0.5567961165048544


Epoch[1] Batch[520] Speed: 1.2534510926147002 samples/sec                   batch loss = 1428.6511719226837 | accuracy = 0.5586538461538462


Epoch[1] Batch[525] Speed: 1.260083353549586 samples/sec                   batch loss = 1443.1879105567932 | accuracy = 0.5571428571428572


Epoch[1] Batch[530] Speed: 1.2493161130438928 samples/sec                   batch loss = 1455.356122970581 | accuracy = 0.5584905660377358


Epoch[1] Batch[535] Speed: 1.255943966321342 samples/sec                   batch loss = 1468.3298077583313 | accuracy = 0.5593457943925234


Epoch[1] Batch[540] Speed: 1.2589386721590914 samples/sec                   batch loss = 1480.8122363090515 | accuracy = 0.562037037037037


Epoch[1] Batch[545] Speed: 1.2542861389297832 samples/sec                   batch loss = 1493.8271596431732 | accuracy = 0.5619266055045872


Epoch[1] Batch[550] Speed: 1.2444563916258369 samples/sec                   batch loss = 1507.2396488189697 | accuracy = 0.5627272727272727


Epoch[1] Batch[555] Speed: 1.2455464880859448 samples/sec                   batch loss = 1520.2504289150238 | accuracy = 0.5626126126126126


Epoch[1] Batch[560] Speed: 1.2425632690376653 samples/sec                   batch loss = 1532.7787549495697 | accuracy = 0.5629464285714286


Epoch[1] Batch[565] Speed: 1.2440298901453517 samples/sec                   batch loss = 1546.7865142822266 | accuracy = 0.5632743362831858


Epoch[1] Batch[570] Speed: 1.2432293520766562 samples/sec                   batch loss = 1559.3098323345184 | accuracy = 0.5649122807017544


Epoch[1] Batch[575] Speed: 1.246730393731963 samples/sec                   batch loss = 1573.181351184845 | accuracy = 0.5652173913043478


Epoch[1] Batch[580] Speed: 1.2452464033832382 samples/sec                   batch loss = 1586.6992695331573 | accuracy = 0.565948275862069


Epoch[1] Batch[585] Speed: 1.246830737084707 samples/sec                   batch loss = 1599.7808585166931 | accuracy = 0.5658119658119658


Epoch[1] Batch[590] Speed: 1.2453770139781508 samples/sec                   batch loss = 1612.1385457515717 | accuracy = 0.5677966101694916


Epoch[1] Batch[595] Speed: 1.2467711591664181 samples/sec                   batch loss = 1626.5941834449768 | accuracy = 0.5663865546218487


Epoch[1] Batch[600] Speed: 1.2470297111391018 samples/sec                   batch loss = 1638.75914311409 | accuracy = 0.56875


Epoch[1] Batch[605] Speed: 1.2507269286084688 samples/sec                   batch loss = 1651.2436010837555 | accuracy = 0.5690082644628099


Epoch[1] Batch[610] Speed: 1.246744105444197 samples/sec                   batch loss = 1664.3153772354126 | accuracy = 0.5688524590163935


Epoch[1] Batch[615] Speed: 1.2472095560133876 samples/sec                   batch loss = 1676.6608768701553 | accuracy = 0.5699186991869919


Epoch[1] Batch[620] Speed: 1.2496789438326565 samples/sec                   batch loss = 1692.5243266820908 | accuracy = 0.5681451612903226


Epoch[1] Batch[625] Speed: 1.2474063323439666 samples/sec                   batch loss = 1704.6837912797928 | accuracy = 0.5696


Epoch[1] Batch[630] Speed: 1.24819591624897 samples/sec                   batch loss = 1719.0771397352219 | accuracy = 0.569047619047619


Epoch[1] Batch[635] Speed: 1.2598194554859636 samples/sec                   batch loss = 1731.6353250741959 | accuracy = 0.5700787401574803


Epoch[1] Batch[640] Speed: 1.2564181004853439 samples/sec                   batch loss = 1743.7663263082504 | accuracy = 0.571484375


Epoch[1] Batch[645] Speed: 1.2531733028896637 samples/sec                   batch loss = 1756.796413064003 | accuracy = 0.5724806201550388


Epoch[1] Batch[650] Speed: 1.2530424557613071 samples/sec                   batch loss = 1770.8880680799484 | accuracy = 0.5723076923076923


Epoch[1] Batch[655] Speed: 1.2518019836967147 samples/sec                   batch loss = 1783.0666776895523 | accuracy = 0.5729007633587786


Epoch[1] Batch[660] Speed: 1.2534521227349693 samples/sec                   batch loss = 1795.598004937172 | accuracy = 0.5727272727272728


Epoch[1] Batch[665] Speed: 1.2540030113025498 samples/sec                   batch loss = 1809.2328544855118 | accuracy = 0.5733082706766918


Epoch[1] Batch[670] Speed: 1.2567877048686655 samples/sec                   batch loss = 1821.617028594017 | accuracy = 0.575


Epoch[1] Batch[675] Speed: 1.2504290957994337 samples/sec                   batch loss = 1835.0147346258163 | accuracy = 0.5755555555555556


Epoch[1] Batch[680] Speed: 1.2538582156251128 samples/sec                   batch loss = 1847.529564499855 | accuracy = 0.5761029411764705


Epoch[1] Batch[685] Speed: 1.2556950504831261 samples/sec                   batch loss = 1860.8221682310104 | accuracy = 0.577007299270073


Epoch[1] Batch[690] Speed: 1.2557833065530655 samples/sec                   batch loss = 1872.3375252485275 | accuracy = 0.5782608695652174


Epoch[1] Batch[695] Speed: 1.2502045332725464 samples/sec                   batch loss = 1883.959833264351 | accuracy = 0.5794964028776979


Epoch[1] Batch[700] Speed: 1.246569396324016 samples/sec                   batch loss = 1895.6116777658463 | accuracy = 0.5807142857142857


Epoch[1] Batch[705] Speed: 1.2483147930057772 samples/sec                   batch loss = 1908.5085524320602 | accuracy = 0.5808510638297872


Epoch[1] Batch[710] Speed: 1.2462388243299054 samples/sec                   batch loss = 1920.4035543203354 | accuracy = 0.5816901408450704


Epoch[1] Batch[715] Speed: 1.244276878411529 samples/sec                   batch loss = 1933.1182922124863 | accuracy = 0.5818181818181818


Epoch[1] Batch[720] Speed: 1.2480541296059668 samples/sec                   batch loss = 1946.3088850975037 | accuracy = 0.5826388888888889


Epoch[1] Batch[725] Speed: 1.249075954782201 samples/sec                   batch loss = 1959.029657125473 | accuracy = 0.5834482758620689


Epoch[1] Batch[730] Speed: 1.245445796598463 samples/sec                   batch loss = 1971.03981757164 | accuracy = 0.5842465753424657


Epoch[1] Batch[735] Speed: 1.245367954468087 samples/sec                   batch loss = 1985.4727065563202 | accuracy = 0.5836734693877551


Epoch[1] Batch[740] Speed: 1.2516250138854208 samples/sec                   batch loss = 1997.7195498943329 | accuracy = 0.5844594594594594


Epoch[1] Batch[745] Speed: 1.24783004798844 samples/sec                   batch loss = 2010.1072206497192 | accuracy = 0.5848993288590604


Epoch[1] Batch[750] Speed: 1.2535916728572092 samples/sec                   batch loss = 2024.7103147506714 | accuracy = 0.5846666666666667


Epoch[1] Batch[755] Speed: 1.2506107615992523 samples/sec                   batch loss = 2036.2115975618362 | accuracy = 0.5847682119205299


Epoch[1] Batch[760] Speed: 1.2506927102408847 samples/sec                   batch loss = 2050.9389671087265 | accuracy = 0.5848684210526316


Epoch[1] Batch[765] Speed: 1.25117697643258 samples/sec                   batch loss = 2065.7777166366577 | accuracy = 0.5849673202614379


Epoch[1] Batch[770] Speed: 1.2510472921405011 samples/sec                   batch loss = 2078.490462779999 | accuracy = 0.5850649350649351


Epoch[1] Batch[775] Speed: 1.2481831012007014 samples/sec                   batch loss = 2090.7649190425873 | accuracy = 0.5858064516129032


Epoch[1] Batch[780] Speed: 1.2526964696620189 samples/sec                   batch loss = 2103.5300369262695 | accuracy = 0.5862179487179487


Epoch[1] Batch[785] Speed: 1.252271873478952 samples/sec                   batch loss = 2116.0045642852783 | accuracy = 0.5878980891719745


[Epoch 1] training: accuracy=0.5881979695431472
[Epoch 1] time cost: 648.0371339321136
[Epoch 1] validation: validation accuracy=0.6788888888888889


Epoch[2] Batch[5] Speed: 1.245613347350665 samples/sec                   batch loss = 12.991774320602417 | accuracy = 0.5


Epoch[2] Batch[10] Speed: 1.253112836514388 samples/sec                   batch loss = 25.8602511882782 | accuracy = 0.525


Epoch[2] Batch[15] Speed: 1.2477114491167183 samples/sec                   batch loss = 41.21578884124756 | accuracy = 0.5666666666666667


Epoch[2] Batch[20] Speed: 1.2492784368769558 samples/sec                   batch loss = 55.22869944572449 | accuracy = 0.5375


Epoch[2] Batch[25] Speed: 1.2492092302842184 samples/sec                   batch loss = 68.22726154327393 | accuracy = 0.58


Epoch[2] Batch[30] Speed: 1.246624971857118 samples/sec                   batch loss = 80.59114336967468 | accuracy = 0.6083333333333333


Epoch[2] Batch[35] Speed: 1.2405297600936394 samples/sec                   batch loss = 95.55366158485413 | accuracy = 0.6071428571428571


Epoch[2] Batch[40] Speed: 1.2532800224940586 samples/sec                   batch loss = 110.05313301086426 | accuracy = 0.6


Epoch[2] Batch[45] Speed: 1.2444972854256386 samples/sec                   batch loss = 124.14902305603027 | accuracy = 0.6


Epoch[2] Batch[50] Speed: 1.2499043418660787 samples/sec                   batch loss = 135.95464754104614 | accuracy = 0.605


Epoch[2] Batch[55] Speed: 1.2449653091992603 samples/sec                   batch loss = 148.61803019046783 | accuracy = 0.6045454545454545


Epoch[2] Batch[60] Speed: 1.2472484056100404 samples/sec                   batch loss = 161.21175491809845 | accuracy = 0.6083333333333333


Epoch[2] Batch[65] Speed: 1.249884601161051 samples/sec                   batch loss = 173.48492991924286 | accuracy = 0.6115384615384616


Epoch[2] Batch[70] Speed: 1.2461936505182292 samples/sec                   batch loss = 185.40691924095154 | accuracy = 0.6214285714285714


Epoch[2] Batch[75] Speed: 1.2461638450014076 samples/sec                   batch loss = 197.63465762138367 | accuracy = 0.6266666666666667


Epoch[2] Batch[80] Speed: 1.2500709521808753 samples/sec                   batch loss = 211.5373055934906 | accuracy = 0.621875


Epoch[2] Batch[85] Speed: 1.2521548585960924 samples/sec                   batch loss = 223.4416365623474 | accuracy = 0.6235294117647059


Epoch[2] Batch[90] Speed: 1.2531049744422758 samples/sec                   batch loss = 235.48773396015167 | accuracy = 0.6305555555555555


Epoch[2] Batch[95] Speed: 1.251145345914079 samples/sec                   batch loss = 248.30293834209442 | accuracy = 0.631578947368421


Epoch[2] Batch[100] Speed: 1.2528642930283185 samples/sec                   batch loss = 262.2983112335205 | accuracy = 0.625


Epoch[2] Batch[105] Speed: 1.2502383522611682 samples/sec                   batch loss = 274.0129281282425 | accuracy = 0.6285714285714286


Epoch[2] Batch[110] Speed: 1.2525389773733617 samples/sec                   batch loss = 285.5143747329712 | accuracy = 0.634090909090909


Epoch[2] Batch[115] Speed: 1.2568366629299912 samples/sec                   batch loss = 297.69121408462524 | accuracy = 0.6391304347826087


Epoch[2] Batch[120] Speed: 1.255014889504831 samples/sec                   batch loss = 310.5781376361847 | accuracy = 0.6375


Epoch[2] Batch[125] Speed: 1.2555275014755645 samples/sec                   batch loss = 321.36734902858734 | accuracy = 0.644


Epoch[2] Batch[130] Speed: 1.2553354812260662 samples/sec                   batch loss = 333.47772192955017 | accuracy = 0.6461538461538462


Epoch[2] Batch[135] Speed: 1.2508617690657227 samples/sec                   batch loss = 345.62306094169617 | accuracy = 0.6425925925925926


Epoch[2] Batch[140] Speed: 1.2442837072724728 samples/sec                   batch loss = 359.4918191432953 | accuracy = 0.6375


Epoch[2] Batch[145] Speed: 1.2474025297625133 samples/sec                   batch loss = 371.31952250003815 | accuracy = 0.6413793103448275


Epoch[2] Batch[150] Speed: 1.2517709753436301 samples/sec                   batch loss = 384.31914031505585 | accuracy = 0.6383333333333333


Epoch[2] Batch[155] Speed: 1.2553122811911195 samples/sec                   batch loss = 395.8038944005966 | accuracy = 0.6403225806451613


Epoch[2] Batch[160] Speed: 1.2511894797924352 samples/sec                   batch loss = 408.49702084064484 | accuracy = 0.6421875


Epoch[2] Batch[165] Speed: 1.25004803946456 samples/sec                   batch loss = 420.41679668426514 | accuracy = 0.6439393939393939


Epoch[2] Batch[170] Speed: 1.2417488086337032 samples/sec                   batch loss = 431.8102722167969 | accuracy = 0.6455882352941177


Epoch[2] Batch[175] Speed: 1.253603475164968 samples/sec                   batch loss = 442.63342583179474 | accuracy = 0.6485714285714286


Epoch[2] Batch[180] Speed: 1.2554908590262135 samples/sec                   batch loss = 454.7649837732315 | accuracy = 0.65


Epoch[2] Batch[185] Speed: 1.254392016496123 samples/sec                   batch loss = 465.46968162059784 | accuracy = 0.6527027027027027


Epoch[2] Batch[190] Speed: 1.2527946885833097 samples/sec                   batch loss = 479.7954970598221 | accuracy = 0.6513157894736842


Epoch[2] Batch[195] Speed: 1.2499725079622603 samples/sec                   batch loss = 491.0546005964279 | accuracy = 0.6525641025641026


Epoch[2] Batch[200] Speed: 1.250144911963793 samples/sec                   batch loss = 504.03569650650024 | accuracy = 0.65


Epoch[2] Batch[205] Speed: 1.2496085759717896 samples/sec                   batch loss = 515.4544777870178 | accuracy = 0.6548780487804878


Epoch[2] Batch[210] Speed: 1.2559449065228794 samples/sec                   batch loss = 529.238498210907 | accuracy = 0.6535714285714286


Epoch[2] Batch[215] Speed: 1.2506278216940918 samples/sec                   batch loss = 541.3521126508713 | accuracy = 0.6534883720930232


Epoch[2] Batch[220] Speed: 1.24367531011238 samples/sec                   batch loss = 554.286091208458 | accuracy = 0.6545454545454545


Epoch[2] Batch[225] Speed: 1.2437301668846892 samples/sec                   batch loss = 565.1280753612518 | accuracy = 0.6566666666666666


Epoch[2] Batch[230] Speed: 1.2403956705521348 samples/sec                   batch loss = 576.6674289703369 | accuracy = 0.6597826086956522


Epoch[2] Batch[235] Speed: 1.2502563339009605 samples/sec                   batch loss = 589.6395874023438 | accuracy = 0.6606382978723404


Epoch[2] Batch[240] Speed: 1.2566334177794298 samples/sec                   batch loss = 602.9267477989197 | accuracy = 0.65625


Epoch[2] Batch[245] Speed: 1.2488497398604377 samples/sec                   batch loss = 615.6476118564606 | accuracy = 0.6551020408163265


Epoch[2] Batch[250] Speed: 1.2465691184588006 samples/sec                   batch loss = 629.0129978656769 | accuracy = 0.654


Epoch[2] Batch[255] Speed: 1.2428801080996188 samples/sec                   batch loss = 641.1898634433746 | accuracy = 0.6549019607843137


Epoch[2] Batch[260] Speed: 1.243486344839813 samples/sec                   batch loss = 651.4085146188736 | accuracy = 0.6576923076923077


Epoch[2] Batch[265] Speed: 1.2464039032139749 samples/sec                   batch loss = 663.9874882698059 | accuracy = 0.6556603773584906


Epoch[2] Batch[270] Speed: 1.2499086253089344 samples/sec                   batch loss = 674.7918041944504 | accuracy = 0.6555555555555556


Epoch[2] Batch[275] Speed: 1.2417726129491893 samples/sec                   batch loss = 688.9850784540176 | accuracy = 0.6554545454545454


Epoch[2] Batch[280] Speed: 1.2513256332959763 samples/sec                   batch loss = 699.9579972028732 | accuracy = 0.6571428571428571


Epoch[2] Batch[285] Speed: 1.2572905538804304 samples/sec                   batch loss = 710.688559293747 | accuracy = 0.6587719298245615


Epoch[2] Batch[290] Speed: 1.2572140504319411 samples/sec                   batch loss = 720.8906733989716 | accuracy = 0.6603448275862069


Epoch[2] Batch[295] Speed: 1.2613560429130701 samples/sec                   batch loss = 733.5134774446487 | accuracy = 0.6593220338983051


Epoch[2] Batch[300] Speed: 1.2598921134112089 samples/sec                   batch loss = 745.7715820074081 | accuracy = 0.6591666666666667


Epoch[2] Batch[305] Speed: 1.25419518664968 samples/sec                   batch loss = 759.2741377353668 | accuracy = 0.660655737704918


Epoch[2] Batch[310] Speed: 1.25126497180383 samples/sec                   batch loss = 771.8202295303345 | accuracy = 0.6596774193548387


Epoch[2] Batch[315] Speed: 1.2480574719458488 samples/sec                   batch loss = 782.503618478775 | accuracy = 0.6603174603174603


Epoch[2] Batch[320] Speed: 1.2487638498647458 samples/sec                   batch loss = 796.4388425350189 | accuracy = 0.65859375


Epoch[2] Batch[325] Speed: 1.2501618661948348 samples/sec                   batch loss = 809.3122236728668 | accuracy = 0.66


Epoch[2] Batch[330] Speed: 1.2405676442650606 samples/sec                   batch loss = 819.2519541978836 | accuracy = 0.6613636363636364


Epoch[2] Batch[335] Speed: 1.2514386660105594 samples/sec                   batch loss = 827.5450550317764 | accuracy = 0.6649253731343283


Epoch[2] Batch[340] Speed: 1.2424047260683546 samples/sec                   batch loss = 838.1245081424713 | accuracy = 0.6669117647058823


Epoch[2] Batch[345] Speed: 1.2501223691380452 samples/sec                   batch loss = 847.8587359189987 | accuracy = 0.6688405797101449


Epoch[2] Batch[350] Speed: 1.2520746805131193 samples/sec                   batch loss = 861.2358512878418 | accuracy = 0.6671428571428571


Epoch[2] Batch[355] Speed: 1.2493343472641592 samples/sec                   batch loss = 871.8418823480606 | accuracy = 0.6669014084507042


Epoch[2] Batch[360] Speed: 1.2449938563679832 samples/sec                   batch loss = 884.1003087759018 | accuracy = 0.6673611111111111


Epoch[2] Batch[365] Speed: 1.2477778914037414 samples/sec                   batch loss = 896.2760509252548 | accuracy = 0.6684931506849315


Epoch[2] Batch[370] Speed: 1.2463851062723335 samples/sec                   batch loss = 908.0979183912277 | accuracy = 0.668918918918919


Epoch[2] Batch[375] Speed: 1.242838215690436 samples/sec                   batch loss = 920.7463835477829 | accuracy = 0.6686666666666666


Epoch[2] Batch[380] Speed: 1.2459396088362917 samples/sec                   batch loss = 933.6332898139954 | accuracy = 0.6664473684210527


Epoch[2] Batch[385] Speed: 1.2418956011316893 samples/sec                   batch loss = 947.4211723804474 | accuracy = 0.6642857142857143


Epoch[2] Batch[390] Speed: 1.243152615784834 samples/sec                   batch loss = 958.7854390144348 | accuracy = 0.6641025641025641


Epoch[2] Batch[395] Speed: 1.2507514513696398 samples/sec                   batch loss = 970.6941587924957 | accuracy = 0.6639240506329114


Epoch[2] Batch[400] Speed: 1.2469203463238212 samples/sec                   batch loss = 982.1143578290939 | accuracy = 0.664375


Epoch[2] Batch[405] Speed: 1.246847786868098 samples/sec                   batch loss = 991.8315795660019 | accuracy = 0.6660493827160494


Epoch[2] Batch[410] Speed: 1.248348045400979 samples/sec                   batch loss = 1004.3174693584442 | accuracy = 0.6676829268292683


Epoch[2] Batch[415] Speed: 1.2418565327385644 samples/sec                   batch loss = 1015.2338846921921 | accuracy = 0.6686746987951807


Epoch[2] Batch[420] Speed: 1.248389195370431 samples/sec                   batch loss = 1024.6403467655182 | accuracy = 0.669047619047619


Epoch[2] Batch[425] Speed: 1.2431173368511932 samples/sec                   batch loss = 1038.2913324832916 | accuracy = 0.668235294117647


Epoch[2] Batch[430] Speed: 1.2475836882373343 samples/sec                   batch loss = 1049.762257695198 | accuracy = 0.6691860465116279


Epoch[2] Batch[435] Speed: 1.25097285244161 samples/sec                   batch loss = 1061.348869562149 | accuracy = 0.6706896551724137


Epoch[2] Batch[440] Speed: 1.251302114603985 samples/sec                   batch loss = 1072.0114442110062 | accuracy = 0.6715909090909091


Epoch[2] Batch[445] Speed: 1.2449183801295272 samples/sec                   batch loss = 1085.852066397667 | accuracy = 0.6719101123595506


Epoch[2] Batch[450] Speed: 1.2539251267530305 samples/sec                   batch loss = 1098.4189813137054 | accuracy = 0.6716666666666666


Epoch[2] Batch[455] Speed: 1.256182446527015 samples/sec                   batch loss = 1109.11230635643 | accuracy = 0.6730769230769231


Epoch[2] Batch[460] Speed: 1.2505175452575383 samples/sec                   batch loss = 1117.8808953762054 | accuracy = 0.6744565217391304


Epoch[2] Batch[465] Speed: 1.2462388243299054 samples/sec                   batch loss = 1130.1019854545593 | accuracy = 0.6741935483870968


Epoch[2] Batch[470] Speed: 1.2453637945331335 samples/sec                   batch loss = 1142.243086218834 | accuracy = 0.675


Epoch[2] Batch[475] Speed: 1.2455599888252205 samples/sec                   batch loss = 1154.9766563177109 | accuracy = 0.6747368421052632


Epoch[2] Batch[480] Speed: 1.2447061349489796 samples/sec                   batch loss = 1165.5204335451126 | accuracy = 0.6744791666666666


Epoch[2] Batch[485] Speed: 1.2501263746220683 samples/sec                   batch loss = 1179.348559498787 | accuracy = 0.6737113402061856


Epoch[2] Batch[490] Speed: 1.2432773516199043 samples/sec                   batch loss = 1193.3177094459534 | accuracy = 0.673469387755102


Epoch[2] Batch[495] Speed: 1.2457714152863697 samples/sec                   batch loss = 1203.7745773792267 | accuracy = 0.6737373737373737


Epoch[2] Batch[500] Speed: 1.242488823335946 samples/sec                   batch loss = 1217.8727071285248 | accuracy = 0.6735


Epoch[2] Batch[505] Speed: 1.24446543787826 samples/sec                   batch loss = 1231.160281777382 | accuracy = 0.6722772277227723


Epoch[2] Batch[510] Speed: 1.240576542335784 samples/sec                   batch loss = 1245.3676356077194 | accuracy = 0.6715686274509803


Epoch[2] Batch[515] Speed: 1.2392104874989964 samples/sec                   batch loss = 1258.7130147218704 | accuracy = 0.670873786407767


Epoch[2] Batch[520] Speed: 1.2434925198702758 samples/sec                   batch loss = 1268.8517748117447 | accuracy = 0.6725961538461539


Epoch[2] Batch[525] Speed: 1.2486232351525375 samples/sec                   batch loss = 1281.0137429237366 | accuracy = 0.6723809523809524


Epoch[2] Batch[530] Speed: 1.2525201819486669 samples/sec                   batch loss = 1294.4488642215729 | accuracy = 0.6716981132075471


Epoch[2] Batch[535] Speed: 1.253237051625426 samples/sec                   batch loss = 1306.1918950080872 | accuracy = 0.6719626168224299


Epoch[2] Batch[540] Speed: 1.2499833109222862 samples/sec                   batch loss = 1318.5891618728638 | accuracy = 0.6712962962962963


Epoch[2] Batch[545] Speed: 1.251908376440428 samples/sec                   batch loss = 1329.943329334259 | accuracy = 0.6720183486238532


Epoch[2] Batch[550] Speed: 1.249434458894679 samples/sec                   batch loss = 1341.6007568836212 | accuracy = 0.6722727272727272


Epoch[2] Batch[555] Speed: 1.2413488672217754 samples/sec                   batch loss = 1351.5819319486618 | accuracy = 0.6738738738738739


Epoch[2] Batch[560] Speed: 1.2376879958351306 samples/sec                   batch loss = 1366.4216356277466 | accuracy = 0.6727678571428571


Epoch[2] Batch[565] Speed: 1.2463295521234947 samples/sec                   batch loss = 1375.8545610904694 | accuracy = 0.6738938053097345


Epoch[2] Batch[570] Speed: 1.240910449829979 samples/sec                   batch loss = 1387.9006662368774 | accuracy = 0.6745614035087719


Epoch[2] Batch[575] Speed: 1.249156679324605 samples/sec                   batch loss = 1398.3052026033401 | accuracy = 0.6760869565217391


Epoch[2] Batch[580] Speed: 1.245338650629071 samples/sec                   batch loss = 1409.8426595926285 | accuracy = 0.6762931034482759


Epoch[2] Batch[585] Speed: 1.244026292609873 samples/sec                   batch loss = 1421.1949387788773 | accuracy = 0.6760683760683761


Epoch[2] Batch[590] Speed: 1.2518872645895371 samples/sec                   batch loss = 1434.1089162826538 | accuracy = 0.6758474576271186


Epoch[2] Batch[595] Speed: 1.2472826212209727 samples/sec                   batch loss = 1446.1825516223907 | accuracy = 0.6760504201680673


Epoch[2] Batch[600] Speed: 1.246811927266847 samples/sec                   batch loss = 1456.5967055559158 | accuracy = 0.6766666666666666


Epoch[2] Batch[605] Speed: 1.2500211227529228 samples/sec                   batch loss = 1467.8629574775696 | accuracy = 0.6772727272727272


Epoch[2] Batch[610] Speed: 1.2461125680387866 samples/sec                   batch loss = 1479.7495757341385 | accuracy = 0.6774590163934426


Epoch[2] Batch[615] Speed: 1.250896463014069 samples/sec                   batch loss = 1489.8566710948944 | accuracy = 0.6788617886178862


Epoch[2] Batch[620] Speed: 1.248731969463958 samples/sec                   batch loss = 1503.8471529483795 | accuracy = 0.6782258064516129


Epoch[2] Batch[625] Speed: 1.2444466070518871 samples/sec                   batch loss = 1517.0813546180725 | accuracy = 0.6772


Epoch[2] Batch[630] Speed: 1.2560865169454574 samples/sec                   batch loss = 1529.4635607004166 | accuracy = 0.676984126984127


Epoch[2] Batch[635] Speed: 1.2451008506088204 samples/sec                   batch loss = 1538.8188623189926 | accuracy = 0.6775590551181102


Epoch[2] Batch[640] Speed: 1.2503407520355878 samples/sec                   batch loss = 1553.851998925209 | accuracy = 0.67578125


Epoch[2] Batch[645] Speed: 1.2459818031219623 samples/sec                   batch loss = 1568.5098872184753 | accuracy = 0.6751937984496124


Epoch[2] Batch[650] Speed: 1.2482936164260439 samples/sec                   batch loss = 1579.6631480455399 | accuracy = 0.6765384615384615


Epoch[2] Batch[655] Speed: 1.250764225964568 samples/sec                   batch loss = 1591.1742459535599 | accuracy = 0.6767175572519084


Epoch[2] Batch[660] Speed: 1.2537476498605964 samples/sec                   batch loss = 1602.797531247139 | accuracy = 0.6772727272727272


Epoch[2] Batch[665] Speed: 1.2488972447047797 samples/sec                   batch loss = 1613.1421909332275 | accuracy = 0.6781954887218045


Epoch[2] Batch[670] Speed: 1.244611765432019 samples/sec                   batch loss = 1624.195456981659 | accuracy = 0.6787313432835821


Epoch[2] Batch[675] Speed: 1.2471019209671506 samples/sec                   batch loss = 1633.7019565105438 | accuracy = 0.6796296296296296


Epoch[2] Batch[680] Speed: 1.2486887522777652 samples/sec                   batch loss = 1646.3041464090347 | accuracy = 0.6786764705882353


Epoch[2] Batch[685] Speed: 1.2428722818306444 samples/sec                   batch loss = 1660.1931953430176 | accuracy = 0.6788321167883211


Epoch[2] Batch[690] Speed: 1.2451115695254165 samples/sec                   batch loss = 1673.703377008438 | accuracy = 0.6789855072463769


Epoch[2] Batch[695] Speed: 1.2387178781597807 samples/sec                   batch loss = 1686.1394455432892 | accuracy = 0.679136690647482


Epoch[2] Batch[700] Speed: 1.2374605927313458 samples/sec                   batch loss = 1696.8584991693497 | accuracy = 0.6796428571428571


Epoch[2] Batch[705] Speed: 1.2382310521057038 samples/sec                   batch loss = 1710.2094486951828 | accuracy = 0.6787234042553192


Epoch[2] Batch[710] Speed: 1.2385637894588262 samples/sec                   batch loss = 1721.1464974880219 | accuracy = 0.6792253521126761


Epoch[2] Batch[715] Speed: 1.247289297656625 samples/sec                   batch loss = 1734.2160050868988 | accuracy = 0.679020979020979


Epoch[2] Batch[720] Speed: 1.2471778476352915 samples/sec                   batch loss = 1744.4768248796463 | accuracy = 0.6791666666666667


Epoch[2] Batch[725] Speed: 1.2497814381873822 samples/sec                   batch loss = 1757.5626658201218 | accuracy = 0.6782758620689655


Epoch[2] Batch[730] Speed: 1.247289112199114 samples/sec                   batch loss = 1767.640436410904 | accuracy = 0.6791095890410959


Epoch[2] Batch[735] Speed: 1.2518118842775734 samples/sec                   batch loss = 1780.201379418373 | accuracy = 0.6789115646258503


Epoch[2] Batch[740] Speed: 1.248639776416202 samples/sec                   batch loss = 1789.3879970312119 | accuracy = 0.6804054054054054


Epoch[2] Batch[745] Speed: 1.2542242524390412 samples/sec                   batch loss = 1799.29887509346 | accuracy = 0.6805369127516778


Epoch[2] Batch[750] Speed: 1.252705729645586 samples/sec                   batch loss = 1811.280546426773 | accuracy = 0.6803333333333333


Epoch[2] Batch[755] Speed: 1.2446169359938926 samples/sec                   batch loss = 1821.563508629799 | accuracy = 0.680794701986755


Epoch[2] Batch[760] Speed: 1.2476437149632251 samples/sec                   batch loss = 1833.7309464216232 | accuracy = 0.6799342105263158


Epoch[2] Batch[765] Speed: 1.2413064350814416 samples/sec                   batch loss = 1847.5501345396042 | accuracy = 0.6787581699346406


Epoch[2] Batch[770] Speed: 1.2367609237822443 samples/sec                   batch loss = 1859.6397676467896 | accuracy = 0.6782467532467532


Epoch[2] Batch[775] Speed: 1.242321836074275 samples/sec                   batch loss = 1871.1193416118622 | accuracy = 0.6780645161290323


Epoch[2] Batch[780] Speed: 1.2465813446454719 samples/sec                   batch loss = 1882.8088544607162 | accuracy = 0.6778846153846154


Epoch[2] Batch[785] Speed: 1.2468796638009192 samples/sec                   batch loss = 1893.314071059227 | accuracy = 0.6780254777070064


[Epoch 2] training: accuracy=0.6779822335025381
[Epoch 2] time cost: 647.3315858840942
[Epoch 2] validation: validation accuracy=0.7244444444444444


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).