<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[16:33:01] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[16:33:01] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[16:33:02] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 3.8622859, -0.9211854]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7502279005345747 samples/sec                   batch loss = 12.434579372406006 | accuracy = 0.7


Epoch[1] Batch[10] Speed: 1.23936757534969 samples/sec                   batch loss = 25.997478008270264 | accuracy = 0.675


Epoch[1] Batch[15] Speed: 1.2363257405571006 samples/sec                   batch loss = 41.322017192840576 | accuracy = 0.5833333333333334


Epoch[1] Batch[20] Speed: 1.2397234588467285 samples/sec                   batch loss = 55.4867479801178 | accuracy = 0.575


Epoch[1] Batch[25] Speed: 1.2502379795901668 samples/sec                   batch loss = 70.67591166496277 | accuracy = 0.57


Epoch[1] Batch[30] Speed: 1.2440099655949697 samples/sec                   batch loss = 84.73651432991028 | accuracy = 0.55


Epoch[1] Batch[35] Speed: 1.2478899127879215 samples/sec                   batch loss = 99.3988687992096 | accuracy = 0.5357142857142857


Epoch[1] Batch[40] Speed: 1.2549218602972878 samples/sec                   batch loss = 113.5031168460846 | accuracy = 0.54375


Epoch[1] Batch[45] Speed: 1.2535404384309017 samples/sec                   batch loss = 127.94388484954834 | accuracy = 0.5333333333333333


Epoch[1] Batch[50] Speed: 1.2497361002821163 samples/sec                   batch loss = 141.92772936820984 | accuracy = 0.54


Epoch[1] Batch[55] Speed: 1.2477806754529068 samples/sec                   batch loss = 156.246764421463 | accuracy = 0.5363636363636364


Epoch[1] Batch[60] Speed: 1.2434185157063222 samples/sec                   batch loss = 170.30510330200195 | accuracy = 0.5416666666666666


Epoch[1] Batch[65] Speed: 1.249010675974308 samples/sec                   batch loss = 184.2382242679596 | accuracy = 0.5384615384615384


Epoch[1] Batch[70] Speed: 1.2469151566059788 samples/sec                   batch loss = 198.03854775428772 | accuracy = 0.5392857142857143


Epoch[1] Batch[75] Speed: 1.2463488103342582 samples/sec                   batch loss = 211.8686718940735 | accuracy = 0.54


Epoch[1] Batch[80] Speed: 1.2547872692898714 samples/sec                   batch loss = 225.7096028327942 | accuracy = 0.540625


Epoch[1] Batch[85] Speed: 1.2554806183138119 samples/sec                   batch loss = 239.11686635017395 | accuracy = 0.5441176470588235


Epoch[1] Batch[90] Speed: 1.2488369113832956 samples/sec                   batch loss = 253.39332056045532 | accuracy = 0.5333333333333333


Epoch[1] Batch[95] Speed: 1.249032806771447 samples/sec                   batch loss = 267.71984338760376 | accuracy = 0.5289473684210526


Epoch[1] Batch[100] Speed: 1.2514837541194672 samples/sec                   batch loss = 280.93167185783386 | accuracy = 0.5375


Epoch[1] Batch[105] Speed: 1.2509040175966832 samples/sec                   batch loss = 294.3806188106537 | accuracy = 0.5452380952380952


Epoch[1] Batch[110] Speed: 1.2553857352543274 samples/sec                   batch loss = 307.79624128341675 | accuracy = 0.5454545454545454


Epoch[1] Batch[115] Speed: 1.2481051022343896 samples/sec                   batch loss = 321.63219165802 | accuracy = 0.5478260869565217


Epoch[1] Batch[120] Speed: 1.243666736297863 samples/sec                   batch loss = 335.4827284812927 | accuracy = 0.5458333333333333


Epoch[1] Batch[125] Speed: 1.2447696716074712 samples/sec                   batch loss = 349.12957096099854 | accuracy = 0.542


Epoch[1] Batch[130] Speed: 1.243320379312696 samples/sec                   batch loss = 362.2224917411804 | accuracy = 0.5538461538461539


Epoch[1] Batch[135] Speed: 1.2435651502043106 samples/sec                   batch loss = 376.3501319885254 | accuracy = 0.5518518518518518


Epoch[1] Batch[140] Speed: 1.2500772859254377 samples/sec                   batch loss = 390.2034344673157 | accuracy = 0.55


Epoch[1] Batch[145] Speed: 1.251572446215603 samples/sec                   batch loss = 404.0142364501953 | accuracy = 0.5482758620689655


Epoch[1] Batch[150] Speed: 1.2568119951223407 samples/sec                   batch loss = 417.7042775154114 | accuracy = 0.5483333333333333


Epoch[1] Batch[155] Speed: 1.2556286081457118 samples/sec                   batch loss = 431.02795696258545 | accuracy = 0.5532258064516129


Epoch[1] Batch[160] Speed: 1.2540933732517145 samples/sec                   batch loss = 444.19282150268555 | accuracy = 0.5578125


Epoch[1] Batch[165] Speed: 1.2554842824027252 samples/sec                   batch loss = 457.8341190814972 | accuracy = 0.5575757575757576


Epoch[1] Batch[170] Speed: 1.2529608539351889 samples/sec                   batch loss = 471.8212890625 | accuracy = 0.5529411764705883


Epoch[1] Batch[175] Speed: 1.2562208224624996 samples/sec                   batch loss = 485.72786045074463 | accuracy = 0.5514285714285714


Epoch[1] Batch[180] Speed: 1.2533307674884342 samples/sec                   batch loss = 499.55275774002075 | accuracy = 0.5486111111111112


Epoch[1] Batch[185] Speed: 1.2447430740773961 samples/sec                   batch loss = 513.2501783370972 | accuracy = 0.55


Epoch[1] Batch[190] Speed: 1.2443264354960368 samples/sec                   batch loss = 526.8526673316956 | accuracy = 0.5539473684210526


Epoch[1] Batch[195] Speed: 1.2469647387112197 samples/sec                   batch loss = 540.5782978534698 | accuracy = 0.5538461538461539


Epoch[1] Batch[200] Speed: 1.2498417697001194 samples/sec                   batch loss = 553.8384439945221 | accuracy = 0.555


Epoch[1] Batch[205] Speed: 1.249594894252429 samples/sec                   batch loss = 566.9685370922089 | accuracy = 0.5597560975609757


Epoch[1] Batch[210] Speed: 1.244885587308786 samples/sec                   batch loss = 580.1964440345764 | accuracy = 0.5630952380952381


Epoch[1] Batch[215] Speed: 1.2428761489159366 samples/sec                   batch loss = 593.6587224006653 | accuracy = 0.563953488372093


Epoch[1] Batch[220] Speed: 1.2429848975241529 samples/sec                   batch loss = 606.993222951889 | accuracy = 0.5659090909090909


Epoch[1] Batch[225] Speed: 1.2494700972968464 samples/sec                   batch loss = 620.2246234416962 | accuracy = 0.5655555555555556


Epoch[1] Batch[230] Speed: 1.2453026004793057 samples/sec                   batch loss = 633.8950161933899 | accuracy = 0.5630434782608695


Epoch[1] Batch[235] Speed: 1.2480674062283987 samples/sec                   batch loss = 648.0737822055817 | accuracy = 0.5648936170212766


Epoch[1] Batch[240] Speed: 1.2450590855564319 samples/sec                   batch loss = 661.4831259250641 | accuracy = 0.5645833333333333


Epoch[1] Batch[245] Speed: 1.2531782640122864 samples/sec                   batch loss = 675.6194539070129 | accuracy = 0.5653061224489796


Epoch[1] Batch[250] Speed: 1.250915116469979 samples/sec                   batch loss = 689.0969817638397 | accuracy = 0.566


Epoch[1] Batch[255] Speed: 1.2557770088505802 samples/sec                   batch loss = 702.946809053421 | accuracy = 0.5647058823529412


Epoch[1] Batch[260] Speed: 1.2513192868951706 samples/sec                   batch loss = 717.5512661933899 | accuracy = 0.5615384615384615


Epoch[1] Batch[265] Speed: 1.2526712158571784 samples/sec                   batch loss = 731.8220493793488 | accuracy = 0.5584905660377358


Epoch[1] Batch[270] Speed: 1.2563555330302603 samples/sec                   batch loss = 744.5200018882751 | accuracy = 0.5601851851851852


Epoch[1] Batch[275] Speed: 1.2570803801778128 samples/sec                   batch loss = 758.1689732074738 | accuracy = 0.5609090909090909


Epoch[1] Batch[280] Speed: 1.2516111946030044 samples/sec                   batch loss = 772.4081587791443 | accuracy = 0.5589285714285714


Epoch[1] Batch[285] Speed: 1.2524844628454674 samples/sec                   batch loss = 785.9716882705688 | accuracy = 0.5605263157894737


Epoch[1] Batch[290] Speed: 1.2483952334045636 samples/sec                   batch loss = 799.5976576805115 | accuracy = 0.5612068965517242


Epoch[1] Batch[295] Speed: 1.2555453537000918 samples/sec                   batch loss = 813.287492275238 | accuracy = 0.5610169491525424


Epoch[1] Batch[300] Speed: 1.2544426639900783 samples/sec                   batch loss = 827.3964395523071 | accuracy = 0.5575


Epoch[1] Batch[305] Speed: 1.2485329164317527 samples/sec                   batch loss = 841.5352737903595 | accuracy = 0.5581967213114755


Epoch[1] Batch[310] Speed: 1.2556704274673116 samples/sec                   batch loss = 855.1518087387085 | accuracy = 0.5588709677419355


Epoch[1] Batch[315] Speed: 1.2608390404910723 samples/sec                   batch loss = 868.5967645645142 | accuracy = 0.5587301587301587


Epoch[1] Batch[320] Speed: 1.261777807526648 samples/sec                   batch loss = 881.329258441925 | accuracy = 0.56171875


Epoch[1] Batch[325] Speed: 1.2594145978212565 samples/sec                   batch loss = 895.0259466171265 | accuracy = 0.5607692307692308


Epoch[1] Batch[330] Speed: 1.2591888757495218 samples/sec                   batch loss = 908.5981860160828 | accuracy = 0.5621212121212121


Epoch[1] Batch[335] Speed: 1.2636252782056254 samples/sec                   batch loss = 921.3530297279358 | accuracy = 0.564179104477612


Epoch[1] Batch[340] Speed: 1.253851562387897 samples/sec                   batch loss = 935.2431046962738 | accuracy = 0.5632352941176471


Epoch[1] Batch[345] Speed: 1.2525215845728237 samples/sec                   batch loss = 948.3185744285583 | accuracy = 0.563768115942029


Epoch[1] Batch[350] Speed: 1.2552637235699011 samples/sec                   batch loss = 962.8241002559662 | accuracy = 0.5614285714285714


Epoch[1] Batch[355] Speed: 1.253926813680474 samples/sec                   batch loss = 975.4389827251434 | accuracy = 0.5626760563380282


Epoch[1] Batch[360] Speed: 1.2500811979762014 samples/sec                   batch loss = 988.6668426990509 | accuracy = 0.5645833333333333


Epoch[1] Batch[365] Speed: 1.2533008068472955 samples/sec                   batch loss = 1002.9575140476227 | accuracy = 0.565068493150685


Epoch[1] Batch[370] Speed: 1.2578562333145198 samples/sec                   batch loss = 1016.9178974628448 | accuracy = 0.5635135135135135


Epoch[1] Batch[375] Speed: 1.2520475830165816 samples/sec                   batch loss = 1030.6580352783203 | accuracy = 0.5633333333333334


Epoch[1] Batch[380] Speed: 1.2574481124767871 samples/sec                   batch loss = 1044.87553191185 | accuracy = 0.5638157894736842


Epoch[1] Batch[385] Speed: 1.2605340051286251 samples/sec                   batch loss = 1058.3906552791595 | accuracy = 0.5636363636363636


Epoch[1] Batch[390] Speed: 1.2566865055838574 samples/sec                   batch loss = 1071.5888063907623 | accuracy = 0.5647435897435897


Epoch[1] Batch[395] Speed: 1.2557329267017536 samples/sec                   batch loss = 1085.3751034736633 | accuracy = 0.5613924050632911


Epoch[1] Batch[400] Speed: 1.2522562639997195 samples/sec                   batch loss = 1099.2341048717499 | accuracy = 0.561875


Epoch[1] Batch[405] Speed: 1.2579139516059135 samples/sec                   batch loss = 1112.287341117859 | accuracy = 0.5641975308641975


Epoch[1] Batch[410] Speed: 1.2537824103848945 samples/sec                   batch loss = 1125.367031097412 | accuracy = 0.5652439024390243


Epoch[1] Batch[415] Speed: 1.258394105177951 samples/sec                   batch loss = 1138.4970915317535 | accuracy = 0.5644578313253013


Epoch[1] Batch[420] Speed: 1.2561424740873879 samples/sec                   batch loss = 1153.2082867622375 | accuracy = 0.5613095238095238


Epoch[1] Batch[425] Speed: 1.2591355763174785 samples/sec                   batch loss = 1166.8872122764587 | accuracy = 0.5617647058823529


Epoch[1] Batch[430] Speed: 1.25367597993128 samples/sec                   batch loss = 1180.1648969650269 | accuracy = 0.5616279069767441


Epoch[1] Batch[435] Speed: 1.253684504944469 samples/sec                   batch loss = 1193.8632109165192 | accuracy = 0.5614942528735632


Epoch[1] Batch[440] Speed: 1.252743986988416 samples/sec                   batch loss = 1206.9236192703247 | accuracy = 0.5630681818181819


Epoch[1] Batch[445] Speed: 1.2518811927288704 samples/sec                   batch loss = 1221.3596286773682 | accuracy = 0.5617977528089888


Epoch[1] Batch[450] Speed: 1.2551445526066152 samples/sec                   batch loss = 1234.0593950748444 | accuracy = 0.5627777777777778


Epoch[1] Batch[455] Speed: 1.2505052417386249 samples/sec                   batch loss = 1247.5762269496918 | accuracy = 0.5615384615384615


Epoch[1] Batch[460] Speed: 1.2550857736882888 samples/sec                   batch loss = 1262.29825258255 | accuracy = 0.5603260869565218


Epoch[1] Batch[465] Speed: 1.2594580879184565 samples/sec                   batch loss = 1275.360399723053 | accuracy = 0.560752688172043


Epoch[1] Batch[470] Speed: 1.25510474008587 samples/sec                   batch loss = 1287.8744690418243 | accuracy = 0.5622340425531915


Epoch[1] Batch[475] Speed: 1.2490503817866871 samples/sec                   batch loss = 1302.1079778671265 | accuracy = 0.5621052631578948


Epoch[1] Batch[480] Speed: 1.2577548618830214 samples/sec                   batch loss = 1314.5969319343567 | accuracy = 0.5630208333333333


Epoch[1] Batch[485] Speed: 1.2562374715755316 samples/sec                   batch loss = 1326.9412875175476 | accuracy = 0.5649484536082474


Epoch[1] Batch[490] Speed: 1.254859723343531 samples/sec                   batch loss = 1341.109011888504 | accuracy = 0.5642857142857143


Epoch[1] Batch[495] Speed: 1.2519080027731604 samples/sec                   batch loss = 1354.5120656490326 | accuracy = 0.5641414141414142


Epoch[1] Batch[500] Speed: 1.25502315109014 samples/sec                   batch loss = 1368.0746312141418 | accuracy = 0.5645


Epoch[1] Batch[505] Speed: 1.255240526187181 samples/sec                   batch loss = 1380.2713646888733 | accuracy = 0.5668316831683168


Epoch[1] Batch[510] Speed: 1.262820526813007 samples/sec                   batch loss = 1393.9925396442413 | accuracy = 0.5676470588235294


Epoch[1] Batch[515] Speed: 1.2552455036909818 samples/sec                   batch loss = 1407.1725659370422 | accuracy = 0.566990291262136


Epoch[1] Batch[520] Speed: 1.250187112083342 samples/sec                   batch loss = 1419.7942569255829 | accuracy = 0.56875


Epoch[1] Batch[525] Speed: 1.2551952608890173 samples/sec                   batch loss = 1433.2688927650452 | accuracy = 0.5685714285714286


Epoch[1] Batch[530] Speed: 1.2533766474963346 samples/sec                   batch loss = 1447.3955309391022 | accuracy = 0.5669811320754717


Epoch[1] Batch[535] Speed: 1.2505511947540884 samples/sec                   batch loss = 1460.899088382721 | accuracy = 0.5672897196261683


Epoch[1] Batch[540] Speed: 1.255868378723766 samples/sec                   batch loss = 1473.8879489898682 | accuracy = 0.5689814814814815


Epoch[1] Batch[545] Speed: 1.2541232780508078 samples/sec                   batch loss = 1486.6445806026459 | accuracy = 0.5692660550458716


Epoch[1] Batch[550] Speed: 1.2510506505312515 samples/sec                   batch loss = 1499.8246476650238 | accuracy = 0.5690909090909091


Epoch[1] Batch[555] Speed: 1.2564392713360142 samples/sec                   batch loss = 1513.4562203884125 | accuracy = 0.5689189189189189


Epoch[1] Batch[560] Speed: 1.2521147682710725 samples/sec                   batch loss = 1526.712675333023 | accuracy = 0.5700892857142857


Epoch[1] Batch[565] Speed: 1.2551284019399516 samples/sec                   batch loss = 1540.982438325882 | accuracy = 0.5690265486725664


Epoch[1] Batch[570] Speed: 1.2529532744885283 samples/sec                   batch loss = 1554.800045490265 | accuracy = 0.5697368421052632


Epoch[1] Batch[575] Speed: 1.2503478339867695 samples/sec                   batch loss = 1566.8686172962189 | accuracy = 0.5721739130434783


Epoch[1] Batch[580] Speed: 1.251961439457284 samples/sec                   batch loss = 1580.3327660560608 | accuracy = 0.5728448275862069


Epoch[1] Batch[585] Speed: 1.2540317869831277 samples/sec                   batch loss = 1592.8811922073364 | accuracy = 0.573076923076923


Epoch[1] Batch[590] Speed: 1.2547059092320172 samples/sec                   batch loss = 1606.672605752945 | accuracy = 0.5728813559322034


Epoch[1] Batch[595] Speed: 1.2543054562668936 samples/sec                   batch loss = 1619.0021815299988 | accuracy = 0.5743697478991596


Epoch[1] Batch[600] Speed: 1.2538839858722568 samples/sec                   batch loss = 1632.8556497097015 | accuracy = 0.5754166666666667


Epoch[1] Batch[605] Speed: 1.253103102534885 samples/sec                   batch loss = 1645.6601166725159 | accuracy = 0.5764462809917356


Epoch[1] Batch[610] Speed: 1.2515393952703953 samples/sec                   batch loss = 1658.5821566581726 | accuracy = 0.5774590163934427


Epoch[1] Batch[615] Speed: 1.260851642952635 samples/sec                   batch loss = 1671.9078905582428 | accuracy = 0.5772357723577236


Epoch[1] Batch[620] Speed: 1.2597054713326652 samples/sec                   batch loss = 1684.9910802841187 | accuracy = 0.5766129032258065


Epoch[1] Batch[625] Speed: 1.2548966105512132 samples/sec                   batch loss = 1698.550223350525 | accuracy = 0.5768


Epoch[1] Batch[630] Speed: 1.2555792743253846 samples/sec                   batch loss = 1711.1682260036469 | accuracy = 0.5773809523809523


Epoch[1] Batch[635] Speed: 1.2585184255017243 samples/sec                   batch loss = 1724.1220271587372 | accuracy = 0.5787401574803149


Epoch[1] Batch[640] Speed: 1.2557752229464463 samples/sec                   batch loss = 1736.9448697566986 | accuracy = 0.578515625


Epoch[1] Batch[645] Speed: 1.260573215692368 samples/sec                   batch loss = 1749.0625772476196 | accuracy = 0.5802325581395349


Epoch[1] Batch[650] Speed: 1.2640857056705084 samples/sec                   batch loss = 1761.2237675189972 | accuracy = 0.5803846153846154


Epoch[1] Batch[655] Speed: 1.2531218218604612 samples/sec                   batch loss = 1775.3179528713226 | accuracy = 0.5801526717557252


Epoch[1] Batch[660] Speed: 1.2528570889836694 samples/sec                   batch loss = 1787.9483952522278 | accuracy = 0.5814393939393939


Epoch[1] Batch[665] Speed: 1.2610067780966312 samples/sec                   batch loss = 1800.1215326786041 | accuracy = 0.5823308270676691


Epoch[1] Batch[670] Speed: 1.2598447145174716 samples/sec                   batch loss = 1811.6811270713806 | accuracy = 0.5843283582089552


Epoch[1] Batch[675] Speed: 1.2621742158664022 samples/sec                   batch loss = 1826.6395275592804 | accuracy = 0.5840740740740741


Epoch[1] Batch[680] Speed: 1.2564650536909656 samples/sec                   batch loss = 1838.736199259758 | accuracy = 0.5845588235294118


Epoch[1] Batch[685] Speed: 1.2598671363041245 samples/sec                   batch loss = 1852.3423076868057 | accuracy = 0.5846715328467154


Epoch[1] Batch[690] Speed: 1.2592643910029768 samples/sec                   batch loss = 1863.8589106798172 | accuracy = 0.5847826086956521


Epoch[1] Batch[695] Speed: 1.2552725519488872 samples/sec                   batch loss = 1875.7326880693436 | accuracy = 0.5859712230215828


Epoch[1] Batch[700] Speed: 1.2563205356078455 samples/sec                   batch loss = 1889.1460322141647 | accuracy = 0.585


Epoch[1] Batch[705] Speed: 1.2596573298656726 samples/sec                   batch loss = 1901.119700908661 | accuracy = 0.5854609929078014


Epoch[1] Batch[710] Speed: 1.2542370980719746 samples/sec                   batch loss = 1913.768298625946 | accuracy = 0.5869718309859155


Epoch[1] Batch[715] Speed: 1.2566337001495027 samples/sec                   batch loss = 1927.514529466629 | accuracy = 0.5874125874125874


Epoch[1] Batch[720] Speed: 1.2602663208847487 samples/sec                   batch loss = 1938.5233715772629 | accuracy = 0.5892361111111111


Epoch[1] Batch[725] Speed: 1.2542292218862685 samples/sec                   batch loss = 1951.7387391328812 | accuracy = 0.5889655172413794


Epoch[1] Batch[730] Speed: 1.2548016280424545 samples/sec                   batch loss = 1963.8369717597961 | accuracy = 0.5897260273972603


Epoch[1] Batch[735] Speed: 1.252567685901848 samples/sec                   batch loss = 1976.6273367404938 | accuracy = 0.5901360544217688


Epoch[1] Batch[740] Speed: 1.2501483586658302 samples/sec                   batch loss = 1989.7843890190125 | accuracy = 0.5898648648648649


Epoch[1] Batch[745] Speed: 1.2529889267586354 samples/sec                   batch loss = 2002.7552721500397 | accuracy = 0.5899328859060403


Epoch[1] Batch[750] Speed: 1.2496677738017976 samples/sec                   batch loss = 2016.3172348737717 | accuracy = 0.59


Epoch[1] Batch[755] Speed: 1.2537310666777264 samples/sec                   batch loss = 2029.3355828523636 | accuracy = 0.5903973509933775


Epoch[1] Batch[760] Speed: 1.2511495445634162 samples/sec                   batch loss = 2042.7709101438522 | accuracy = 0.5898026315789474


Epoch[1] Batch[765] Speed: 1.252670561142446 samples/sec                   batch loss = 2056.067029595375 | accuracy = 0.5888888888888889


Epoch[1] Batch[770] Speed: 1.2525257924641426 samples/sec                   batch loss = 2068.676416993141 | accuracy = 0.5892857142857143


Epoch[1] Batch[775] Speed: 1.2558409287638888 samples/sec                   batch loss = 2082.0615490674973 | accuracy = 0.5890322580645161


Epoch[1] Batch[780] Speed: 1.2566508308379656 samples/sec                   batch loss = 2093.696599841118 | accuracy = 0.5903846153846154


Epoch[1] Batch[785] Speed: 1.2579424354597688 samples/sec                   batch loss = 2106.4618190526962 | accuracy = 0.5907643312101911


[Epoch 1] training: accuracy=0.5910532994923858
[Epoch 1] time cost: 646.9665174484253
[Epoch 1] validation: validation accuracy=0.6811111111111111


Epoch[2] Batch[5] Speed: 1.253674387359251 samples/sec                   batch loss = 13.18952488899231 | accuracy = 0.6


Epoch[2] Batch[10] Speed: 1.2560398741296102 samples/sec                   batch loss = 24.16894829273224 | accuracy = 0.675


Epoch[2] Batch[15] Speed: 1.256778478584252 samples/sec                   batch loss = 35.99902594089508 | accuracy = 0.6666666666666666


Epoch[2] Batch[20] Speed: 1.2543539397772816 samples/sec                   batch loss = 49.412450671195984 | accuracy = 0.6125


Epoch[2] Batch[25] Speed: 1.251984048658472 samples/sec                   batch loss = 62.07201540470123 | accuracy = 0.64


Epoch[2] Batch[30] Speed: 1.25715036747028 samples/sec                   batch loss = 75.97901165485382 | accuracy = 0.6166666666666667


Epoch[2] Batch[35] Speed: 1.256001415219948 samples/sec                   batch loss = 90.17657673358917 | accuracy = 0.6142857142857143


Epoch[2] Batch[40] Speed: 1.2523746066458232 samples/sec                   batch loss = 103.27028501033783 | accuracy = 0.60625


Epoch[2] Batch[45] Speed: 1.2587014107897765 samples/sec                   batch loss = 117.4341367483139 | accuracy = 0.5833333333333334


Epoch[2] Batch[50] Speed: 1.253569380215542 samples/sec                   batch loss = 128.24632501602173 | accuracy = 0.6


Epoch[2] Batch[55] Speed: 1.2543736343432417 samples/sec                   batch loss = 141.50438451766968 | accuracy = 0.5954545454545455


Epoch[2] Batch[60] Speed: 1.2539417150700223 samples/sec                   batch loss = 153.40332734584808 | accuracy = 0.6


Epoch[2] Batch[65] Speed: 1.252771208183223 samples/sec                   batch loss = 166.01105165481567 | accuracy = 0.6


Epoch[2] Batch[70] Speed: 1.2547973109973263 samples/sec                   batch loss = 178.4030110836029 | accuracy = 0.6071428571428571


Epoch[2] Batch[75] Speed: 1.2541843107651838 samples/sec                   batch loss = 192.143217086792 | accuracy = 0.6133333333333333


Epoch[2] Batch[80] Speed: 1.253442758067519 samples/sec                   batch loss = 204.9400417804718 | accuracy = 0.615625


Epoch[2] Batch[85] Speed: 1.2508869499657627 samples/sec                   batch loss = 217.9151692390442 | accuracy = 0.611764705882353


Epoch[2] Batch[90] Speed: 1.2573426606967704 samples/sec                   batch loss = 230.86155605316162 | accuracy = 0.6111111111111112


Epoch[2] Batch[95] Speed: 1.2538915765833736 samples/sec                   batch loss = 245.10159301757812 | accuracy = 0.6105263157894737


Epoch[2] Batch[100] Speed: 1.2557143172566996 samples/sec                   batch loss = 259.4807789325714 | accuracy = 0.6075


Epoch[2] Batch[105] Speed: 1.255130937191519 samples/sec                   batch loss = 271.11769664287567 | accuracy = 0.6095238095238096


Epoch[2] Batch[110] Speed: 1.258887472051137 samples/sec                   batch loss = 283.6200045347214 | accuracy = 0.6113636363636363


Epoch[2] Batch[115] Speed: 1.254170622420039 samples/sec                   batch loss = 297.18721854686737 | accuracy = 0.6130434782608696


Epoch[2] Batch[120] Speed: 1.2564040810702326 samples/sec                   batch loss = 312.6936379671097 | accuracy = 0.6020833333333333


Epoch[2] Batch[125] Speed: 1.2532355537833713 samples/sec                   batch loss = 326.8694976568222 | accuracy = 0.602


Epoch[2] Batch[130] Speed: 1.253860745747937 samples/sec                   batch loss = 339.5746999979019 | accuracy = 0.5980769230769231


Epoch[2] Batch[135] Speed: 1.257309304298808 samples/sec                   batch loss = 352.46142995357513 | accuracy = 0.5981481481481481


Epoch[2] Batch[140] Speed: 1.2600798518477014 samples/sec                   batch loss = 364.3563965559006 | accuracy = 0.6035714285714285


Epoch[2] Batch[145] Speed: 1.2551843676531849 samples/sec                   batch loss = 376.7270576953888 | accuracy = 0.6051724137931035


Epoch[2] Batch[150] Speed: 1.2583548412960985 samples/sec                   batch loss = 390.97232818603516 | accuracy = 0.605


Epoch[2] Batch[155] Speed: 1.251653680532138 samples/sec                   batch loss = 403.4546980857849 | accuracy = 0.6048387096774194


Epoch[2] Batch[160] Speed: 1.2522342057584566 samples/sec                   batch loss = 414.8726638555527 | accuracy = 0.609375


Epoch[2] Batch[165] Speed: 1.253377115676993 samples/sec                   batch loss = 426.4612293243408 | accuracy = 0.6121212121212121


Epoch[2] Batch[170] Speed: 1.2539693632403581 samples/sec                   batch loss = 439.0776914358139 | accuracy = 0.611764705882353


Epoch[2] Batch[175] Speed: 1.2525339278008634 samples/sec                   batch loss = 451.6003714799881 | accuracy = 0.6128571428571429


Epoch[2] Batch[180] Speed: 1.2497529503309863 samples/sec                   batch loss = 464.35428416728973 | accuracy = 0.6138888888888889


Epoch[2] Batch[185] Speed: 1.2495248150407574 samples/sec                   batch loss = 478.37346732616425 | accuracy = 0.6135135135135135


Epoch[2] Batch[190] Speed: 1.2559862767841596 samples/sec                   batch loss = 489.1800550222397 | accuracy = 0.6171052631578947


Epoch[2] Batch[195] Speed: 1.2563705862847465 samples/sec                   batch loss = 503.8323343992233 | accuracy = 0.6153846153846154


Epoch[2] Batch[200] Speed: 1.255211976739262 samples/sec                   batch loss = 516.7650860548019 | accuracy = 0.61375


Epoch[2] Batch[205] Speed: 1.2527182635787737 samples/sec                   batch loss = 526.8808326721191 | accuracy = 0.6158536585365854


Epoch[2] Batch[210] Speed: 1.258197715909708 samples/sec                   batch loss = 539.9638472795486 | accuracy = 0.6190476190476191


Epoch[2] Batch[215] Speed: 1.2527322008586896 samples/sec                   batch loss = 551.9035805463791 | accuracy = 0.6232558139534884


Epoch[2] Batch[220] Speed: 1.2547815446500752 samples/sec                   batch loss = 564.3587559461594 | accuracy = 0.6238636363636364


Epoch[2] Batch[225] Speed: 1.2599508702838815 samples/sec                   batch loss = 575.8732032775879 | accuracy = 0.6266666666666667


Epoch[2] Batch[230] Speed: 1.2480896893065154 samples/sec                   batch loss = 588.7378830909729 | accuracy = 0.6239130434782608


Epoch[2] Batch[235] Speed: 1.2565375134193961 samples/sec                   batch loss = 601.6043426990509 | accuracy = 0.625531914893617


Epoch[2] Batch[240] Speed: 1.2566192053163663 samples/sec                   batch loss = 612.5851808786392 | accuracy = 0.628125


Epoch[2] Batch[245] Speed: 1.2516519063367297 samples/sec                   batch loss = 625.6728117465973 | accuracy = 0.6275510204081632


Epoch[2] Batch[250] Speed: 1.2542626025901755 samples/sec                   batch loss = 638.5445177555084 | accuracy = 0.629


Epoch[2] Batch[255] Speed: 1.2559358806462302 samples/sec                   batch loss = 649.7662930488586 | accuracy = 0.6303921568627451


Epoch[2] Batch[260] Speed: 1.2505777614253049 samples/sec                   batch loss = 663.605508685112 | accuracy = 0.6288461538461538


Epoch[2] Batch[265] Speed: 1.2532785245492863 samples/sec                   batch loss = 676.9363330602646 | accuracy = 0.629245283018868


Epoch[2] Batch[270] Speed: 1.2592248837706939 samples/sec                   batch loss = 690.1786807775497 | accuracy = 0.6277777777777778


Epoch[2] Batch[275] Speed: 1.259292274393529 samples/sec                   batch loss = 703.9084483385086 | accuracy = 0.6254545454545455


Epoch[2] Batch[280] Speed: 1.2611775941007386 samples/sec                   batch loss = 714.9660156965256 | accuracy = 0.6276785714285714


Epoch[2] Batch[285] Speed: 1.253489208192336 samples/sec                   batch loss = 727.2207703590393 | accuracy = 0.6263157894736842


Epoch[2] Batch[290] Speed: 1.2590029143474806 samples/sec                   batch loss = 740.9228081703186 | accuracy = 0.6267241379310344


Epoch[2] Batch[295] Speed: 1.2579429070576085 samples/sec                   batch loss = 751.1406599283218 | accuracy = 0.6305084745762712


Epoch[2] Batch[300] Speed: 1.2633816806367055 samples/sec                   batch loss = 761.829264998436 | accuracy = 0.6325


Epoch[2] Batch[305] Speed: 1.258111007051015 samples/sec                   batch loss = 774.8489507436752 | accuracy = 0.6295081967213115


Epoch[2] Batch[310] Speed: 1.2536019764469026 samples/sec                   batch loss = 787.9135746955872 | accuracy = 0.6282258064516129


Epoch[2] Batch[315] Speed: 1.2514794598738794 samples/sec                   batch loss = 801.2969257831573 | accuracy = 0.6277777777777778


Epoch[2] Batch[320] Speed: 1.2547456964541719 samples/sec                   batch loss = 811.2489376068115 | accuracy = 0.63125


Epoch[2] Batch[325] Speed: 1.2575787501985438 samples/sec                   batch loss = 823.0805965662003 | accuracy = 0.6330769230769231


Epoch[2] Batch[330] Speed: 1.2605314480026766 samples/sec                   batch loss = 834.7189205884933 | accuracy = 0.6356060606060606


Epoch[2] Batch[335] Speed: 1.2560899024459635 samples/sec                   batch loss = 845.3835970163345 | accuracy = 0.6373134328358209


Epoch[2] Batch[340] Speed: 1.2568421238732803 samples/sec                   batch loss = 856.7872579097748 | accuracy = 0.6389705882352941


Epoch[2] Batch[345] Speed: 1.259078974237206 samples/sec                   batch loss = 867.6717076301575 | accuracy = 0.6405797101449275


Epoch[2] Batch[350] Speed: 1.260223626909503 samples/sec                   batch loss = 883.4064898490906 | accuracy = 0.6392857142857142


Epoch[2] Batch[355] Speed: 1.260324449926956 samples/sec                   batch loss = 894.0072598457336 | accuracy = 0.6408450704225352


Epoch[2] Batch[360] Speed: 1.2601205485032758 samples/sec                   batch loss = 905.7552194595337 | accuracy = 0.6416666666666667


Epoch[2] Batch[365] Speed: 1.2535951385912019 samples/sec                   batch loss = 916.8822581768036 | accuracy = 0.6438356164383562


Epoch[2] Batch[370] Speed: 1.2489559102725654 samples/sec                   batch loss = 930.2526692152023 | accuracy = 0.6418918918918919


Epoch[2] Batch[375] Speed: 1.2521438311644937 samples/sec                   batch loss = 942.5355437994003 | accuracy = 0.6406666666666667


Epoch[2] Batch[380] Speed: 1.2530449825930041 samples/sec                   batch loss = 953.7622812986374 | accuracy = 0.6401315789473684


Epoch[2] Batch[385] Speed: 1.2525623555544008 samples/sec                   batch loss = 964.7356986999512 | accuracy = 0.6402597402597403


Epoch[2] Batch[390] Speed: 1.2466660083058974 samples/sec                   batch loss = 977.3152797222137 | accuracy = 0.6397435897435897


Epoch[2] Batch[395] Speed: 1.2522085032796813 samples/sec                   batch loss = 988.1107642650604 | accuracy = 0.639873417721519


Epoch[2] Batch[400] Speed: 1.2534235609366913 samples/sec                   batch loss = 1000.9985009431839 | accuracy = 0.640625


Epoch[2] Batch[405] Speed: 1.2553782203774362 samples/sec                   batch loss = 1011.7688771486282 | accuracy = 0.641358024691358


Epoch[2] Batch[410] Speed: 1.2528024532120487 samples/sec                   batch loss = 1025.1264979839325 | accuracy = 0.6402439024390244


Epoch[2] Batch[415] Speed: 1.2497880483004793 samples/sec                   batch loss = 1040.287781715393 | accuracy = 0.6385542168674698


Epoch[2] Batch[420] Speed: 1.2552720823511332 samples/sec                   batch loss = 1051.251152396202 | accuracy = 0.6404761904761904


Epoch[2] Batch[425] Speed: 1.2566323824235817 samples/sec                   batch loss = 1063.1128352880478 | accuracy = 0.6411764705882353


Epoch[2] Batch[430] Speed: 1.2530403032830846 samples/sec                   batch loss = 1074.0183666944504 | accuracy = 0.6430232558139535


Epoch[2] Batch[435] Speed: 1.2549658855189931 samples/sec                   batch loss = 1083.93161714077 | accuracy = 0.6448275862068965


Epoch[2] Batch[440] Speed: 1.2507576987666735 samples/sec                   batch loss = 1095.3807294368744 | accuracy = 0.6443181818181818


Epoch[2] Batch[445] Speed: 1.2569241378845333 samples/sec                   batch loss = 1107.5260653495789 | accuracy = 0.6449438202247191


Epoch[2] Batch[450] Speed: 1.2563584495701425 samples/sec                   batch loss = 1119.8585501909256 | accuracy = 0.6444444444444445


Epoch[2] Batch[455] Speed: 1.2613581292233 samples/sec                   batch loss = 1132.302607536316 | accuracy = 0.6450549450549451


Epoch[2] Batch[460] Speed: 1.2539368416206302 samples/sec                   batch loss = 1142.9119851589203 | accuracy = 0.6456521739130435


Epoch[2] Batch[465] Speed: 1.2493400223086646 samples/sec                   batch loss = 1156.6538434028625 | accuracy = 0.6456989247311828


Epoch[2] Batch[470] Speed: 1.2559560950292192 samples/sec                   batch loss = 1167.704374074936 | accuracy = 0.6462765957446809


Epoch[2] Batch[475] Speed: 1.2591852845112672 samples/sec                   batch loss = 1177.9657413959503 | accuracy = 0.6473684210526316


Epoch[2] Batch[480] Speed: 1.2550437116439357 samples/sec                   batch loss = 1190.2430064678192 | accuracy = 0.6463541666666667


Epoch[2] Batch[485] Speed: 1.2615232541151242 samples/sec                   batch loss = 1202.4863401651382 | accuracy = 0.6463917525773196


Epoch[2] Batch[490] Speed: 1.2585557169478094 samples/sec                   batch loss = 1212.6519919633865 | accuracy = 0.6479591836734694


Epoch[2] Batch[495] Speed: 1.2614524945603052 samples/sec                   batch loss = 1226.0022379159927 | accuracy = 0.6484848484848484


Epoch[2] Batch[500] Speed: 1.263134182770159 samples/sec                   batch loss = 1237.8128658533096 | accuracy = 0.6485


Epoch[2] Batch[505] Speed: 1.262482420702308 samples/sec                   batch loss = 1249.4066120386124 | accuracy = 0.6485148514851485


Epoch[2] Batch[510] Speed: 1.2572999761313175 samples/sec                   batch loss = 1261.0043110847473 | accuracy = 0.6490196078431373


Epoch[2] Batch[515] Speed: 1.2533681266694583 samples/sec                   batch loss = 1275.1218540668488 | accuracy = 0.6485436893203883


Epoch[2] Batch[520] Speed: 1.2561614724355958 samples/sec                   batch loss = 1286.127022266388 | accuracy = 0.6485576923076923


Epoch[2] Batch[525] Speed: 1.2554418178641302 samples/sec                   batch loss = 1298.6187957525253 | accuracy = 0.6490476190476191


Epoch[2] Batch[530] Speed: 1.2515818763251307 samples/sec                   batch loss = 1309.3898527622223 | accuracy = 0.65


Epoch[2] Batch[535] Speed: 1.2515578811771608 samples/sec                   batch loss = 1323.237259030342 | accuracy = 0.6495327102803738


Epoch[2] Batch[540] Speed: 1.2532130865821876 samples/sec                   batch loss = 1333.5642412900925 | accuracy = 0.6504629629629629


Epoch[2] Batch[545] Speed: 1.2540729375635644 samples/sec                   batch loss = 1345.3874994516373 | accuracy = 0.6509174311926605


Epoch[2] Batch[550] Speed: 1.2561135074150322 samples/sec                   batch loss = 1360.0649267435074 | accuracy = 0.65


Epoch[2] Batch[555] Speed: 1.2537298487211366 samples/sec                   batch loss = 1371.6860704421997 | accuracy = 0.65


Epoch[2] Batch[560] Speed: 1.255804268017831 samples/sec                   batch loss = 1383.8272792100906 | accuracy = 0.65


Epoch[2] Batch[565] Speed: 1.264804818333938 samples/sec                   batch loss = 1394.4253907203674 | accuracy = 0.6517699115044248


Epoch[2] Batch[570] Speed: 1.2651148817656515 samples/sec                   batch loss = 1404.527575135231 | accuracy = 0.6526315789473685


Epoch[2] Batch[575] Speed: 1.2564675943452572 samples/sec                   batch loss = 1415.9902037382126 | accuracy = 0.6526086956521739


Epoch[2] Batch[580] Speed: 1.2508225075369344 samples/sec                   batch loss = 1424.843516945839 | accuracy = 0.6543103448275862


Epoch[2] Batch[585] Speed: 1.2544848733593748 samples/sec                   batch loss = 1439.2815304994583 | accuracy = 0.6542735042735043


Epoch[2] Batch[590] Speed: 1.2544080544267167 samples/sec                   batch loss = 1451.2906247377396 | accuracy = 0.6546610169491526


Epoch[2] Batch[595] Speed: 1.2586694931161562 samples/sec                   batch loss = 1462.6390116214752 | accuracy = 0.6546218487394958


Epoch[2] Batch[600] Speed: 1.251287275884148 samples/sec                   batch loss = 1474.0765188932419 | accuracy = 0.655


Epoch[2] Batch[605] Speed: 1.2521597182038782 samples/sec                   batch loss = 1487.6000027656555 | accuracy = 0.6553719008264463


Epoch[2] Batch[610] Speed: 1.2530245810577416 samples/sec                   batch loss = 1496.0834082365036 | accuracy = 0.6569672131147541


Epoch[2] Batch[615] Speed: 1.2566142169121752 samples/sec                   batch loss = 1506.9822891950607 | accuracy = 0.656910569105691


Epoch[2] Batch[620] Speed: 1.2627417332388002 samples/sec                   batch loss = 1522.8092230558395 | accuracy = 0.6560483870967742


Epoch[2] Batch[625] Speed: 1.2503713168199793 samples/sec                   batch loss = 1535.5059022903442 | accuracy = 0.656


Epoch[2] Batch[630] Speed: 1.253494640084983 samples/sec                   batch loss = 1548.4323838949203 | accuracy = 0.6571428571428571


Epoch[2] Batch[635] Speed: 1.2517414627959202 samples/sec                   batch loss = 1563.1078935861588 | accuracy = 0.6562992125984252


Epoch[2] Batch[640] Speed: 1.252378533088509 samples/sec                   batch loss = 1575.5577669143677 | accuracy = 0.656640625


Epoch[2] Batch[645] Speed: 1.256803709969066 samples/sec                   batch loss = 1585.596308350563 | accuracy = 0.6573643410852713


Epoch[2] Batch[650] Speed: 1.2552163905493006 samples/sec                   batch loss = 1601.6847803592682 | accuracy = 0.6576923076923077


Epoch[2] Batch[655] Speed: 1.2491907206807686 samples/sec                   batch loss = 1613.5087258815765 | accuracy = 0.6576335877862596


Epoch[2] Batch[660] Speed: 1.2519733045147619 samples/sec                   batch loss = 1628.3943541049957 | accuracy = 0.6571969696969697


Epoch[2] Batch[665] Speed: 1.2579000874153203 samples/sec                   batch loss = 1639.2412583827972 | accuracy = 0.6578947368421053


Epoch[2] Batch[670] Speed: 1.2557012533798415 samples/sec                   batch loss = 1649.0767549276352 | accuracy = 0.6585820895522388


Epoch[2] Batch[675] Speed: 1.2577569362973844 samples/sec                   batch loss = 1660.8556226491928 | accuracy = 0.6585185185185185


Epoch[2] Batch[680] Speed: 1.2544139632413664 samples/sec                   batch loss = 1674.8609133958817 | accuracy = 0.6573529411764706


Epoch[2] Batch[685] Speed: 1.250287826309035 samples/sec                   batch loss = 1687.7004119157791 | accuracy = 0.6576642335766424


Epoch[2] Batch[690] Speed: 1.2531082502936666 samples/sec                   batch loss = 1699.2643607854843 | accuracy = 0.6579710144927536


Epoch[2] Batch[695] Speed: 1.2529018115351362 samples/sec                   batch loss = 1713.1179047822952 | accuracy = 0.6568345323741007


Epoch[2] Batch[700] Speed: 1.2566628790744392 samples/sec                   batch loss = 1726.1534000635147 | accuracy = 0.6578571428571428


Epoch[2] Batch[705] Speed: 1.2610816584397124 samples/sec                   batch loss = 1739.1272438764572 | accuracy = 0.6578014184397163


Epoch[2] Batch[710] Speed: 1.258178938984169 samples/sec                   batch loss = 1752.916844010353 | accuracy = 0.6566901408450704


Epoch[2] Batch[715] Speed: 1.2641987692437067 samples/sec                   batch loss = 1765.929161310196 | accuracy = 0.6562937062937063


Epoch[2] Batch[720] Speed: 1.2579456423320514 samples/sec                   batch loss = 1776.7682412862778 | accuracy = 0.6572916666666667


Epoch[2] Batch[725] Speed: 1.2546062645751597 samples/sec                   batch loss = 1787.2413322925568 | accuracy = 0.6579310344827586


Epoch[2] Batch[730] Speed: 1.257057963285119 samples/sec                   batch loss = 1800.0193405151367 | accuracy = 0.6582191780821918


Epoch[2] Batch[735] Speed: 1.262511301840314 samples/sec                   batch loss = 1809.0207809209824 | accuracy = 0.6598639455782312


Epoch[2] Batch[740] Speed: 1.2547181078762482 samples/sec                   batch loss = 1818.556238770485 | accuracy = 0.6597972972972973


Epoch[2] Batch[745] Speed: 1.2561859266009612 samples/sec                   batch loss = 1828.8329590559006 | accuracy = 0.6590604026845638


Epoch[2] Batch[750] Speed: 1.2567541896260872 samples/sec                   batch loss = 1840.5689169168472 | accuracy = 0.6593333333333333


Epoch[2] Batch[755] Speed: 1.259980960466081 samples/sec                   batch loss = 1851.2495937347412 | accuracy = 0.659933774834437


Epoch[2] Batch[760] Speed: 1.2590112285215915 samples/sec                   batch loss = 1860.7630738019943 | accuracy = 0.6615131578947369


Epoch[2] Batch[765] Speed: 1.2612619764593525 samples/sec                   batch loss = 1871.855698466301 | accuracy = 0.661437908496732


Epoch[2] Batch[770] Speed: 1.2548761486770865 samples/sec                   batch loss = 1882.5230585336685 | accuracy = 0.6616883116883117


Epoch[2] Batch[775] Speed: 1.2560484313222418 samples/sec                   batch loss = 1893.5525740385056 | accuracy = 0.6616129032258065


Epoch[2] Batch[780] Speed: 1.2642890822277697 samples/sec                   batch loss = 1904.4639892578125 | accuracy = 0.6621794871794872


Epoch[2] Batch[785] Speed: 1.2621558897806109 samples/sec                   batch loss = 1913.9351794719696 | accuracy = 0.663375796178344


[Epoch 2] training: accuracy=0.6637055837563451
[Epoch 2] time cost: 643.5884056091309
[Epoch 2] validation: validation accuracy=0.7555555555555555


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).