<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[03:20:43] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[03:20:43] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[03:20:43] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 7.9173703, -1.7567377]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7723953320428477 samples/sec                   batch loss = 13.909246444702148 | accuracy = 0.55


Epoch[1] Batch[10] Speed: 1.260931148499055 samples/sec                   batch loss = 28.489954471588135 | accuracy = 0.55


Epoch[1] Batch[15] Speed: 1.2572326101422602 samples/sec                   batch loss = 43.4683883190155 | accuracy = 0.4666666666666667


Epoch[1] Batch[20] Speed: 1.257121448492178 samples/sec                   batch loss = 56.3475604057312 | accuracy = 0.5125


Epoch[1] Batch[25] Speed: 1.2608817761752853 samples/sec                   batch loss = 70.33314418792725 | accuracy = 0.51


Epoch[1] Batch[30] Speed: 1.2560644175988702 samples/sec                   batch loss = 83.76938486099243 | accuracy = 0.525


Epoch[1] Batch[35] Speed: 1.2553237402105164 samples/sec                   batch loss = 98.76546239852905 | accuracy = 0.5142857142857142


Epoch[1] Batch[40] Speed: 1.2501074652368867 samples/sec                   batch loss = 114.2679386138916 | accuracy = 0.49375


Epoch[1] Batch[45] Speed: 1.2523804963190848 samples/sec                   batch loss = 127.83131408691406 | accuracy = 0.5055555555555555


Epoch[1] Batch[50] Speed: 1.252775791937207 samples/sec                   batch loss = 142.459814786911 | accuracy = 0.5


Epoch[1] Batch[55] Speed: 1.2488698197450796 samples/sec                   batch loss = 156.5645182132721 | accuracy = 0.5045454545454545


Epoch[1] Batch[60] Speed: 1.2471042384942232 samples/sec                   batch loss = 170.61870646476746 | accuracy = 0.5125


Epoch[1] Batch[65] Speed: 1.255914726756164 samples/sec                   batch loss = 184.35284996032715 | accuracy = 0.5115384615384615


Epoch[1] Batch[70] Speed: 1.2566344531369842 samples/sec                   batch loss = 198.80197286605835 | accuracy = 0.5


Epoch[1] Batch[75] Speed: 1.2542987982825367 samples/sec                   batch loss = 212.55842208862305 | accuracy = 0.49333333333333335


Epoch[1] Batch[80] Speed: 1.2513578329389807 samples/sec                   batch loss = 226.6861288547516 | accuracy = 0.490625


Epoch[1] Batch[85] Speed: 1.2451886402732923 samples/sec                   batch loss = 240.17520761489868 | accuracy = 0.4970588235294118


Epoch[1] Batch[90] Speed: 1.247715067997625 samples/sec                   batch loss = 254.36226558685303 | accuracy = 0.49722222222222223


Epoch[1] Batch[95] Speed: 1.253860183497538 samples/sec                   batch loss = 268.28449034690857 | accuracy = 0.49736842105263157


Epoch[1] Batch[100] Speed: 1.2543115516666608 samples/sec                   batch loss = 282.49228262901306 | accuracy = 0.4925


Epoch[1] Batch[105] Speed: 1.2497052872795391 samples/sec                   batch loss = 295.8616361618042 | accuracy = 0.5023809523809524


Epoch[1] Batch[110] Speed: 1.255561515099061 samples/sec                   batch loss = 309.86733198165894 | accuracy = 0.5


Epoch[1] Batch[115] Speed: 1.259938948156399 samples/sec                   batch loss = 323.20802211761475 | accuracy = 0.5065217391304347


Epoch[1] Batch[120] Speed: 1.2533490254565454 samples/sec                   batch loss = 336.9459686279297 | accuracy = 0.5125


Epoch[1] Batch[125] Speed: 1.2526263227181396 samples/sec                   batch loss = 351.24992084503174 | accuracy = 0.506


Epoch[1] Batch[130] Speed: 1.2504325440686646 samples/sec                   batch loss = 364.8566997051239 | accuracy = 0.5057692307692307


Epoch[1] Batch[135] Speed: 1.2502384454289532 samples/sec                   batch loss = 378.2908458709717 | accuracy = 0.5148148148148148


Epoch[1] Batch[140] Speed: 1.2487271364257584 samples/sec                   batch loss = 392.4264805316925 | accuracy = 0.5125


Epoch[1] Batch[145] Speed: 1.246330940916859 samples/sec                   batch loss = 405.56981658935547 | accuracy = 0.5189655172413793


Epoch[1] Batch[150] Speed: 1.2473120166189158 samples/sec                   batch loss = 419.89360189437866 | accuracy = 0.5183333333333333


Epoch[1] Batch[155] Speed: 1.2572551274808212 samples/sec                   batch loss = 433.53905177116394 | accuracy = 0.5241935483870968


Epoch[1] Batch[160] Speed: 1.2537804427565835 samples/sec                   batch loss = 447.2552936077118 | accuracy = 0.525


Epoch[1] Batch[165] Speed: 1.251863537960928 samples/sec                   batch loss = 461.7283687591553 | accuracy = 0.5196969696969697


Epoch[1] Batch[170] Speed: 1.249068980225029 samples/sec                   batch loss = 475.2056956291199 | accuracy = 0.5235294117647059


Epoch[1] Batch[175] Speed: 1.2522380378547648 samples/sec                   batch loss = 488.9727876186371 | accuracy = 0.5242857142857142


Epoch[1] Batch[180] Speed: 1.2514304515497077 samples/sec                   batch loss = 502.886926651001 | accuracy = 0.5236111111111111


Epoch[1] Batch[185] Speed: 1.2503870657220977 samples/sec                   batch loss = 516.4094350337982 | accuracy = 0.5256756756756756


Epoch[1] Batch[190] Speed: 1.2470105245679168 samples/sec                   batch loss = 529.4596126079559 | accuracy = 0.531578947368421


Epoch[1] Batch[195] Speed: 1.247210019598125 samples/sec                   batch loss = 543.2951550483704 | accuracy = 0.5333333333333333


Epoch[1] Batch[200] Speed: 1.2487185857575123 samples/sec                   batch loss = 556.8107674121857 | accuracy = 0.535


Epoch[1] Batch[205] Speed: 1.2542494751350879 samples/sec                   batch loss = 570.4804360866547 | accuracy = 0.5390243902439025


Epoch[1] Batch[210] Speed: 1.2538765826746336 samples/sec                   batch loss = 584.3314139842987 | accuracy = 0.5392857142857143


Epoch[1] Batch[215] Speed: 1.2495407287535751 samples/sec                   batch loss = 597.3812029361725 | accuracy = 0.5406976744186046


Epoch[1] Batch[220] Speed: 1.2592453931966947 samples/sec                   batch loss = 611.2027847766876 | accuracy = 0.5397727272727273


Epoch[1] Batch[225] Speed: 1.2540616888451166 samples/sec                   batch loss = 625.0891108512878 | accuracy = 0.5433333333333333


Epoch[1] Batch[230] Speed: 1.2559995346490678 samples/sec                   batch loss = 639.2816207408905 | accuracy = 0.5413043478260869


Epoch[1] Batch[235] Speed: 1.2599385696798564 samples/sec                   batch loss = 652.7107481956482 | accuracy = 0.5436170212765957


Epoch[1] Batch[240] Speed: 1.2550672773080969 samples/sec                   batch loss = 666.1662838459015 | accuracy = 0.5479166666666667


Epoch[1] Batch[245] Speed: 1.2570095530597003 samples/sec                   batch loss = 680.3857688903809 | accuracy = 0.5459183673469388


Epoch[1] Batch[250] Speed: 1.2568738547758156 samples/sec                   batch loss = 694.3688242435455 | accuracy = 0.544


Epoch[1] Batch[255] Speed: 1.2548798092385849 samples/sec                   batch loss = 708.1061956882477 | accuracy = 0.542156862745098


Epoch[1] Batch[260] Speed: 1.2504508108931593 samples/sec                   batch loss = 721.4873740673065 | accuracy = 0.5442307692307692


Epoch[1] Batch[265] Speed: 1.2547342479854546 samples/sec                   batch loss = 735.6136820316315 | accuracy = 0.5424528301886793


Epoch[1] Batch[270] Speed: 1.2532482855549625 samples/sec                   batch loss = 749.0624489784241 | accuracy = 0.5444444444444444


Epoch[1] Batch[275] Speed: 1.2489654869503137 samples/sec                   batch loss = 762.8412461280823 | accuracy = 0.5454545454545454


Epoch[1] Batch[280] Speed: 1.246109051001056 samples/sec                   batch loss = 775.8416006565094 | accuracy = 0.5464285714285714


Epoch[1] Batch[285] Speed: 1.2523298282433504 samples/sec                   batch loss = 788.9446432590485 | accuracy = 0.5482456140350878


Epoch[1] Batch[290] Speed: 1.2495714405733909 samples/sec                   batch loss = 802.3874003887177 | accuracy = 0.5491379310344827


Epoch[1] Batch[295] Speed: 1.24830504054428 samples/sec                   batch loss = 815.8599305152893 | accuracy = 0.55


Epoch[1] Batch[300] Speed: 1.254545378409146 samples/sec                   batch loss = 829.0662317276001 | accuracy = 0.5508333333333333


Epoch[1] Batch[305] Speed: 1.2534648589072634 samples/sec                   batch loss = 843.2330496311188 | accuracy = 0.5483606557377049


Epoch[1] Batch[310] Speed: 1.255042209480208 samples/sec                   batch loss = 856.9209897518158 | accuracy = 0.5483870967741935


Epoch[1] Batch[315] Speed: 1.2597518192468258 samples/sec                   batch loss = 869.7664880752563 | accuracy = 0.5515873015873016


Epoch[1] Batch[320] Speed: 1.256620523014652 samples/sec                   batch loss = 883.0736219882965 | accuracy = 0.55234375


Epoch[1] Batch[325] Speed: 1.2592698730635983 samples/sec                   batch loss = 896.2030136585236 | accuracy = 0.5546153846153846


Epoch[1] Batch[330] Speed: 1.26072951391639 samples/sec                   batch loss = 909.8766412734985 | accuracy = 0.5553030303030303


Epoch[1] Batch[335] Speed: 1.2538513749737865 samples/sec                   batch loss = 924.8250563144684 | accuracy = 0.5507462686567164


Epoch[1] Batch[340] Speed: 1.2624821356976619 samples/sec                   batch loss = 939.2527034282684 | accuracy = 0.549264705882353


Epoch[1] Batch[345] Speed: 1.2610880094601806 samples/sec                   batch loss = 952.7966649532318 | accuracy = 0.5485507246376812


Epoch[1] Batch[350] Speed: 1.2548970798680807 samples/sec                   batch loss = 967.0430948734283 | accuracy = 0.5464285714285714


Epoch[1] Batch[355] Speed: 1.252950748026686 samples/sec                   batch loss = 980.3775796890259 | accuracy = 0.5471830985915493


Epoch[1] Batch[360] Speed: 1.254845269405685 samples/sec                   batch loss = 993.6165118217468 | accuracy = 0.5493055555555556


Epoch[1] Batch[365] Speed: 1.261120618692082 samples/sec                   batch loss = 1007.0176742076874 | accuracy = 0.547945205479452


Epoch[1] Batch[370] Speed: 1.2594103435165074 samples/sec                   batch loss = 1020.9034202098846 | accuracy = 0.5472972972972973


Epoch[1] Batch[375] Speed: 1.2506832935254846 samples/sec                   batch loss = 1034.8004612922668 | accuracy = 0.5486666666666666


Epoch[1] Batch[380] Speed: 1.2511626072082356 samples/sec                   batch loss = 1047.8291635513306 | accuracy = 0.55


Epoch[1] Batch[385] Speed: 1.2575039082406483 samples/sec                   batch loss = 1060.812330007553 | accuracy = 0.5506493506493506


Epoch[1] Batch[390] Speed: 1.2564645832005576 samples/sec                   batch loss = 1074.974805355072 | accuracy = 0.55


Epoch[1] Batch[395] Speed: 1.2572455174719805 samples/sec                   batch loss = 1088.862330198288 | accuracy = 0.5512658227848102


Epoch[1] Batch[400] Speed: 1.2544881564293766 samples/sec                   batch loss = 1102.8231358528137 | accuracy = 0.55125


Epoch[1] Batch[405] Speed: 1.257838315314571 samples/sec                   batch loss = 1116.3432619571686 | accuracy = 0.5512345679012346


Epoch[1] Batch[410] Speed: 1.254249193835486 samples/sec                   batch loss = 1129.4341838359833 | accuracy = 0.5530487804878049


Epoch[1] Batch[415] Speed: 1.2573972220100813 samples/sec                   batch loss = 1143.0098023414612 | accuracy = 0.5548192771084337


Epoch[1] Batch[420] Speed: 1.2554360872503751 samples/sec                   batch loss = 1156.5914959907532 | accuracy = 0.5541666666666667


Epoch[1] Batch[425] Speed: 1.2568961709398738 samples/sec                   batch loss = 1169.9483988285065 | accuracy = 0.5547058823529412


Epoch[1] Batch[430] Speed: 1.259667260515387 samples/sec                   batch loss = 1181.941812992096 | accuracy = 0.5569767441860465


Epoch[1] Batch[435] Speed: 1.2595745805401124 samples/sec                   batch loss = 1194.4890582561493 | accuracy = 0.5597701149425287


Epoch[1] Batch[440] Speed: 1.257997048840055 samples/sec                   batch loss = 1207.7860674858093 | accuracy = 0.5613636363636364


Epoch[1] Batch[445] Speed: 1.259345019912663 samples/sec                   batch loss = 1220.360933303833 | accuracy = 0.5646067415730337


Epoch[1] Batch[450] Speed: 1.2560589634116988 samples/sec                   batch loss = 1233.8217039108276 | accuracy = 0.5655555555555556


Epoch[1] Batch[455] Speed: 1.244848639721535 samples/sec                   batch loss = 1246.9489495754242 | accuracy = 0.5659340659340659


Epoch[1] Batch[460] Speed: 1.2439669823960584 samples/sec                   batch loss = 1260.636982679367 | accuracy = 0.5657608695652174


Epoch[1] Batch[465] Speed: 1.2467330804626282 samples/sec                   batch loss = 1273.5360565185547 | accuracy = 0.5672043010752689


Epoch[1] Batch[470] Speed: 1.2442586069629973 samples/sec                   batch loss = 1286.9438898563385 | accuracy = 0.5675531914893617


Epoch[1] Batch[475] Speed: 1.241446876347652 samples/sec                   batch loss = 1300.1354660987854 | accuracy = 0.5689473684210526


Epoch[1] Batch[480] Speed: 1.2487790005966553 samples/sec                   batch loss = 1312.4968328475952 | accuracy = 0.5692708333333333


Epoch[1] Batch[485] Speed: 1.2500988956546342 samples/sec                   batch loss = 1325.7244865894318 | accuracy = 0.5706185567010309


Epoch[1] Batch[490] Speed: 1.2450822777506283 samples/sec                   batch loss = 1338.7300601005554 | accuracy = 0.5719387755102041


Epoch[1] Batch[495] Speed: 1.250362091316024 samples/sec                   batch loss = 1351.6821944713593 | accuracy = 0.5747474747474748


Epoch[1] Batch[500] Speed: 1.2525468323448417 samples/sec                   batch loss = 1366.3468263149261 | accuracy = 0.573


Epoch[1] Batch[505] Speed: 1.2498332037601936 samples/sec                   batch loss = 1379.908641576767 | accuracy = 0.5727722772277227


Epoch[1] Batch[510] Speed: 1.2506805897423148 samples/sec                   batch loss = 1393.8737177848816 | accuracy = 0.5715686274509804


Epoch[1] Batch[515] Speed: 1.252906396244899 samples/sec                   batch loss = 1405.110417842865 | accuracy = 0.574757281553398


Epoch[1] Batch[520] Speed: 1.2504502516967355 samples/sec                   batch loss = 1417.651980638504 | accuracy = 0.5754807692307692


Epoch[1] Batch[525] Speed: 1.2468198032298883 samples/sec                   batch loss = 1430.2220072746277 | accuracy = 0.5776190476190476


Epoch[1] Batch[530] Speed: 1.2504584532944092 samples/sec                   batch loss = 1443.5577855110168 | accuracy = 0.5778301886792453


Epoch[1] Batch[535] Speed: 1.2469817921596424 samples/sec                   batch loss = 1454.6659395694733 | accuracy = 0.5794392523364486


Epoch[1] Batch[540] Speed: 1.2483918892553663 samples/sec                   batch loss = 1466.3521645069122 | accuracy = 0.5814814814814815


Epoch[1] Batch[545] Speed: 1.248533009345533 samples/sec                   batch loss = 1479.1430654525757 | accuracy = 0.5830275229357799


Epoch[1] Batch[550] Speed: 1.251546864257887 samples/sec                   batch loss = 1491.12602186203 | accuracy = 0.5859090909090909


Epoch[1] Batch[555] Speed: 1.247467362310806 samples/sec                   batch loss = 1504.1607847213745 | accuracy = 0.586036036036036


Epoch[1] Batch[560] Speed: 1.2495236052429117 samples/sec                   batch loss = 1516.9646754264832 | accuracy = 0.5866071428571429


Epoch[1] Batch[565] Speed: 1.249430364795904 samples/sec                   batch loss = 1531.302169561386 | accuracy = 0.5858407079646017


Epoch[1] Batch[570] Speed: 1.2510342319034304 samples/sec                   batch loss = 1544.073152065277 | accuracy = 0.5859649122807018


Epoch[1] Batch[575] Speed: 1.2544871246055243 samples/sec                   batch loss = 1557.5953710079193 | accuracy = 0.5856521739130435


Epoch[1] Batch[580] Speed: 1.243247593309062 samples/sec                   batch loss = 1571.8465323448181 | accuracy = 0.5853448275862069


Epoch[1] Batch[585] Speed: 1.2545808398950524 samples/sec                   batch loss = 1583.9330307245255 | accuracy = 0.5871794871794872


Epoch[1] Batch[590] Speed: 1.2562642803967938 samples/sec                   batch loss = 1596.5528024435043 | accuracy = 0.5877118644067797


Epoch[1] Batch[595] Speed: 1.2535569229081758 samples/sec                   batch loss = 1609.0300668478012 | accuracy = 0.5894957983193277


Epoch[1] Batch[600] Speed: 1.2544395687481158 samples/sec                   batch loss = 1622.1296766996384 | accuracy = 0.5891666666666666


Epoch[1] Batch[605] Speed: 1.2565706406502897 samples/sec                   batch loss = 1636.3640052080154 | accuracy = 0.5884297520661157


Epoch[1] Batch[610] Speed: 1.2565099400961712 samples/sec                   batch loss = 1648.5631619691849 | accuracy = 0.5897540983606557


Epoch[1] Batch[615] Speed: 1.250165126676602 samples/sec                   batch loss = 1661.9299358129501 | accuracy = 0.5894308943089431


Epoch[1] Batch[620] Speed: 1.2474037354565617 samples/sec                   batch loss = 1675.5389965772629 | accuracy = 0.5899193548387097


Epoch[1] Batch[625] Speed: 1.2501219033857547 samples/sec                   batch loss = 1689.213469862938 | accuracy = 0.5888


Epoch[1] Batch[630] Speed: 1.2492558322675418 samples/sec                   batch loss = 1701.211027622223 | accuracy = 0.5892857142857143


Epoch[1] Batch[635] Speed: 1.249266808889286 samples/sec                   batch loss = 1714.9627108573914 | accuracy = 0.5881889763779528


Epoch[1] Batch[640] Speed: 1.253123974618759 samples/sec                   batch loss = 1728.7767550945282 | accuracy = 0.5875


Epoch[1] Batch[645] Speed: 1.2479235138431666 samples/sec                   batch loss = 1742.5289492607117 | accuracy = 0.5868217054263566


Epoch[1] Batch[650] Speed: 1.2559277950752274 samples/sec                   batch loss = 1755.8708329200745 | accuracy = 0.5865384615384616


Epoch[1] Batch[655] Speed: 1.2515802890690264 samples/sec                   batch loss = 1768.1870141029358 | accuracy = 0.5877862595419847


Epoch[1] Batch[660] Speed: 1.2523087956522894 samples/sec                   batch loss = 1781.7318050861359 | accuracy = 0.5882575757575758


Epoch[1] Batch[665] Speed: 1.256123946548822 samples/sec                   batch loss = 1791.962931394577 | accuracy = 0.5902255639097744


Epoch[1] Batch[670] Speed: 1.2518924957782658 samples/sec                   batch loss = 1803.7282083034515 | accuracy = 0.5906716417910448


Epoch[1] Batch[675] Speed: 1.252225980850452 samples/sec                   batch loss = 1816.1220541000366 | accuracy = 0.5914814814814815


Epoch[1] Batch[680] Speed: 1.245442190869186 samples/sec                   batch loss = 1826.9316720962524 | accuracy = 0.5930147058823529


Epoch[1] Batch[685] Speed: 1.2543529081742055 samples/sec                   batch loss = 1839.3708817958832 | accuracy = 0.5927007299270073


Epoch[1] Batch[690] Speed: 1.2564275096642272 samples/sec                   batch loss = 1850.9253692626953 | accuracy = 0.5934782608695652


Epoch[1] Batch[695] Speed: 1.2520807542508403 samples/sec                   batch loss = 1863.3925927877426 | accuracy = 0.5935251798561151


Epoch[1] Batch[700] Speed: 1.2527107806034155 samples/sec                   batch loss = 1875.6311081647873 | accuracy = 0.5932142857142857


Epoch[1] Batch[705] Speed: 1.251554426698549 samples/sec                   batch loss = 1888.6703375577927 | accuracy = 0.5929078014184397


Epoch[1] Batch[710] Speed: 1.2538921388619273 samples/sec                   batch loss = 1901.9747956991196 | accuracy = 0.5922535211267606


Epoch[1] Batch[715] Speed: 1.2532742179780123 samples/sec                   batch loss = 1915.0056343078613 | accuracy = 0.5926573426573427


Epoch[1] Batch[720] Speed: 1.2558780617013923 samples/sec                   batch loss = 1927.3088096380234 | accuracy = 0.5930555555555556


Epoch[1] Batch[725] Speed: 1.2552648505900912 samples/sec                   batch loss = 1939.639752626419 | accuracy = 0.5944827586206897


Epoch[1] Batch[730] Speed: 1.2482030667888293 samples/sec                   batch loss = 1952.1973210573196 | accuracy = 0.5955479452054795


Epoch[1] Batch[735] Speed: 1.2438509610380213 samples/sec                   batch loss = 1964.97600543499 | accuracy = 0.595578231292517


Epoch[1] Batch[740] Speed: 1.2473795292762384 samples/sec                   batch loss = 1980.7381793260574 | accuracy = 0.5949324324324324


Epoch[1] Batch[745] Speed: 1.2474015095616775 samples/sec                   batch loss = 1994.0854657888412 | accuracy = 0.5946308724832214


Epoch[1] Batch[750] Speed: 1.2504363651459718 samples/sec                   batch loss = 2008.4352096319199 | accuracy = 0.593


Epoch[1] Batch[755] Speed: 1.241812411391456 samples/sec                   batch loss = 2022.9580799341202 | accuracy = 0.5927152317880795


Epoch[1] Batch[760] Speed: 1.2470786534726943 samples/sec                   batch loss = 2035.370556473732 | accuracy = 0.593421052631579


Epoch[1] Batch[765] Speed: 1.2464802079515602 samples/sec                   batch loss = 2049.58679831028 | accuracy = 0.5937908496732026


Epoch[1] Batch[770] Speed: 1.2377789439521403 samples/sec                   batch loss = 2060.6147295236588 | accuracy = 0.5948051948051948


Epoch[1] Batch[775] Speed: 1.2466036673170688 samples/sec                   batch loss = 2073.854999780655 | accuracy = 0.5958064516129032


Epoch[1] Batch[780] Speed: 1.2472026949995674 samples/sec                   batch loss = 2087.500764608383 | accuracy = 0.5955128205128205


Epoch[1] Batch[785] Speed: 1.2474387015978714 samples/sec                   batch loss = 2100.043662071228 | accuracy = 0.5955414012738853


[Epoch 1] training: accuracy=0.5961294416243654
[Epoch 1] time cost: 647.6567876338959
[Epoch 1] validation: validation accuracy=0.6711111111111111


Epoch[2] Batch[5] Speed: 1.2450905938872827 samples/sec                   batch loss = 11.91425621509552 | accuracy = 0.65


Epoch[2] Batch[10] Speed: 1.2463690876512987 samples/sec                   batch loss = 26.30911648273468 | accuracy = 0.55


Epoch[2] Batch[15] Speed: 1.2466196919684531 samples/sec                   batch loss = 38.781744837760925 | accuracy = 0.5833333333333334


Epoch[2] Batch[20] Speed: 1.2546984963256211 samples/sec                   batch loss = 49.28114199638367 | accuracy = 0.6375


Epoch[2] Batch[25] Speed: 1.2453967974481015 samples/sec                   batch loss = 62.41603755950928 | accuracy = 0.64


Epoch[2] Batch[30] Speed: 1.2492849486446025 samples/sec                   batch loss = 73.07273721694946 | accuracy = 0.6416666666666667


Epoch[2] Batch[35] Speed: 1.2468862432545118 samples/sec                   batch loss = 86.48186433315277 | accuracy = 0.6214285714285714


Epoch[2] Batch[40] Speed: 1.2483412647408514 samples/sec                   batch loss = 97.38442099094391 | accuracy = 0.64375


Epoch[2] Batch[45] Speed: 1.2538769575179818 samples/sec                   batch loss = 110.88504374027252 | accuracy = 0.6277777777777778


Epoch[2] Batch[50] Speed: 1.2506648334453752 samples/sec                   batch loss = 122.10698997974396 | accuracy = 0.635


Epoch[2] Batch[55] Speed: 1.2530808272661857 samples/sec                   batch loss = 137.26574337482452 | accuracy = 0.6181818181818182


Epoch[2] Batch[60] Speed: 1.25606404144651 samples/sec                   batch loss = 149.96273005008698 | accuracy = 0.6125


Epoch[2] Batch[65] Speed: 1.2520694478017198 samples/sec                   batch loss = 161.25914132595062 | accuracy = 0.6192307692307693


Epoch[2] Batch[70] Speed: 1.2479927637138335 samples/sec                   batch loss = 172.49572491645813 | accuracy = 0.6321428571428571


Epoch[2] Batch[75] Speed: 1.253301743095649 samples/sec                   batch loss = 186.4948754310608 | accuracy = 0.63


Epoch[2] Batch[80] Speed: 1.249988898733502 samples/sec                   batch loss = 198.29641199111938 | accuracy = 0.634375


Epoch[2] Batch[85] Speed: 1.2537583307869147 samples/sec                   batch loss = 212.1231211423874 | accuracy = 0.638235294117647


Epoch[2] Batch[90] Speed: 1.2501888821285352 samples/sec                   batch loss = 225.45155894756317 | accuracy = 0.6388888888888888


Epoch[2] Batch[95] Speed: 1.253662771073903 samples/sec                   batch loss = 237.7812637090683 | accuracy = 0.6421052631578947


Epoch[2] Batch[100] Speed: 1.2504553776826737 samples/sec                   batch loss = 249.7281070947647 | accuracy = 0.6425


Epoch[2] Batch[105] Speed: 1.2516369659433346 samples/sec                   batch loss = 261.53814792633057 | accuracy = 0.6428571428571429


Epoch[2] Batch[110] Speed: 1.243391515134986 samples/sec                   batch loss = 272.9194393157959 | accuracy = 0.6477272727272727


Epoch[2] Batch[115] Speed: 1.24270712450814 samples/sec                   batch loss = 288.20023012161255 | accuracy = 0.6391304347826087


Epoch[2] Batch[120] Speed: 1.2471627357300128 samples/sec                   batch loss = 297.1488571166992 | accuracy = 0.6458333333333334


Epoch[2] Batch[125] Speed: 1.244536704792306 samples/sec                   batch loss = 308.7320293188095 | accuracy = 0.65


Epoch[2] Batch[130] Speed: 1.24797261919205 samples/sec                   batch loss = 322.2534729242325 | accuracy = 0.6480769230769231


Epoch[2] Batch[135] Speed: 1.2521868205544675 samples/sec                   batch loss = 336.26467084884644 | accuracy = 0.6444444444444445


Epoch[2] Batch[140] Speed: 1.2523943325948355 samples/sec                   batch loss = 348.1449626684189 | accuracy = 0.6464285714285715


Epoch[2] Batch[145] Speed: 1.2494472066013336 samples/sec                   batch loss = 361.1749209165573 | accuracy = 0.6431034482758621


Epoch[2] Batch[150] Speed: 1.2491203147888594 samples/sec                   batch loss = 374.4726023674011 | accuracy = 0.6433333333333333


Epoch[2] Batch[155] Speed: 1.2507259029626026 samples/sec                   batch loss = 385.56494641304016 | accuracy = 0.646774193548387


Epoch[2] Batch[160] Speed: 1.2484089817617214 samples/sec                   batch loss = 398.0341410636902 | accuracy = 0.6484375


Epoch[2] Batch[165] Speed: 1.25384031764044 samples/sec                   batch loss = 407.9894493818283 | accuracy = 0.6530303030303031


Epoch[2] Batch[170] Speed: 1.2458746574297233 samples/sec                   batch loss = 420.045401930809 | accuracy = 0.6514705882352941


Epoch[2] Batch[175] Speed: 1.2542246274902882 samples/sec                   batch loss = 431.93448281288147 | accuracy = 0.6528571428571428


Epoch[2] Batch[180] Speed: 1.2557124375454458 samples/sec                   batch loss = 442.1126352548599 | accuracy = 0.6583333333333333


Epoch[2] Batch[185] Speed: 1.2614000465547388 samples/sec                   batch loss = 454.1940186023712 | accuracy = 0.6621621621621622


Epoch[2] Batch[190] Speed: 1.2561592151760228 samples/sec                   batch loss = 465.57686722278595 | accuracy = 0.6631578947368421


Epoch[2] Batch[195] Speed: 1.2592194021017742 samples/sec                   batch loss = 477.23020017147064 | accuracy = 0.6641025641025641


Epoch[2] Batch[200] Speed: 1.259841403350788 samples/sec                   batch loss = 488.39728713035583 | accuracy = 0.66375


Epoch[2] Batch[205] Speed: 1.2590134015624712 samples/sec                   batch loss = 502.5257215499878 | accuracy = 0.6621951219512195


Epoch[2] Batch[210] Speed: 1.255909555913643 samples/sec                   batch loss = 515.3315043449402 | accuracy = 0.6619047619047619


Epoch[2] Batch[215] Speed: 1.2554684987874636 samples/sec                   batch loss = 528.4942826032639 | accuracy = 0.663953488372093


Epoch[2] Batch[220] Speed: 1.2583810798686297 samples/sec                   batch loss = 540.4363669157028 | accuracy = 0.6636363636363637


Epoch[2] Batch[225] Speed: 1.249999888241301 samples/sec                   batch loss = 552.4639018774033 | accuracy = 0.6633333333333333


Epoch[2] Batch[230] Speed: 1.2477584034074365 samples/sec                   batch loss = 564.8165255784988 | accuracy = 0.6641304347826087


Epoch[2] Batch[235] Speed: 1.252969462800647 samples/sec                   batch loss = 576.1973514556885 | accuracy = 0.6680851063829787


Epoch[2] Batch[240] Speed: 1.2469807726468995 samples/sec                   batch loss = 588.0769052505493 | accuracy = 0.6666666666666666


Epoch[2] Batch[245] Speed: 1.2425402626149873 samples/sec                   batch loss = 599.7063717842102 | accuracy = 0.6673469387755102


Epoch[2] Batch[250] Speed: 1.244180636357906 samples/sec                   batch loss = 611.274956703186 | accuracy = 0.67


Epoch[2] Batch[255] Speed: 1.2472223510786473 samples/sec                   batch loss = 623.0158624649048 | accuracy = 0.6705882352941176


Epoch[2] Batch[260] Speed: 1.250742313498323 samples/sec                   batch loss = 634.3284604549408 | accuracy = 0.6721153846153847


Epoch[2] Batch[265] Speed: 1.2538111759419623 samples/sec                   batch loss = 645.292189002037 | accuracy = 0.6754716981132075


Epoch[2] Batch[270] Speed: 1.24451944120261 samples/sec                   batch loss = 657.4293661117554 | accuracy = 0.6768518518518518


Epoch[2] Batch[275] Speed: 1.2475672677126506 samples/sec                   batch loss = 667.0395359992981 | accuracy = 0.68


Epoch[2] Batch[280] Speed: 1.2460280720709935 samples/sec                   batch loss = 675.6421916484833 | accuracy = 0.6839285714285714


Epoch[2] Batch[285] Speed: 1.2517427702843957 samples/sec                   batch loss = 689.2915726900101 | accuracy = 0.6833333333333333


Epoch[2] Batch[290] Speed: 1.2447311609689622 samples/sec                   batch loss = 702.0373878479004 | accuracy = 0.6818965517241379


Epoch[2] Batch[295] Speed: 1.2481093733540523 samples/sec                   batch loss = 713.8555694818497 | accuracy = 0.6822033898305084


Epoch[2] Batch[300] Speed: 1.241495840779783 samples/sec                   batch loss = 725.7620936632156 | accuracy = 0.6808333333333333


Epoch[2] Batch[305] Speed: 1.236212779712843 samples/sec                   batch loss = 736.3716354370117 | accuracy = 0.680327868852459


Epoch[2] Batch[310] Speed: 1.2436204581789545 samples/sec                   batch loss = 750.4834862947464 | accuracy = 0.6774193548387096


Epoch[2] Batch[315] Speed: 1.2433011224396104 samples/sec                   batch loss = 759.0912055373192 | accuracy = 0.6793650793650794


Epoch[2] Batch[320] Speed: 1.235126495216698 samples/sec                   batch loss = 770.1443120837212 | accuracy = 0.67890625


Epoch[2] Batch[325] Speed: 1.239704862858353 samples/sec                   batch loss = 782.1372727751732 | accuracy = 0.6776923076923077


Epoch[2] Batch[330] Speed: 1.2400732222877102 samples/sec                   batch loss = 796.0954156517982 | accuracy = 0.6765151515151515


Epoch[2] Batch[335] Speed: 1.2444116238431746 samples/sec                   batch loss = 806.3950467705727 | accuracy = 0.6783582089552239


Epoch[2] Batch[340] Speed: 1.2453632398772394 samples/sec                   batch loss = 818.6293070912361 | accuracy = 0.6772058823529412


Epoch[2] Batch[345] Speed: 1.2395222304345372 samples/sec                   batch loss = 828.8770948052406 | accuracy = 0.6797101449275362


Epoch[2] Batch[350] Speed: 1.2387741278003856 samples/sec                   batch loss = 837.9871248602867 | accuracy = 0.6821428571428572


Epoch[2] Batch[355] Speed: 1.2434858840191034 samples/sec                   batch loss = 849.2380422949791 | accuracy = 0.6809859154929577


Epoch[2] Batch[360] Speed: 1.2442476258960367 samples/sec                   batch loss = 863.07503002882 | accuracy = 0.68125


Epoch[2] Batch[365] Speed: 1.2471243549910833 samples/sec                   batch loss = 874.4210661053658 | accuracy = 0.6808219178082192


Epoch[2] Batch[370] Speed: 1.2440818261168691 samples/sec                   batch loss = 886.6943266987801 | accuracy = 0.6804054054054054


Epoch[2] Batch[375] Speed: 1.2437296136829994 samples/sec                   batch loss = 899.120009958744 | accuracy = 0.6806666666666666


Epoch[2] Batch[380] Speed: 1.2395344103606218 samples/sec                   batch loss = 909.0826231837273 | accuracy = 0.680921052631579


Epoch[2] Batch[385] Speed: 1.2389834396195172 samples/sec                   batch loss = 920.8764888644218 | accuracy = 0.6818181818181818


Epoch[2] Batch[390] Speed: 1.2350998535596949 samples/sec                   batch loss = 934.0637273192406 | accuracy = 0.6807692307692308


Epoch[2] Batch[395] Speed: 1.243479156075637 samples/sec                   batch loss = 944.0948703885078 | accuracy = 0.6829113924050633


Epoch[2] Batch[400] Speed: 1.243288131297599 samples/sec                   batch loss = 956.8980986475945 | accuracy = 0.6825


Epoch[2] Batch[405] Speed: 1.244555076731986 samples/sec                   batch loss = 971.0039778351784 | accuracy = 0.6814814814814815


Epoch[2] Batch[410] Speed: 1.2430544292570214 samples/sec                   batch loss = 984.4797181487083 | accuracy = 0.6817073170731708


Epoch[2] Batch[415] Speed: 1.2418645300683606 samples/sec                   batch loss = 997.5073329806328 | accuracy = 0.6813253012048193


Epoch[2] Batch[420] Speed: 1.2480256276014456 samples/sec                   batch loss = 1010.0559374690056 | accuracy = 0.6815476190476191


Epoch[2] Batch[425] Speed: 1.2478549213471781 samples/sec                   batch loss = 1022.0139098763466 | accuracy = 0.6823529411764706


Epoch[2] Batch[430] Speed: 1.2422951590727476 samples/sec                   batch loss = 1033.8359737992287 | accuracy = 0.6825581395348838


Epoch[2] Batch[435] Speed: 1.2461731937833418 samples/sec                   batch loss = 1044.3904810547829 | accuracy = 0.6821839080459771


Epoch[2] Batch[440] Speed: 1.2493246718973257 samples/sec                   batch loss = 1052.1325773596764 | accuracy = 0.6846590909090909


Epoch[2] Batch[445] Speed: 1.2420890488720249 samples/sec                   batch loss = 1064.7140607237816 | accuracy = 0.6842696629213483


Epoch[2] Batch[450] Speed: 1.2429991716162123 samples/sec                   batch loss = 1079.4259260296822 | accuracy = 0.6833333333333333


Epoch[2] Batch[455] Speed: 1.239798123234214 samples/sec                   batch loss = 1088.759877026081 | accuracy = 0.682967032967033


Epoch[2] Batch[460] Speed: 1.2484920357101184 samples/sec                   batch loss = 1102.7794317603111 | accuracy = 0.6820652173913043


Epoch[2] Batch[465] Speed: 1.240595347956745 samples/sec                   batch loss = 1115.5075325369835 | accuracy = 0.6811827956989247


Epoch[2] Batch[470] Speed: 1.250374112454116 samples/sec                   batch loss = 1128.5620958209038 | accuracy = 0.6803191489361702


Epoch[2] Batch[475] Speed: 1.2442144993137516 samples/sec                   batch loss = 1141.9518867135048 | accuracy = 0.6789473684210526


Epoch[2] Batch[480] Speed: 1.2443626137375212 samples/sec                   batch loss = 1153.1026688218117 | accuracy = 0.6786458333333333


Epoch[2] Batch[485] Speed: 1.248725928172054 samples/sec                   batch loss = 1166.3055084347725 | accuracy = 0.677319587628866


Epoch[2] Batch[490] Speed: 1.242169608572422 samples/sec                   batch loss = 1176.634338080883 | accuracy = 0.6785714285714286


Epoch[2] Batch[495] Speed: 1.2401452703891391 samples/sec                   batch loss = 1190.4290263056755 | accuracy = 0.6777777777777778


Epoch[2] Batch[500] Speed: 1.2423129129496921 samples/sec                   batch loss = 1201.5238071084023 | accuracy = 0.679


Epoch[2] Batch[505] Speed: 1.2467189057728354 samples/sec                   batch loss = 1216.5292111039162 | accuracy = 0.6777227722772278


Epoch[2] Batch[510] Speed: 1.2398629007304358 samples/sec                   batch loss = 1228.9021300673485 | accuracy = 0.6774509803921569


Epoch[2] Batch[515] Speed: 1.241996271023979 samples/sec                   batch loss = 1239.3723732829094 | accuracy = 0.6771844660194175


Epoch[2] Batch[520] Speed: 1.2407293889192468 samples/sec                   batch loss = 1252.870046555996 | accuracy = 0.676923076923077


Epoch[2] Batch[525] Speed: 1.2401470121135456 samples/sec                   batch loss = 1264.1979249119759 | accuracy = 0.6780952380952381


Epoch[2] Batch[530] Speed: 1.2369412841993974 samples/sec                   batch loss = 1273.7719258666039 | accuracy = 0.680188679245283


Epoch[2] Batch[535] Speed: 1.243347561081803 samples/sec                   batch loss = 1285.3141068816185 | accuracy = 0.680841121495327


Epoch[2] Batch[540] Speed: 1.244333541770301 samples/sec                   batch loss = 1296.6255944371223 | accuracy = 0.6814814814814815


Epoch[2] Batch[545] Speed: 1.2458793758935358 samples/sec                   batch loss = 1308.6226795315742 | accuracy = 0.681651376146789


Epoch[2] Batch[550] Speed: 1.2463363109469887 samples/sec                   batch loss = 1323.7080736756325 | accuracy = 0.6804545454545454


Epoch[2] Batch[555] Speed: 1.2460319588191628 samples/sec                   batch loss = 1334.9985145926476 | accuracy = 0.6810810810810811


Epoch[2] Batch[560] Speed: 1.2470448198208075 samples/sec                   batch loss = 1347.9229585528374 | accuracy = 0.6808035714285714


Epoch[2] Batch[565] Speed: 1.2474474202514239 samples/sec                   batch loss = 1362.1288015246391 | accuracy = 0.6787610619469027


Epoch[2] Batch[570] Speed: 1.2496665637271092 samples/sec                   batch loss = 1375.620289504528 | accuracy = 0.6780701754385965


Epoch[2] Batch[575] Speed: 1.2524872679369254 samples/sec                   batch loss = 1384.305327951908 | accuracy = 0.6804347826086956


Epoch[2] Batch[580] Speed: 1.2469032945559848 samples/sec                   batch loss = 1394.349872291088 | accuracy = 0.6818965517241379


Epoch[2] Batch[585] Speed: 1.2515371545915313 samples/sec                   batch loss = 1404.4790942072868 | accuracy = 0.6829059829059829


Epoch[2] Batch[590] Speed: 1.2436933800440673 samples/sec                   batch loss = 1413.988814651966 | accuracy = 0.684322033898305


Epoch[2] Batch[595] Speed: 1.248238727847061 samples/sec                   batch loss = 1425.9431629776955 | accuracy = 0.6840336134453782


Epoch[2] Batch[600] Speed: 1.2483601207315445 samples/sec                   batch loss = 1439.2064512372017 | accuracy = 0.68375


Epoch[2] Batch[605] Speed: 1.2476191283569906 samples/sec                   batch loss = 1450.3117948174477 | accuracy = 0.6830578512396694


Epoch[2] Batch[610] Speed: 1.243619259787362 samples/sec                   batch loss = 1465.210454761982 | accuracy = 0.6819672131147541


Epoch[2] Batch[615] Speed: 1.2529477537147329 samples/sec                   batch loss = 1475.2207971215248 | accuracy = 0.6821138211382114


Epoch[2] Batch[620] Speed: 1.2484686234115847 samples/sec                   batch loss = 1486.4597508311272 | accuracy = 0.682258064516129


Epoch[2] Batch[625] Speed: 1.2516845896908086 samples/sec                   batch loss = 1495.6815556883812 | accuracy = 0.6832


Epoch[2] Batch[630] Speed: 1.2470302672804046 samples/sec                   batch loss = 1507.388451397419 | accuracy = 0.6837301587301587


Epoch[2] Batch[635] Speed: 1.2447656080223692 samples/sec                   batch loss = 1518.5053834319115 | accuracy = 0.684251968503937


Epoch[2] Batch[640] Speed: 1.2463830691994084 samples/sec                   batch loss = 1531.5516577363014 | accuracy = 0.683984375


Epoch[2] Batch[645] Speed: 1.2518923089493441 samples/sec                   batch loss = 1543.2964399456978 | accuracy = 0.6844961240310078


Epoch[2] Batch[650] Speed: 1.2500335098845405 samples/sec                   batch loss = 1554.9420972466469 | accuracy = 0.6842307692307692


Epoch[2] Batch[655] Speed: 1.243559988377678 samples/sec                   batch loss = 1568.1276852488518 | accuracy = 0.684351145038168


Epoch[2] Batch[660] Speed: 1.2374346717104554 samples/sec                   batch loss = 1579.4759972691536 | accuracy = 0.6848484848484848


Epoch[2] Batch[665] Speed: 1.2397550641461716 samples/sec                   batch loss = 1591.3793569207191 | accuracy = 0.6845864661654135


Epoch[2] Batch[670] Speed: 1.2403995222412518 samples/sec                   batch loss = 1604.513236105442 | accuracy = 0.6835820895522388


Epoch[2] Batch[675] Speed: 1.2501271198312303 samples/sec                   batch loss = 1618.2114875912666 | accuracy = 0.6833333333333333


Epoch[2] Batch[680] Speed: 1.2474827598719178 samples/sec                   batch loss = 1629.7933982014656 | accuracy = 0.6834558823529412


Epoch[2] Batch[685] Speed: 1.2469670557284604 samples/sec                   batch loss = 1644.5530344843864 | accuracy = 0.6824817518248175


Epoch[2] Batch[690] Speed: 1.251353166221373 samples/sec                   batch loss = 1655.7955912947655 | accuracy = 0.6818840579710145


Epoch[2] Batch[695] Speed: 1.2502404019556468 samples/sec                   batch loss = 1663.9283441901207 | accuracy = 0.6830935251798561


Epoch[2] Batch[700] Speed: 1.2551093409320695 samples/sec                   batch loss = 1675.2448465824127 | accuracy = 0.6825


Epoch[2] Batch[705] Speed: 1.2529954772651128 samples/sec                   batch loss = 1687.3763303756714 | accuracy = 0.6826241134751773


Epoch[2] Batch[710] Speed: 1.2413157111240936 samples/sec                   batch loss = 1701.863625884056 | accuracy = 0.6816901408450704


Epoch[2] Batch[715] Speed: 1.2559407696466447 samples/sec                   batch loss = 1713.1471276283264 | accuracy = 0.6814685314685315


Epoch[2] Batch[720] Speed: 1.2513318864303207 samples/sec                   batch loss = 1723.4543622732162 | accuracy = 0.6819444444444445


Epoch[2] Batch[725] Speed: 1.2475068772978235 samples/sec                   batch loss = 1740.6860550642014 | accuracy = 0.6803448275862068


Epoch[2] Batch[730] Speed: 1.2464015882890722 samples/sec                   batch loss = 1751.3830674886703 | accuracy = 0.6808219178082192


Epoch[2] Batch[735] Speed: 1.2482913873541832 samples/sec                   batch loss = 1765.4181698560715 | accuracy = 0.6799319727891157


Epoch[2] Batch[740] Speed: 1.241987076749327 samples/sec                   batch loss = 1782.599929690361 | accuracy = 0.677027027027027


Epoch[2] Batch[745] Speed: 1.2448791213226935 samples/sec                   batch loss = 1792.3077948093414 | accuracy = 0.6775167785234899


Epoch[2] Batch[750] Speed: 1.2531035705112084 samples/sec                   batch loss = 1802.869700908661 | accuracy = 0.678


Epoch[2] Batch[755] Speed: 1.2595812000860682 samples/sec                   batch loss = 1813.5100693702698 | accuracy = 0.6781456953642384


Epoch[2] Batch[760] Speed: 1.2538114570451235 samples/sec                   batch loss = 1823.2221837043762 | accuracy = 0.6786184210526316


Epoch[2] Batch[765] Speed: 1.2523732978370656 samples/sec                   batch loss = 1834.0818058252335 | accuracy = 0.6787581699346406


Epoch[2] Batch[770] Speed: 1.2520684199528345 samples/sec                   batch loss = 1845.610126376152 | accuracy = 0.6795454545454546


Epoch[2] Batch[775] Speed: 1.2500646185004947 samples/sec                   batch loss = 1859.3601959943771 | accuracy = 0.6790322580645162


Epoch[2] Batch[780] Speed: 1.2487798371508447 samples/sec                   batch loss = 1875.0069624185562 | accuracy = 0.6782051282051282


Epoch[2] Batch[785] Speed: 1.2564822738823485 samples/sec                   batch loss = 1887.83749461174 | accuracy = 0.6780254777070064


[Epoch 2] training: accuracy=0.6782994923857868
[Epoch 2] time cost: 647.7589917182922
[Epoch 2] validation: validation accuracy=0.7355555555555555


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).