<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[21:37:53] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[21:37:53] /work/mxnet/src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[21:37:53] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:106: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 3.2892706, -1.2321993]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.7748471870642782 samples/sec                   batch loss = 14.978061199188232 | accuracy = 0.4


Epoch[1] Batch[10] Speed: 1.2494245958840788 samples/sec                   batch loss = 28.068971872329712 | accuracy = 0.525


Epoch[1] Batch[15] Speed: 1.249209416313168 samples/sec                   batch loss = 41.93116235733032 | accuracy = 0.5666666666666667


Epoch[1] Batch[20] Speed: 1.2525559966027553 samples/sec                   batch loss = 56.2120463848114 | accuracy = 0.5625


Epoch[1] Batch[25] Speed: 1.247965935439106 samples/sec                   batch loss = 71.55546140670776 | accuracy = 0.51


Epoch[1] Batch[30] Speed: 1.2507904286874862 samples/sec                   batch loss = 84.91923332214355 | accuracy = 0.5166666666666667


Epoch[1] Batch[35] Speed: 1.2480766907475835 samples/sec                   batch loss = 99.43203639984131 | accuracy = 0.5071428571428571


Epoch[1] Batch[40] Speed: 1.2515026118076413 samples/sec                   batch loss = 113.8516321182251 | accuracy = 0.50625


Epoch[1] Batch[45] Speed: 1.2484410314727061 samples/sec                   batch loss = 127.83356070518494 | accuracy = 0.5111111111111111


Epoch[1] Batch[50] Speed: 1.251431104968782 samples/sec                   batch loss = 142.38191080093384 | accuracy = 0.5


Epoch[1] Batch[55] Speed: 1.2523496462644863 samples/sec                   batch loss = 156.34546375274658 | accuracy = 0.5


Epoch[1] Batch[60] Speed: 1.2488284521693769 samples/sec                   batch loss = 170.24147653579712 | accuracy = 0.5041666666666667


Epoch[1] Batch[65] Speed: 1.2542213457994815 samples/sec                   batch loss = 184.17815780639648 | accuracy = 0.5153846153846153


Epoch[1] Batch[70] Speed: 1.2547269285821498 samples/sec                   batch loss = 198.2960765361786 | accuracy = 0.5142857142857142


Epoch[1] Batch[75] Speed: 1.2549783708455604 samples/sec                   batch loss = 212.81066370010376 | accuracy = 0.5133333333333333


Epoch[1] Batch[80] Speed: 1.2529427008457645 samples/sec                   batch loss = 226.1800081729889 | accuracy = 0.521875


Epoch[1] Batch[85] Speed: 1.2478249435038669 samples/sec                   batch loss = 239.982084274292 | accuracy = 0.5235294117647059


Epoch[1] Batch[90] Speed: 1.2457186906853563 samples/sec                   batch loss = 254.59045147895813 | accuracy = 0.5083333333333333


Epoch[1] Batch[95] Speed: 1.2489435444884533 samples/sec                   batch loss = 268.3886227607727 | accuracy = 0.5105263157894737


Epoch[1] Batch[100] Speed: 1.2491641198997465 samples/sec                   batch loss = 282.29727029800415 | accuracy = 0.5025


Epoch[1] Batch[105] Speed: 1.2563064242955766 samples/sec                   batch loss = 295.8602976799011 | accuracy = 0.5071428571428571


Epoch[1] Batch[110] Speed: 1.2529383030115016 samples/sec                   batch loss = 310.13409972190857 | accuracy = 0.5


Epoch[1] Batch[115] Speed: 1.2485093167794308 samples/sec                   batch loss = 323.58422088623047 | accuracy = 0.5065217391304347


Epoch[1] Batch[120] Speed: 1.2453445667504481 samples/sec                   batch loss = 337.32188963890076 | accuracy = 0.5083333333333333


Epoch[1] Batch[125] Speed: 1.2559329660679384 samples/sec                   batch loss = 351.0321578979492 | accuracy = 0.504


Epoch[1] Batch[130] Speed: 1.2590591316408661 samples/sec                   batch loss = 364.99096393585205 | accuracy = 0.5019230769230769


Epoch[1] Batch[135] Speed: 1.2533812356818708 samples/sec                   batch loss = 378.5564396381378 | accuracy = 0.5092592592592593


Epoch[1] Batch[140] Speed: 1.2524271482279246 samples/sec                   batch loss = 392.30170035362244 | accuracy = 0.5107142857142857


Epoch[1] Batch[145] Speed: 1.2499773506453875 samples/sec                   batch loss = 406.0198903083801 | accuracy = 0.5189655172413793


Epoch[1] Batch[150] Speed: 1.2518572795119483 samples/sec                   batch loss = 419.6191396713257 | accuracy = 0.5183333333333333


Epoch[1] Batch[155] Speed: 1.2513619396792686 samples/sec                   batch loss = 433.3164269924164 | accuracy = 0.5241935483870968


Epoch[1] Batch[160] Speed: 1.2513760334702417 samples/sec                   batch loss = 446.85773944854736 | accuracy = 0.5265625


Epoch[1] Batch[165] Speed: 1.2536610848568255 samples/sec                   batch loss = 460.7444796562195 | accuracy = 0.5303030303030303


Epoch[1] Batch[170] Speed: 1.2511189417207709 samples/sec                   batch loss = 474.6348309516907 | accuracy = 0.5279411764705882


Epoch[1] Batch[175] Speed: 1.2512659050130093 samples/sec                   batch loss = 488.70246171951294 | accuracy = 0.5242857142857142


Epoch[1] Batch[180] Speed: 1.2517609820149995 samples/sec                   batch loss = 502.93289589881897 | accuracy = 0.5236111111111111


Epoch[1] Batch[185] Speed: 1.2508876960819693 samples/sec                   batch loss = 516.7833666801453 | accuracy = 0.522972972972973


Epoch[1] Batch[190] Speed: 1.2535448404931986 samples/sec                   batch loss = 530.5182456970215 | accuracy = 0.5276315789473685


Epoch[1] Batch[195] Speed: 1.2551925375623334 samples/sec                   batch loss = 544.4522929191589 | accuracy = 0.5243589743589744


Epoch[1] Batch[200] Speed: 1.253911444064812 samples/sec                   batch loss = 557.8929743766785 | accuracy = 0.5275


Epoch[1] Batch[205] Speed: 1.2550604234579346 samples/sec                   batch loss = 571.2313668727875 | accuracy = 0.5341463414634147


Epoch[1] Batch[210] Speed: 1.252549450690562 samples/sec                   batch loss = 584.8040096759796 | accuracy = 0.5369047619047619


Epoch[1] Batch[215] Speed: 1.2467959903283705 samples/sec                   batch loss = 598.9575207233429 | accuracy = 0.5348837209302325


Epoch[1] Batch[220] Speed: 1.2437790349757103 samples/sec                   batch loss = 612.4720230102539 | accuracy = 0.5363636363636364


Epoch[1] Batch[225] Speed: 1.259968943145469 samples/sec                   batch loss = 626.1586091518402 | accuracy = 0.5388888888888889


Epoch[1] Batch[230] Speed: 1.2583776820061958 samples/sec                   batch loss = 639.462061882019 | accuracy = 0.5391304347826087


Epoch[1] Batch[235] Speed: 1.2570146387835834 samples/sec                   batch loss = 653.5270149707794 | accuracy = 0.5361702127659574


Epoch[1] Batch[240] Speed: 1.2455844943394456 samples/sec                   batch loss = 667.8697490692139 | accuracy = 0.534375


Epoch[1] Batch[245] Speed: 1.24582072145138 samples/sec                   batch loss = 681.4699821472168 | accuracy = 0.5346938775510204


Epoch[1] Batch[250] Speed: 1.2543720399982294 samples/sec                   batch loss = 695.1288909912109 | accuracy = 0.536


Epoch[1] Batch[255] Speed: 1.2471148065268676 samples/sec                   batch loss = 709.2115273475647 | accuracy = 0.5362745098039216


Epoch[1] Batch[260] Speed: 1.2491120377351017 samples/sec                   batch loss = 723.1577394008636 | accuracy = 0.5355769230769231


Epoch[1] Batch[265] Speed: 1.2476387975644547 samples/sec                   batch loss = 736.3011844158173 | accuracy = 0.5405660377358491


Epoch[1] Batch[270] Speed: 1.2498006170182556 samples/sec                   batch loss = 749.6863431930542 | accuracy = 0.5407407407407407


Epoch[1] Batch[275] Speed: 1.2490696311804028 samples/sec                   batch loss = 763.2071948051453 | accuracy = 0.5418181818181819


Epoch[1] Batch[280] Speed: 1.2497891655096005 samples/sec                   batch loss = 777.3214089870453 | accuracy = 0.5410714285714285


Epoch[1] Batch[285] Speed: 1.2500322991014516 samples/sec                   batch loss = 790.9753634929657 | accuracy = 0.5429824561403509


Epoch[1] Batch[290] Speed: 1.2520755214886738 samples/sec                   batch loss = 805.0482106208801 | accuracy = 0.5396551724137931


Epoch[1] Batch[295] Speed: 1.2469375839109738 samples/sec                   batch loss = 818.9014911651611 | accuracy = 0.5372881355932203


Epoch[1] Batch[300] Speed: 1.2469566755583434 samples/sec                   batch loss = 832.1740183830261 | accuracy = 0.54


Epoch[1] Batch[305] Speed: 1.248324174088719 samples/sec                   batch loss = 845.9999868869781 | accuracy = 0.5385245901639344


Epoch[1] Batch[310] Speed: 1.2537116733424067 samples/sec                   batch loss = 859.3748264312744 | accuracy = 0.5395161290322581


Epoch[1] Batch[315] Speed: 1.2479360450866785 samples/sec                   batch loss = 872.6212077140808 | accuracy = 0.5436507936507936


Epoch[1] Batch[320] Speed: 1.2501126815618688 samples/sec                   batch loss = 886.1830658912659 | accuracy = 0.54140625


Epoch[1] Batch[325] Speed: 1.2506783521374978 samples/sec                   batch loss = 900.3268659114838 | accuracy = 0.5392307692307692


Epoch[1] Batch[330] Speed: 1.249104690779281 samples/sec                   batch loss = 913.837201833725 | accuracy = 0.5416666666666666


Epoch[1] Batch[335] Speed: 1.2509591408317142 samples/sec                   batch loss = 928.107417345047 | accuracy = 0.5388059701492537


Epoch[1] Batch[340] Speed: 1.256213861947191 samples/sec                   batch loss = 941.5037200450897 | accuracy = 0.5404411764705882


Epoch[1] Batch[345] Speed: 1.25552205195027 samples/sec                   batch loss = 955.8298194408417 | accuracy = 0.5413043478260869


Epoch[1] Batch[350] Speed: 1.2511301377090482 samples/sec                   batch loss = 969.2359733581543 | accuracy = 0.5428571428571428


Epoch[1] Batch[355] Speed: 1.2510843287243831 samples/sec                   batch loss = 982.8606462478638 | accuracy = 0.5429577464788733


Epoch[1] Batch[360] Speed: 1.2536370099188259 samples/sec                   batch loss = 996.305083990097 | accuracy = 0.5430555555555555


Epoch[1] Batch[365] Speed: 1.253926813680474 samples/sec                   batch loss = 1009.9792370796204 | accuracy = 0.5452054794520548


Epoch[1] Batch[370] Speed: 1.2542303470495988 samples/sec                   batch loss = 1023.4118530750275 | accuracy = 0.547972972972973


Epoch[1] Batch[375] Speed: 1.261184420118974 samples/sec                   batch loss = 1037.3096153736115 | accuracy = 0.546


Epoch[1] Batch[380] Speed: 1.2581092145021222 samples/sec                   batch loss = 1051.4047288894653 | accuracy = 0.5447368421052632


Epoch[1] Batch[385] Speed: 1.2543978313709367 samples/sec                   batch loss = 1064.4315676689148 | accuracy = 0.5441558441558442


Epoch[1] Batch[390] Speed: 1.2557679853869619 samples/sec                   batch loss = 1078.323369026184 | accuracy = 0.5435897435897435


Epoch[1] Batch[395] Speed: 1.2554967780462636 samples/sec                   batch loss = 1091.8864753246307 | accuracy = 0.5455696202531646


Epoch[1] Batch[400] Speed: 1.2462011484037325 samples/sec                   batch loss = 1106.2009353637695 | accuracy = 0.54625


Epoch[1] Batch[405] Speed: 1.2523422611738588 samples/sec                   batch loss = 1119.6193194389343 | accuracy = 0.5444444444444444


Epoch[1] Batch[410] Speed: 1.2493953799551811 samples/sec                   batch loss = 1133.250103712082 | accuracy = 0.5439024390243903


Epoch[1] Batch[415] Speed: 1.2496118335681772 samples/sec                   batch loss = 1146.4844825267792 | accuracy = 0.5469879518072289


Epoch[1] Batch[420] Speed: 1.2548643223935834 samples/sec                   batch loss = 1160.2709202766418 | accuracy = 0.5470238095238096


Epoch[1] Batch[425] Speed: 1.2567307488474249 samples/sec                   batch loss = 1173.7852411270142 | accuracy = 0.548235294117647


Epoch[1] Batch[430] Speed: 1.2512850361079502 samples/sec                   batch loss = 1186.8334035873413 | accuracy = 0.5494186046511628


Epoch[1] Batch[435] Speed: 1.254448760723466 samples/sec                   batch loss = 1201.1226902008057 | accuracy = 0.5488505747126436


Epoch[1] Batch[440] Speed: 1.2547983433315024 samples/sec                   batch loss = 1214.5297343730927 | accuracy = 0.5488636363636363


Epoch[1] Batch[445] Speed: 1.2512606790595342 samples/sec                   batch loss = 1228.5550019741058 | accuracy = 0.547752808988764


Epoch[1] Batch[450] Speed: 1.2586035854098987 samples/sec                   batch loss = 1242.7038917541504 | accuracy = 0.5472222222222223


Epoch[1] Batch[455] Speed: 1.2539194099771147 samples/sec                   batch loss = 1256.0137753486633 | accuracy = 0.548901098901099


Epoch[1] Batch[460] Speed: 1.2526641075624187 samples/sec                   batch loss = 1269.1191062927246 | accuracy = 0.5483695652173913


Epoch[1] Batch[465] Speed: 1.2527633503972881 samples/sec                   batch loss = 1282.2666411399841 | accuracy = 0.5489247311827957


Epoch[1] Batch[470] Speed: 1.2538321653246518 samples/sec                   batch loss = 1295.6954417228699 | accuracy = 0.55


Epoch[1] Batch[475] Speed: 1.2491238488452285 samples/sec                   batch loss = 1309.4045960903168 | accuracy = 0.5505263157894736


Epoch[1] Batch[480] Speed: 1.2512149538286221 samples/sec                   batch loss = 1322.8302054405212 | accuracy = 0.5505208333333333


Epoch[1] Batch[485] Speed: 1.2504034673673268 samples/sec                   batch loss = 1336.0012021064758 | accuracy = 0.5515463917525774


Epoch[1] Batch[490] Speed: 1.2524719335903542 samples/sec                   batch loss = 1348.864188671112 | accuracy = 0.551530612244898


Epoch[1] Batch[495] Speed: 1.2562331446449568 samples/sec                   batch loss = 1362.3805062770844 | accuracy = 0.5515151515151515


Epoch[1] Batch[500] Speed: 1.2559163250251937 samples/sec                   batch loss = 1375.5819220542908 | accuracy = 0.5515


Epoch[1] Batch[505] Speed: 1.255428759656545 samples/sec                   batch loss = 1388.5866992473602 | accuracy = 0.5524752475247525


Epoch[1] Batch[510] Speed: 1.2488514131594954 samples/sec                   batch loss = 1401.3993926048279 | accuracy = 0.553921568627451


Epoch[1] Batch[515] Speed: 1.251496637033052 samples/sec                   batch loss = 1414.5968112945557 | accuracy = 0.5548543689320389


Epoch[1] Batch[520] Speed: 1.2477323275646008 samples/sec                   batch loss = 1428.6266119480133 | accuracy = 0.5538461538461539


Epoch[1] Batch[525] Speed: 1.244860739814841 samples/sec                   batch loss = 1442.422902584076 | accuracy = 0.5523809523809524


Epoch[1] Batch[530] Speed: 1.2461073850427447 samples/sec                   batch loss = 1455.2715764045715 | accuracy = 0.5528301886792453


Epoch[1] Batch[535] Speed: 1.2492145321309895 samples/sec                   batch loss = 1469.5932819843292 | accuracy = 0.5518691588785046


Epoch[1] Batch[540] Speed: 1.2452744088678251 samples/sec                   batch loss = 1482.867469072342 | accuracy = 0.5532407407407407


Epoch[1] Batch[545] Speed: 1.2509918812962157 samples/sec                   batch loss = 1495.5782678127289 | accuracy = 0.5541284403669725


Epoch[1] Batch[550] Speed: 1.2464329794030289 samples/sec                   batch loss = 1509.3609464168549 | accuracy = 0.555


Epoch[1] Batch[555] Speed: 1.2436604673470129 samples/sec                   batch loss = 1521.9949190616608 | accuracy = 0.5567567567567567


Epoch[1] Batch[560] Speed: 1.248236591840892 samples/sec                   batch loss = 1534.9614162445068 | accuracy = 0.5584821428571428


Epoch[1] Batch[565] Speed: 1.2419214335839301 samples/sec                   batch loss = 1548.70663022995 | accuracy = 0.5592920353982301


Epoch[1] Batch[570] Speed: 1.2504131594508539 samples/sec                   batch loss = 1562.3901257514954 | accuracy = 0.5596491228070175


Epoch[1] Batch[575] Speed: 1.252201680617075 samples/sec                   batch loss = 1575.8616557121277 | accuracy = 0.5608695652173913


Epoch[1] Batch[580] Speed: 1.2478075885686917 samples/sec                   batch loss = 1589.4445357322693 | accuracy = 0.5616379310344828


Epoch[1] Batch[585] Speed: 1.2497395447302968 samples/sec                   batch loss = 1603.2419984340668 | accuracy = 0.5623931623931624


Epoch[1] Batch[590] Speed: 1.2472200331125705 samples/sec                   batch loss = 1615.794685602188 | accuracy = 0.5639830508474576


Epoch[1] Batch[595] Speed: 1.2480786405141628 samples/sec                   batch loss = 1630.3146929740906 | accuracy = 0.5621848739495798


Epoch[1] Batch[600] Speed: 1.2487902476967423 samples/sec                   batch loss = 1644.3367426395416 | accuracy = 0.56125


Epoch[1] Batch[605] Speed: 1.2515241774842383 samples/sec                   batch loss = 1657.3241701126099 | accuracy = 0.5615702479338843


Epoch[1] Batch[610] Speed: 1.2528759880824105 samples/sec                   batch loss = 1671.1239612102509 | accuracy = 0.5622950819672131


Epoch[1] Batch[615] Speed: 1.2455262375257814 samples/sec                   batch loss = 1683.060294866562 | accuracy = 0.5642276422764227


Epoch[1] Batch[620] Speed: 1.2476276639746258 samples/sec                   batch loss = 1696.0145616531372 | accuracy = 0.5649193548387097


Epoch[1] Batch[625] Speed: 1.2465378131043563 samples/sec                   batch loss = 1710.0561621189117 | accuracy = 0.5644


Epoch[1] Batch[630] Speed: 1.2528322029214054 samples/sec                   batch loss = 1723.20458984375 | accuracy = 0.5650793650793651


Epoch[1] Batch[635] Speed: 1.2483143285993932 samples/sec                   batch loss = 1736.060614824295 | accuracy = 0.5669291338582677


Epoch[1] Batch[640] Speed: 1.2454039159532058 samples/sec                   batch loss = 1749.8519752025604 | accuracy = 0.566015625


Epoch[1] Batch[645] Speed: 1.250413718614103 samples/sec                   batch loss = 1762.6788399219513 | accuracy = 0.5666666666666667


Epoch[1] Batch[650] Speed: 1.2501548795054587 samples/sec                   batch loss = 1775.0033836364746 | accuracy = 0.568076923076923


Epoch[1] Batch[655] Speed: 1.2572313853729153 samples/sec                   batch loss = 1789.588189125061 | accuracy = 0.5675572519083969


Epoch[1] Batch[660] Speed: 1.2584506456968099 samples/sec                   batch loss = 1802.3905262947083 | accuracy = 0.5674242424242424


Epoch[1] Batch[665] Speed: 1.2556280443087222 samples/sec                   batch loss = 1814.5386588573456 | accuracy = 0.5699248120300752


Epoch[1] Batch[670] Speed: 1.2544164956075425 samples/sec                   batch loss = 1826.083658695221 | accuracy = 0.5727611940298507


Epoch[1] Batch[675] Speed: 1.2549180117573466 samples/sec                   batch loss = 1838.1206781864166 | accuracy = 0.5737037037037037


Epoch[1] Batch[680] Speed: 1.2501546000395078 samples/sec                   batch loss = 1850.4951881170273 | accuracy = 0.5746323529411764


Epoch[1] Batch[685] Speed: 1.2472321793505259 samples/sec                   batch loss = 1865.1379326581955 | accuracy = 0.5755474452554744


Epoch[1] Batch[690] Speed: 1.2536182751982847 samples/sec                   batch loss = 1878.5936933755875 | accuracy = 0.5760869565217391


Epoch[1] Batch[695] Speed: 1.2525000780142892 samples/sec                   batch loss = 1891.6220089197159 | accuracy = 0.5766187050359712


Epoch[1] Batch[700] Speed: 1.256979510456351 samples/sec                   batch loss = 1905.01031768322 | accuracy = 0.5760714285714286


Epoch[1] Batch[705] Speed: 1.2541158720272596 samples/sec                   batch loss = 1917.4475010633469 | accuracy = 0.5765957446808511


Epoch[1] Batch[710] Speed: 1.2583820237225622 samples/sec                   batch loss = 1929.5773562192917 | accuracy = 0.5774647887323944


Epoch[1] Batch[715] Speed: 1.257585537315374 samples/sec                   batch loss = 1943.4802607297897 | accuracy = 0.5772727272727273


Epoch[1] Batch[720] Speed: 1.253950431141279 samples/sec                   batch loss = 1956.798852801323 | accuracy = 0.5777777777777777


Epoch[1] Batch[725] Speed: 1.2620691092232768 samples/sec                   batch loss = 1969.0526329278946 | accuracy = 0.5789655172413793


Epoch[1] Batch[730] Speed: 1.2630508809397334 samples/sec                   batch loss = 1982.1981493234634 | accuracy = 0.5794520547945206


Epoch[1] Batch[735] Speed: 1.2570500516315217 samples/sec                   batch loss = 1996.7697709798813 | accuracy = 0.5792517006802721


Epoch[1] Batch[740] Speed: 1.2560340440211917 samples/sec                   batch loss = 2008.1765798330307 | accuracy = 0.5807432432432432


Epoch[1] Batch[745] Speed: 1.2579063121161502 samples/sec                   batch loss = 2020.4115625619888 | accuracy = 0.5812080536912752


Epoch[1] Batch[750] Speed: 1.25539700773836 samples/sec                   batch loss = 2033.3495124578476 | accuracy = 0.5813333333333334


Epoch[1] Batch[755] Speed: 1.2555897045813653 samples/sec                   batch loss = 2045.5374599695206 | accuracy = 0.5821192052980132


Epoch[1] Batch[760] Speed: 1.2563260861442054 samples/sec                   batch loss = 2057.35500061512 | accuracy = 0.5825657894736842


Epoch[1] Batch[765] Speed: 1.263985803641501 samples/sec                   batch loss = 2069.082074522972 | accuracy = 0.5836601307189543


Epoch[1] Batch[770] Speed: 1.2527190118812264 samples/sec                   batch loss = 2080.9654059410095 | accuracy = 0.5850649350649351


Epoch[1] Batch[775] Speed: 1.257785318744801 samples/sec                   batch loss = 2092.4008135795593 | accuracy = 0.5858064516129032


Epoch[1] Batch[780] Speed: 1.25405756436558 samples/sec                   batch loss = 2104.209526181221 | accuracy = 0.5868589743589744


Epoch[1] Batch[785] Speed: 1.2536946226929928 samples/sec                   batch loss = 2115.004522204399 | accuracy = 0.5882165605095542


[Epoch 1] training: accuracy=0.5881979695431472
[Epoch 1] time cost: 647.299831867218
[Epoch 1] validation: validation accuracy=0.6244444444444445


Epoch[2] Batch[5] Speed: 1.2509695877453235 samples/sec                   batch loss = 13.137036561965942 | accuracy = 0.65


Epoch[2] Batch[10] Speed: 1.2559697283525995 samples/sec                   batch loss = 24.53073525428772 | accuracy = 0.725


Epoch[2] Batch[15] Speed: 1.253012228157467 samples/sec                   batch loss = 36.95272421836853 | accuracy = 0.6833333333333333


Epoch[2] Batch[20] Speed: 1.2529662812496312 samples/sec                   batch loss = 52.37809419631958 | accuracy = 0.6125


Epoch[2] Batch[25] Speed: 1.255048030384683 samples/sec                   batch loss = 64.43830907344818 | accuracy = 0.62


Epoch[2] Batch[30] Speed: 1.2548470526657247 samples/sec                   batch loss = 75.09382486343384 | accuracy = 0.6583333333333333


Epoch[2] Batch[35] Speed: 1.2573549106657727 samples/sec                   batch loss = 85.69986116886139 | accuracy = 0.6785714285714286


Epoch[2] Batch[40] Speed: 1.2529238933027722 samples/sec                   batch loss = 97.08172833919525 | accuracy = 0.7


Epoch[2] Batch[45] Speed: 1.252521304047741 samples/sec                   batch loss = 108.70598447322845 | accuracy = 0.7055555555555556


Epoch[2] Batch[50] Speed: 1.2518754945747412 samples/sec                   batch loss = 123.76937401294708 | accuracy = 0.69


Epoch[2] Batch[55] Speed: 1.2565177508464553 samples/sec                   batch loss = 136.6209832429886 | accuracy = 0.6863636363636364


Epoch[2] Batch[60] Speed: 1.2587434350476268 samples/sec                   batch loss = 148.4261692762375 | accuracy = 0.6916666666666667


Epoch[2] Batch[65] Speed: 1.2617029393757926 samples/sec                   batch loss = 158.92385911941528 | accuracy = 0.7076923076923077


Epoch[2] Batch[70] Speed: 1.2541679972921025 samples/sec                   batch loss = 171.75795447826385 | accuracy = 0.6928571428571428


Epoch[2] Batch[75] Speed: 1.249727163424025 samples/sec                   batch loss = 185.91543924808502 | accuracy = 0.68


Epoch[2] Batch[80] Speed: 1.2561323167904401 samples/sec                   batch loss = 196.8259949684143 | accuracy = 0.690625


Epoch[2] Batch[85] Speed: 1.2537736029538367 samples/sec                   batch loss = 208.7139699459076 | accuracy = 0.6970588235294117


Epoch[2] Batch[90] Speed: 1.2548341945357637 samples/sec                   batch loss = 222.87978386878967 | accuracy = 0.6972222222222222


Epoch[2] Batch[95] Speed: 1.2544547637188568 samples/sec                   batch loss = 235.0228601694107 | accuracy = 0.6973684210526315


Epoch[2] Batch[100] Speed: 1.2487348507160687 samples/sec                   batch loss = 249.94990122318268 | accuracy = 0.6875


Epoch[2] Batch[105] Speed: 1.2485930345376615 samples/sec                   batch loss = 261.3677214384079 | accuracy = 0.6904761904761905


Epoch[2] Batch[110] Speed: 1.2533792693125276 samples/sec                   batch loss = 272.52532851696014 | accuracy = 0.6909090909090909


Epoch[2] Batch[115] Speed: 1.2477766849862946 samples/sec                   batch loss = 288.44126856327057 | accuracy = 0.6760869565217391


Epoch[2] Batch[120] Speed: 1.2527435192806298 samples/sec                   batch loss = 303.07489228248596 | accuracy = 0.675


Epoch[2] Batch[125] Speed: 1.2516674073718732 samples/sec                   batch loss = 316.3656692504883 | accuracy = 0.67


Epoch[2] Batch[130] Speed: 1.2515614290399095 samples/sec                   batch loss = 330.5691797733307 | accuracy = 0.6615384615384615


Epoch[2] Batch[135] Speed: 1.252222429219033 samples/sec                   batch loss = 342.19166481494904 | accuracy = 0.6648148148148149


Epoch[2] Batch[140] Speed: 1.2499040625121296 samples/sec                   batch loss = 354.1301556825638 | accuracy = 0.6678571428571428


Epoch[2] Batch[145] Speed: 1.2497896310139904 samples/sec                   batch loss = 366.716725230217 | accuracy = 0.6689655172413793


Epoch[2] Batch[150] Speed: 1.2543693202425625 samples/sec                   batch loss = 378.46141254901886 | accuracy = 0.6716666666666666


Epoch[2] Batch[155] Speed: 1.2489933810233171 samples/sec                   batch loss = 391.396301984787 | accuracy = 0.6725806451612903


Epoch[2] Batch[160] Speed: 1.2520168427940221 samples/sec                   batch loss = 404.7576222419739 | accuracy = 0.671875


Epoch[2] Batch[165] Speed: 1.2473439172934717 samples/sec                   batch loss = 416.9381295442581 | accuracy = 0.6742424242424242


Epoch[2] Batch[170] Speed: 1.248576680352931 samples/sec                   batch loss = 429.1529014110565 | accuracy = 0.675


Epoch[2] Batch[175] Speed: 1.2550739434533518 samples/sec                   batch loss = 440.49136078357697 | accuracy = 0.6728571428571428


Epoch[2] Batch[180] Speed: 1.2569524827582383 samples/sec                   batch loss = 452.21085798740387 | accuracy = 0.6736111111111112


Epoch[2] Batch[185] Speed: 1.2628808880112994 samples/sec                   batch loss = 466.1006466150284 | accuracy = 0.6702702702702703


Epoch[2] Batch[190] Speed: 1.253896355967153 samples/sec                   batch loss = 478.9214069843292 | accuracy = 0.6710526315789473


Epoch[2] Batch[195] Speed: 1.2495785138124427 samples/sec                   batch loss = 493.2529592514038 | accuracy = 0.6653846153846154


Epoch[2] Batch[200] Speed: 1.2430766257924446 samples/sec                   batch loss = 504.4791897535324 | accuracy = 0.6675


Epoch[2] Batch[205] Speed: 1.2486475825580088 samples/sec                   batch loss = 517.0546993017197 | accuracy = 0.6707317073170732


Epoch[2] Batch[210] Speed: 1.2454929503645862 samples/sec                   batch loss = 528.2730377912521 | accuracy = 0.6714285714285714


Epoch[2] Batch[215] Speed: 1.2575237018466705 samples/sec                   batch loss = 538.1474204063416 | accuracy = 0.6744186046511628


Epoch[2] Batch[220] Speed: 1.2503935890521876 samples/sec                   batch loss = 553.2304203510284 | accuracy = 0.6715909090909091


Epoch[2] Batch[225] Speed: 1.2538185783672484 samples/sec                   batch loss = 566.1916325092316 | accuracy = 0.6722222222222223


Epoch[2] Batch[230] Speed: 1.251986384366289 samples/sec                   batch loss = 575.6360187530518 | accuracy = 0.675


Epoch[2] Batch[235] Speed: 1.2477881924477099 samples/sec                   batch loss = 585.3704507350922 | accuracy = 0.6797872340425531


Epoch[2] Batch[240] Speed: 1.2482005594473438 samples/sec                   batch loss = 596.6168981790543 | accuracy = 0.6833333333333333


Epoch[2] Batch[245] Speed: 1.2502554021970738 samples/sec                   batch loss = 610.3859473466873 | accuracy = 0.6795918367346939


Epoch[2] Batch[250] Speed: 1.2501007585973032 samples/sec                   batch loss = 623.2636978626251 | accuracy = 0.679


Epoch[2] Batch[255] Speed: 1.24774151430303 samples/sec                   batch loss = 637.1734675168991 | accuracy = 0.6774509803921569


Epoch[2] Batch[260] Speed: 1.2441817435643698 samples/sec                   batch loss = 649.2719876766205 | accuracy = 0.676923076923077


Epoch[2] Batch[265] Speed: 1.2459183277150023 samples/sec                   batch loss = 661.8929427862167 | accuracy = 0.6754716981132075


Epoch[2] Batch[270] Speed: 1.248518143316607 samples/sec                   batch loss = 675.1022857427597 | accuracy = 0.674074074074074


Epoch[2] Batch[275] Speed: 1.2433594477204357 samples/sec                   batch loss = 686.6008676290512 | accuracy = 0.6763636363636364


Epoch[2] Batch[280] Speed: 1.244804305511718 samples/sec                   batch loss = 699.5168281793594 | accuracy = 0.675


Epoch[2] Batch[285] Speed: 1.2478500950955593 samples/sec                   batch loss = 714.7310613393784 | accuracy = 0.6710526315789473


Epoch[2] Batch[290] Speed: 1.2495990825020469 samples/sec                   batch loss = 727.0145809650421 | accuracy = 0.6689655172413793


Epoch[2] Batch[295] Speed: 1.2439519482014905 samples/sec                   batch loss = 739.3877013921738 | accuracy = 0.6686440677966101


Epoch[2] Batch[300] Speed: 1.250440093048777 samples/sec                   batch loss = 751.8933864831924 | accuracy = 0.6675


Epoch[2] Batch[305] Speed: 1.2528903031258005 samples/sec                   batch loss = 765.5019373893738 | accuracy = 0.6663934426229509


Epoch[2] Batch[310] Speed: 1.2488887846734875 samples/sec                   batch loss = 777.2845594882965 | accuracy = 0.6669354838709678


Epoch[2] Batch[315] Speed: 1.2555988194514502 samples/sec                   batch loss = 789.8782072067261 | accuracy = 0.6658730158730158


Epoch[2] Batch[320] Speed: 1.2558450649824273 samples/sec                   batch loss = 802.0949462652206 | accuracy = 0.66484375


Epoch[2] Batch[325] Speed: 1.2536732631931964 samples/sec                   batch loss = 813.2941044569016 | accuracy = 0.6661538461538462


Epoch[2] Batch[330] Speed: 1.2537910305436213 samples/sec                   batch loss = 827.958415389061 | accuracy = 0.6636363636363637


Epoch[2] Batch[335] Speed: 1.2570460958420684 samples/sec                   batch loss = 839.8948231935501 | accuracy = 0.6634328358208955


Epoch[2] Batch[340] Speed: 1.2538196090916258 samples/sec                   batch loss = 850.3828785419464 | accuracy = 0.6654411764705882


Epoch[2] Batch[345] Speed: 1.2557532285436213 samples/sec                   batch loss = 861.4282069206238 | accuracy = 0.6681159420289855


Epoch[2] Batch[350] Speed: 1.2509174481910843 samples/sec                   batch loss = 870.9378988742828 | accuracy = 0.6685714285714286


Epoch[2] Batch[355] Speed: 1.2493514655879852 samples/sec                   batch loss = 881.4653389453888 | accuracy = 0.6697183098591549


Epoch[2] Batch[360] Speed: 1.2581929036794 samples/sec                   batch loss = 894.3225963115692 | accuracy = 0.6694444444444444


Epoch[2] Batch[365] Speed: 1.255848261169959 samples/sec                   batch loss = 909.5066964626312 | accuracy = 0.6684931506849315


Epoch[2] Batch[370] Speed: 1.2581153469273831 samples/sec                   batch loss = 921.8825348615646 | accuracy = 0.6675675675675675


Epoch[2] Batch[375] Speed: 1.2527370649488427 samples/sec                   batch loss = 933.6289503574371 | accuracy = 0.6686666666666666


Epoch[2] Batch[380] Speed: 1.2510996291119387 samples/sec                   batch loss = 946.1558697223663 | accuracy = 0.6677631578947368


Epoch[2] Batch[385] Speed: 1.2546662183619088 samples/sec                   batch loss = 957.7480291128159 | accuracy = 0.6681818181818182


Epoch[2] Batch[390] Speed: 1.2554429452041418 samples/sec                   batch loss = 969.8671168088913 | accuracy = 0.6685897435897435


Epoch[2] Batch[395] Speed: 1.2554184260157482 samples/sec                   batch loss = 979.3598792552948 | accuracy = 0.670253164556962


Epoch[2] Batch[400] Speed: 1.2523816181678917 samples/sec                   batch loss = 991.5212118625641 | accuracy = 0.67


Epoch[2] Batch[405] Speed: 1.2574221012791484 samples/sec                   batch loss = 1001.0329924821854 | accuracy = 0.671604938271605


Epoch[2] Batch[410] Speed: 1.2575817666858684 samples/sec                   batch loss = 1010.8689548373222 | accuracy = 0.6725609756097561


Epoch[2] Batch[415] Speed: 1.2553388626792794 samples/sec                   batch loss = 1025.2549251914024 | accuracy = 0.6710843373493975


Epoch[2] Batch[420] Speed: 1.25407284382341 samples/sec                   batch loss = 1036.5199782252312 | accuracy = 0.6714285714285714


Epoch[2] Batch[425] Speed: 1.2557962781393137 samples/sec                   batch loss = 1047.92743319273 | accuracy = 0.6711764705882353


Epoch[2] Batch[430] Speed: 1.2504516496887328 samples/sec                   batch loss = 1060.2184324860573 | accuracy = 0.672093023255814


Epoch[2] Batch[435] Speed: 1.2537318161906503 samples/sec                   batch loss = 1074.9937724471092 | accuracy = 0.6701149425287356


Epoch[2] Batch[440] Speed: 1.2562328624548678 samples/sec                   batch loss = 1085.8308765292168 | accuracy = 0.6715909090909091


Epoch[2] Batch[445] Speed: 1.2562394469233125 samples/sec                   batch loss = 1100.8055602908134 | accuracy = 0.6702247191011236


Epoch[2] Batch[450] Speed: 1.2519318245076398 samples/sec                   batch loss = 1113.6811811327934 | accuracy = 0.6694444444444444


Epoch[2] Batch[455] Speed: 1.2545612326352213 samples/sec                   batch loss = 1124.9800388216972 | accuracy = 0.6708791208791208


Epoch[2] Batch[460] Speed: 1.2587848009900882 samples/sec                   batch loss = 1135.48567456007 | accuracy = 0.6711956521739131


Epoch[2] Batch[465] Speed: 1.2566088520688503 samples/sec                   batch loss = 1150.3611587882042 | accuracy = 0.6688172043010753


Epoch[2] Batch[470] Speed: 1.2568684877153307 samples/sec                   batch loss = 1160.08195489645 | accuracy = 0.6702127659574468


Epoch[2] Batch[475] Speed: 1.2555640521006348 samples/sec                   batch loss = 1170.912946164608 | accuracy = 0.6721052631578948


Epoch[2] Batch[480] Speed: 1.2513479395389755 samples/sec                   batch loss = 1183.9014253020287 | accuracy = 0.6734375


Epoch[2] Batch[485] Speed: 1.2547805123435396 samples/sec                   batch loss = 1195.9707010388374 | accuracy = 0.6726804123711341


Epoch[2] Batch[490] Speed: 1.2524480912972662 samples/sec                   batch loss = 1206.2563994526863 | accuracy = 0.6739795918367347


Epoch[2] Batch[495] Speed: 1.2508147674313708 samples/sec                   batch loss = 1217.7802622914314 | accuracy = 0.6737373737373737


Epoch[2] Batch[500] Speed: 1.2506140244229245 samples/sec                   batch loss = 1228.4528244137764 | accuracy = 0.6745


Epoch[2] Batch[505] Speed: 1.2529715214599138 samples/sec                   batch loss = 1242.2991574406624 | accuracy = 0.6722772277227723


Epoch[2] Batch[510] Speed: 1.2583701312664837 samples/sec                   batch loss = 1252.6181867718697 | accuracy = 0.6730392156862746


Epoch[2] Batch[515] Speed: 1.2548910726386855 samples/sec                   batch loss = 1266.1709845662117 | accuracy = 0.6723300970873787


Epoch[2] Batch[520] Speed: 1.2520740264218024 samples/sec                   batch loss = 1278.112509071827 | accuracy = 0.6721153846153847


Epoch[2] Batch[525] Speed: 1.2626528765303646 samples/sec                   batch loss = 1288.764033138752 | accuracy = 0.6723809523809524


Epoch[2] Batch[530] Speed: 1.261190203331132 samples/sec                   batch loss = 1300.0415812134743 | accuracy = 0.6731132075471699


Epoch[2] Batch[535] Speed: 1.2531845356763456 samples/sec                   batch loss = 1312.8138093352318 | accuracy = 0.6719626168224299


Epoch[2] Batch[540] Speed: 1.2551387308067692 samples/sec                   batch loss = 1328.6040241122246 | accuracy = 0.6703703703703704


Epoch[2] Batch[545] Speed: 1.253156547690407 samples/sec                   batch loss = 1341.4331194758415 | accuracy = 0.6697247706422018


Epoch[2] Batch[550] Speed: 1.2530412391422727 samples/sec                   batch loss = 1352.2403252720833 | accuracy = 0.6695454545454546


Epoch[2] Batch[555] Speed: 1.250197080297954 samples/sec                   batch loss = 1363.1696122288704 | accuracy = 0.6702702702702703


Epoch[2] Batch[560] Speed: 1.2521338318817217 samples/sec                   batch loss = 1375.049706041813 | accuracy = 0.6709821428571429


Epoch[2] Batch[565] Speed: 1.2554992208327123 samples/sec                   batch loss = 1386.2500312924385 | accuracy = 0.6721238938053097


Epoch[2] Batch[570] Speed: 1.254642573930752 samples/sec                   batch loss = 1398.264218866825 | accuracy = 0.6714912280701755


Epoch[2] Batch[575] Speed: 1.2568993724798845 samples/sec                   batch loss = 1412.9129616618156 | accuracy = 0.6704347826086956


Epoch[2] Batch[580] Speed: 1.2515333267836992 samples/sec                   batch loss = 1423.9675136208534 | accuracy = 0.6698275862068965


Epoch[2] Batch[585] Speed: 1.2537090501353865 samples/sec                   batch loss = 1438.9393818974495 | accuracy = 0.6696581196581196


Epoch[2] Batch[590] Speed: 1.255845629014339 samples/sec                   batch loss = 1449.6043755412102 | accuracy = 0.6703389830508475


Epoch[2] Batch[595] Speed: 1.251787600112993 samples/sec                   batch loss = 1462.5122950673103 | accuracy = 0.6693277310924369


Epoch[2] Batch[600] Speed: 1.2590922029822322 samples/sec                   batch loss = 1472.414723932743 | accuracy = 0.6708333333333333


Epoch[2] Batch[605] Speed: 1.2542295969404877 samples/sec                   batch loss = 1485.3947034478188 | accuracy = 0.6702479338842975


Epoch[2] Batch[610] Speed: 1.257672550872226 samples/sec                   batch loss = 1499.1986873745918 | accuracy = 0.6700819672131147


Epoch[2] Batch[615] Speed: 1.2571615774645692 samples/sec                   batch loss = 1510.5494979023933 | accuracy = 0.6707317073170732


Epoch[2] Batch[620] Speed: 1.253290789077469 samples/sec                   batch loss = 1520.0306432843208 | accuracy = 0.6713709677419355


Epoch[2] Batch[625] Speed: 1.2520785116331277 samples/sec                   batch loss = 1531.0325010418892 | accuracy = 0.6712


Epoch[2] Batch[630] Speed: 1.2557120616038702 samples/sec                   batch loss = 1544.7079928517342 | accuracy = 0.6694444444444444


Epoch[2] Batch[635] Speed: 1.2585991477464749 samples/sec                   batch loss = 1555.1428371071815 | accuracy = 0.6692913385826772


Epoch[2] Batch[640] Speed: 1.2531293097472889 samples/sec                   batch loss = 1568.631073653698 | accuracy = 0.66953125


Epoch[2] Batch[645] Speed: 1.250522578585056 samples/sec                   batch loss = 1580.4585673213005 | accuracy = 0.6689922480620155


Epoch[2] Batch[650] Speed: 1.2561768972597995 samples/sec                   batch loss = 1593.1528093218803 | accuracy = 0.6688461538461539


Epoch[2] Batch[655] Speed: 1.2547410982735996 samples/sec                   batch loss = 1603.5215846896172 | accuracy = 0.6690839694656489


Epoch[2] Batch[660] Speed: 1.2559035389868551 samples/sec                   batch loss = 1611.7596800923347 | accuracy = 0.6712121212121213


Epoch[2] Batch[665] Speed: 1.24908441734949 samples/sec                   batch loss = 1621.6298095583916 | accuracy = 0.6718045112781955


Epoch[2] Batch[670] Speed: 1.251056900917599 samples/sec                   batch loss = 1632.3846084475517 | accuracy = 0.6716417910447762


Epoch[2] Batch[675] Speed: 1.252962070398131 samples/sec                   batch loss = 1643.4661864638329 | accuracy = 0.6718518518518518


Epoch[2] Batch[680] Speed: 1.2577767378697928 samples/sec                   batch loss = 1654.1389790177345 | accuracy = 0.6720588235294118


Epoch[2] Batch[685] Speed: 1.255402174361197 samples/sec                   batch loss = 1663.7708213925362 | accuracy = 0.672992700729927


Epoch[2] Batch[690] Speed: 1.258313881107707 samples/sec                   batch loss = 1675.1203303933144 | accuracy = 0.6728260869565217


Epoch[2] Batch[695] Speed: 1.2531151764358524 samples/sec                   batch loss = 1685.2789929509163 | accuracy = 0.673021582733813


Epoch[2] Batch[700] Speed: 1.2565952048668714 samples/sec                   batch loss = 1694.031471669674 | accuracy = 0.6735714285714286


Epoch[2] Batch[705] Speed: 1.2524529531813926 samples/sec                   batch loss = 1706.4023335576057 | accuracy = 0.6734042553191489


Epoch[2] Batch[710] Speed: 1.2550308494834408 samples/sec                   batch loss = 1714.5722371339798 | accuracy = 0.6742957746478874


Epoch[2] Batch[715] Speed: 1.2492882045538825 samples/sec                   batch loss = 1724.9003167748451 | accuracy = 0.6755244755244755


Epoch[2] Batch[720] Speed: 1.2519780692861728 samples/sec                   batch loss = 1733.3986045718193 | accuracy = 0.6763888888888889


Epoch[2] Batch[725] Speed: 1.2494454386556775 samples/sec                   batch loss = 1745.363104045391 | accuracy = 0.6751724137931034


Epoch[2] Batch[730] Speed: 1.2500447795934997 samples/sec                   batch loss = 1754.9768746495247 | accuracy = 0.6756849315068493


Epoch[2] Batch[735] Speed: 1.253553363722975 samples/sec                   batch loss = 1767.7856894135475 | accuracy = 0.6751700680272109


Epoch[2] Batch[740] Speed: 1.252717421739583 samples/sec                   batch loss = 1781.442050755024 | accuracy = 0.6746621621621621


Epoch[2] Batch[745] Speed: 1.2520117039983039 samples/sec                   batch loss = 1789.1005019545555 | accuracy = 0.6761744966442953


Epoch[2] Batch[750] Speed: 1.2525308419710461 samples/sec                   batch loss = 1802.699660718441 | accuracy = 0.6766666666666666


Epoch[2] Batch[755] Speed: 1.2543585351206896 samples/sec                   batch loss = 1816.2445703148842 | accuracy = 0.676158940397351


Epoch[2] Batch[760] Speed: 1.2546989654943157 samples/sec                   batch loss = 1829.4325640797615 | accuracy = 0.6759868421052632


Epoch[2] Batch[765] Speed: 1.2569554020703348 samples/sec                   batch loss = 1839.6287569403648 | accuracy = 0.6767973856209151


Epoch[2] Batch[770] Speed: 1.2593439800837505 samples/sec                   batch loss = 1852.0052393078804 | accuracy = 0.6762987012987013


Epoch[2] Batch[775] Speed: 1.2553969138001566 samples/sec                   batch loss = 1861.6456711888313 | accuracy = 0.677741935483871


Epoch[2] Batch[780] Speed: 1.2536062852709995 samples/sec                   batch loss = 1873.3865614533424 | accuracy = 0.6778846153846154


Epoch[2] Batch[785] Speed: 1.2640250387802892 samples/sec                   batch loss = 1883.0426966547966 | accuracy = 0.6780254777070064


[Epoch 2] training: accuracy=0.6779822335025381
[Epoch 2] time cost: 645.2873799800873
[Epoch 2] validation: validation accuracy=0.6877777777777778


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).