<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[03:31:49] /work/mxnet/src/storage/storage.cc:205: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[03:31:49] /work/mxnet/src/storage/storage.cc:205: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[03:31:50] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 5.2582893 , -0.34987912]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.780499206019643 samples/sec                   batch loss = 13.533374071121216 | accuracy = 0.65


Epoch[1] Batch[10] Speed: 1.2556562367785262 samples/sec                   batch loss = 26.91742491722107 | accuracy = 0.65


Epoch[1] Batch[15] Speed: 1.2547137914057886 samples/sec                   batch loss = 40.53870332241058 | accuracy = 0.6333333333333333


Epoch[1] Batch[20] Speed: 1.2584250649662858 samples/sec                   batch loss = 53.385477900505066 | accuracy = 0.6375


Epoch[1] Batch[25] Speed: 1.254147090415463 samples/sec                   batch loss = 67.70736110210419 | accuracy = 0.62


Epoch[1] Batch[30] Speed: 1.2538444406910891 samples/sec                   batch loss = 81.33279049396515 | accuracy = 0.6166666666666667


Epoch[1] Batch[35] Speed: 1.2595949122240246 samples/sec                   batch loss = 95.06146657466888 | accuracy = 0.5928571428571429


Epoch[1] Batch[40] Speed: 1.2581519540324875 samples/sec                   batch loss = 109.27478587627411 | accuracy = 0.60625


Epoch[1] Batch[45] Speed: 1.2619945862129713 samples/sec                   batch loss = 122.95663344860077 | accuracy = 0.6166666666666667


Epoch[1] Batch[50] Speed: 1.2564030460922564 samples/sec                   batch loss = 138.06807100772858 | accuracy = 0.6


Epoch[1] Batch[55] Speed: 1.2635308730007029 samples/sec                   batch loss = 152.27074682712555 | accuracy = 0.5818181818181818


Epoch[1] Batch[60] Speed: 1.2604886413607002 samples/sec                   batch loss = 165.82401430606842 | accuracy = 0.5833333333333334


Epoch[1] Batch[65] Speed: 1.2591775350669336 samples/sec                   batch loss = 180.49679672718048 | accuracy = 0.5576923076923077


Epoch[1] Batch[70] Speed: 1.2619507309246083 samples/sec                   batch loss = 195.5519243478775 | accuracy = 0.55


Epoch[1] Batch[75] Speed: 1.2537585181732114 samples/sec                   batch loss = 208.38168799877167 | accuracy = 0.5666666666666667


Epoch[1] Batch[80] Speed: 1.2577059265470814 samples/sec                   batch loss = 221.78008830547333 | accuracy = 0.575


Epoch[1] Batch[85] Speed: 1.2554630497743495 samples/sec                   batch loss = 235.8705974817276 | accuracy = 0.5647058823529412


Epoch[1] Batch[90] Speed: 1.2612918448311494 samples/sec                   batch loss = 249.7232083082199 | accuracy = 0.5666666666666667


Epoch[1] Batch[95] Speed: 1.2595656915449898 samples/sec                   batch loss = 264.3494428396225 | accuracy = 0.5526315789473685


Epoch[1] Batch[100] Speed: 1.258679219336341 samples/sec                   batch loss = 279.1314207315445 | accuracy = 0.54


Epoch[1] Batch[105] Speed: 1.2612225334121565 samples/sec                   batch loss = 292.6632391214371 | accuracy = 0.5428571428571428


Epoch[1] Batch[110] Speed: 1.256931577113037 samples/sec                   batch loss = 306.16659247875214 | accuracy = 0.5454545454545454


Epoch[1] Batch[115] Speed: 1.256136737075405 samples/sec                   batch loss = 320.0802150964737 | accuracy = 0.5434782608695652


Epoch[1] Batch[120] Speed: 1.2532300305217379 samples/sec                   batch loss = 333.8124953508377 | accuracy = 0.5458333333333333


Epoch[1] Batch[125] Speed: 1.254411806048614 samples/sec                   batch loss = 346.8062080144882 | accuracy = 0.552


Epoch[1] Batch[130] Speed: 1.2533583887234607 samples/sec                   batch loss = 360.1819795370102 | accuracy = 0.5557692307692308


Epoch[1] Batch[135] Speed: 1.2581494065618806 samples/sec                   batch loss = 374.98713982105255 | accuracy = 0.5481481481481482


Epoch[1] Batch[140] Speed: 1.2554254716614752 samples/sec                   batch loss = 388.1741899251938 | accuracy = 0.5517857142857143


Epoch[1] Batch[145] Speed: 1.254862445226225 samples/sec                   batch loss = 402.17486011981964 | accuracy = 0.5517241379310345


Epoch[1] Batch[150] Speed: 1.257246271192855 samples/sec                   batch loss = 415.3946248292923 | accuracy = 0.5533333333333333


Epoch[1] Batch[155] Speed: 1.254972362837857 samples/sec                   batch loss = 428.53794944286346 | accuracy = 0.5580645161290323


Epoch[1] Batch[160] Speed: 1.2538402239350586 samples/sec                   batch loss = 441.7243539094925 | accuracy = 0.5609375


Epoch[1] Batch[165] Speed: 1.2543217733161636 samples/sec                   batch loss = 455.6720942258835 | accuracy = 0.5590909090909091


Epoch[1] Batch[170] Speed: 1.2578430305282229 samples/sec                   batch loss = 468.918444275856 | accuracy = 0.5632352941176471


Epoch[1] Batch[175] Speed: 1.2538480015293805 samples/sec                   batch loss = 482.5707117319107 | accuracy = 0.56


Epoch[1] Batch[180] Speed: 1.2599842723663854 samples/sec                   batch loss = 496.0129815340042 | accuracy = 0.5611111111111111


Epoch[1] Batch[185] Speed: 1.2550318822019215 samples/sec                   batch loss = 510.1214522123337 | accuracy = 0.5594594594594594


Epoch[1] Batch[190] Speed: 1.2575208741505173 samples/sec                   batch loss = 523.5339900255203 | accuracy = 0.5592105263157895


Epoch[1] Batch[195] Speed: 1.2570932843819802 samples/sec                   batch loss = 537.408395409584 | accuracy = 0.5564102564102564


Epoch[1] Batch[200] Speed: 1.2564070919249493 samples/sec                   batch loss = 550.9468809366226 | accuracy = 0.55625


Epoch[1] Batch[205] Speed: 1.260703650988941 samples/sec                   batch loss = 565.5677038431168 | accuracy = 0.5536585365853659


Epoch[1] Batch[210] Speed: 1.256755978315789 samples/sec                   batch loss = 578.7859667539597 | accuracy = 0.5583333333333333


Epoch[1] Batch[215] Speed: 1.2575008921268207 samples/sec                   batch loss = 592.1503337621689 | accuracy = 0.5627906976744186


Epoch[1] Batch[220] Speed: 1.2626875624204574 samples/sec                   batch loss = 606.3334006071091 | accuracy = 0.5579545454545455


Epoch[1] Batch[225] Speed: 1.2596571407119594 samples/sec                   batch loss = 620.3110536336899 | accuracy = 0.5566666666666666


Epoch[1] Batch[230] Speed: 1.2617465876613045 samples/sec                   batch loss = 634.1587852239609 | accuracy = 0.5576086956521739


Epoch[1] Batch[235] Speed: 1.2637376882824016 samples/sec                   batch loss = 648.404888510704 | accuracy = 0.5553191489361702


Epoch[1] Batch[240] Speed: 1.26280674435647 samples/sec                   batch loss = 662.6342707872391 | accuracy = 0.5510416666666667


Epoch[1] Batch[245] Speed: 1.2594054274668156 samples/sec                   batch loss = 675.9389976263046 | accuracy = 0.5571428571428572


Epoch[1] Batch[250] Speed: 1.2597593865770862 samples/sec                   batch loss = 688.8719455003738 | accuracy = 0.561


Epoch[1] Batch[255] Speed: 1.2553962562331265 samples/sec                   batch loss = 704.0011109113693 | accuracy = 0.5549019607843138


Epoch[1] Batch[260] Speed: 1.2599676184154922 samples/sec                   batch loss = 717.2375956773758 | accuracy = 0.5576923076923077


Epoch[1] Batch[265] Speed: 1.2572645492007364 samples/sec                   batch loss = 730.3589051961899 | accuracy = 0.5566037735849056


Epoch[1] Batch[270] Speed: 1.2631437879114247 samples/sec                   batch loss = 744.5966514348984 | accuracy = 0.5555555555555556


Epoch[1] Batch[275] Speed: 1.2559579754700199 samples/sec                   batch loss = 757.6060174703598 | accuracy = 0.5563636363636364


Epoch[1] Batch[280] Speed: 1.2551004209548275 samples/sec                   batch loss = 770.833766579628 | accuracy = 0.5589285714285714


Epoch[1] Batch[285] Speed: 1.2580991197168971 samples/sec                   batch loss = 784.9075707197189 | accuracy = 0.5570175438596491


Epoch[1] Batch[290] Speed: 1.2592651471464318 samples/sec                   batch loss = 798.931516289711 | accuracy = 0.5568965517241379


Epoch[1] Batch[295] Speed: 1.266540494878266 samples/sec                   batch loss = 813.3510092496872 | accuracy = 0.5533898305084746


Epoch[1] Batch[300] Speed: 1.2562583541377932 samples/sec                   batch loss = 827.0500503778458 | accuracy = 0.5533333333333333


Epoch[1] Batch[305] Speed: 1.258928091721602 samples/sec                   batch loss = 840.4424756765366 | accuracy = 0.5532786885245902


Epoch[1] Batch[310] Speed: 1.2611815759357294 samples/sec                   batch loss = 853.8690038919449 | accuracy = 0.5540322580645162


Epoch[1] Batch[315] Speed: 1.2636722005562135 samples/sec                   batch loss = 867.145511507988 | accuracy = 0.5555555555555556


Epoch[1] Batch[320] Speed: 1.2592776236445413 samples/sec                   batch loss = 880.877179980278 | accuracy = 0.55546875


Epoch[1] Batch[325] Speed: 1.2611557892597245 samples/sec                   batch loss = 894.2142306566238 | accuracy = 0.5553846153846154


Epoch[1] Batch[330] Speed: 1.2549459846097746 samples/sec                   batch loss = 907.9662743806839 | accuracy = 0.5553030303030303


Epoch[1] Batch[335] Speed: 1.2588233360947507 samples/sec                   batch loss = 921.5309487581253 | accuracy = 0.5544776119402985


Epoch[1] Batch[340] Speed: 1.2578202092226565 samples/sec                   batch loss = 934.8625441789627 | accuracy = 0.5536764705882353


Epoch[1] Batch[345] Speed: 1.2616932612519713 samples/sec                   batch loss = 947.7204011678696 | accuracy = 0.5572463768115942


Epoch[1] Batch[350] Speed: 1.2559191455098755 samples/sec                   batch loss = 961.8640216588974 | accuracy = 0.555


Epoch[1] Batch[355] Speed: 1.254149621704246 samples/sec                   batch loss = 974.8258529901505 | accuracy = 0.5570422535211268


Epoch[1] Batch[360] Speed: 1.2619083024103215 samples/sec                   batch loss = 988.4158507585526 | accuracy = 0.5583333333333333


Epoch[1] Batch[365] Speed: 1.2621135424160455 samples/sec                   batch loss = 1001.9144300222397 | accuracy = 0.5582191780821918


Epoch[1] Batch[370] Speed: 1.2567644511256613 samples/sec                   batch loss = 1014.8303176164627 | accuracy = 0.5601351351351351


Epoch[1] Batch[375] Speed: 1.2637595824954264 samples/sec                   batch loss = 1028.0917354822159 | accuracy = 0.56


Epoch[1] Batch[380] Speed: 1.2602668888950486 samples/sec                   batch loss = 1042.9545718431473 | accuracy = 0.5572368421052631


Epoch[1] Batch[385] Speed: 1.2555324812555613 samples/sec                   batch loss = 1056.4845465421677 | accuracy = 0.5584415584415584


Epoch[1] Batch[390] Speed: 1.2635169798770736 samples/sec                   batch loss = 1070.1329163312912 | accuracy = 0.558974358974359


Epoch[1] Batch[395] Speed: 1.260342722867147 samples/sec                   batch loss = 1083.5687240362167 | accuracy = 0.5607594936708861


Epoch[1] Batch[400] Speed: 1.2573613184369499 samples/sec                   batch loss = 1096.4598177671432 | accuracy = 0.560625


Epoch[1] Batch[405] Speed: 1.2558702589019148 samples/sec                   batch loss = 1109.3513880968094 | accuracy = 0.5617283950617284


Epoch[1] Batch[410] Speed: 1.2541132471285161 samples/sec                   batch loss = 1122.2195295095444 | accuracy = 0.5634146341463414


Epoch[1] Batch[415] Speed: 1.2527996467086262 samples/sec                   batch loss = 1134.5492659807205 | accuracy = 0.5656626506024096


Epoch[1] Batch[420] Speed: 1.2646164320079742 samples/sec                   batch loss = 1147.710446715355 | accuracy = 0.5660714285714286


Epoch[1] Batch[425] Speed: 1.2623765978241233 samples/sec                   batch loss = 1160.478525519371 | accuracy = 0.5670588235294117


Epoch[1] Batch[430] Speed: 1.2574515053210342 samples/sec                   batch loss = 1174.6724776029587 | accuracy = 0.5668604651162791


Epoch[1] Batch[435] Speed: 1.2574534844886336 samples/sec                   batch loss = 1187.8609813451767 | accuracy = 0.5672413793103448


Epoch[1] Batch[440] Speed: 1.2564793567675412 samples/sec                   batch loss = 1199.6823165416718 | accuracy = 0.5698863636363637


Epoch[1] Batch[445] Speed: 1.2538108948389273 samples/sec                   batch loss = 1211.8411922454834 | accuracy = 0.5719101123595506


Epoch[1] Batch[450] Speed: 1.253941433908451 samples/sec                   batch loss = 1224.4061295986176 | accuracy = 0.5738888888888889


Epoch[1] Batch[455] Speed: 1.2522200926304023 samples/sec                   batch loss = 1237.723474264145 | accuracy = 0.5752747252747252


Epoch[1] Batch[460] Speed: 1.2564332493325494 samples/sec                   batch loss = 1252.3371560573578 | accuracy = 0.5728260869565217


Epoch[1] Batch[465] Speed: 1.2529896753844807 samples/sec                   batch loss = 1265.3908128738403 | accuracy = 0.5731182795698925


Epoch[1] Batch[470] Speed: 1.251426904429496 samples/sec                   batch loss = 1279.276516199112 | accuracy = 0.5723404255319149


Epoch[1] Batch[475] Speed: 1.2580978932587008 samples/sec                   batch loss = 1293.2084860801697 | accuracy = 0.5721052631578948


Epoch[1] Batch[480] Speed: 1.255682738854509 samples/sec                   batch loss = 1307.8427398204803 | accuracy = 0.5713541666666667


Epoch[1] Batch[485] Speed: 1.2531236938237789 samples/sec                   batch loss = 1320.4015254974365 | accuracy = 0.5721649484536082


Epoch[1] Batch[490] Speed: 1.2543947363502121 samples/sec                   batch loss = 1334.551310300827 | accuracy = 0.5719387755102041


Epoch[1] Batch[495] Speed: 1.2547395029943367 samples/sec                   batch loss = 1347.4421775341034 | accuracy = 0.5732323232323232


Epoch[1] Batch[500] Speed: 1.2544020518783563 samples/sec                   batch loss = 1361.4184548854828 | accuracy = 0.5725


Epoch[1] Batch[505] Speed: 1.2581105353271271 samples/sec                   batch loss = 1374.822069644928 | accuracy = 0.5722772277227722


Epoch[1] Batch[510] Speed: 1.2602632915051295 samples/sec                   batch loss = 1389.0487387180328 | accuracy = 0.571078431372549


Epoch[1] Batch[515] Speed: 1.2544026146148248 samples/sec                   batch loss = 1402.6558330059052 | accuracy = 0.5703883495145631


Epoch[1] Batch[520] Speed: 1.2519388310726665 samples/sec                   batch loss = 1414.950507760048 | accuracy = 0.5721153846153846


Epoch[1] Batch[525] Speed: 1.2566425478094014 samples/sec                   batch loss = 1429.601058602333 | accuracy = 0.570952380952381


Epoch[1] Batch[530] Speed: 1.262114207038831 samples/sec                   batch loss = 1442.641222834587 | accuracy = 0.5712264150943396


Epoch[1] Batch[535] Speed: 1.252368436571345 samples/sec                   batch loss = 1456.5442126989365 | accuracy = 0.5714953271028037


Epoch[1] Batch[540] Speed: 1.261285681400497 samples/sec                   batch loss = 1470.0418628454208 | accuracy = 0.5712962962962963


Epoch[1] Batch[545] Speed: 1.2554060258713424 samples/sec                   batch loss = 1484.0197094678879 | accuracy = 0.5701834862385321


Epoch[1] Batch[550] Speed: 1.2577376067923702 samples/sec                   batch loss = 1497.8029843568802 | accuracy = 0.5686363636363636


Epoch[1] Batch[555] Speed: 1.2524772631682362 samples/sec                   batch loss = 1509.8997963666916 | accuracy = 0.5707207207207208


Epoch[1] Batch[560] Speed: 1.2581008178937323 samples/sec                   batch loss = 1523.2936581373215 | accuracy = 0.5705357142857143


Epoch[1] Batch[565] Speed: 1.256149057608334 samples/sec                   batch loss = 1536.2856577634811 | accuracy = 0.5716814159292035


Epoch[1] Batch[570] Speed: 1.2510996291119387 samples/sec                   batch loss = 1549.770166516304 | accuracy = 0.5714912280701754


Epoch[1] Batch[575] Speed: 1.2570337577063224 samples/sec                   batch loss = 1563.1073132753372 | accuracy = 0.5721739130434783


Epoch[1] Batch[580] Speed: 1.2538214831402903 samples/sec                   batch loss = 1576.1679795980453 | accuracy = 0.5728448275862069


Epoch[1] Batch[585] Speed: 1.2635707460589896 samples/sec                   batch loss = 1589.1863244771957 | accuracy = 0.5735042735042735


Epoch[1] Batch[590] Speed: 1.2600078346739598 samples/sec                   batch loss = 1603.1506179571152 | accuracy = 0.5745762711864407


Epoch[1] Batch[595] Speed: 1.2607405983529951 samples/sec                   batch loss = 1616.4562429189682 | accuracy = 0.5752100840336134


Epoch[1] Batch[600] Speed: 1.258980145410072 samples/sec                   batch loss = 1630.4296370744705 | accuracy = 0.575


Epoch[1] Batch[605] Speed: 1.2594100598972127 samples/sec                   batch loss = 1643.6592041254044 | accuracy = 0.5743801652892562


Epoch[1] Batch[610] Speed: 1.2613498788549824 samples/sec                   batch loss = 1657.0770860910416 | accuracy = 0.5741803278688524


Epoch[1] Batch[615] Speed: 1.2604678073213778 samples/sec                   batch loss = 1668.9076741933823 | accuracy = 0.5764227642276423


Epoch[1] Batch[620] Speed: 1.2598544589089757 samples/sec                   batch loss = 1681.142399430275 | accuracy = 0.5774193548387097


Epoch[1] Batch[625] Speed: 1.2581852608010407 samples/sec                   batch loss = 1693.8692685365677 | accuracy = 0.578


Epoch[1] Batch[630] Speed: 1.2592351856569277 samples/sec                   batch loss = 1705.9764646291733 | accuracy = 0.5781746031746032


Epoch[1] Batch[635] Speed: 1.257284900597684 samples/sec                   batch loss = 1719.2039033174515 | accuracy = 0.5791338582677166


Epoch[1] Batch[640] Speed: 1.2518375704769809 samples/sec                   batch loss = 1731.6810247898102 | accuracy = 0.57890625


Epoch[1] Batch[645] Speed: 1.2537916864301162 samples/sec                   batch loss = 1744.527492761612 | accuracy = 0.5794573643410853


Epoch[1] Batch[650] Speed: 1.254511326011943 samples/sec                   batch loss = 1756.9932000637054 | accuracy = 0.58


Epoch[1] Batch[655] Speed: 1.252602006944658 samples/sec                   batch loss = 1769.705540895462 | accuracy = 0.5809160305343511


Epoch[1] Batch[660] Speed: 1.2606958828443344 samples/sec                   batch loss = 1781.55313205719 | accuracy = 0.5821969696969697


Epoch[1] Batch[665] Speed: 1.2657746177620364 samples/sec                   batch loss = 1795.0021874904633 | accuracy = 0.5823308270676691


Epoch[1] Batch[670] Speed: 1.2632910216231044 samples/sec                   batch loss = 1808.6165755987167 | accuracy = 0.5824626865671642


Epoch[1] Batch[675] Speed: 1.260496028140022 samples/sec                   batch loss = 1823.5076516866684 | accuracy = 0.5811111111111111


Epoch[1] Batch[680] Speed: 1.2622823790974087 samples/sec                   batch loss = 1835.9557212591171 | accuracy = 0.5819852941176471


Epoch[1] Batch[685] Speed: 1.2540106034548417 samples/sec                   batch loss = 1848.1758221387863 | accuracy = 0.5828467153284671


Epoch[1] Batch[690] Speed: 1.2584987893753075 samples/sec                   batch loss = 1860.753295302391 | accuracy = 0.5826086956521739


Epoch[1] Batch[695] Speed: 1.2614340945825047 samples/sec                   batch loss = 1874.700728058815 | accuracy = 0.5816546762589928


Epoch[1] Batch[700] Speed: 1.258736163233703 samples/sec                   batch loss = 1886.9947546720505 | accuracy = 0.5825


Epoch[1] Batch[705] Speed: 1.2614869248266634 samples/sec                   batch loss = 1900.838483452797 | accuracy = 0.5836879432624114


Epoch[1] Batch[710] Speed: 1.2628370661579755 samples/sec                   batch loss = 1913.5301815271378 | accuracy = 0.5845070422535211


Epoch[1] Batch[715] Speed: 1.2585635531529022 samples/sec                   batch loss = 1925.8480867147446 | accuracy = 0.5849650349650349


Epoch[1] Batch[720] Speed: 1.262926614271858 samples/sec                   batch loss = 1937.7987884283066 | accuracy = 0.5850694444444444


Epoch[1] Batch[725] Speed: 1.260805782572908 samples/sec                   batch loss = 1952.0289672613144 | accuracy = 0.5851724137931035


Epoch[1] Batch[730] Speed: 1.2610578663794967 samples/sec                   batch loss = 1964.730095744133 | accuracy = 0.5856164383561644


Epoch[1] Batch[735] Speed: 1.2637174130138695 samples/sec                   batch loss = 1978.4731956720352 | accuracy = 0.5853741496598639


Epoch[1] Batch[740] Speed: 1.262498951191954 samples/sec                   batch loss = 1993.9628328084946 | accuracy = 0.5847972972972973


Epoch[1] Batch[745] Speed: 1.2594646116920982 samples/sec                   batch loss = 2007.6074129343033 | accuracy = 0.5852348993288591


Epoch[1] Batch[750] Speed: 1.260682146725372 samples/sec                   batch loss = 2020.1479350328445 | accuracy = 0.586


Epoch[1] Batch[755] Speed: 1.261801816559076 samples/sec                   batch loss = 2031.6026822328568 | accuracy = 0.5867549668874172


Epoch[1] Batch[760] Speed: 1.2607701578035273 samples/sec                   batch loss = 2044.3230897188187 | accuracy = 0.5875


Epoch[1] Batch[765] Speed: 1.260411653545342 samples/sec                   batch loss = 2057.782122731209 | accuracy = 0.5875816993464053


Epoch[1] Batch[770] Speed: 1.258059874240624 samples/sec                   batch loss = 2071.12031686306 | accuracy = 0.587987012987013


Epoch[1] Batch[775] Speed: 1.2609857373729052 samples/sec                   batch loss = 2084.9827497005463 | accuracy = 0.5883870967741935


Epoch[1] Batch[780] Speed: 1.257863966503777 samples/sec                   batch loss = 2096.740876555443 | accuracy = 0.5894230769230769


Epoch[1] Batch[785] Speed: 1.2559900378521491 samples/sec                   batch loss = 2109.7431432008743 | accuracy = 0.5898089171974522


[Epoch 1] training: accuracy=0.5901015228426396
[Epoch 1] time cost: 644.2185699939728
[Epoch 1] validation: validation accuracy=0.7077777777777777


Epoch[2] Batch[5] Speed: 1.2613057838887833 samples/sec                   batch loss = 14.634690761566162 | accuracy = 0.45


Epoch[2] Batch[10] Speed: 1.2628806978882305 samples/sec                   batch loss = 28.426279306411743 | accuracy = 0.5


Epoch[2] Batch[15] Speed: 1.2615588265947042 samples/sec                   batch loss = 41.01937758922577 | accuracy = 0.55


Epoch[2] Batch[20] Speed: 1.2603923370454717 samples/sec                   batch loss = 53.921212553977966 | accuracy = 0.55


Epoch[2] Batch[25] Speed: 1.2619590840778723 samples/sec                   batch loss = 64.73092067241669 | accuracy = 0.61


Epoch[2] Batch[30] Speed: 1.262881838627503 samples/sec                   batch loss = 74.73751378059387 | accuracy = 0.6416666666666667


Epoch[2] Batch[35] Speed: 1.2605053090880458 samples/sec                   batch loss = 87.13557457923889 | accuracy = 0.6285714285714286


Epoch[2] Batch[40] Speed: 1.2536647383328932 samples/sec                   batch loss = 98.4158284664154 | accuracy = 0.65


Epoch[2] Batch[45] Speed: 1.2640350384163157 samples/sec                   batch loss = 109.78392207622528 | accuracy = 0.6555555555555556


Epoch[2] Batch[50] Speed: 1.2564827443860052 samples/sec                   batch loss = 122.25332534313202 | accuracy = 0.66


Epoch[2] Batch[55] Speed: 1.2535009149161251 samples/sec                   batch loss = 135.20265543460846 | accuracy = 0.6545454545454545


Epoch[2] Batch[60] Speed: 1.2561459539398616 samples/sec                   batch loss = 148.51492547988892 | accuracy = 0.6625


Epoch[2] Batch[65] Speed: 1.2546226832741065 samples/sec                   batch loss = 160.6270785331726 | accuracy = 0.6653846153846154


Epoch[2] Batch[70] Speed: 1.2561292132046928 samples/sec                   batch loss = 173.91586756706238 | accuracy = 0.6607142857142857


Epoch[2] Batch[75] Speed: 1.2608106148110463 samples/sec                   batch loss = 187.7790927886963 | accuracy = 0.6533333333333333


Epoch[2] Batch[80] Speed: 1.257077554475032 samples/sec                   batch loss = 200.38798236846924 | accuracy = 0.653125


Epoch[2] Batch[85] Speed: 1.254797217148849 samples/sec                   batch loss = 214.49110984802246 | accuracy = 0.6470588235294118


Epoch[2] Batch[90] Speed: 1.2596403062590111 samples/sec                   batch loss = 226.85694408416748 | accuracy = 0.6527777777777778


Epoch[2] Batch[95] Speed: 1.2573866674660916 samples/sec                   batch loss = 240.41942131519318 | accuracy = 0.65


Epoch[2] Batch[100] Speed: 1.2589519924270252 samples/sec                   batch loss = 253.58145308494568 | accuracy = 0.65


Epoch[2] Batch[105] Speed: 1.2548070713156083 samples/sec                   batch loss = 263.7497205734253 | accuracy = 0.6571428571428571


Epoch[2] Batch[110] Speed: 1.263889535503034 samples/sec                   batch loss = 277.5665988922119 | accuracy = 0.6522727272727272


Epoch[2] Batch[115] Speed: 1.2571032688385293 samples/sec                   batch loss = 290.3277566432953 | accuracy = 0.6521739130434783


Epoch[2] Batch[120] Speed: 1.2599678076624612 samples/sec                   batch loss = 302.8681938648224 | accuracy = 0.65


Epoch[2] Batch[125] Speed: 1.2569725415309934 samples/sec                   batch loss = 317.14509630203247 | accuracy = 0.648


Epoch[2] Batch[130] Speed: 1.2548766179786492 samples/sec                   batch loss = 328.5540974140167 | accuracy = 0.65


Epoch[2] Batch[135] Speed: 1.2563207237608174 samples/sec                   batch loss = 340.12767589092255 | accuracy = 0.6518518518518519


Epoch[2] Batch[140] Speed: 1.2595214375641408 samples/sec                   batch loss = 353.17790591716766 | accuracy = 0.6482142857142857


Epoch[2] Batch[145] Speed: 1.260323787187787 samples/sec                   batch loss = 367.0002602338791 | accuracy = 0.6448275862068965


Epoch[2] Batch[150] Speed: 1.2563597667216246 samples/sec                   batch loss = 381.243505358696 | accuracy = 0.6366666666666667


Epoch[2] Batch[155] Speed: 1.2639573311218195 samples/sec                   batch loss = 392.33917224407196 | accuracy = 0.6419354838709678


Epoch[2] Batch[160] Speed: 1.257110427603113 samples/sec                   batch loss = 404.169300198555 | accuracy = 0.6421875


Epoch[2] Batch[165] Speed: 1.2602102797192134 samples/sec                   batch loss = 418.13198697566986 | accuracy = 0.6393939393939394


Epoch[2] Batch[170] Speed: 1.2591119521259933 samples/sec                   batch loss = 430.4321736097336 | accuracy = 0.6411764705882353


Epoch[2] Batch[175] Speed: 1.2581995087108027 samples/sec                   batch loss = 442.1251199245453 | accuracy = 0.6442857142857142


Epoch[2] Batch[180] Speed: 1.2591942626453128 samples/sec                   batch loss = 456.76354122161865 | accuracy = 0.6416666666666667


Epoch[2] Batch[185] Speed: 1.2577229921952011 samples/sec                   batch loss = 470.50954008102417 | accuracy = 0.6351351351351351


Epoch[2] Batch[190] Speed: 1.2554522458364898 samples/sec                   batch loss = 482.3822503089905 | accuracy = 0.6368421052631579


Epoch[2] Batch[195] Speed: 1.257079626655829 samples/sec                   batch loss = 493.33056831359863 | accuracy = 0.6371794871794871


Epoch[2] Batch[200] Speed: 1.2577513731106609 samples/sec                   batch loss = 506.72946190834045 | accuracy = 0.63375


Epoch[2] Batch[205] Speed: 1.2602871482638842 samples/sec                   batch loss = 518.8965004682541 | accuracy = 0.6341463414634146


Epoch[2] Batch[210] Speed: 1.2557112157361485 samples/sec                   batch loss = 533.140181183815 | accuracy = 0.6273809523809524


Epoch[2] Batch[215] Speed: 1.2636291803372282 samples/sec                   batch loss = 545.506040930748 | accuracy = 0.6267441860465116


Epoch[2] Batch[220] Speed: 1.2612237659700394 samples/sec                   batch loss = 558.6448802947998 | accuracy = 0.6261363636363636


Epoch[2] Batch[225] Speed: 1.2598565402548532 samples/sec                   batch loss = 571.3609299659729 | accuracy = 0.6266666666666667


Epoch[2] Batch[230] Speed: 1.2625249828405822 samples/sec                   batch loss = 583.0818400382996 | accuracy = 0.6282608695652174


Epoch[2] Batch[235] Speed: 1.2615582574192385 samples/sec                   batch loss = 594.2450126409531 | accuracy = 0.6297872340425532


Epoch[2] Batch[240] Speed: 1.2616649867995762 samples/sec                   batch loss = 608.4372812509537 | accuracy = 0.6260416666666667


Epoch[2] Batch[245] Speed: 1.2590861555214423 samples/sec                   batch loss = 617.8819189071655 | accuracy = 0.6326530612244898


Epoch[2] Batch[250] Speed: 1.2538679613394816 samples/sec                   batch loss = 632.0739183425903 | accuracy = 0.632


Epoch[2] Batch[255] Speed: 1.2533114801614484 samples/sec                   batch loss = 648.6731066703796 | accuracy = 0.6303921568627451


Epoch[2] Batch[260] Speed: 1.2538648689329364 samples/sec                   batch loss = 663.2221493721008 | accuracy = 0.6278846153846154


Epoch[2] Batch[265] Speed: 1.2541400591113443 samples/sec                   batch loss = 676.1668734550476 | accuracy = 0.6273584905660378


Epoch[2] Batch[270] Speed: 1.2570637087150325 samples/sec                   batch loss = 688.0539464950562 | accuracy = 0.6268518518518519


Epoch[2] Batch[275] Speed: 1.2645762069774884 samples/sec                   batch loss = 702.5127940177917 | accuracy = 0.6281818181818182


Epoch[2] Batch[280] Speed: 1.253675324165837 samples/sec                   batch loss = 715.2881844043732 | accuracy = 0.6285714285714286


Epoch[2] Batch[285] Speed: 1.253149995499713 samples/sec                   batch loss = 728.265748500824 | accuracy = 0.6298245614035087


Epoch[2] Batch[290] Speed: 1.2615750483113783 samples/sec                   batch loss = 741.7770887613297 | accuracy = 0.6293103448275862


Epoch[2] Batch[295] Speed: 1.2583209592906872 samples/sec                   batch loss = 756.245793223381 | accuracy = 0.6271186440677966


Epoch[2] Batch[300] Speed: 1.2547562067077473 samples/sec                   batch loss = 766.7088038921356 | accuracy = 0.63


Epoch[2] Batch[305] Speed: 1.2583931613058947 samples/sec                   batch loss = 778.3343210220337 | accuracy = 0.6303278688524591


Epoch[2] Batch[310] Speed: 1.2576625573706983 samples/sec                   batch loss = 789.5512230396271 | accuracy = 0.6314516129032258


Epoch[2] Batch[315] Speed: 1.2548190842234765 samples/sec                   batch loss = 802.0897108316422 | accuracy = 0.6325396825396825


Epoch[2] Batch[320] Speed: 1.2551443648057772 samples/sec                   batch loss = 812.4449155330658 | accuracy = 0.634375


Epoch[2] Batch[325] Speed: 1.2600231648412326 samples/sec                   batch loss = 824.8703019618988 | accuracy = 0.6346153846153846


Epoch[2] Batch[330] Speed: 1.2610646911017573 samples/sec                   batch loss = 836.308831691742 | accuracy = 0.6363636363636364


Epoch[2] Batch[335] Speed: 1.2592081553787469 samples/sec                   batch loss = 849.7355073690414 | accuracy = 0.6343283582089553


Epoch[2] Batch[340] Speed: 1.2541349966211974 samples/sec                   batch loss = 861.5143538713455 | accuracy = 0.6360294117647058


Epoch[2] Batch[345] Speed: 1.261932791003318 samples/sec                   batch loss = 874.6519697904587 | accuracy = 0.6355072463768116


Epoch[2] Batch[350] Speed: 1.2605765307069319 samples/sec                   batch loss = 887.6733889579773 | accuracy = 0.6342857142857142


Epoch[2] Batch[355] Speed: 1.262040628052898 samples/sec                   batch loss = 898.0796055793762 | accuracy = 0.6338028169014085


Epoch[2] Batch[360] Speed: 1.2540615951066438 samples/sec                   batch loss = 908.6577821969986 | accuracy = 0.6361111111111111


Epoch[2] Batch[365] Speed: 1.257504662271366 samples/sec                   batch loss = 921.1790093183517 | accuracy = 0.6363013698630137


Epoch[2] Batch[370] Speed: 1.253789625074872 samples/sec                   batch loss = 933.3997610807419 | accuracy = 0.6364864864864865


Epoch[2] Batch[375] Speed: 1.2579407377104739 samples/sec                   batch loss = 945.4868434667587 | accuracy = 0.6373333333333333


Epoch[2] Batch[380] Speed: 1.2575074898946104 samples/sec                   batch loss = 954.5941202640533 | accuracy = 0.6401315789473684


Epoch[2] Batch[385] Speed: 1.2585033207347098 samples/sec                   batch loss = 968.275686621666 | accuracy = 0.6396103896103896


Epoch[2] Batch[390] Speed: 1.2600251521124508 samples/sec                   batch loss = 978.6759201288223 | accuracy = 0.6416666666666667


Epoch[2] Batch[395] Speed: 1.2592346185762373 samples/sec                   batch loss = 991.3912619352341 | accuracy = 0.6417721518987342


Epoch[2] Batch[400] Speed: 1.2563125391586387 samples/sec                   batch loss = 1005.2460157871246 | accuracy = 0.641875


Epoch[2] Batch[405] Speed: 1.2584940692439621 samples/sec                   batch loss = 1015.3365758657455 | accuracy = 0.6438271604938272


Epoch[2] Batch[410] Speed: 1.2652476899553358 samples/sec                   batch loss = 1028.3883658647537 | accuracy = 0.6439024390243903


Epoch[2] Batch[415] Speed: 1.259677569641208 samples/sec                   batch loss = 1042.4133588075638 | accuracy = 0.6427710843373494


Epoch[2] Batch[420] Speed: 1.265190537036568 samples/sec                   batch loss = 1051.7340602874756 | accuracy = 0.6452380952380953


Epoch[2] Batch[425] Speed: 1.2566340766431308 samples/sec                   batch loss = 1061.8484959602356 | accuracy = 0.6458823529411765


Epoch[2] Batch[430] Speed: 1.2611842305063585 samples/sec                   batch loss = 1074.6210941076279 | accuracy = 0.6453488372093024


Epoch[2] Batch[435] Speed: 1.2584777378626948 samples/sec                   batch loss = 1089.5446521043777 | accuracy = 0.6459770114942529


Epoch[2] Batch[440] Speed: 1.259632740359301 samples/sec                   batch loss = 1098.837073802948 | accuracy = 0.65


Epoch[2] Batch[445] Speed: 1.2612060363308661 samples/sec                   batch loss = 1109.4443883895874 | accuracy = 0.6511235955056179


Epoch[2] Batch[450] Speed: 1.2638799190169028 samples/sec                   batch loss = 1123.0116121768951 | accuracy = 0.6516666666666666


Epoch[2] Batch[455] Speed: 1.2595323116515937 samples/sec                   batch loss = 1132.2212014198303 | accuracy = 0.6532967032967033


Epoch[2] Batch[460] Speed: 1.2589235573028332 samples/sec                   batch loss = 1147.1043070554733 | accuracy = 0.6516304347826087


Epoch[2] Batch[465] Speed: 1.2559958675520486 samples/sec                   batch loss = 1158.6157386302948 | accuracy = 0.6521505376344086


Epoch[2] Batch[470] Speed: 1.253280209737407 samples/sec                   batch loss = 1172.0295125246048 | accuracy = 0.6505319148936171


Epoch[2] Batch[475] Speed: 1.257752881766604 samples/sec                   batch loss = 1185.4947999715805 | accuracy = 0.6505263157894737


Epoch[2] Batch[480] Speed: 1.2568337441694175 samples/sec                   batch loss = 1199.2127009630203 | accuracy = 0.65


Epoch[2] Batch[485] Speed: 1.2641579992547916 samples/sec                   batch loss = 1210.8357292413712 | accuracy = 0.65


Epoch[2] Batch[490] Speed: 1.2584567814510195 samples/sec                   batch loss = 1222.9992208480835 | accuracy = 0.6494897959183673


Epoch[2] Batch[495] Speed: 1.2599337441238707 samples/sec                   batch loss = 1236.3538811206818 | accuracy = 0.65


Epoch[2] Batch[500] Speed: 1.260263480840929 samples/sec                   batch loss = 1249.3217545747757 | accuracy = 0.6495


Epoch[2] Batch[505] Speed: 1.2572734057462134 samples/sec                   batch loss = 1265.3577551841736 | accuracy = 0.6475247524752475


Epoch[2] Batch[510] Speed: 1.2632378499346666 samples/sec                   batch loss = 1276.6887969970703 | accuracy = 0.6480392156862745


Epoch[2] Batch[515] Speed: 1.2584979397490523 samples/sec                   batch loss = 1287.0451539754868 | accuracy = 0.6485436893203883


Epoch[2] Batch[520] Speed: 1.260901202443999 samples/sec                   batch loss = 1298.4642267227173 | accuracy = 0.6490384615384616


Epoch[2] Batch[525] Speed: 1.255725219692949 samples/sec                   batch loss = 1310.3659307956696 | accuracy = 0.6490476190476191


Epoch[2] Batch[530] Speed: 1.263519168499563 samples/sec                   batch loss = 1322.8942637443542 | accuracy = 0.6485849056603774


Epoch[2] Batch[535] Speed: 1.2590236055069135 samples/sec                   batch loss = 1333.8020057678223 | accuracy = 0.6485981308411215


Epoch[2] Batch[540] Speed: 1.253678040912853 samples/sec                   batch loss = 1349.1467323303223 | accuracy = 0.6476851851851851


Epoch[2] Batch[545] Speed: 1.2544825283198944 samples/sec                   batch loss = 1361.0462763309479 | accuracy = 0.6481651376146789


Epoch[2] Batch[550] Speed: 1.255024840973262 samples/sec                   batch loss = 1371.9442203044891 | accuracy = 0.649090909090909


Epoch[2] Batch[555] Speed: 1.253488927233652 samples/sec                   batch loss = 1385.2635984420776 | accuracy = 0.6481981981981982


Epoch[2] Batch[560] Speed: 1.2533051136013043 samples/sec                   batch loss = 1399.2460322380066 | accuracy = 0.6482142857142857


Epoch[2] Batch[565] Speed: 1.2512604924191462 samples/sec                   batch loss = 1414.1923458576202 | accuracy = 0.6460176991150443


Epoch[2] Batch[570] Speed: 1.2548563444710965 samples/sec                   batch loss = 1424.952586889267 | accuracy = 0.6478070175438596


Epoch[2] Batch[575] Speed: 1.2583592772368986 samples/sec                   batch loss = 1436.8209022283554 | accuracy = 0.6473913043478261


Epoch[2] Batch[580] Speed: 1.258769595384726 samples/sec                   batch loss = 1446.7633483409882 | accuracy = 0.6487068965517241


Epoch[2] Batch[585] Speed: 1.2578584023702044 samples/sec                   batch loss = 1464.4711565971375 | accuracy = 0.647008547008547


Epoch[2] Batch[590] Speed: 1.2602649008612388 samples/sec                   batch loss = 1476.8501093387604 | accuracy = 0.6466101694915254


Epoch[2] Batch[595] Speed: 1.259850485449579 samples/sec                   batch loss = 1490.7143886089325 | accuracy = 0.6474789915966387


Epoch[2] Batch[600] Speed: 1.2559049491988983 samples/sec                   batch loss = 1500.5822304487228 | accuracy = 0.6483333333333333


Epoch[2] Batch[605] Speed: 1.2587466460051908 samples/sec                   batch loss = 1510.9845210313797 | accuracy = 0.6491735537190083


Epoch[2] Batch[610] Speed: 1.2565537003708123 samples/sec                   batch loss = 1525.4851871728897 | accuracy = 0.6475409836065574


Epoch[2] Batch[615] Speed: 1.256078617515247 samples/sec                   batch loss = 1539.3635958433151 | accuracy = 0.6467479674796748


Epoch[2] Batch[620] Speed: 1.2563327656682381 samples/sec                   batch loss = 1551.7114320993423 | accuracy = 0.646774193548387


Epoch[2] Batch[625] Speed: 1.2665205120158614 samples/sec                   batch loss = 1562.9257091283798 | accuracy = 0.6472


Epoch[2] Batch[630] Speed: 1.2625198524307353 samples/sec                   batch loss = 1576.2899446487427 | accuracy = 0.6468253968253969


Epoch[2] Batch[635] Speed: 1.2554573189667624 samples/sec                   batch loss = 1586.6657215356827 | accuracy = 0.6480314960629922


Epoch[2] Batch[640] Speed: 1.2589940334191563 samples/sec                   batch loss = 1598.9850952625275 | accuracy = 0.64765625


Epoch[2] Batch[645] Speed: 1.2558194020663977 samples/sec                   batch loss = 1614.388525724411 | accuracy = 0.6476744186046511


Epoch[2] Batch[650] Speed: 1.2609048982409412 samples/sec                   batch loss = 1630.1146140098572 | accuracy = 0.6461538461538462


Epoch[2] Batch[655] Speed: 1.2622420175148215 samples/sec                   batch loss = 1644.1043176651 | accuracy = 0.6458015267175573


Epoch[2] Batch[660] Speed: 1.2602659422114997 samples/sec                   batch loss = 1655.6819175481796 | accuracy = 0.646969696969697


Epoch[2] Batch[665] Speed: 1.2618187088369568 samples/sec                   batch loss = 1667.1590799093246 | accuracy = 0.6473684210526316


Epoch[2] Batch[670] Speed: 1.2570229267180877 samples/sec                   batch loss = 1680.5084891319275 | accuracy = 0.6473880597014925


Epoch[2] Batch[675] Speed: 1.2587802676034707 samples/sec                   batch loss = 1692.7922279834747 | accuracy = 0.6477777777777778


Epoch[2] Batch[680] Speed: 1.253304364598481 samples/sec                   batch loss = 1705.5427281856537 | accuracy = 0.6477941176470589


Epoch[2] Batch[685] Speed: 1.2549868197037202 samples/sec                   batch loss = 1716.5142500400543 | accuracy = 0.6478102189781022


Epoch[2] Batch[690] Speed: 1.252233831896658 samples/sec                   batch loss = 1726.488585472107 | accuracy = 0.6492753623188405


Epoch[2] Batch[695] Speed: 1.2557124375454458 samples/sec                   batch loss = 1736.951224565506 | accuracy = 0.6503597122302158


Epoch[2] Batch[700] Speed: 1.2525876049701488 samples/sec                   batch loss = 1748.3522443771362 | accuracy = 0.6503571428571429


Epoch[2] Batch[705] Speed: 1.2588244695158988 samples/sec                   batch loss = 1761.331030368805 | accuracy = 0.650709219858156


Epoch[2] Batch[710] Speed: 1.2564092559856905 samples/sec                   batch loss = 1772.414065361023 | accuracy = 0.6517605633802817


Epoch[2] Batch[715] Speed: 1.253286014308955 samples/sec                   batch loss = 1784.1796556711197 | accuracy = 0.6524475524475525


Epoch[2] Batch[720] Speed: 1.2528143342158253 samples/sec                   batch loss = 1794.34537088871 | accuracy = 0.6538194444444444


Epoch[2] Batch[725] Speed: 1.2488943627032256 samples/sec                   batch loss = 1809.7665288448334 | accuracy = 0.6537931034482759


Epoch[2] Batch[730] Speed: 1.2498311554006507 samples/sec                   batch loss = 1821.783216714859 | accuracy = 0.6541095890410958


Epoch[2] Batch[735] Speed: 1.2522103725152878 samples/sec                   batch loss = 1832.6046528816223 | accuracy = 0.654421768707483


Epoch[2] Batch[740] Speed: 1.2550252165034625 samples/sec                   batch loss = 1846.8028309345245 | accuracy = 0.6537162162162162


Epoch[2] Batch[745] Speed: 1.2553109662350614 samples/sec                   batch loss = 1860.3522638082504 | accuracy = 0.6540268456375838


Epoch[2] Batch[750] Speed: 1.2536013207588759 samples/sec                   batch loss = 1872.1111596822739 | accuracy = 0.654


Epoch[2] Batch[755] Speed: 1.253496700470371 samples/sec                   batch loss = 1883.3493437767029 | accuracy = 0.6533112582781457


Epoch[2] Batch[760] Speed: 1.2559259147247988 samples/sec                   batch loss = 1894.8300358057022 | accuracy = 0.6536184210526316


Epoch[2] Batch[765] Speed: 1.2540770621442237 samples/sec                   batch loss = 1905.5555863380432 | accuracy = 0.6542483660130719


Epoch[2] Batch[770] Speed: 1.2523277716921732 samples/sec                   batch loss = 1916.892875790596 | accuracy = 0.6548701298701298


Epoch[2] Batch[775] Speed: 1.2522129894545124 samples/sec                   batch loss = 1929.8686275482178 | accuracy = 0.6551612903225806


Epoch[2] Batch[780] Speed: 1.2511725909849616 samples/sec                   batch loss = 1941.7272729873657 | accuracy = 0.6551282051282051


Epoch[2] Batch[785] Speed: 1.2544909705030551 samples/sec                   batch loss = 1955.7327597141266 | accuracy = 0.6538216560509554


[Epoch 2] training: accuracy=0.6541878172588832
[Epoch 2] time cost: 642.7391257286072
[Epoch 2] validation: validation accuracy=0.79


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).