<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[15:42:39] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[15:42:39] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[15:42:40] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 7.353148 , -7.5006824]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.769035209917965 samples/sec                   batch loss = 14.605167150497437 | accuracy = 0.45


Epoch[1] Batch[10] Speed: 1.2483867801731319 samples/sec                   batch loss = 28.533894538879395 | accuracy = 0.525


Epoch[1] Batch[15] Speed: 1.2483878948784164 samples/sec                   batch loss = 42.35579991340637 | accuracy = 0.55


Epoch[1] Batch[20] Speed: 1.244649345315603 samples/sec                   batch loss = 53.49289929866791 | accuracy = 0.625


Epoch[1] Batch[25] Speed: 1.243005065530499 samples/sec                   batch loss = 66.50286376476288 | accuracy = 0.62


Epoch[1] Batch[30] Speed: 1.24397537587748 samples/sec                   batch loss = 80.60588681697845 | accuracy = 0.6166666666666667


Epoch[1] Batch[35] Speed: 1.2460130805551901 samples/sec                   batch loss = 94.43511879444122 | accuracy = 0.6


Epoch[1] Batch[40] Speed: 1.2437617923995026 samples/sec                   batch loss = 108.01333725452423 | accuracy = 0.6


Epoch[1] Batch[45] Speed: 1.2430324177080458 samples/sec                   batch loss = 123.00306308269501 | accuracy = 0.5666666666666667


Epoch[1] Batch[50] Speed: 1.245562300624993 samples/sec                   batch loss = 136.89798367023468 | accuracy = 0.565


Epoch[1] Batch[55] Speed: 1.253473662001148 samples/sec                   batch loss = 150.35558354854584 | accuracy = 0.5636363636363636


Epoch[1] Batch[60] Speed: 1.2480532011813987 samples/sec                   batch loss = 164.73720514774323 | accuracy = 0.5625


Epoch[1] Batch[65] Speed: 1.2493054146420106 samples/sec                   batch loss = 179.6788798570633 | accuracy = 0.5423076923076923


Epoch[1] Batch[70] Speed: 1.2520931822362826 samples/sec                   batch loss = 192.96064484119415 | accuracy = 0.5464285714285714


Epoch[1] Batch[75] Speed: 1.250088556423742 samples/sec                   batch loss = 207.56511056423187 | accuracy = 0.5366666666666666


Epoch[1] Batch[80] Speed: 1.2527034847885137 samples/sec                   batch loss = 221.7867385149002 | accuracy = 0.53125


Epoch[1] Batch[85] Speed: 1.246835370131881 samples/sec                   batch loss = 236.11756813526154 | accuracy = 0.5294117647058824


Epoch[1] Batch[90] Speed: 1.2509744381573833 samples/sec                   batch loss = 250.03605377674103 | accuracy = 0.5277777777777778


Epoch[1] Batch[95] Speed: 1.2514173833115119 samples/sec                   batch loss = 263.0023316144943 | accuracy = 0.5368421052631579


Epoch[1] Batch[100] Speed: 1.2527160186767794 samples/sec                   batch loss = 277.057066321373 | accuracy = 0.5325


Epoch[1] Batch[105] Speed: 1.25057310052415 samples/sec                   batch loss = 291.08173620700836 | accuracy = 0.5333333333333333


Epoch[1] Batch[110] Speed: 1.2442886905426302 samples/sec                   batch loss = 305.07352244853973 | accuracy = 0.5318181818181819


Epoch[1] Batch[115] Speed: 1.247856406355187 samples/sec                   batch loss = 319.53122103214264 | accuracy = 0.5304347826086957


Epoch[1] Batch[120] Speed: 1.2443125000509896 samples/sec                   batch loss = 333.7999573945999 | accuracy = 0.5208333333333334


Epoch[1] Batch[125] Speed: 1.2499037831583053 samples/sec                   batch loss = 347.5207351446152 | accuracy = 0.522


Epoch[1] Batch[130] Speed: 1.2438778893450038 samples/sec                   batch loss = 361.1700986623764 | accuracy = 0.525


Epoch[1] Batch[135] Speed: 1.2449991224935875 samples/sec                   batch loss = 374.9000235795975 | accuracy = 0.5259259259259259


Epoch[1] Batch[140] Speed: 1.2522334580350825 samples/sec                   batch loss = 389.32921731472015 | accuracy = 0.5232142857142857


Epoch[1] Batch[145] Speed: 1.245657739208943 samples/sec                   batch loss = 402.54891550540924 | accuracy = 0.5258620689655172


Epoch[1] Batch[150] Speed: 1.245170711693811 samples/sec                   batch loss = 416.47353279590607 | accuracy = 0.5266666666666666


Epoch[1] Batch[155] Speed: 1.2490103970196853 samples/sec                   batch loss = 430.2901281118393 | accuracy = 0.5258064516129032


Epoch[1] Batch[160] Speed: 1.2441604301860862 samples/sec                   batch loss = 444.6221467256546 | accuracy = 0.521875


Epoch[1] Batch[165] Speed: 1.2406536031327635 samples/sec                   batch loss = 457.7718480825424 | accuracy = 0.5257575757575758


Epoch[1] Batch[170] Speed: 1.2466397002039908 samples/sec                   batch loss = 471.66570222377777 | accuracy = 0.525


Epoch[1] Batch[175] Speed: 1.2420265211482702 samples/sec                   batch loss = 485.63091218471527 | accuracy = 0.5257142857142857


Epoch[1] Batch[180] Speed: 1.2524683805636312 samples/sec                   batch loss = 498.7671653032303 | accuracy = 0.5305555555555556


Epoch[1] Batch[185] Speed: 1.2430264314383468 samples/sec                   batch loss = 512.31749355793 | accuracy = 0.5324324324324324


Epoch[1] Batch[190] Speed: 1.2485721272754686 samples/sec                   batch loss = 526.0139709711075 | accuracy = 0.5328947368421053


Epoch[1] Batch[195] Speed: 1.2436435968078867 samples/sec                   batch loss = 539.6321145296097 | accuracy = 0.5346153846153846


Epoch[1] Batch[200] Speed: 1.2437701831121655 samples/sec                   batch loss = 553.7512639760971 | accuracy = 0.535


Epoch[1] Batch[205] Speed: 1.2428569058010008 samples/sec                   batch loss = 567.3070198297501 | accuracy = 0.5390243902439025


Epoch[1] Batch[210] Speed: 1.2409417485314997 samples/sec                   batch loss = 581.0881389379501 | accuracy = 0.5369047619047619


Epoch[1] Batch[215] Speed: 1.2448510412480611 samples/sec                   batch loss = 594.0872999429703 | accuracy = 0.541860465116279


Epoch[1] Batch[220] Speed: 1.2457189681715737 samples/sec                   batch loss = 608.345423579216 | accuracy = 0.5397727272727273


Epoch[1] Batch[225] Speed: 1.2489206730819336 samples/sec                   batch loss = 623.2884942293167 | accuracy = 0.5344444444444445


Epoch[1] Batch[230] Speed: 1.2491776061681195 samples/sec                   batch loss = 637.4896544218063 | accuracy = 0.532608695652174


Epoch[1] Batch[235] Speed: 1.239931167227512 samples/sec                   batch loss = 651.845524430275 | accuracy = 0.5287234042553192


Epoch[1] Batch[240] Speed: 1.2518209443803052 samples/sec                   batch loss = 665.6034508943558 | accuracy = 0.5291666666666667


Epoch[1] Batch[245] Speed: 1.2416632493058521 samples/sec                   batch loss = 679.0281518697739 | accuracy = 0.5316326530612245


Epoch[1] Batch[250] Speed: 1.2421181999771969 samples/sec                   batch loss = 692.9490948915482 | accuracy = 0.531


Epoch[1] Batch[255] Speed: 1.243240960071708 samples/sec                   batch loss = 706.8215569257736 | accuracy = 0.5294117647058824


Epoch[1] Batch[260] Speed: 1.248173536517329 samples/sec                   batch loss = 720.5873483419418 | accuracy = 0.5288461538461539


Epoch[1] Batch[265] Speed: 1.2429491676859603 samples/sec                   batch loss = 733.7545152902603 | accuracy = 0.5349056603773585


Epoch[1] Batch[270] Speed: 1.2481311008066587 samples/sec                   batch loss = 747.9095004796982 | accuracy = 0.5351851851851852


Epoch[1] Batch[275] Speed: 1.248586715824593 samples/sec                   batch loss = 761.382640004158 | accuracy = 0.5363636363636364


Epoch[1] Batch[280] Speed: 1.2462739103489173 samples/sec                   batch loss = 774.841172337532 | accuracy = 0.5375


Epoch[1] Batch[285] Speed: 1.2448939931911251 samples/sec                   batch loss = 788.1106387376785 | accuracy = 0.5412280701754386


Epoch[1] Batch[290] Speed: 1.244044096001566 samples/sec                   batch loss = 801.6596764326096 | accuracy = 0.5431034482758621


Epoch[1] Batch[295] Speed: 1.249811882528284 samples/sec                   batch loss = 815.4006534814835 | accuracy = 0.5432203389830509


Epoch[1] Batch[300] Speed: 1.2439718708934424 samples/sec                   batch loss = 828.9918657541275 | accuracy = 0.5441666666666667


Epoch[1] Batch[305] Speed: 1.245945438139832 samples/sec                   batch loss = 842.7406760454178 | accuracy = 0.5434426229508197


Epoch[1] Batch[310] Speed: 1.245515141607688 samples/sec                   batch loss = 856.3178008794785 | accuracy = 0.5443548387096774


Epoch[1] Batch[315] Speed: 1.2482343629726835 samples/sec                   batch loss = 869.4650872945786 | accuracy = 0.5444444444444444


Epoch[1] Batch[320] Speed: 1.246240490639686 samples/sec                   batch loss = 883.0598462820053 | accuracy = 0.5453125


Epoch[1] Batch[325] Speed: 1.246279465044937 samples/sec                   batch loss = 896.4992765188217 | accuracy = 0.546923076923077


Epoch[1] Batch[330] Speed: 1.2471343671300141 samples/sec                   batch loss = 910.2516537904739 | accuracy = 0.5462121212121213


Epoch[1] Batch[335] Speed: 1.244399070981309 samples/sec                   batch loss = 923.7485994100571 | accuracy = 0.5462686567164179


Epoch[1] Batch[340] Speed: 1.2463102945078663 samples/sec                   batch loss = 937.5597537755966 | accuracy = 0.5470588235294118


Epoch[1] Batch[345] Speed: 1.2448363551346187 samples/sec                   batch loss = 950.8844839334488 | accuracy = 0.5478260869565217


Epoch[1] Batch[350] Speed: 1.242662574532315 samples/sec                   batch loss = 964.8031870126724 | accuracy = 0.5478571428571428


Epoch[1] Batch[355] Speed: 1.2473405787743388 samples/sec                   batch loss = 977.4973491430283 | accuracy = 0.5507042253521127


Epoch[1] Batch[360] Speed: 1.2472863303430666 samples/sec                   batch loss = 991.8706218004227 | accuracy = 0.5472222222222223


Epoch[1] Batch[365] Speed: 1.246549946058079 samples/sec                   batch loss = 1005.1894167661667 | accuracy = 0.5493150684931507


Epoch[1] Batch[370] Speed: 1.2454727940041097 samples/sec                   batch loss = 1018.1397293806076 | accuracy = 0.5506756756756757


Epoch[1] Batch[375] Speed: 1.246643776034416 samples/sec                   batch loss = 1032.4519375562668 | accuracy = 0.548


Epoch[1] Batch[380] Speed: 1.2475784002245716 samples/sec                   batch loss = 1045.9444972276688 | accuracy = 0.5473684210526316


Epoch[1] Batch[385] Speed: 1.245387275419154 samples/sec                   batch loss = 1059.9821039438248 | accuracy = 0.5454545454545454


Epoch[1] Batch[390] Speed: 1.2467372495503792 samples/sec                   batch loss = 1073.0805586576462 | accuracy = 0.5467948717948717


Epoch[1] Batch[395] Speed: 1.2485535436301967 samples/sec                   batch loss = 1086.052169919014 | accuracy = 0.549367088607595


Epoch[1] Batch[400] Speed: 1.2418634269822648 samples/sec                   batch loss = 1100.6137953996658 | accuracy = 0.549375


Epoch[1] Batch[405] Speed: 1.2507614285856998 samples/sec                   batch loss = 1113.2063347101212 | accuracy = 0.5512345679012346


Epoch[1] Batch[410] Speed: 1.2463121461758164 samples/sec                   batch loss = 1126.5939930677414 | accuracy = 0.550609756097561


Epoch[1] Batch[415] Speed: 1.24626178276801 samples/sec                   batch loss = 1139.3921612501144 | accuracy = 0.5524096385542169


Epoch[1] Batch[420] Speed: 1.2445245186792737 samples/sec                   batch loss = 1152.793826699257 | accuracy = 0.5535714285714286


Epoch[1] Batch[425] Speed: 1.2456622710557983 samples/sec                   batch loss = 1166.986398100853 | accuracy = 0.5529411764705883


Epoch[1] Batch[430] Speed: 1.2412472920893163 samples/sec                   batch loss = 1180.3591984510422 | accuracy = 0.5517441860465117


Epoch[1] Batch[435] Speed: 1.2422864203041353 samples/sec                   batch loss = 1193.8929289579391 | accuracy = 0.5528735632183908


Epoch[1] Batch[440] Speed: 1.2377520964308386 samples/sec                   batch loss = 1207.5361109972 | accuracy = 0.5528409090909091


Epoch[1] Batch[445] Speed: 1.2413752280425103 samples/sec                   batch loss = 1220.994537472725 | accuracy = 0.553932584269663


Epoch[1] Batch[450] Speed: 1.2388396216230666 samples/sec                   batch loss = 1235.5390578508377 | accuracy = 0.5527777777777778


Epoch[1] Batch[455] Speed: 1.2412422413185926 samples/sec                   batch loss = 1248.6091777086258 | accuracy = 0.5538461538461539


Epoch[1] Batch[460] Speed: 1.23991989586492 samples/sec                   batch loss = 1262.9303642511368 | accuracy = 0.5543478260869565


Epoch[1] Batch[465] Speed: 1.2470431513594469 samples/sec                   batch loss = 1276.3229757547379 | accuracy = 0.5559139784946237


Epoch[1] Batch[470] Speed: 1.2463788098866275 samples/sec                   batch loss = 1289.3710423707962 | accuracy = 0.5569148936170213


Epoch[1] Batch[475] Speed: 1.248482837916688 samples/sec                   batch loss = 1303.2477847337723 | accuracy = 0.5563157894736842


Epoch[1] Batch[480] Speed: 1.2482694685715732 samples/sec                   batch loss = 1316.3791435956955 | accuracy = 0.5572916666666666


Epoch[1] Batch[485] Speed: 1.2478259643974408 samples/sec                   batch loss = 1329.178523182869 | accuracy = 0.5592783505154639


Epoch[1] Batch[490] Speed: 1.2460642567835245 samples/sec                   batch loss = 1342.7576330900192 | accuracy = 0.5581632653061225


Epoch[1] Batch[495] Speed: 1.246347884445122 samples/sec                   batch loss = 1355.5039662122726 | accuracy = 0.5595959595959596


Epoch[1] Batch[500] Speed: 1.2436104101978744 samples/sec                   batch loss = 1368.133583664894 | accuracy = 0.5605


Epoch[1] Batch[505] Speed: 1.243489294100443 samples/sec                   batch loss = 1381.1253706216812 | accuracy = 0.560891089108911


Epoch[1] Batch[510] Speed: 1.249846238932779 samples/sec                   batch loss = 1393.6721040010452 | accuracy = 0.5632352941176471


Epoch[1] Batch[515] Speed: 1.246248822255428 samples/sec                   batch loss = 1408.9751611948013 | accuracy = 0.5616504854368932


Epoch[1] Batch[520] Speed: 1.2435349172578514 samples/sec                   batch loss = 1422.4067043066025 | accuracy = 0.5610576923076923


Epoch[1] Batch[525] Speed: 1.2463839951408218 samples/sec                   batch loss = 1436.4006139039993 | accuracy = 0.560952380952381


Epoch[1] Batch[530] Speed: 1.2497845104847727 samples/sec                   batch loss = 1449.1561304330826 | accuracy = 0.5608490566037736


Epoch[1] Batch[535] Speed: 1.250261085612383 samples/sec                   batch loss = 1460.9696477651596 | accuracy = 0.5616822429906542


Epoch[1] Batch[540] Speed: 1.2500892084422592 samples/sec                   batch loss = 1473.9495517015457 | accuracy = 0.562962962962963


Epoch[1] Batch[545] Speed: 1.255308242406276 samples/sec                   batch loss = 1486.1114073991776 | accuracy = 0.5646788990825689


Epoch[1] Batch[550] Speed: 1.2531933348332318 samples/sec                   batch loss = 1498.5124071836472 | accuracy = 0.5659090909090909


Epoch[1] Batch[555] Speed: 1.2487192363477193 samples/sec                   batch loss = 1512.6525398492813 | accuracy = 0.5653153153153153


Epoch[1] Batch[560] Speed: 1.2580485538851693 samples/sec                   batch loss = 1525.498724102974 | accuracy = 0.5669642857142857


Epoch[1] Batch[565] Speed: 1.2489564681332603 samples/sec                   batch loss = 1540.1734923124313 | accuracy = 0.565929203539823


Epoch[1] Batch[570] Speed: 1.247884807813561 samples/sec                   batch loss = 1555.8985673189163 | accuracy = 0.5640350877192982


Epoch[1] Batch[575] Speed: 1.2483101489574882 samples/sec                   batch loss = 1569.9010511636734 | accuracy = 0.5643478260869565


Epoch[1] Batch[580] Speed: 1.2531259401871426 samples/sec                   batch loss = 1583.3774114847183 | accuracy = 0.5633620689655172


Epoch[1] Batch[585] Speed: 1.2452032422813535 samples/sec                   batch loss = 1597.427216887474 | accuracy = 0.5628205128205128


Epoch[1] Batch[590] Speed: 1.2482787560973432 samples/sec                   batch loss = 1610.8996406793594 | accuracy = 0.563135593220339


Epoch[1] Batch[595] Speed: 1.2466705474921342 samples/sec                   batch loss = 1624.0520795583725 | accuracy = 0.5642857142857143


Epoch[1] Batch[600] Speed: 1.2484349000963568 samples/sec                   batch loss = 1636.5019670724869 | accuracy = 0.5645833333333333


Epoch[1] Batch[605] Speed: 1.2443760888243307 samples/sec                   batch loss = 1650.2507506608963 | accuracy = 0.5644628099173554


Epoch[1] Batch[610] Speed: 1.248575472390368 samples/sec                   batch loss = 1664.5845786333084 | accuracy = 0.5647540983606557


Epoch[1] Batch[615] Speed: 1.2440392991826685 samples/sec                   batch loss = 1677.465373158455 | accuracy = 0.5650406504065041


Epoch[1] Batch[620] Speed: 1.240039859336337 samples/sec                   batch loss = 1690.9474774599075 | accuracy = 0.5645161290322581


Epoch[1] Batch[625] Speed: 1.2417534039903166 samples/sec                   batch loss = 1703.684971690178 | accuracy = 0.5652


Epoch[1] Batch[630] Speed: 1.2371669455547365 samples/sec                   batch loss = 1717.3328009843826 | accuracy = 0.5650793650793651


Epoch[1] Batch[635] Speed: 1.2456471958561028 samples/sec                   batch loss = 1730.316544175148 | accuracy = 0.5653543307086614


Epoch[1] Batch[640] Speed: 1.2440673426479363 samples/sec                   batch loss = 1742.8926566839218 | accuracy = 0.567578125


Epoch[1] Batch[645] Speed: 1.243940234694401 samples/sec                   batch loss = 1756.0240787267685 | accuracy = 0.5670542635658915


Epoch[1] Batch[650] Speed: 1.2504187511058502 samples/sec                   batch loss = 1768.8993810415268 | accuracy = 0.5673076923076923


Epoch[1] Batch[655] Speed: 1.251491595861366 samples/sec                   batch loss = 1781.451737523079 | accuracy = 0.5679389312977099


Epoch[1] Batch[660] Speed: 1.2530549028456162 samples/sec                   batch loss = 1795.2304347753525 | accuracy = 0.5681818181818182


Epoch[1] Batch[665] Speed: 1.2479035571971244 samples/sec                   batch loss = 1808.8017731904984 | accuracy = 0.5684210526315789


Epoch[1] Batch[670] Speed: 1.2448442061584455 samples/sec                   batch loss = 1821.1893404722214 | accuracy = 0.5705223880597015


Epoch[1] Batch[675] Speed: 1.2504689850491986 samples/sec                   batch loss = 1834.1652156114578 | accuracy = 0.5711111111111111


Epoch[1] Batch[680] Speed: 1.2543731654178174 samples/sec                   batch loss = 1848.7803384065628 | accuracy = 0.5702205882352941


Epoch[1] Batch[685] Speed: 1.2448866033984218 samples/sec                   batch loss = 1861.9133669137955 | accuracy = 0.5711678832116789


Epoch[1] Batch[690] Speed: 1.2504558436834794 samples/sec                   batch loss = 1873.9744902849197 | accuracy = 0.5731884057971014


Epoch[1] Batch[695] Speed: 1.2504180055489613 samples/sec                   batch loss = 1885.6417798995972 | accuracy = 0.5748201438848921


Epoch[1] Batch[700] Speed: 1.2539732996891828 samples/sec                   batch loss = 1897.773915529251 | accuracy = 0.5767857142857142


Epoch[1] Batch[705] Speed: 1.2517220375746048 samples/sec                   batch loss = 1910.8747918605804 | accuracy = 0.577304964539007


Epoch[1] Batch[710] Speed: 1.249387099255702 samples/sec                   batch loss = 1926.7372255325317 | accuracy = 0.576056338028169


Epoch[1] Batch[715] Speed: 1.2518117908751356 samples/sec                   batch loss = 1939.151603460312 | accuracy = 0.5772727272727273


Epoch[1] Batch[720] Speed: 1.2500178630222476 samples/sec                   batch loss = 1951.9559073448181 | accuracy = 0.5777777777777777


Epoch[1] Batch[725] Speed: 1.2396574135539058 samples/sec                   batch loss = 1963.7743636369705 | accuracy = 0.5793103448275863


Epoch[1] Batch[730] Speed: 1.245691960092633 samples/sec                   batch loss = 1976.664560675621 | accuracy = 0.5794520547945206


Epoch[1] Batch[735] Speed: 1.2407893084491362 samples/sec                   batch loss = 1989.846186041832 | accuracy = 0.5789115646258504


Epoch[1] Batch[740] Speed: 1.2433217614078693 samples/sec                   batch loss = 2002.2604788541794 | accuracy = 0.5793918918918919


Epoch[1] Batch[745] Speed: 1.247442040642279 samples/sec                   batch loss = 2016.9252160787582 | accuracy = 0.5798657718120805


Epoch[1] Batch[750] Speed: 1.242562716873543 samples/sec                   batch loss = 2029.8424336910248 | accuracy = 0.58


Epoch[1] Batch[755] Speed: 1.2412168044253837 samples/sec                   batch loss = 2043.2421424388885 | accuracy = 0.5801324503311258


Epoch[1] Batch[760] Speed: 1.2416406437620702 samples/sec                   batch loss = 2056.6199073791504 | accuracy = 0.5796052631578947


Epoch[1] Batch[765] Speed: 1.2485162850878828 samples/sec                   batch loss = 2067.311002969742 | accuracy = 0.5816993464052288


Epoch[1] Batch[770] Speed: 1.2430288259393074 samples/sec                   batch loss = 2079.851457953453 | accuracy = 0.5818181818181818


Epoch[1] Batch[775] Speed: 1.2505455087009327 samples/sec                   batch loss = 2092.1976433992386 | accuracy = 0.5829032258064516


Epoch[1] Batch[780] Speed: 1.2488796739987373 samples/sec                   batch loss = 2104.5822664499283 | accuracy = 0.5833333333333334


Epoch[1] Batch[785] Speed: 1.2456785489817412 samples/sec                   batch loss = 2119.2538598775864 | accuracy = 0.5834394904458599


[Epoch 1] training: accuracy=0.5840736040609137
[Epoch 1] time cost: 650.196658372879
[Epoch 1] validation: validation accuracy=0.6466666666666666


Epoch[2] Batch[5] Speed: 1.2460649971576387 samples/sec                   batch loss = 12.504848003387451 | accuracy = 0.65


Epoch[2] Batch[10] Speed: 1.2508906805556963 samples/sec                   batch loss = 25.993945837020874 | accuracy = 0.6


Epoch[2] Batch[15] Speed: 1.248296217019943 samples/sec                   batch loss = 36.388587951660156 | accuracy = 0.6666666666666666


Epoch[2] Batch[20] Speed: 1.2498535015040215 samples/sec                   batch loss = 48.800145387649536 | accuracy = 0.6625


Epoch[2] Batch[25] Speed: 1.25045724168798 samples/sec                   batch loss = 60.664527893066406 | accuracy = 0.69


Epoch[2] Batch[30] Speed: 1.2476665396235185 samples/sec                   batch loss = 70.78456056118011 | accuracy = 0.7083333333333334


Epoch[2] Batch[35] Speed: 1.2505342299616153 samples/sec                   batch loss = 81.42616605758667 | accuracy = 0.7142857142857143


Epoch[2] Batch[40] Speed: 1.251651719579611 samples/sec                   batch loss = 91.67417454719543 | accuracy = 0.74375


Epoch[2] Batch[45] Speed: 1.2511364821913527 samples/sec                   batch loss = 103.68552565574646 | accuracy = 0.7388888888888889


Epoch[2] Batch[50] Speed: 1.2487856001101614 samples/sec                   batch loss = 120.06126117706299 | accuracy = 0.705


Epoch[2] Batch[55] Speed: 1.2492060678005552 samples/sec                   batch loss = 131.05219161510468 | accuracy = 0.7181818181818181


Epoch[2] Batch[60] Speed: 1.2478439695222665 samples/sec                   batch loss = 144.0154446363449 | accuracy = 0.7041666666666667


Epoch[2] Batch[65] Speed: 1.2514044086878884 samples/sec                   batch loss = 158.43694603443146 | accuracy = 0.6961538461538461


Epoch[2] Batch[70] Speed: 1.25067797920414 samples/sec                   batch loss = 169.18106865882874 | accuracy = 0.7035714285714286


Epoch[2] Batch[75] Speed: 1.2491234768383535 samples/sec                   batch loss = 180.26958620548248 | accuracy = 0.7066666666666667


Epoch[2] Batch[80] Speed: 1.2508770640100528 samples/sec                   batch loss = 192.37218487262726 | accuracy = 0.703125


Epoch[2] Batch[85] Speed: 1.2517839575695693 samples/sec                   batch loss = 204.31811702251434 | accuracy = 0.7058823529411765


Epoch[2] Batch[90] Speed: 1.2490209043965237 samples/sec                   batch loss = 215.24583327770233 | accuracy = 0.7055555555555556


Epoch[2] Batch[95] Speed: 1.2489259724796151 samples/sec                   batch loss = 228.1008950471878 | accuracy = 0.7


Epoch[2] Batch[100] Speed: 1.2495744187695474 samples/sec                   batch loss = 240.10155379772186 | accuracy = 0.7


Epoch[2] Batch[105] Speed: 1.2485445307614504 samples/sec                   batch loss = 252.2280696630478 | accuracy = 0.7


Epoch[2] Batch[110] Speed: 1.2511157695605222 samples/sec                   batch loss = 263.8103367090225 | accuracy = 0.6954545454545454


Epoch[2] Batch[115] Speed: 1.2470809709132906 samples/sec                   batch loss = 275.9335604906082 | accuracy = 0.6891304347826087


Epoch[2] Batch[120] Speed: 1.2464883575533718 samples/sec                   batch loss = 289.17554664611816 | accuracy = 0.6854166666666667


Epoch[2] Batch[125] Speed: 1.252668597002355 samples/sec                   batch loss = 304.41553688049316 | accuracy = 0.674


Epoch[2] Batch[130] Speed: 1.2492209502162412 samples/sec                   batch loss = 318.13749957084656 | accuracy = 0.6692307692307692


Epoch[2] Batch[135] Speed: 1.2554563794951001 samples/sec                   batch loss = 328.10713827610016 | accuracy = 0.6777777777777778


Epoch[2] Batch[140] Speed: 1.2494807985194398 samples/sec                   batch loss = 341.25061571598053 | accuracy = 0.6767857142857143


Epoch[2] Batch[145] Speed: 1.2449742704672337 samples/sec                   batch loss = 353.59905529022217 | accuracy = 0.6741379310344827


Epoch[2] Batch[150] Speed: 1.2458246994171944 samples/sec                   batch loss = 367.95717346668243 | accuracy = 0.6716666666666666


Epoch[2] Batch[155] Speed: 1.2496190003400154 samples/sec                   batch loss = 381.30612576007843 | accuracy = 0.6693548387096774


Epoch[2] Batch[160] Speed: 1.2440817338644077 samples/sec                   batch loss = 393.2901872396469 | accuracy = 0.6671875


Epoch[2] Batch[165] Speed: 1.2429092042697978 samples/sec                   batch loss = 407.0858141183853 | accuracy = 0.6651515151515152


Epoch[2] Batch[170] Speed: 1.2463177937970469 samples/sec                   batch loss = 419.7250329256058 | accuracy = 0.6647058823529411


Epoch[2] Batch[175] Speed: 1.2421257408643815 samples/sec                   batch loss = 435.2923482656479 | accuracy = 0.6614285714285715


Epoch[2] Batch[180] Speed: 1.2412503225714806 samples/sec                   batch loss = 446.4987692832947 | accuracy = 0.6625


Epoch[2] Batch[185] Speed: 1.2452757028810575 samples/sec                   batch loss = 458.04965233802795 | accuracy = 0.6635135135135135


Epoch[2] Batch[190] Speed: 1.2419311784892029 samples/sec                   batch loss = 472.09673368930817 | accuracy = 0.6605263157894737


Epoch[2] Batch[195] Speed: 1.2494621878153498 samples/sec                   batch loss = 483.58710539340973 | accuracy = 0.6602564102564102


Epoch[2] Batch[200] Speed: 1.2463985326013676 samples/sec                   batch loss = 497.81771528720856 | accuracy = 0.66


Epoch[2] Batch[205] Speed: 1.250408779356037 samples/sec                   batch loss = 511.91325199604034 | accuracy = 0.6585365853658537


Epoch[2] Batch[210] Speed: 1.2484244025766755 samples/sec                   batch loss = 522.2743189334869 | accuracy = 0.6619047619047619


Epoch[2] Batch[215] Speed: 1.2432796549541356 samples/sec                   batch loss = 534.7540826797485 | accuracy = 0.6627906976744186


Epoch[2] Batch[220] Speed: 1.2419051617716652 samples/sec                   batch loss = 545.7849543094635 | accuracy = 0.6659090909090909


Epoch[2] Batch[225] Speed: 1.2439614482944445 samples/sec                   batch loss = 559.513913154602 | accuracy = 0.6633333333333333


Epoch[2] Batch[230] Speed: 1.2406556215203837 samples/sec                   batch loss = 573.5910818576813 | accuracy = 0.6597826086956522


Epoch[2] Batch[235] Speed: 1.242583147272907 samples/sec                   batch loss = 584.4058892726898 | accuracy = 0.6627659574468086


Epoch[2] Batch[240] Speed: 1.2425929945972305 samples/sec                   batch loss = 594.909192442894 | accuracy = 0.6666666666666666


Epoch[2] Batch[245] Speed: 1.2445101172169977 samples/sec                   batch loss = 605.6433751583099 | accuracy = 0.6673469387755102


Epoch[2] Batch[250] Speed: 1.2470970078382424 samples/sec                   batch loss = 617.3210750818253 | accuracy = 0.666


Epoch[2] Batch[255] Speed: 1.2451700647959545 samples/sec                   batch loss = 630.8841632604599 | accuracy = 0.6647058823529411


Epoch[2] Batch[260] Speed: 1.2500712316094291 samples/sec                   batch loss = 644.6609354019165 | accuracy = 0.6625


Epoch[2] Batch[265] Speed: 1.24867471895319 samples/sec                   batch loss = 657.3529279232025 | accuracy = 0.659433962264151


Epoch[2] Batch[270] Speed: 1.2484096320293572 samples/sec                   batch loss = 668.6139608621597 | accuracy = 0.6601851851851852


Epoch[2] Batch[275] Speed: 1.2427144884481336 samples/sec                   batch loss = 682.6865713596344 | accuracy = 0.6554545454545454


Epoch[2] Batch[280] Speed: 1.2395867043421698 samples/sec                   batch loss = 695.0553996562958 | accuracy = 0.6544642857142857


Epoch[2] Batch[285] Speed: 1.2513217134525831 samples/sec                   batch loss = 706.3229175806046 | accuracy = 0.6543859649122807


Epoch[2] Batch[290] Speed: 1.247040463292201 samples/sec                   batch loss = 717.7086557149887 | accuracy = 0.656896551724138


Epoch[2] Batch[295] Speed: 1.2487998218341319 samples/sec                   batch loss = 729.9852007627487 | accuracy = 0.6576271186440678


Epoch[2] Batch[300] Speed: 1.2524779176809278 samples/sec                   batch loss = 746.477138876915 | accuracy = 0.6541666666666667


Epoch[2] Batch[305] Speed: 1.2533710293557128 samples/sec                   batch loss = 757.474905371666 | accuracy = 0.6549180327868852


Epoch[2] Batch[310] Speed: 1.2463211268434295 samples/sec                   batch loss = 769.3450258970261 | accuracy = 0.6556451612903226


Epoch[2] Batch[315] Speed: 1.2485427653694106 samples/sec                   batch loss = 780.1416105031967 | accuracy = 0.6579365079365079


Epoch[2] Batch[320] Speed: 1.2477584962059765 samples/sec                   batch loss = 793.7219142913818 | accuracy = 0.65546875


Epoch[2] Batch[325] Speed: 1.2479362307366224 samples/sec                   batch loss = 806.8425359725952 | accuracy = 0.6538461538461539


Epoch[2] Batch[330] Speed: 1.2465834749904112 samples/sec                   batch loss = 818.2089574337006 | accuracy = 0.6553030303030303


Epoch[2] Batch[335] Speed: 1.2488976165769168 samples/sec                   batch loss = 829.0124342441559 | accuracy = 0.6559701492537313


Epoch[2] Batch[340] Speed: 1.2458050873667397 samples/sec                   batch loss = 841.9288671016693 | accuracy = 0.6558823529411765


Epoch[2] Batch[345] Speed: 1.2438302122817255 samples/sec                   batch loss = 851.3352749347687 | accuracy = 0.6579710144927536


Epoch[2] Batch[350] Speed: 1.246359643339357 samples/sec                   batch loss = 863.1208267211914 | accuracy = 0.6592857142857143


Epoch[2] Batch[355] Speed: 1.2398001388392748 samples/sec                   batch loss = 875.4832743406296 | accuracy = 0.6584507042253521


Epoch[2] Batch[360] Speed: 1.2510252764698948 samples/sec                   batch loss = 887.3252217769623 | accuracy = 0.6597222222222222


Epoch[2] Batch[365] Speed: 1.2421201311712908 samples/sec                   batch loss = 898.6123824119568 | accuracy = 0.6595890410958904


Epoch[2] Batch[370] Speed: 1.2473075654917256 samples/sec                   batch loss = 909.8594306707382 | accuracy = 0.6594594594594595


Epoch[2] Batch[375] Speed: 1.2398380700839062 samples/sec                   batch loss = 920.6712280511856 | accuracy = 0.6613333333333333


Epoch[2] Batch[380] Speed: 1.2437476851927358 samples/sec                   batch loss = 933.5721037387848 | accuracy = 0.6625


Epoch[2] Batch[385] Speed: 1.241453490454317 samples/sec                   batch loss = 945.3045570850372 | accuracy = 0.6623376623376623


Epoch[2] Batch[390] Speed: 1.2392186338501392 samples/sec                   batch loss = 959.966596364975 | accuracy = 0.6602564102564102


Epoch[2] Batch[395] Speed: 1.2401108035861854 samples/sec                   batch loss = 972.2669285535812 | accuracy = 0.6582278481012658


Epoch[2] Batch[400] Speed: 1.2389372349052208 samples/sec                   batch loss = 982.4512870311737 | accuracy = 0.659375


Epoch[2] Batch[405] Speed: 1.2418777672543617 samples/sec                   batch loss = 994.1912050247192 | accuracy = 0.6598765432098765


Epoch[2] Batch[410] Speed: 1.2497327489453538 samples/sec                   batch loss = 1005.995127081871 | accuracy = 0.6591463414634147


Epoch[2] Batch[415] Speed: 1.2454942448320487 samples/sec                   batch loss = 1017.9995757341385 | accuracy = 0.6596385542168675


Epoch[2] Batch[420] Speed: 1.2414966676044803 samples/sec                   batch loss = 1031.5153585672379 | accuracy = 0.6601190476190476


Epoch[2] Batch[425] Speed: 1.248412604690033 samples/sec                   batch loss = 1042.940128326416 | accuracy = 0.6617647058823529


Epoch[2] Batch[430] Speed: 1.248029619660487 samples/sec                   batch loss = 1053.9896372556686 | accuracy = 0.6622093023255814


Epoch[2] Batch[435] Speed: 1.2426648755891971 samples/sec                   batch loss = 1065.8543075323105 | accuracy = 0.6620689655172414


Epoch[2] Batch[440] Speed: 1.2469472223394886 samples/sec                   batch loss = 1077.9722512960434 | accuracy = 0.6619318181818182


Epoch[2] Batch[445] Speed: 1.2425547105489145 samples/sec                   batch loss = 1087.4107874631882 | accuracy = 0.6646067415730337


Epoch[2] Batch[450] Speed: 1.2425325326481829 samples/sec                   batch loss = 1100.4315272569656 | accuracy = 0.6655555555555556


Epoch[2] Batch[455] Speed: 1.2409136622339665 samples/sec                   batch loss = 1112.677178144455 | accuracy = 0.6659340659340659


Epoch[2] Batch[460] Speed: 1.2499809826823574 samples/sec                   batch loss = 1124.5383307933807 | accuracy = 0.6652173913043479


Epoch[2] Batch[465] Speed: 1.24196556267859 samples/sec                   batch loss = 1135.4791355133057 | accuracy = 0.6655913978494624


Epoch[2] Batch[470] Speed: 1.243415935396269 samples/sec                   batch loss = 1144.902715563774 | accuracy = 0.6675531914893617


Epoch[2] Batch[475] Speed: 1.2473192497683563 samples/sec                   batch loss = 1159.1276251077652 | accuracy = 0.6663157894736842


Epoch[2] Batch[480] Speed: 1.2453815437826108 samples/sec                   batch loss = 1171.0557260513306 | accuracy = 0.6671875


Epoch[2] Batch[485] Speed: 1.2456891853563425 samples/sec                   batch loss = 1184.940022945404 | accuracy = 0.6670103092783505


Epoch[2] Batch[490] Speed: 1.243790007773124 samples/sec                   batch loss = 1196.5430324077606 | accuracy = 0.6678571428571428


Epoch[2] Batch[495] Speed: 1.2410519949485683 samples/sec                   batch loss = 1208.6161696910858 | accuracy = 0.6676767676767676


Epoch[2] Batch[500] Speed: 1.2420988883665318 samples/sec                   batch loss = 1219.6910535097122 | accuracy = 0.6685


Epoch[2] Batch[505] Speed: 1.2392778582966024 samples/sec                   batch loss = 1234.0221730470657 | accuracy = 0.6678217821782179


Epoch[2] Batch[510] Speed: 1.2443146226482964 samples/sec                   batch loss = 1245.1958309412003 | accuracy = 0.6696078431372549


Epoch[2] Batch[515] Speed: 1.2458524533739577 samples/sec                   batch loss = 1257.9117888212204 | accuracy = 0.6699029126213593


Epoch[2] Batch[520] Speed: 1.244905170419291 samples/sec                   batch loss = 1267.9075498580933 | accuracy = 0.6711538461538461


Epoch[2] Batch[525] Speed: 1.2426251145249059 samples/sec                   batch loss = 1277.48952627182 | accuracy = 0.6728571428571428


Epoch[2] Batch[530] Speed: 1.2427432086480243 samples/sec                   batch loss = 1290.3114562034607 | accuracy = 0.6726415094339623


Epoch[2] Batch[535] Speed: 1.2423896376100425 samples/sec                   batch loss = 1303.198983669281 | accuracy = 0.6719626168224299


Epoch[2] Batch[540] Speed: 1.243756721144541 samples/sec                   batch loss = 1315.4215233325958 | accuracy = 0.6717592592592593


Epoch[2] Batch[545] Speed: 1.239632499543373 samples/sec                   batch loss = 1328.1454094648361 | accuracy = 0.6706422018348623


Epoch[2] Batch[550] Speed: 1.2422732663895104 samples/sec                   batch loss = 1339.8882721662521 | accuracy = 0.6704545454545454


Epoch[2] Batch[555] Speed: 1.2456412768590495 samples/sec                   batch loss = 1349.4609042406082 | accuracy = 0.6711711711711712


Epoch[2] Batch[560] Speed: 1.2448461458384112 samples/sec                   batch loss = 1362.7801035642624 | accuracy = 0.6700892857142857


Epoch[2] Batch[565] Speed: 1.2427142122988086 samples/sec                   batch loss = 1374.0918179750443 | accuracy = 0.6707964601769911


Epoch[2] Batch[570] Speed: 1.245020464616628 samples/sec                   batch loss = 1385.923162817955 | accuracy = 0.6714912280701755


Epoch[2] Batch[575] Speed: 1.2448684987770735 samples/sec                   batch loss = 1396.6659138202667 | accuracy = 0.672608695652174


Epoch[2] Batch[580] Speed: 1.245796391655165 samples/sec                   batch loss = 1407.2959860563278 | accuracy = 0.6724137931034483


Epoch[2] Batch[585] Speed: 1.2493559313148295 samples/sec                   batch loss = 1419.4608035087585 | accuracy = 0.6735042735042736


Epoch[2] Batch[590] Speed: 1.2432140593347152 samples/sec                   batch loss = 1428.451811850071 | accuracy = 0.6745762711864407


Epoch[2] Batch[595] Speed: 1.2366897241672685 samples/sec                   batch loss = 1436.9909409880638 | accuracy = 0.6760504201680673


Epoch[2] Batch[600] Speed: 1.2393617158803725 samples/sec                   batch loss = 1452.5996126532555 | accuracy = 0.6745833333333333


Epoch[2] Batch[605] Speed: 1.2379200495604359 samples/sec                   batch loss = 1468.30815833807 | accuracy = 0.6735537190082644


Epoch[2] Batch[610] Speed: 1.243181908953603 samples/sec                   batch loss = 1481.1775876879692 | accuracy = 0.6721311475409836


Epoch[2] Batch[615] Speed: 1.2449938563679832 samples/sec                   batch loss = 1491.5309789776802 | accuracy = 0.6731707317073171


Epoch[2] Batch[620] Speed: 1.244075276226092 samples/sec                   batch loss = 1503.8144608139992 | accuracy = 0.6733870967741935


Epoch[2] Batch[625] Speed: 1.2430653892729513 samples/sec                   batch loss = 1515.022705256939 | accuracy = 0.6732


Epoch[2] Batch[630] Speed: 1.2500807322545884 samples/sec                   batch loss = 1528.3379328846931 | accuracy = 0.6718253968253968


Epoch[2] Batch[635] Speed: 1.2477725089439156 samples/sec                   batch loss = 1540.332268178463 | accuracy = 0.6716535433070866


Epoch[2] Batch[640] Speed: 1.242538514161698 samples/sec                   batch loss = 1551.8606197237968 | accuracy = 0.671875


Epoch[2] Batch[645] Speed: 1.2487335495037917 samples/sec                   batch loss = 1563.364767730236 | accuracy = 0.6713178294573643


Epoch[2] Batch[650] Speed: 1.2478275421453402 samples/sec                   batch loss = 1574.7854745984077 | accuracy = 0.6715384615384615


Epoch[2] Batch[655] Speed: 1.24101031748063 samples/sec                   batch loss = 1584.587423503399 | accuracy = 0.6725190839694657


Epoch[2] Batch[660] Speed: 1.2466563742242307 samples/sec                   batch loss = 1595.9883356690407 | accuracy = 0.6727272727272727


Epoch[2] Batch[665] Speed: 1.2464238119231084 samples/sec                   batch loss = 1608.6520591378212 | accuracy = 0.6721804511278195


Epoch[2] Batch[670] Speed: 1.2432449215799477 samples/sec                   batch loss = 1620.949361383915 | accuracy = 0.6716417910447762


Epoch[2] Batch[675] Speed: 1.2483812995677712 samples/sec                   batch loss = 1634.8488691449165 | accuracy = 0.6707407407407407


Epoch[2] Batch[680] Speed: 1.2491990917902682 samples/sec                   batch loss = 1647.0919187664986 | accuracy = 0.6702205882352941


Epoch[2] Batch[685] Speed: 1.2529338116381616 samples/sec                   batch loss = 1659.4737902283669 | accuracy = 0.6711678832116789


Epoch[2] Batch[690] Speed: 1.2436637861954727 samples/sec                   batch loss = 1670.8977257609367 | accuracy = 0.6710144927536232


Epoch[2] Batch[695] Speed: 1.245292063137432 samples/sec                   batch loss = 1684.1136538386345 | accuracy = 0.6712230215827338


Epoch[2] Batch[700] Speed: 1.2423295634197487 samples/sec                   batch loss = 1698.1759044528008 | accuracy = 0.6703571428571429


Epoch[2] Batch[705] Speed: 1.2513785535855297 samples/sec                   batch loss = 1708.0724464058876 | accuracy = 0.6719858156028369


Epoch[2] Batch[710] Speed: 1.2457619800298987 samples/sec                   batch loss = 1719.8308652043343 | accuracy = 0.6725352112676056


Epoch[2] Batch[715] Speed: 1.2444105162275287 samples/sec                   batch loss = 1728.8859785199165 | accuracy = 0.6730769230769231


Epoch[2] Batch[720] Speed: 1.2520476764542097 samples/sec                   batch loss = 1740.5517819523811 | accuracy = 0.6732638888888889


Epoch[2] Batch[725] Speed: 1.2465357755322377 samples/sec                   batch loss = 1753.4276463389397 | accuracy = 0.6737931034482758


Epoch[2] Batch[730] Speed: 1.2453235832613208 samples/sec                   batch loss = 1768.18182772398 | accuracy = 0.672945205479452


Epoch[2] Batch[735] Speed: 1.2417553340502367 samples/sec                   batch loss = 1777.01433801651 | accuracy = 0.673469387755102


Epoch[2] Batch[740] Speed: 1.2414712202823461 samples/sec                   batch loss = 1788.947779417038 | accuracy = 0.6739864864864865


Epoch[2] Batch[745] Speed: 1.2381743035017883 samples/sec                   batch loss = 1800.932224869728 | accuracy = 0.674496644295302


Epoch[2] Batch[750] Speed: 1.241366226660773 samples/sec                   batch loss = 1812.9415179491043 | accuracy = 0.6746666666666666


Epoch[2] Batch[755] Speed: 1.2451318990121008 samples/sec                   batch loss = 1826.607674241066 | accuracy = 0.6741721854304635


Epoch[2] Batch[760] Speed: 1.24947968186156 samples/sec                   batch loss = 1836.691110610962 | accuracy = 0.6746710526315789


Epoch[2] Batch[765] Speed: 1.2507302852794289 samples/sec                   batch loss = 1848.3331587314606 | accuracy = 0.6751633986928105


Epoch[2] Batch[770] Speed: 1.2425159687574616 samples/sec                   batch loss = 1860.3386253118515 | accuracy = 0.6746753246753247


Epoch[2] Batch[775] Speed: 1.249850335757462 samples/sec                   batch loss = 1872.4886779785156 | accuracy = 0.6741935483870968


Epoch[2] Batch[780] Speed: 1.2493031819551623 samples/sec                   batch loss = 1881.8073304891586 | accuracy = 0.675


Epoch[2] Batch[785] Speed: 1.2473001470170044 samples/sec                   batch loss = 1890.5515476465225 | accuracy = 0.6764331210191082


[Epoch 2] training: accuracy=0.6770304568527918
[Epoch 2] time cost: 648.5417671203613
[Epoch 2] validation: validation accuracy=0.7111111111111111


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).