<!--- Licensed to the Apache Software Foundation (ASF) under one -->
<!--- or more contributor license agreements.  See the NOTICE file -->
<!--- distributed with this work for additional information -->
<!--- regarding copyright ownership.  The ASF licenses this file -->
<!--- to you under the Apache License, Version 2.0 (the -->
<!--- "License"); you may not use this file except in compliance -->
<!--- with the License.  You may obtain a copy of the License at -->

<!---   http://www.apache.org/licenses/LICENSE-2.0 -->

<!--- Unless required by applicable law or agreed to in writing, -->
<!--- software distributed under the License is distributed on an -->
<!--- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->
<!--- KIND, either express or implied.  See the License for the -->
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Step 7: Load and Run a NN using GPU

In this step, you will learn how to use graphics processing units (GPUs) with MXNet. If you use GPUs to train and deploy neural networks, you may be able to train or perform inference quicker than with central processing units (CPUs).

## Prerequisites

Before you start the steps, make sure you have at least one Nvidia GPU on your machine and make sure that you have CUDA properly installed. GPUs from AMD and Intel are not supported. Additionally, you will need to install the GPU-enabled version of MXNet. You can find information about how to install the GPU version of MXNet for your system [here](https://mxnet.apache.org/versions/1.4.1/install/ubuntu_setup.html).

You can use the following command to view the number GPUs that are available to MXNet.

In [1]:
from mxnet import np, npx, gluon, autograd
from mxnet.gluon import nn
import time
npx.set_np()

npx.num_gpus() #This command provides the number of GPUs MXNet can access

1

## Allocate data to a GPU

MXNet's ndarray is very similar to NumPy's. One major difference is that MXNet's ndarray has a `context` attribute specifieing which device an array is on. By default, arrays are stored on `npx.cpu()`. To change it to the first GPU, you can use the following code, `npx.gpu()` or `npx.gpu(0)` to indicate the first GPU.

In [2]:
gpu = npx.gpu() if npx.num_gpus() > 0 else npx.cpu()
x = np.ones((3,4), ctx=gpu)
x

[09:32:57] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for GPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], ctx=gpu(0))

If you're using a CPU, MXNet allocates data on the main memory and tries to use as many CPU cores as possible.  If there are multiple GPUs, MXNet will tell you which GPUs the ndarray is allocated on.

Assuming there is at least two GPUs. You can create another ndarray and assign it to a different GPU. If you only have one GPU, then you will get an error trying to run this code. In the example code here, you will copy `x` to the second GPU, `npx.gpu(1)`:

In [3]:
gpu_1 = npx.gpu(1) if npx.num_gpus() > 1 else npx.cpu()
x.copyto(gpu_1)

[09:32:57] /work/mxnet/src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

MXNet requries that users explicitly move data between devices. But several operators such as `print`, and `asnumpy`, will implicitly move data to main memory.

## Choosing GPU Ids
If you have multiple GPUs on your machine, MXNet can access each of them through 0-indexing with `npx`. As you saw before, the first GPU was accessed using `npx.gpu(0)`, and the second using `npx.gpu(1)`. This extends to however many GPUs your machine has. So if your machine has eight GPUs, the last GPU is accessed using `npx.gpu(7)`. This allows you to select which GPUs to use for operations and training. You might find it particularly useful when you want to leverage multiple GPUs while training neural networks.

## Run an operation on a GPU

To perform an operation on a particular GPU, you only need to guarantee that the input of an operation is already on that GPU. The output is allocated on the same GPU as well. Almost all operators in the `np` and `npx` module support running on a GPU.

In [4]:
y = np.random.uniform(size=(3,4), ctx=gpu)
x + y

array([[1.7402194, 1.9209938, 1.0390205, 1.9689629],
       [1.9251406, 1.4463501, 1.6673192, 1.1099306],
       [1.4702187, 1.5131936, 1.7761751, 1.2947657]], ctx=gpu(0))

Remember that if the inputs are not on the same GPU, you will get an error.

## Run a neural network on a GPU

To run a neural network on a GPU, you only need to copy and move the input data and parameters to the GPU. To demonstrate this you can reuse the previously defined LeafNetwork in [Training Neural Networks](6-train-nn.md). The following code example shows this.

In [5]:
# The convolutional block has a convolution layer, a max pool layer and a batch normalization layer
def conv_block(filters, kernel_size=2, stride=2, batch_norm=True):
    conv_block = nn.HybridSequential()
    conv_block.add(nn.Conv2D(channels=filters, kernel_size=kernel_size, activation='relu'),
              nn.MaxPool2D(pool_size=4, strides=stride))
    if batch_norm:
        conv_block.add(nn.BatchNorm())
    return conv_block

# The dense block consists of a dense layer and a dropout layer
def dense_block(neurons, activation='relu', dropout=0.2):
    dense_block = nn.HybridSequential()
    dense_block.add(nn.Dense(neurons, activation=activation))
    if dropout:
        dense_block.add(nn.Dropout(dropout))
    return dense_block

# Create neural network blueprint using the blocks
class LeafNetwork(nn.HybridBlock):
    def __init__(self):
        super(LeafNetwork, self).__init__()
        self.conv1 = conv_block(32)
        self.conv2 = conv_block(64)
        self.conv3 = conv_block(128)
        self.flatten = nn.Flatten()
        self.dense1 = dense_block(100)
        self.dense2 = dense_block(10)
        self.dense3 = nn.Dense(2)

    def forward(self, batch):
        batch = self.conv1(batch)
        batch = self.conv2(batch)
        batch = self.conv3(batch)
        batch = self.flatten(batch)
        batch = self.dense1(batch)
        batch = self.dense2(batch)
        batch = self.dense3(batch)

        return batch

Load the saved parameters onto GPU 0 directly as shown below; additionally, you could use `net.collect_params().reset_ctx(gpu)` to change the device.

In [6]:
net = LeafNetwork()
net.load_parameters('leaf_models.params', ctx=gpu)

Use the following command to create input data on GPU 0. The forward function will then run on GPU 0.

In [7]:
x = np.random.uniform(size=(1, 3, 128, 128), ctx=gpu)
net(x)

[09:32:58] /work/mxnet/src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)


array([[ 9.732011, -9.782253]], ctx=gpu(0))

## Training with multiple GPUs

Finally, you will see how you can use multiple GPUs to jointly train a neural network through data parallelism. To elaborate on what data parallelism is, assume there are *n* GPUs, then you can split each data batch into *n* parts, and use a GPU on each of these parts to run the forward and backward passes on the seperate chunks of the data.

First copy the data definitions with the following commands, and the transform functions from the tutorial [Training Neural Networks](6-train-nn.md).

In [8]:
# Import transforms as compose a series of transformations to the images
from mxnet.gluon.data.vision import transforms

jitter_param = 0.05

# mean and std for normalizing image value in range (0,1)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]

training_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(contrast=jitter_param),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

validation_transformer = transforms.Compose([
    transforms.Resize(size=224, keep_ratio=True),
    transforms.CenterCrop(128),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

# Use ImageFolderDataset to create a Dataset object from directory structure
train_dataset = gluon.data.vision.ImageFolderDataset('./datasets/train')
val_dataset = gluon.data.vision.ImageFolderDataset('./datasets/validation')
test_dataset = gluon.data.vision.ImageFolderDataset('./datasets/test')

# Create data loaders
batch_size = 4
train_loader = gluon.data.DataLoader(train_dataset.transform_first(training_transformer),batch_size=batch_size, shuffle=True, try_nopython=True)
validation_loader = gluon.data.DataLoader(val_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)
test_loader = gluon.data.DataLoader(test_dataset.transform_first(validation_transformer), batch_size=batch_size, try_nopython=True)

### Define a helper function
This is the same test function defined previously in the **Step 6**.

In [9]:
# Function to return the accuracy for the validation and test set
def test(val_data, devices):
    acc = gluon.metric.Accuracy()
    for batch in val_data:
        data, label = batch[0], batch[1]
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)
        outputs = [net(X) for X in data_list]
        acc.update(label_list, outputs)

    _, accuracy = acc.get()
    return accuracy

The training loop is quite similar to that shown earlier. The major differences are highlighted in the following code.

In [10]:
# Diff 1: Use two GPUs for training.
available_gpus = [npx.gpu(i) for i in range(npx.num_gpus())]
num_gpus = 2
devices = available_gpus[:num_gpus]
print('Using {} GPUs'.format(len(devices)))

# Diff 2: reinitialize the parameters and place them on multiple GPUs
net.initialize(force_reinit=True, ctx=devices)

# Loss and trainer are the same as before
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
optimizer = 'sgd'
optimizer_params = {'learning_rate': 0.001}
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

epochs = 2
accuracy = gluon.metric.Accuracy()
log_interval = 5

for epoch in range(epochs):
    train_loss = 0.
    tic = time.time()
    btic = time.time()
    accuracy.reset()
    for idx, batch in enumerate(train_loader):
        data, label = batch[0], batch[1]

        # Diff 3: split batch and load into corresponding devices
        data_list = gluon.utils.split_and_load(data, devices)
        label_list = gluon.utils.split_and_load(label, devices)

        # Diff 4: run forward and backward on each devices.
        # MXNet will automatically run them in parallel
        with autograd.record():
            outputs = [net(X)
                      for X in data_list]
            losses = [loss_fn(output, label)
                      for output, label in zip(outputs, label_list)]
        for l in losses:
            l.backward()
        trainer.step(batch_size)

        # Diff 5: sum losses over all devices. Here, the float
        # function will copy data into CPU.
        train_loss += sum([float(l.sum()) for l in losses])
        accuracy.update(label_list, outputs)
        if log_interval and (idx + 1) % log_interval == 0:
            _, acc = accuracy.get()

            print(f"""Epoch[{epoch + 1}] Batch[{idx + 1}] Speed: {batch_size / (time.time() - btic)} samples/sec \
                  batch loss = {train_loss} | accuracy = {acc}""")
            btic = time.time()

    _, acc = accuracy.get()

    acc_val = test(validation_loader, devices)
    print(f"[Epoch {epoch + 1}] training: accuracy={acc}")
    print(f"[Epoch {epoch + 1}] time cost: {time.time() - tic}")
    print(f"[Epoch {epoch + 1}] validation: validation accuracy={acc_val}")

Using 1 GPUs


Epoch[1] Batch[5] Speed: 0.5219391920826294 samples/sec                   batch loss = 14.777900457382202 | accuracy = 0.4


Epoch[1] Batch[10] Speed: 0.7654149143717631 samples/sec                   batch loss = 30.43872380256653 | accuracy = 0.35


Epoch[1] Batch[15] Speed: 1.1919859956059715 samples/sec                   batch loss = 45.42611837387085 | accuracy = 0.36666666666666664


Epoch[1] Batch[20] Speed: 1.2415250559204487 samples/sec                   batch loss = 59.1918158531189 | accuracy = 0.4125


Epoch[1] Batch[25] Speed: 1.019804696268725 samples/sec                   batch loss = 72.96912741661072 | accuracy = 0.42


Epoch[1] Batch[30] Speed: 0.8168624889567719 samples/sec                   batch loss = 86.91197323799133 | accuracy = 0.44166666666666665


Epoch[1] Batch[35] Speed: 0.7448007345184704 samples/sec                   batch loss = 101.88783693313599 | accuracy = 0.4357142857142857


Epoch[1] Batch[40] Speed: 1.0325178945103644 samples/sec                   batch loss = 115.85376334190369 | accuracy = 0.43125


Epoch[1] Batch[45] Speed: 1.2274485479829849 samples/sec                   batch loss = 129.94422721862793 | accuracy = 0.4444444444444444


Epoch[1] Batch[50] Speed: 1.2271309102228922 samples/sec                   batch loss = 144.31450152397156 | accuracy = 0.455


Epoch[1] Batch[55] Speed: 1.239052524382847 samples/sec                   batch loss = 158.0210633277893 | accuracy = 0.4681818181818182


Epoch[1] Batch[60] Speed: 1.2353507671380162 samples/sec                   batch loss = 172.3633725643158 | accuracy = 0.4625


Epoch[1] Batch[65] Speed: 1.2441183591671625 samples/sec                   batch loss = 186.51303601264954 | accuracy = 0.45


Epoch[1] Batch[70] Speed: 1.233991470806662 samples/sec                   batch loss = 200.54737424850464 | accuracy = 0.45


Epoch[1] Batch[75] Speed: 1.2419832151945662 samples/sec                   batch loss = 214.64220261573792 | accuracy = 0.44666666666666666


Epoch[1] Batch[80] Speed: 1.2327117833223744 samples/sec                   batch loss = 228.89490675926208 | accuracy = 0.459375


Epoch[1] Batch[85] Speed: 1.2435878258788948 samples/sec                   batch loss = 242.75409197807312 | accuracy = 0.4588235294117647


Epoch[1] Batch[90] Speed: 1.2373208693746858 samples/sec                   batch loss = 256.3795518875122 | accuracy = 0.4638888888888889


Epoch[1] Batch[95] Speed: 1.240589018196279 samples/sec                   batch loss = 270.46026396751404 | accuracy = 0.4631578947368421


Epoch[1] Batch[100] Speed: 1.2373620256219218 samples/sec                   batch loss = 284.6926038265228 | accuracy = 0.4625


Epoch[1] Batch[105] Speed: 1.2375883882848353 samples/sec                   batch loss = 298.6010043621063 | accuracy = 0.4642857142857143


Epoch[1] Batch[110] Speed: 1.2401482954925547 samples/sec                   batch loss = 312.1769554615021 | accuracy = 0.47045454545454546


Epoch[1] Batch[115] Speed: 1.2348705827834008 samples/sec                   batch loss = 326.32310223579407 | accuracy = 0.4673913043478261


Epoch[1] Batch[120] Speed: 1.238880513037763 samples/sec                   batch loss = 340.80574083328247 | accuracy = 0.46458333333333335


Epoch[1] Batch[125] Speed: 1.2426936855426225 samples/sec                   batch loss = 354.40168952941895 | accuracy = 0.468


Epoch[1] Batch[130] Speed: 1.237033034705198 samples/sec                   batch loss = 368.1341862678528 | accuracy = 0.47115384615384615


Epoch[1] Batch[135] Speed: 1.2308895973670146 samples/sec                   batch loss = 381.9735858440399 | accuracy = 0.4703703703703704


Epoch[1] Batch[140] Speed: 1.2367081386466354 samples/sec                   batch loss = 395.9652256965637 | accuracy = 0.4732142857142857


Epoch[1] Batch[145] Speed: 1.2402264949537765 samples/sec                   batch loss = 409.27531480789185 | accuracy = 0.4844827586206897


Epoch[1] Batch[150] Speed: 1.2411712593801474 samples/sec                   batch loss = 422.9939138889313 | accuracy = 0.49


Epoch[1] Batch[155] Speed: 1.2441207578768396 samples/sec                   batch loss = 436.388375043869 | accuracy = 0.4935483870967742


Epoch[1] Batch[160] Speed: 1.2356130666774685 samples/sec                   batch loss = 449.9759883880615 | accuracy = 0.4984375


Epoch[1] Batch[165] Speed: 1.2470802293313628 samples/sec                   batch loss = 464.20240688323975 | accuracy = 0.4954545454545455


Epoch[1] Batch[170] Speed: 1.236240835802111 samples/sec                   batch loss = 477.8028013706207 | accuracy = 0.4985294117647059


Epoch[1] Batch[175] Speed: 1.237218583776473 samples/sec                   batch loss = 491.07027101516724 | accuracy = 0.5071428571428571


Epoch[1] Batch[180] Speed: 1.241495473302493 samples/sec                   batch loss = 504.8883273601532 | accuracy = 0.5097222222222222


Epoch[1] Batch[185] Speed: 1.2372357366483577 samples/sec                   batch loss = 518.8012890815735 | accuracy = 0.5094594594594595


Epoch[1] Batch[190] Speed: 1.2329113499871103 samples/sec                   batch loss = 532.1396689414978 | accuracy = 0.5118421052631579


Epoch[1] Batch[195] Speed: 1.214946436810093 samples/sec                   batch loss = 545.7370357513428 | accuracy = 0.517948717948718


Epoch[1] Batch[200] Speed: 1.2345217491057614 samples/sec                   batch loss = 559.6490969657898 | accuracy = 0.5175


Epoch[1] Batch[205] Speed: 1.2479906285493763 samples/sec                   batch loss = 573.1607189178467 | accuracy = 0.5182926829268293


Epoch[1] Batch[210] Speed: 1.229889545215764 samples/sec                   batch loss = 587.1447176933289 | accuracy = 0.5154761904761904


Epoch[1] Batch[215] Speed: 1.237707718388196 samples/sec                   batch loss = 600.9138300418854 | accuracy = 0.5127906976744186


Epoch[1] Batch[220] Speed: 1.2464151075720322 samples/sec                   batch loss = 614.2836565971375 | accuracy = 0.5147727272727273


Epoch[1] Batch[225] Speed: 1.234106840190187 samples/sec                   batch loss = 628.2514967918396 | accuracy = 0.5122222222222222


Epoch[1] Batch[230] Speed: 1.2408731871591656 samples/sec                   batch loss = 642.2933888435364 | accuracy = 0.5119565217391304


Epoch[1] Batch[235] Speed: 1.226068047398682 samples/sec                   batch loss = 655.7820343971252 | accuracy = 0.5127659574468085


Epoch[1] Batch[240] Speed: 1.2279449054556077 samples/sec                   batch loss = 669.5412695407867 | accuracy = 0.5166666666666667


Epoch[1] Batch[245] Speed: 1.2365739625261811 samples/sec                   batch loss = 683.2731688022614 | accuracy = 0.5173469387755102


Epoch[1] Batch[250] Speed: 1.2394699418926065 samples/sec                   batch loss = 697.0099921226501 | accuracy = 0.521


Epoch[1] Batch[255] Speed: 1.2358442518918258 samples/sec                   batch loss = 710.1226937770844 | accuracy = 0.5235294117647059


Epoch[1] Batch[260] Speed: 1.2522085032796813 samples/sec                   batch loss = 723.6771867275238 | accuracy = 0.5230769230769231


Epoch[1] Batch[265] Speed: 1.2395477810814064 samples/sec                   batch loss = 736.4608972072601 | accuracy = 0.5245283018867924


Epoch[1] Batch[270] Speed: 1.2292733422645068 samples/sec                   batch loss = 750.3558852672577 | accuracy = 0.524074074074074


Epoch[1] Batch[275] Speed: 1.2257565864481155 samples/sec                   batch loss = 763.9199743270874 | accuracy = 0.5281818181818182


Epoch[1] Batch[280] Speed: 1.2396014500276076 samples/sec                   batch loss = 778.4487669467926 | accuracy = 0.5276785714285714


Epoch[1] Batch[285] Speed: 1.2315710649638216 samples/sec                   batch loss = 792.3200950622559 | accuracy = 0.5271929824561403


Epoch[1] Batch[290] Speed: 1.2363902468142745 samples/sec                   batch loss = 806.2159917354584 | accuracy = 0.5267241379310345


Epoch[1] Batch[295] Speed: 1.24604260122991 samples/sec                   batch loss = 819.9937086105347 | accuracy = 0.5262711864406779


Epoch[1] Batch[300] Speed: 1.2453788628739504 samples/sec                   batch loss = 833.6554138660431 | accuracy = 0.5266666666666666


Epoch[1] Batch[305] Speed: 1.24174108851116 samples/sec                   batch loss = 846.7553057670593 | accuracy = 0.530327868852459


Epoch[1] Batch[310] Speed: 1.236576332232244 samples/sec                   batch loss = 860.6865739822388 | accuracy = 0.5306451612903226


Epoch[1] Batch[315] Speed: 1.2472739975973726 samples/sec                   batch loss = 874.6344003677368 | accuracy = 0.5293650793650794


Epoch[1] Batch[320] Speed: 1.2354668457384455 samples/sec                   batch loss = 888.3481879234314 | accuracy = 0.52890625


Epoch[1] Batch[325] Speed: 1.2407246176281204 samples/sec                   batch loss = 901.9126260280609 | accuracy = 0.53


Epoch[1] Batch[330] Speed: 1.2358192178205534 samples/sec                   batch loss = 914.9845921993256 | accuracy = 0.531060606060606


Epoch[1] Batch[335] Speed: 1.2404324459886127 samples/sec                   batch loss = 928.0978994369507 | accuracy = 0.5343283582089552


Epoch[1] Batch[340] Speed: 1.2446606104893783 samples/sec                   batch loss = 941.8476734161377 | accuracy = 0.5345588235294118


Epoch[1] Batch[345] Speed: 1.2387551029599937 samples/sec                   batch loss = 954.7754273414612 | accuracy = 0.5369565217391304


Epoch[1] Batch[350] Speed: 1.247565783392739 samples/sec                   batch loss = 967.8139083385468 | accuracy = 0.5378571428571428


Epoch[1] Batch[355] Speed: 1.2384401804259435 samples/sec                   batch loss = 981.7235908508301 | accuracy = 0.5380281690140845


Epoch[1] Batch[360] Speed: 1.2432524761540102 samples/sec                   batch loss = 994.4460487365723 | accuracy = 0.5416666666666666


Epoch[1] Batch[365] Speed: 1.2392421581844144 samples/sec                   batch loss = 1008.5510058403015 | accuracy = 0.5431506849315069


Epoch[1] Batch[370] Speed: 1.2503813811613766 samples/sec                   batch loss = 1021.8929617404938 | accuracy = 0.5445945945945946


Epoch[1] Batch[375] Speed: 1.2457088862516925 samples/sec                   batch loss = 1035.7093563079834 | accuracy = 0.5433333333333333


Epoch[1] Batch[380] Speed: 1.2454503269033035 samples/sec                   batch loss = 1048.8155932426453 | accuracy = 0.5434210526315789


Epoch[1] Batch[385] Speed: 1.238757298104062 samples/sec                   batch loss = 1062.5087616443634 | accuracy = 0.5441558441558442


Epoch[1] Batch[390] Speed: 1.2374630571102834 samples/sec                   batch loss = 1076.8046379089355 | accuracy = 0.5435897435897435


Epoch[1] Batch[395] Speed: 1.2493647698266617 samples/sec                   batch loss = 1089.6364529132843 | accuracy = 0.5430379746835443


Epoch[1] Batch[400] Speed: 1.22855329953885 samples/sec                   batch loss = 1102.578319311142 | accuracy = 0.545625


Epoch[1] Batch[405] Speed: 1.2393604341288447 samples/sec                   batch loss = 1115.6754429340363 | accuracy = 0.5469135802469136


Epoch[1] Batch[410] Speed: 1.2391171324700863 samples/sec                   batch loss = 1129.2642307281494 | accuracy = 0.5457317073170732


Epoch[1] Batch[415] Speed: 1.2400481999058797 samples/sec                   batch loss = 1142.3974106311798 | accuracy = 0.5463855421686747


Epoch[1] Batch[420] Speed: 1.2497568603520282 samples/sec                   batch loss = 1155.3491270542145 | accuracy = 0.5482142857142858


Epoch[1] Batch[425] Speed: 1.2336328833159496 samples/sec                   batch loss = 1168.5982196331024 | accuracy = 0.5505882352941176


Epoch[1] Batch[430] Speed: 1.2494611642427094 samples/sec                   batch loss = 1182.8660733699799 | accuracy = 0.5505813953488372


Epoch[1] Batch[435] Speed: 1.2416332925430804 samples/sec                   batch loss = 1196.373104095459 | accuracy = 0.5494252873563218


Epoch[1] Batch[440] Speed: 1.2506115073860193 samples/sec                   batch loss = 1209.3184275627136 | accuracy = 0.5517045454545455


Epoch[1] Batch[445] Speed: 1.2456526524813427 samples/sec                   batch loss = 1222.6298174858093 | accuracy = 0.551123595505618


Epoch[1] Batch[450] Speed: 1.2456671728864581 samples/sec                   batch loss = 1236.3571755886078 | accuracy = 0.5505555555555556


Epoch[1] Batch[455] Speed: 1.2455539781859735 samples/sec                   batch loss = 1250.1766159534454 | accuracy = 0.5510989010989011


Epoch[1] Batch[460] Speed: 1.243525239338652 samples/sec                   batch loss = 1263.7026941776276 | accuracy = 0.5510869565217391


Epoch[1] Batch[465] Speed: 1.2433150352402687 samples/sec                   batch loss = 1277.357757806778 | accuracy = 0.5505376344086022


Epoch[1] Batch[470] Speed: 1.2392236681656277 samples/sec                   batch loss = 1291.962925195694 | accuracy = 0.548936170212766


Epoch[1] Batch[475] Speed: 1.2414780183818075 samples/sec                   batch loss = 1305.7886967658997 | accuracy = 0.5489473684210526


Epoch[1] Batch[480] Speed: 1.2261177775580436 samples/sec                   batch loss = 1319.1025590896606 | accuracy = 0.55


Epoch[1] Batch[485] Speed: 1.2502954667181772 samples/sec                   batch loss = 1331.0139071941376 | accuracy = 0.5525773195876289


Epoch[1] Batch[490] Speed: 1.2366159805847083 samples/sec                   batch loss = 1346.1898341178894 | accuracy = 0.551530612244898


Epoch[1] Batch[495] Speed: 1.2358829429060665 samples/sec                   batch loss = 1358.4114480018616 | accuracy = 0.553030303030303


Epoch[1] Batch[500] Speed: 1.2436269110964606 samples/sec                   batch loss = 1371.6794970035553 | accuracy = 0.5535


Epoch[1] Batch[505] Speed: 1.2390952601394458 samples/sec                   batch loss = 1385.3731362819672 | accuracy = 0.5534653465346535


Epoch[1] Batch[510] Speed: 1.2303494412897962 samples/sec                   batch loss = 1398.6458106040955 | accuracy = 0.5544117647058824


Epoch[1] Batch[515] Speed: 1.2364184020789426 samples/sec                   batch loss = 1411.5798771381378 | accuracy = 0.5558252427184466


Epoch[1] Batch[520] Speed: 1.2352013344155042 samples/sec                   batch loss = 1423.5491952896118 | accuracy = 0.5572115384615385


Epoch[1] Batch[525] Speed: 1.2347195396837594 samples/sec                   batch loss = 1436.2697010040283 | accuracy = 0.559047619047619


Epoch[1] Batch[530] Speed: 1.2315035351044088 samples/sec                   batch loss = 1449.1569163799286 | accuracy = 0.5599056603773584


Epoch[1] Batch[535] Speed: 1.2367984869336137 samples/sec                   batch loss = 1461.7961564064026 | accuracy = 0.5607476635514018


Epoch[1] Batch[540] Speed: 1.241116260847514 samples/sec                   batch loss = 1475.750146150589 | accuracy = 0.5606481481481481


Epoch[1] Batch[545] Speed: 1.2340960375780605 samples/sec                   batch loss = 1488.1092083454132 | accuracy = 0.5614678899082569


Epoch[1] Batch[550] Speed: 1.2365800690948736 samples/sec                   batch loss = 1501.3237781524658 | accuracy = 0.5622727272727273


Epoch[1] Batch[555] Speed: 1.2313468075889973 samples/sec                   batch loss = 1515.1930739879608 | accuracy = 0.5617117117117117


Epoch[1] Batch[560] Speed: 1.2407936214103923 samples/sec                   batch loss = 1528.341451883316 | accuracy = 0.5625


Epoch[1] Batch[565] Speed: 1.2318596192995561 samples/sec                   batch loss = 1539.663789987564 | accuracy = 0.5654867256637168


Epoch[1] Batch[570] Speed: 1.2435302165354258 samples/sec                   batch loss = 1552.2869937419891 | accuracy = 0.5671052631578948


Epoch[1] Batch[575] Speed: 1.2310330205662856 samples/sec                   batch loss = 1565.154390335083 | accuracy = 0.5669565217391305


Epoch[1] Batch[580] Speed: 1.2318949857739512 samples/sec                   batch loss = 1577.9244902133942 | accuracy = 0.5681034482758621


Epoch[1] Batch[585] Speed: 1.2378468898215353 samples/sec                   batch loss = 1591.0677330493927 | accuracy = 0.5679487179487179


Epoch[1] Batch[590] Speed: 1.2366992959816672 samples/sec                   batch loss = 1602.6124503612518 | accuracy = 0.5699152542372882


Epoch[1] Batch[595] Speed: 1.2431027837230542 samples/sec                   batch loss = 1614.9486927986145 | accuracy = 0.5718487394957983


Epoch[1] Batch[600] Speed: 1.236088182340147 samples/sec                   batch loss = 1628.5049765110016 | accuracy = 0.5720833333333334


Epoch[1] Batch[605] Speed: 1.2499608670505633 samples/sec                   batch loss = 1643.0925562381744 | accuracy = 0.5727272727272728


Epoch[1] Batch[610] Speed: 1.2443934407358301 samples/sec                   batch loss = 1656.911432504654 | accuracy = 0.5733606557377049


Epoch[1] Batch[615] Speed: 1.2449243846359446 samples/sec                   batch loss = 1669.5484416484833 | accuracy = 0.574390243902439


Epoch[1] Batch[620] Speed: 1.2488035399853155 samples/sec                   batch loss = 1682.4498816728592 | accuracy = 0.5754032258064516


Epoch[1] Batch[625] Speed: 1.2351958780313335 samples/sec                   batch loss = 1694.3094545602798 | accuracy = 0.5764


Epoch[1] Batch[630] Speed: 1.2474961171002843 samples/sec                   batch loss = 1707.136600136757 | accuracy = 0.5777777777777777


Epoch[1] Batch[635] Speed: 1.245494892066789 samples/sec                   batch loss = 1719.2449613809586 | accuracy = 0.5787401574803149


Epoch[1] Batch[640] Speed: 1.2462728919933512 samples/sec                   batch loss = 1732.995457291603 | accuracy = 0.578515625


Epoch[1] Batch[645] Speed: 1.2467868175078214 samples/sec                   batch loss = 1745.0748512744904 | accuracy = 0.57984496124031


Epoch[1] Batch[650] Speed: 1.2394595945954732 samples/sec                   batch loss = 1757.6299755573273 | accuracy = 0.5803846153846154


Epoch[1] Batch[655] Speed: 1.244322374804328 samples/sec                   batch loss = 1771.512927055359 | accuracy = 0.5801526717557252


Epoch[1] Batch[660] Speed: 1.2424576305704182 samples/sec                   batch loss = 1784.5801651477814 | accuracy = 0.5806818181818182


Epoch[1] Batch[665] Speed: 1.2478841580925477 samples/sec                   batch loss = 1796.8946416378021 | accuracy = 0.5827067669172933


Epoch[1] Batch[670] Speed: 1.2405164599042617 samples/sec                   batch loss = 1809.4315581321716 | accuracy = 0.5828358208955224


Epoch[1] Batch[675] Speed: 1.2489651150377574 samples/sec                   batch loss = 1822.5191789865494 | accuracy = 0.5837037037037037


Epoch[1] Batch[680] Speed: 1.2447069660552592 samples/sec                   batch loss = 1834.6006747484207 | accuracy = 0.5845588235294118


Epoch[1] Batch[685] Speed: 1.2464181633410085 samples/sec                   batch loss = 1847.6995383501053 | accuracy = 0.5843065693430657


Epoch[1] Batch[690] Speed: 1.2440883760766153 samples/sec                   batch loss = 1860.5415235757828 | accuracy = 0.5847826086956521


Epoch[1] Batch[695] Speed: 1.242303897964305 samples/sec                   batch loss = 1872.5734869241714 | accuracy = 0.5856115107913669


Epoch[1] Batch[700] Speed: 1.238286709170518 samples/sec                   batch loss = 1885.8994718790054 | accuracy = 0.5853571428571429


Epoch[1] Batch[705] Speed: 1.2425593118722988 samples/sec                   batch loss = 1897.2679628133774 | accuracy = 0.5868794326241135


Epoch[1] Batch[710] Speed: 1.2481204226905476 samples/sec                   batch loss = 1910.6316667795181 | accuracy = 0.5869718309859155


Epoch[1] Batch[715] Speed: 1.2369334413482718 samples/sec                   batch loss = 1922.032708287239 | accuracy = 0.5877622377622378


Epoch[1] Batch[720] Speed: 1.2450576071988442 samples/sec                   batch loss = 1934.302298426628 | accuracy = 0.5888888888888889


Epoch[1] Batch[725] Speed: 1.2390087850551632 samples/sec                   batch loss = 1948.0064874887466 | accuracy = 0.5889655172413794


Epoch[1] Batch[730] Speed: 1.2443524614676242 samples/sec                   batch loss = 1959.451321363449 | accuracy = 0.5897260273972603


Epoch[1] Batch[735] Speed: 1.2544305644947111 samples/sec                   batch loss = 1971.9604882001877 | accuracy = 0.5901360544217688


Epoch[1] Batch[740] Speed: 1.2366054074421209 samples/sec                   batch loss = 1984.976249575615 | accuracy = 0.5898648648648649


Epoch[1] Batch[745] Speed: 1.2516279085034998 samples/sec                   batch loss = 2000.2788738012314 | accuracy = 0.5895973154362416


Epoch[1] Batch[750] Speed: 1.2393785620039954 samples/sec                   batch loss = 2012.787720799446 | accuracy = 0.5906666666666667


Epoch[1] Batch[755] Speed: 1.2437226987034005 samples/sec                   batch loss = 2024.533232331276 | accuracy = 0.5920529801324503


Epoch[1] Batch[760] Speed: 1.2425852639743122 samples/sec                   batch loss = 2037.6560608148575 | accuracy = 0.5907894736842105


Epoch[1] Batch[765] Speed: 1.2416804337455045 samples/sec                   batch loss = 2049.3943178653717 | accuracy = 0.5911764705882353


Epoch[1] Batch[770] Speed: 1.2535009149161251 samples/sec                   batch loss = 2063.935633659363 | accuracy = 0.5902597402597403


Epoch[1] Batch[775] Speed: 1.2374936344801988 samples/sec                   batch loss = 2075.1589167118073 | accuracy = 0.5916129032258064


Epoch[1] Batch[780] Speed: 1.2516553613534802 samples/sec                   batch loss = 2086.321209549904 | accuracy = 0.5923076923076923


Epoch[1] Batch[785] Speed: 1.2511443195818606 samples/sec                   batch loss = 2097.4069234132767 | accuracy = 0.5933121019108281


[Epoch 1] training: accuracy=0.5939086294416244
[Epoch 1] time cost: 662.4308176040649
[Epoch 1] validation: validation accuracy=0.6966666666666667


Epoch[2] Batch[5] Speed: 1.245018154827743 samples/sec                   batch loss = 12.03242802619934 | accuracy = 0.55


Epoch[2] Batch[10] Speed: 1.2460488942191859 samples/sec                   batch loss = 23.25856626033783 | accuracy = 0.65


Epoch[2] Batch[15] Speed: 1.2494789374240822 samples/sec                   batch loss = 36.00518715381622 | accuracy = 0.65


Epoch[2] Batch[20] Speed: 1.2479031859167344 samples/sec                   batch loss = 50.11796152591705 | accuracy = 0.6375


Epoch[2] Batch[25] Speed: 1.2431550107719695 samples/sec                   batch loss = 61.34697902202606 | accuracy = 0.65


Epoch[2] Batch[30] Speed: 1.249458558792647 samples/sec                   batch loss = 74.73574817180634 | accuracy = 0.6416666666666667


Epoch[2] Batch[35] Speed: 1.249986291082051 samples/sec                   batch loss = 85.18869459629059 | accuracy = 0.6642857142857143


Epoch[2] Batch[40] Speed: 1.2499749292991336 samples/sec                   batch loss = 101.21400034427643 | accuracy = 0.6375


Epoch[2] Batch[45] Speed: 1.2519303297839173 samples/sec                   batch loss = 114.4253078699112 | accuracy = 0.6333333333333333


Epoch[2] Batch[50] Speed: 1.2586752532866592 samples/sec                   batch loss = 129.01619720458984 | accuracy = 0.635


Epoch[2] Batch[55] Speed: 1.2542520068371816 samples/sec                   batch loss = 141.95399129390717 | accuracy = 0.6409090909090909


Epoch[2] Batch[60] Speed: 1.2586918730905015 samples/sec                   batch loss = 154.97657752037048 | accuracy = 0.6416666666666667


Epoch[2] Batch[65] Speed: 1.2543022679275369 samples/sec                   batch loss = 168.029238820076 | accuracy = 0.6423076923076924


Epoch[2] Batch[70] Speed: 1.2550209918016624 samples/sec                   batch loss = 182.41496336460114 | accuracy = 0.6392857142857142


Epoch[2] Batch[75] Speed: 1.2478627176786556 samples/sec                   batch loss = 195.91305792331696 | accuracy = 0.6233333333333333


Epoch[2] Batch[80] Speed: 1.247188509629291 samples/sec                   batch loss = 207.3160721063614 | accuracy = 0.628125


Epoch[2] Batch[85] Speed: 1.242200878851175 samples/sec                   batch loss = 219.70415675640106 | accuracy = 0.6264705882352941


Epoch[2] Batch[90] Speed: 1.2409914074822748 samples/sec                   batch loss = 231.50602674484253 | accuracy = 0.625


Epoch[2] Batch[95] Speed: 1.2431363116945782 samples/sec                   batch loss = 242.43799674510956 | accuracy = 0.631578947368421


Epoch[2] Batch[100] Speed: 1.2504304005477167 samples/sec                   batch loss = 254.99769973754883 | accuracy = 0.63


Epoch[2] Batch[105] Speed: 1.2549108779408626 samples/sec                   batch loss = 267.17099833488464 | accuracy = 0.6333333333333333


Epoch[2] Batch[110] Speed: 1.2438084498855884 samples/sec                   batch loss = 279.2589622735977 | accuracy = 0.6318181818181818


Epoch[2] Batch[115] Speed: 1.2504811946506897 samples/sec                   batch loss = 291.38275277614594 | accuracy = 0.6347826086956522


Epoch[2] Batch[120] Speed: 1.2482209899305865 samples/sec                   batch loss = 304.6271654367447 | accuracy = 0.63125


Epoch[2] Batch[125] Speed: 1.2485832777281718 samples/sec                   batch loss = 316.19640016555786 | accuracy = 0.634


Epoch[2] Batch[130] Speed: 1.2500728150402605 samples/sec                   batch loss = 329.9292378425598 | accuracy = 0.6365384615384615


Epoch[2] Batch[135] Speed: 1.244467560997374 samples/sec                   batch loss = 344.5402600765228 | accuracy = 0.6314814814814815


Epoch[2] Batch[140] Speed: 1.2446481449402556 samples/sec                   batch loss = 357.5096844434738 | accuracy = 0.6303571428571428


Epoch[2] Batch[145] Speed: 1.2461696764033825 samples/sec                   batch loss = 369.1874690055847 | accuracy = 0.6362068965517241


Epoch[2] Batch[150] Speed: 1.2461044233500786 samples/sec                   batch loss = 380.96038949489594 | accuracy = 0.64


Epoch[2] Batch[155] Speed: 1.242889868055497 samples/sec                   batch loss = 391.365225315094 | accuracy = 0.6435483870967742


Epoch[2] Batch[160] Speed: 1.2447350396303933 samples/sec                   batch loss = 404.5656760931015 | accuracy = 0.6375


Epoch[2] Batch[165] Speed: 1.2439608026524638 samples/sec                   batch loss = 416.746569275856 | accuracy = 0.6393939393939394


Epoch[2] Batch[170] Speed: 1.2488193423748566 samples/sec                   batch loss = 428.22038424015045 | accuracy = 0.6426470588235295


Epoch[2] Batch[175] Speed: 1.2465204939536696 samples/sec                   batch loss = 441.72439181804657 | accuracy = 0.6414285714285715


Epoch[2] Batch[180] Speed: 1.2439140415082477 samples/sec                   batch loss = 454.532057762146 | accuracy = 0.6402777777777777


Epoch[2] Batch[185] Speed: 1.245017508088391 samples/sec                   batch loss = 466.70687305927277 | accuracy = 0.6405405405405405


Epoch[2] Batch[190] Speed: 1.2380119453295297 samples/sec                   batch loss = 477.44047462940216 | accuracy = 0.6460526315789473


Epoch[2] Batch[195] Speed: 1.2395218641246786 samples/sec                   batch loss = 491.6816908121109 | accuracy = 0.6410256410256411


Epoch[2] Batch[200] Speed: 1.2497986618677757 samples/sec                   batch loss = 505.43948769569397 | accuracy = 0.63875


Epoch[2] Batch[205] Speed: 1.2428105038025348 samples/sec                   batch loss = 518.8387796878815 | accuracy = 0.6402439024390244


Epoch[2] Batch[210] Speed: 1.2431388908442225 samples/sec                   batch loss = 532.2556765079498 | accuracy = 0.638095238095238


Epoch[2] Batch[215] Speed: 1.2437346847169088 samples/sec                   batch loss = 544.6645126342773 | accuracy = 0.6383720930232558


Epoch[2] Batch[220] Speed: 1.2451854056975868 samples/sec                   batch loss = 557.0115950107574 | accuracy = 0.6375


Epoch[2] Batch[225] Speed: 1.250474856812522 samples/sec                   batch loss = 569.0683805942535 | accuracy = 0.6366666666666667


Epoch[2] Batch[230] Speed: 1.2502399361154024 samples/sec                   batch loss = 580.6463671922684 | accuracy = 0.6402173913043478


Epoch[2] Batch[235] Speed: 1.2498030376892297 samples/sec                   batch loss = 592.2866245508194 | accuracy = 0.6404255319148936


Epoch[2] Batch[240] Speed: 1.2461680102829635 samples/sec                   batch loss = 608.5725486278534 | accuracy = 0.6375


Epoch[2] Batch[245] Speed: 1.2483682020447915 samples/sec                   batch loss = 618.7024590969086 | accuracy = 0.6408163265306123


Epoch[2] Batch[250] Speed: 1.2477925542022668 samples/sec                   batch loss = 630.0446105003357 | accuracy = 0.643


Epoch[2] Batch[255] Speed: 1.2394952156632588 samples/sec                   batch loss = 642.4676113128662 | accuracy = 0.6441176470588236


Epoch[2] Batch[260] Speed: 1.2444875925212726 samples/sec                   batch loss = 655.2606126070023 | accuracy = 0.6442307692307693


Epoch[2] Batch[265] Speed: 1.2453246925029553 samples/sec                   batch loss = 666.873352766037 | accuracy = 0.6443396226415095


Epoch[2] Batch[270] Speed: 1.2451569421559796 samples/sec                   batch loss = 678.4843037128448 | accuracy = 0.6453703703703704


Epoch[2] Batch[275] Speed: 1.2467488304959888 samples/sec                   batch loss = 692.7646791934967 | accuracy = 0.6463636363636364


Epoch[2] Batch[280] Speed: 1.2428173165773562 samples/sec                   batch loss = 705.6588492393494 | accuracy = 0.6446428571428572


Epoch[2] Batch[285] Speed: 1.245722667999618 samples/sec                   batch loss = 717.8081256151199 | accuracy = 0.6456140350877193


Epoch[2] Batch[290] Speed: 1.2448105859994483 samples/sec                   batch loss = 730.7890557050705 | accuracy = 0.6448275862068965


Epoch[2] Batch[295] Speed: 1.2489799917127817 samples/sec                   batch loss = 741.2308212518692 | accuracy = 0.6483050847457628


Epoch[2] Batch[300] Speed: 1.2448551977581765 samples/sec                   batch loss = 752.3477083444595 | accuracy = 0.6508333333333334


Epoch[2] Batch[305] Speed: 1.2441080263696784 samples/sec                   batch loss = 765.0572190284729 | accuracy = 0.65


Epoch[2] Batch[310] Speed: 1.2454841665493108 samples/sec                   batch loss = 775.0805134773254 | accuracy = 0.6516129032258065


Epoch[2] Batch[315] Speed: 1.244126385655082 samples/sec                   batch loss = 787.5227241516113 | accuracy = 0.6523809523809524


Epoch[2] Batch[320] Speed: 1.251247427730335 samples/sec                   batch loss = 802.0071303844452 | accuracy = 0.6515625


Epoch[2] Batch[325] Speed: 1.2433905936344736 samples/sec                   batch loss = 814.3803088665009 | accuracy = 0.6507692307692308


Epoch[2] Batch[330] Speed: 1.244300964322686 samples/sec                   batch loss = 827.5151653289795 | accuracy = 0.65


Epoch[2] Batch[335] Speed: 1.2433204714522785 samples/sec                   batch loss = 839.8092021942139 | accuracy = 0.6492537313432836


Epoch[2] Batch[340] Speed: 1.2469003290787504 samples/sec                   batch loss = 851.6929309368134 | accuracy = 0.6507352941176471


Epoch[2] Batch[345] Speed: 1.2409393620646385 samples/sec                   batch loss = 863.8599816560745 | accuracy = 0.6507246376811594


Epoch[2] Batch[350] Speed: 1.2411055187982862 samples/sec                   batch loss = 877.3988684415817 | accuracy = 0.6492857142857142


Epoch[2] Batch[355] Speed: 1.2461998524664553 samples/sec                   batch loss = 890.8985658884048 | accuracy = 0.6485915492957747


Epoch[2] Batch[360] Speed: 1.2485685963402982 samples/sec                   batch loss = 906.6480573415756 | accuracy = 0.6465277777777778


Epoch[2] Batch[365] Speed: 1.2442559308868004 samples/sec                   batch loss = 918.9130846261978 | accuracy = 0.6472602739726028


Epoch[2] Batch[370] Speed: 1.2461609756014225 samples/sec                   batch loss = 930.6563867330551 | accuracy = 0.6486486486486487


Epoch[2] Batch[375] Speed: 1.239402367089899 samples/sec                   batch loss = 941.9997161626816 | accuracy = 0.6493333333333333


Epoch[2] Batch[380] Speed: 1.2426780377983468 samples/sec                   batch loss = 954.9927303791046 | accuracy = 0.6486842105263158


Epoch[2] Batch[385] Speed: 1.2412105601488872 samples/sec                   batch loss = 967.7977340221405 | accuracy = 0.6480519480519481


Epoch[2] Batch[390] Speed: 1.2462562282296095 samples/sec                   batch loss = 977.7921277284622 | accuracy = 0.6506410256410257


Epoch[2] Batch[395] Speed: 1.2463188122259947 samples/sec                   batch loss = 988.5505913496017 | accuracy = 0.6512658227848102


Epoch[2] Batch[400] Speed: 1.2521003774985153 samples/sec                   batch loss = 999.9170541763306 | accuracy = 0.6525


Epoch[2] Batch[405] Speed: 1.2520784181908973 samples/sec                   batch loss = 1009.91146671772 | accuracy = 0.6549382716049382


Epoch[2] Batch[410] Speed: 1.242859391720131 samples/sec                   batch loss = 1026.5940905809402 | accuracy = 0.6542682926829269


Epoch[2] Batch[415] Speed: 1.2476890867542147 samples/sec                   batch loss = 1039.6822801828384 | accuracy = 0.653012048192771


Epoch[2] Batch[420] Speed: 1.2444936851863913 samples/sec                   batch loss = 1052.3731124401093 | accuracy = 0.6523809523809524


Epoch[2] Batch[425] Speed: 1.2461179361872832 samples/sec                   batch loss = 1064.3229615688324 | accuracy = 0.6523529411764706


Epoch[2] Batch[430] Speed: 1.2526915123533415 samples/sec                   batch loss = 1076.0064648389816 | accuracy = 0.6523255813953488


Epoch[2] Batch[435] Speed: 1.2465915333175663 samples/sec                   batch loss = 1087.7254354953766 | accuracy = 0.6511494252873563


Epoch[2] Batch[440] Speed: 1.2442241879645015 samples/sec                   batch loss = 1101.3103556632996 | accuracy = 0.6505681818181818


Epoch[2] Batch[445] Speed: 1.2491493318436357 samples/sec                   batch loss = 1114.4648574590683 | accuracy = 0.648314606741573


Epoch[2] Batch[450] Speed: 1.24794384243188 samples/sec                   batch loss = 1127.094718694687 | accuracy = 0.6477777777777778


Epoch[2] Batch[455] Speed: 1.2468600185216936 samples/sec                   batch loss = 1138.5741419792175 | accuracy = 0.6483516483516484


Epoch[2] Batch[460] Speed: 1.2415289146267272 samples/sec                   batch loss = 1151.0488138198853 | accuracy = 0.6489130434782608


Epoch[2] Batch[465] Speed: 1.2399964168652124 samples/sec                   batch loss = 1165.2080583572388 | accuracy = 0.6489247311827957


Epoch[2] Batch[470] Speed: 1.241667844029225 samples/sec                   batch loss = 1175.5496594905853 | accuracy = 0.65


Epoch[2] Batch[475] Speed: 1.24088264028498 samples/sec                   batch loss = 1186.1908601522446 | accuracy = 0.6505263157894737


Epoch[2] Batch[480] Speed: 1.2415180735605342 samples/sec                   batch loss = 1197.3970164060593 | accuracy = 0.6515625


Epoch[2] Batch[485] Speed: 1.2443960251044783 samples/sec                   batch loss = 1206.2009572982788 | accuracy = 0.6525773195876289


Epoch[2] Batch[490] Speed: 1.2419180320963983 samples/sec                   batch loss = 1216.7036343812943 | accuracy = 0.6540816326530612


Epoch[2] Batch[495] Speed: 1.2410499752710458 samples/sec                   batch loss = 1230.9192999601364 | accuracy = 0.6525252525252525


Epoch[2] Batch[500] Speed: 1.2525712394920168 samples/sec                   batch loss = 1244.306260228157 | accuracy = 0.6515


Epoch[2] Batch[505] Speed: 1.2542793873794669 samples/sec                   batch loss = 1257.0751569271088 | accuracy = 0.650990099009901


Epoch[2] Batch[510] Speed: 1.2490947399769199 samples/sec                   batch loss = 1269.9043924808502 | accuracy = 0.6509803921568628


Epoch[2] Batch[515] Speed: 1.2550059708700534 samples/sec                   batch loss = 1283.604129076004 | accuracy = 0.6495145631067961


Epoch[2] Batch[520] Speed: 1.2457084237822182 samples/sec                   batch loss = 1295.3742535114288 | accuracy = 0.6495192307692308


Epoch[2] Batch[525] Speed: 1.2499390758467153 samples/sec                   batch loss = 1308.7231364250183 | accuracy = 0.6480952380952381


Epoch[2] Batch[530] Speed: 1.2420887730005938 samples/sec                   batch loss = 1321.8804559707642 | accuracy = 0.6471698113207547


Epoch[2] Batch[535] Speed: 1.255330596936919 samples/sec                   batch loss = 1331.670572757721 | accuracy = 0.6490654205607477


Epoch[2] Batch[540] Speed: 1.2419746646944765 samples/sec                   batch loss = 1344.4594616889954 | accuracy = 0.6486111111111111


Epoch[2] Batch[545] Speed: 1.2446378033408565 samples/sec                   batch loss = 1358.8243045806885 | accuracy = 0.6477064220183486


Epoch[2] Batch[550] Speed: 1.2412088154358016 samples/sec                   batch loss = 1372.3541955947876 | accuracy = 0.6472727272727272


Epoch[2] Batch[555] Speed: 1.2454543025040903 samples/sec                   batch loss = 1382.3744084835052 | accuracy = 0.6486486486486487


Epoch[2] Batch[560] Speed: 1.2377295417604586 samples/sec                   batch loss = 1395.520363330841 | accuracy = 0.6486607142857143


Epoch[2] Batch[565] Speed: 1.2399362989726777 samples/sec                   batch loss = 1410.84097802639 | accuracy = 0.6473451327433628


Epoch[2] Batch[570] Speed: 1.237271868721813 samples/sec                   batch loss = 1424.6648317575455 | accuracy = 0.6464912280701754


Epoch[2] Batch[575] Speed: 1.2384620296357483 samples/sec                   batch loss = 1433.8801674842834 | accuracy = 0.648695652173913


Epoch[2] Batch[580] Speed: 1.2410619098244797 samples/sec                   batch loss = 1443.6141238212585 | accuracy = 0.6504310344827586


Epoch[2] Batch[585] Speed: 1.2453192387506113 samples/sec                   batch loss = 1455.415608882904 | accuracy = 0.6504273504273504


Epoch[2] Batch[590] Speed: 1.240195232321273 samples/sec                   batch loss = 1466.0931510925293 | accuracy = 0.6504237288135594


Epoch[2] Batch[595] Speed: 1.2433922523363794 samples/sec                   batch loss = 1478.090360045433 | accuracy = 0.6512605042016807


Epoch[2] Batch[600] Speed: 1.245576356552314 samples/sec                   batch loss = 1488.8932062387466 | accuracy = 0.6520833333333333


Epoch[2] Batch[605] Speed: 1.2462467856279593 samples/sec                   batch loss = 1500.0407938957214 | accuracy = 0.6524793388429752


Epoch[2] Batch[610] Speed: 1.2479450491725823 samples/sec                   batch loss = 1513.320796251297 | accuracy = 0.6520491803278688


Epoch[2] Batch[615] Speed: 1.2495803752044499 samples/sec                   batch loss = 1523.3943382501602 | accuracy = 0.6524390243902439


Epoch[2] Batch[620] Speed: 1.2484084243900013 samples/sec                   batch loss = 1538.29905295372 | accuracy = 0.652016129032258


Epoch[2] Batch[625] Speed: 1.2433868154966528 samples/sec                   batch loss = 1548.2312942743301 | accuracy = 0.6532


Epoch[2] Batch[630] Speed: 1.242405830116279 samples/sec                   batch loss = 1561.9386633634567 | accuracy = 0.6527777777777778


Epoch[2] Batch[635] Speed: 1.2413961705378314 samples/sec                   batch loss = 1571.5533351898193 | accuracy = 0.6543307086614173


Epoch[2] Batch[640] Speed: 1.236013781998767 samples/sec                   batch loss = 1584.0975149869919 | accuracy = 0.653515625


Epoch[2] Batch[645] Speed: 1.2403877838347694 samples/sec                   batch loss = 1594.0326933860779 | accuracy = 0.6542635658914728


Epoch[2] Batch[650] Speed: 1.2352162487781992 samples/sec                   batch loss = 1605.2818411588669 | accuracy = 0.6546153846153846


Epoch[2] Batch[655] Speed: 1.2384764743065393 samples/sec                   batch loss = 1614.4277929067612 | accuracy = 0.6561068702290076


Epoch[2] Batch[660] Speed: 1.2442635900321062 samples/sec                   batch loss = 1624.2080295085907 | accuracy = 0.656060606060606


Epoch[2] Batch[665] Speed: 1.2437363443369878 samples/sec                   batch loss = 1637.5743128061295 | accuracy = 0.6567669172932331


Epoch[2] Batch[670] Speed: 1.2409478983153297 samples/sec                   batch loss = 1646.2144476175308 | accuracy = 0.6578358208955224


Epoch[2] Batch[675] Speed: 1.245566739304612 samples/sec                   batch loss = 1658.2039059400558 | accuracy = 0.6581481481481481


Epoch[2] Batch[680] Speed: 1.2499803307767317 samples/sec                   batch loss = 1669.6512578725815 | accuracy = 0.6577205882352941


Epoch[2] Batch[685] Speed: 1.245463085898135 samples/sec                   batch loss = 1684.1332651376724 | accuracy = 0.6562043795620438


Epoch[2] Batch[690] Speed: 1.247273533965073 samples/sec                   batch loss = 1695.0685756206512 | accuracy = 0.6572463768115943


Epoch[2] Batch[695] Speed: 1.248957583856145 samples/sec                   batch loss = 1704.9179068803787 | accuracy = 0.6579136690647482


Epoch[2] Batch[700] Speed: 1.2394778169424838 samples/sec                   batch loss = 1715.4950193166733 | accuracy = 0.6585714285714286


Epoch[2] Batch[705] Speed: 1.2384203431082172 samples/sec                   batch loss = 1728.06278860569 | accuracy = 0.6592198581560283


Epoch[2] Batch[710] Speed: 1.237972024716029 samples/sec                   batch loss = 1741.9579240083694 | accuracy = 0.6591549295774648


Epoch[2] Batch[715] Speed: 1.2447481533790132 samples/sec                   batch loss = 1754.041021823883 | accuracy = 0.6587412587412588


Epoch[2] Batch[720] Speed: 1.2457212805615259 samples/sec                   batch loss = 1764.8110857009888 | accuracy = 0.6590277777777778


Epoch[2] Batch[725] Speed: 1.2458039772691412 samples/sec                   batch loss = 1775.673392534256 | accuracy = 0.6596551724137931


Epoch[2] Batch[730] Speed: 1.24407850503687 samples/sec                   batch loss = 1788.1560408473015 | accuracy = 0.6599315068493151


Epoch[2] Batch[735] Speed: 1.2487964755169967 samples/sec                   batch loss = 1800.5589177012444 | accuracy = 0.6598639455782312


Epoch[2] Batch[740] Speed: 1.2480809616727953 samples/sec                   batch loss = 1808.9525602459908 | accuracy = 0.660472972972973


Epoch[2] Batch[745] Speed: 1.2507712194664822 samples/sec                   batch loss = 1819.6591964364052 | accuracy = 0.6610738255033557


Epoch[2] Batch[750] Speed: 1.2476689520347852 samples/sec                   batch loss = 1829.1794238686562 | accuracy = 0.662


Epoch[2] Batch[755] Speed: 1.2490766057448446 samples/sec                   batch loss = 1839.8198999762535 | accuracy = 0.6625827814569536


Epoch[2] Batch[760] Speed: 1.2492372282644306 samples/sec                   batch loss = 1852.5097162127495 | accuracy = 0.6625


Epoch[2] Batch[765] Speed: 1.246123767159606 samples/sec                   batch loss = 1862.5872712731361 | accuracy = 0.6627450980392157


Epoch[2] Batch[770] Speed: 1.2518479386597183 samples/sec                   batch loss = 1874.6926352381706 | accuracy = 0.6623376623376623


Epoch[2] Batch[775] Speed: 1.2419902947299708 samples/sec                   batch loss = 1887.1125256419182 | accuracy = 0.6619354838709678


Epoch[2] Batch[780] Speed: 1.246839076594411 samples/sec                   batch loss = 1899.226878464222 | accuracy = 0.6618589743589743


Epoch[2] Batch[785] Speed: 1.2473508725990645 samples/sec                   batch loss = 1913.3467491269112 | accuracy = 0.6617834394904458


[Epoch 2] training: accuracy=0.6624365482233503
[Epoch 2] time cost: 648.7618913650513
[Epoch 2] validation: validation accuracy=0.7122222222222222


## Next steps

Now that you have completed training and predicting with a neural network on GPUs, you reached the conclusion of the crash course. Congratulations.
If you are keen on studying more, checkout [D2L.ai](https://d2l.ai),
[GluonCV](https://cv.gluon.ai/tutorials/index.html), [GluonNLP](https://nlp.gluon.ai),
[GluonTS](https://ts.gluon.ai/), [AutoGluon](https://auto.gluon.ai).