# GPUs

In [1]:
!pip install mxnet-cu100

Collecting mxnet-cu100
[?25l  Downloading https://files.pythonhosted.org/packages/56/d3/e939814957c2f09ecdd22daa166898889d54e5981e356832425d514edfb6/mxnet_cu100-1.5.0-py2.py3-none-manylinux1_x86_64.whl (540.1MB)
[K     |████████████████████████████████| 540.1MB 36kB/s 
Collecting graphviz<0.9.0,>=0.8.1 (from mxnet-cu100)
  Downloading https://files.pythonhosted.org/packages/53/39/4ab213673844e0c004bed8a0781a0721a3f6bb23eb8854ee75c236428892/graphviz-0.8.4-py2.py3-none-any.whl
Installing collected packages: graphviz, mxnet-cu100
  Found existing installation: graphviz 0.10.1
    Uninstalling graphviz-0.10.1:
      Successfully uninstalled graphviz-0.10.1
Successfully installed graphviz-0.8.4 mxnet-cu100-1.5.0


In [2]:
!nvidia-smi

Sat Sep 21 10:35:57 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.40       Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   54C    P8    11W /  70W |      0MiB / 15079MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|  No ru

## Computing Devices


In [9]:
import mxnet as mx
from mxnet import nd
from mxnet.gluon import nn

mx.cpu(), mx.gpu(), mx.gpu(0)

(cpu(0), gpu(0), gpu(0))

## NDArray and GPUs


In [4]:
x = nd.array([1, 2, 3])
x.context

cpu(0)

### Storage on the GPU


In [5]:
x = nd.ones((2, 3), ctx=mx.gpu())
x


[[1. 1. 1.]
 [1. 1. 1.]]
<NDArray 2x3 @gpu(0)>

Create on the second GPU:

In [10]:
y = nd.random.uniform(shape=(2, 3), ctx=mx.gpu(0))
y


[[0.6686509  0.17409194 0.3850025 ]
 [0.24678314 0.35134333 0.8404298 ]]
<NDArray 2x3 @gpu(0)>

### Copy with `copyto`

Inputs for an operator should be on the same device. 

![Copyto copies arrays to the target device](http://d2l.ai/_images/copyto.svg)

In [1]:
z = x.copyto(mx.gpu(0))
y + z

NameError: ignored

### Copy with `as_in_context`

In [0]:
z = x.as_in_context(mx.gpu(0))
z

### Tiny Difference between `copyto` and  `as_in_context` 

In [0]:
# Return the input if the target device is same as the source device
y.as_in_context(mx.gpu(1)) is y

In [0]:
# Always create new memory to copy the input
y.copyto(mx.gpu()) is y

## Gluon and GPUs

In [0]:
net = nn.Sequential()
net.add(nn.Dense(1))
net.initialize(ctx=mx.gpu())

# When the input is an NDArray on the GPU, 
# Gluon will calculate the result on the same GPU.
print(net(x))
net[0].weight.data()