# GPUs

Check your CUDA driver and device. 

In [1]:
!nvidia-smi

/bin/sh: nvidia-smi: command not found


Number of available GPUs

In [2]:
from mxnet import np, npx
from mxnet.gluon import nn
npx.set_np()

npx.num_gpus()

0

Computation devices

In [3]:
print(npx.cpu(), npx.gpu(), npx.gpu(1))

def try_gpu(i=0):
    return npx.gpu(i) if npx.num_gpus() >= i + 1 else npx.cpu()

def try_all_gpus():
    ctxes = [npx.gpu(i) for i in range(npx.num_gpus())]
    return ctxes if ctxes else [npx.cpu()]

try_gpu(), try_gpu(3), try_all_gpus()

cpu(0) gpu(0) gpu(1)


(cpu(0), cpu(0), [cpu(0)])

Create ndarrays on the 1st GPU

In [4]:
x = np.ones((2, 3), ctx=try_gpu())
print(x.context)
x

cpu(0)


array([[1., 1., 1.],
       [1., 1., 1.]])

Create on the 2nd GPU

In [5]:
y = np.random.uniform(size=(2, 3), ctx=try_gpu(1))
y

array([[0.5488135 , 0.5928446 , 0.71518934],
       [0.84426576, 0.60276335, 0.8579456 ]])

Copying between devices

In [6]:
z = x.copyto(try_gpu(1))
print(x)
print(z)

[[1. 1. 1.]
 [1. 1. 1.]]
[[1. 1. 1.]
 [1. 1. 1.]]


The inputs of an operator must be on the same device, then the computation will run on that device.

In [7]:
y + z

array([[1.5488136, 1.5928446, 1.7151893],
       [1.8442657, 1.6027634, 1.8579457]])

Initialize parameters on the first GPU.

In [8]:
net = nn.Sequential()
net.add(nn.Dense(1))
net.initialize(ctx=try_gpu())

When the input is an ndarray on the GPU, Gluon will calculate the result on the same GPU.

In [9]:
net(x)

array([[0.04421056],
       [0.04421056]])

Let us confirm that the model parameters are stored on the same GPU.

In [10]:
net[0].weight.data()

array([[ 0.00628365,  0.04861524, -0.01068833]])