In [0]:
import torch

In [4]:
torch.cuda.is_available()

True

In [5]:
torch.cuda.current_device()

0

In [6]:
torch.cuda.device_count()

1

Check how much memory tensors are allocated

In [7]:
torch.cuda.memory_allocated()

0

Behind the scenes PyTorch uses a **caching memery allocator** to speed up memory allocations - this allows fast memory deallocation without device synchronizations

In [8]:
torch.cuda.memory_cached()

0

Find the current CUDA device being used

In [11]:
cuda = torch.device('cuda')
cuda

device(type='cuda')

Accessing multiple GPUs for CUDA context manager

In [0]:
cuda0 = torch.device('cuda:0')
cuda1 = torch.device('cuda:1')
cuda2 = torch.device('cuda:2')

By default tensors are created on CPU

In [13]:
x = torch.tensor([10., 20.])
x

tensor([10., 20.])

Create a tensor on default cuda device

In [15]:
x_default = torch.tensor([10., 20.], device=cuda)
x_default

tensor([10., 20.], device='cuda:0')

Create a device on a specific CUDA device

In [17]:
x0 = torch.tensor([10., 20.], device=cuda0)
x0

tensor([10., 20.], device='cuda:0')

In [18]:
x1 = torch.tensor([10., 20.], device=cuda1)


RuntimeError: ignored

.cuda() function creates a copy of the object in the memory. If the tensor is already in CUDA memory and on the correct device then no copy is performed

Here x was originally created on the CPU and is not on the default device in CUDA memory

In [22]:
y = x.cuda()

y


tensor([10., 20.], device='cuda:0')

If x1 tensor is on GPU1 and you want to make a copy available on another device then you would want to use cuda()

Here x1 was originally created on cuda:1, a copy will now be made on the default device cuda:0

In [25]:
y0 = x0.cuda()

y0

tensor([10., 20.], device='cuda:0')

"with" context allow you to create a black in which only one GPU device is valid

In [30]:
print('Outside with context: ', torch.cuda.current_device())

with torch.cuda.device(0): #Put (1) in the brackets for GPU1
    print('Inside with context: ', torch.cuda.current_device())
    
print('Outside with context again: ', torch.cuda.current_device())

Outside with context:  0
Inside with context:  0
Outside with context again:  0


When we explicitly specify a device within a with context the tensor is created on the specified device

In [0]:
with torch.cuda.device(0): #Put (1) in the brackets
  
    a = torch.tensor([10., 20.])  #This tensor will be created on GPU1 coz 1 is mentioned in torch.cuda.device(1)
    
    a0 = torch.tensor([10., 20.], device=cuda0) #This tensor will be created on GPU0
    
    a1 = torch.tensor([10., 20.], device=cuda) #This tensor will be created on default GPU and in this with block it is GPU1

In [39]:
a

tensor([10., 20.])

Operations cannot be performed on tensors on different CUDA devices

In [40]:
sum_a = a + a0

RuntimeError: ignored

In [41]:
torch.cuda.memory_allocated()

3072

In [44]:
torch.cuda.memory_cached()

2097152

In [0]:
torch.cuda.empty_cache()

Copying tensor on same device. For example copying tensor from CPU to CPU and from GPU to GPU

In [46]:
preserve_context = x.new_full([2, 2], fill_value=1.1)
preserve_context

tensor([[1.1000, 1.1000],
        [1.1000, 1.1000]])