# Demo: Creating Tensors on CUDA-enabled devices 

In [1]:
import torch

In [2]:
# Check whether the cuda is available
torch.cuda.is_available()

True

In [3]:
# In order to initialize the CUDA state for PyTorch, you can call torch.cuda.init(). 
# This is required when interacting with PyTorch's C API. 
# When you are working with Python, the CUDA state is initilized on demand, so it's not really needed here in this case.

torch.cuda.init()

In [4]:
# At any point in time, when you're working with PyTorch, torch.cuda keeps track of the currently selected GPU and 
# all CUDA tensors that you allocate will by default, be created on that device.

# the gpus with indexed positions, it will return index of the current device
torch.cuda.current_device()

0

In [5]:
# Number of CUDA emabled devices available for PyTorch to use by running torch.cuda.device_count().

torch.cuda.device_count()

1

In [6]:
# If you want to use PyTorch to monitor how much memory your tensors occupy, you can call torch.cuda.memory_allocated().

torch.cuda.memory_allocated()

0

In [8]:
# Behind the scenes PyTorch uses a caching memeroy allocator to speed up memory allocations to your tensors - 
# this allows fast memory deallocation without device synchronizations between your different CUDA devices.

# FutureWarning: torch.cuda.memory_cached has been renamed to torch.cuda.memory_reserved
torch.cuda.memory_reserved()

0

In [9]:
# "cuda" refers to the default CUDA device used by PyTorch (on which tensors will be created) - 
# this is something that can be changed using the device context manager.

cuda = torch.device("cuda")

cuda

device(type='cuda')

In [14]:
# If you want to access to a specific CUDA device using the device's context manager, you will reference it using an index.

# but we should have only one cuda device which is a dedicated GPU of your laptop.
# So, cuda1 and cuda2 are not really valid.
# The default cuda device is at index 0.
cuda0 = torch.device("cuda:0")
cuda1 = torch.device("cuda:1")
cuda2 = torch.device("cuda:2")

display(cuda0)
display(cuda1)
display(cuda2)

device(type='cuda', index=0)

device(type='cuda', index=1)

device(type='cuda', index=2)

In [15]:
# When you create a torch tensor and you haven't specified a CUDA device, this tensor, by default, is created on the CPU.

x = torch.tensor([10., 20.])

# x is a tensor created on CPU, because it has not CUDA device associated with it.
x

tensor([10., 20.])

In [16]:
# If you want to create a tensor on GPU, you need to explicitly specify the device parameter, device equal to cuda
# (meaning that it is going to use defauly cuda device)
x_default = torch.tensor([10., 20.], device=cuda) # device=cuda means device="cuda:0"

x_default

tensor([10., 20.], device='cuda:0')

In [17]:
x0 = torch.tensor([10.0, 20.0], device=cuda0)

x0

tensor([10., 20.], device='cuda:0')

In [18]:
torch.cuda.memory_allocated()

1024

In [19]:
torch.cuda.memory_reserved()

2097152

In [20]:
# Let's create another tensor, x1, explicitly on the device cuda1.

# will result in an error, since there is no 2nd GPU
x1 = torch.tensor([10.0, 20.0], device=cuda1)

x1

RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

In [21]:
# The same happens for the 3rd cuda device we obtained a reference to.

x2 = torch.tensor([10.0, 20.0], device=cuda2)

RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.