### Tensor processing 2
PyTorch tensors can be created by constructors or using conversions from other types of arrays. They can be directly created on a desired device or copied from one device to another.

Experiment with PyTorch tensors as follows:

1. Prepare startup code for allowing CPU and GPU execution.
2. Create one 3D tensor of size 128^3 to the main memory (CPU) and another to the GPU memory. The tensor elements should be double precision (64-bit) floating point numbers that are sampled from the normal probability distribution with zero mean and unit variance. Determine how much memory each tensors reserves.
3. Perform matrix multiplication on the GPU with the tensors and serialise the result to storage.
4. If possible, examine the memory utilisation on the GPU while varying the tensor sizes. Does the information reflect the true utilisation of memory?

Hints: device, element_size, nelement, to, empty_cache.

In [1]:
import torch

Determine the device for tensor processing

In [2]:
cpu = torch.device('cpu')
if torch.cuda.is_available():
    print("GPU available")
    gpu = torch.device('cuda')
else:
    print("GPU not available")
    gpu = None

GPU available


Create a GPU tensor with elements sampled from the normal distribution

In [3]:
n = 128
A1 = torch.rand((n, n, n), dtype=torch.double, device=gpu)
(A1.device, A1.shape, A1.element_size() * A1.nelement())

(device(type='cuda', index=0), torch.Size([128, 128, 128]), 16777216)

Create a CPU tensor with elements sampled from the normal distribution

In [4]:
B1 = torch.rand((n, n, n), dtype=torch.double, device=cpu)
(B1.device, B1.shape, B1.element_size() * B1.nelement())

(device(type='cpu'), torch.Size([128, 128, 128]), 16777216)

Move the CPU tensor to the GPU

In [5]:
B1 = B1.to(gpu)
(B1.device)

device(type='cuda', index=0)

Multiply the tensors on the GPU

In [6]:
C1 = A1 * B1
(C1.device, C1.shape, C1.element_size() * C1.nelement())

(device(type='cuda', index=0), torch.Size([128, 128, 128]), 16777216)

Serialise the result to a Pickle file

In [7]:
torch.save(C1, '/home/ltl/experiments/gpuc/s004a.t')

RuntimeError: Parent directory /home/ltl/experiments/gpuc does not exist.

Examine the GPU memory consumption before and after the following. For the examination, nvidia-smi (CLI) can be used.

In [8]:
torch.cuda.empty_cache()

A more profound approach would be to use memory snapshots with a memory visualiser (or the profiler API).

In [9]:
torch.cuda.memory._record_memory_history()

# Code to be examined

torch.cuda.memory._dump_snapshot("snapshot.pickle")

# Examine the snapshot with https://pytorch.org/memory_viz

RuntimeError: record_context_cpp is not support on non-linux non-x86_64 platforms