# PyTorch Test Notebook

This notebook aims to provide a very basic testing perspective on Jupyter notebooks with PyTorch installed, in such way that it will execute properly the following PyTorch scripts:

1. PyTorch quickstart for beginners
2. PyTorch tests (basic operations on GPU)

## 1. PyTorch quickstart for beginners
This test aims to use the basic Getting Started tutorial on PyTorch website to check if this installation is working properly:

- Working with data;
- Creating Models;
- Optimizing the Model Parameters;
- Saving Models;
- Loading Models;

The expected output here is a set of the following objects:

- Dataset download;
- Dataloader objects;
- Model object;
- Training epochs;
- Model prediction;

In [1]:
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor


# 1. Working with data
# ----------------------------------------------------------------------------------------------
# PyTorch offers domain-specific libraries such as TorchText, TorchVision,
# and TorchAudio, all of which include datasets. For this tutorial, we will
# be using a TorchVision dataset.
#
# The torchvision.datasets module contains Dataset objects for many real-world
# vision data like CIFAR, COCO (full list here). In this tutorial, we use the
# FashionMNIST dataset. Every TorchVision Dataset includes two arguments:
# transform and target_transform to modify the samples and labels respectively.

# Download training data from open datasets.
training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor(),
)

# Download test data from open datasets.
test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor(),
)

# We pass the Dataset as an argument to DataLoader. This wraps an iterable over our dataset, and supports automatic batching, sampling,
# shuffling and multiprocess data loading. Here we define a batch size of 64, i.e. each element in the dataloader iterable will return
# a batch of 64 features and labels.

batch_size = 64

# Create data loaders.
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

for X, y in test_dataloader:
    print(f"Shape of X [N, C, H, W]: {X.shape}")
    print(f"Shape of y: {y.shape} {y.dtype}")
    break


# 2. Creating Models
# ----------------------------------------------------------------------------------------------
# To define a neural network in PyTorch, we create a class that inherits
# from nn.Module. We define the layers of the network in the __init__ function
# and specify how data will pass through the network in the forward function.
# To accelerate operations in the neural network, we move it to the accelerator
# such as CUDA, MPS, MTIA, or XPU. If the current accelerator is available,
# we will use it. Otherwise, we use the CPU.
device = torch.accelerator.current_accelerator().type if torch.accelerator.is_available() else "cpu"
print(f"Using {device} device")

# Define model
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10)
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)
print(model)


# 3. Optimizing the Model Parameters
# ----------------------------------------------------------------------------------------------
# To train a model, we need a loss function and an optimizer.

loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

# In a single training loop, the model makes predictions on the training
# dataset (fed to it in batches), and backpropagates the prediction error
# to adjust the model’s parameters.

def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

        if batch % 100 == 0:
            loss, current = loss.item(), (batch + 1) * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

# We also check the model’s performance against the test dataset to ensure
# it is learning.

def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")

epochs = 5
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model, loss_fn)
print("Done!")

# 4. Saving Models
# ----------------------------------------------------------------------------------------------
# A common way to save a model is to serialize the internal state dictionary
# (containing the model parameters).

torch.save(model.state_dict(), "model.pth")
print("Saved PyTorch Model State to model.pth")


# 5. Loading Models
# ----------------------------------------------------------------------------------------------
# The process for loading a model includes re-creating the model structure and
# loading the state dictionary into it.

model = NeuralNetwork().to(device)
model.load_state_dict(torch.load("model.pth", weights_only=True))

# This model can now be used to make predictions.

classes = [
    "T-shirt/top",
    "Trouser",
    "Pullover",
    "Dress",
    "Coat",
    "Sandal",
    "Shirt",
    "Sneaker",
    "Bag",
    "Ankle boot",
]

model.eval()
x, y = test_data[0][0], test_data[0][1]
with torch.no_grad():
    x = x.to(device)
    pred = model(x)
    predicted, actual = classes[pred[0].argmax(0)], classes[y]
    print(f'Predicted: "{predicted}", Actual: "{actual}"')

100%|██████████| 26.4M/26.4M [00:02<00:00, 11.9MB/s]
100%|██████████| 29.5k/29.5k [00:00<00:00, 195kB/s]
100%|██████████| 4.42M/4.42M [00:01<00:00, 3.58MB/s]
100%|██████████| 5.15k/5.15k [00:00<00:00, 15.0MB/s]


Shape of X [N, C, H, W]: torch.Size([64, 1, 28, 28])
Shape of y: torch.Size([64]) torch.int64
Using cuda device
NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)
Epoch 1
-------------------------------
loss: 2.310729  [   64/60000]
loss: 2.293337  [ 6464/60000]
loss: 2.271836  [12864/60000]
loss: 2.256696  [19264/60000]
loss: 2.253461  [25664/60000]
loss: 2.219933  [32064/60000]
loss: 2.227298  [38464/60000]
loss: 2.192452  [44864/60000]
loss: 2.189365  [51264/60000]
loss: 2.152730  [57664/60000]
Test Error: 
 Accuracy: 42.2%, Avg loss: 2.151557 

Epoch 2
-------------------------------
loss: 2.167950  [   64/60000]
loss: 2.160891  [ 6464/60000]
loss: 2.096020  [12864/60000]
loss: 2.096901  [19264/60000]
loss: 2.065094  [

## 2. PyTorch tests (basic operations on GPU)
This test aims to try out the GPU basic commands, like `tensor`, `from_numpy`, `ones_like`, `rand_like` among some more commands, just to be sure that PyTorch is really running well.

In [2]:
import torch
import numpy as np

# Tensor Initialization
# ------------------------------------------------------------
# Tensors can be initialized in various ways. Take a look at the following
# examples:
print("\n===== Tensor initialization")

# - Directly from data
# Tensors can be created directly from data. The data type is automatically
# inferred.
print("\n----- Tensor initialization directly from data")
data = [[1, 2], [3, 4]]
x_data = torch.tensor(data)

# - From a NumPy array
# Tensors can be created from NumPy arrays (and vice versa - see Bridge
# with NumPy).
print("\n----- Tensor initialization from a NumPy array")
np_array = np.array(data)
x_np = torch.from_numpy(np_array)

# - From another tensor:
# The new tensor retains the properties (shape, datatype) of the
# argument tensor, unless explicitly overridden.
print("\n----- Tensor initialization from another tensor")

# retains the properties of x_data
x_ones = torch.ones_like(x_data)
print(f"Ones Tensor: \n {x_ones} \n")

# overrides the datatype of x_data
x_rand = torch.rand_like(x_data, dtype=torch.float)
print(f"Random Tensor: \n {x_rand} \n")

# - With random or constant values:
# shape is a tuple of tensor dimensions. In the functions below, it determines
# the dimensionality of the output tensor.
print("\n----- Tensor initialization with random or constant values")
shape = (2, 3,)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor}")


# Tensor Attributes
# ------------------------------------------------------------
# Tensor attributes describe their shape, datatype, and the device on which they are stored.

print("\n===== Tensor attributes")

tensor = torch.rand(3, 4)

print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")


# Tensor Operations
# ------------------------------------------------------------
# Over 100 tensor operations, including transposing, indexing, slicing,
# mathematical operations, linear algebra, random sampling, and more are
# comprehensively described here.
#
# Each of them can be run on the GPU (at typically higher speeds than
# on a CPU).
print("\n===== Tensor operations")

# We move our tensor to the GPU if available
print("\n----- Move the tensor to the GPU (if available)")
if torch.cuda.is_available():
  tensor = tensor.to('cuda')
  print(f"Device tensor is stored on: {tensor.device}")


# Standard numpy-like indexing and slicing:
tensor = torch.ones(4, 4)
tensor[:,1] = 0
print(tensor)

# Joining tensors You can use torch.cat to concatenate a sequence of tensors
# along a given dimension. See also torch.stack, another tensor joining op
# that is subtly different from torch.cat
print("\n----- Joining tensors")
t1 = torch.cat([tensor, tensor, tensor], dim=1)
print(t1)

# Multiplying tensors
# This computes the element-wise product
print("\n----- Multiplying tensors")

print(f"tensor.mul(tensor) \n {tensor.mul(tensor)} \n")
# Alternative syntax:
print(f"tensor * tensor \n {tensor * tensor}")

# This computes the matrix multiplication between two tensors
print(f"tensor.matmul(tensor.T) \n {tensor.matmul(tensor.T)} \n")
# Alternative syntax:
print(f"tensor @ tensor.T \n {tensor @ tensor.T}")

# In-place operations
# Operations that have a _ suffix are in-place.
# For example: x.copy_(y), x.t_(), will change x.
print("\n----- In-place operations")
print(tensor, "\n")
tensor.add_(5)
print(tensor)


# Bridge with NumPy
# ------------------------------------------------------------
# Tensors on the CPU and NumPy arrays can share their underlying
# memory locations, and changing one will change the other.
print("\n===== Bridge with NumPy")

# Tensor to NumPy array
print("\n----- Tensor to NumPy array")

t = torch.ones(5)
print(f"t: {t}")
n = t.numpy()
print(f"n: {n}")

# A change in the tensor reflects in the NumPy array.
t.add_(1)
print(f"t: {t}")
print(f"n: {n}")

# NumPy array to Tensor
print("\n----- NumPy array to Tensor")

np.add(n, 1, out=n)
print(f"t: {t}")
print(f"n: {n}")


===== Tensor initialization

----- Tensor initialization directly from data

----- Tensor initialization from a NumPy array

----- Tensor initialization from another tensor
Ones Tensor: 
 tensor([[1, 1],
        [1, 1]]) 

Random Tensor: 
 tensor([[0.3678, 0.4588],
        [0.7819, 0.7682]]) 


----- Tensor initialization with random or constant values
Random Tensor: 
 tensor([[0.8846, 0.9072, 0.9605],
        [0.3528, 0.9535, 0.0923]]) 

Ones Tensor: 
 tensor([[1., 1., 1.],
        [1., 1., 1.]]) 

Zeros Tensor: 
 tensor([[0., 0., 0.],
        [0., 0., 0.]])

===== Tensor attributes
Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu

===== Tensor operations

----- Move the tensor to the GPU (if available)
Device tensor is stored on: cuda:0
tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])

----- Joining tensors
tensor([[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1.

For more information of the methods that will be tested here:
- [torch.Tensor](https://docs.pytorch.org/docs/stable/tensors.html)<br/>
  A `torch.Tensor` is a multi-dimensional matrix containing elements of a single data type

- [torch.from_numpy](https://docs.pytorch.org/docs/stable/generated/torch.from_numpy.html)<br />
  Creates a `Tensor` from a `numpy.ndarray`<br />
  The returned tensor and `ndarray` share the same memory. Modifications to the tensor will be reflected in the `ndarray` and vice versa. The returned tensor is not resizable.

- [torch.ones_like](https://docs.pytorch.org/docs/stable/generated/torch.ones_like.html)<br />
  Returns a tensor filled with the scalar value 1, with the same size as `input`. `torch.ones_like(input)` is equivalent to `torch.ones(input.size(), dtype=input.dtype, layout=input.layout, device=input.device)`

- [torch.rand_like](https://docs.pytorch.org/docs/stable/generated/torch.rand_like.html)<br />
  Returns a tensor with the same size as `input` that is filled with random numbers from a uniform distribution on the interval [0, 1)[0,1). `torch.rand_like(input)` is equivalent to `torch.rand(input.size(), dtype=input.dtype, layout=input.layout, device=input.device)`

## 3. PyTorch GPU operations

The following test will run some operations directly on the GPU to ensure that the Jupyter Notebook and GPUs are working properly together on this environment's setup.

In [3]:
import torch
import torch.nn as nn

print("Simple GPU Tensor Operations Test")
print("=" * 40)

# Detect available GPU device (CUDA or ROCm)
if torch.cuda.is_available():
    device = 'cuda'
    device_name = torch.cuda.get_device_name(0)

    # Try to determine if we're using NVIDIA CUDA or AMD ROCm
    if 'NVIDIA' in device_name or 'GeForce' in device_name or 'Tesla' in device_name or 'Quadro' in device_name:
        backend = "NVIDIA CUDA"
    elif 'AMD' in device_name or 'Radeon' in device_name or 'gfx' in device_name:
        backend = "AMD ROCm"
    else:
        backend = "CUDA-compatible GPU"

else:
    raise RuntimeError("No GPU device available (neither CUDA nor ROCm detected)")

print(f"Detected GPU backend: {backend}")
print(f"Target device: {device}")
print(f"Device name: {device_name}")
print()

# 1. Create tensors directly on GPU
print("1. Creating tensors directly on GPU:")
a = torch.randn(3, 4, device=device)
b = torch.ones(3, 4, device=device)
c = torch.zeros(2, 5, device=device)

print(f"   Tensor a device: {a.device}")
print(f"   Tensor b device: {b.device}")
print(f"   Tensor c device: {c.device}")
print(f"   Tensor a:\n{a}")
print()

# 2. Basic arithmetic operations
print("2. Basic arithmetic operations (all on GPU):")
result_add = a + b
result_mul = a * b
result_mm = torch.mm(a, b.T)  # Matrix multiplication

print(f"   Addition result device: {result_add.device}")
print(f"   Multiplication result device: {result_mul.device}")
print(f"   Matrix mult result device: {result_mm.device}")
print(f"   Matrix multiplication result shape: {result_mm.shape}")
print()

# 3. Create tensor on CPU and move to GPU
print("3. Moving tensor from CPU to GPU:")
cpu_tensor = torch.randn(2, 3)
print(f"   Original tensor device: {cpu_tensor.device}")

gpu_tensor = cpu_tensor.to(device)
print(f"   Moved tensor device: {gpu_tensor.device}")
print()

# 4. More complex operations
print("4. Complex tensor operations on GPU:")
x = torch.randn(100, 50, device=device)
y = torch.randn(50, 30, device=device)

# Chain of operations
result = torch.relu(torch.mm(x, y))
result = torch.softmax(result, dim=1)
mean_result = torch.mean(result, dim=0)

print(f"   Input x device: {x.device}")
print(f"   Input y device: {y.device}")
print(f"   Final result device: {mean_result.device}")
print(f"   Final result shape: {mean_result.shape}")
print(f"   Final result sample: {mean_result[:5]}")
print()

# 5. Neural network operations
print("5. Neural network operations on GPU:")

class SimpleNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(10, 20)
        self.fc2 = nn.Linear(20, 5)
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.relu(self.fc1(x))
        x = self.fc2(x)
        return x

# Create model and move to GPU
model = SimpleNet().to(device)

# Check if model parameters are on GPU
print(f"   Model parameters device: {next(model.parameters()).device}")

# Create input data on GPU
input_data = torch.randn(32, 10, device=device)
print(f"   Input data device: {input_data.device}")

# Forward pass
with torch.no_grad():
    output = model(input_data)

print(f"   Model output device: {output.device}")
print(f"   Output shape: {output.shape}")
print()

# 6. Gradient computation on GPU
print("6. Gradient computation on GPU:")
x = torch.randn(5, 3, device=device, requires_grad=True)
y = torch.randn(5, 3, device=device)

print(f"   Input x device: {x.device}")
print(f"   Target y device: {y.device}")

# Compute loss
loss = torch.mean((x - y) ** 2)
print(f"   Loss device: {loss.device}")

# Backward pass
loss.backward()
print(f"   Gradient device: {x.grad.device}")
print(f"   Gradient shape: {x.grad.shape}")
print()

# 7. In-place operations on GPU
print("7. In-place operations on GPU:")
tensor = torch.ones(3, 3, device=device)
print(f"   Original tensor device: {tensor.device}")
print(f"   Original tensor:\n{tensor}")

# In-place operations
tensor.add_(2.0)  # Add 2 to all elements
tensor.mul_(3.0)  # Multiply all elements by 3

print(f"   Modified tensor device: {tensor.device}")
print(f"   Modified tensor:\n{tensor}")
print()

print("All operations completed successfully on GPU!")
print(f"All tensors maintained {device.upper()} device throughout operations!")
print(f"Backend used: {backend}")

Simple GPU Tensor Operations Test
Detected GPU backend: AMD ROCm
Target device: cuda
Device name: AMD Instinct MI300X VF

1. Creating tensors directly on GPU:
   Tensor a device: cuda:0
   Tensor b device: cuda:0
   Tensor c device: cuda:0
   Tensor a:
tensor([[-0.7475,  1.6155,  1.0797, -1.9877],
        [ 0.5451, -2.5021,  1.8317,  0.8946],
        [ 0.4863, -0.1370,  1.2617,  1.4519]], device='cuda:0')

2. Basic arithmetic operations (all on GPU):
   Addition result device: cuda:0
   Multiplication result device: cuda:0
   Matrix mult result device: cuda:0
   Matrix multiplication result shape: torch.Size([3, 3])

3. Moving tensor from CPU to GPU:
   Original tensor device: cpu
   Moved tensor device: cuda:0

4. Complex tensor operations on GPU:
   Input x device: cuda:0
   Input y device: cuda:0
   Final result device: cuda:0
   Final result shape: torch.Size([30])
   Final result sample: tensor([0.0291, 0.0143, 0.0169, 0.0223, 0.0712], device='cuda:0')

5. Neural network operation