[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/KedoKudo/DT_GNN_Tutorial/blob/tree/main/notebooks/02_pytorch_basics.ipynb)

# PyTorch Basics

PyTorch provides very good introductory tutorials on their [website](https://pytorch.org/tutorials/).
We recommend beginners to go through the tutorials step by step.
In this notebook, we will only provide a few key points for using PyTorch based on our experience.

## Table of Contents

0. Detect GPU
0. Numpy and PyTorch.Tensor
0. Basic training procedure
0. Save and load models

In [1]:
# uncomment the following line to install the required packages
# !pip install torch torchvision matplotlib numpy

## 0. Detect GPU

PyTorch can automatically detect whether a GPU is available.
Although it is possible to use PyTorch on CPU, we strongly recommend to use GPU for training.

- For x86_64, use

```python
import torch
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
```

- For Apple silicon, use

```python
import torch
device = torch.device('mps')
```

If there are multiple GPUs, you can specify which GPU to use by

```python
import torch
device = torch.device('cuda:1' if torch.cuda.is_available() else 'cpu')
```
which will use the second GPU.

In [2]:
# This block is run on MacbookPro with M1-pro chip, you need to update the device name to match your device
import torch
device = torch.device('mps')
print(device)

mps


> Now all features are supported on Apple silicon yet, so we recommend to use x86_64 for now if possible.

## 1. Numpy and PyTorch.Tensor

PyTorch offers an easy way to convert between Numpy arrays and its tensors:

In [3]:
import numpy as np

# Convert a Numpy array to a Torch tensor
numpy_array = np.array([1, 2, 3, 4, 5])
tensor_from_numpy = torch.from_numpy(numpy_array)
print(tensor_from_numpy)

# Convert a Torch tensor to a Numpy array
tensor = torch.tensor([1, 2, 3, 4, 5])
numpy_from_tensor = tensor.numpy()
print(numpy_from_tensor)

tensor([1, 2, 3, 4, 5])
[1 2 3 4 5]


#### DO NOT mix use of Numpy and PyTorch tensors.

In [4]:
# Notice that an error will occur if you try to operate a numpy array and a torch tensor
numpy_array * tensor_from_numpy

TypeError: unsupported operand type(s) for *: 'numpy.ndarray' and 'Tensor'

#### Until a tensor is explicitly sent to GPU, it is always on CPU.

In [5]:
tensor_from_numpy.device

device(type='cpu')

make sure return the handle if you want to keep track of the data sent to GPU

In [6]:
# handle is still pointed to the CPU one
tensor_from_numpy.to(device)
tensor_from_numpy.device

device(type='cpu')

In [7]:
# handle is now pointed to the GPU one
tensor_from_numpy = tensor_from_numpy.to(device)
tensor_from_numpy.device

device(type='mps', index=0)

## 2. Basic training procedure

PyTorch provides a very flexible way to train a model, and provides many useful tools to simpiify the training procedure.
However, we still recommend beginners to follow the basic training procedure instead of using the bundled functions.

In [8]:
import torch.nn as nn

# Generate synthetic data
x = np.linspace(0, 10, 100)
y = 2 * x + 1 + np.random.randn(100) * 0.5  # y = 2x + 1 + noise

# Convert data to PyTorch tensors
x_tensor = torch.FloatTensor(x).view(-1, 1)
y_tensor = torch.FloatTensor(y).view(-1, 1)

# Model Definition
class LinearRegression(nn.Module):
    def __init__(self):
        super(LinearRegression, self).__init__()
        self.linear = nn.Linear(1, 1)  # Single input and single output

    def forward(self, x):
        return self.linear(x)

model = LinearRegression()

# Loss and optimizer
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

# Training loop
num_epochs = 200
for epoch in range(num_epochs):
    # Forward pass
    y_pred = model(x_tensor)

    # Compute loss
    loss = criterion(y_pred, y_tensor)

    # Backward pass and optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Print loss every 20 epochs
    if (epoch+1) % 20 == 0:
        print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item()}")

# Get the learned parameters
learned_weight = model.linear.weight.item()
learned_bias = model.linear.bias.item()
print(f"Trained model: y = {learned_weight:.4f}x + {learned_bias:.4f}")


Epoch [20/200], Loss: 0.3957580327987671
Epoch [40/200], Loss: 0.3622252643108368
Epoch [60/200], Loss: 0.33474501967430115
Epoch [80/200], Loss: 0.312224805355072
Epoch [100/200], Loss: 0.293769508600235
Epoch [120/200], Loss: 0.27864524722099304
Epoch [140/200], Loss: 0.2662506103515625
Epoch [160/200], Loss: 0.2560933232307434
Epoch [180/200], Loss: 0.24776943027973175
Epoch [200/200], Loss: 0.24094779789447784
Trained model: y = 2.0640x + 0.6144


The code example will run on CPU by default, and you can change it to GPU by setting `device` to `cuda` or `cuda:1` etc.

In [9]:
x = np.linspace(0, 10, 100)
y = 2 * x + 1 + np.random.randn(100) * 0.5  # y = 2x + 1 + noise

# Convert data to PyTorch tensors and move them to the specified device
x_tensor = torch.FloatTensor(x).view(-1, 1).to(device)
y_tensor = torch.FloatTensor(y).view(-1, 1).to(device)

# Model Definition
class LinearRegression(nn.Module):
    def __init__(self):
        super(LinearRegression, self).__init__()
        self.linear = nn.Linear(1, 1)  # Single input and single output

    def forward(self, x):
        return self.linear(x)

model = LinearRegression().to(device)

# Loss and optimizer
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

# Training loop
num_epochs = 200
for epoch in range(num_epochs):
    # Forward pass
    y_pred = model(x_tensor)

    # Compute loss
    loss = criterion(y_pred, y_tensor)

    # Backward pass and optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Print loss every 20 epochs
    if (epoch+1) % 20 == 0:
        print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item()}")

# Get the learned parameters
learned_weight = model.linear.weight.item()
learned_bias = model.linear.bias.item()
print(f"Trained model: y = {learned_weight:.4f}x + {learned_bias:.4f}")

Epoch [20/200], Loss: 0.3625805974006653
Epoch [40/200], Loss: 0.34780141711235046
Epoch [60/200], Loss: 0.3356897532939911
Epoch [80/200], Loss: 0.32576417922973633
Epoch [100/200], Loss: 0.31763020157814026
Epoch [120/200], Loss: 0.3109643757343292
Epoch [140/200], Loss: 0.3055015206336975
Epoch [160/200], Loss: 0.301024854183197
Epoch [180/200], Loss: 0.2973560690879822
Epoch [200/200], Loss: 0.2943494915962219
Trained model: y = 2.0351x + 0.7803


## 3. Save and load models

Once a model is trained, we can save it to disk and load it later for inference.

In [10]:
# Save the model weights
model_path = 'linear_regression_weights.pth'
torch.save(model.state_dict(), model_path)
print(f"Model weights saved to {model_path}")

Model weights saved to linear_regression_weights.pth


And the next time we want to use/train the model, we can load the model from disk and continue from there:

In [11]:
# Load the model weights
loaded_model = LinearRegression().to(device)
loaded_model.load_state_dict(torch.load(model_path))
loaded_model.eval()  # Set the model to evaluation mode

# Now, the model is ready to make predictions
x_new = np.array([[10.0], [20.0]])
x_new_tensor = torch.FloatTensor(x_new).to(device)
y_pred = loaded_model(x_new_tensor)
print(y_pred)

tensor([[21.1310],
        [41.4816]], device='mps:0', grad_fn=<LinearBackward0>)
