# Simple tutorial extracted from Chat GPT about how to use Pytorch with GPU

To execute a PyTorch project using a GPU, you need to ensure that PyTorch is set up with CUDA (NVIDIA's parallel computing platform) and that your code explicitly uses the GPU. Here’s a step-by-step guide to help you run your PyTorch project on a GPU:

## 1) Install PyTorch with GPU Support
> First, you need to install the correct version of PyTorch that supports CUDA. You can install it using `pip` or `conda`, depending on your environment.

> Using `pip`:

```bash
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
```

> Here, cu118 refers to CUDA 11.8. Adjust this version based on your installed CUDA version.

> Using `conda`:

```bash
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
```

## 2) Check if CUDA is Available
After installation, ensure that PyTorch can detect your GPU. You can do this by running the following code:

In [1]:
import torch

if torch.cuda.is_available():
    print(f"CUDA is available! GPU name: {torch.cuda.get_device_name(0)}")
else:
    print("CUDA is not available. Running on CPU.")


CUDA is available! GPU name: NVIDIA GeForce RTX 4060 Ti


If CUDA is not available, check your installation of NVIDIA drivers, CUDA Toolkit, and PyTorch with GPU support.

## 3) Move Your Model and Tensors to GPU
To utilize the GPU, you'll need to move your model and data to the GPU using .to('cuda') or .cuda().

Example:

In [2]:
import torch
import torch.nn as nn

# Check if CUDA is available and set the device accordingly
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Define a simple model
model = nn.Linear(10, 1)

# Move the model to GPU
model.to(device)

# Create a random input tensor and move it to GPU
input_tensor = torch.randn(10).to(device)

# Forward pass (on GPU)
output = model(input_tensor)

print(output)


tensor([0.0110], device='cuda:0', grad_fn=<ViewBackward0>)


In [5]:
type(input_tensor)

torch.Tensor

In this example:
* The model and tensor are moved to the GPU using to(device), where device is set to 'cuda' if available.
* Any computation that happens after this will utilize the GPU.

## 4) Training a Model on the GPU
When training a model, you should also move the data (inputs and labels), the model, and the loss function to the GPU.

Example of a simple training loop:

In [12]:
import torch
import torch.nn as nn
import torch.optim as optim

# Set device to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Define a simple model and move it to the GPU
model = nn.Linear(10, 1).to(device)

# Define a loss function and an optimizer
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)

# Create some random data (100 samples, 10 features) and move to GPU
inputs = torch.randn(100, 10).to(device)
targets = torch.randn(100, 1).to(device)

# Training loop
for epoch in range(100):
    optimizer.zero_grad()  # Zero the gradients

    # Forward pass
    outputs = model(inputs)

    # Compute loss
    loss = criterion(outputs, targets)

    # Backward pass and optimization
    loss.backward()
    optimizer.step()

    if (epoch + 1) % 10 == 0:
        print(f'Epoch [{epoch+1}/100], Loss: {loss.item():.4f}')


Epoch [10/100], Loss: 1.1432
Epoch [20/100], Loss: 1.0414
Epoch [30/100], Loss: 0.9763
Epoch [40/100], Loss: 0.9343
Epoch [50/100], Loss: 0.9068
Epoch [60/100], Loss: 0.8885
Epoch [70/100], Loss: 0.8763
Epoch [80/100], Loss: 0.8679
Epoch [90/100], Loss: 0.8621
Epoch [100/100], Loss: 0.8581


## 5) Key Things to Remember:
1. Move everything to the GPU: Ensure that the model, input tensors, labels, and any intermediate tensors are moved to the GPU by calling .to(device) where device is set to 'cuda' when available.
2. Check CUDA availability: Always check if CUDA is available using torch.cuda.is_available() to avoid errors on systems without a GPU.
3. Utilize Multiple GPUs: If you have multiple GPUs, you can use torch.nn.DataParallel or torch.nn.parallel.DistributedDataParallel to distribute the model across them.


Using Multiple GPUs:

In [14]:
if torch.cuda.device_count() > 1:
    model = nn.DataParallel(model)
model.to(device)

Linear(in_features=10, out_features=1, bias=True)

## 6) Useful PyTorch CUDA Commands:
* Check number of available GPUs:

In [15]:
print(torch.cuda.device_count())

1


* Get the name of the current GPU

In [17]:
print(torch.cuda.get_device_name(0))

NVIDIA GeForce RTX 4060 Ti


Clear GPU memory (useful after debugging):

In [18]:
torch.cuda.empty_cache()

## 7) Example Full Workflow:

In [20]:
import torch
import torch.nn as nn
import torch.optim as optim

# Set device to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Define a simple model and move it to the GPU
model = nn.Sequential(
    nn.Linear(10, 128),
    nn.ReLU(),
    nn.Linear(128, 1)
).to(device)

# Loss function and optimizer
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Random data
inputs = torch.randn(100, 10).to(device)
targets = torch.randn(100, 1).to(device)

# Training loop
for epoch in range(100):
    model.train()
    optimizer.zero_grad()
    
    outputs = model(inputs)
    loss = criterion(outputs, targets)
    
    loss.backward()
    optimizer.step()
    
    if (epoch + 1) % 10 == 0:
        print(f'Epoch [{epoch+1}/100], Loss: {loss.item():.4f}')
        
torch.cuda.empty_cache()

Epoch [10/100], Loss: 1.0222
Epoch [20/100], Loss: 0.9496
Epoch [30/100], Loss: 0.8930
Epoch [40/100], Loss: 0.8436
Epoch [50/100], Loss: 0.7959
Epoch [60/100], Loss: 0.7472
Epoch [70/100], Loss: 0.6965
Epoch [80/100], Loss: 0.6452
Epoch [90/100], Loss: 0.5932
Epoch [100/100], Loss: 0.5410
