Introduction to PyTorch
--

This notebook introduces core PyTorch concepts from scratch, including installation steps and comparing PyTorch tensors with NumPy arrays for better understanding.

---

Installing PyTorch
--

You can install PyTorch locally or use it in Google Colab.

**CPU only (pip):**
```bash
pip install torch torchvision torchaudion

```

---

**GPU (CUDA):**

Check your CUDA version first, then follow installation instructions from the official site:

https://pytorch.org/get-started/locally/

Example for CUDA 11.8:

```bash
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
```
---

**Google Colab:**

PyTorch is usually preinstalled.


In [None]:
import torch
print(torch.__version__)

2.8.0+cu126


Tensors (PyTorch) vs ndarrays (NumPy)
--

PyTorch uses the `torch.Tensor` object, similar to `numpy.ndarray`, but can also run computations on GPUs.


In [10]:
import numpy as np
import torch

In [None]:
x = [1, 2, 3]

In [None]:
np.array(x)

array([1, 2, 3])

In [None]:
torch.tensor(x)

tensor([1, 2, 3])

In [None]:
# Create tensor from list
t = torch.tensor([[1, 2], [3, 4]])
print(t)

# Create NumPy array and convert to tensor
np_array = np.array([[1, 2], [3, 4]])
t_from_np = torch.from_numpy(np_array)
print(t_from_np)


tensor([[1, 2],
        [3, 4]])
tensor([[1, 2],
        [3, 4]])


## Basic Operations

These operations look almost identical in PyTorch and NumPy.


In [14]:
# NumPy
a = np.array([[1., 2.], [3., 4.]])
b = np.ones((2, 2))

In [16]:
a + b

array([[2., 3.],
       [4., 5.]])

In [17]:
a*b

array([[1., 2.],
       [3., 4.]])

In [18]:
a.T

array([[1., 3.],
       [2., 4.]])

In [19]:
# PyTorch
a = torch.tensor([[1., 2.], [3., 4.]])
b = torch.ones((2, 2))


In [20]:
a + b

tensor([[2., 3.],
        [4., 5.]])

In [21]:
a*b

tensor([[1., 2.],
        [3., 4.]])

In [22]:
a.T

tensor([[1., 3.],
        [2., 4.]])

## Converting between NumPy and PyTorch

This is useful when working with datasets or integrating PyTorch models into existing NumPy-based code.


In [None]:
# Tensor → NumPy
t = torch.ones(3)
np_version = t.numpy()
np_version

array([1., 1., 1.], dtype=float32)

In [None]:
# NumPy → Tensor
arr = np.zeros(3)
torch_version = torch.from_numpy(arr)
torch_version

tensor([0., 0., 0.], dtype=torch.float64)

## Using GPU

PyTorch can run computations on available GPUs.


In [None]:
print("CUDA available:", torch.cuda.is_available())

# Move tensor to GPU
if torch.cuda.is_available():
    t = torch.ones(3)
    t_gpu = t.to('cuda')
    print("Tensor on GPU:", t_gpu)
else:
    print("No GPU detected.")


CUDA available: False
No GPU detected.


In [None]:
print("CUDA available:", torch.cuda.is_available())

# Move tensor to GPU
if torch.cuda.is_available():
    t = torch.ones(3)
    t_gpu = t.to('cuda')
    print("Tensor on GPU:", t_gpu)
else:
    print("No GPU detected.")


CUDA available: True
Tensor on GPU: tensor([1., 1., 1.], device='cuda:0')


In [None]:
t_gpu = torch.ones(3, device='cuda')

In [None]:
t_gpu

tensor([1., 1., 1.], device='cuda:0')

In [None]:
# Defult is on CPU
t = torch.ones(3)
t

tensor([1., 1., 1.])

In [None]:
# From GPU to CPU
t_gpu = torch.ones(3, device='cuda')   # GPU
print(t_gpu.device)

t_cpu = t_gpu.to('cpu')
print(t_cpu.device)


cuda:0
cpu


In [None]:
# or
t_cpu = t_gpu.cpu()
print(t_cpu.device)

cpu


In [None]:
import time

# Setup: large tensor size for noticeable time difference
N = 10_000_000

# Create a large random tensor on CPU
x_cpu = torch.randn(N)

# Move same tensor to GPU (if available)
if torch.cuda.is_available():
    x_gpu = x_cpu.to('cuda')
else:
    raise RuntimeError("No GPU available for test.")

# CPU timing
start_cpu = time.time()
y_cpu = x_cpu * 2.5
torch.cuda.synchronize()  # Not needed for CPU, but safe
cpu_time = time.time() - start_cpu

# GPU timing
torch.cuda.synchronize()  # Ensure GPU is ready
start_gpu = time.time()
y_gpu = x_gpu * 2.5
torch.cuda.synchronize()  # Wait for GPU to finish
gpu_time = time.time() - start_gpu

print(f"CPU time: {cpu_time:.6f} seconds")
print(f"GPU time: {gpu_time:.6f} seconds")


CPU time: 0.019426 seconds
GPU time: 0.004059 seconds


## Statistical operations


PyTorch provides a variety of statistical
operations similar to NumPy. These include:

- `mean()` : return the mean of data
-  `std()`: return the standard deviation of data
- `sum()`, `prod()` : total sum and product of elements.
- `min()`, `max()` : smallest and largest values.
- `argmin()`, `argmax()` : indices of smallest/largest values.
- `median()` : median value.
- `quantile()` : arbitrary quantile(s).
- `all()`, `any()` : boolean reductions.
- `var()` : variance.

By default, these operate on all elements. You can specify `dim` to perform
the operation along a particular axis.


In [None]:
data = torch.tensor([[2.0, 3.0, 7.0],
                     [1.0, 5.0, 4.0]])

In [60]:
data.mean()

tensor(3.6667)

In [61]:
data.std()

tensor(2.1602)

In [None]:
data.sum()

tensor(22.)

In [None]:
data.prod()

tensor(840.)

In [None]:
( data.min(), data.max())

(tensor(1.), tensor(7.))

In [None]:
(data.argmin(), data.argmax())

(tensor(3), tensor(2))

In [None]:
# Find flattened index of min value
idx_flat = data.argmin()
print("Flattened index:", idx_flat)

# Convert to 2D coordinates (row, col)
rows, cols = data.shape
row_idx = idx_flat // cols
col_idx = idx_flat % cols
print(f"Row index: {row_idx.item()}, Col index: {col_idx.item()}")

print("Min value:", data[row_idx, col_idx])


Flattened index: tensor(3)
Row index: 1, Col index: 0
Min value: tensor(1.)


In [None]:
# Median
data.median()

tensor(3.)

In [None]:
# Quantile (e.g., 25% and 75%)
q_values = torch.tensor([0.25, 0.75])
torch.quantile(data, q_values)


tensor([2.2500, 4.7500])

In [None]:
# Variance
data.var()

tensor(4.6667)

In [None]:
# Operations along a specific dimension
data.mean(dim=1) # mean of each column

tensor([1.5000, 4.0000, 5.5000])