### PyTorch - Deep Learning Framework
- What is PyTorch?

`PyTorch` is a deep learning framework that enables building, training, and deploying neural networks. It's the backbone for training the U-Net segmentation model in VisionExtract.

- Why PyTorch for VisionExtract?

  - Dynamic computation graphs: More flexible than static graphs

  - Pythonic: Easy to debug and experiment with

  - Production-ready: Used by major companies (Tesla, Meta, etc.)

  - Ecosystem: Rich library of pre-trained models and utilities
- #### Installation

In [2]:
pip install torch torchvision torchaudio

Collecting torch
  Downloading torch-2.9.1-cp314-cp314-win_amd64.whl.metadata (30 kB)
Collecting torchvision
  Downloading torchvision-0.24.1-cp314-cp314-win_amd64.whl.metadata (5.9 kB)
Collecting torchaudio
  Downloading torchaudio-2.9.1-cp314-cp314-win_amd64.whl.metadata (6.7 kB)
Collecting filelock (from torch)
  Downloading filelock-3.20.0-py3-none-any.whl.metadata (2.1 kB)
Collecting typing-extensions>=4.10.0 (from torch)
  Using cached typing_extensions-4.15.0-py3-none-any.whl.metadata (3.3 kB)
Collecting sympy>=1.13.3 (from torch)
  Downloading sympy-1.14.0-py3-none-any.whl.metadata (12 kB)
Collecting networkx>=2.5.1 (from torch)
  Downloading networkx-3.6-py3-none-any.whl.metadata (6.8 kB)
Collecting jinja2 (from torch)
  Downloading jinja2-3.1.6-py3-none-any.whl.metadata (2.9 kB)
Collecting fsspec>=0.8.5 (from torch)
  Downloading fsspec-2025.10.0-py3-none-any.whl.metadata (10 kB)
Collecting setuptools (from torch)
  Using cached setuptools-80.9.0-py3-none-any.whl.metadata (6.


[notice] A new release of pip is available: 25.2 -> 25.3
[notice] To update, run: python.exe -m pip install --upgrade pip


### 1. Tensors - PyTorch's Core Data Structure

- `Tensors` are multi-dimensional arrays (similar to NumPy arrays but optimized for GPUs).

In [4]:
import torch
import numpy as np

# Create tensors from scratch
tensor_zeros = torch.zeros(2, 3)
print(f"Zeros:\n{tensor_zeros}")

tensor_ones = torch.ones(2, 3)    # 2x3 ones
print(f"\nOnes:\n{tensor_ones}")

# Create a Python list
tensor_from_list = torch.tensor([[4,5,6], [7,8,9]])
print(f"\nRandom:\n{tensor_from_list}")

# NumPy ↔ PyTorch conversion
numpy_array = np.array([[1, 2], [3, 4]])
tensor_from_numpy = torch.from_numpy(numpy_array)
print(f"\nFrom NumPy:\n{tensor_from_numpy}")

# Back to NumPy
back_to_numpy = tensor_from_numpy.numpy()
print(f"\nBack to NumPy:\n{back_to_numpy}")

Zeros:
tensor([[0., 0., 0.],
        [0., 0., 0.]])

Ones:
tensor([[1., 1., 1.],
        [1., 1., 1.]])

Random:
tensor([[4, 5, 6],
        [7, 8, 9]])

From NumPy:
tensor([[1, 2],
        [3, 4]])

Back to NumPy:
[[1 2]
 [3 4]]


- #### Key Properties:

In [5]:
tensor = torch.randn(2, 3, 224, 224)  # Batch of 2 images, 3 channels, 224x224

print(f"Shape: {tensor.shape}")        # torch.Size([2, 3, 224, 224])
print(f"Data type: {tensor.dtype}")    # torch.float32
print(f"Device: {tensor.device}")      # cpu or cuda
print(f"Requires grad: {tensor.requires_grad}")  # False by default

Shape: torch.Size([2, 3, 224, 224])
Data type: torch.float32
Device: cpu
Requires grad: False


#### 2. Moving Tensors to GPU

  - GPUs dramatically speed up neural network training (50-100x faster).

In [7]:
# Check GPU Availability
print(f"CUDA Availability: {torch.cuda.is_available()}")
print(f"GPU count:{torch.cuda.device_count()}")
if torch.cuda.is_available():
  print(f"GPU name: {torch.cuda.get_device_name(0)}")

# Set device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

# Move tensor to GPU
tensor_cpu = torch.randn(1000, 1000)
tensor_gpu = tensor_cpu.to(device)
print(f"CPU tensor device: {tensor_cpu.device}")
print(f"GPU tensor device: {tensor_gpu.device}")

# GPU operations are much faster
import time

# CPU operations
start = time.time()
results_cpu = torch.matmul(tensor_cpu, tensor_cpu)
cpu_time = time.time()-start

# GPU operation
start = time.time()
result_gpu = torch.matmul(tensor_gpu, tensor_gpu)
gpu_time = time.time() - start

print(f"\nCPU time: {cpu_time:.4f}s")
print(f"GPU time: {gpu_time:.4f}s (including transfer overhead)")
print(f"Speedup: {cpu_time/gpu_time:.1f}x")

CUDA Availability: False
GPU count:0
Using device: cpu
CPU tensor device: cpu
GPU tensor device: cpu

CPU time: 0.0230s
GPU time: 0.0226s (including transfer overhead)
Speedup: 1.0x


#### 3. Building Neural Networks with nn.Module

 - `PyTorch` models are classes that inherit from nn.Module. This is the standard way to define neural networks.

In [9]:
import torch
import torch.nn as nn

# Simple Neural Network
class SimpleNet(nn.Module):
  def __init__(self, input_size, hidden_size, num_classes):
    super(SimpleNet, self).__init__()

    # Define layers
    self.fc1 = nn.Linear(input_size, hidden_size) # Input -> Hidden
    self.relu = nn.ReLU() # Activation
    self.fc2 = nn.Linear(hidden_size, num_classes) # Hidden -> Ouput

  def forward(self, x):
    """Forward pass: defines how data flows through the network"""
    x = self.fc1(x)  # Linear layer
    x = self.relu(x) # ReLU activation
    x = self.fc2(x) # Output layer
    return x

# Create model instance
model = SimpleNet(input_size=784, hidden_size=128, num_classes=10)

# Test forward pass
dummy_input = torch.randn(42, 784)  # Batch of 42 images, 784 features
output = model(dummy_input)
print(f"Input shape: {dummy_input.shape}")
print(f"Output shape: {output.shape}")

Input shape: torch.Size([42, 784])
Output shape: torch.Size([42, 10])


- Teaching Point: Show the architecture:

Input (42, 784)

    ↓

Linear(784 → 128)

    ↓

ReLU

    ↓

Linear(128 → 10)

    ↓
    
Output (32, 10)
