# PyTorch Device Management

This notebook covers how to work with different computational devices in PyTorch:

## 📚 **What You'll Learn**

- **Device Detection**: How to check for available devices (CPU, CUDA, MPS)
- **Moving Tensors**: How to move tensors between devices
- **Device Compatibility**: Understanding device requirements for operations
- **Performance Optimization**: Best practices for device usage
- **Memory Management**: Efficient tensor creation and movement

## 🎯 **Learning Objectives**

By the end of this notebook, you'll understand:
- How to detect CUDA (NVIDIA GPU) and MPS (Apple Silicon) availability
- Different methods to move tensors between devices
- Why tensors must be on the same device for operations
- How to optimize performance with proper device usage

Let's master PyTorch device management! ⚡

In [None]:
import torch
import numpy as np

print(f"PyTorch version: {torch.__version__}")

## Checking Available Devices

Let's check what computational devices are available on your system:

In [None]:
# Check CPU (always available)
print("CPU is available: True (always)")

# Check CUDA (NVIDIA GPU)
cuda_available = torch.cuda.is_available()
print(f"CUDA is available: {cuda_available}")

if cuda_available:
    print(f"CUDA device count: {torch.cuda.device_count()}")
    print(f"Current CUDA device: {torch.cuda.current_device()}")
    print(f"CUDA device name: {torch.cuda.get_device_name(0)}")

# Check MPS (Apple Silicon GPU)
# Metal Performance Shaders (MPS) for Apple Silicon
mps_available = torch.backends.mps.is_available()
print(f"MPS is available: {mps_available}")

# Check if MPS is built (compiled with MPS support)
mps_built = torch.backends.mps.is_built()
print(f"MPS is built: {mps_built}")

# Determine the best available device
if cuda_available:
    device = torch.device("cuda")
    print(f"\nBest available device: {device}")
elif mps_available:
    device = torch.device("mps")
    print(f"\nBest available device: {device}")
else:
    device = torch.device("cpu")
    print(f"\nBest available device: {device}")

# Create a tensor and check its default device
sample_tensor = torch.randn(3, 3)
print(f"\nDefault tensor device: {sample_tensor.device}")
print(f"Default tensor device type: {sample_tensor.device.type}")

## Moving Tensors Between Devices

Now let's see how to move tensors between different devices:

In [None]:
# Create a tensor on CPU (default)
cpu_tensor = torch.randn(3, 3)
print("Original tensor device:", cpu_tensor.device)
print("Tensor values:\n", cpu_tensor)

# Method 1: Using .to() with device string
cpu_tensor_moved = cpu_tensor.to('cpu')  # Already on CPU, but showing the syntax
print(f"\nUsing .to('cpu'): {cpu_tensor_moved.device}")

# Method 2: Using .to() with device object
device_obj = torch.device('cpu')
cpu_tensor_moved2 = cpu_tensor.to(device_obj)
print(f"Using .to(device_obj): {cpu_tensor_moved2.device}")

# Method 3: Using specific device methods
cpu_tensor_moved3 = cpu_tensor.cpu()  # Explicit CPU method
print(f"Using .cpu(): {cpu_tensor_moved3.device}")

# Try moving to CUDA if available
if torch.cuda.is_available():
    print("\n--- CUDA Examples ---")
    cuda_tensor = cpu_tensor.to('cuda')
    print(f"Moved to CUDA: {cuda_tensor.device}")
    
    # You can also specify which GPU (if multiple)
    cuda_tensor_gpu0 = cpu_tensor.to('cuda:0')
    print(f"Moved to CUDA:0: {cuda_tensor_gpu0.device}")
    
    # Using .cuda() method
    cuda_tensor2 = cpu_tensor.cuda()
    print(f"Using .cuda(): {cuda_tensor2.device}")
    
    # Move back to CPU
    back_to_cpu = cuda_tensor.to('cpu')
    print(f"Back to CPU: {back_to_cpu.device}")
    
    # Perform computation on GPU
    gpu_result = cuda_tensor @ cuda_tensor.T  # Matrix multiplication on GPU
    print(f"GPU computation result device: {gpu_result.device}")
    print(f"Result shape: {gpu_result.shape}")

# Try moving to MPS if available
if torch.backends.mps.is_available():
    print("\n--- MPS Examples ---")
    mps_tensor = cpu_tensor.to('mps')
    print(f"Moved to MPS: {mps_tensor.device}")
    
    # Perform computation on MPS
    mps_result = mps_tensor @ mps_tensor.T
    print(f"MPS computation result device: {mps_result.device}")
    print(f"Result shape: {mps_result.shape}")
    
    # Move back to CPU
    back_to_cpu_from_mps = mps_tensor.to('cpu')
    print(f"Back to CPU from MPS: {back_to_cpu_from_mps.device}")

# IMPORTANT: Tensors must be on the same device for operations!
print("\n--- Device Compatibility ---")
tensor_a = torch.randn(2, 2)  # CPU
tensor_b = torch.randn(2, 2)  # CPU

print(f"tensor_a device: {tensor_a.device}")
print(f"tensor_b device: {tensor_b.device}")

# This works - both on CPU
result_cpu = tensor_a @ tensor_b
print("CPU @ CPU: Success")

# This would fail if tensors are on different devices:
# If tensor_a is on CPU and tensor_b is on CUDA, tensor_a @ tensor_b would raise an error

## Device Compatibility and Operations

**Important:** Tensors must be on the same device to perform operations between them.

In [None]:
# Create tensors on different devices
cpu_tensor1 = torch.randn(3, 3)
cpu_tensor2 = torch.randn(3, 3)

print("Both tensors on CPU:")
print(f"Tensor 1 device: {cpu_tensor1.device}")
print(f"Tensor 2 device: {cpu_tensor2.device}")

# This works - both on CPU
result_cpu = cpu_tensor1 + cpu_tensor2
print(f"Addition result device: {result_cpu.device}")

# Demonstrate device mismatch error (only if GPU/MPS available)
if torch.cuda.is_available() or torch.backends.mps.is_available():
    # Determine available accelerated device
    if torch.cuda.is_available():
        accelerated_device = 'cuda'
    else:
        accelerated_device = 'mps'
    
    print(f"\n=== Device Mismatch Example (using {accelerated_device}) ===")
    
    # Move one tensor to accelerated device
    accelerated_tensor = cpu_tensor1.to(accelerated_device)
    print(f"CPU tensor device: {cpu_tensor2.device}")
    print(f"Accelerated tensor device: {accelerated_tensor.device}")
    
    # This would cause an error - demonstrate how to fix it
    try:
        # This line would fail: result = cpu_tensor2 + accelerated_tensor
        print("❌ Cannot perform operations between tensors on different devices")
        print("💡 Solution: Move tensors to the same device")
        
        # Solution 1: Move accelerated tensor back to CPU
        result1 = cpu_tensor2 + accelerated_tensor.cpu()
        print(f"✅ Solution 1 - Move to CPU: {result1.device}")
        
        # Solution 2: Move CPU tensor to accelerated device
        result2 = cpu_tensor2.to(accelerated_device) + accelerated_tensor
        print(f"✅ Solution 2 - Move to {accelerated_device}: {result2.device}")
        
    except RuntimeError as e:
        print(f"RuntimeError: {e}")
else:
    print("\n=== Device Mismatch Example ===")
    print("Only CPU available - device mismatch errors occur when tensors are on different devices")
    print("Example: CPU tensor + CUDA tensor would fail")
    print("Solution: Always ensure tensors are on the same device before operations")

## Best Practices for Device Management

### 1. Automatic Device Selection

In [None]:
# Best practice: Automatic device selection
def get_device():
    """Select the best available device automatically"""
    if torch.cuda.is_available():
        return torch.device("cuda")
    elif torch.backends.mps.is_available():
        return torch.device("mps")
    else:
        return torch.device("cpu")

# Use the function
device = get_device()
print(f"Selected device: {device}")

# Create tensors directly on the selected device
tensor_on_device = torch.randn(3, 3, device=device)
print(f"Tensor created on: {tensor_on_device.device}")

# Alternative: Create on CPU then move
cpu_tensor = torch.randn(3, 3)
tensor_moved = cpu_tensor.to(device)
print(f"Tensor moved to: {tensor_moved.device}")

print("\n=== Best Practices Summary ===")
practices = [
    "1. Use automatic device selection functions",
    "2. Create tensors directly on target device when possible",
    "3. Ensure all tensors in an operation are on the same device",
    "4. Move model and data to the same device before training",
    "5. Use .cpu() to move tensors back for NumPy conversion",
    "6. Consider memory limitations when using GPU devices"
]

for practice in practices:
    print(practice)