![](https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/se_01.png)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/CLDiego/uom_fse_dl_workshop/blob/main/SE01_CA_Intro_to_pytorch.ipynb)

## Workshop Overview
***
Welcome! This workshop introduces PyTorch with a focus on scientific computing applications. Whether you're analyzing experimental data, running simulations, or building models for physical systems, PyTorch provides powerful tools for numerical computation and GPU acceleration.

**Prerequisites**: Basic Python, familiarity with NumPy arrays and Pandas DataFrames

**Learning Objectives**:
- Understand PyTorch tensors as GPU-accelerated arrays
- Convert between NumPy, Pandas, and PyTorch
- Perform scientific computations efficiently
- Load and process real scientific datasets

# <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/write.svg" width="30"/> 1. Why PyTorch for Scientific Computing?
***
[PyTorch](https://pytorch.org/) is an open-source machine learning framework that excels at scientific computing tasks requiring:

- **GPU Acceleration**: Move computations from CPU to GPU with a single line of code
- **Automatic Differentiation**: Compute gradients automatically (essential for optimization and inverse problems)
- **NumPy Compatibility**: Seamless conversion between NumPy and PyTorch
- **Dynamic Computation**: Build and modify computational graphs on-the-fly

## PyTorch vs. Traditional Scientific Python

While NumPy and SciPy are excellent for CPU-based scientific computing, PyTorch offers:

1. **Hardware Flexibility**: Same code runs on CPU or GPU
2. **Autograd**: Built-in automatic differentiation for optimization
3. **Deep Learning Integration**: Easily transition from data processing to model building
4. **Production Ready**: Deploy research code to production environments

## Setting up the environment

We are going to use different python modules throughout this course. It is not necessary to be familiar with all of them at the moment. Some of these libraries enable us to work with data and perform numerical operations, while others are used for visualization purposes.

In [2]:
import plotly.express as px
import plotly.graph_objects as go
import pprint
import numpy as np
import pandas as pd
import torch

print("=" * 60)
print("PyTorch Setup Information")
print("=" * 60)
print(f"PyTorch Version: {torch.__version__}")
print(f"NumPy Version: {np.__version__}")
print(f"Pandas Version: {pd.__version__}")
print("-" * 60)

if torch.cuda.is_available():
    print(f"✓ GPU Available: {torch.cuda.get_device_name(0)}")
    print(f"  GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
else:
    print("✗ No GPU detected - using CPU")
    print("  (For GPU: Runtime > Change runtime type > Hardware accelerator > GPU)")

print("=" * 60)

PyTorch Setup Information
PyTorch Version: 2.10.0+cpu
NumPy Version: 2.0.2
Pandas Version: 2.2.2
------------------------------------------------------------
✗ No GPU detected - using CPU
  (For GPU: Runtime > Change runtime type > Hardware accelerator > GPU)


# <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/write.svg" width="30"/> 2. Scientific Python Refresher: NumPy & Pandas
***

Before diving into PyTorch, let's briefly review NumPy and Pandas. If you're already comfortable with these libraries, feel free to skim this section.

## 2.1 NumPy: Arrays for Numerical Data

> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/docs.svg" width="20"/> **Definition**: NumPy arrays are n-dimensional containers for homogeneous data (all elements have the same type). They're ideal for representing grids, matrices, sensor readings, or any structured numerical data.

<img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/reminder.svg" width="20"/> **Common Use Cases**:
- Time series data (temperature, pressure, voltage readings)
- Spatial data (2D/3D grids from simulations)
- Experimental measurements
- Image data from microscopes or telescopes

Let's create some example scientific data using NumPy.

In [5]:
# Example 1: Temperature sensor readings over time (1D array)
time_hours = np.linspace(0, 24, 25)  # 0 to 24 hours
temperature_celsius = 20 + 5 * np.sin(2 * np.pi * time_hours / 24) + np.random.normal(0, 0.5, 25)

###################
# TODO: COMPLETE THE CODE BELOW
# Print the shape, data type, mean, std deviation, and first 5 readings
print("Temperature Sensor Data")
print("-" * 40)
print(f"Shape: {temperature_celsius.shape}")  # Get the shape attribute
print(f"Data type: {temperature_celsius.dtype}")  # Get the dtype attribute
print(f"Mean temperature: {temperature_celsius.mean():.2f} °C")  # Calculate mean
print(f"Std deviation: {temperature_celsius.std():.2f} °C")  # Calculate standard deviation
print(f"First 5 readings: {temperature_celsius[:5]}")  # Slice first 5 elements

Temperature Sensor Data
----------------------------------------
Shape: (25,)
Data type: float64
Mean temperature: 20.15 °C
Std deviation: 3.64 °C
First 5 readings: [20.56397249 22.24418794 22.35570644 22.84183848 24.56138094]


In [6]:
# Example 2: 2D grid (e.g., spatial temperature distribution)
grid_size = 50
x = np.linspace(-5, 5, grid_size)
y = np.linspace(-5, 5, grid_size)
X, Y = np.meshgrid(x, y)

# Simulate a Gaussian temperature distribution
temperature_2d = 100 * np.exp(-(X**2 + Y**2) / 10)

print("\n2D Temperature Field (Heat Source)")
print("-" * 40)
print(f"Shape: {temperature_2d.shape}")
print(f"Max temperature: {temperature_2d.max():.2f} °C")
print(f"Min temperature: {temperature_2d.min():.2f} °C")


2D Temperature Field (Heat Source)
----------------------------------------
Shape: (50, 50)
Max temperature: 99.79 °C
Min temperature: 0.67 °C


In [7]:

# Example 3: 3D volumetric data (e.g., MRI scan, simulation output)
volume_data = np.random.randn(10, 10, 10)  # Small 3D volume
print("\n3D Volumetric Data")
print("-" * 40)
print(f"Shape: {volume_data.shape} (depth × height × width)")
print(f"Total elements: {volume_data.size}")


3D Volumetric Data
----------------------------------------
Shape: (10, 10, 10) (depth × height × width)
Total elements: 1000


In [8]:
# Visualising previous data using Plotly
# 1D Line Plot for Temperature Sensor Readings
fig1 = px.line(x=time_hours, y=temperature_celsius, title="Temperature Sensor Readings Over Time",
               labels={"x": "Time (hours)", "y": "Temperature (°C)"})
fig1.show()

# 2D Heatmap for Temperature Field
fig2 = px.imshow(temperature_2d, title="2D Temperature Field (Heat Source)",
                 labels={"x": "X-axis", "y": "Y-axis", "color": "Temperature (°C)"})
fig2.show()

# 3D Surface Plot for Volumetric Data
fig3 = go.Figure(data=[go.Surface(z=volume_data[5, :, :])])
fig3.update_layout(title='3D Volumetric Data (Slice at depth=5)',
                   scene=dict(xaxis_title='X-axis', yaxis_title='Y-axis', zaxis_title='Value'))
fig3.show()

## 2.2 Pandas: DataFrames for Tabular Data

> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/docs.svg" width="20"/> **Definition**: Pandas DataFrames are 2D labeled data structures with columns of potentially different types. They're like spreadsheets or SQL tables in Python.

<img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/reminder.svg" width="20"/>  **Common Use Cases**:
- Experimental measurements with metadata (e.g., sample ID, conditions, measurements)
- Time series with multiple sensors
- Results from parameter sweeps
- Loading CSV/Excel files from lab equipment

Let's create a sample experimental dataset.

In [9]:
# Create a sample experimental dataset
np.random.seed(42)
n_samples = 100

experimental_data = pd.DataFrame({
    'sample_id': [f'S{i:03d}' for i in range(n_samples)],
    'concentration_mM': np.random.uniform(0.1, 10.0, n_samples),
    'temperature_C': np.random.normal(25, 2, n_samples),
    'pressure_bar': np.random.normal(1.0, 0.05, n_samples),
    'reaction_rate': np.random.lognormal(0, 0.5, n_samples)
})

print("Experimental Dataset Summary")
print("=" * 60)
print(f"Number of samples: {len(experimental_data)}")
print(f"Columns: {list(experimental_data.columns)}")


Experimental Dataset Summary
Number of samples: 100
Columns: ['sample_id', 'concentration_mM', 'temperature_C', 'pressure_bar', 'reaction_rate']


In [10]:
# Display first few rows
experimental_data.head()

Unnamed: 0,sample_id,concentration_mM,temperature_C,pressure_bar,reaction_rate
0,S000,3.807947,25.174094,1.00065,1.104651
1,S001,9.512072,24.401985,1.072677,0.740738
2,S002,7.34674,25.183522,0.986767,1.035517
3,S003,6.026719,21.024862,1.136008,0.824765
4,S004,1.644585,24.560656,1.031283,1.0584


In [12]:
from google.colab import sheets
sheet = sheets.InteractiveSheet(df=experimental_data)

https://docs.google.com/spreadsheets/d/1j02ZklO4elXfaJVuWT2Xq_Ih9xRz8x9_5CRwwyfPh78/edit#gid=0


In [13]:
# Statistical summary
experimental_data.describe()

Unnamed: 0,concentration_mM,temperature_C,pressure_bar,reaction_rate
count,100.0,100.0,100.0,100.0
mean,4.754789,24.997839,1.001844,1.148774
std,2.945145,1.825859,0.054868,0.500276
min,0.154669,21.024862,0.837937,0.376799
25%,2.012688,23.589745,0.962951,0.780174
50%,4.69501,25.15561,1.003062,1.044161
75%,7.329011,25.967383,1.034063,1.369346
max,9.870181,29.926484,1.192637,2.934659


In [14]:
# Accessing data: convert to NumPy arrays
concentrations = experimental_data['concentration_mM'].values
print(f"\nExtracted concentrations shape: {concentrations.shape}")
print(f"Data type: {concentrations.dtype}")


Extracted concentrations shape: (100,)
Data type: float64


# <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/write.svg" width="30"/> 3. PyTorch Tensors: NumPy Arrays on Steroids
***

> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/docs.svg" width="20"/>  **Definition**: A **tensor** is PyTorch's core data structure. Essentially it can be boiled down to a multidimensional array (like NumPy's `ndarray`) that can live on the GPU and track gradients for automatic differentiation.

**Think of tensors as**: NumPy arrays + GPU support + automatic differentiation

## Why Tensors Matter for Scientists

1. **Hardware Flexibility**: Move computations between CPU and GPU with one line
2. **Performance**: GPU acceleration for large-scale computations (matrix operations, FFTs, etc.)
3. **Gradient Computation**: Essential for optimization, inverse problems, and machine learning
4. **Seamless Integration**: Convert to/from NumPy and Pandas easily

## Tensor Terminology

Similar to arrays, tensors have different **ranks** (number of dimensions):

- **Scalar** (rank 0): A single number, e.g., temperature = 273.15
- **Vector** (rank 1): 1D array, e.g., time series `[t₀, t₁, t₂, ...]`
- **Matrix** (rank 2): 2D array, e.g., grayscale image or grid data
- **3D Tensor** (rank 3): e.g., RGB image (height × width × channels) or volumetric data
- **nD Tensor** (rank n): Higher-dimensional data (e.g., batch of 3D volumes)

<div align="center">
  <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/tensors.png" width="60%">
</div>

## 3.1 Creating Tensors
***

Let's create tensors using `torch.tensor()`.

> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/docs.svg" width="20" style="filter: invert(50%) sepia(50%) saturate(2000%) hue-rotate(90deg) brightness(915%) contrast(100%);"/> **Documentation**: For details, see the [PyTorch Tensor Tutorial](https://pytorch.org/tutorials/beginner/basics/tensorqs_tutorial.html) and [torch.Tensor documentation](https://pytorch.org/docs/stable/tensors.html).

In [23]:
# Example: Creating scalar tensors (rank 0)
# Scalars represent single values like temperature, pressure, or energy

###################
# TODO: Create an integer scalar tensor with value 42
measurement_count =  torch.tensor(42) # COMPLETE: Pass the integer 42
print(f"Measurement count: {measurement_count}")
print(f"  Type: {type(measurement_count)}")
print(f"  Shape: {measurement_count.shape}")
print(f"  Data type: {measurement_count.dtype}")

Measurement count: 42
  Type: <class 'torch.Tensor'>
  Shape: torch.Size([])
  Data type: torch.int64


In [24]:
# Float scalar (common for physical measurements)
###################
# TODO: Create a float scalar tensor with value 273.15
temperature_kelvin = torch.tensor(273.15)  # COMPLETE: Pass the float 273.15
print(f"Temperature: {temperature_kelvin} K")
print(f"  Shape: {temperature_kelvin.shape}")
print(f"  Data type: {temperature_kelvin.dtype}")

Temperature: 273.1499938964844 K
  Shape: torch.Size([])
  Data type: torch.float32


In [27]:
# Extracting the Python value
###################
# TODO: Extract the Python value from the tensor using .item()
print(f"\nExtracted value: {temperature_kelvin.item()} (type: {type(temperature_kelvin.item())})")  # COMPLETE: Use .item() method


Extracted value: 273.1499938964844 (type: <class 'float'>)


In [28]:
# Practice: Create your own scientific measurements

# Create a scalar for pressure in atmospheres
pressure_atm = torch.tensor(1.013)

# Create a scalar for Avogadro's number (as float32 for efficiency)
avogadro_approx = torch.tensor(6.022e23, dtype=torch.float32)

print("Your Scientific Constants:")
print("=" * 60)
print(f"Pressure: {pressure_atm.item():.3f} atm")
print(f"  Shape: {pressure_atm.shape}, Dtype: {pressure_atm.dtype}")
print(f"\nAvogadro's number: {avogadro_approx.item():.3e}")
print(f"  Shape: {avogadro_approx.shape}, Dtype: {avogadro_approx.dtype}")

Your Scientific Constants:
Pressure: 1.013 atm
  Shape: torch.Size([]), Dtype: torch.float32

Avogadro's number: 6.022e+23
  Shape: torch.Size([]), Dtype: torch.float32


In the above example, we created a scalar tensor with a single element. Looking at its attributes, we can see that the tensor has a shape of `torch.Size([])`, which means that it has no dimensions. We can also see that the tensor has a data type of `torch.int64`, which means that it is an integer tensor.


> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/reminder.svg" width="20" style="filter: invert(100%) sepia(100%) saturate(1500%) hue-rotate(30deg) brightness(450%) contrast(70%);"/> **Note**: The data type of a tensor is determined by the data type of the elements that it contains. It is important to be aware of the data type of a tensor, as it can affect the results of operations that are performed on it. Good practice is to always specify the data type of a tensor when creating it.


As we can see our single element is now stored in a type of container, which means that we can perform operations on it but not directly on the element itself. To access the element, we can use the method `item()`.

We can specify the data type of a tensor by passing the `dtype` argument to the `torch.Tensor` constructor. Alternatively, we can use the 'torch.tensor.type` method to change the data type of a tensor.

In [29]:
###################
# TODO: Practice changing tensor data types
# Create a scalar tensor with a specific data type
scalar_tensor = torch.tensor(42, dtype=torch.float32)  # COMPLETE: Use torch.float32
print(scalar_tensor)

# Change the data type of a tensor
scalar_tensor = scalar_tensor.type(torch.int64)  # COMPLETE: Use torch.int64
print(scalar_tensor)

# Another way to change the data type of a tensor
scalar_tensor = scalar_tensor.int()  # COMPLETE: Use .int() method
print(scalar_tensor)

# # Not recommended as it can be confusing
# with the .to() method that is used to move tensors
# to different devices
scalar_tensor = scalar_tensor.to(torch.float64) # COMPLETE: Use torch.float64
print(scalar_tensor)


tensor(42.)
tensor(42)
tensor(42, dtype=torch.int32)
tensor(42., dtype=torch.float64)


## 3.2 Initializing tensors
***

PyTorch provides multiple ways to initialize tensors. Sometimes, we want to create a tensor with specific values, while other times we want to create a tensor with random values. PyTorch provides several functions for creating tensors with different initializations.

Below is a table summarizing some of the most commonly used tensor creation functions in PyTorch.

| Function | Description | Example | Output Shape |
|----------|-------------|---------|--------------|
| `torch.tensor()` | Creates tensor from data | `torch.tensor([1, 2, 3])` | `(3,)` |
| `torch.zeros()` | Creates tensor of zeros | `torch.zeros(2, 3)` | `(2, 3)` |
| `torch.ones()` | Creates tensor of ones | `torch.ones(2, 3)` | `(2, 3)` |
| `torch.rand()` | Uniform random [0, 1] | `torch.rand(2, 3)` | `(2, 3)` |
| `torch.randn()` | Normal distribution μ=0, σ=1 | `torch.randn(2, 3)` | `(2, 3)` |
| `torch.arange()` | Integer sequence | `torch.arange(5)` | `(5,)` |
| `torch.linspace()` | Evenly spaced sequence | `torch.linspace(0, 1, 5)` | `(5,)` |
| `torch.eye()` | Identity matrix | `torch.eye(3)` | `(3, 3)` |
| `torch.randint()` | Random integers | `torch.randint(0, 10, (2, 3))` | `(2, 3)` |

In [30]:
# Common tensor creation methods for scientific computing

print("Tensor Initialization Examples")
print("=" * 60)

# 1. From a list (e.g., measurement data)
time_points = torch.tensor([0.0, 0.5, 1.0, 1.5, 2.0])
print(f"Time points: {time_points}")
print(f"  Shape: {time_points.shape}\n")

###################
# TODO: Complete the tensor initialization methods
# 2. Zeros (useful for initializing arrays)
grid_2d = torch.zeros(3, 3)  # COMPLETE: Create a 3x3 tensor of zeros
print(f"Zeros (3×3 grid):\n{grid_2d}\n")

# 3. Ones (useful for masks or initial conditions)
mask = torch.ones(2, 4)  # COMPLETE: Create a 2x4 tensor of ones
print(f"Ones (2×4 mask):\n{mask}\n")

# 4. Random values (for Monte Carlo simulations)
random_samples = torch.rand(3, 3)  # COMPLETE: Create 3x3 random uniform [0,1]
print(f"Random uniform [0,1] (3×3):\n{random_samples}\n")

Tensor Initialization Examples
Time points: tensor([0.0000, 0.5000, 1.0000, 1.5000, 2.0000])
  Shape: torch.Size([5])

Zeros (3×3 grid):
tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])

Ones (2×4 mask):
tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.]])

Random uniform [0,1] (3×3):
tensor([[0.5847, 0.7231, 0.3447],
        [0.9495, 0.2361, 0.2601],
        [0.4497, 0.7400, 0.1180]])



In [31]:
###################
# TODO: Complete more tensor initialization methods
# 5. Random normal (Gaussian noise)
noise = torch.randn(5)  # COMPLETE: Create random normal distribution, Mean=0, Std=1
print(f"Gaussian noise: {noise}\n")

# 6. Sequences (like np.arange)
indices = torch.arange(0, 10, 2)  # COMPLETE: Create sequence from 0 to 10 with step 2
print(f"Sequence (0 to 10, step 2): {indices}\n")

# 7. Linearly spaced (like np.linspace)
wavelengths = torch.linspace(400, 700, 7)  # COMPLETE: Create 7 evenly spaced values from 400 to 700
print(f"Wavelengths (400-700 nm): {wavelengths}\n")

# 8. Identity matrix (for linear algebra)
identity_3x3 = torch.eye(3)  # COMPLETE: Create 3x3 identity matrix
print(f"Identity matrix (3×3):\n{identity_3x3}")

Gaussian noise: tensor([ 0.7061, -1.2565,  0.1613, -1.1486, -0.4624])

Sequence (0 to 10, step 2): tensor([0, 2, 4, 6, 8])

Wavelengths (400-700 nm): tensor([400., 450., 500., 550., 600., 650., 700.])

Identity matrix (3×3):
tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])


## 3.3 Indexing tensors
***
Indexing tensors is similar to indexing arrays in Python. We can use square brackets `[]` to access elements in a tensor. This is useful for extracting specific elements or slices of a tensor. Below is a table summarizing the different ways to index tensors in PyTorch.

> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/reminder.svg" width="20" style="filter: invert(100%) sepia(100%) saturate(1500%) hue-rotate(30deg) brightness(450%) contrast(70%);"/>  **Tips**:
> - Use `:` to select all elements in a dimension
> - Use negative indices to count from the end: -1 is last element
> - Ellipsis (`...`) represents multiple full slices
> - Step values can be negative for reverse order
> - Boolean masks must match tensor dimensions

| Method | Syntax | Description | Example | Result |
|--------|--------|-------------|---------|---------|
| Basic Indexing | `tensor[ix,jx]` | Access single element | `t[0,1]` | Element at row 0, col 1 |
| Slicing | `tensor[start:end]` | Extract subset | `t[1:3]` | Elements from index 1 to 2 |
| Striding | `tensor[::step]` | Extract with step | `t[::2]` | Every second element |
| Negative Indexing | `tensor[-1]` | Count from end | `t[-1]` | Last element |
| Boolean Indexing | `tensor[mask]` | Filter with condition | `t[t > 0]` | Elements > 0 |
| Ellipsis | `tensor[...]` | All dimensions | `t[...,0]` | All dims except last |
| Combined | `tensor[1:3,...,::2]` | Mix methods | `t[1:3,...,0]` | Complex selection |

***
> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/code.svg" width="20" style="filter: invert(100%) sepia(100%) saturate(2000%) hue-rotate(40deg) brightness(915%) contrast(100%);"/> **Snippet 1**: Indexing a tensor

```python
# Get corners of a matrix
corners = tensor[...,[0,-1]]  # First and last elements of last dimension

# Get last row of a matrix
last_row = tensor[-1,...]  # Last row of all columns

# Extract diagonal
diagonal = tensor.diagonal()  # More efficient than indexing
```

In [32]:
# Create a sample 4×4 data matrix (e.g., sensor grid readings)
sensor_data = torch.tensor([
    [20.1, 20.5, 21.0, 21.5],
    [19.8, 20.2, 20.8, 21.2],
    [19.5, 19.9, 20.5, 21.0],
    [19.2, 19.6, 20.2, 20.8]
])

print("Sensor Grid Data (4×4): Temperature in °C")
print(sensor_data)
print("\nIndexing Examples:")
print("=" * 60)

###################
# TODO: Complete the indexing operations
# 1. Single element
value = sensor_data[2, 3]  # COMPLETE: Get element at row 2, column 3
print(f"Sensor at position (2,3): {value.item():.1f} °C\n")

# 2. Entire row (all sensors in one row)
second_row = sensor_data[1, :]  # COMPLETE: Get row 1, all columns (use :)
print(f"Second row (all columns): {second_row}\n")

# 3. Entire column (one sensor over all rows)
last_column = sensor_data[:, 3]  # COMPLETE: Get all rows, last column (column 3)
print(f"Last column (all rows): {last_column}\n")

# 4. Submatrix (bottom-right 2×2 region)
bottom_right = sensor_data[2:, 2:]  # COMPLETE: Slice from row 2 to end, column 2 to end
print(f"Bottom-right 2×2:\n{bottom_right}")

Sensor Grid Data (4×4): Temperature in °C
tensor([[20.1000, 20.5000, 21.0000, 21.5000],
        [19.8000, 20.2000, 20.8000, 21.2000],
        [19.5000, 19.9000, 20.5000, 21.0000],
        [19.2000, 19.6000, 20.2000, 20.8000]])

Indexing Examples:
Sensor at position (2,3): 21.0 °C

Second row (all columns): tensor([19.8000, 20.2000, 20.8000, 21.2000])

Last column (all rows): tensor([21.5000, 21.2000, 21.0000, 20.8000])

Bottom-right 2×2:
tensor([[20.5000, 21.0000],
        [20.2000, 20.8000]])


In [33]:
print("Sensor Grid Data (4×4): Temperature in °C")
print(sensor_data)
print("\nIndexing Examples:")
print("=" * 60)

###################
# TODO: Complete advanced indexing operations
# 5. Striding (every other element in first row)
every_other = sensor_data[0,::2]  # COMPLETE: Get every 2nd element (use ::2)
print(f"First row, every other element: {every_other}\n")

# 6. Boolean masking (find hot spots > 20.5°C)
hot_spots = sensor_data > 20.5  # COMPLETE: Create boolean mask (use > operator)
print(f"Hot spots (> 20.5°C):\n{hot_spots}")
print(f"\nValues at hot spots: {sensor_data[hot_spots]}")  # COMPLETE: Apply mask

# 7. Negative indexing (last element)
last_element = sensor_data[-1, -1]  # COMPLETE: Get last row, last column (use -1)
print(f"\nLast element (bottom-right): {last_element.item():.1f} °C")

Sensor Grid Data (4×4): Temperature in °C
tensor([[20.1000, 20.5000, 21.0000, 21.5000],
        [19.8000, 20.2000, 20.8000, 21.2000],
        [19.5000, 19.9000, 20.5000, 21.0000],
        [19.2000, 19.6000, 20.2000, 20.8000]])

Indexing Examples:
First row, every other element: tensor([20.1000, 21.0000])

Hot spots (> 20.5°C):
tensor([[False, False,  True,  True],
        [False, False,  True,  True],
        [False, False, False,  True],
        [False, False, False,  True]])

Values at hot spots: tensor([21.0000, 21.5000, 20.8000, 21.2000, 21.0000, 20.8000])

Last element (bottom-right): 20.8 °C


# <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/write.svg" width="30"/> 4. Tensor operations
***

PyTorch allows us to manipulate tensors in different ways. Since PyTorch is built on top of NumPy, the same operations can be accessed through the `torch` module or alternatively through the `numpy` module. Due to the pythonic nature of PyTorch, we can also use the same operations as we would in Python.

### Basic Operations Cheatsheet

| Category | Description | Methods | PyTorch Method | Example |
|----------|-------------|----------|----------------|---------|
| Arithmetic | Basic math operations | +, -, *, /, ** | `add(), sub(), mul(), div(), pow(), sqrt()` | `a + b` |
| Comparison | Compare values | >, <, ==, != | `gt(), lt(), eq(), ne()` | `a > 0` |
| Reduction | Reduce dimensions | `sum(), mean(), max()` | `sum(), mean(), max()` | `a.sum()` |
| Statistical | Statistical operations | `std(), var()` | `std(), var()` | `a.mean()` |

***
> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/reminder.svg" width="20" style="filter: invert(100%) sepia(100%) saturate(1500%) hue-rotate(30deg) brightness(450%) contrast(70%);"/> **Tips**:
> 1. **Type Matching**: Ensure tensors have compatible data types
> 2. **Shape Broadcasting**: Understand how PyTorch broadcasts shapes
> 3. **GPU Memory**: Be careful with large tensor operations on GPU
> 4. **Inplace Operations**: Use `_` suffix for inplace operations

***
> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/list.svg" width="20" style="filter: invert(19%) sepia(75%) saturate(6158%) hue-rotate(312deg) brightness(87%) contrast(116%);"/> **Common Mistakes to Avoid**:
> - Mixing tensor types without conversion
> - Forgetting to handle device placement (CPU/GPU)
> - Not checking tensor shapes before operations
> - Unnecessary copying of large tensors

***
> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/code.svg" width="20" style="filter: invert(100%) sepia(100%) saturate(2000%) hue-rotate(40deg) brightness(915%) contrast(100%);"/> **Snippet 2**: Inplace operations
   ```python
   # Instead of: x = x + 1
   x.add_(1)  # Inplace addition
   y.add_(x)  # Inplace addition with another tensor
   ```

In [36]:
# Tensor operations for scientific computing

# Example: Two sets of experimental measurements
experiment_A = torch.tensor([[1.0, 2.0], [3.0, 4.0]])
experiment_B = torch.tensor([[5.0, 6.0], [7.0, 8.0]])

print("Tensor Operations Examples")
print("=" * 60)
print(f"Experiment A:\n{experiment_A}\n")
print(f"Experiment B:\n{experiment_B}\n")

###################
# TODO: Complete the tensor operations
# Element-wise addition
combined = experiment_A.add( experiment_B ) # COMPLETE: Add the two tensors
print(f"A + B (element-wise):\n{combined}\n")

# Element-wise multiplication (Hadamard product)
product = experiment_A.matmul(experiment_B)  # COMPLETE: Multiply element-wise
print(f"A * B (element-wise):\n{product}\n")

# Matrix multiplication (linear transformations)
matmul_result = experiment_A @ experiment_B  # COMPLETE: Matrix multiply (use @ operator)
print(f"A @ B (matrix multiplication):\n{matmul_result}\n")

# Square root (useful for RMS calculations)
sqrt_A = torch.sqrt(experiment_A)  # COMPLETE: Use torch.sqrt()
print(f"√A:\n{sqrt_A}\n")


Tensor Operations Examples
Experiment A:
tensor([[1., 2.],
        [3., 4.]])

Experiment B:
tensor([[5., 6.],
        [7., 8.]])

A + B (element-wise):
tensor([[ 6.,  8.],
        [10., 12.]])

A * B (element-wise):
tensor([[19., 22.],
        [43., 50.]])

A @ B (matrix multiplication):
tensor([[19., 22.],
        [43., 50.]])

√A:
tensor([[1.0000, 1.4142],
        [1.7321, 2.0000]])



In [37]:
###################
# TODO: Complete the statistical operations
# Statistical operations
print(f"Mean of A: {experiment_A.mean().item():.2f}")  # COMPLETE: Calculate mean
print(f"Standard deviation of A: {experiment_A.std().item():.2f}")  # COMPLETE: Calculate std
print(f"Sum along rows: {experiment_A.sum(dim=1)}")  # COMPLETE: Sum along dimension 1
print(f"Max along columns: {experiment_A.max(dim=0).values}")  # COMPLETE: Max along dimension 0

Mean of A: 2.50
Standard deviation of A: 1.29
Sum along rows: tensor([3., 7.])
Max along columns: tensor([3., 4.])


## 4.1 Matrix operations
***
Matrix multiplication is a common operation in algebra and is used in many machine learning algorithms. We can perform:

- **Matrix multiplication**: This is the standard matrix multiplication operation, which is denoted by the `@` operator in Python. This operation is also known as the dot product.
- **Element-wise multiplication**: This is the multiplication of two matrices of the same shape, which is denoted by the `*` operator in Python. This operation is also known as the Hadamard product.
- **Matrix transpose**: This is the operation of flipping a matrix over its diagonal, which is denoted by the `.T` attribute in Python. This operation is also known as the matrix transpose.
- **Matrix inverse**: This is the operation of finding the inverse of a matrix, which is denoted by the `torch.inverse()` function in Python. This operation is also known as the matrix inverse.

***
| Operation | Description | Method | Example |
|-----------|-------------|--------|---------|
| Matrix Multiplication | Standard matrix product | @ or matmul() | `a @ b` |
| Transpose | Flip matrix dimensions | .T or transpose() | `a.T` |
| Inverse | Matrix inverse | inverse() | `torch.inverse(a)` |
| Determinant | Matrix determinant | det() | `torch.det(a)` |
| Eigenvalues | Eigenvalues and vectors | eig() | `torch.eig(a)` |
| Singular Value Decomposition | SVD decomposition | svd() | `torch.svd(a)` |
| Cholesky Decomposition | Cholesky factorization | cholesky() | `torch.cholesky(a)` |

***

<div align="center">
  <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/matrix_mul.gif" width="40%">
</div>




In [38]:
# Matrix operations for linear algebra (common in scientific computing)

# Example: Covariance matrix from experimental data
covariance_matrix = torch.tensor([[4.0, 2.0], [2.0, 3.0]], dtype=torch.float32)

print("Matrix Operations for Linear Algebra")
print("=" * 60)
print(f"Covariance Matrix:\n{covariance_matrix}\n")

###################
# TODO: Complete the matrix operations
# Matrix transpose
transposed = covariance_matrix.T  # COMPLETE: Get transpose (use .T attribute)
print(f"Transposed:\n{transposed}\n")

# Matrix determinant (measure of volume/invertibility)
det = torch.det(covariance_matrix)  # COMPLETE: Calculate determinant
print(f"Determinant: {det.item():.4f}")

# Matrix inverse (for solving linear systems)
inverse = torch.inverse(covariance_matrix)  # COMPLETE: Calculate inverse
print(f"Inverse:\n{inverse}\n")

# Verify: A @ A^(-1) = I
identity_check = covariance_matrix @ inverse  # COMPLETE: Matrix multiply (use @)
print(f"A @ A^(-1) ≈ I:\n{identity_check}\n")

# Matrix-matrix multiplication (transformations)
transformation = covariance_matrix @ covariance_matrix
print(f"A @ A:\n{transformation}")

Matrix Operations for Linear Algebra
Covariance Matrix:
tensor([[4., 2.],
        [2., 3.]])

Transposed:
tensor([[4., 2.],
        [2., 3.]])

Determinant: 8.0000
Inverse:
tensor([[ 0.3750, -0.2500],
        [-0.2500,  0.5000]])

A @ A^(-1) ≈ I:
tensor([[1., 0.],
        [0., 1.]])

A @ A:
tensor([[20., 14.],
        [14., 13.]])


## 4.2 Tensor Broadcasting
***

> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/docs.svg" width="20" style="filter: invert(50%) sepia(50%) saturate(2000%) hue-rotate(90deg) brightness(915%) contrast(100%);"/> **Documentation**: [Broadcasting](https://numpy.org/doc/stable/user/basics.broadcasting.html) is a powerful feature of NumPy and PyTorch that allows us to perform operations on arrays of different shapes without having to explicitly reshape them.

Since PyTorch is built on top of NumPy we can use its broadcasting capabilities. Broadcasting is how NumPy handles arrays with different shapes during arithmetic operations. It allows us to perform operations on arrays of different shapes without having to explicitly reshape them. This is done by automatically expanding the smaller array to match the shape of the larger array.

For example, if we have a 1D array of shape `(3,)` and a 2D array of shape `(3, 2)`, we can add them together without having to reshape the 1D array. NumPy will automatically expand the 1D array to match the shape of the 2D array.

***
> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/code.svg" width="20" style="filter: invert(100%) sepia(100%) saturate(2000%) hue-rotate(40deg) brightness(915%) contrast(100%);"/> **Snippet 3**: Broadcasting example

```python
# Create a 1D tensor of shape (3,)
a = torch.tensor([1, 2, 3])
# Create a 2D tensor of shape (3, 2)
b = torch.tensor([[1, 2], [3, 4], [5, 6]])
# Add the two tensors together
c = a + b  # Broadcasting occurs here
print(c)  # Output: tensor([[ 2,  4], [ 6,  8], [10, 12]])
```

In [39]:
# Broadcasting: Automatic shape alignment for operations

# Example: Normalizing sensor data
raw_data = torch.tensor([[10.0, 20.0], [30.0, 40.0]])
offset = torch.tensor([5.0])  # Scalar to subtract
scale = torch.tensor([1.0, 2.0])  # Per-column scaling

print("Broadcasting Examples")
print("=" * 60)
print(f"Raw sensor data:\n{raw_data}\n")

# Broadcast scalar to matrix (subtract offset from all elements)
centered = raw_data - offset
print(f"After subtracting offset {offset.item()}:\n{centered}\n")

# Broadcast row vector to matrix (per-column scaling)
scaled = raw_data * scale
print(f"After per-column scaling {scale}:\n{scaled}\n")


Broadcasting Examples
Raw sensor data:
tensor([[10., 20.],
        [30., 40.]])

After subtracting offset 5.0:
tensor([[ 5., 15.],
        [25., 35.]])

After per-column scaling tensor([1., 2.]):
tensor([[10., 40.],
        [30., 80.]])



In [42]:
print("Broadcasting Examples")
print("=" * 60)
print(f"Raw sensor data:\n{raw_data}\n")

###################
# TODO: Complete the normalization using broadcasting
# More complex: normalize each column (mean=0, std=1)
mean_per_column =  raw_data.mean(dim=0) # COMPLETE: Compute mean along rows (dim=0)
std_per_column = raw_data.std(dim=0)  # COMPLETE: Compute std along rows (dim=0)
normalized = (raw_data - mean_per_column)/std_per_column  # COMPLETE: Normalize (subtract mean, divide by std)

print(f"Column means: {mean_per_column}")
print(f"\nColumn stds: {std_per_column}")
print(f"\nNormalized data:\n{normalized}\n")

# Verify normalization
print(f"\nNormalized means: {normalized.mean(dim=0)}")
print(f"\nNormalized stds: {normalized.std(dim=0, unbiased=False)}")


Broadcasting Examples
Raw sensor data:
tensor([[10., 20.],
        [30., 40.]])

Column means: tensor([20., 30.])

Column stds: tensor([14.1421, 14.1421])

Normalized data:
tensor([[-0.7071, -0.7071],
        [ 0.7071,  0.7071]])


Normalized means: tensor([0., 0.])

Normalized stds: tensor([0.7071, 0.7071])


NameError: name 'checker' is not defined

## 4.3 Reshaping Methods
***
Sometimes, we need to change the shape of a tensor without changing its data. We do this in order to prepare the tensor for a specific operation or to match the shape of another tensor. PyTorch provides several methods for reshaping tensors. Below is a table summarizing some of the most commonly used reshaping methods in PyTorch.

| Method | Description | Example | Note |
|--------|-------------|---------|------|
| `reshape()` | New shape, maybe new memory | `x.reshape(2,3)` | May copy data |
| `view()` | New shape, same memory | `x.view(2,3)` | Must be contiguous |
| `squeeze()` | Remove single dims | `x.squeeze()` | Removes size 1 dims |
| `unsqueeze()` | Add single dim | `x.unsqueeze(0)` | Adds size 1 dim |
| `expand()` | Broadcast dimensions | `x.expand(2,3)` | No data copy |
***

> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/code.svg" width="20" style="filter: invert(100%) sepia(100%) saturate(2000%) hue-rotate(40deg) brightness(915%) contrast(100%);"/> **Snippet 4**: Reshaping a tensor

```python
# Create a 1D tensor of shape (6,)
x = torch.tensor([1, 2, 3, 4, 5, 6])
# Reshape to (2, 3)
print(x.reshape(2, 3))  # Output: tensor([[1, 2, 3], [4, 5, 6]])
```

In [41]:
# Reshaping tensors: crucial for data preparation

# Example: 1D time series that needs to be reshaped for processing
time_series = torch.tensor([1.0, 2.0, 3.0, 4.0, 5.0, 6.0])

print("Reshaping Examples")
print("=" * 60)
print(f"Original time series: {time_series}, shape: {time_series.shape}\n")

###################
# TODO: Complete the reshaping operations
# Reshape to 2D (e.g., for batch processing)
reshaped_2x3 = time_series.reshape(2, 3)  # COMPLETE: Reshape to 2x3
print(f"Reshaped to 2×3:\n{reshaped_2x3}\n")

# Reshape to 3D (e.g., adding batch dimension)
reshaped_3d = time_series.reshape(1, 2, 3)  # COMPLETE: Reshape to 1x2x3
print(f"Reshaped to 1×2×3:\n{reshaped_3d}\n")

# Use -1 to infer one dimension
auto_reshape = time_series.reshape(3, -1)  # COMPLETE: Use -1 to auto-infer size
print(f"Reshape with -1 (3, -1):\n{auto_reshape}\n")

# Unsqueeze: Add a dimension (useful for broadcasting)
vector = torch.tensor([1.0, 2.0, 3.0])
column_vector = vector.unsqueeze(1)  # COMPLETE: Add dimension at position 1
print(f"Original vector: {vector.shape}")
print(f"Column vector: {column_vector.shape}")
print(f"Column vector:\n{column_vector}\n")

# Squeeze: Remove dimensions of size 1
squeezed = column_vector.squeeze()  # COMPLETE: Remove size-1 dimensions
print(f"After squeeze: {squeezed.shape}, {squeezed}")

Reshaping Examples
Original time series: tensor([1., 2., 3., 4., 5., 6.]), shape: torch.Size([6])

Reshaped to 2×3:
tensor([[1., 2., 3.],
        [4., 5., 6.]])

Reshaped to 1×2×3:
tensor([[[1., 2., 3.],
         [4., 5., 6.]]])

Reshape with -1 (3, -1):
tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])

Original vector: torch.Size([3])
Column vector: torch.Size([3, 1])
Column vector:
tensor([[1.],
        [2.],
        [3.]])

After squeeze: torch.Size([3]), tensor([1., 2., 3.])


# <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/write.svg" width="30"/> 5. Automatic Differentiation (Autograd)
***

Automatic differentiation is one of the most powerful features of PyTorch. It allows us to compute gradients automatically, which is essential for training neural networks. PyTorch uses a technique called **reverse mode differentiation** to compute gradients efficiently. This technique is based on the chain rule of calculus and allows us to compute gradients for complex functions with many variables.

> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/docs.svg" width="20" style="filter: invert(50%) sepia(50%) saturate(2000%) hue-rotate(90deg) brightness(915%) contrast(100%);"/> **Documentation**: [Autograd](https://pytorch.org/docs/stable/autograd.html) is the automatic differentiation engine in PyTorch. It provides a way to compute gradients automatically for tensors with `requires_grad=True`.

Take for instance the following function:

$$f(x) = x^2 + 42y^2 + 3$$

where $x$ and $y$ are tensors. The gradient of this function with respect to $x$ and $y$ is given by:

$$\frac{\partial f}{\partial x} = 2x$$
$$\frac{\partial f}{\partial y} = 84y$$

using the chain rule of calculus. PyTorch allows us to compute these gradients automatically using the `backward()` method.

The chain rule is a fundamental concept in calculus that allows us to compute the derivative of a composite function. It states that if we have two functions $f(x)$ and $g(x)$, then the derivative of their composition $f(g(x))$ is given by:

$$\frac{d}{dx}f(g(x)) = f'(g(x)) \cdot g'(x)$$

where $f'(g(x))$ is the derivative of $f$ with respect to $g$, and $g'(x)$ is the derivative of $g$ with respect to $x$. This means that we can compute the derivative of a composite function by computing the derivatives of its constituent functions and multiplying them together.

## Autograd Concepts
***

In PyTorch, the autograd engine keeps track of all operations performed on tensors with `requires_grad=True`. It builds a computation graph dynamically as operations are performed. This graph is used to compute gradients when we call the `backward()` method.

| Concept | Description | Example |
|---------|-------------|---------|
| `requires_grad` | Flag to track gradients | `x = torch.tensor(1.0, requires_grad=True)` |
| `backward()` | Compute gradients | `y.backward()` |
| `grad` | Access gradients | `x.grad` |
***

> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/code.svg" width="20" style="filter: invert(100%) sepia(100%) saturate(2000%) hue-rotate(40deg) brightness(915%) contrast(100%);"/> **Snippet 5**: Using autograd to compute gradients

```python
# Create tensor with gradient tracking
x = torch.tensor([1.0], requires_grad=True)

# Compute function
y = x * x

# Compute gradient
y.backward()

# Access gradient
x.grad  # Should be 2.0
```

In [45]:
###################
# TODO: Complete the automatic differentiation example
# Automatic differentiation: Essential for optimization and inverse problems

# Example: Optimize a physical parameter
# Suppose we have y = 3x³ + 2x² - 5x + 1
# We want to find dy/dx at x = 2.0

x = torch.tensor([2.0], requires_grad =True)  # COMPLETE: Enable gradient tracking (use True)

# Compute function (e.g., energy as a function of position)
y = 3*(x**3) + 2*(x**2) -5*x + 1

print("Automatic Differentiation Example")
print("=" * 60)
print(f"Function: y = 3x³ + 2x² - 5x + 1")
print(f"At x = {x.item():.1f}:")
print(f"  y = {y.item():.2f}")

# Compute gradient automatically
y.backward()  # COMPLETE: Call backward() to compute gradients

print(f"  dy/dx = {x.grad.item():.2f}")  # COMPLETE: Access gradient with .grad
print("\nAnalytical verification:")
print(f"  dy/dx = 9x² + 4x - 5")
print(f"  At x = 2: 9(4) + 4(2) - 5 = 36 + 8 - 5 = 39 ✓")



Automatic Differentiation Example
Function: y = 3x³ + 2x² - 5x + 1
At x = 2.0:
  y = 23.00
  dy/dx = 39.00

Analytical verification:
  dy/dx = 9x² + 4x - 5
  At x = 2: 9(4) + 4(2) - 5 = 36 + 8 - 5 = 39 ✓


# <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/write.svg" width="30"/> 6. Converting Between NumPy and PyTorch
***

One of PyTorch's greatest strengths is seamless interoperability with NumPy. This allows you to:
- Use existing NumPy-based code and data
- Leverage PyTorch's GPU acceleration
- Easily integrate with the scientific Python ecosystem (SciPy, Pandas, Matplotlib, etc.)

> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/reminder.svg" width="20" style="filter: invert(100%) sepia(100%) saturate(1500%) hue-rotate(30deg) brightness(450%) contrast(70%);"/> **Important**: NumPy arrays and PyTorch tensors can **share memory** on CPU, meaning changes to one affect the other (unless you explicitly copy).

In [46]:
# Converting between NumPy and PyTorch

print("NumPy ↔ PyTorch Conversion")
print("=" * 60)

###################
# TODO: Complete the NumPy-PyTorch conversions
# 1. NumPy to PyTorch
# Use the temperature data from our earlier NumPy example
numpy_temps = np.array([20.5, 21.0, 19.8, 22.1, 20.3])
torch_temps = torch.tensor(numpy_temps)  # COMPLETE: Convert from NumPy to PyTorch

print("NumPy → PyTorch:")
print(f"  NumPy array: {numpy_temps}")
print(f"  PyTorch tensor: {torch_temps}")
print(f"  Tensor dtype: {torch_temps.dtype}\n")

# 2. PyTorch to NumPy
torch_data = torch.tensor([1.0, 2.0, 3.0, 4.0])
numpy_data = torch_data.numpy()  # COMPLETE: Convert from PyTorch to NumPy

print("PyTorch → NumPy:")
print(f"  PyTorch tensor: {torch_data}")
print(f"  NumPy array: {numpy_data}")
print(f"  Array dtype: {numpy_data.dtype}\n")



NumPy ↔ PyTorch Conversion
NumPy → PyTorch:
  NumPy array: [20.5 21.  19.8 22.1 20.3]
  PyTorch tensor: tensor([20.5000, 21.0000, 19.8000, 22.1000, 20.3000], dtype=torch.float64)
  Tensor dtype: torch.float64

PyTorch → NumPy:
  PyTorch tensor: tensor([1., 2., 3., 4.])
  NumPy array: [1. 2. 3. 4.]
  Array dtype: float32



In [47]:
# 3. Memory sharing demonstration
print("Memory Sharing:")
np_array = np.array([1.0, 2.0, 3.0])
torch_tensor = torch.from_numpy(np_array)

print(f"Original NumPy: {np_array}")
print(f"PyTorch tensor: {torch_tensor}")

# Modify NumPy array
np_array[0] = 999.0
print(f"\nAfter modifying NumPy[0] = 999:")
print(f"  NumPy: {np_array}")
print(f"  PyTorch: {torch_tensor} (changed too!)\n")

Memory Sharing:
Original NumPy: [1. 2. 3.]
PyTorch tensor: tensor([1., 2., 3.], dtype=torch.float64)

After modifying NumPy[0] = 999:
  NumPy: [999.   2.   3.]
  PyTorch: tensor([999.,   2.,   3.], dtype=torch.float64) (changed too!)



# <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/write.svg" width="30"/> 7. Working with Scientific Datasets: Higgs Boson
***

Now let's apply what we've learned to a real scientific dataset. We'll use the [Higgs Boson Dataset](https://archive.ics.uci.edu/ml/datasets/HIGGS) from the UCI Machine Learning Repository.

## About the Dataset

The dataset contains 11 million Monte Carlo-generated collision events from the CERN Large Hadron Collider. Each event has:
- **31 features**: 21 low-level kinematic properties and 7 high-level features derived by physicists
- **1 label**: Binary classification (signal or background)

**Scientific Context**: The Higgs boson is a fundamental particle discovered at CERN in 2012. Identifying Higgs events among background noise is a classic classification problem in high-energy physics.

> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/reminder.svg" width="20" style="filter: invert(100%) sepia(100%) saturate(1500%) hue-rotate(30deg) brightness(450%) contrast(70%);"/> **Note**: We'll load only the first 10,000 rows for this workshop to keep things fast.

## 7.1 Loading the Data with Pandas

In [48]:
# Load Higgs Boson dataset
# This may take a moment as we're downloading from UCI repository

url = "https://github.com/CLDiego/uom_fse_dl_workshop/raw/refs/heads/main/training_data/HIGGS.zip"

print("Loading Higgs Boson Dataset...")
print("=" * 60)

# Default column names (used if download fails)
column_names = ['label'] + [f'feature_{i}' for i in range(1, 29)]

# Load first 10,000 rows using the first row as header
try:
    higgs_df = pd.read_csv(url, nrows=10000, compression='zip', header=0)
    column_names = list(higgs_df.columns)

    print(f"✓ Dataset loaded successfully!")
except Exception as e:
    print(f"Error loading dataset: {e}")
    print("Falling back to sample data...")
    # Create sample data if download fails
    higgs_df = pd.DataFrame(
        np.random.randn(10000, 29),
        columns=column_names
    )
    higgs_df['label'] = np.random.randint(0, 2, 10000)

print(f"\nDataset Shape: {higgs_df.shape}")
print(f"  Samples: {len(higgs_df)}")
print(f"  Features: {len(higgs_df.columns) - 1}")
print(f"  Labels: {higgs_df['Label'].nunique()} classes")
print(f"\nClass distribution:")
print(higgs_df['Label'].value_counts())

Loading Higgs Boson Dataset...
✓ Dataset loaded successfully!

Dataset Shape: (10000, 33)
  Samples: 10000
  Features: 32
  Labels: 2 classes

Class distribution:
Label
b    6628
s    3372
Name: count, dtype: int64


## Exploring the Dataset

Let's examine the structure and statistics of our particle physics data.

In [49]:
# Display first few rows
higgs_df.head()

Unnamed: 0,EventId,DER_mass_MMC,DER_mass_transverse_met_lep,DER_mass_vis,DER_pt_h,DER_deltaeta_jet_jet,DER_mass_jet_jet,DER_prodeta_jet_jet,DER_deltar_tau_lep,DER_pt_tot,...,PRI_jet_num,PRI_jet_leading_pt,PRI_jet_leading_eta,PRI_jet_leading_phi,PRI_jet_subleading_pt,PRI_jet_subleading_eta,PRI_jet_subleading_phi,PRI_jet_all_pt,Weight,Label
0,100000,138.47,51.655,97.827,27.98,0.91,124.711,2.666,3.064,41.928,...,2,67.435,2.15,0.444,46.062,1.24,-2.475,113.497,0.002653,s
1,100001,160.937,68.768,103.235,48.146,-999.0,-999.0,-999.0,3.473,2.078,...,1,46.226,0.725,1.158,-999.0,-999.0,-999.0,46.226,2.233584,b
2,100002,-999.0,162.172,125.953,35.635,-999.0,-999.0,-999.0,3.148,9.336,...,1,44.251,2.053,-2.028,-999.0,-999.0,-999.0,44.251,2.347389,b
3,100003,143.905,81.417,80.943,0.414,-999.0,-999.0,-999.0,3.31,0.414,...,0,-999.0,-999.0,-999.0,-999.0,-999.0,-999.0,-0.0,5.446378,b
4,100004,175.864,16.915,134.805,16.405,-999.0,-999.0,-999.0,3.891,16.405,...,0,-999.0,-999.0,-999.0,-999.0,-999.0,-999.0,0.0,6.245333,b


In [50]:
print("Statistical Summary (first 5 features):")
print("=" * 60)
print(higgs_df.iloc[:, :6].describe())

Statistical Summary (first 5 features):
            EventId  DER_mass_MMC  DER_mass_transverse_met_lep  DER_mass_vis  \
count   10000.00000  10000.000000                 10000.000000  10000.000000   
mean   104999.50000    -45.304318                    49.132840     80.703992   
std      2886.89568    401.854494                    34.525596     39.169303   
min    100000.00000   -999.000000                     0.005000      7.330000   
25%    102499.75000     78.232500                    19.930750     59.045750   
50%    104999.50000    104.787000                    46.725500     73.537500   
75%    107499.25000    130.504000                    73.176250     92.353000   
max    109999.00000    813.396000                   444.719000    651.561000   

           DER_pt_h  DER_deltaeta_jet_jet  
count  10000.000000          10000.000000  
mean      57.906264           -704.783037  
std       68.997076            456.171676  
min        0.000000           -999.000000  
25%       12.875250

## 7.2 Converting Data to Tensors

To use PyTorch, we need to convert our Pandas DataFrame to tensors. We'll separate:
- **Features (X)**: The kinematic measurements
- **Labels (y)**: The binary class (signal=1, background=0)

In [52]:
###################
# TODO: Convert the dataset to PyTorch tensors
# Separate features and labels
X = higgs_df.drop('Label', axis=1).values  # All features as NumPy array
y = higgs_df['Label'].values  # Labels as NumPy array

# Convert labels from b and s to 0 and 1
y = np.where(y == '___', 1, 0)  # COMPLETE: Replace 's' with the signal label

# Convert to PyTorch tensors
X_tensor = torch.tensor(X).float()  # COMPLETE: Convert NumPy to PyTorch
y_tensor = torch.from_numpy(y).long()   # COMPLETE: Use .long() for classification labels

print("Dataset Conversion to Tensors")
print("=" * 60)
print(f"Features (X):")
print(f"  Shape: {X_tensor.shape} (samples × features)")
print(f"  Data type: {X_tensor.dtype}")
print(f"  Min: {X_tensor.min().item():.4f}, Max: {X_tensor.max().item():.4f}")

print(f"\nLabels (y):")
print(f"  Shape: {y_tensor.shape}")
print(f"  Data type: {y_tensor.dtype}")
print(f"  Unique values: {torch.unique(y_tensor)}")

# Show a single sample
print(f"\nExample collision event (first sample):")
print(f"  Features: {X_tensor[0, :5]} ... (showing first 5 of 28)")
print(f"  Label: {y_tensor[0].item()} ({'signal' if y_tensor[0] == 1 else 'background'})")

Dataset Conversion to Tensors
Features (X):
  Shape: torch.Size([10000, 32]) (samples × features)
  Data type: torch.float32
  Min: -999.0000, Max: 109999.0000

Labels (y):
  Shape: torch.Size([10000])
  Data type: torch.int64
  Unique values: tensor([0])

Example collision event (first sample):
  Features: tensor([1.0000e+05, 1.3847e+02, 5.1655e+01, 9.7827e+01, 2.7980e+01]) ... (showing first 5 of 28)
  Label: 0 (background)


# <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/write.svg" width="30"/> 8. Creating a Custom Dataset Class

For more complex data handling, PyTorch provides the `torch.utils.data.Dataset` class. By creating a custom dataset class, we can:
- Load data on-demand (useful for large datasets that don't fit in memory)
- Apply transformations consistently
- Integrate seamlessly with PyTorch's `DataLoader` for batching

> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/docs.svg" width="20" style="filter: invert(50%) sepia(50%) saturate(2000%) hue-rotate(90deg) brightness(915%) contrast(100%);"/> **Documentation**: See the [PyTorch Data Tutorial](https://pytorch.org/tutorials/beginner/basics/data_tutorial.html) for more details.

A custom Dataset class must implement:
1. `__init__`: Initialize the dataset (load file paths, read metadata, etc.)
2. `__len__`: Return the number of samples
3. `__getitem__`: Return a single sample given an index

In [None]:
from torch.utils.data import Dataset, DataLoader

class HiggsDataset(Dataset):
    """Custom Dataset for Higgs Boson particle collision data."""

    def __init__(self, dataframe, transform=None):
        """
        Args:
            dataframe: Pandas DataFrame with 'label' column and feature columns
            transform: Optional transform to apply to features (e.g., normalization)
        """
        self.features = torch.from_numpy(
            dataframe.drop('Label', axis=1).values
        ).float()
        self.labels = torch.from_numpy(
            dataframe['Label'].apply(lambda x: 1 if x == 's' else 0).values
        ).long()
        self.transform = transform

    def __len__(self):
        """Return the total number of samples."""
        return len(self.labels)

    def __getitem__(self, idx):
        """
        Return a single sample (features, label) at index idx.

        Args:
            idx: Index of the sample to retrieve

        Returns:
            tuple: (features, label) where features is a tensor of shape (28,)
                   and label is a scalar tensor
        """
        features = self.features[idx]
        label = self.labels[idx]

        # Apply transformation if specified
        if self.transform:
            features = self.transform(features)

        return features, label

# Create dataset instance
higgs_dataset = HiggsDataset(higgs_df)

print("Custom HiggsDataset Created")
print("=" * 60)
print(f"Dataset size: {len(higgs_dataset)} samples")
print(f"\nAccessing samples:")

# Access individual samples
for i in range(3):
    features, label = higgs_dataset[i]
    print(f"\nSample {i}:")
    print(f"  Features shape: {features.shape}")
    print(f"  First 5 features: {features[:5]}")
    print(f"  Label: {label.item()} ({'signal' if label == 1 else 'background'})")

## 8.1 Using DataLoader for Batching

The `DataLoader` wraps a Dataset and provides:
- **Batching**: Combine multiple samples into batches
- **Shuffling**: Randomize sample order (important for training)
- **Parallel loading**: Use multiple workers for faster data loading
- **Automatic batching**: Handles stacking samples into batch tensors

This is essential for training neural networks, where we process data in mini-batches rather than one sample at a time.

In [None]:
# Create a DataLoader
batch_size = 64
dataloader = DataLoader(
    higgs_dataset,
    batch_size=batch_size,
    shuffle=True,  # Randomize order
    num_workers=0  # Set to 0 for compatibility; increase for parallel loading
)

print("DataLoader Configuration")
print("=" * 60)
print(f"Batch size: {batch_size}")
print(f"Number of batches: {len(dataloader)}")
print(f"Total samples: {len(higgs_dataset)}")

# Iterate through the first batch
for batch_idx, (features_batch, labels_batch) in enumerate(dataloader):
    if batch_idx == 0:  # Only show first batch
        print(f"\nFirst Batch:")
        print(f"  Features batch shape: {features_batch.shape} (batch × features)")
        print(f"  Labels batch shape: {labels_batch.shape}")
        print(f"\n  Sample features from batch:")
        pprint(features_batch[:2, :5])  # First 2 samples, first 5 features
        print(f"\n  Sample labels from batch:")
        print(f"  {labels_batch[:10]}")  # First 10 labels
        break

print("\n" + "=" * 60)
print("✓ Data pipeline ready for model training!")

# <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/write.svg" width="30"/> 9. GPU Acceleration
***

One of PyTorch's main advantages is seamless GPU support. GPUs (Graphics Processing Units) excel at parallel computations, making them ideal for:
- Large matrix operations
- Deep learning model training
- Monte Carlo simulations
- Image processing
- Any computation that can be parallelized

## Moving Tensors to GPU

To use the GPU, we need to move tensors from CPU to GPU memory using the `.to()` method.

> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/reminder.svg" width="20" style="filter: invert(100%) sepia(100%) saturate(1500%) hue-rotate(30deg) brightness(450%) contrast(70%);"/> **Important**: All tensors in an operation must be on the same device (CPU or GPU). Mixing devices will cause errors.

> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/idea.svg" width="20" style="filter: invert(100%) sepia(100%) saturate(2000%) hue-rotate(40deg) brightness(915%) contrast(100%);"/> **Tip**: For code that works on both CPU and GPU, use `device = torch.device("cuda" if torch.cuda.is_available() else "cpu")` and move all tensors to `device`.

In [53]:
# Check device availability and set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

print("GPU Acceleration Setup")
print("=" * 60)
print(f"Device: {device}")

if torch.cuda.is_available():
    print(f"GPU Name: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
else:
    print("No GPU available - computations will run on CPU")

# Example: Move a tensor to the device
data = torch.randn(1000, 1000)
print(f"\nOriginal tensor device: {data.device}")

# Move to GPU (if available)
data_gpu = data.to(device)
print(f"After .to(device): {data_gpu.device}")

# Perform computation on the device
result = data_gpu @ data_gpu.T  # Matrix multiplication
print(f"Result device: {result.device}")
print(f"Result shape: {result.shape}")

# Move back to CPU (needed for NumPy conversion)
result_cpu = result.cpu()
print(f"\nMoved back to CPU: {result_cpu.device}")

print("\n" + "=" * 60)
print("✓ GPU operations complete!")

GPU Acceleration Setup
Device: cpu
No GPU available - computations will run on CPU

Original tensor device: cpu
After .to(device): cpu
Result device: cpu
Result shape: torch.Size([1000, 1000])

Moved back to CPU: cpu

✓ GPU operations complete!


## When to Use GPU vs CPU

> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/idea.svg" width="20" style="filter: invert(100%) sepia(100%) saturate(2000%) hue-rotate(40deg) brightness(915%) contrast(100%);"/> **Decision Guide**:

**Use GPU for**:
- Training deep neural networks
- Large matrix operations (size > 1000×1000)
- Batch processing of images/signals
- Long-running simulations
- Operations repeated many times

**Use CPU for**:
- Small computations (overhead of GPU transfer isn't worth it)
- Operations not parallelizable
- Debugging (easier to inspect values)
- When GPU memory is limited

> <img src="https://github.com/CLDiego/uom_fse_dl_workshop/raw/main/figs/icons/reminder.svg" width="20" style="filter: invert(100%) sepia(100%) saturate(1500%) hue-rotate(30deg) brightness(450%) contrast(70%);"/> **Device Consistency Rule**: All tensors and models in an operation must be on the same device. Common pattern:

```python
# Good practice: move everything to device at the start
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
x = x.to(device)
y = y.to(device)
model = model.to(device)

# Now operations work
output = model(x)
loss = loss_fn(output, y)
```

---