# Day 3


Today we will cover the following topics:

- [More OOP](#More-OOP) (protected and private attributes, decorators)
- A selection of [design patterns](#Design-Patterns)
- Basics of [PyTorch](#PyTorch)
- A brief overview to [Deep Learning](#deep-learning)
- Building a [Neural Network](#neural-networks) with PyTorch
- A brief introduction to the [Python Debugger](#python-debugger)

**Credits:**

- Part of the material about OOP and design patterns builds upon and includes the material from *Scientific Programming with Python* (WiSe 2023/2024), which was kindly provided by **Andriy Sokolov**.

## Class Decorators

In Python, there are two ways one can use decorators on classes:

- **decorate the methods within the class**
- **decorate the whole class**

In [None]:
class Circle:
  abbrev = "circ"

  def __init__(self, radius):
    self.__radius = radius

  @property
  def radius(self):
    """Get value of radius"""
    return self.__radius

  @radius.setter
  def radius(self, value):
    """Set radius, raise error if negative"""
    if value >= 0:
      self.__radius = value
    else:
      raise ValueError("Radius must be >= 0!")

  @property
  def area(self):
    return self.pi() * self.radius**2

  def cylinder_volume(self, height):
    """Calculate volume of cylinder with circle as base"""
    return self.area * height

  @classmethod
  def unit_circle(cls):
    """Factory method creating a circle with radius 1"""
    print(cls.abbrev)
    return cls(1)

  @staticmethod
  def pi():
    return 3.1415926535

In [None]:
c = Circle(5)
print(c.radius)
print(c.area)
c.radius = 2
try:
    c.radius = -1
except ValueError:
    print("A radius needs to be positive!")
print(c.area)
try:
    c.area = 100
except AttributeError as e:
    print(e)
print(c.cylinder_volume(height=4))
c = Circle.unit_circle()
print(c.radius)
print(c.pi())
print(Circle.pi())

### Class Decorators (2)

There are also decorators for classes, e.g., `dataclass`. The dataclasses module makes it easier to create classes that **mostly consists of attributes** (see [https://docs.python.org/3/library/dataclasses.html](https://docs.python.org/3/library/dataclasses.html)).

In [None]:
from dataclasses import dataclass

@dataclass
class Location:
  name: str
  lon: float = 0.0
  lat: float = 0.0

  def __str__(self):
    return f"{self.name} is located at ({self.lon}, {self.lat})."

nowhere = Location("Nowhere")
wrong_location = Location("Here", "There")
print(wrong_location.lon)
oslo = Location("Oslo", 10.8, 59.9)
print(oslo)

### Nested Decorators

Decorators can also be nested, whereas the first decorator is also executed first.

In [None]:
from my_decorator import do_twice
@timer
@do_twice
def whee():
  j = 0
  for i in range(1000):
    j += i
  print("Whee!")
whee()

### Decorators with Arguments

- Decorators that can be used both with and without arguments. Most likely, you do not need this, but it is nice to have the flexibility.
- Often such decorators are found in **unit testing**.
- Note that in the previous example, we used `functools.wraps`, which is a decorator that can be used to modify a function or method by **updating its metadata**.

In [None]:
import functools

def repeat(num_times):
    def decorator_repeat(func):
        @functools.wraps(func)
        def wrapper_repeat(*args, **kwargs):
            for _ in range(num_times):
                value = func(*args, **kwargs)
            return value
        return wrapper_repeat
    return decorator_repeat
      
@repeat(num_times=4)
def greet(name):
  print(f"Hello {name}")

greet("World")   

___

## Exercise (Bank Account)

Create a class `BankAccount`, which has some build in security and functionality that is error-prone to some extent. Use the `dataclass` decorator.

The class should have to following instance attributes:

- **balance** should be a private attribute
- **owner_name** a protected attribute
- **account_type** (str $\in$ "Savings" or "Checkings") can be public

In addition, to a standard constructor to initialize all three attributes, the class should have two methods:

- `deposit(amount)`: Adds money to the account balance.
- `withdraw(amount)`: Deducts money from the account balance if sufficient funds are available; otherwise raises an exception.

To protect both methods, write a decorator `validate_input` that takes the two inputs `min_value` and `max_value` (where the latter does not need to be set and is infinity by default). If an invalid `amount` is provided (e.g., negative deposit), it should raise a `ValueError`.

Test your code on the example code below.

In [26]:
import numpy as np
from dataclasses import dataclass, field
from functools import wraps

def validate_input(min_value, max_value=np.inf):
    # TODO implement decorator
    def decorator(func):
        @wraps(func)
        def wrapper(self, amount, *args, **kwargs):
            # Type check
            if not isinstance(amount, (int, float)):
                raise ValueError("Amount must be a number.")

            # Range check
            if amount < min_value or amount > max_value:
                raise ValueError(
                    f"Amount must be between {min_value} and {max_value}."
                )

            return func(self, amount, *args, **kwargs)

        return wrapper
    return decorator


@dataclass
class BankAccount:
    owner_name: str
    initial_balance: float
    account_type: str = "Savings"
    __balance: float = field(init=False, repr=False)

    def __post_init__(self):
        self._owner_name = self.owner_name
        del self.owner_name

        if self.account_type not in {"Savings", "Checkings"}:
            raise ValueError("account_type must be 'Savings' or 'Checkings'")

        if self.initial_balance < 0:
            raise ValueError("Initial balance cannot be negative")

        self.__balance = float(self.initial_balance)

    @property
    def balance(self):
        return self.__balance

    @validate_input(0.01)
    def deposit(self, amount):
        self.__balance += amount

    
    @validate_input(min_value=0.01, max_value=500)
    def withdraw(self, amount):
        if amount > self.__balance:
            raise RuntimeError("Insufficient funds")
        self.__balance -= amount

    def __str__(self):
        return f"BankAccount(owner='{self._owner_name}', type='{self.account_type}', balance={self.__balance:.2f})"




In [27]:
# Execute the example code below

# Create an instance of BankAccount
alice_account = BankAccount(owner_name="Alice", initial_balance=1000)
print(alice_account) # return a nice string representation
print(f"Current balance: {alice_account.balance}")

# If correct, an AttributeError will be raised.
try:
    alice_account.balance = 1000000
except AttributeError:
    print("The balance cannot be set manually!")

# Valid deposit operation
print("\nDepositing $500...")
alice_account.deposit(500)

# Valid withdrawal operation
print("\nWithdrawing $200...")
alice_account.withdraw(200)

# Invalid withdrawal operation (negative number)
try:
    alice_account.withdraw(-50)
except ValueError as e:
    print(f"Error: {e}")

# Make sure the maximum withdrawal can only be 500
try:
    alice_account.withdraw(1000)
except ValueError as e:
    print(f"Error: {e}") 

alice_account.withdraw(500)
print(alice_account.balance) # should be 800

BankAccount(owner='Alice', type='Savings', balance=1000.00)
Current balance: 1000.0
The balance cannot be set manually!

Depositing $500...

Withdrawing $200...
Error: Amount must be between 0.01 and 500.
Error: Amount must be between 0.01 and 500.
800.0


## Design Patterns

### Template Pattern

The **template pattern** is a behavioral design pattern that allows you to define a **skeleton of an algorithm** in a base class and let **subclasses override** the steps **without changing the overall** algorithm’s **structure**.

In [None]:
from abc import ABC, abstractmethod
import math


# Define Abstract Base Class
class Shape(ABC):
    """
    Abstract base class defining a template for calculating area.
    """

    @property
    def area(self):
        """
        Template method: Defines the overall structure for calculating area.
        """
        self.validate_inputs()
        return self.compute_area()

    @abstractmethod
    def validate_inputs(self):
        """Abstract method to validate shape-specific inputs."""
        pass
    
    @abstractmethod
    def compute_area(self):
        """Abstract method to compute area of the shape."""
        pass

class Rectangle(Shape):
    def __init__(self, width, height):
        self.width = width
        self.height = height

    def validate_inputs(self):
        if self.width <= 0 or self.height <= 0:
            raise ValueError("Width and height must be positive numbers.")

    def compute_area(self):
        return self.width * self.height    

In [None]:
rectangle = Rectangle(width=5, height=10)
print(rectangle.area)

In [None]:
class FalseShape(Shape):
    def __init__(self, width, height):
        self.width = width
        self.height = height

fs = FalseShape(2,3)

**Miniexercise:** Create a subclass `Circle` and test it a bit.

In [None]:
class Circle(Shape):
    # TODO

___

## Exercise (Shape Factory)

Create another subclass of type `Shape` for a triangle object which is constructed based on the lengths for all three sides. Then create a `ShapeFactory` with the static method `create_shape(shape_type, *args)` and test your factory with a few examples.

Note: Make sure that the criterion $a^2 + b^2 = c^2$ is met.

In [None]:
class Triangle(Shape):
    # TODO

class ShapeFactory:
    # TODO
        
shape = ShapeFactory.create_shape("circle", 12)
print(shape.area)
# Try out more

## PyTorch

### Tensors

**Tensors** are Torch's equivalent to **ndarrays** in NumPy. The main difference is that **tensors can run on GPU** and other hardware generators. In addition, tensors are optimized for automatic differentiation with autograd.

#### Initializing a Tensor

In [None]:
import torch
import numpy as np

# Direct initialization from data
data = [[1, 2],[3, 4]]
print(data)
x_data = torch.tensor(data)
print(x_data)

# Initialization from a NumPy arrey
np_array = np.array(data)
x_np = torch.from_numpy(np_array)
print(type(x_np))

# Create a tensor of ones that retains the properties of x_data
x_ones = torch.ones_like(x_data)
print(f"Ones Tensor: \n {x_ones} \n")

# Create a tensor with random numbers, the shape of x_data and datatype float.
x_rand = torch.rand_like(x_data, dtype=torch.float)
print(f"Random Tensor: \n {x_rand} \n")

shape = (2, 3,)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print(zeros_tensor)
print(rand_tensor)

#### Tensor Attributes

- One can access the **shape** and **type** of a tensor directly, via `.shape` and `.type`, respectively.
- In addition, we can query **on which device the tensor is stored** with `.device`. This is important if we want to work with GPUs (more to devices later). By **default**, objects are stored on **CPU**.

In [None]:
tensor = torch.rand(3, 4)

print(f"Shape: {tensor.shape}")
print(f"Datatype: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")

#### Basic Operations

In [None]:
t = torch.ones(2,4)
# Indexing and slicing as in NumPy
t[:,2] = 0
print(t)

# Concatenate along axis
t_cat = torch.cat([t,t], dim=1)
print(t_cat)

# Stack along axis
t_stack = torch.stack((t,t), dim=1)
print(t_stack)
print(f"{t_cat.shape} vs. {t_stack.shape}")

# Computing element wise produces
t_ep = t.mul(t)
print(t_ep)
# Alternatively use `*``
print(t_ep == (t * t))

# Transposing a matrix
print(t.T)
# Computing matrix products
t_mp = t.matmul(t.T)
print(t_mp)
# Alternatively use `@``
print((t_mp != (t @ t.T)).sum())

#### In-Place Operations

**In-place** operations that have a `_` **suffix**. For example: `x.copy_(y)`, `x.t_()`, will change **x**.

In [None]:
t = torch.rand(2, 4)
print(t.shape)
t_transposed = t.T
print(t.shape)
t.t_()
print(t.shape)

#### Bridge to NumPy

We can convert Torch tensors to NumPy arrays and vice versa via `torch.from_numpy()` and `tensor.numpy()`, **but the corresponding objects still share the underlying memory!**

In [None]:
t = torch.ones(5)
print(f"Initial t: {t}")
n = t.numpy()
t.add_(1)
print(f"t + 1: {t}")
print(f"n: {n}")
n = np.ones(5)
t = torch.from_numpy(n)
np.add(n, 1, out=n)
print(f"t: {t}")
print(f"n: {n}")

#### GPU and CPU Devices

In [None]:
import torch

# Available CUDA devices
cuda_devices = [f'cuda:{i}' for i in range(torch.cuda.device_count())]

# CPU is always available
available_devices = ['cpu'] + cuda_devices

# Check if MPS is available (for Apple)
if torch.backends.mps.is_available():
  available_devices.append('mps')

print("Available devices:", available_devices)

___

## Exercise: Tensor Basics

1. Create a 3D tensor of shape `(2, 3, 4)` filled with random numbers between 0 and 1.
2. Reshape the tensor into two different shapes: `(6, 4)`, and `(4, 6)`
3. Perform a matrix product (dot product) between the reshaped tensors from Step 2.
4. Compute the following statistics on the resulting matrix: the sum, mean, maximum and minimum of all elements.
5. Normalize the resulting matrix so that its values lie between 0 and 1 using min-max normalization: $x_{\text{norm}} = \frac{x - x_{\text{min}}}{x_{\text{max}} - x_{\text{min}}}$
6. Verify if the normalized matrix has values within $[0, 1]$. Hint: You can use the functions `.le()`, `.ge()` and `.all()`.
7. Reshape the original tensor back into its original shape `(2, 3, 4)` after performing some transformation (e.g., adding a constant to each element), ensuring you understand how reshaping works.

Complete the code below.

In [None]:
import torch

torch.manual_seed(42)

# TODO

# Print results for verification
print("Original Tensor:\n", tensor)
print("\nSum of Elements:", sum_elements.item())
print("Mean of Elements:", mean_elements.item())
print("Max Value:", max_value.item())
print("Min Value:", min_value.item())
print("\nNormalized Matrix:\n", normalized_matrix)
print("Is Normalized Correctly? ", is_normalized_correctly)
print("\nTransformed Original Tensor (+10):\n", transformed_tensor)

___

### PyTorch Datasets

**Datasets**, or `torch.utils.data.Dataset`, provides a modular and principled framework to **disentangle data loading and preparation from model training**. With the dataset class, you can use preloaded datasets (well-known text, image and audio datasets) or manage your own data.

#### Loading Datasets

Here, we load the famous MNIST dataset which is available in PyTorch.

In [None]:
import torch
from torch.utils.data import Dataset
from torchvision import datasets
from torchvision.transforms import ToTensor

training_data = datasets.MNIST(
    root="../data",
    train=True,
    download=True,
    transform=ToTensor()
)

test_data = datasets.MNIST(
    root="../data",
    train=False,
    download=True,
    transform=ToTensor()
)

The dataset requires the following parameters:

- `root`: path where the train/test data is stored
- `train`: training or test dataset
- `download`: whether the data should be downloaded if it is not available at `root`
- `transform` and `target_transform` specify the feature and label transformations (details later)

We can **access an element of a dataset by its index** and access the size of the dataset with `len()`.

In [None]:
import matplotlib.pyplot as plt

torch.manual_seed(123)
figure = plt.figure(figsize=(8, 8))
cols, rows = 3, 3
for i in range(1, cols * rows + 1):
    sample_idx = torch.randint(len(training_data), size=(1,)).item()
    img, label = training_data[sample_idx]
    figure.add_subplot(rows, cols, i)
    plt.title(f"Label = {label}")
    plt.axis("off")
    plt.imshow(img.squeeze(), cmap="gray")
plt.show()

## Custom Datasets

To create your own custom dataset by inheriting from `torch.utils.data.Dataset`. This requires you to implement three functions for your dataset:

- `__init__(self, ...)`: For datasets, we typically initialize the directory containing the data, link/load the annotation files, define the transforms, etc.
- `__len__(self)`: Outputs the length of the dataset. Should be specific to the (sub-)dataset you initialized, e.g., train or test split.
- `__getitem__(self, idx)`: Loads and returns a sample from the dataset at the given index `idx`. For example, based on the index, it identifies the image’s location on disk, converts that to a tensor using read_image, retrieves the corresponding label from the csv data in self.img_labels, calls the transform functions on them (if applicable), and returns the tensor image and corresponding label in a tuple.

Let's have a look at the example below. Where we create a new version of MNIST that randomly rotates the image when it is accessed.

In [None]:
from torchvision import transforms
import random

random.seed(0)

class RotatedMNIST(Dataset):
    def __init__(self, train=True):
        """
        Initialize the dataset.
        Args:
            train (bool): If True, load training data; otherwise, load test data.
        
        One could have potentially more parameters such as: path_to_annotations (str), path_to_image_folder (str), transform, target_transform, ... Here we define the transform in the class.
        """
        self.__train = train
        self.mnist = datasets.MNIST(
            root="../data",
            train=self.__train,
            download=True,
            transform=ToTensor()
        )

    def __len__(self):
        return len(self.mnist)

    def __getitem__(self, idx):
        """
        Retrieve an item from the dataset at index `idx`.
        """
        # Utilize __getitem__ function from MNIST
        image, label = self.mnist[idx]
        
        # Select a random rotation angle [0°, 90°, 180°, 270°]
        # and apply transformation
        angle = random.choice([0, 90, 180, 270])
        rotated_image = transforms.functional.rotate(image, angle)
        
        # We can also return more than just datapoint and label
        return rotated_image, label, angle

    @property
    def train(self):
        return self.__train

Next, we create an instance of our custom dataset and plot some examples.

In [None]:
training_data = RotatedMNIST(train=True)

torch.manual_seed(123)
figure = plt.figure(figsize=(8, 8))
cols, rows = 3, 3
for i in range(1, cols * rows + 1):
    sample_idx = torch.randint(len(training_data), size=(1,)).item()
    img, label, angle = training_data[sample_idx]
    figure.add_subplot(rows, cols, i)
    plt.title(f"Label = {label} ({angle}°)")
    plt.axis("off")
    plt.imshow(img.squeeze(), cmap="gray")
plt.show()

## DataLoader

In [None]:
from torch.utils.data import DataLoader
train_loader = DataLoader(
  training_data,
  batch_size=8,     # Typically chosen as a power of 2
  shuffle=True
)

In [None]:
# Dummy iteration through training data
for idx, batch in enumerate(train_loader):
    x, y, _ = batch
    print(f"Batch {idx+1} target values: {y}")

In [None]:
# Define Iterator
it = iter(train_loader)
x, y, a = next(it)
print(f"Feature batch shape: {x.size()}")
print(f"Labels batch shape: {y.size()}")
print(x[0].size())
print(x[0].squeeze().size())

In [None]:
img = x[0].squeeze()
label, angle = y[0], a[0]
print(f"Label: {label} ({angle}°)")
plt.imshow(img, cmap="gray")
plt.show()

### .item()

If the result of your computation, e.g., after applying `.mean()` to a 1D tensor is just a scalar, it will still be a tensor with all its properties. It will have a `.device`, etc. If you just want to work with the scalar itself, you need to apply `.item()` to the result, as shown in the example below.

In [None]:
import torch
t = torch.rand(5)
print(t.mean())
print(t.mean().item())
print(type(t.mean().item()))

___

## Exercise: QuadraticDataset

1. Create a custom PyTorch dataset class called `QuadraticDataset`. This dataset generates pairs of inputs (x) and outputs (y) based on a quadratic equation: $y = ax^2 + bx + c$.
At initialization, the dataset should create (x,y) pairs given the inputs:
    
    - $N$ the number of data points
    - $a, b, c$ the coefficients
    - `seed` the random seed with default value equal to $1$
    - `standardize` a boolean flag with default value `True` that indicates if the $x$ and $y$ values should be standardized to mean 0 and standard deviation 1.

    Remember, that you need to implement the `__init__()`, `__len__()`, and `__getitem__()` functions. Please make sure that you output a tensor.

2. Instantiate your dataset with $N=512, a=2, b=-3, c=5$ with `standardize=False`. Iterate through the dataset with batch size equal to 128, output the number of batches, and create a scatter plot based on the last batch.

3. Repeat Step 2 with `standardize=True`.



In [None]:
import torch
from torch.utils.data import Dataset

class QuadraticDataset(Dataset):
    # TODO

In [None]:
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt

# Solution for Step 2
ds1 = QuadraticDataset(N=512, a=2.0, b=-3.0, c=5.0, standardize=False)

# TODO

In [None]:
# Solution for Step 3

# TODO

## Deep Learning

**Deep learning** can be described as **ML with neural networks**.

## Neural Networks

- There exists a large spectrum of neural networks ranging from simple multi-layer perceptrons (MLPs) to transformer models such as the ones that are the basis for ChatGPT.
- Note that we will not discuss deep learning architectures in detail but **learn how to implement and train neural networks in PyTorch**.
- We will briefly sketch two fundamental types of networks: **MLPs, and convolutional neural networks (CNNs)** -- see slides.

## An MLP in PyTorch

The simplest solution to create an MLP in PyTorch is via the `torch.nn.Sequential` module. Just alternate with linear layers and activation functions. **Note that the out dimension of the previous layer needs to match the input dimension of the next layer.**

- `Linear(in_dim, out_dim)` requires the dimensions of input and output, and optionally lets you define if a bias term should be specified (the default is `True`).
- Activation functions like `Sigmoid`, `ReLU`, etc. have the same input and output dimension.

In [None]:
from torch import nn

in_dim = 1
out_dim = 1
h1 = 8
h2 = 32
activ_fun = nn.Sigmoid()

mlp = nn.Sequential(
    nn.Linear(in_dim, h1, bias=True),
    activ_fun,
    nn.Linear(h1, h2, bias=True),
    activ_fun,
    nn.Linear(h2, out_dim, bias=True)
)
print(mlp)
print(mlp[0])

### Pushing Data Through an MLP

In [None]:
x = torch.rand(32, 1)

# Pass through MLP
y_hat = mlp(x)

# Network and data need to be on the same device!
x = torch.rand(32, 1, device="mps")
# The following will raise an error
# y_hat = mlp(x)

## Python Debugger

One way to use the Python debugger is to simply import it and **set breakpoints** with `pdb.set_trace()` or `breakpoint()`. The execution of your code will then stop at this line, and you are free to explore.

Some helpful commands that you can use while debugging:

- `h(elp)`, meaning that you can type either `h` or `help` will print the list of available commands.
- `s(tep)` executes the current line, and stops at the first possible occasion (either in a function that is called or on the next line in the current function).
- `n(ext)` continues execution until the next line in the current function is reached, or it returns.
- `c(ont(inue))` continues the execution, and only stops when a breakpoint is encountered.
- `l(ist)` returns the source code around the current line (11 lines).
- `a(rgs)` prints the arguments of the current function and their current values.
- `q(uit)` terminates the code.

Additional remarks:

- There are many more commands that you can use (checkout the website linked before!).
- When using the debugger, you can manipulate objects, e.g., `x += 1`. However, it is more tricky if your variable name coincides with a command, e.g., you cannot type `q += 1`. Instead, you can type `!q += 1` to indicate that you want to execute code.
- You can also invoke the debugger from the command line, but we will not go into details here.

___

**Miniexercise:** Please go ahead and explore working with the debugger a bit!

___