# 🧪 LAB: Implement an autoencoder with `PyTorch`

In this lab, you will implement and explore autoencoders using the MNIST dataset. The goal is to understand how autoencoders learn compressed representations of data, and how these representations compare with classical methods like PCA.

**Collaboration Note**: This assignment is designed to support collaborative work. We encourage you to divide tasks among group members so that everyone can contribute meaningfully. Many components of the assignment can be approached in parallel or split logically across team members. Good coordination and thoughtful integration of your work will lead to a stronger final result.

---

In total, this lab assignment will be worth **100 points**.

--- 
**Submission notes**:

* Write down all group members' names, or at least the group name (if you have one and you previously provided it), in the first cell of the notebook.

* Verify that the notebook runs as expected and that all required outputs are included.


NAME(s) = ""

## 1. Theoretical Background Reflection (10 points)

Read *Michelucci (2022)*: [An Introduction to Autoencoders](https://arxiv.org/abs/2201.03898) — focus on the following sections:  
- 1.Introduction
- 2.1, 2.2, and 2.3
- Skim Section 3 on applications  

Then answer briefly:

1. What is the main idea behind an autoencoder?  
2. What is the role of the bottleneck layer?  
3. Discuss, in your own words, the applications of autoencoders mentioned in the paper.  
4. What differences would you expect between an autoencoder’s representation and that of PCA if applied as a dimensionality reduction technique?


USE AS MANY MARKDOWN CELLS AS NEEDED

## 2. Prepare the Data (10 points)

Given the below dataset:

- Normalize **each** image (i.e., separately) so its values are between 0 and 1.
- Visualize a few random digits to confirm everything worked as expected.
- Split the dataset into training, validation and test sets (70%, 15% 15%).
- Convert the splits into `PyTorch` `TensorDataset` objects and wrap them in `DataLoader`s with a batch size of 128.

In [None]:
from sklearn.datasets import fetch_openml
mnist = fetch_openml("mnist_784", version=1, as_frame=False)
X, y = mnist.data, mnist.target.astype(int)

In [None]:
# USE AS MANY CELLS AS NEEDED

## 3. Autoencoder (50 points)

In this section, you will build and train an **autoencoder** to reconstruct the provided data.

Your model will include three layers in both the encoder and decoder:

- **Encoder:** 784 → 128 → 64 → 2 (the last layer is the **bottleneck**)  
- **Decoder:** 2 → 64 → 128 → 784  

### 3.1 Implementation (30 points)

Follow the skeleton provided below and fill in the missing parts.

In [None]:
import torch.nn as nn

class Autoencoder(nn.Module):
    def __init__(self):
        super().__init__()
        
        # 🔹 ENCODING PART: 784 -> 128 -> 64 -> 2
        encoder_list = [
            nn.Linear(784, 128),
            nn.BatchNorm1d(128),
            nn.ReLU(),
            nn.Dropout(0.2),
            
            # TODO: Add steps to go from 128 to 64 
            # (Linear → BatchNorm → ReLU → Dropout(0.2))
            
            # TODO: Add steps to go from 64 to 2  
            # (only Linear → ReLU)
        ]
        # TODO: Replace the None value below with the module that stores the layers in 'encoder_list' for manual iteration in the forward pass
        # Hint: we have covered this module in class
        self.encoder_list = None
        
        # 🔹 DECODING PART: 2 -> 64 -> 128 -> 784
        decoder_list = [
            nn.Linear(2, 64),
            nn.BatchNorm1d(64),
            nn.ReLU(),
            nn.Dropout(0.2),
            
            # TODO: Add steps to go from 64 to 128  
            # (Linear → BatchNorm → ReLU → Dropout(0.2))
            
            # TODO: Add steps to go from 128 to 784  
            # (only Linear → Sigmoid)
        ]
        # TODO: Replace the None value below with the module that stores the layers in 'decoder_list' for manual iteration in the forward pass
        # Hint: we have covered this module in class
        self.decoder_list = None
        
    def forward(self, X):
        # TODO: Replace the None value below for that after applying X -> encoder -> decoder
        # Hint: You will probably need to iterate through the encoder_list and decoder_list
        output = None
        return output


### 3.2 Training (15 points)

Follow the skeleton provided below and fill in the missing parts to train your model. As you can see, training includes EarlyStopping, which I give you already for your convenience :-). After you have filled these missing parts, uncomment the provided piece of code with the function for training (`train_autoencoder`) your model and execute it.

But, **BEFORE** that, discuss with your mates:

- Why do you think the output layer of the autoencoder uses a Sigmoid activation?
- Based on this, decide and justify which cost function is most appropriate for this task.

Once you have discussed and agreed on the reasoning above, proceed to complete the skeleton and train the model.


In [None]:
# TODO: Replace None to instantiate the autoencoder
model = None  

# TODO: Replace None with the appropriate Loss function from 'torch.nn'
# Hint: we are reconstructing continuous pixel values in [0,1]
criterion = None  

# TODO: Replace None with Adam optimizer. 
# Use learning rate of 0.001 and an L2 regularization via weight_decay of 1e-5
optimizer = None  

In [None]:
class EarlyStopping:
    def __init__(self, patience=10, min_delta=0.0):
        self.patience = patience
        self.min_delta = min_delta
        self.counter = 0
        self.best_loss = None
        self.early_stop = False
        self.best_model_state = None

    def __call__(self, val_loss, model):
        if self.best_loss is None or val_loss < self.best_loss - self.min_delta:
            self.best_loss = val_loss
            self.counter = 0
            self.best_model_state = {k: v.clone() for k, v in model.state_dict().items()}
        else:
            self.counter += 1
            if self.counter >= self.patience:
                self.early_stop = True

    def restore_best_weights(self, model):
        if self.best_model_state is not None:
            model.load_state_dict(self.best_model_state)


def train_autoencoder(model, train_loader, val_loader, n_epochs=100, patience=5):
    # Initialize the early stopping object
    early_stopping = EarlyStopping(patience=patience)
    history = {"train": [], "val": []}

    for epoch in range(n_epochs):
        model.train()
        train_cost = 0.0

        for X_batch, _ in train_loader:
            # --- Implement the standard training steps ---
            # TODO: Set gradients equal to zero
            outputs = None  # TODO: Forward pass for this batch
            cost = None     # TODO: Compute the cost for this batch
            # TODO: Backward pass (compute gradients)
            # TODO: Update model parameters
            train_cost += cost.item() * X_batch.size(0)

        train_cost /= len(train_loader.dataset)

        # --- Validation phase ---
        model.eval()
        val_cost = 0.0
        with torch.no_grad():
            for X_batch, _ in val_loader:
                outputs = None  # TODO: Forward pass for validation data
                cost = None     # TODO: Compute the validation cost
                val_cost += cost.item() * X_batch.size(0)

        val_cost /= len(val_loader.dataset)

        # --- Save metrics ---
        history["train"].append(train_cost)
        history["val"].append(val_cost)

        print(f"Epoch [{epoch+1}/{n_epochs}]  Train Loss: {train_cost:.5f}  Val Loss: {val_cost:.5f}")

        # --- Early stopping check ---
        early_stopping(val_cost, model)
        if early_stopping.early_stop:
            print(f"Stopping early at epoch {epoch+1}")
            early_stopping.restore_best_weights(model)
            break

    return model, history


Once the missing parts have been filled, uncomment the piece of code below and execute it to train your model (it may take a bit to complete the training, which is totally normla)

In [None]:
# model, history = train_autoencoder(model, train_loader, val_loader, n_epochs=50, patience=5)

In [None]:
# USE AS MANY CELLS AS NEEDED FOR THE REST OF THE CODE

### 3.3 Reconstruct (5 points)

Once you have trained the model: 

1. Visualize several reconstructed digits from the test set.  
2. Compare them with the corresponding original input digits. Was the reconstruction good? 

In [None]:
# USE AS MANY CELLS AS NEEDED

## 4. Visualize the Learned Representations (15 points)

Here you will explore the latent representations learned by your autoencoder.

1. Complete `extract_latent_features` to extract the latent (bottleneck) features and their corresponding labels, given a trained model.  
   Apply this function to the test set using the provided code. This should give you the transformed test set with the dimensions of the bottleneck (2).
2. Plot the results, coloring each point by its corresponding digit label.


In [None]:
def extract_latent_features(model, dataloader): 
    model.eval()
    latents = []
    labels = []

    with torch.no_grad():
        for X_batch, y_batch in dataloader:
            
            encoded = None  # TODO: Apply the encoder part of the model to get the latent features
            latents.append(encoded.numpy())
            labels.append(y_batch.numpy())

    # Concatenate all batches
    latents = np.concatenate(latents, axis=0)
    labels = np.concatenate(labels, axis=0)
    return latents, labels

Once the missing parts above have been filled, uncomment the piece of code below and execute it to extract the latent features

In [None]:
# latents_autoencoder, labels = extract_latent_features(model, test_loader)

In [None]:
# USE AS MANY CELLS AS NEEDED FOR THE REST OF THE CODE

## 5. Compare with PCA (10 points)

Here you will compare the latent representations learned by your autoencoder with those obtained using a PCA.

1. Apply PCA to the original input data (without using the autoencoder) to reduce the data to the same dimensionality as your bottleneck (2 dimensions). 
3. Visualize the results. 
4. Compare the structure of the two plots (Autoencoder vs. PCA) and briefly discuss the differences (if any) you observe.

In [None]:
# USE AS MANY CELLS AS NEEDED

## 5. Collaboration Reflection (5 points)

As a group, briefly reflect on the following (max 1–2 short paragraphs):

- How did the group dynamics work throughout the assignment?
- Were there any major disagreements or diverging approaches?
- How did you resolve conflicts or make final modeling decisions?
- What did you learn from each other during this project?

YOUR TEXT HERE