# Self-Supervised Learning â€“ Denoising Autoencoder (CIFAR-10)

This notebook implements the task of comparing a standard CNN classifier to a Denoising Autoencoder (DAE) based approach.

## Task Requirements:
a) Train a CNN to discriminate between cats and dogs using CIFAR-10 data.
b) Pre-train a denoising autoencoder on images from CIFAR-10 (except for cats and dogs).
c) Fine-tune the pre-trained model to discriminate between cats and dogs.
d) Experiment with the amount of fine-tuning data and the number of training steps. Compare convergence and performance.

In [None]:
%load_ext autoreload
%autoreload 2

import os
import torch
import matplotlib.pyplot as plt
from src.experiments import run_baseline, run_pretraining, run_finetuning, plot_results

# Ensure results directory exists
os.makedirs('results', exist_ok=True)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

## Part (a): Baseline CNN
Train a standard CNN to discriminate between cats and dogs using CIFAR-10 data.
We will run this baseline with different data fractions (10%, 25%, 50%, 100%) to fully compare with the DAE approach.

In [None]:
# Run Baseline (using varying data fractions)
# You can adjust epochs if needed for faster checking
baseline_results = run_baseline(epochs=10, fractions=[0.1, 0.25, 0.5, 1.0])

## Part (b): DAE Pre-training
Pre-train a denoising autoencoder on images from CIFAR-10 (except for cats and dogs).

In [None]:
# Pre-train DAE on non-cat/dog classes
# This saves the encoder weights to 'results/pretrained_encoder.pth'
dae_model, dae_history = run_pretraining(epochs=20)

## Part (c) & (d): Fine-tuning & Experiments
Fine-tune the pre-trained model to discriminate between cats and dogs.
**Ablation Study**: We use a lower learning rate (1e-4) to prevent catastrophic forgetting of pre-trained features.

In [None]:
pretrained_path = os.path.join('results', 'pretrained_encoder.pth')

# Run fine-tuning experiments with different data fractions
# This returns a dictionary of results for each fraction
finetune_results = run_finetuning(
    pretrained_encoder_path=pretrained_path, 
    fractions=[0.1, 0.25, 0.5, 1.0],
    epochs=10,
    lr=0.0001
)

## Analysis & Plotting
Compare training time, convergence, and final performance. Does the pre-trained model need less data?

In [None]:
plot_results()