<a href="https://colab.research.google.com/github/eborin/SSL-course/blob/main/09_minerva_SimCLR-STL10-backbone_pretrain.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

[View Source Code](https://github.com/eborin/SSL-course/blob/main/09_minerva_SimCLR-STL10-backbone_pretrain.ipynb)

# Pretraining backbones with Minerva SimCLR

This notebook provides a demonstration of how to pretrain feature extraction backbones using the Minerva SimCLR model. 
In particular, it walks through the process of training a ResNet-18 backbone on the "unlabeled" split of the STL10 dataset.

## 1. Introduction

### 1.1 Objective

The main objective of this tutorial is to present how to employ Minerva SimCLR to pretrain a given backbone.

### 1.2 Before Running this Notebook

The training operation performed in this notebook is likely to take a considerable amount of time when executed on typical laptop or desktop hardware.
As of April 2025, it also remains time-consuming even when running on Google Colab.
If you have access to a system equipped with a powerful GPU, it is recommended that you run this notebook on that system to significantly reduce the training time.

#### 1.1.1. Running this notebook from a terminal as a Python script.

You can convert this notebook into a Python script by executing the following command in your terminal:

```bash
jupyter nbconvert --to script 09_minerva_SimCLR-STL10-backbone_pretrain.ipynb
```

This will generate a Python script named `09_minerva_SimCLR-STL10-backbone_pretrain.py`, which you can then run directly from your terminal using:

```bash
python 09_minerva_SimCLR-STL10-backbone_pretrain.py
```

> **Note**: Before converting the notebook, you may want to adjust the main configuration variables found in the "Basic Setup" section to ensure they are appropriately set for your environment.

### 1.3 SimCLR

SimCLR (Simple Framework for Contrastive Learning of Visual Representations) is a self-supervised learning framework introduced by Google Research. 
It learns meaningful visual representations without the need for labeled data by maximizing agreement between different augmented views of the same image.

The method was presented in a paper published at the 37th International Conference on Machine Learning (ICML) in 2020. 
A [preprint](https://arxiv.org/pdf/2002.05709) of the paper is also available on the arXiv repository, as provided by the authors.

### 1.4 What we're going to cover

In this tutorial, we’ll demonstrate how to use the SimCLR model from the [Minerva framework](https://github.com/discovery-unicamp/Minerva) to train a backbone.
Specifically, we will train a ResNet18-based backbone using the "unlabeled" split of the STL10 dataset.

| **Topic** | **Contents** |
| ----- | ----- |
| [**2. Basic Setup**](#sec_2) | Import useful modules (torch, torchvision, and lightning). |
| [**3. Setting up the Dataset**](#sec_3) | Set up the data transforms, the dataset and the data module for the traininig process. |
| [**4. Create the Model for the Pretext Task**](#sec_4) | Create the backbone, the projection head, and the model for the pretext task. |
| [**5. Training the Model**](#sec_5) | Create a trainer object and train the model. |
| [**6. Monitoring the Training Process with a Downstream Benchmark**](#sec_6) | Create a benchmark and attach it to the trainer object to track the performance of the backbone on the downstream task. |
| [**7. Exercises**](#sec_6) | Suggested Exercises. |

### 1.5 Where can you get help?

In addition to discussing with your colleagues or the course professor, you might also consider:

* Minerva: check the [Minerva docs](https://discovery-unicamp.github.io/Minerva/).

* Lightning: check the [Lightning documentation](https://lightning.ai/docs/overview/getting-started) and research or post Lightning related question on the [PyTorch Lightning forum](https://lightning.ai/forums/).

* PyTorch: check the [PyTorch documentation](https://pytorch.org/docs/stable/index.html) and research or post PyTorch related question on the [PyTorch developer forums](https://discuss.pytorch.org/).

## <a id="sec_2">2. Basic Setup</a>

### 2.1 Setup main variables

Several variables influence the execution of this notebook, particularly in terms of memory usage and training time. These include:

* **`n_epochs`**: Specifies the maximum number of training epochs. 
    Increasing this value generally improves backbone performance but also leads to longer training times. 
    Reducing the number of epochs can speed up training, but doing so excessively may compromise the quality of the learned representations. 
    Based on my experiments, training for at least 90 epochs typically yields backbones with noticeably better performance compared to random backbones (You will be able to evaluate this in the next tutorials).

* **`checkpoint_every_n_epochs`**: Specifies how often, in terms of training epochs, a model checkpoint is saved. 
    For example, if set to 10, the model's state will be saved every 10 epochs during training.
    These checkpoints can be used to recover from interruptions, and they also allow you to evaluate the backbone's performance at various stages to monitor how the learned representations are evolving over time (The upcoming tutorials include code for evaluating the performance of backbones across multiple checkpoints.)

* **`DL_BATCH_SIZE`**: Determines the batch size used during training. 
    As highlighted in the SimCLR paper, this hyperparameter significantly impacts performance. 
    Larger batch sizes tend to produce better results; however, very large batches may exceed your GPU's memory capacity. 
    Adjust accordingly based on available resources.

* **`DL_NUM_WORKERS`**: Sets the number of worker threads used by the DataLoader for parallel data loading and preprocessing. 
    Increasing this value can help improve data throughput and reduce training bottlenecks, especially on multi-core systems.

* **`monitor_backbone_performance_with_downstream_benchmark`**:  set to `True`, uses a downstream task benchmark to monitor the backbone’s performance.

You can customize these parameters in the following code cell.

In [1]:
# Total number of epochs for training the model using the SimCLR pretext task.
n_epochs = 500

# Number of epochs between model checkpoints
checkpoint_every_n_epochs = 20

# Dataloaders/Datamodule parameters
DL_BATCH_SIZE=256
DL_NUM_WORKERS=16

# Monitor the backbone performance using a downstream benchmark
monitor_backbone_performance_with_downstream_benchmark = True

### 2.2 Installing Lightining and Minerva modules

The code below attempts to import the Minerva module and will automatically install it if it is not already available.
> **Note**: Since Minerva depends on PyTorch Lightning, Lightning will also be installed automatically if it is not already present.

In [2]:
try:
    import minerva
except:
    try:
        #Try to install it and import again
        print("[INFO]: Could not import the minerva module. Trying to install it!")
        !pip install -q minerva-ml
        import minerva
        print("[INFO]: It looks like minerva was successfully imported!")
    except:
        raise Exception("[ERROR] Couldn't find the minerva module ... \n" +
                        "Please, install it before running the notebook.\n"+
                        "You might want to install the modules listed at requirements.txt\n" +
                        "To do so, run: \"pip install -r requirements.txt\"")

### 2.3 Importing basic modules

Let's import the basic modules, such as lightning, torch, minerva, and other utility modules.

In [3]:
# Import PyTorch
import torch

# Import torchvision
import torchvision

# Import lightning
import lightning

# Import minerva
import minerva

# Check versions
# Note: your PyTorch version shouldn't be lower than 1.10.0 and torchvision version shouldn't be lower than 0.11
print(f"PyTorch version: {torch.__version__}")
print(f"torchvision version: {torchvision.__version__}")
print(f"Lightning version: {lightning.__version__}")
#print(f"Minerva version: {M.__version__}") ## TODO

# Import matplotlib for visualization
import matplotlib.pyplot as plt

PyTorch version: 2.6.0+cu124
torchvision version: 0.21.0+cu124
Lightning version: 2.5.1


## <a id="sec_3">3. Setting up the Dataset</a>

We will use the unlabeled split of the STL10 dataset to pretrain our backbone. 
To enable contrastive learning, we will apply a series of data transformations to generate randomly augmented views of each image.

For a detailed discussion of the data augmentation strategies used in the next code block, please refer to the tutorial:
`08_minerva_data_transforms.ipynb`.

In [4]:
# Torchvision transforms
from torchvision.transforms.v2 import Compose, ToImage, ToDtype, RandomHorizontalFlip, RandomResizedCrop, RandomApply, ColorJitter, RandomGrayscale, GaussianBlur, Normalize
# Minerva Contrastive transform
from minerva.transforms.transform import ContrastiveTransform

# STL10 statistics for the unlabeled split. 
# - Note: If you would like to compute these statistics for your own dataset, refer 
#         to the discussion in tutorial 05_pytorch_transfer_learning.ipynb.
stl10_unlabeled_mean  = torch.tensor([0.4406, 0.4273, 0.3858])
stl10_unlabeled_std = torch.tensor([0.2687, 0.2613, 0.2685])

transform_pipeline = Compose([
    ToImage(), 
    ToDtype(torch.float32, scale=True),
    RandomHorizontalFlip(),
    RandomResizedCrop(size=96),
    RandomApply([ColorJitter(brightness=0.5,contrast=0.5,saturation=0.5,hue=0.1)], p=0.8),
    RandomGrayscale(p=0.2),
    GaussianBlur(kernel_size=9),
    Normalize(mean=stl10_unlabeled_mean, std=stl10_unlabeled_std)
])

contrastive_transform = ContrastiveTransform(transform_pipeline)

contrastive_dataset = torchvision.datasets.STL10(root="data", split="unlabeled",  download=True,
                                                 transform=contrastive_transform)

To monitor learning performance, we will split the dataset into 80% training and 20% validation subsets.
Additionally, we will use a `MinervaDataModule` to streamline data handling and simplify the training workflow.

In [5]:
from torch.utils.data import random_split
from minerva.data.data_modules.base import MinervaDataModule

torch.manual_seed(42)
train_size = int(0.8 * len(contrastive_dataset))
val_size = len(contrastive_dataset) - train_size
train_set, val_set = random_split(contrastive_dataset, [train_size, val_size])

SimCLR_datamodule = MinervaDataModule(name="Contrastive STL10",
                                      train_dataset=train_set, 
                                      val_dataset=val_set, 
                                      test_dataset=None,
                                      batch_size=DL_BATCH_SIZE, 
                                      num_workers=DL_NUM_WORKERS)

## <a id="sec_4">4. Create the Model for the Pretext Task</a>

### 4.1 Backbone and Projection Head Generation

We will use a modified version of the ResNet18 model as the backbone. 
Specifically, we replace its final fully connected (fc) layer with an identity layer—`torch.nn.Identity()`—which effectively removes any operation at that stage, allowing us to extract raw feature representations.

The `generate_backbone()` function handles this process: it instantiates a ResNet18 model, replaces its fully connected layer with an identity layer, and returns the modified model.

In the following code block, we instantiate the backbone and display its architecture using the summary() function from the torchinfo package.

In [6]:
from torchinfo import summary
from torchvision.models import resnet18

# Function to generate a ResNet18 based backbone.
def generate_backbone(weights=None):
    backbone = resnet18(weights=weights)
    backbone.fc = torch.nn.Identity()
    return backbone

# Generate the backbone and check its structure
backbone = generate_backbone()

summary(backbone,
        input_size=(32, 3, 96, 96), # input data shape (N x C x H x W)
        col_names=["input_size", "output_size", "num_params", "trainable"],
        col_width=20,
        row_settings=["var_names"]
)

Layer (type (var_name))                  Input Shape          Output Shape         Param #              Trainable
ResNet (ResNet)                          [32, 3, 96, 96]      [32, 512]            --                   True
├─Conv2d (conv1)                         [32, 3, 96, 96]      [32, 64, 48, 48]     9,408                True
├─BatchNorm2d (bn1)                      [32, 64, 48, 48]     [32, 64, 48, 48]     128                  True
├─ReLU (relu)                            [32, 64, 48, 48]     [32, 64, 48, 48]     --                   --
├─MaxPool2d (maxpool)                    [32, 64, 48, 48]     [32, 64, 24, 24]     --                   --
├─Sequential (layer1)                    [32, 64, 24, 24]     [32, 64, 24, 24]     --                   True
│    └─BasicBlock (0)                    [32, 64, 24, 24]     [32, 64, 24, 24]     --                   True
│    │    └─Conv2d (conv1)               [32, 64, 24, 24]     [32, 64, 24, 24]     36,864               True
│    │    └─BatchN

Note that the output of the backbone is a 512-dimensional feature vector, as indicated by the Identity (fc) layer in the model summary.

Next, we will define the projection head—the component that will be attached to the backbone to form the complete pretext model used during contrastive learning.

For this purpose, we’ll use a simple multi-layer perceptron (MLP), defined as follows:

In [7]:
import torch

# Function to generate a projection head
def generate_proj_head(backbone_out_dim=512, output_dim=128):
    return torch.nn.Sequential(
        torch.nn.Linear(backbone_out_dim, 4*output_dim), # Resnet output => 4*hidden_dim
        torch.nn.ReLU(inplace=True),
        torch.nn.Linear(4*output_dim, output_dim))


### 4.2 Adjusting the SimCLR optimizer

By default, the Minerva SimCLR class uses the LARS optimizer, following the setup proposed in the original SimCLR paper. 
However, for our experiments, we will instead use PyTorch's standard `SGD` optimizer in combination with the `CosineAnnealingLR` learning rate scheduler.

To apply this configuration, we need to modify the `configure_optimizers()` method of the pretext model so that it returns our chosen optimizer and scheduler.

While one clean approach would be to extend the Minerva SimCLR class, we will opt for a quicker, more pragmatic solution: we will monkey-patch the `configure_optimizers` method directly on the model instance. This is handled by the `adjust_configure_optimizers()` function.

In [8]:
def adjust_configure_optimizers(model, 
                                lr, # SGD Optimizer learning rate parameter
                                momentum, # SGD Optimizer momentum parameter
                                weight_decay, # SGD Optimizer weight_decay parameter
                                lr_scheduler_max_epochs): # CosineAnnealingLR parameter
    # Redefine the optimizers
    def configure_optimizers(self):
        optim = torch.optim.SGD(model.parameters(), lr=lr, momentum=momentum, weight_decay=weight_decay)
        scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optim, lr_scheduler_max_epochs)
        return [optim], [scheduler]
    model.configure_optimizers = configure_optimizers.__get__(model)

### 4.3 Building and Configuring the Model for the Pretext Task

Now we build and configure the model for the pretext task.

In [9]:
from minerva.models.ssl.simclr import SimCLR

# Create the backbone and the projection head
backbone  = generate_backbone()
proj_head = generate_proj_head()

# Create the model for the pretext task
simclr_model = SimCLR(backbone=backbone, projection_head=proj_head, temperature=0.07)

# Set the optimizer parameters
lr=1e-2
momentum=0.9
weight_decay=5e-4
lr_scheduler_max_epochs=n_epochs*1000
opt_cfg_string=f"op_lr_{lr}_m_{momentum}_wd_{weight_decay}_lrep_{lr_scheduler_max_epochs}"

# Adjusting the model optimizers
adjust_configure_optimizers(simclr_model, lr, momentum, weight_decay, lr_scheduler_max_epochs)

## <a id="sec_5">5. Training the Model</a>

As before, we’ll use a PyTorch Lightning `Trainer` object to handle the training process.

This time, however, we will enhance the trainer by adding several callbacks:

* **`ModelCheckpoint`**: to save model weights at regular intervals and also store the best-performing weights (based on the lowest validation loss).
  - We will also set `save_weights_only=True` to save only the model parameters (i.e., the backbone and projection head), excluding additional training states such as optimizer values and scheduler status.

* **`LearningRateMonitor`**: to track and log the learning rate throughout training.

Additionally, we will configure a `TensorBoardLogger` to log training metrics and store model checkpoints in the `logs/08_SimCLR/Pretext/SimCLR-Resnet18-ep_N` directory, where `N` corresponds to the number of epochs defined by the `n_epochs` variable.

The following code sets up the trainer along with these callbacks and logging configuration.

In [10]:
from lightning import Trainer
from lightning.pytorch.callbacks import ModelCheckpoint, LearningRateMonitor
from lightning.pytorch.loggers import TensorBoardLogger

log_ckpt_dir=f"logs/09_minerva_SimCLR_STL10/Pretext/{opt_cfg_string}"
trainer = Trainer(max_epochs=n_epochs,
                  log_every_n_steps=16,
                  benchmark=True,
                  callbacks=[ModelCheckpoint(save_weights_only=True, mode='min', monitor='val_loss', save_last="link"), 
                             ModelCheckpoint(save_weights_only=True, every_n_epochs=checkpoint_every_n_epochs, save_top_k=-1), 
                             LearningRateMonitor('epoch')],
                  logger = TensorBoardLogger(save_dir=log_ckpt_dir, name=f"SimCLR-Resnet18"))

GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Now, let's call the `trainer.fit()` method to begin training the backbone.

In [11]:
# If you plan to monitor the backbone's performance with a downstream benchmark,
# do not run the trainer yet — wait for the next section.
if not monitor_backbone_performance_with_downstream_benchmark:
    trainer.fit(simclr_model, SimCLR_datamodule)

Once the training process finishes, you will find all the model weights in the `${log_ckpt_dir}/version_0/checkpoints/` directory, where `log_ckpt_dir` corresponds to the directory set in the previous code blocks.

> **Note**: If `monitor_backbone_performance_with_downstream_benchmark` is set to `True`, the training process in the previous block will be skipped. 
    Instead, training will occur in the following section, where the trainer is configured with a callback to evaluate the backbone's performance on a designated downstream task.

## <a id="sec_6">6. Monitoring the Training Process with a Downstream Benchmark (Extra)</a>

In this section, we will create a benchmark and attach it to the trainer object to track the backbone's performance on the downstream task.
The training process will be similar to that described in the previous section; however, in addition to monitoring the training and validation losses, we will also monitor the performance of a KNN model trained and evaluated using the features produced by the backbone at the end of each epoch.

The goal is to assess whether the backbone's evolution is improving the feature representations for the downstream task, enabling a simple machine learning model to achieve better performance when trained on these features.

We will first create the benchmark and then configure the trainer to run it at the end of each training epoch.

### 6.1. Setting up a KNN benchmark

Our KNN benchmark will be implemented as a PyTorch Lightning Callback that is invoked by the trainer at the end of each training epoch.
To achieve this, we will define a class that extends the `pytorch_lightning.callbacks.Callback` class and override the `on_train_epoch_end()` method.
For example:

```python
from pytorch_lightning.callbacks import Callback

class KNNBenchmark(Callback):
    def __init__(self, ...):
        # Initialization code

    def on_train_epoch_end(self, trainer, model):
        # Benchmark evaluation code
```

After defining the class, we create a benchmark object and attach it to the trainer in the same way we attach other callbacks, such as `ModelCheckpoint` and `LearningRateMonitor`.

With this setup, the `on_train_epoch_end(self, trainer, model)` method will be automatically invoked by the trainer at the end of every epoch, allowing us to extract features from the benchmark dataset using the model's backbone and evaluate a KNN model on these features.

To simplify the process, we will employ `scikit-learn` `KNeighborsClassifier` class to implement the KNN model.

### 6.1.1 Implementing the `KNN_Benchmark` class


In [12]:
import torch
from torch import Tensor
from torch.nn import functional as F
from lightning.pytorch.callbacks import Callback

from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

class KNN_Benchmark(Callback):
    """
    KNN Benchmark that can be attached to the Lightning trainer object to track the backbone's performance on the downstream task.
    It expects the PyTorch Lightning model to have a backbone attribute.
    """

    def __init__(self, train_dataset, test_dataset, K: int = 10) -> None:
        """        
        Args:
            train_dataset: downstream training dataset -- N samples with D features - D must be compatible with the backbone.
            test_dataset: downstream test dataset -- M samples with D features - D must be compatible with the backbone.
            K: hyperparameter for KNN

        The callback will be invoked at the end of every epoch to compute the accuracy using KNN on the features extracted by the backbone of the lightning model.
        """
        super().__init__()

        # Store the features and labels from the downstream train and test datasets
        self.train_X, self.train_y = self.dataset_to_tensors(train_dataset)
        self.test_X, self.test_y = self.dataset_to_tensors(test_dataset)

        # Set the KNN hyperparameter
        self.K = K

        # Create the KNN classifier
        self.skl_KNN = KNeighborsClassifier(n_neighbors=self.K)    

    def on_train_epoch_end(self, trainer, model):
        # Use the model backbone to compute the features.
        train_features, test_features = self.compute_features(model.backbone)
        # Train the KNN model with the train set
        self.skl_KNN.fit(train_features, self.train_y)
        # Predict and compute the accuracy with the test set
        y_pred = self.skl_KNN.predict(test_features)
        acc = accuracy_score(self.test_y, y_pred)
        # Log the result using the PyTorch Lightning model logger.
        model.log("KNN_acc", acc, on_step=False, on_epoch=True, sync_dist=True)

    # Organize the features and labels from the dataset samples into two tensors.
    def dataset_to_tensors(self, dataset):
        features_l = [ f for f,l in dataset ]
        labels_l = [ l for f,l in dataset ]
        return torch.stack(features_l), torch.tensor(labels_l)

    # Extract the features using the backbone
    def compute_features(self, backbone):
        backbone_device = next(backbone.parameters()).device
        with torch.no_grad():
            # Extract features from the train and test datasets
            train_features = backbone( self.train_X.to(backbone_device) ).flatten(start_dim=1)
            test_features = backbone( self.test_X.to(backbone_device) ).flatten(start_dim=1)
        return train_features.to("cpu"), test_features.to("cpu")

#### 6.1.2 Setting Up the Downstream Dataset

We will download the STL10 training dataset and split it into separate training and testing subsets.
> Note: We will not use the STL10 test partition, as we want to avoid biasing our decisions based on the official test set.

In [13]:
# Torchvision transforms
from torchvision.transforms.v2 import Compose, ToImage, ToDtype, Normalize

# STL10 statistics for the train split
stl10_train_mean = torch.tensor([0.4467, 0.4398, 0.4066])
stl10_train_std  = torch.tensor([0.2603, 0.2566, 0.2713])

# Build the data transform pipeline to convert from PIL images to tensors and normalize the data. 
transform_pipeline = Compose([
    ToImage(), 
    ToDtype(torch.float32, scale=True),
    Normalize(mean=stl10_train_mean, std=stl10_train_std)
])

# Build the dataset object (This step will download the dataset if it hasn't been previously downloaded).
train_dataset = torchvision.datasets.STL10(root="data", 
                                           split="train",  
                                           download=True,
                                           transform=transform_pipeline)

The following code split the train_dataset into train and validation subsets.

In [14]:
from torch.utils.data import random_split

# Split the data
torch.manual_seed(42)
train_size = int(0.80 * len(train_dataset))
test_size = len(train_dataset) - train_size
train_set, val_set = random_split(train_dataset, [train_size, test_size])

### 6.1.3 Create the Downstream Benchmark

In [15]:
downstream_benchmark = KNN_Benchmark(train_dataset=train_set, 
                                     test_dataset=val_set,
                                     K=10)

### 6.2. Training the model

The following code configures the trainer to use the Downstream Benchmark by passing the object referenced by the `downstream_benchmark` variable as a callback.

In [16]:
from lightning import Trainer
from lightning.pytorch.callbacks import ModelCheckpoint, LearningRateMonitor
from lightning.pytorch.loggers import TensorBoardLogger

log_ckpt_dir=f"logs/09_minerva_SimCLR_STL10/Pretext/{opt_cfg_string}"
trainer = Trainer(max_epochs=n_epochs,
                  log_every_n_steps=16,
                  benchmark=True,
                  callbacks=[ModelCheckpoint(save_weights_only=True, mode='min', monitor='val_loss', save_last="link"), 
                             ModelCheckpoint(save_weights_only=True, every_n_epochs=checkpoint_every_n_epochs, save_top_k=-1), 
                             LearningRateMonitor('epoch'), downstream_benchmark],
                  logger = TensorBoardLogger(save_dir=log_ckpt_dir, name=f"SimCLR-Resnet18"))

GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Finally, we invoke the trainer object with the SimCLR (pretext) model and the SimCLR datamodule. 

In [17]:
if monitor_backbone_performance_with_downstream_benchmark:
    trainer.fit(simclr_model, SimCLR_datamodule)

You are using a CUDA device ('NVIDIA A100 80GB PCIe') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name      | Type       | Params | Mode 
-------------------------------------------------
0 | backbone  | ResNet     | 11.2 M | train
1 | projector | Sequential | 328 K  | train
2 | loss      | NTXentLoss | 0      | train
-------------------------------------------------
11.5 M    Trainable params
0         Non-trainable params
11.5 M    Total params
46.019    Total estimated model params size (MB)
74        Modules in train mode
0         Modules in eval mode


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

`Trainer.fit` stopped: `max_epochs=500` reached.


In [25]:
!ls $log_ckpt_dir/SimCLR-Resnet18/version_*/checkpoints

'epoch=119-step=37560.ckpt'	'epoch=379-step=118940.ckpt'
'epoch=139-step=43820.ckpt'	'epoch=39-step=12520.ckpt'
'epoch=159-step=50080.ckpt'	'epoch=399-step=125200.ckpt'
'epoch=179-step=56340.ckpt'	'epoch=419-step=131460-v1.ckpt'
'epoch=19-step=6260-v1.ckpt'	'epoch=435-step=136468.ckpt'
'epoch=199-step=62600.ckpt'	'epoch=439-step=137720.ckpt'
'epoch=219-step=68860.ckpt'	'epoch=459-step=143980.ckpt'
'epoch=239-step=75120.ckpt'	'epoch=479-step=150240.ckpt'
'epoch=259-step=81380-v1.ckpt'	'epoch=499-step=156500.ckpt'
'epoch=279-step=87640.ckpt'	'epoch=59-step=18780.ckpt'
'epoch=299-step=93900.ckpt'	'epoch=79-step=25040-v1.ckpt'
'epoch=319-step=100160.ckpt'	'epoch=99-step=31300.ckpt'
'epoch=339-step=106420.ckpt'	 last.ckpt
'epoch=359-step=112680.ckpt'


Notice that we have several checkpoints, collected at different epochs/training steps.
The `last.ckpt` is a link that points to the checkpoint that achieved the best validation loss.

Now, we can use these checkpoins to load backbone weights and employ pre-trained backbones on downstream tasks -- this is the subject for another tutorial.

## <a id="sec_7">7. Exercises</a>

1) Explore how different combinations of transforms affect the performance of the Downstream Benchmark.