# <img src="https://img.icons8.com/dusk/64/000000/artificial-intelligence.png" style="height:50px;display:inline"> EE 046202 - Technion - Unsupervised Learning & Data Analysis
---

#### <a href="https://lioritan.github.io">Lior Friedman</a>

## Tutorial 10 - Contrastive Learning Continues

### <img src="https://img.icons8.com/bubbles/50/000000/checklist.png" style="height:50px;display:inline"> Agenda
---
* [Bootstrap Your Own Latent (BYOL)](#-Bootstrap-Your-Own-Latent-(BYOL))
* [Barlow Twins](#-Barlow-Twins)
* [Lightly - implementing contrastive learning](#-Lightly---implementing-contrastive-learning)
* [Recommended Videos](#-Recommended-Videos)
* [Credits](#-Credits)

In [2]:
# imports for the tutorial
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

%matplotlib notebook

### <img src="https://img.icons8.com/dusk/64/000000/paper.png" style="height:50px;display:inline">  Reminder: Contrastive learning
---
* Similar things should be close, giving low loss, and dissimilar things should be far.
* Use augmentations to find positive samples (similar things).
* <img src="./assets/selfsup_contrast_augs.PNG" style="height:200px;">
* Contrastive loss: collect a batch of $N$ samples and use $N-1$ as negative samples each time.
* <img src="./assets/selfsup_infonce_loss.PNG" style="height:100px;">

### <img src="https://img.icons8.com/dusk/64/000000/mountain.png" style="height:50px;display:inline"> Bootstrap Your Own Latent (BYOL)
---
* <a href="https://arxiv.org/abs/2006.07733">Bootstrap Your Own Latent (BYOL)</a> creates two views of the data and tries to predict one from the other.
* Unlike previous contrastive methods **there are no negative samples**.
* Create two augmented views of a sample $x$, $t(x),t'(x)$, feed them to two neural networks and predict one from the other.
* Each network has an encoder $f_\theta$, a projector $g_\theta$ and a predictor $q_\theta$.
<img src="./assets/selfsup_byol.PNG" style="height:300px;">
1. Sample $t,t'\sim \tau$, get latent variables $z=g_\theta(f_\theta(t(x))),\quad z_\xi=g_\xi(f_\xi(t'(x)))$.
2. Online network produces prediction $q_\theta(z)$.
3. Normalize $q_\theta(z),z_\xi$ ($L_2$-norm), $\quad\mathcal{L}_{BYOL}=\mathrm{MSE}(q_\theta(z),z_\xi)$.
4. Update online network with SGD, update target network via polyak averaging: $\xi\leftarrow\alpha \xi+(1-\alpha)\theta$.

* **The MSE loss does not require negative samples**.
* There is an *implicit* contrastive loss, by using **Batch Normalization** in the projection and prediction.
* Without this batch normalization, BYOL (usually) fails catastrophically.
* Why?
    * One purpose of negative examples in a contrastive loss function is to prevent mode collapse (i.e. what if you use all-zeros representation for every data point?).
    * BYOL has no negative samples, so we need some implict dependency on negative samples.
    * This is exactly what batch normalization does, no matter how similar a batch of inputs are, the values are re-distributed according to the learned mean and standard deviation (and scaling-shifting).
    * Mode collapse is prevented because all samples in the mini-batch **cannot take on the same value after batch normalization**.
* In other words, BYOL learns by asking **“how is this image different from the average image?“**, whereas contrastive methods ask **“what distinguishes these two specific images from each other?”**
* <a href="https://arxiv.org/abs/2010.10241">Recent paper</a> shows that Group Normalization + a good initialization are enough to prevent mode collapse, but BN makes avoiding mode collapse easy.
<img src="./assets/selfsup_compare.PNG" style="height:250px;">

### <img src="https://img.icons8.com/bubbles/50/000000/yin-yang.png" style="height:50px;display:inline"> Barlow Twins
---
* <a href="https://arxiv.org/abs/2103.03230">Barlow Twins</a> does something similar to CCA (canonical-correlation analysis).
* Feed two distorted versions of the sample into the *same* network to extract features and learn to make the cross-correlation matrix between these two groups of output features **close to the identity matrix**. 
* In other words, the goal is to keep the representations of different versions of one sample similar, while minimizing the *redundancy* between these vectors (the idea comes from neuroscience).
<img src="./assets/selfsup_barlow.PNG" style="height:300px;">
1. Sample a batch of size, $N$, for each sample apply random augmentations $t,t'$ and encode: $z^A=f_\theta(t(x)),z^B=f_\theta(t'(x))$.
2. Calculate the cross-correlation matrix $\mathcal{C}$. 
    * $\mathcal{C}$ is a square matrix with the size same as the feature network’s output dimensionality. 
    * Each entry in the matrix $\mathcal{C}_{i,j}$ is the cosine similarity between the output vectors dimension at index $i,j$
    * $$\mathcal{C}_{i,j}=\frac{\sum_{b=1}^{N}z^A_{i,b}z^b_{j,b}}{\sqrt{\sum_b(z^A_{i,b})^2}\sqrt{\sum_b(z^B_{i,b})^2}}$$
    * $\mathcal{C}_{i,j}$ is between -1 (i.e. perfect anti-correlation) and 1 (i.e. perfect correlation).
3. $$\mathcal{L}_\mathrm{BT} = \underbrace{\sum_i (1-\mathcal{C}_{ii})^2}_\mathrm{invariance-term} + \lambda \underbrace{\sum_i\sum_{i\neq j} \mathcal{C}_{ij}^2}_\mathrm{redundancy-reduction-term}$$

Notes:
* *Explicitly reduces redundancy*, so no need for batch normalization to avoid mode collapse in the representation.
* Pretty robust to batch size, but sensitive to the choice of augmentations.
* Barlow Twins is very similar to VICReg without the Variance term, and their covariance terms are identical.

### <img src="./assets/selfsup_lightly.png" style="height:50px;display:inline"> Lightly - implementing contrastive learning
---
* Lightly SSL is a computer vision framework for self-supervised learning. 
* Contains Pytorch-based implementations for many popular models and losses, including everything we talked about.
* <a href="https://docs.lightly.ai/self-supervised-learning/tutorials/package/tutorial_moco_memory_bank.html">Tutorials (+documentation) </a>
* Can implement most methods easily, SimCLR training on CIFAR10:

In [1]:
import torch
import torchvision

from lightly import loss
from lightly import transforms
from lightly.data import LightlyDataset
from lightly.models.modules import heads
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")  # use gpu 0 if it is available, o.w. use the cpu

In [2]:
# Create a PyTorch module for the SimCLR model.
class SimCLR(torch.nn.Module):
    def __init__(self, backbone, device):
        super().__init__()
        self.backbone = backbone
        self.projection_head = heads.SimCLRProjectionHead(
            input_dim=512,  # Resnet18 features have 512 dimensions.
            hidden_dim=512,
            output_dim=128,
        ).to(device)
        self.device=device

    def forward(self, x):
        features = self.backbone(x).flatten(start_dim=1)
        z = self.projection_head(features)
        return z


# Use a resnet backbone.
backbone = torchvision.models.resnet18().to(device)
# Ignore the classification head as we only want the features.
backbone.fc = torch.nn.Identity()
# Build the SimCLR model.
model = SimCLR(backbone, device).to(device)

In [3]:
# Prepare transform that creates multiple random views for every image.
# Applies the following augmentations by default:
#    Random resized crop
#    Random horizontal flip
#    Color jitter
#    Random gray scale
#    Gaussian blur
#    ImageNet normalization
transform = transforms.SimCLRTransform(input_size=32, cj_prob=0.5)
# Create a dataset from your image folder.
mnist_train_dataset = torchvision.datasets.CIFAR10(root='./datasets/',
                                           train=True, 
                                           transform=torchvision.transforms.ToTensor(),
                                           download=True)
dataset = LightlyDataset.from_torch_dataset(mnist_train_dataset, transform=transform)

# Build a PyTorch dataloader.
dataloader = torch.utils.data.DataLoader(
    dataset,  # Pass the dataset to the dataloader.
    batch_size=128,  # A large batch size helps with the learning.
    shuffle=True,  # Shuffling is important!
)

Files already downloaded and verified


In [4]:
# Lightly exposes building blocks such as loss functions.
criterion = loss.NTXentLoss(temperature=0.5)

# Get a PyTorch optimizer.
optimizer = torch.optim.SGD(model.parameters(), lr=0.1, weight_decay=1e-6)

# Train the model.
model.train()
for epoch in range(10):
    for (view0, view1), targets, filenames in dataloader:
        z0 = model(view0.to(device))
        z1 = model(view1.to(device))
        loss = criterion(z0, z1) # contrastive loss!
        # Same 3 steps:
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()
        print(f"epoch: {epoch}, loss: {loss.item():.5f}")

epoch: 0, loss: 5.53867
epoch: 0, loss: 5.52710
epoch: 0, loss: 5.46335
epoch: 0, loss: 5.48311
epoch: 0, loss: 5.45449
epoch: 0, loss: 5.45347
epoch: 0, loss: 5.33780
epoch: 0, loss: 5.35684
epoch: 0, loss: 5.37308
epoch: 0, loss: 5.33071
epoch: 0, loss: 5.44714
epoch: 0, loss: 5.22976
epoch: 0, loss: 5.32906
epoch: 0, loss: 5.27504
epoch: 0, loss: 5.28635
epoch: 0, loss: 5.30149
epoch: 0, loss: 5.34259
epoch: 0, loss: 5.33194
epoch: 0, loss: 5.34683
epoch: 0, loss: 5.29996
epoch: 0, loss: 5.24218
epoch: 0, loss: 5.22056
epoch: 0, loss: 5.22605
epoch: 0, loss: 5.22667
epoch: 0, loss: 5.19274
epoch: 0, loss: 5.21810
epoch: 0, loss: 5.22854
epoch: 0, loss: 5.22857
epoch: 0, loss: 5.21867
epoch: 0, loss: 5.25928
epoch: 0, loss: 5.11539
epoch: 0, loss: 5.20981
epoch: 0, loss: 5.17648
epoch: 0, loss: 5.15574
epoch: 0, loss: 5.22631
epoch: 0, loss: 5.21771
epoch: 0, loss: 5.12557
epoch: 0, loss: 5.12088
epoch: 0, loss: 5.21498
epoch: 0, loss: 5.19752
epoch: 0, loss: 5.16982
epoch: 0, loss: 

epoch: 0, loss: 4.81337
epoch: 0, loss: 4.85374
epoch: 0, loss: 4.87589
epoch: 0, loss: 4.81875
epoch: 0, loss: 5.00377
epoch: 0, loss: 4.79069
epoch: 0, loss: 4.96422
epoch: 0, loss: 4.84016
epoch: 0, loss: 4.85542
epoch: 0, loss: 4.78908
epoch: 0, loss: 4.86527
epoch: 0, loss: 4.90451
epoch: 0, loss: 4.91620
epoch: 0, loss: 4.93619
epoch: 0, loss: 4.92078
epoch: 0, loss: 4.95165
epoch: 0, loss: 4.96710
epoch: 0, loss: 4.98703
epoch: 0, loss: 4.78699
epoch: 0, loss: 4.87522
epoch: 0, loss: 5.01850
epoch: 0, loss: 4.77618
epoch: 0, loss: 4.79649
epoch: 0, loss: 4.81130
epoch: 0, loss: 4.85435
epoch: 0, loss: 5.04729
epoch: 0, loss: 5.04205
epoch: 0, loss: 4.87509
epoch: 0, loss: 4.90640
epoch: 0, loss: 4.71202
epoch: 0, loss: 4.91627
epoch: 0, loss: 4.87213
epoch: 0, loss: 4.83602
epoch: 0, loss: 4.89959
epoch: 0, loss: 5.00367
epoch: 0, loss: 4.83118
epoch: 0, loss: 5.01319
epoch: 0, loss: 4.89577
epoch: 0, loss: 5.03373
epoch: 0, loss: 4.89851
epoch: 0, loss: 4.88145
epoch: 0, loss: 

epoch: 1, loss: 4.97606
epoch: 1, loss: 4.85584
epoch: 1, loss: 4.82723
epoch: 1, loss: 4.83021
epoch: 1, loss: 4.81617
epoch: 1, loss: 4.87880
epoch: 1, loss: 4.82827
epoch: 1, loss: 4.84924
epoch: 1, loss: 4.73154
epoch: 1, loss: 4.82356
epoch: 1, loss: 4.86307
epoch: 1, loss: 4.76021
epoch: 1, loss: 4.82984
epoch: 1, loss: 4.92619
epoch: 1, loss: 4.77283
epoch: 1, loss: 4.74019
epoch: 1, loss: 4.82881
epoch: 1, loss: 4.85890
epoch: 1, loss: 4.79097
epoch: 1, loss: 4.83390
epoch: 1, loss: 4.84507
epoch: 1, loss: 4.83019
epoch: 1, loss: 4.68766
epoch: 1, loss: 4.76485
epoch: 1, loss: 4.79324
epoch: 1, loss: 4.74558
epoch: 1, loss: 4.79164
epoch: 1, loss: 4.79046
epoch: 1, loss: 4.83662
epoch: 1, loss: 4.83454
epoch: 1, loss: 4.79914
epoch: 1, loss: 4.75050
epoch: 1, loss: 4.90261
epoch: 1, loss: 4.77705
epoch: 1, loss: 4.74372
epoch: 1, loss: 4.84196
epoch: 1, loss: 4.86453
epoch: 1, loss: 4.87886
epoch: 1, loss: 4.78874
epoch: 1, loss: 4.84974
epoch: 1, loss: 4.75064
epoch: 1, loss: 

epoch: 2, loss: 4.79343
epoch: 2, loss: 4.74571
epoch: 2, loss: 4.72702
epoch: 2, loss: 4.64647
epoch: 2, loss: 4.60890
epoch: 2, loss: 4.64388
epoch: 2, loss: 4.75581
epoch: 2, loss: 4.81466
epoch: 2, loss: 4.72145
epoch: 2, loss: 4.69914
epoch: 2, loss: 4.78095
epoch: 2, loss: 4.81558
epoch: 2, loss: 4.82527
epoch: 2, loss: 4.75720
epoch: 2, loss: 4.74495
epoch: 2, loss: 4.77763
epoch: 2, loss: 4.77860
epoch: 2, loss: 4.80995
epoch: 2, loss: 4.80304
epoch: 2, loss: 4.83347
epoch: 2, loss: 4.69027
epoch: 2, loss: 4.75455
epoch: 2, loss: 4.71734
epoch: 2, loss: 4.73245
epoch: 2, loss: 4.76698
epoch: 2, loss: 4.81963
epoch: 2, loss: 4.78572
epoch: 2, loss: 4.70201
epoch: 2, loss: 4.68319
epoch: 2, loss: 4.84559
epoch: 2, loss: 4.88762
epoch: 2, loss: 4.70510
epoch: 2, loss: 4.77474
epoch: 2, loss: 4.85484
epoch: 2, loss: 4.77187
epoch: 2, loss: 4.79300
epoch: 2, loss: 4.77422
epoch: 2, loss: 4.75226
epoch: 2, loss: 4.80593
epoch: 2, loss: 4.75387
epoch: 2, loss: 4.80165
epoch: 2, loss: 

epoch: 3, loss: 4.66961
epoch: 3, loss: 4.74714
epoch: 3, loss: 4.76349
epoch: 3, loss: 4.65112
epoch: 3, loss: 4.73572
epoch: 3, loss: 4.75237
epoch: 3, loss: 4.74153
epoch: 3, loss: 4.81448
epoch: 3, loss: 4.72932
epoch: 3, loss: 4.80173
epoch: 3, loss: 4.81906
epoch: 3, loss: 4.67857
epoch: 3, loss: 4.77189
epoch: 3, loss: 4.70613
epoch: 3, loss: 4.72931
epoch: 3, loss: 4.68391
epoch: 3, loss: 4.74071
epoch: 3, loss: 4.75386
epoch: 3, loss: 4.69280
epoch: 3, loss: 4.72056
epoch: 3, loss: 4.78784
epoch: 3, loss: 4.71903
epoch: 3, loss: 4.75330
epoch: 3, loss: 4.69761
epoch: 3, loss: 4.81159
epoch: 3, loss: 4.70132
epoch: 3, loss: 4.80501
epoch: 3, loss: 4.68310
epoch: 3, loss: 4.74801
epoch: 3, loss: 4.89934
epoch: 3, loss: 4.76196
epoch: 3, loss: 4.72576
epoch: 3, loss: 4.68205
epoch: 3, loss: 4.79606
epoch: 3, loss: 4.72118
epoch: 3, loss: 4.86863
epoch: 3, loss: 4.70621
epoch: 3, loss: 4.74419
epoch: 3, loss: 4.71346
epoch: 3, loss: 4.70691
epoch: 3, loss: 4.74054
epoch: 3, loss: 

epoch: 4, loss: 4.69491
epoch: 4, loss: 4.74086
epoch: 4, loss: 4.71873
epoch: 4, loss: 4.62725
epoch: 4, loss: 4.76116
epoch: 4, loss: 4.51161
epoch: 4, loss: 4.66462
epoch: 4, loss: 4.71101
epoch: 4, loss: 4.79298
epoch: 4, loss: 4.67551
epoch: 4, loss: 4.64376
epoch: 4, loss: 4.66390
epoch: 4, loss: 4.73999
epoch: 4, loss: 4.72858
epoch: 4, loss: 4.69025
epoch: 4, loss: 4.65671
epoch: 4, loss: 4.73787
epoch: 4, loss: 4.62385
epoch: 4, loss: 4.61008
epoch: 4, loss: 4.67679
epoch: 4, loss: 4.70752
epoch: 4, loss: 4.73430
epoch: 4, loss: 4.64886
epoch: 4, loss: 4.69190
epoch: 4, loss: 4.73653
epoch: 4, loss: 4.68882
epoch: 4, loss: 4.72291
epoch: 4, loss: 4.71033
epoch: 4, loss: 4.77529
epoch: 4, loss: 4.74889
epoch: 4, loss: 4.71063
epoch: 4, loss: 4.79432
epoch: 4, loss: 4.79158
epoch: 4, loss: 4.64110
epoch: 4, loss: 4.79696
epoch: 4, loss: 4.69505
epoch: 4, loss: 4.72052
epoch: 4, loss: 4.59829
epoch: 4, loss: 4.70200
epoch: 4, loss: 4.73420
epoch: 4, loss: 4.71190
epoch: 4, loss: 

epoch: 5, loss: 4.72929
epoch: 5, loss: 4.72953
epoch: 5, loss: 4.61289
epoch: 5, loss: 4.68904
epoch: 5, loss: 4.66018
epoch: 5, loss: 4.69320
epoch: 5, loss: 4.75788
epoch: 5, loss: 4.76647
epoch: 5, loss: 4.68729
epoch: 5, loss: 4.63843
epoch: 5, loss: 4.67369
epoch: 5, loss: 4.70060
epoch: 5, loss: 4.58946
epoch: 5, loss: 4.61039
epoch: 5, loss: 4.70128
epoch: 5, loss: 4.57153
epoch: 5, loss: 4.67642
epoch: 5, loss: 4.72681
epoch: 5, loss: 4.69051
epoch: 5, loss: 4.76805
epoch: 5, loss: 4.69081
epoch: 5, loss: 4.69741
epoch: 5, loss: 4.61265
epoch: 5, loss: 4.73374
epoch: 5, loss: 4.66208
epoch: 5, loss: 4.69118
epoch: 5, loss: 4.57312
epoch: 5, loss: 4.68605
epoch: 5, loss: 4.79383
epoch: 5, loss: 4.71150
epoch: 5, loss: 4.68985
epoch: 5, loss: 4.75277
epoch: 5, loss: 4.63178
epoch: 5, loss: 4.66125
epoch: 5, loss: 4.69850
epoch: 5, loss: 4.64766
epoch: 5, loss: 4.72820
epoch: 5, loss: 4.61466
epoch: 5, loss: 4.61739
epoch: 5, loss: 4.70012
epoch: 5, loss: 4.69531
epoch: 5, loss: 

epoch: 6, loss: 4.59689
epoch: 6, loss: 4.62218
epoch: 6, loss: 4.71876
epoch: 6, loss: 4.66653
epoch: 6, loss: 4.71502
epoch: 6, loss: 4.69049
epoch: 6, loss: 4.70652
epoch: 6, loss: 4.64429
epoch: 6, loss: 4.58674
epoch: 6, loss: 4.56169
epoch: 6, loss: 4.66987
epoch: 6, loss: 4.75773
epoch: 6, loss: 4.71616
epoch: 6, loss: 4.73273
epoch: 6, loss: 4.60196
epoch: 6, loss: 4.63529
epoch: 6, loss: 4.63514
epoch: 6, loss: 4.65619
epoch: 6, loss: 4.63707
epoch: 6, loss: 4.73203
epoch: 6, loss: 4.77190
epoch: 6, loss: 4.77620
epoch: 6, loss: 4.71456
epoch: 6, loss: 4.59049
epoch: 6, loss: 4.69352
epoch: 6, loss: 4.61087
epoch: 6, loss: 4.59396
epoch: 6, loss: 4.69986
epoch: 6, loss: 4.61183
epoch: 6, loss: 4.70561
epoch: 6, loss: 4.62277
epoch: 6, loss: 4.63486
epoch: 6, loss: 4.63360
epoch: 6, loss: 4.52447
epoch: 6, loss: 4.57266
epoch: 6, loss: 4.62713
epoch: 6, loss: 4.67383
epoch: 6, loss: 4.70691
epoch: 6, loss: 4.61786
epoch: 6, loss: 4.62876
epoch: 6, loss: 4.62622
epoch: 6, loss: 

epoch: 6, loss: 4.10138
epoch: 7, loss: 4.75162
epoch: 7, loss: 4.74755
epoch: 7, loss: 4.58063
epoch: 7, loss: 4.62815
epoch: 7, loss: 4.53496
epoch: 7, loss: 4.63725
epoch: 7, loss: 4.79194
epoch: 7, loss: 4.54970
epoch: 7, loss: 4.67607
epoch: 7, loss: 4.59168
epoch: 7, loss: 4.56725
epoch: 7, loss: 4.65160
epoch: 7, loss: 4.70658
epoch: 7, loss: 4.53710
epoch: 7, loss: 4.55777
epoch: 7, loss: 4.71572
epoch: 7, loss: 4.65752
epoch: 7, loss: 4.72520
epoch: 7, loss: 4.64647
epoch: 7, loss: 4.65038
epoch: 7, loss: 4.59784
epoch: 7, loss: 4.63986
epoch: 7, loss: 4.55953
epoch: 7, loss: 4.60656
epoch: 7, loss: 4.67965
epoch: 7, loss: 4.53230
epoch: 7, loss: 4.74154
epoch: 7, loss: 4.58290
epoch: 7, loss: 4.64899
epoch: 7, loss: 4.73195
epoch: 7, loss: 4.61968
epoch: 7, loss: 4.68025
epoch: 7, loss: 4.60901
epoch: 7, loss: 4.65057
epoch: 7, loss: 4.56978
epoch: 7, loss: 4.66827
epoch: 7, loss: 4.69326
epoch: 7, loss: 4.65924
epoch: 7, loss: 4.57662
epoch: 7, loss: 4.67027
epoch: 7, loss: 

epoch: 7, loss: 4.65507
epoch: 7, loss: 4.57000
epoch: 7, loss: 4.71341
epoch: 7, loss: 4.64505
epoch: 7, loss: 4.68874
epoch: 7, loss: 4.63332
epoch: 7, loss: 4.68442
epoch: 7, loss: 4.73570
epoch: 7, loss: 4.62704
epoch: 7, loss: 4.66297
epoch: 7, loss: 4.56458
epoch: 7, loss: 4.51787
epoch: 7, loss: 4.56070
epoch: 7, loss: 4.62918
epoch: 7, loss: 4.66226
epoch: 7, loss: 4.69295
epoch: 7, loss: 4.62083
epoch: 7, loss: 4.68110
epoch: 7, loss: 4.66760
epoch: 7, loss: 4.62630
epoch: 7, loss: 4.64906
epoch: 7, loss: 4.75150
epoch: 7, loss: 4.61644
epoch: 7, loss: 4.60859
epoch: 7, loss: 4.67003
epoch: 7, loss: 4.58994
epoch: 7, loss: 4.67266
epoch: 7, loss: 4.56694
epoch: 7, loss: 4.55354
epoch: 7, loss: 4.52613
epoch: 7, loss: 4.63634
epoch: 7, loss: 4.56930
epoch: 7, loss: 4.65965
epoch: 7, loss: 4.65926
epoch: 7, loss: 4.62536
epoch: 7, loss: 4.69743
epoch: 7, loss: 4.61689
epoch: 7, loss: 4.65144
epoch: 7, loss: 4.65431
epoch: 7, loss: 4.61457
epoch: 7, loss: 4.57118
epoch: 7, loss: 

epoch: 8, loss: 4.68586
epoch: 8, loss: 4.69647
epoch: 8, loss: 4.67477
epoch: 8, loss: 4.55421
epoch: 8, loss: 4.60624
epoch: 8, loss: 4.59599
epoch: 8, loss: 4.69522
epoch: 8, loss: 4.58878
epoch: 8, loss: 4.60171
epoch: 8, loss: 4.52747
epoch: 8, loss: 4.58919
epoch: 8, loss: 4.51147
epoch: 8, loss: 4.51823
epoch: 8, loss: 4.72000
epoch: 8, loss: 4.68022
epoch: 8, loss: 4.54292
epoch: 8, loss: 4.63488
epoch: 8, loss: 4.66772
epoch: 8, loss: 4.58828
epoch: 8, loss: 4.62156
epoch: 8, loss: 4.61319
epoch: 8, loss: 4.52906
epoch: 8, loss: 4.75217
epoch: 8, loss: 4.73331
epoch: 8, loss: 4.55011
epoch: 8, loss: 4.58417
epoch: 8, loss: 4.60217
epoch: 8, loss: 4.55056
epoch: 8, loss: 4.65585
epoch: 8, loss: 4.61890
epoch: 8, loss: 4.60535
epoch: 8, loss: 4.59989
epoch: 8, loss: 4.53490
epoch: 8, loss: 4.54875
epoch: 8, loss: 4.49506
epoch: 8, loss: 4.65908
epoch: 8, loss: 4.60440
epoch: 8, loss: 4.68657
epoch: 8, loss: 4.58171
epoch: 8, loss: 4.70492
epoch: 8, loss: 4.70989
epoch: 8, loss: 

epoch: 9, loss: 4.64443
epoch: 9, loss: 4.56879
epoch: 9, loss: 4.56448
epoch: 9, loss: 4.58461
epoch: 9, loss: 4.62787
epoch: 9, loss: 4.67309
epoch: 9, loss: 4.60679
epoch: 9, loss: 4.67028
epoch: 9, loss: 4.59673
epoch: 9, loss: 4.61184
epoch: 9, loss: 4.52091
epoch: 9, loss: 4.55474
epoch: 9, loss: 4.61355
epoch: 9, loss: 4.60933
epoch: 9, loss: 4.65387
epoch: 9, loss: 4.67340
epoch: 9, loss: 4.49355
epoch: 9, loss: 4.59842
epoch: 9, loss: 4.62803
epoch: 9, loss: 4.52051
epoch: 9, loss: 4.74364
epoch: 9, loss: 4.59964
epoch: 9, loss: 4.52608
epoch: 9, loss: 4.58959
epoch: 9, loss: 4.54002
epoch: 9, loss: 4.62693
epoch: 9, loss: 4.70210
epoch: 9, loss: 4.64142
epoch: 9, loss: 4.54966
epoch: 9, loss: 4.61261
epoch: 9, loss: 4.52993
epoch: 9, loss: 4.61505
epoch: 9, loss: 4.56391
epoch: 9, loss: 4.56023
epoch: 9, loss: 4.64395
epoch: 9, loss: 4.55563
epoch: 9, loss: 4.69017
epoch: 9, loss: 4.56047
epoch: 9, loss: 4.57660
epoch: 9, loss: 4.48061
epoch: 9, loss: 4.61676
epoch: 9, loss: 

* Can use ```model.backbone``` as downstream image embedding for classification/clustering/tsne.
* ```model.projection_head``` is discarded after training.
* Can easily switch loss function, augmentations, projection head(s)

### <img src="https://img.icons8.com/bubbles/50/000000/video-playlist.png" style="height:50px;display:inline"> Recommended Videos
---
#### <img src="https://img.icons8.com/cute-clipart/64/000000/warning-shield.png" style="height:30px;display:inline"> Warning!
* These videos do not replace the lectures and tutorials.
* Please use these to get a better understanding of the material, and not as an alternative to the written material.

#### Video By Subject

* BYOL - <a href="https://www.youtube.com/watch?v=YPfUiOMYOEE"> BYOL: Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning </a>

## <img src="https://img.icons8.com/dusk/64/000000/prize.png" style="height:50px;display:inline"> Credits
---
* <a href="https://github.com/taldatech/ee046211-deep-learning/blob/main/ee046211_tutorial_09_self_supervised_representation_learning.ipynb"> ee045211 - Deep Learning </a> @ Technion
* <a href="https://lilianweng.github.io/posts/2021-05-31-contrastive/"> Weng, Lilian. (May 2021). Contrastive representation learning. Lil’Log </a>
* <a href="https://imbue.com/research/2020-08-24-understanding-self-supervised-contrastive-learning/"> Understanding self-supervised and contrastive learning with BYOL </a>
* A Cookbook of Self-Supervised Learning, Balestriero et al. 2023
* <a href="https://paperswithcode.com/method/byol">Bootstrap Your Own Latent (BYOL)</a>
* <a href="https://paperswithcode.com/method/barlow-twins">Barlow Twins</a>
* <a href="https://github.com/lightly-ai/lightly">Lightly SSL</a>