# SwAv - Unsupervised Learning of Visual Features by Contrasting Cluster Assignments

## 0. Current Colab's Setup

In [None]:
import torch
torch.__version__

'2.3.1+cu121'

In [None]:
!python --version

Python 3.10.12


In [None]:
!nvidia-smi

Mon Aug 12 06:47:25 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   42C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [None]:
!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:02:13_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0


## 1. Initial Setup



### 1.1. Clonning the repository

In [None]:
%cd /content
!git clone https://github.com/facebookresearch/swav.git

/content
Cloning into 'swav'...
remote: Enumerating objects: 163, done.[K
remote: Counting objects: 100% (96/96), done.[K
remote: Compressing objects: 100% (33/33), done.[K
remote: Total 163 (delta 77), reused 63 (delta 63), pack-reused 67[K
Receiving objects: 100% (163/163), 65.50 KiB | 482.00 KiB/s, done.
Resolving deltas: 100% (111/111), done.


### 1.2. Mounting Google Drives Files

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


### 1.3. Set environments variables

In [None]:
%env EXPERIMENT_PATH=/content/drive/MyDrive/PLAEX/Selfsupervised_models/swav/experiments/swav_400ep_bs256_pretrain
%env DATASET_PATH=/content/drive/MyDrive/PLAEX/Dataset

env: EXPERIMENT_PATH=/content/drive/MyDrive/PLAEX/Selfsupervised_models/swav/experiments/swav_400ep_bs256_pretrain
env: DATASET_PATH=/content/drive/MyDrive/PLAEX/Dataset


### 1.4. Check if Dataset and Environment paths exist

In [None]:
import os
%cd /content

# For EXPERIMENT_PATH
dir_path = os.getenv('EXPERIMENT_PATH') # '/content/swav/experiments/swav_200ep_bs256_pretrain'
if os.path.exists(dir_path):
    print(f"The directory '{dir_path}' exists.")
else:
    os.makedirs(dir_path, exist_ok=True)
    print(f"Directory '{dir_path}' created successfully.")

# For DATASET_PATH
dataset_path = os.getenv('DATASET_PATH') # '/content/drive/MyDrive/PLAEX/Dataset'
# Check if the directory exists
if os.path.exists(dataset_path):
    print(f"The directory '{dataset_path}' exists.")
else:
    print(f"The directory '{dataset_path}' does not exist.")

/content
The directory '/content/drive/MyDrive/PLAEX/Selfsupervised_models/swav/experiments/swav_400ep_bs256_pretrain' exists.
The directory '/content/drive/MyDrive/PLAEX/Dataset' exists.


### 1.5. Copy our modificated codes to SwAV repository

In [None]:
!cp /content/drive/MyDrive/PLAEX/Selfsupervised_models/swav/code/main_swav_no_apex.py /content/swav/
!cp -f /content/drive/MyDrive/PLAEX/Selfsupervised_models/swav/code/utils.py /content/swav/src/
!cp /content/drive/MyDrive/PLAEX/Selfsupervised_models/swav/code/eval_linear_no_distribution.py /content/swav/


## 2. Training and Evaluating SwAV
The training is divided into 2 main steps:


*   Self supervised training of SwAV model: In order to cluster the features of the images. But not predicting the classification as paper, plastic, etc.
*   Supervised Linear Classification training on top of SwaV: After getting the cluster of the features is necessary to classify them into the label that we want as paper, plastic, etc.

This process is common on this type of classification task in selfsupervised models.



### 2.1. Self supervised training of SwAV model

In [None]:
%cd /content/swav
!python main_swav_no_apex.py \
--data_path $DATASET_PATH \
--epochs 200 \
--base_lr 0.6 \
--final_lr 0.0006 \
--warmup_epochs 0 \
--batch_size 32 \
--size_crops 224 96 \
--nmb_crops 2 6 \
--min_scale_crops 0.14 0.05 \
--max_scale_crops 1. 0.14 \
--use_fp16 true \
--freeze_prototypes_niters 5005 \
--queue_length 3840 \
--epoch_queue_starts 15 \
--crops_for_assign 0 1 \
--temperature 0.1 \
--epsilon 0.05 \
--sinkhorn_iterations 3 \
--feat_dim 128 \
--nmb_prototypes 3000 \
--wd 0.000001 \
--arch resnet50 \
--dump_path $EXPERIMENT_PATH

/content/swav
INFO - 08/12/24 07:14:58 - 0:00:01 - arch: resnet50
                                     base_lr: 0.6
                                     batch_size: 32
                                     checkpoint_freq: 25
                                     crops_for_assign: [0, 1]
                                     data_path: /content/drive/MyDrive/PLAEX/Dataset
                                     dump_checkpoints: /content/drive/MyDrive/PLAEX/Selfsupervised_models/swav/experiments/swav_400ep_bs256_pretrain/checkpoints
                                     dump_path: /content/drive/MyDrive/PLAEX/Selfsupervised_models/swav/experiments/swav_400ep_bs256_pretrain
                                     epoch_queue_starts: 15
                                     epochs: 200
                                     epsilon: 0.05
                                     feat_dim: 128
                                     final_lr: 0.0006
                                     freeze_prototypes_niter

Copy final modifications into our Google Drive files

In [None]:
!cp -f /content/swav/main_swav_no_apex.py /content/drive/MyDrive/PLAEX/Selfsupervised_models/swav/code/
!cp -f /content/swav/src/utils.py /content/drive/MyDrive/PLAEX/Selfsupervised_models/swav/code/

### 2.2. Supervised learning of Linear Classification on top of SwAV model

In [None]:
import os
checkpoint = "swav_400ep_bs256_pretrain.pth.tar"
os.environ['CHECKPOINT'] = checkpoint

%cd /content/swav
!python eval_linear_no_distribution.py \
--data_path $DATASET_PATH \
--pretrained $EXPERIMENT_PATH/$CHECKPOINT

/content/swav
INFO - 08/12/24 07:24:05 - 0:00:00 - arch: resnet50
                                     batch_size: 32
                                     data_path: /content/drive/MyDrive/PLAEX/Dataset
                                     decay_epochs: [60, 80]
                                     dump_checkpoints: ./checkpoints
                                     dump_path: .
                                     epochs: 100
                                     final_lr: 0
                                     gamma: 0.1
                                     global_pooling: True
                                     lr: 0.3
                                     nesterov: False
                                     pretrained: /content/drive/MyDrive/PLAEX/Selfsupervised_models/swav/experiments/swav_400ep_bs256_pretrain/swav_400ep_bs256_pretrain.pth.tar
                                     scheduler_type: cosine
                                     seed: 31
                                 

Copy final modifications into our Google Drive files

In [None]:
!cp -f /content/swav/eval_linear_no_distribution.py /content/drive/MyDrive/PLAEX/Selfsupervised_models/swav/code/
!cp -f /content/swav/src/utils.py /content/drive/MyDrive/PLAEX/Selfsupervised_models/swav/code/


# **Given time limitation (<1h) in COLAB we are going to try to train locally**
- Graph cards: GTX 1060 Ti
- RAM: 16GB