# Self_supervised training on 1-gpu with VISSL

## Generate custom dataset

Note: (1) do not change the name of these folders: "train" and "val"; (2) put all your images into "train/label1" and "train/label2" folders in any split.

```
path/to/your/dataset
├──train
├  ├── label1/
├  ├    ├── images1.jpg
├  ├    ├── images2.jpg
├  ├
├  └── label2/
├       ├── images1.jpg
├       ├── images2.jpg
├
├──val (leave it empty)
   ├── label1/
   ├    ├── images1.jpg
   ├    ├── images2.jpg
   ├
   └── label2/
       ├── images1.jpg
       ├── images2.jpg
```

Load custom dataset

(1) Modify the custom dataset path in **"tools/run_distributed_engines.py"** file;

(2) Add the project root path in **"tools/run_distributed_engines.py"** file.

In [None]:
# (1) Modify the custom dataset path in the below code in "tools/run_distributed_engines.py" file;

from vissl.data.dataset_catalog import VisslDatasetCatalog

train_path="/scratch/tjian/Data/GJO_SSL/images_tiles_224_pretrain/train"
val_path="/scratch/tjian/Data/GJO_SSL/images_tiles_224_pretrain/val"
VisslDatasetCatalog.register_data(name="GJO", data_dict={"train": train_path, "test": val_path})

In [None]:
# (2) Add the project root path in the second code in "tools/run_distributed_engines.py" file.

import sys

sys.path.append('/scratch/tjian/PythonProject/deep_plastic_SSL/')

## SimCLR

Steps:

(a) Pretrained ResNet50 or ResNet101 on ImageNet-1k dataset (1k categories, 1.2 million images); The weights can be downloaded from: 

https://dl.fbaipublicfiles.com/detectron2/ImageNetPretrained/MSRA/R-50.pkl

https://dl.fbaipublicfiles.com/detectron2/ImageNetPretrained/MSRA/R-101.pkl

(b) Modify the hyperparameters in **"pretrain/simclr/XXX.yaml"** file if needed, e.g., data augmentation;

(c) Modify the hyperparameters in the below codes, e.g., dataset name, train data path, batch size, epoches, checkpoint output path, pre-trained model weights path, and fine-tune strategy (here: fine-tune all layers of the backbone);

(d) Train the full model (all layers) on the custom dataset

#### (1) ResNet50

In [None]:
# SimCLR
# Backbone: resnet50
# Pretrained on ImageNet, and fine tune all layers (FTAL)

!python tools/run_distributed_engines.py \
  hydra.verbose=true \
  config=pretrain/simclr/simclr_1_gpu_resnet50.yaml \
  config.DATA.TRAIN.DATASET_NAMES=[GJO] \
  config.DATA.TRAIN.DATA_SOURCES=[disk_folder] \
  config.DATA.TRAIN.DATA_PATHS=["/scratch/tjian/Data/GJO_SSL/images_tiles_224_pretrain/train"] \
  config.DATA.TRAIN.BATCHSIZE_PER_REPLICA=16 \
  config.OPTIMIZER.num_epochs=100 \
  config.CHECKPOINT.DIR="/scratch/tjian/PythonProject/deep_plastic_SSL/checkpoints/train_weights/Self_train_tiles_224/RN50_SC/vissl" \
  config.HOOKS.TENSORBOARD_SETUP.USE_TENSORBOARD=true \
  config.WEIGHTS_INIT.PARAMS_FILE="/scratch/tjian/PythonProject/deep_plastic_SSL/checkpoints/pretrained_model/R-50.pkl" \
  config.WEIGHTS_INIT.APPEND_PREFIX="trunk._feature_blocks."

In [None]:
# SimCLR
# Backbone: resnet50
# Train from scratch

!python tools/run_distributed_engines.py \
  hydra.verbose=true \
  config=pretrain/simclr/simclr_1_gpu_RN50_scratch.yaml \
  config.DATA.TRAIN.DATASET_NAMES=[GJO] \
  config.DATA.TRAIN.DATA_SOURCES=[disk_folder] \
  config.DATA.TRAIN.DATA_PATHS=["/scratch/tjian/Data/GJO_SSL/images_tiles_224_pretrain/train"] \
  config.DATA.TRAIN.BATCHSIZE_PER_REPLICA=16 \
  config.OPTIMIZER.num_epochs=100 \
  config.CHECKPOINT.DIR="/scratch/tjian/PythonProject/deep_plastic_SSL/checkpoints/train_weights/Self_train_tiles_224/RN50_SC_100e_Scratch/vissl" \
  config.HOOKS.TENSORBOARD_SETUP.USE_TENSORBOARD=true
  

#### (2) ResNet101

In [None]:
# SimCLR
# Backbone: resnet101
# Pretrained on ImageNet, and fine tune all layers (FTAL)

!python tools/run_distributed_engines.py \
  hydra.verbose=true \
  config=pretrain/simclr/simclr_1_gpu_resnet101.yaml \
  config.DATA.TRAIN.DATASET_NAMES=[GJO] \
  config.DATA.TRAIN.DATA_SOURCES=[disk_folder] \
  config.DATA.TRAIN.DATA_PATHS=["/scratch/tjian/Data/GJO_SSL/images_tiles_224_pretrain/train"] \
  config.DATA.TRAIN.BATCHSIZE_PER_REPLICA=16 \
  config.OPTIMIZER.num_epochs=100 \
  config.CHECKPOINT.DIR="/scratch/tjian/PythonProject/deep_plastic_SSL/checkpoints/train_weights/Self_train_tiles_224/RN101_SC/vissl" \
  config.HOOKS.TENSORBOARD_SETUP.USE_TENSORBOARD=true \
  config.WEIGHTS_INIT.PARAMS_FILE="/scratch/tjian/PythonProject/deep_plastic_SSL/checkpoints/pretrained_model/R-101.pkl" \
  config.WEIGHTS_INIT.APPEND_PREFIX="trunk._feature_blocks."

In [None]:
# SimCLR
# Backbone: resnet101
# Train from scratch

!python tools/run_distributed_engines.py \
  hydra.verbose=true \
  config=pretrain/simclr/simclr_1_gpu_RN101_scratch.yaml \
  config.DATA.TRAIN.DATASET_NAMES=[GJO] \
  config.DATA.TRAIN.DATA_SOURCES=[disk_folder] \
  config.DATA.TRAIN.DATA_PATHS=["/scratch/tjian/Data/GJO_SSL/images_tiles_224_pretrain/train"] \
  config.DATA.TRAIN.BATCHSIZE_PER_REPLICA=16 \
  config.OPTIMIZER.num_epochs=100 \
  config.CHECKPOINT.DIR="/scratch/tjian/PythonProject/deep_plastic_SSL/checkpoints/train_weights/Self_train_tiles_224/RN101_SC_100e_Scratch/vissl" \
  config.HOOKS.TENSORBOARD_SETUP.USE_TENSORBOARD=true
  

## SwAV

Steps:

(a) Pretrained ResNet50 or ResNet101 on ImageNet dataset;

(b) Modify the hyperparameters in **"pretrain/swav/XXX.yaml"** file if needed, e.g., data augmentation;

(c) Modify the hyperparameters in the below codes, e.g., batch size, epoches, output path;

(d) Train the full model (all layers) on the custom dataset

#### (1) ResNet50

In [None]:
# SwAV
# backbone: RN50
# Pretrained on ImageNet, and fine tune all layers (FTAL)

!python tools/run_distributed_engines.py \
  hydra.verbose=true \
  config=pretrain/swav/swav_1_gpu_resnet50/C5000_TP_0.1.yaml \
  config.DATA.TRAIN.DATASET_NAMES=[GJO] \
  config.DATA.TRAIN.DATA_SOURCES=[disk_folder] \
  config.DATA.TRAIN.DATA_PATHS=["/scratch/tjian/Data/GJO_SSL/images_tiles_224_pretrain/train"] \
  config.OPTIMIZER.num_epochs=100 \
  config.CHECKPOINT.DIR="/scratch/tjian/PythonProject/deep_plastic_SSL/checkpoints/train_weights/Paper_4/Vitenam/Self_train_GJO_Vit_90K/C5000_TP_0.1/vissl" \
  config.HOOKS.TENSORBOARD_SETUP.USE_TENSORBOARD=true \
  config.WEIGHTS_INIT.PARAMS_FILE="/scratch/tjian/PythonProject/deep_plastic_SSL/checkpoints/pretrained_model/R-50.pkl" \
  config.WEIGHTS_INIT.APPEND_PREFIX="trunk._feature_blocks."


In [None]:
# SwAV
# Backbone: resnet50
# Train from scratch

!python tools/run_distributed_engines.py \
  hydra.verbose=true \
  config=pretrain/swav/swav_1_gpu_RN50_scratch.yaml \
  config.DATA.TRAIN.DATASET_NAMES=[GJO] \
  config.DATA.TRAIN.DATA_SOURCES=[disk_folder] \
  config.DATA.TRAIN.DATA_PATHS=["/scratch/tjian/Data/GJO_SSL/images_tiles_224_pretrain/train"] \
  config.DATA.TRAIN.BATCHSIZE_PER_REPLICA=16 \
  config.OPTIMIZER.num_epochs=100 \
  config.CHECKPOINT.DIR="/scratch/tjian/PythonProject/deep_plastic_SSL/checkpoints/train_weights/Self_train_tiles_224/pretrain_50per/RN50_Sw_100e_Scratch/vissl" \
  config.HOOKS.TENSORBOARD_SETUP.USE_TENSORBOARD=true
  

#### (2) ResNet101

In [None]:
# SwAV
# backbone: RN101
# Pretrained on ImageNet, and fine tune all layers (FTAL)

!python tools/run_distributed_engines.py \
  hydra.verbose=true \
  config=pretrain/swav/swav_1_gpu_resnet101.yaml \
  config.DATA.TRAIN.DATASET_NAMES=[GJO] \
  config.DATA.TRAIN.DATA_SOURCES=[disk_folder] \
  config.DATA.TRAIN.DATA_PATHS=["/scratch/tjian/Data/GJO_SSL/images_tiles_224_pretrain/train"] \
  config.OPTIMIZER.num_epochs=100 \
  config.CHECKPOINT.DIR="/scratch/tjian/PythonProject/deep_plastic_SSL/checkpoints/train_weights/Self_train_tiles_224/RN101_Sw/vissl" \
  config.HOOKS.TENSORBOARD_SETUP.USE_TENSORBOARD=true \
  config.WEIGHTS_INIT.PARAMS_FILE="/scratch/tjian/PythonProject/deep_plastic_SSL/checkpoints/pretrained_model/R-101.pkl" \
  config.WEIGHTS_INIT.APPEND_PREFIX="trunk._feature_blocks."


In [None]:
# SwAV
# Backbone: resnet101
# Train from scratch

!python tools/run_distributed_engines.py \
  hydra.verbose=true \
  config=pretrain/swav/swav_1_gpu_RN101_scratch.yaml \
  config.DATA.TRAIN.DATASET_NAMES=[GJO] \
  config.DATA.TRAIN.DATA_SOURCES=[disk_folder] \
  config.DATA.TRAIN.DATA_PATHS=["/scratch/tjian/Data/GJO_SSL/images_tiles_224_pretrain/train"] \
  config.DATA.TRAIN.BATCHSIZE_PER_REPLICA=16 \
  config.OPTIMIZER.num_epochs=100 \
  config.CHECKPOINT.DIR="/scratch/tjian/PythonProject/deep_plastic_SSL/checkpoints/train_weights/Self_train_tiles_224/RN101_Sw_100e_Scratch/vissl" \
  config.HOOKS.TENSORBOARD_SETUP.USE_TENSORBOARD=true
  

## Training logs, checkpoints, metrics (optional)

VISSL dumps model checkpoints in the checkpoint directory specified by user. In above example, we used `./checkpoints` directory.

We notice:
- model checkpoints `.torch` files after every epoch, 
- model training log `log.txt` which has the full stdout but saved in file
- `metrics.json` if your training calculated some metrics, those metrics values will be saved there..
- `tb_logs` which are the tensorboard events

## Visualizing Tensorboard Logs (optional)

If you have enabled `config.TENSORBOARD_SETUP.USE_TENSORBOARD=true` , you will see the tensorboard events dumped in `tb_logs/` directory. You can use this to visualize the events in tensorboard as follows:

In [None]:
# Look at training curves in tensorboard:
%reload_ext tensorboard
%tensorboard --logdir /scratch/tjian/PythonProject/deep_plastic_SSL/checkpoints/train_weights/Self_train_bbox/SimCLR_50_epochs/vissl/tb_logs/