<a href="https://colab.research.google.com/github/open-mmlab/mmselfsup/blob/master/demo/mmselfsup_colab_tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# MMSelfSup Tutorial
In this tutorial, we will introduce the following content:

- How to install MMSelfSup
- How to train algorithms in MMSelfSup
- How to train downstream tasks

If you have any other questions, welcome to report issues.

## How to install MMSelfSup

Before using MMSelfSup, we need to prepare the environment with the following steps:

1. Install Python, CUDA, C/C++ compiler and git
2. Install PyTorch (CUDA version)
3. Install dependent codebase (mmcv, mmcls)
4. Clone mmselfsup source code from GitHub and install it

Because this tutorial is on Google Colab and all necessary packages have been installed, we can skip the first two steps.

In [1]:
!pwd

/content


In [2]:
# Check nvcc version
!nvcc -V

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0


In [3]:
# Check GCC version
!gcc --version

gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.



In [4]:
# Check PyTorch installation
import torch, torchvision
print(torch.__version__)
print(torch.cuda.is_available())

1.10.0+cu111
True


## Install MMCV

MMCV is the basic package of all OpenMMLab packages. We have pre-built wheels on Linux, so we can download and install them directly.

Please pay attention to PyTorch and CUDA versions to match the wheel.

In the above steps, we have checked the version of PyTorch and CUDA, and they are 1.10.2 and 11.3 respectively, so we need to choose the corresponding wheel.

In addition, we can also install the full version of mmcv (mmcv-full). It includes full features and various CUDA ops out of the box, but needs a longer time to build.

MIM is recommended: https://github.com/open-mmlab/mim

In [5]:
!pip install openmim

Collecting openmim
  Downloading openmim-0.1.5.tar.gz (35 kB)
Collecting colorama
  Downloading colorama-0.4.4-py2.py3-none-any.whl (16 kB)
Collecting model-index
  Downloading model_index-0.1.11-py3-none-any.whl (34 kB)
Collecting ordered-set
  Downloading ordered_set-4.1.0-py3-none-any.whl (7.6 kB)
Building wheels for collected packages: openmim
  Building wheel for openmim (setup.py) ... [?25l[?25hdone
  Created wheel for openmim: filename=openmim-0.1.5-py2.py3-none-any.whl size=42503 sha256=6957857625db07e7d2c995d0fa19d98dd58f3daea577a3bc933b16edab669fb4
  Stored in directory: /root/.cache/pip/wheels/16/8b/e1/bdebbbc687aa50224a5ce46fe97a040a0c59f92b34bfc750b6
Successfully built openmim
Installing collected packages: ordered-set, model-index, colorama, openmim
Successfully installed colorama-0.4.4 model-index-0.1.11 openmim-0.1.5 ordered-set-4.1.0


In [6]:
!mim install mmcv-full

installing mmcv-full from wheel.
Looking in links: https://download.openmmlab.com/mmcv/dist/cu111/torch1.10.0/index.html
Collecting mmcv-full==1.4.7
  Downloading https://download.openmmlab.com/mmcv/dist/cu111/torch1.10.0/mmcv_full-1.4.7-cp37-cp37m-manylinux1_x86_64.whl (46.3 MB)
[K     |████████████████████████████████| 46.3 MB 158 kB/s 
Collecting yapf
  Downloading yapf-0.32.0-py2.py3-none-any.whl (190 kB)
[K     |████████████████████████████████| 190 kB 3.9 MB/s 
Collecting addict
  Downloading addict-2.4.0-py3-none-any.whl (3.8 kB)
Installing collected packages: yapf, addict, mmcv-full
Successfully installed addict-2.4.0 mmcv-full-1.4.7 yapf-0.32.0
[32mSuccessfully installed mmcv-full.[0m


Besides, you can also use pip to install the packages, but you are supposed to check the pytorch and cuda version manually. The example command is provided below, but you need to modify it according to your PyTorch and CUDA version.

In [None]:
# Install mmcv and mmcls
!pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.10/index.html

## Clone and install mmselfsup

In [7]:
# Clone MMSelfSup repository
!git clone https://github.com/open-mmlab/mmselfsup.git
%cd mmselfsup/

# Install MMSelfSup from source
!pip install -e . 

Cloning into 'mmselfsup'...
remote: Enumerating objects: 3255, done.[K
remote: Counting objects: 100% (872/872), done.[K
remote: Compressing objects: 100% (524/524), done.[K
remote: Total 3255 (delta 534), reused 446 (delta 347), pack-reused 2383[K
Receiving objects: 100% (3255/3255), 1.96 MiB | 6.78 MiB/s, done.
Resolving deltas: 100% (1911/1911), done.
/content/mmselfsup
Obtaining file:///content/mmselfsup
Collecting mmcls<=0.20.1,>=0.19.0
  Downloading mmcls-0.20.1-py2.py3-none-any.whl (490 kB)
[K     |████████████████████████████████| 490 kB 4.3 MB/s 
Collecting timm
  Downloading timm-0.5.4-py3-none-any.whl (431 kB)
[K     |████████████████████████████████| 431 kB 48.8 MB/s 
Installing collected packages: timm, mmcls, mmselfsup
  Running setup.py develop for mmselfsup
Successfully installed mmcls-0.20.1 mmselfsup-0.7.1 timm-0.5.4


In [8]:
# Check MMSelfSup installation
import mmselfsup
print(mmselfsup.__version__)

0.7.1


## Example to start a self-supervised task

Before you start training, you need to prepare your dataset, please check [prepare_data.md](https://github.com/open-mmlab/mmselfsup/blob/master/docs/en/prepare_data.md) file carefully.

**Note**: As we follow the original algorithms to implement our codes, so many algorithms are supposed to run on distributed mode, they are not supported on 1 GPU training officially. You can check it [here](https://github.com/open-mmlab/mmselfsup/blob/master/tools/train.py#L120).


In [1]:
!pwd

/home/renqin/Projects/2.0/mmselfsup/demo


Here we provide a example and download a small dataset to display the demo.

In [2]:
!mkdir data
!wget https://download.openmmlab.com/mmselfsup/data/imagenet_examples.zip
!unzip -q imagenet_examples.zip -d ./data/

--2022-08-30 19:55:19--  https://download.openmmlab.com/mmselfsup/data/imagenet_examples.zip
正在解析主机 download.openmmlab.com (download.openmmlab.com)... 101.133.111.186
正在连接 download.openmmlab.com (download.openmmlab.com)|101.133.111.186|:443... 已连接。
已发出 HTTP 请求，正在等待回应... 200 OK
长度： 155496559 (148M) [application/zip]
正在保存至: “imagenet_examples.zip”

imagenet_examples.z  11%[=>                  ]  17.10M  3.49MB/s    剩余 27s     ^C
[imagenet_examples.zip]
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.
unzip:  cannot find zipfile directory in one of imagenet_examples.zip or
        imagenet_examples.zip.zip, and cannot find imagenet_examples.zip.ZIP, period.


In [11]:
# Check data directory
!apt-get install tree
!tree -d ./data

Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  tree
0 upgraded, 1 newly installed, 0 to remove and 39 not upgraded.
Need to get 40.7 kB of archives.
After this operation, 105 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic/universe amd64 tree amd64 1.7.0-5 [40.7 kB]
Fetched 40.7 kB in 1s (45.2 kB/s)
Selecting previously unselected package tree.
(Reading database ... 156210 files and directories currently installed.)
Preparing to unpack .../tree_1.7.0-5_amd64.deb ...
Unpacking tree (1.7.0-5) ...
Setting up tree (1.7.0-5) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
./data
└── imagenet
    ├── meta
    └── train
        └── n01440764

4 directories


### Create a new config file
To reuse the common parts of different config files, we support inheriting multiple base config files. For example, to train `relative_loc` algorithm, the new config file can create the model's basic structure by inheriting `configs/_base_/models/relative-loc.py`.

In [1]:
!pwd

/home/renqin/Projects/2.0/mmselfsup/demo


In [2]:
%cd ..

/home/renqin/Projects/2.0/mmselfsup


In [3]:
import torch
torch.cuda.empty_cache()

  from .autonotebook import tqdm as notebook_tqdm


In [4]:
%%writefile configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k_colab.py
_base_ = [
    '../_base_/models/relative-loc.py',
    '../_base_/datasets/imagenet_relative-loc.py',
    '../_base_/schedules/sgd_steplr-200e_in1k.py',
    '../_base_/default_runtime.py',
]

default_hooks = dict(logger=dict(type='LoggerHook', interval=10))

# optimizer wrapper
optimizer = dict(type='SGD', lr=0.2, momentum=0.9, weight_decay=1e-4)
optim_wrapper = dict(
    type='OptimWrapper',
    optimizer=optimizer,
    paramwise_cfg=dict(custom_keys={
        'neck': dict(decay_mult=5.0),
        'head': dict(decay_mult=5.0)
    }))

# learning rate scheduler
param_scheduler = [dict(type='MultiStepLR', by_epoch=True, milestones=[1, 2])]

# runtime settings
# pre-train for 70 epochs
train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=70)
# the max_keep_ckpts controls the max number of ckpt file in your work_dirs
# if it is 3, when CheckpointHook (in mmcv) saves the 4th ckpt
# it will remove the oldest one to keep the number of total ckpts as 3
default_hooks = dict(
    checkpoint=dict(type='CheckpointHook', interval=1, max_keep_ckpts=3))


Overwriting configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k_colab.py


### Read the config file and modify config

We can modify the loaded config file.

In [5]:
# Load the basic config file
from mmengine.config import Config
cfg = Config.fromfile('configs/selfsup/relative_loc/relative-loc_resnet50_8xb64-steplr-70e_in1k_colab.py')

# Specify the data settings
cfg.train_dataloader.batch_size = 8
cfg.train_dataloader.num_workers = 2

# Specify the optimizer
cfg.optimizer = dict(type='SGD', lr=0.005, momentum=0.9, weight_decay=0.0001)
cfg.optim_wrapper.clip_grad = None

# Specify the learning rate scheduler
cfg.param_scheduler = [dict(type='MultiStepLR', by_epoch=True, milestones=[1, 2])]

# Modify runtime setting
cfg.train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=2)

# Specify the work directory
cfg.work_dir = './work_dirs/selfsup/relative-loc_resnet50_8xb64-steplr-70e_in1k_colab'

# Output logs for every 10 iterations
cfg.default_hooks.logger.interval = 10
# Set the random seed and enable the deterministic option of cuDNN
# to keep the results' reproducible.
# cfg.randomness = dict(seed=0, deterministic=True)
cfg.randomness = dict(seed=None)

# cfg.gpu_ids = range(1)

### Start self-supervised pre-train task

In [6]:
import os
import torch



if torch.cuda.is_available():
#     # cfg = copy.deepcopy(self.epoch_based_cfg)
#     cfg.experiment_name = 'test_init15'
#     cfg.launcher = 'pytorch'
#     os.environ['MASTER_ADDR'] = '127.0.0.1'
#     os.environ['MASTER_PORT'] = '29600'
#     os.environ['RANK'] = '0'
#     os.environ['WORLD_SIZE'] = '1'
#     os.environ['LOCAL_RANK'] = '0'

#     os.environ['CUBLAS_WORKSPACE_CONFIG'] = ':16:8'
    os.environ['CUDA_LAUNCH_BLOCKING'] = '1'

In [7]:
from mmengine.config import Config, DictAction
from mmengine.runner import Runner

from mmselfsup.utils import register_all_modules

# register all modules in mmselfsup into the registries
# do not init the default scope here because it will be init in the runner
register_all_modules(init_default_scope=False)

# build the runner from config
runner = Runner.from_cfg(cfg)

# start training
runner.train()

08/30 20:52:24 - mmengine - [4m[37mINFO[0m - 
------------------------------------------------------------
System environment:
    sys.platform: linux
    Python: 3.7.13 (default, Mar 29 2022, 02:18:16) [GCC 7.5.0]
    CUDA available: True
    numpy_random_seed: 1019958489
    GPU 0: NVIDIA GeForce RTX 3090
    CUDA_HOME: /usr/local/cuda-11.3
    NVCC: Cuda compilation tools, release 11.3, V11.3.58
    GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
    PyTorch: 1.11.0+cu113
    PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;a

RelativeLoc(
  (data_preprocessor): RelativeLocDataPreprocessor()
  (backbone): ResNet(
    (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu): ReLU(inplace=True)
    (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (layer1): ResLayer(
      (0): Bottleneck(
        (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(in

## Example to start a downstream task


In [8]:
!pwd

/home/renqin/Projects/2.0/mmselfsup


### Extract backbone weights from pre-train model

In [9]:
!python tools/model_converters/extract_backbone_weights.py \
  work_dirs/selfsup/relative-loc_resnet50_8xb64-steplr-70e_in1k_colab/epoch_2.pth \
  work_dirs/selfsup/relative-loc_resnet50_8xb64-steplr-70e_in1k_colab/relative-loc_backbone-weights.pth

### Prepare config file

Here we create a new config file for demo dataset, actually we provided various config files in directory `configs/benchmarks`.

In [2]:
!pwd

/home/renqin/Projects/2.0/mmselfsup/demo


In [3]:
%cd ..

/home/renqin/Projects/2.0/mmselfsup


In [4]:
# Load the basic config file
from mmengine.config import Config
benchmark_cfg = Config.fromfile('configs/benchmarks/classification/imagenet/resnet50_linear-8xb32-steplr-100e_in1k.py')

# Modify the model
checkpoint_file = 'work_dirs/selfsup/relative-loc_resnet50_8xb64-steplr-70e_in1k_colab/relative-loc_backbone-weights.pth'
# Or directly using pre-train model provided by us
# checkpoint_file = 'https://download.openmmlab.com/mmselfsup/moco/mocov2_resnet50_8xb32-coslr-200e_in1k_20220225-89e03af4.pth'

benchmark_cfg.model.backbone.frozen_stages=4
benchmark_cfg.model.backbone.init_cfg = dict(type='Pretrained', checkpoint=checkpoint_file)

# As the imagenet_examples dataset folder doesn't have val dataset
# Modify the path and meta files of validation dataset
benchmark_cfg.val_dataloader.dataset.data_prefix = 'train'
benchmark_cfg.val_dataloader.dataset.ann_file = 'meta/train.txt'

# Specify the learning rate scheduler
benchmark_cfg.param_scheduler = [dict(type='MultiStepLR', by_epoch=True, milestones=[1, 2])]

# Output logs for every 10 iterations
benchmark_cfg.default_hooks.logger.interval = 10

# Modify runtime settings for demo
benchmark_cfg.train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=2)


# Specify the work directory
benchmark_cfg.work_dir = './work_dirs/benchmarks/classification/imagenet/resnet50_8xb32-steplr-100e_in1k_colab'

# Set the random seed and enable the deterministic option of cuDNN
# to keep the results' reproducible.
# benchmark_cfg.randomness = dict(seed=0, deterministic=True)
benchmark_cfg.randomness = dict(seed=None)

# cfg.gpu_ids = range(1)

  from .autonotebook import tqdm as notebook_tqdm


### Load extracted backbone weights to start a downstream task

In [5]:
from mmengine.config import Config, DictAction
from mmengine.runner import Runner

from mmselfsup.utils import register_all_modules

# register all modules in mmselfsup into the registries
# do not init the default scope here because it will be init in the runner
register_all_modules(init_default_scope=False)

# build the runner from config
runner = Runner.from_cfg(benchmark_cfg)

# start training
runner.train()

08/30 20:51:44 - mmengine - [4m[37mINFO[0m - 
------------------------------------------------------------
System environment:
    sys.platform: linux
    Python: 3.7.13 (default, Mar 29 2022, 02:18:16) [GCC 7.5.0]
    CUDA available: True
    numpy_random_seed: 93394465
    GPU 0: NVIDIA GeForce RTX 3090
    CUDA_HOME: /usr/local/cuda-11.3
    NVCC: Cuda compilation tools, release 11.3, V11.3.58
    GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
    PyTorch: 1.11.0+cu113
    PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arc

ImageClassifier(
  (data_preprocessor): ClsDataPreprocessor()
  (backbone): ResNet(
    (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu): ReLU(inplace=True)
    (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (layer1): ResLayer(
      (0): Bottleneck(
        (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplac

**Note: As the demo only has one class in dataset, the model collapsed and the results of loss and acc should be ignored.**