Name		Name	Last commit message	Last commit date
parent directory ..
mgd		mgd
moco		moco
models		models
README.md		README.md
main_moco_mgd.py		main_moco_mgd.py

README.md

MGD in Unsupervised Learning

This experiment is an extension of the original paper. MGD can naturally work with current unsupervised learning frameworks, e.g., Momentum Contrast (MoCo) and Simple Siamese Learning (SimSiam). In this repo, we initially investigate MoCo-v2 training with MGD and work in progress on other parts.

Environments

PyTorch 1.8.1

Data Preparation

Prepare ImageNet-1K dataset following the official PyTorch ImageNet training code.

Directory Structure

`-- path/to/${ImageNet-1K}/root/folder
    `-- train
    |   |-- n01440764
    |   |-- n01734418
    |   |-- ...
    |   |-- n15075141
    `-- valid
    |   |-- n01440764
    |   |-- n01734418
    |   |-- ...
    |   |-- n15075141

Code Preparation

cp -r ../mgd/sampler.py mgd

Unsupervised Training with MGD

Please download the pre-trained weight (md5: 59fd9945, epochs: 200) of ResNet-50 from MoCo-v2 Models and then load it with the arg of --resume.

To do unsupervised pre-training of a ResNet-18 model with MGD on ImageNet in an 8-gpu machine, run:

python main_moco_mgd.py \
  -a resnet18 \
  --lr 0.03 \
  --batch-size 256 \
  --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 \
  --mlp --moco-t 0.2 --aug-plus --cos \
  --resume moco_v2_200ep_pretrain.pth.tar \
  [your imagenet-folder with train and val folders]

method	model	pre-train epochs	training logs
MGD	ResNet-50 distills ResNet-34	200	Baidu Pan [ bkr5 ]
MGD	ResNet-50 distills ResNet-18	200	Baidu Pan [ jbcv ]

Note:

The MGD distiller is engined by the AMP -- absolute max pooling.
The teacher is ResNet-50 in deafult.
The hyper-parameters of MGD, such as loss factors, are the same as supervised training. We did not search hyper-parameters. But according to training logs, we believe performances can be better with tunning hyper-parameters, for example, increasing the factor from 1e4 to 1e2.

Linear Classification

Same as linear classification of MoCo-v2. Linear classification results on ImageNet using this repo with 8 NVIDIA TITAN Xp GPUs:

method	model	pre-train epochs	MoCo v2 top-1 acc.	MoCo v2 top-5 acc.
Teacher	ResNet-50	200	67.5	-
Student	ResNet-34	200	57.2	81.5
MGD	ResNet-34	200	58.5	82.7
Student	ResNet-18	200	52.5	77.0
MGD	ResNet-18	200	53.6	78.7

Update Schedule

The schedule for updating MGD matching matrix is different with that in the original paper. We scale it with a log function, i.e., we update matching matrix at the epoch of [1, 2, 3, 6, 9, 15, 26, 43, 74, 126].

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unsup

unsup

mgd

mgd

moco

moco

models

models

README.md

README.md

main_moco_mgd.py

main_moco_mgd.py

README.md

MGD in Unsupervised Learning

Environments

Data Preparation

Code Preparation

Unsupervised Training with MGD

Linear Classification

Update Schedule

Files

unsup

Directory actions

More options

Directory actions

More options

Latest commit

History

unsup

Folders and files

parent directory

MGD in Unsupervised Learning

Environments

Data Preparation

Code Preparation

Unsupervised Training with MGD

Linear Classification

Update Schedule