PyTorch Code for the paper:
"Extending Contrastive Learning to Unsupervised Coreset Selection"
Jeongwoo Ju, Heechul Jung, Yoonju Oh and Junmo Kim
Original code is SVP from Stanford.
Based on the original code, we implemented our unsupervised coreset algorithm.
@article{ju2021extending,
title={Extending Contrastive Learning to Unsupervised Coreset Selection},
author={Ju, Jeongwoo and Jung, Heechul and Oh, Yoonju and Kim, Junmo},
journal={arXiv preprint arXiv:2103.03574},
year={2021}
}
- Linux or macOS (Windows is in experimental support)
- Python 3.6 +
- PyTorch 0.4.1
- TorchVision 0.2.1
- CUDA 9.1
File Name | Description |
---|---|
run_sim_core_svhn.sh |
UCS w\ SimCLR on SVHN |
run_sim_core_qmnist.sh |
UCS w\ SimCLR on QMNIST |
run_sim_core_cifar.sh |
UCS w\ SimCLR on CIFAR10 |
run_moco_core_svhn.sh |
UCS w\ MoCo on SVHN |
run_moco_core_qmnist.sh |
UCS w\ MoCo on QMNIST |
run_moco_core_cifar.sh |
UCS w\ MoCo on CIFAR10 |
Folder Name | Description |
---|---|
loss |
coreset score for SimCLR and MoCo |
index |
example indices for each dataset |
see each shell script file in main branch For example, ./run_sim_core_svhn.sh is as follows
for subsize in 22500 30000 37500 45000 52500
do
for run in 1 2 3 4 5
do
CUDA_VISIBLE_DEVICES=2 python3 -m svp.svhn active \
--run-dir ./run/svhn/resnet18/simclr \
--dataset svhn \
--datasets-dir './data' \
--arch resnet18 \
--num-workers 4 \
--weighted-loss False \
--coreset-path ./index/simclr_coreset_svhn_run$run.index \
--coreset-loss-path ./loss/simclr_coreset_svhn_run$run.loss \
--runs $run \
--initial-subset $subsize \
--eval-target-at $subsize 2>&1 | tee "./log_svhn_test_simclr_coreset_resnet18_subsize"$subsize"_run$run.txt"
done
done
Link: https://www.cs.toronto.edu/~kriz/cifar.html
Download train_32x32.mat and test_32x32.mat from the web http://ufldl.stanford.edu/housenumbers/
Download QMNIST dataset using torchvision
Coreset selection performance on CIFAR10 |
Coreset selection performance on SVHN |
Coreset selection performance on QMNIST |