by Xudong Wang*, Long Lian*, and Stella X. Yu at UC Berkeley/ICSI. (*: co-first authors)
Arxiv Paper | ECCV Paper | Poster | Video | Citation
European Conference on Computer Vision (ECCV), 2022.
This work is also presented in CV in the Wild workshop in ECCV 2022.
This repository contains the code for USL on CIFAR. Other implementations are coming soon.
Note that even if you only work on proposing new SSL methods, you can try our unsupervised selected samples as the labeled data with your SSL methods to get a free boost with the same number of labeled samples. We have pre-extracted the labeled split and nothing is needed to run.
Our unsupervised selection (USL and USL-T) is available in a plug-and-play format for CIFAR and ImageNet here. We have shown that our method works off-the-shelf with SimCLR, SimCLR-CLD, FixMatch, MixMatch, CoMatch, etc and it should also work for other newer methods.
For further information regarding the paper, please contact Xudong Wang. For information regarding the code and implementation, please contact Long Lian.
- USL-T is fully supported in this implementation. Pretraining weights and trained models are available.
- Selected sample indices on USL-T are added for reference (note that USL-T is training-based and thus gives different results on different runs)
- Poster and video are added (see above)
- ImageNet scripts, intermediate results, final results, and FixMatch checkpoints are added
- Provided CLD pretrained model and reference selections
- Initial Implementation
- USL
- USL-T
- FixMatch
- SimCLRv2
- SimCLRv2-CLD
- CIFAR-10
- CIFAR-100
- ImageNet100
- ImageNet
Install the required packages:
pip install -r requirements.txt
You also need to install clip
if you want to use clip
models:
pip install ftfy regex tqdm
pip install git+https://github.com/openai/CLIP.git
For ImageNet, you need to change the data path to your data path in the config. For CIFAR, it will download the data automatically.
mkdir selective_labeling/pretrained
cd selective_labeling/pretrained
# CLD checkpoint on CIFAR-10
wget https://people.eecs.berkeley.edu/~longlian/files/cifar10_ckpt_epoch_200.pth
# CLD checkpoint on CIFAR-100
wget https://people.eecs.berkeley.edu/~longlian/files/cifar100_ckpt_epoch_200.pth
cd selective_labeling
# CIFAR-10
python usl-cifar.py --cfg configs/cifar10_usl.yaml
# CIFAR-100
python usl-cifar.py --cfg configs/cifar100_usl.yaml
You can also find the config for SimCLR in semisup-simclrv2-cld/configs
. The implementation for FixMatch is in semisup-fixmatch-cifar
.
cd semisup-simclrv2-cld
# CIFAR-10
python fine_tune.py --cfg semisup-simclrv2-cld/configs/cifar10_usl-t_finetune.yaml
# CIFAR-100
python fine_tune.py --cfg semisup-simclrv2-cld/configs/cifar100_usl-t_finetune.yaml
mkdir selective_labeling/pretrained
cd selective_labeling/pretrained
# MoCov2 checkpoint on ImageNet (with EMAN as normalization)
wget https://eman-cvpr.s3.amazonaws.com/models/res50_moco_eman_800ep.pth.tar
CLIP models will be downloaded at the first run with USL-CLIP config.
This step is optional but recommended to refrain from recomputing the feature from the dataset. Furthermore, it relieves the non-deterministic behavior from obtaining feature, kNN, and clustering, since GPU ops lead to non-deterministic behavior (even though seed is set), which is more prominent for large datasets where more compute is used.
We provide intermediate results after obtaining feature, kNN, k-Means, and the final selected indices in numpy and csv format. Intermediate results can be obtained by:
USL-MoCo experiments
mkdir -p selective_labeling/saved/imagenet_usl_moco_0.2
cd selective_labeling/saved/imagenet_usl_moco_0.2
wget https://people.eecs.berkeley.edu/~longlian/files/usl_imagenet_moco_0.2_intermediate.zip
unzip usl_imagenet_moco_0.2_intermediate.zip
cd ../../..
Please also download the precomputed MoCov2 feature here and unzip memory_feats_list.npy
into selective_labeling/saved/imagenet_usl_moco_0.2
.
USL-CLIP experiments
mkdir -p selective_labeling/saved/imagenet_usl_clip_0.2
cd selective_labeling/saved/imagenet_usl_clip_0.2
wget https://people.eecs.berkeley.edu/~longlian/files/usl_imagenet_clip_0.2_intermediate.zip
unzip usl_imagenet_clip_0.2_intermediate.zip
cd ../../..
Please also download the precomputed CLIP feature here and unzip memory_feats_list.npy
into selective_labeling/saved/imagenet_usl_clip_0.2
.
You can also obtain the intermediate and final results here.
If this step is skipped (i.e., compute everything from scratch), RECOMPUTE_ALL
and RECOMPUTE_NUM_DEP
need to be set to True in the config.
cd selective_labeling
python usl-imagenet.py
cd selective_labeling
python usl-imagenet.py --cfg configs/ImageNet_usl_clip_0.2.yaml
Use the script here. The output csv
file is compatible with the split of labeled dataset.
To make our results more comparable, we release our selected indices and trained models, see model zoo below.
mkdir -p usl-t_pretraining/pretrained
wget https://people.eecs.berkeley.edu/~longlian/files/pretrained_kNN_cifar10.tgz
tar zxvf pretrained_kNN_cifar10.tgz
wget https://people.eecs.berkeley.edu/~longlian/files/pretrained_kNN_cifar100.tgz
tar zxvf pretrained_kNN_cifar100.tgz
- USL-T Training (also called pretraining because it is prior to training in semi-supervised learning):
cd usl-t_pretraining
# CIFAR-10
python usl-t-cifar-pretrain.py --cfg configs/cifar10_usl-t_pretrain.yaml
# CIFAR-100
python usl-t-cifar-pretrain.py --cfg configs/cifar100_usl-t_pretrain.yaml
cd ..
- Sample selection:
cd ../selective_labeling
# CIFAR-10
python usl-t-cifar.py --cfg configs/cifar10_usl-t.yaml
# CIFAR-100
python usl-t-cifar.py --cfg configs/cifar100_usl-t.yaml
cd ..
# CIFAR-10
python fine_tune.py --cfg configs/cifar10_usl-t_finetune.yaml
# CIFAR-100
python fine_tune.py --cfg configs/cifar100_usl-t_finetune.yaml
mkdir -p usl-t_pretraining/pretrained
wget https://people.eecs.berkeley.edu/~longlian/files/pretrained_kNN_imagenet.tgz
tar zxvf pretrained_kNN_imagenet.tgz
wget https://people.eecs.berkeley.edu/~longlian/files/pretrained_kNN_imagenet100.tgz
tar zxvf pretrained_kNN_imagenet100.tgz
- Training:
cd usl-t_pretraining
# ImageNet100
python usl-t-imagenet-pretrain.py --cfg configs/ImageNet100_usl-t_pretrain.yaml
# ImageNet
python usl-t-imagenet-pretrain.py --cfg configs/ImageNet_usl-t_pretrain.yaml
cd ..
- Sample selection:
cd ../selective_labeling
# CIFAR-10
python usl-t-imagenet.py --cfg ImageNet100_usl-t_0.3.yaml
# CIFAR-100
python usl-t-imagenet.py --cfg ImageNet_usl-t_0.2.yaml
cd ..
- SimCLR weights
r50_1x_sk0.pth
can be downloaded at here and should be put under pretrained.
cd semisup-simclrv2
# ImageNet-100
python fine_tune.py --lr 0.16 --seed 0 --name-to-save "usl-t_0.3p_imagenet100" --trainindex_x "[path to train_0.3p_gen_imagenet100_usl-t_0.3_index.csv]" -j 16 --batch-size 256 --eval-freq 48 --epochs 48 --warmup-epochs 0 --weight-decay 0.0001 --dist-url 'tcp://localhost:10005' --multiprocessing-distributed --world-size 1 --rank 0 --pretrain pretrained/r50_1x_sk0.pth --roll-data 20 --amp-opt-level O1 --roll-data 40 --num-classes 100 --freeze-backbone [Path to ImageNet 100]
# ImageNet
python fine_tune.py --lr 0.16 --seed 0 --name-to-save "usl-t_0.2p_imagenet" --trainindex_x "[path to train_0.2p_gen_imagenet_usl-t_0.2_index.csv]" -j 16 --batch-size 512 --eval-freq 6 --epochs 12 --warmup-epochs 0 --weight-decay 0 --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 --pretrain pretrained/r50_1x_sk0.pth --multi-epochs-data-loader --amp-opt-level O1 --no-lr-decay [Path to ImageNet]
Note that both USL and USL-T have some randomness due to non-deterministic computations in GPU and could vary between each run or server, despite setting the seed. Therefore, we release samples selected by USL run on our end. The indices are in torchvision CIFAR dataset order (the order commonly used in most PyTorch SSL implementations).
These are the instance indices by the torch CIFAR-10 dataset.
Random indices on CIFAR-10
Seed 1 (class distribution [1, 6, 5, 3, 1, 3, 5, 4, 6, 6]):[26247, 35067, 34590, 16668, 12196, 2600, 9047, 2206, 25607,
11606, 3624, 43027, 15190, 25816, 26370, 1281, 29433, 36256,
34217, 39950, 6756, 26652, 3991, 40312, 24580, 4949, 18783,
39205, 23784, 39508, 19062, 48140, 11314, 766, 39319, 15299,
10298, 25573, 18750, 19298]
Seed 2 (class distribution [4, 2, 6, 5, 7, 1, 5, 2, 4, 4]):
[23656, 27442, 40162, 8459, 8051, 42404, 89, 1461, 13519,
42536, 20817, 45479, 3121, 36502, 40119, 35971, 8784, 14084,
4063, 18730, 17763, 29366, 43841, 10741, 3986, 40475, 8470,
35621, 30892, 27652, 35359, 24435, 47853, 8835, 6572, 36456,
8750, 21067, 4337, 24908]
Seed 5 (class distribution [6, 6, 2, 3, 5, 3, 5, 2, 2, 6]):
[24166, 42699, 15927, 7473, 5070, 33926, 21409, 9495, 16235,
35747, 46288, 13560, 29644, 28992, 35350, 43077, 35757, 24106,
26555, 22478, 1951, 29145, 33373, 10043, 21988, 37116, 15760,
48939, 29761, 3702, 3273, 4175, 30998, 31012, 8754, 33953,
22206, 28896, 31669, 19275]
Seed 3 and 4 are not selected because seed 3 and seed 4 do not lead to instances of 10 classes for random selection and thus the comparison would not bring us much insights. Note that seed 3 and 4 lead to instances of 10 classes for our selection.
Note that these can be obtained by selective_labeling/random-cifar.py
.
USL indices on CIFAR-10
Seed 1 (class distribution [5, 4, 5, 2, 2, 5, 5, 4, 3, 5]):[ 3301, 37673, 33436, 28722, 10113, 5286, 21957, 13485, 445,
48678, 43647, 27879, 39987, 14374, 32536, 14741, 38215, 22102,
23082, 16734, 7409, 881, 10912, 37632, 39363, 7119, 6203,
28474, 25040, 43960, 24780, 45742, 49642, 25728, 9297, 21092,
4689, 4712, 48444, 30117]
Seed 2 (class distribution [4, 4, 4, 3, 3, 5, 4, 5, 3, 5]):
[19957, 40843, 45218, 881, 4557, 6203, 11400, 14374, 27595,
21092, 41009, 38215, 35471, 49642, 25728, 28722, 17094, 48678,
43960, 39363, 43647, 3907, 16734, 48023, 3301, 22102, 37632,
21130, 3646, 14741, 7127, 9297, 11961, 39987, 4712, 45568,
39908, 23505, 48421, 33436]
Seed 5 (class distribution [4, 5, 4, 3, 3, 4, 4, 4, 4, 5]):
[38215, 43213, 39363, 27965, 445, 16734, 14374, 914, 17063,
45918, 3301, 5286, 32457, 19867, 48678, 10455, 43647, 10912,
28722, 4712, 29946, 1221, 3907, 10110, 20670, 13410, 4689,
49642, 10018, 41210, 43755, 46227, 11961, 15682, 45742, 21092,
9692, 48023, 14741, 2703]
Seed 3 and 4 are not selected because seed 3 and seed 4 do not lead to instances of 10 classes for random selection and thus the comparison would not bring us much insights. Note that seed 3 and 4 lead to instances of 10 classes for our selection.
Note that these can be obtained by selective_labeling/usl-cifar.py
.
USL-T indices on CIFAR-10
Class distribution [4, 4, 4, 4, 4, 4, 4, 4, 4, 4]:[7998, 45774, 27115, 8389, 28558, 8454, 12390, 42528, 28249,
12885, 25101, 39912, 19571, 7904, 43637, 3267, 6935, 21794,
24489, 13999, 24554, 19979, 1573, 36597, 5403, 44836, 29500,
16935, 9408, 47504, 35673, 20778, 44636, 37123, 49130, 8086,
39994, 8499, 48597, 7753]
Note that both USL and USL-T have some randomness due to non-deterministic computations in GPU and could vary between each run or server, despite setting the seed. Therefore, we release samples selected by USL run on our end. The indices are in torchvision CIFAR dataset order (the order commonly used in most PyTorch SSL implementations).
Random indices on CIFAR-100
Class distribution [ 2, 4, 4, 3, 3, 4, 8, 7, 6, 1, 2, 3, 4, 3, 4, 4, 2, 3, 2, 0, 1, 2, 3, 5, 4, 4, 6, 5, 7, 2, 5, 2, 2, 10, 7, 3, 2, 2, 2, 7, 3, 7, 4, 1, 5, 4, 4, 2, 4, 4, 13, 5, 4, 4, 4, 3, 2, 3, 7, 4, 4, 1, 4, 7, 2, 9, 3, 5, 3, 4, 5, 4, 1, 7, 6, 3, 2, 3, 6, 2, 4, 1, 2, 2, 3, 3, 5, 3, 7, 2, 6, 5, 6, 5, 4, 3, 5, 7, 5, 4]:[24166, 42699, 15927, 7473, 5070, 33926, 21409, 9495, 16235, 35747, 46288, 13560, 29644, 28992, 35350, 43077, 35757, 24106, 26555, 22478, 1951, 29145, 33373, 10043, 21988, 37116, 15760, 48939, 29761, 3702, 3273, 4175, 30998, 31012, 8754, 33953, 22206, 28896, 31669, 19275, 49843, 36020, 22750, 7287, 17858, 2307, 44346, 12850, 23481, 19587, 40928, 40633, 35091, 42855, 27790, 5834, 3866, 19436, 31884, 37267, 23120, 33086, 4440, 27294, 3067, 45307, 7740, 30408, 12046, 7469, 8633, 991, 36400, 33945, 43384, 619, 295, 6399, 399, 34346, 43381, 21838, 3031, 951, 15485, 41403, 18809, 37656, 31150, 42183, 25731, 27322, 26964, 12174, 10713, 21227, 11015, 39811, 40368, 28291, 29288, 16029, 21946, 9705, 21352, 33934, 37044, 24317, 36476, 25635, 7886, 4473, 41868, 26742, 29129, 38795, 1719, 4790, 31217, 14773, 16794, 47168, 27679, 42505, 10825, 31802, 49931, 5575, 28614, 6435, 44541, 5085, 17016, 42117, 38402, 29967, 25431, 25062, 33502, 15008, 10109, 8426, 28882, 9342, 6308, 35548, 12832, 24762, 33338, 42121, 46170, 17984, 42630, 29659, 7446, 28009, 23580, 38684, 43558, 888, 20080, 11419, 46386, 34533, 6104, 17931, 4785, 4495, 47383, 17126, 29605, 37074, 14999, 27624, 20763, 33869, 3823, 36053, 20029, 19489, 40526, 9001, 36766, 5168, 38401, 47320, 24840, 43386, 35017, 47749, 36024, 33553, 45709, 28068, 11612, 26724, 19599, 17505, 30267, 28034, 6525, 20387, 42058, 13573, 20016, 18810, 6579, 37825, 11653, 43824, 4535, 9211, 17387, 32223, 39727, 26713, 27986, 42511, 17841, 42043, 17476, 30838, 22163, 9398, 11837, 17402, 44474, 46779, 34172, 49624, 13008, 814, 14162, 37951, 17370, 47018, 23821, 40787, 4148, 49582, 1733, 14839, 15748, 23940, 2074, 17809, 10011, 28631, 26640, 6567, 1893, 17286, 41825, 4060, 29351, 42622, 34632, 24497, 41630, 8892, 43350, 21663, 36068, 17874, 14887, 6986, 574, 40534, 6410, 19994, 40707, 4914, 25078, 33826, 20188, 40485, 42960, 42533, 1500, 16193, 13313, 10466, 26284, 43916, 47801, 15414, 7746, 44384, 21855, 39092, 29991, 18908, 25548, 19031, 16423, 31446, 24607, 46107, 37059, 48936, 46619, 37714, 34645, 49173, 20050, 23766, 15251, 29339, 37963, 9887, 16646, 8353, 5072, 14002, 16087, 24325, 17034, 24853, 3059, 8437, 10524, 32249, 21469, 12733, 30641, 43425, 402, 20134, 33961, 31995, 14819, 12533, 43068, 49730, 18873, 49475, 49092, 49282, 26051, 9070, 3479, 9355, 8142, 27134, 29154, 44421, 134, 7139, 42677, 45885, 35052, 28516, 9571, 17065, 38029, 45069, 26723, 12765, 1041, 33472, 43872, 5377, 45825, 18063, 12945, 32732, 30400, 2353, 44644, 21439, 14658, 25817, 35673, 36260, 22863, 25272, 3251, 16628, 36201, 49088, 33481, 36333, 39695, 32444, 28474, 12797, 32118, 37912, 22011, 3196, 36365, 12075, 47069, 15728, 534, 44533, 3113, 12785, 43732, 25765]
USL indices on CIFAR-100
Class distribution [5, 4, 3, 4, 4, 4, 5, 4, 4, 2, 3, 8, 4, 2, 8, 2, 5, 3, 2, 2, 2, 2, 2, 2, 5, 4, 4, 6, 5, 2, 3, 3, 5, 5, 4, 3, 3, 4, 6, 3, 1, 5, 6, 6, 4, 5, 3, 3, 6, 5, 4, 3, 9, 2, 7, 2, 2, 3, 5, 1, 4, 3, 5, 3, 4, 7, 2, 6, 4, 3, 4, 4, 4, 3, 5, 3, 4, 4, 4, 5, 3, 6, 3, 2, 4, 2, 4, 5, 5, 7, 4, 4, 5, 4, 6, 4, 7, 2, 6, 3]:[22206, 15361, 24078, 22803, 10747, 21048, 27300, 1106, 31810, 41199, 8736, 47459, 5900, 23184, 210, 47195, 40234, 7629, 6458, 31732, 41228, 27974, 23980, 30146, 1368, 31891, 46592, 40815, 20847, 8155, 18011, 29160, 22693, 46498, 37427, 2006, 22506, 2302, 11743, 4500, 32338, 31112, 26553, 36430, 47351, 41152, 46279, 46028, 2149, 11125, 48700, 12325, 47166, 47410, 4290, 33275, 11410, 23998, 22278, 13267, 715, 35417, 8517, 20521, 7508, 9462, 27757, 20893, 49175, 29673, 6477, 8255, 36917, 21076, 30512, 7195, 17986, 9877, 31824, 4340, 40584, 20741, 51, 12714, 48609, 37284, 45509, 40413, 47958, 14536, 23990, 1871, 44686, 48318, 5245, 35253, 2519, 20553, 28140, 13514, 13276, 21972, 5423, 47901, 17726, 40743, 7859, 47350, 43054, 31207, 49312, 4780, 9741, 13351, 9022, 38961, 10975, 40218, 16656, 12985, 35080, 21969, 40436, 8160, 20180, 17159, 49497, 4600, 12640, 9306, 27679, 36206, 12959, 32090, 3049, 30026, 25417, 40116, 34101, 21479, 9083, 5768, 19417, 3441, 18566, 33689, 49152, 36668, 22747, 6087, 16483, 20472, 14240, 6790, 32485, 34823, 5591, 26852, 22707, 12774, 45823, 34676, 10430, 10683, 39252, 25088, 6682, 12444, 26152, 3818, 34555, 3232, 47836, 12584, 18753, 28289, 46985, 40834, 22674, 25657, 29516, 29945, 48491, 18271, 36346, 44892, 2599, 16124, 37785, 19595, 4331, 12955, 31965, 48730, 563, 26109, 2580, 27360, 8033, 32350, 29868, 12805, 25414, 32947, 612, 38133, 41872, 14239, 28040, 43421, 49358, 45928, 17874, 30130, 8343, 48325, 34903, 16680, 22661, 5929, 868, 28565, 25180, 27498, 25216, 30783, 17604, 41793, 14387, 36673, 24815, 9966, 44087, 27517, 34323, 19583, 13727, 28632, 13872, 9738, 42044, 12358, 2878, 44860, 30303, 30007, 15769, 2538, 27205, 33147, 5308, 41664, 14280, 8903, 38737, 30401, 7952, 19026, 22712, 14610, 23180, 43114, 1843, 30665, 16331, 19947, 24107, 3009, 13682, 16944, 30998, 32069, 3618, 2679, 13812, 11975, 1846, 34370, 32934, 36766, 47276, 38654, 12141, 22821, 38348, 30562, 6064, 4892, 39669, 35458, 43696, 10557, 5427, 1707, 45805, 7727, 44672, 47255, 47831, 12951, 16570, 36288, 8138, 49012, 14205, 34474, 41050, 10882, 34830, 1627, 3784, 25923, 11883, 18593, 14734, 37586, 8442, 46250, 30750, 41352, 10437, 39812, 8550, 49114, 25148, 6764, 16857, 16947, 17439, 27469, 34486, 5103, 27571, 46567, 433, 5576, 39624, 32810, 38208, 10233, 34721, 35500, 7198, 16577, 36927, 1364, 22608, 48528, 41758, 32909, 44279, 5844, 36946, 3974, 12468, 44529, 5998, 2176, 45598, 49136, 22725, 26274, 19532, 17104, 14411, 2376, 36061, 26912, 19367, 38488, 7345, 23549, 19293, 20523, 28316, 21374, 41267, 44467, 19224, 34958, 29115, 9603, 2237, 44402, 41499, 23403, 38756, 5096, 6143, 13455, 29615, 25857, 15402, 2151, 10046, 790, 21690, 7772, 46735, 5318]
USL-T indices on CIFAR-100
Class distribution [5, 5, 4, 4, 4, 5, 6, 2, 4, 3, 2, 5, 2, 4, 3, 3, 4, 5, 6, 3, 3, 6, 3, 3, 6, 2, 4, 3, 3, 4, 4, 5, 4, 4, 2, 3, 4, 3, 5, 6, 2, 6, 3, 6, 5, 4, 0, 7, 4, 4, 5, 3, 4, 6, 2, 1, 3, 3, 3, 2, 4, 6, 7, 5, 2, 4, 4, 6, 4, 3, 6, 5, 4, 1, 5, 6, 4, 3, 3, 2, 3, 5, 4, 2, 5, 5, 3, 4, 6, 3, 5, 4, 3, 2, 6, 7, 4, 5, 5, 6]:[21762, 16447, 49147, 10389, 18413, 35442, 33474, 46464, 8338, 14525, 34628, 47555, 29499, 33006, 15704, 39226, 22682, 20220, 39482, 28379, 11077, 5201, 46545, 48978, 46444, 23451, 6294, 20971, 42100, 43381, 6904, 9155, 8731, 25248, 11694, 22934, 15677, 41072, 36827, 19802, 4593, 4847, 8897, 21525, 12729, 8578, 28329, 20981, 49514, 2711, 30004, 38806, 1723, 2943, 26878, 7087, 19209, 49254, 6446, 31422, 20969, 26117, 23572, 6410, 26734, 40907, 8296, 31978, 25857, 31126, 15024, 35952, 19102, 34214, 37404, 33551, 8454, 21571, 37177, 37070, 22569, 6519, 47011, 43725, 32006, 12986, 34865, 4324, 9779, 46093, 22246, 41444, 46387, 13288, 24731, 10286, 46384, 7574, 24220, 21826, 6140, 19237, 616, 27663, 46045, 30240, 48781, 35184, 6081, 42246, 1458, 48719, 9891, 14490, 21687, 24246, 13089, 8257, 15239, 1873, 39691, 25481, 24495, 8455, 38323, 13807, 6036, 25472, 14880, 2797, 12212, 48310, 47899, 39238, 21455, 7343, 9462, 508, 38205, 9973, 29254, 1797, 34802, 11474, 44070, 17242, 38568, 4101, 44808, 20624, 19976, 43584, 17020, 10984, 10756, 35500, 26976, 25107, 30896, 26283, 40935, 35799, 41217, 12935, 24193, 43840, 7911, 28369, 35101, 1843, 45917, 14696, 2868, 49554, 28012, 35154, 2355, 24112, 8203, 31274, 6933, 29179, 42669, 31020, 12493, 45961, 1476, 17077, 15287, 41520, 21172, 39689, 33892, 20100, 11246, 33001, 29609, 3614, 29688, 34446, 13096, 7953, 12863, 38488, 23459, 38311, 45879, 12076, 3853, 14705, 25755, 35035, 18426, 42637, 4375, 29598, 24182, 27208, 47842, 46663, 49903, 24443, 1681, 22698, 19446, 16168, 22036, 7260, 45326, 30289, 40054, 25173, 12881, 39212, 16684, 46354, 17429, 30140, 38577, 43055, 6243, 41306, 23593, 16979, 31824, 48917, 2340, 49058, 44091, 25068, 20199, 31109, 6841, 31451, 4944, 18916, 15397, 39035, 16920, 34210, 21804, 13167, 13405, 8312, 5271, 40302, 36489, 18208, 23850, 16257, 5103, 33416, 26902, 19625, 7576, 30412, 27098, 7135, 46662, 46237, 2013, 13923, 39640, 13251, 2812, 16878, 48923, 24540, 30728, 26896, 20537, 37602, 48008, 7862, 30725, 3899, 10048, 29422, 44526, 38679, 13503, 22859, 13350, 10245, 27467, 21403, 4895, 39100, 44376, 15669, 37786, 1323, 807, 28568, 18199, 4054, 31353, 4424, 4367, 21359, 7092, 15131, 41729, 33263, 16688, 23562, 31033, 26816, 24209, 32544, 44294, 5649, 34927, 44546, 32031, 47751, 17975, 30215, 30740, 49699, 28670, 40664, 2570, 27057, 19430, 34764, 4413, 15967, 32042, 41441, 48957, 22367, 18955, 32866, 41457, 7634, 18922, 31547, 19411, 33710, 42915, 11569, 2808, 9905, 32939, 46386, 21291, 45922, 43343, 46521, 742, 36504, 35395, 4119, 22490, 49340, 40765, 15719, 35317, 38737, 30610, 4798, 283, 45332, 41408, 24516, 48768, 32476, 40236, 23567, 13674, 30089, 21409, 40468, 18040, 46783, 29182, 28345, 14026, 34255]
The random selection and USL selected samples on ImageNet (in csv
format) could be obtained here.
Update: USL-T selected samples on ImageNet is available at here.
Note that USL does not require training, so the weights are just pretraining weights (e.g., with MoCo/SimCLR/CLD).
Dataset | Models for Sample Selection |
---|---|
CIFAR-10 | Download (CLD) |
CIFAR-100 | Download (CLD) |
ImageNet | Download (MoCov2-EMAN) |
These weights can be directly parsed by selective_labeling/usl-{cifar,imagenet}.py
to get selections. The selection should match with the selections released (besides slight variations due to indeterministic operations in GPUs).
You can obtain the EMAN-FixMatch trained model for baselines/Stratified/USL-MoCo/USL-CLIP on ImageNet here.
Dataset | Models for Sample Selection | Results of this re-implementation | Results in the work |
---|---|---|---|
CIFAR-10 | Download | SimCLR-CLD: 76.1 | SimCLR-CLD: 76.1 |
CIFAR-100 | Download | SimCLR-CLD: 36.1 | SimCLR-CLD: 36.9 |
ImageNet | Download | SimCLR: 42.0 | SimCLR: 41.3 |
These weights can be directly parsed by selective_labeling/usl-t-{cifar,imagenet}.py
to get selections. The selection should mostly match with the selections released (besides slight variations due to indeterministic operations in GPUs).
If you find our work inspiring or use our codebase in your research, please consider giving a star ⭐ and a citation.
@inproceedings{wang2022unsupervised,
title={Unsupervised Selective Labeling for More Effective Semi-Supervised Learning},
author={Wang, Xudong and Lian, Long and Yu, Stella X},
booktitle={European Conference on Computer Vision},
pages={427--445},
year={2022},
organization={Springer}
}
This is a re-implementation of our original codebase. If you have any questions about the implementation in this repo, feel free to email us at longlian at berkeley.edu
.
I'm also happy to provide additional checkpoints for baselines and selected labels.
This project is licensed under the MIT license. See LICENSE for more details. The parts described below follow their original license.
This project uses code fragments from many projects. See credits in comments for the code fragments and adhere to their own LICENSE. The code that is written by the authors of this project is licensed under the MIT license.
We thank the authors of the following projects that we referenced in our implementation:
- Swin-Transformer for the overall configuration framework.
- SCAN for augmentations, auxiliary dataloader wrappers, and many utility functions in USL-T.
- PAWS for the use of sharpening function.
- MoCov2 for augmentations.
- pykeops for kNN and k-Means.
- FixMatch-pytorch for EMA.
- SimCLRv2-pytorch for extracting and using SimCLRv2 weights
- Other functions listed with their own credits.