How's FixMatch compared to supervised methods? #52

ElimsV · 2021-01-28T09:59:41Z

Thanks for sharing the great work.

It might be interesting to compare FixMatch with supervised learning methods, given the same labeled data and model structure, to further demonstrate the power of SSL.

I appreciate if you could comment on the idea or release experimental results on such comparison.

Regards,

carlini · 2021-02-02T07:06:08Z

I'm not quite sure I follow the question. Do you want to know how well our methods do in comparison to a fully supervised baseline that uses, say, just 40 or just 250 images? This is given in Table 9 and Table 10.

If instead you mean fully supervised on the entire dataset, then these numbers are reported in our ReMixMatch paper Appendix Section 4.1

To begin, we train a fully-supervised baseline to measure the highest
accuracy we could hope to obtain with our training pipeline. The experiments we perform use the
same model and training algorithm, so these baselines are valid for all discussed SSL techniques. On
CIFAR-10, we obtain an fully-supervised error rate of 4.25% using weak flip + crop augmentation,
which drops to 3.62% using AutoAugment and 3.91% using CTAugment. Similarly, on SVHN we
obtain 2.70% error using weak (flip) augmentation and 2.31% and 2.16% using AutoAugment and
CTAugment respectively. While AutoAugment performs slightly better on CIFAR-10 and slightly
worse on SVHN compared to CTAugment, it is not our intent to design a better augmentation
strategy; just one that can be used without a pre-training or tuning of hyper-parameters.

ElimsV · 2021-02-03T07:12:41Z

Actually I meant both. Thanks for the considerate response.

I implemented FixMatch with RandAugment in my own codebase and on a classification task, with real world data and imbalanced distribution (4 classes, majority:minority up to 50:1). The training dataset is composed of balanced/imbalanced labeled data and imbalanced unlabeled data (sampled randomly from the entire dataset).

I tried a coarse grid search for hyperparameters (weight decay, tao, lr, lambda_u, range of augmentations), but got no luck. FixMatch didn't really help to improve classification performance compared to a purely supervised model. I have a few observations:

With too strong augmentation, the model cannot learn much from pseudo labels. Given large training epochs, the "unlabeled loss" multiplied by lambda_u, i.e. L_u * lambda_u, still maintains a high value and oscillates fiercely, despite that the "labeled loss" quickly converges. L_u * lambda_u and L_s start from the same order of magnitude, but differ by 1 - 2 orders of magnitude in the end.
With balanced labeled data the training set, FixMatch achieves very similar precision but worse recall compared to naive supervised training. With imbalanced labeled data, FixMatch gives better precision but significantly worse recall.

Here are my guesses, please correct me if I'm wrong.

First, FixMatch should start the training process with a good or at least mediocre pre-trained model to prevent low quality pseudo labels introducing too much noise to the learning process. Second, FixMatch seems to have problems adapting to imbalanced classification tasks. The majority class is quickly overfitted. The randomly sampled unlabeled data cannot provide much useful info via consistency regularization and pseudo labeling. Third, hyperparameter tuning (maybe among all, augmentation range tuning is the most important) is vital and difficult in adaptation.

I appreciate any comment or suggestion for improvements. Thank you very much.

carlini · 2021-02-07T01:53:14Z

Yeah, unbalanced datasets are so far something that SSL really struggles with. We've found distribution matching from ReMixMatch can help here, but I don't know of any answers. My guess is this is less of any issue with fixmatch-as-expected, and more of an open research question for how to solve unaligned data distributions better.

ElimsV mentioned this issue Feb 4, 2021

Questions on FixMatch iBelieveCJM/Tricks-of-Semi-supervisedDeepLeanring-Pytorch#7

Open

ElimsV closed this as completed Feb 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How's FixMatch compared to supervised methods? #52

How's FixMatch compared to supervised methods? #52

ElimsV commented Jan 28, 2021

carlini commented Feb 2, 2021

ElimsV commented Feb 3, 2021

carlini commented Feb 7, 2021

How's FixMatch compared to supervised methods? #52

How's FixMatch compared to supervised methods? #52

Comments

ElimsV commented Jan 28, 2021

carlini commented Feb 2, 2021

ElimsV commented Feb 3, 2021

carlini commented Feb 7, 2021