Mitigating Embedding and Class Assignment Mismatch in Unsupervised Image Classification

This repository is the PyTorch codes for "Mitigating Embedding and Class Assignment Mismatch in Unsupervised Image Classification", which will be published in ECCV2020.

Highlight

Our two-stage process starts with embedding learning as a pretraining step, which produces a great initialization. The second stage then aims to assign a class for each data point by refining its pretrained embedding. Our model successfully optimizes two objectives without falling into the mismatched state.
The proposed method outperforms the existing baselines substantially. With the CIFAR-10 dataset, we achieve an accuracy of 81.0%, whereas the best performing alternative reaches 61.7%.
Extensive experiments and ablation studies confirm that both stages are critical to the overall performance gain. In-depth comparison with the current state-of-the-art (SOTA) methods reveals that a massive advantage of our approach comes from the embedding learning initialization that gathers similar images nearby even in a low-dimensional space.
Our model can be adopted as a pretraining step for a semi-supervised task with few labels. We show the potential gain in the experiment section.

Required packages

python == 3.6.10
pytorch == 1.1.0
scikit-learn == 0.21.2
scipy == 1.3.0
numpy == 1.18.5
pillow == 7.1.2

Two stage model architecture

(a) First stage : Unsupervised deep embedding

super_and.py

The encoder projects input images to a lower dimension embedding sphere via deep embedding (Super-AND). The encoder is trained to gather samples with similar semantic contents nearby and separate them if otherwise.

usage: super_and.py [-h] [--dataset DATASET] [--low_dim LOW_DIM] [--npc_t T]
                    [--npc_m NPC_M] [--ANs_select_rate ANS_SELECT_RATE]
                    [--ANs_size ANS_SIZE] [--lr LR] [--momentum M]
                    [--weight_decay W] [--epochs EPOCHS] [--rounds ROUNDS]
                    [--batch_t T] [--batch_m N] [--batch_size B]
                    [--model_dir MODEL_DIR] [--resume RESUME] [--test_only]
                    [--seed SEED]

Example

python3 super_and.py --dataset cifar10

(b) Second stage: Unsupervised class assignment with refining pretrained embeddings

main.py

Multi-head normalized fully-connected layer classifies images by jointly optimizing the clustering and embedding losses.

usage: main.py [-h] [--dataset DATASET] [--low_dim LOW_DIM] [--lr LR]
               [--momentum M] [--weight_decay W] [--epochs EPOCHS]
               [--batch_t T] [--batch_m N] [--batch_size B]
               [--model_dir MODEL_DIR] [--resume RESUME] [--test_only]
               [--seed SEED]

Example

python3 main.py --dataset cifar10 --resume [first stage pretrained model]

Pretrained Model

Currently, we support the pretrained model for our model and super-AND on CIFAR10 dataset.

Result

Unsupervised Image Classification Result

We achieve new state of the art unsupervised image classification record on multiple dataset (CIFAR 10, CIFAR 100-20, STL 10)

+ Additional experiments

We found that the performance of our algorithm is sensitive to the initial randomness and pretraining results. When we train our model from scratch including pretraining, result accuracies can be varied accordingly. The table below shows the best and average accuracy of the proposed model, which is evaluated six times.

Supplementary Materials

This repository contains supplementary materials in ECCV2020 directory. You can check implementation details and additional analyses of our model.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
ECCV2020		ECCV2020
checkpoint		checkpoint
data		data
datasets		datasets
fig		fig
lib		lib
models		models
README.md		README.md
cifar10_log.txt		cifar10_log.txt
main.py		main.py
super_and.py		super_and.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mitigating Embedding and Class Assignment Mismatch in Unsupervised Image Classification

Highlight

Required packages

Two stage model architecture

(a) First stage : Unsupervised deep embedding

super_and.py

Example

(b) Second stage: Unsupervised class assignment with refining pretrained embeddings

main.py

Example

Pretrained Model

Result

Unsupervised Image Classification Result

+ Additional experiments

Supplementary Materials

About

Releases

Packages

Contributors 4

Languages

Sungwon-Han/TwoStageUC

Folders and files

Latest commit

History

Repository files navigation

Mitigating Embedding and Class Assignment Mismatch in Unsupervised Image Classification

Highlight

Required packages

Two stage model architecture

(a) First stage : Unsupervised deep embedding

super_and.py

Example

(b) Second stage: Unsupervised class assignment with refining pretrained embeddings

main.py

Example

Pretrained Model

Result

Unsupervised Image Classification Result

+ Additional experiments

Supplementary Materials

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages