Number of classes much greater than the number of unique predictions obtained #66

deshanadesai · 2021-06-03T00:55:05Z

Hi! Hope you are doing well, I enjoyed reading this paper and the code. I was trying to adapt this method to my own task on a custom dataset. Something strange i noticed for my task was:

Even with a high number of classes given as input to the model, eg. 1024, the predictions produced by the model had only 34 unique values:

torch.unique(data['predictions'])
tensor([ 13, 29, 54, 60, 79, 93, 105, 150, 188, 253, 260, 286, 293, 333,
347, 363, 378, 406, 408, 414, 418, 447, 450, 492, 509, 529, 542, 580,
602, 615, 628, 635, 643, 662, 670, 677, 685, 698, 711, 720, 734, 741,
765, 799, 810, 847, 848, 859, 910, 919, 932, 933])

Do you think this is possible when faced with a large amount of overclustering?

Would highly appreciate your input.

Thank you,
Deshana

wvangansbeke · 2021-06-06T08:58:28Z

Hi @deshanadesai,

Thank you for your interest.

Hard to say since I don't know which dataset you're using. In order to discover what the model has learned I would take a look at few clusters and a few neighbors. You can also compare it to KMeans clustering.

It might be that you don't have enough images in your dataset to find reliable clusters or that the settings for the pretext task are not ideal (i.e. duration of training, augmentation strategy, etc.) for your particular problem.

anewusername77 · 2021-06-08T14:27:32Z

Hi! Hope you are doing well, I enjoyed reading this paper and the code. I was trying to adapt this method to my own task on a custom dataset. Something strange i noticed for my task was:

Even with a high number of classes given as input to the model, eg. 1024, the predictions produced by the model had only 34 unique values:

torch.unique(data['predictions'])
tensor([ 13, 29, 54, 60, 79, 93, 105, 150, 188, 253, 260, 286, 293, 333,
347, 363, 378, 406, 408, 414, 418, 447, 450, 492, 509, 529, 542, 580,
602, 615, 628, 635, 643, 662, 670, 677, 685, 698, 711, 720, 734, 741,
765, 799, 810, 847, 848, 859, 910, 919, 932, 933])

Do you think this is possible when faced with a large amount of overclustering?

Would highly appreciate your input.

Thank you,
Deshana

I'm having this probem too, and I've found out that neighbors after runing pretext task are incorrect. Just don't know how to adjust it.(I'm using moco.py, image size is larger than 224*224)

arbab-ml · 2021-06-23T12:22:32Z

I am also experiencing the same issue @wvangansbeke can you please skim through my pretext config file.
Thank you,

Setup

setup: simclr

Model

backbone: resnet50
model_kwargs:
head: mlp
features_dim: 128

Dataset

train_db_name: batsnet
val_db_name: batsnet
num_classes: 5

Loss

criterion: simclr
criterion_kwargs:
temperature: 0.1

Hyperparameters

epochs: 30
optimizer: sgd
optimizer_kwargs:
nesterov: False
weight_decay: 0.0001
momentum: 0.9
lr: 0.4
scheduler: cosine
scheduler_kwargs:
lr_decay_rate: 0.03 #0.1
batch_size: 50
num_workers: 8

Transformations

augmentation_strategy: batsnet_strategy
augmentation_kwargs:
crop_size: 128
color_jitter_random_apply:
p: 0.8
color_jitter:
brightness: 0.4
contrast: 0.4
saturation: 0.4
hue: 0.1
random_grayscale:
p: 0.2
normalize:
mean: [0.4914, 0.4822, 0.4465]
std: [0.2023, 0.1994, 0.2010]

transformation_kwargs:
crop_size: 128
normalize:
mean: [0.4914, 0.4822, 0.4465]
std: [0.2023, 0.1994, 0.2010]

arbab-ml · 2021-06-23T12:24:52Z

and good luck with NeurIPS. :-)

wvangansbeke · 2021-06-30T17:02:50Z

Hi @arbab97,

That looks okay. Since the amount classes is only 5 in your dataset I don't think this should really be a problem then. What are the entropy loss and consistency loss? Are the losses going down? What kind of data are you using? What do you get with KMeans?

wvangansbeke · 2021-07-24T18:12:29Z

If there are still issues, let me know. I'm closing this for now.

wvangansbeke closed this as completed Jul 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Number of classes much greater than the number of unique predictions obtained #66

Number of classes much greater than the number of unique predictions obtained #66

deshanadesai commented Jun 3, 2021

wvangansbeke commented Jun 6, 2021

anewusername77 commented Jun 8, 2021 •

edited

arbab-ml commented Jun 23, 2021

arbab-ml commented Jun 23, 2021

wvangansbeke commented Jun 30, 2021

wvangansbeke commented Jul 24, 2021

Number of classes much greater than the number of unique predictions obtained #66

Number of classes much greater than the number of unique predictions obtained #66

Comments

deshanadesai commented Jun 3, 2021

wvangansbeke commented Jun 6, 2021

anewusername77 commented Jun 8, 2021 • edited

arbab-ml commented Jun 23, 2021

Setup

Model

Dataset

Loss

Hyperparameters

Transformations

arbab-ml commented Jun 23, 2021

wvangansbeke commented Jun 30, 2021

wvangansbeke commented Jul 24, 2021

anewusername77 commented Jun 8, 2021 •

edited