swapped_prediction computation #4

DeepTecher · 2021-09-28T09:34:46Z

Line 180 in 50022c9

loss_cluster.append(self.swapped_prediction(logits, targets))

Thanks for your nice work~

However, I have a question why we compute swapped_prediction for the length of num_head. It seems swapped_prediction is the same computation.

Hope your reply

DonkeyShot21 · 2021-09-28T09:51:40Z

Yes, you are right, it is a small typo, I changed the code multiple times and I forgot to remove the loop. Anyway, the swapped assignment is performed in the same way, since I only index the view axis inside the swapped_assignment method, and the cross-entropy loss compares tensors element-wise. Also, the losses are then averaged, so nothing should change.

DeepTecher · 2021-09-28T13:05:15Z

yeah...
But it still has a problem on cross_entropy_loss function.
if I guess right, the dims of preds is [num_views, bach_size, num_label+num_unlabel], our F.log_softmax should be

preds = F.log_softmax(preds / 0.1, dim=2)

on the last dim to do log_softmax.
we do dim=1 will work on batch_size. Is it right?

DonkeyShot21 · 2021-09-28T13:14:46Z

Yes, you are right. For some reason, this still works. Let me look into it.

DeepTecher · 2021-09-29T07:15:03Z

Ok.
if you have a new conclusion, please let me know
many thanks

DonkeyShot21 · 2021-09-29T07:41:35Z

Hi, I fixed it and ran CIFAR100-20. I got similar results for CIFAR100-20 on the test set, while performance is slightly worse on the training set. I am now trying to do some hyperparameter tuning. I'll upload the fix as soon as possible.

DeepTecher · 2021-09-30T01:34:46Z

nice~ 👍

DonkeyShot21 · 2021-10-02T10:09:06Z

Hi, I have some good news. It seems that normalizing on the correct dimension improves performance quite significantly. I needed to tune the parameters a bit, but I just had one run hit 55% on the training set (unlab/train/acc) and 56% on the test set (unlab/test/acc) for CIFAR100-50. I am testing if the same parameters work on the other settings.

I also went back to my logs and found that ImageNet experiments were probably run without the bug, while all other datasets were affected. I will upload the new version as soon as I finish running experiments.

DeepTecher · 2021-10-04T13:42:11Z

okay,
I cannot wait for the newest result.

DonkeyShot21 · 2021-10-07T08:07:15Z

I have just merged a pull request that fixes this bug. Closing. Thanks @DeepTecher

DonkeyShot21 closed this as completed Oct 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

swapped_prediction computation #4

swapped_prediction computation #4

DeepTecher commented Sep 28, 2021

DonkeyShot21 commented Sep 28, 2021

DeepTecher commented Sep 28, 2021

DonkeyShot21 commented Sep 28, 2021

DeepTecher commented Sep 29, 2021

DonkeyShot21 commented Sep 29, 2021

DeepTecher commented Sep 30, 2021

DonkeyShot21 commented Oct 2, 2021

DeepTecher commented Oct 4, 2021

DonkeyShot21 commented Oct 7, 2021

swapped_prediction computation #4

swapped_prediction computation #4

Comments

DeepTecher commented Sep 28, 2021

DonkeyShot21 commented Sep 28, 2021

DeepTecher commented Sep 28, 2021

DonkeyShot21 commented Sep 28, 2021

DeepTecher commented Sep 29, 2021

DonkeyShot21 commented Sep 29, 2021

DeepTecher commented Sep 30, 2021

DonkeyShot21 commented Oct 2, 2021

DeepTecher commented Oct 4, 2021

DonkeyShot21 commented Oct 7, 2021