Is it a bug that `slot_mu` and `slot_log_sigma` is not updated in training? #8

Wuziyi616 · 2021-10-16T19:33:35Z

Hi. Thank you for opening source this wonderful implementation! I have a small question about a code and think it might be a bug.

In these lines, you define slot_mu and slot_log_sigma using register_buffer. If I understand correctly, tensors created via register_buffer won't be updated during training (see here for reference). I also check my trained checkpoints, these two values are indeed the same throughout the training process.

Also, in other slot-attention implementations, they define them as trainable parameters (see PyTorch one and the official one). So I just wonder if this is a bug or intentional behavior?

Update: I didn't observe much performance difference using trainable or fixed mu+sigma. That's very interesting.

The text was updated successfully, but these errors were encountered:

joshalbrecht · 2021-10-26T03:49:22Z

I think you are correct that this was a bug, and not an intentional change. Thanks for flagging!

Wuziyi616 · 2021-10-26T03:51:07Z

Indeed you're right. Actually I run experiments after fixing it, and the performance difference is very small (<5%). So I think the learned slot initialization distribution is not very important.

Wuziyi616 closed this as completed Oct 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it a bug that `slot_mu` and `slot_log_sigma` is not updated in training? #8

Is it a bug that `slot_mu` and `slot_log_sigma` is not updated in training? #8

Wuziyi616 commented Oct 16, 2021

joshalbrecht commented Oct 26, 2021

Wuziyi616 commented Oct 26, 2021

Is it a bug that slot_mu and slot_log_sigma is not updated in training? #8

Is it a bug that slot_mu and slot_log_sigma is not updated in training? #8

Comments

Wuziyi616 commented Oct 16, 2021

joshalbrecht commented Oct 26, 2021

Wuziyi616 commented Oct 26, 2021

Is it a bug that `slot_mu` and `slot_log_sigma` is not updated in training? #8

Is it a bug that `slot_mu` and `slot_log_sigma` is not updated in training? #8