Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How it works that the self.learnable_vector is learnable? #26

Closed
yupeng1111 opened this issue Mar 21, 2023 · 4 comments
Closed

How it works that the self.learnable_vector is learnable? #26

yupeng1111 opened this issue Mar 21, 2023 · 4 comments

Comments

@yupeng1111
Copy link

Nice work!
I'm wondering that how you optimize the unconditional vector in

self.params_with_white=params + list(self.learnable_vector)

I find the intention that the unconditional input self.learnable_vector is optimized by self.opt.params=self.params_with_white in line 927 but how it works? I've not found any instructions about the parameter params in class torch.optim.AdamW.

@yupeng1111
Copy link
Author

Besides, I calculate the mean(0.0327) and std(0.9966) of learnable_vector and it's more like a random Gussian sampling so I'm doubting whether its value changed or not during training.

@Fantasy-Studio
Copy link
Owner

Fantasy-Studio commented Mar 23, 2023

We appreciate your interest in our work. I apologize for accidentally writing this bug during the code cleanup. This bug has now been fixed.
Regarding the learnable vector resembling Gaussian noise, I believe it is caused by the two factors listed below. 1) It is initialized by a Gaussian distribution. 2) Because the learning rate is small, it does not deviate too far from the initial value.

We have also retrained the network and tracked the value and gradient of the learnable vector during the training process. As a result, the gradient of the learnable vector is quite small.

@yupeng1111
Copy link
Author

Thansks!

@dorianzhang7
Copy link

dorianzhang7 commented Aug 14, 2023

I also have the same question about the changes in the parameters of to_k, to_q in cross-attention module,I tried iterating the network many times, but there is no change in the parameters of these two parts. The guess is caused by that only one-dimensional class vectors are selected as cond. https://github.com/Fantasy-Studio/Paint-by-Example/issues/
Have you retrained the code and have you encountered the same problem?
Looking forward to hearing from you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants