-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
key-value updates #4
Comments
Hi Devraj, Here are my understanding/answers: |
Hi @yanbeic Thanks |
Hi, |
Hi I am curious about one thing. Since the unlabeled portion of the data is much larger than the labeled portion of the data how is the data loading part working ? for example you have taken around Does it mean we need to repeat the labeled portion of the data multiple times? Thanks again for all your help! |
In this implementation yes. But one can also use annealing to decrease this portion during training. |
Hi
Thanks for the wonderful work. I found it to be a great read and very easy to understand.
(1) I am wondering what loss function did you use for the key value updates?
based on eqn. (3) it seems that mean square error loss has been utilized like
loss_k_j = sum_i=1^n_j (k_j - x_i)^2
Is there any particular reason that
1/(n_j + 1)
has been selected instead ofn_j
?(2) I am also wondering if the MND and ME loss was simply defined for the unlabeled portion of the data, how does the performance degrade ?
(3) lastly what happens if instead of updating the key and values after every epoch, we simply average out the intermediate representations and the softmax of the labeled data to re-define the key and value pairs ?
(4) Any plans to release the code in Pytorch ? I am not very familiar with tensorflow but would like to understand your method more by studying the code.
any clarifications will be helpful.
Thanks
Devraj
The text was updated successfully, but these errors were encountered: