Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significance of line 174 in train_query.py code #24

Closed
Nishant3815 opened this issue Apr 3, 2022 · 4 comments
Closed

Significance of line 174 in train_query.py code #24

Nishant3815 opened this issue Apr 3, 2022 · 4 comments

Comments

@Nishant3815
Copy link

Nishant3815 commented Apr 3, 2022

Hi,

I was going through the code for query finetuning and I am not able to understand one condition in the code:

image

Is the above highlighted line redundant and if not what is the significance (I feel we can directly update the encoder). Just wanted to make sure that I am not missing anything.

@jhyuklee
Copy link
Member

jhyuklee commented Apr 3, 2022

Hi @Nishant3815,

this line (v1.0.0) was written to sync the pre-trained encoder (for the retrieval) with the target encoder that is used for the query-side fine-tuning. As you mentioned, we updated the code (v1.1.0) recently to directly update the target encoder and use it for the retrieval as well: https://github.com/princeton-nlp/DensePhrases/blob/v1.1.0/train_query.py

It gives a slightly better accuracy, overall.

@Nishant3815
Copy link
Author

Thanks for the confirmation on updating the target encoder. This makes sense, I had been following a previous version.

For your comment, "this line (v1.0.0) was written to sync the pre-trained encoder (for the retrieval) with the target encoder that is used for the query-side fine-tuning", in a scenario where we would like to keep both pretrained_encoder and target_encoder version of the code, can we keep line 175 and 176 and remove the "if" statement of 174.

@jhyuklee
Copy link
Member

jhyuklee commented Apr 4, 2022

Yes you can do that. It has no effect in v1.0.0 for now unless you use a higher divisor (currently 1) for the updating period.

Note that v1.0.0 with deepcopy syncs the pre-trained encoder w/ the target encoder after each epoch, so freezes the pre-trained encoder during each epoch. But using only the target encoder in v1.1.0 retrieves phrases from the same target encoder that is being updated for every "step."

@Nishant3815
Copy link
Author

Thanks, this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants