Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are the transformers of bi-encoder trained separately? #4

Closed
kaisugi opened this issue Oct 24, 2020 · 2 comments
Closed

Are the transformers of bi-encoder trained separately? #4

kaisugi opened this issue Oct 24, 2020 · 2 comments

Comments

@kaisugi
Copy link
Contributor

kaisugi commented Oct 24, 2020

(To be honest, I'm not used to "deep learning coding" (PyTorch, Huggingface, etc...), so this might be a silly question. Keep in mind I'm a beginner.)

The original paper said that context encoder and candidate encoder are trained separately.

スクリーンショット 2020-10-24 9 19 18

スクリーンショット 2020-10-24 9 20 15

However I found in your code that both transformers are called as self.bert().

https://github.com/chijames/Poly-Encoder/blob/master/encoder.py#L20-L27


Is it OK? I doubt these two encoders have different weights after training.

FYI: In the official implementation of BLINK(https://arxiv.org/pdf/1911.03814.pdf ) paper, they prepare different methods. https://github.com/facebookresearch/BLINK/blob/master/blink/biencoder/biencoder.py#L37-L48

@chijames
Copy link
Owner

You are right that the original paper used two encoders. However, doing so would consume much more GPU memory, which is why I just used one encoder. (I guess it is very likely that separate encoders can further boost the performance) Anyway, let me add this to readme.

If you really want two encoders, feel free to experiment with it. It should be fairly easy to modify the code. (and of course you are welcome to share the results:) )

@kaisugi
Copy link
Contributor Author

kaisugi commented Oct 24, 2020

OK I got!
And also it is interesting using the same encoder still performs comparably well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants