Skip to content
This repository has been archived by the owner on Mar 1, 2024. It is now read-only.

optimization and loss #103

Open
lshowway opened this issue Nov 30, 2021 · 1 comment
Open

optimization and loss #103

lshowway opened this issue Nov 30, 2021 · 1 comment

Comments

@lshowway
Copy link

Hi, thanks for your code and paper. I am a fresher in EL and I have a question:
bi-encoder and cross encoder are optimized jointly or separately? Specifically, loss function Eq. 4, loss function in the paragraph following Eq. 6, and loss function in Eq. 10, what is the relationship ?

@ledw-2
Copy link

ledw-2 commented Feb 6, 2022

@lshowway Hi, sorry for the delay in responding. The bi-encoder and cross encoder are optimized separately. For those equations, they are all independent. To be more specific: 1. Use eq.4 to train a bi-encoder. 2. Use eq.6 to train a cross-encoder. 3. Use eq.10 to train a distillation model (teacher: cross encoder, student bi-encoder). I hope that answers your question!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants