New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does the code supports for the entire end-to-end fine-tuning including the retriever ? #4
Comments
As you saw in the paper, the evidence blocks are frozen during fine-tuning, which means that index updates are not performed in this time. Therefore, if domain specific QA is the case, we would have to firstly pre-train REALM to get domain specific evidence blocks (retriever), then we can further fine-tine on a given dataset. |
So we have to pre train the REALM with masked word prediction task right?
On Sat, 29 Jan 2022 at 5:42 PM, Li-Huai (Allan) Lin < ***@***.***> wrote:
As you saw in the paper, the evidence blocks are frozen during
fine-tuning, which means that index updates are not performed in this time.
Therefore, if domain specific QA is the case, we would have to firstly
pre-train REALM to get domain specific evidence blocks, then we can further
fine-tine on a given dataset.
—
Reply to this email directly, view it on GitHub
<#4 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEA4FGQFWTPDML34RREGCQTUYNVZXANCNFSM5NCKNTHQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
[image: Augmented Human Lab] <http://www.ahlab.org/> [image: uni]
<https://www.auckland.ac.nz/en/abi.html>
Gayal Shamane
Ph.D. Candidate
Augmented Human Lab
Auckland Bioengineering Institute | The University of Auckland
|
Exactly, but the pre-training part has not been fully ported to PyTorch, especially asynchronous MIPS refreshes, and Inverse Cloze Task (ICT), which is used to warm-start retriever training. Thus, to pre-train REALM, we would have to utilize the original TF impl., and then can fine-tune it on PyTorch. |
Thanks a lot for your insight. Anyways this end-to-end fine-tuning will be very expensive. |
@qqaatw is it part of the roadmap to port the pre-training part to Pytorch? |
It was part of the roadmap, but now I'm thinking whether this is worth to port. You can see the configuration of their experiments:
which leveraged an array of resources and is extremely expensive for normal users and researchers. I don't have such resources and a regular deep learning workstation will not be able to reproduce similar results like that of them I think. |
@qqaatw "It was part of the roadmap, but now I'm thinking whether this is worth port." Yeah, this seems a problem and I agree. |
The REALM paper highlights that for downstream tasks they kept the retriever frozen. What about a task like domain-specific open domain question answering? In that kind of a scenario can we train the entire REALM with this code.
if yes: we might able to compare results with RAG-end2end
https://github.com/huggingface/transformers/tree/master/examples/research_projects/rag-end2end-retriever
The text was updated successfully, but these errors were encountered: