-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About Extrinsic Evaluation #3
Comments
Hello. To clarify, in the extrinsic evaluation, we only use pre-trained LMs without any fine-tuning. What we do is we only collect some relevant assertions from a CSKB and concatenate them with the given question, then we feed all of them to the LM (see also Table 4 in our paper for example, or you can play with our QA demo at https://ascent.mpi-inf.mpg.de/qa). All the LMs we used (i.e., RoBERTa, GPT-2 and ALBERT) could be found in the HuggingFace Transformers library (https://huggingface.co/models). Yes, the LAMA project could be a good start for mask prediction evaluation. For the other QA settings (i.e., generation and span prediction), we only took the output of LMs and ask for human evaluation. About converting assertions to natural language: We used an embarrassingly simple approach:
There are certainly better approaches for this. Obviously our approach will produce some grammar mistakes, but I would not worry much about it as the large pre-trained LMs can somehow deal with those mistakes decently. For canonical schema like in ConceptNet, you can look at Table 9 in this paper (https://arxiv.org/pdf/2010.05953.pdf) for translation templates. |
Thank you a lot : ) |
Hello, I'm really interested in the part related to Extrinsic Evaluation in your paper. I read the papers you cited, but some details are still unclear. For example, how to convert assertions to natural language (which are used as context); or some other the training details in this project. Could you provide the code or some other related projects ? Maybe LAMA is a good start?
The text was updated successfully, but these errors were encountered: