Format of the ranking candidates file for GrailQA #6

zluw1117 · 2022-04-22T14:36:40Z

Hi. I am interested in the ranker part of this project. I am currently setting up the environment. However, looks like the previous steps could be time consuming. Can I get some quick information on the format of the output files for:

python enumerate_candidates.py --split train # we use gt entity for trainning (so no need for prediction on training)
python enumerate_candidates.py --split dev --pred_file misc/grail_dev_entity_linking.json

Thanks!

The text was updated successfully, but these errors were encountered:

xiye17 · 2022-04-22T15:32:26Z

Sure. It will a be a jsonline file where each line is a json of the information of the candidates for a question.

{
  "qid": "2100278008000", 
  "s_expression": "(AND cvg.game_version (JOIN cvg.game_version.producer m.0ds98f))",  # the ground truth logical form
  "candidates": [ # logical form candidates
     {
        "logical_form": "(AND cvg.game_version (JOIN cvg.game_version.publisher m.0ds98f))",
         "ex": false # whether the logical form is equivalent to the ground truth
      },
     {
        "logical_form": "(COUNT (AND cvg.game_version (JOIN cvg.game_version.publisher m.0ds98f)))", 
        "ex": false
     },
    ........
   ]
}

zluw1117 · 2022-04-22T16:02:30Z

Thanks!
Another quick question, what is the 'ex' here? How will this be used in ranker training?

xiye17 · 2022-04-22T16:15:46Z

ex: True means if this logical form is equivalent to the ground truth logical form.

If ex is true, we will not use this logical form as negative candidates (you don't wanna penalize a logical form that is equivalent to the ground truth).

zluw1117 · 2022-04-22T16:17:03Z

Got it! Thank you for the detailed explanation!

zluw1117 · 2022-04-22T17:33:00Z

Sorry, one more question.
From the codebase, where can we see the roberta/bert model is using a contrastive loss for ranker? Thanks.

xiye17 · 2022-04-23T06:37:20Z

https://github.com/salesforce/rng-kbqa/blob/main/framework/models/BertRanker.py

line 76-82

alik-git · 2022-07-06T16:35:12Z

Hi there, I just want to follow up on this thread as I am doing something similar.

I am curious what the input batch looks like before preprocessing when it is being passed through the RobertaRanker.py file. Specifically I would like to know what is the input text, not input_id that is eventually passed to the model forward function.

Thank you in advance for the information!

xiye17 · 2022-07-06T16:59:29Z

the code for processing logical form can be found in

rng-kbqa/framework/components/rank_dataset.py

Line 69 in 80d56b5

def _vanilla_linearization_method(expr, entity_label_map):

basically, we tokenize the expression, replace "_" in relations with " ", replace entities "m.xxxx" with its label.

alik-git · 2022-07-11T17:38:29Z

Thank you for answer. I am trying to understand the forward function of the RobertaRanker.py, and I am confused the tensor dimensions when you compute the loss. I'm referring to this part:

rng-kbqa/framework/models/RobertaRanker.py

Line 76 in 80d56b5

logits = logits.view((batch_size, sample_size))

Why do you reshape the logits to be [batch_size, sample_size] when computing the loss, but then call view(-1) on the labels? Won't this cause the labels to be in the shape [batch_size * sample_size] and cause a shape mismatch error?
Edit: Follow up question: For re-ranking 5 predictions, is your label a one-hot vector in the form of [1,0,0,0,0] or the index of the correct classification such as [0]?

My intuition says that both the logits and labels should be in the shape [batch_size * sample_size] so maybe I am misunderstanding something. If my question is not clear please ask me and I can clarify. Thank you again for your answers.

xiye17 · 2022-07-11T23:47:29Z

logits: [batch_size, sample_size], labels" [batch_size].
label vec is in the format of index of the correct sample.

zluw1117 closed this as completed Apr 22, 2022

zluw1117 reopened this Apr 22, 2022

zluw1117 closed this as completed Apr 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Format of the ranking candidates file for GrailQA #6

Format of the ranking candidates file for GrailQA #6

zluw1117 commented Apr 22, 2022

xiye17 commented Apr 22, 2022 •

edited

zluw1117 commented Apr 22, 2022

xiye17 commented Apr 22, 2022

zluw1117 commented Apr 22, 2022

zluw1117 commented Apr 22, 2022 •

edited

xiye17 commented Apr 23, 2022

alik-git commented Jul 6, 2022

xiye17 commented Jul 6, 2022

alik-git commented Jul 11, 2022 •

edited

xiye17 commented Jul 11, 2022

Format of the ranking candidates file for GrailQA #6

Format of the ranking candidates file for GrailQA #6

Comments

zluw1117 commented Apr 22, 2022

xiye17 commented Apr 22, 2022 • edited

zluw1117 commented Apr 22, 2022

xiye17 commented Apr 22, 2022

zluw1117 commented Apr 22, 2022

zluw1117 commented Apr 22, 2022 • edited

xiye17 commented Apr 23, 2022

alik-git commented Jul 6, 2022

xiye17 commented Jul 6, 2022

alik-git commented Jul 11, 2022 • edited

xiye17 commented Jul 11, 2022

xiye17 commented Apr 22, 2022 •

edited

zluw1117 commented Apr 22, 2022 •

edited

alik-git commented Jul 11, 2022 •

edited