Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

verbalizers for trec dataset #11

Closed
dorost1234 opened this issue Nov 11, 2021 · 1 comment
Closed

verbalizers for trec dataset #11

dorost1234 opened this issue Nov 11, 2021 · 1 comment

Comments

@dorost1234
Copy link

dorost1234 commented Nov 11, 2021

Hi
this seems to me the place of "location" and "number" needs to get swapped, as label 4 (zero-indexed) corresponds to "number" and label "5" corresponds to "location".

Here is the link to the dataset labels: https://huggingface.co/datasets/viewer/ if you search trec

Here are the current verbalizers: ["Description", "Entity", "Expression", ​"Human", "Location", "Number"]

thanks

@shmsw25
Copy link
Owner

shmsw25 commented Nov 11, 2021

Hi @dorost1234,

We did not use the Huggingface version of the data. Instead, we downloaded the data from here, and I believe two versions of the data have different label ordering.

I manually checked the examples and double-checked that the current verbalizers are correct. For example, label "4" includes

  • What continent's name appears on the upper left corner of a Budweiser label?
  • What European city do Nicois live in?
  • Where is the Isle of Man?

(all location related)

label "5" includes

  • How many bails are there in a cricket wicket?
  • At what age did Rossini stop writing opera?
  • At was the first minimum wage?

(all number related)

@shmsw25 shmsw25 closed this as completed Nov 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants