-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Indic xnli pair classification #581
Add Indic xnli pair classification #581
Conversation
waiting until #582 |
This is not really an issue, @loicmagne already answered, you need to adjust your dataset to meet the expected format by the PC task. Please check existing examples fo PC tasks with We'll think about updating this format but for now it's not the priority. Note that this update would require to update all mteb/*PC datasets. |
# removing this indic xnli is 3-way (entailment, neutral, ) | ||
# for label in labels: | ||
# assert label == 0 or label == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undo this change and filter out the neutral values just as it has been done here: XNLI
From the checklist it seems like this is still a work in progress. Will close it for now, but feel free to re-open it. |
Checklist for adding MMTEB dataset
Reason for dataset addition:
mteb
package.mteb run -m {model_name} -t {task_name}
command.sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
intfloat/multilingual-e5-small
self.stratified_subsampling() under dataset_transform()
make test
.make lint
.438.jsonl
).