You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I want to know the training/test splits of the BUCC dataset.
In the paper, it writes "we evaluate representations on the test sets directly", but the training data are renamed as test set at here.
for f in $base_dir/*training*; do mv $f ${f/training/test}; done
So which spilt is used as the test set of bucc in xtreme? Training set or test set?
Thanks.
The text was updated successfully, but these errors were encountered:
Hi! Thanks for the question and the close look at the code. Good catch! :)
We used the training set of the BUCC task for evaluation in XTREME as the BUCC test set is private. We might update the evaluation in the future if we get access to the original test data. This is not a problem in practice as no new parameters are learned for this task. @JunjieHu, do you have anything to add?
Thanks for the question! Indeed as Sebastian said, the original test data is private. So we use the released train set as our evaluation test set, and use the released sample set as our dev set which can be used by the participants, for example to find a threshold for mining the bitext.
Hi,
Congratulations on your paper!
I want to know the training/test splits of the BUCC dataset.
In the paper, it writes "we evaluate representations on the test sets directly", but the training data are renamed as test set at here.
for f in $base_dir/*training*; do mv $f ${f/training/test}; done
So which spilt is used as the test set of bucc in xtreme? Training set or test set?
Thanks.
The text was updated successfully, but these errors were encountered: