New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add machine translated multilingual STS benchmark dataset #2090
Conversation
Hello dear maintainer, are there any comments or questions about this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really cool thank you :)
The dataset script looks all good ! Good job.
The dummy data and the dataset_infos.json are also perfect :)
For the readme, can you follow the template for the README.md ? You can find the template here:
https://github.com/huggingface/datasets/tree/master/templates
Ideally it would be cool to fill the info for those sections at least:
- Dataset Summary
- Languages
- Data Instances
- Data Fields
- Data Splits
Let me know if you have questions about this !
- Dataset Summary - Languages - Data Instances - Data Fields - Data Splits
3b8a4ff
to
550d715
Compare
@iamollas thanks for the feedback. I did not see the template. |
Should be clean for merge IMO. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you !
I just added the table of contents and the missing sections in the dataset card :)
@lhoestq CI is green. ;-) |
Thanks again ! this is awesome :) |
Thanks for merging. :-) |
also see here https://github.com/PhilipMay/stsb-multi-mt