Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore order of split names in dataset_info for canonical datasets #5258

Closed
albertvillanova opened this issue Nov 17, 2022 · 3 comments
Closed
Assignees
Labels
dataset contribution Contribution to a dataset script

Comments

@albertvillanova
Copy link
Member

albertvillanova commented Nov 17, 2022

After a bulk edit of canonical datasets to create the YAML dataset_info metadata, the split names were accidentally sorted alphabetically. See for example:

Note that this order is the one appearing in the preview of the datasets.

I'm making a bulk edit to align the order of the splits appearing in the metadata info with the order appearing in the loading script.

Related to:

@albertvillanova albertvillanova added the dataset contribution Contribution to a dataset script label Nov 17, 2022
@albertvillanova albertvillanova self-assigned this Nov 17, 2022
@albertvillanova
Copy link
Member Author

albertvillanova commented Nov 18, 2022

The bulk edit is running...

See for example:

@albertvillanova
Copy link
Member Author

albertvillanova commented Nov 18, 2022

TODO: Add "dataset_info" YAML metadata to:

  • "chr_en" has no metadata JSON file, nor "dataset_info" YAML tag in its card
  • "conll2000" has no metadata JSON file, but it has "dataset_info" YAML tag in its card
  • "crime_and_punish" has no metadata JSON file, but it has "dataset_info" YAML tag in its card
  • "dart" has no metadata JSON file, but it has "dataset_info" YAML tag in its card
  • "iwslt2017" has no metadata JSON file, but it has "dataset_info" YAML tag in its card
  • "mc4" has no metadata JSON file, nor "dataset_info" YAML tag in its card
  • "the_pile" has no metadata JSON file, nor "dataset_info" YAML tag in its card
  • "timit_asr" has no metadata JSON file, nor "dataset_info" YAML tag in its card

@albertvillanova
Copy link
Member Author

The bulk edit is finished.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dataset contribution Contribution to a dataset script
Projects
None yet
Development

No branches or pull requests

1 participant