Defining a large dataset with several splits under it

We have a recurrent format of some datasets where the same dataset will have multiple splits under each, where splits are different by language, subtask, train-dev-test, etc. but have the same file structure. 
Our current implementation assumes we will have one dataset per split (especially with the metadata specifying dataset language for example), OR we will have an ad-hoc method of using the same dataset class but passing different splits file names with different assets. 

I  think we should find a more unified way to handle such cases (e.g., parent dataset and subsets under each, where subsets are different by metadata only for example). 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Defining a large dataset with several splits under it #199

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Defining a large dataset with several splits under it #199

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions