Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Made dataset splits possible during export #1162

Closed
Arthemide opened this issue Apr 19, 2023 · 1 comment
Closed

Made dataset splits possible during export #1162

Arthemide opened this issue Apr 19, 2023 · 1 comment

Comments

@Arthemide
Copy link

I am working on a project using Kili and exporting my datasets with different types of annotations.

I think it would be useful to add a new feature that allows me to split the datasets into train/validation/test sets directly during the export process.
Currently, I have to create another script after the Kili export, and I believe that every other data scientist has to do the same.

Is this feature being developed internally, and do you think it would be a valuable addition?

@Jonas1312
Copy link
Contributor

Jonas1312 commented Apr 19, 2023

Hi,

If you want to create three separate exports for your three folds, you can try to use the asset_ids, external_ids or asset_filter_kwargs parameters of the kili.export_labels() method.

For example, you can get the asset ids using kili.assets(), then split those ids into three folds, and call the kili.export_labels() method thrice using the asset_ids parameter and the different folds.

You could also add metadata to your assets, as described in this tutorial.

Would this solution solve your issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants