Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use dataset.set_format in dataset collator #18

Open
cemilcengiz opened this issue Sep 28, 2021 · 0 comments
Open

Use dataset.set_format in dataset collator #18

cemilcengiz opened this issue Sep 28, 2021 · 0 comments
Labels
enhancement New feature or request

Comments

@cemilcengiz
Copy link
Contributor

Instead of doing the tensor conversion manually, we can do something like
dataset.set_format(type='torch', columns=['input_ids', 'token_type_ids', 'attention_mask', 'label'])
as shown in datasets doc. This would also enable removing the unused columns (the columns not required by the models) conveniently.

@cemilcengiz cemilcengiz added the enhancement New feature or request label Sep 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant