New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
You should supply an instance of transformers.BatchFeature
or list of transformers.BatchFeature
to this method that includes input_values, but you provided ['file', 'audio', 'label']
#25748
Comments
Hey @c1ekrt - thanks for the issue report. Unfortunately, I'm not able to reproduce the error you're facing with the given command. I launched training using the arguments you provided, and training was executed successfully. See logs at wandb. Could you confirm that you are using the latest version of the examples script without modifications? Thanks! |
Thanks for replying! I will reinstall the package and rerun the example after this weekend. |
I had modified two lines since this error message popped out
So I changed the code in line 400 to
And transformers.BatchFeature error popped up I have reinstalled transformers package but the issue remained |
Hey @c1ekrt - you can't pass the raw dataset with # Initialize Trainer
trainer = Trainer(
model=model,
data_collator=data_collator,
args=training_args,
compute_metrics=compute_metrics,
train_dataset=vectorized_datasets["train"] if training_args.do_train else None,
eval_dataset=vectorized_datasets["eval"] if training_args.do_eval else None,
tokenizer=processor,
) |
I still can't get the example work. The pre-process part of the code which is this section
never run despite set_transform being called
all of these code are unmodified. |
Indeed, the pre-processing function is defined here:
And the transformation is applied here:
Can you try running the script un-changed from the default script provided? As mentioned above, can do a training run using the command you provided without any issue It's worth trying updating the
And checking that your PyTorch version is up to date (maybe even try the nightly install?) |
OK. It seems that the 'label' of the superb dataset that passed into the cross entropy calculation happened to be wrong dtype. Hence the error
occurred. After changing the dtype to torch.int64 the code start running without any error. |
Interesting! I couldn't repro this on my side. Will leave as closed for now, but feel free to re-open if you see this phenomenon in the examples scripts again. Sorry we didn't find the complete fix this time! |
I faced the exact same issue. For me, upgrading datasets (pip3 install --upgrade datasets) did the trick. |
System Info
transformers
version: 4.33.0.dev0Who can help?
@sanchit-gandhi
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
run audio_classification_CMD.py with following arguments
however return the error below
Expected behavior
Expect to be start training.
The text was updated successfully, but these errors were encountered: