Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

You should supply an instance of transformers.BatchFeature or list of transformers.BatchFeature to this method that includes input_values, but you provided ['file', 'audio', 'label'] #25748

Closed
2 of 4 tasks
c1ekrt opened this issue Aug 25, 2023 · 10 comments

Comments

@c1ekrt
Copy link

c1ekrt commented Aug 25, 2023

System Info

  • transformers version: 4.33.0.dev0
  • Platform: Windows-10-10.0.22621-SP0
  • Python version: 3.10.12
  • Huggingface_hub version: 0.16.4
  • Safetensors version: 0.3.2
  • Accelerate version: 0.21.0
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.0.1+cu118 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help?

@sanchit-gandhi

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

run audio_classification_CMD.py with following arguments

audio_classification_CMD.py
run_audio_classification.py --output_dir .\output --overwrite_output_dir --model_name_or_path facebook/wav2vec2-base --dataset_name superb  --dataset_config_name ks --hub_model_id Audio_Classification --do_train --do_eval --fp16 --train_split_name train --remove_unused_columns False --load_best_model_at_end --metric_for_best_model accuracy --gradient_accumulation_steps 4 --push_to_hub --push_to_hub_model_id Audio_Classification --save_safetensors --save_step 200 --save_strategy epoch --evaluation_strategy epoch --logging_strategy steps --logging_steps 10 --max_length_seconds 1 --seed 0 --num_train_epochs 5 --save_total_limit 3 --learning_rate 3e-5 --per_device_train_batch_size 16 --per_device_eval_batch_size 3 --warmup_ratio 0.1

however return the error below

Traceback (most recent call last):
  File "D:\Jhou's Workshop\transformers-main\examples\pytorch\audio-classification\run_audio_classification.py", line 443, in <module>
    main()
  File "D:\Jhou's Workshop\transformers-main\examples\pytorch\audio-classification\run_audio_classification.py", line 417, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\transformers\trainer.py", line 1546, in train
    return inner_training_loop(
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\transformers\trainer.py", line 1815, in _inner_training_loop
    for step, inputs in enumerate(epoch_iterator):
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\accelerate\data_loader.py", line 384, in __iter__
    current_batch = next(dataloader_iter)
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\torch\utils\data\dataloader.py", line 633, in __next__
    data = self._next_data()
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\torch\utils\data\dataloader.py", line 677, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\torch\utils\data\_utils\fetch.py", line 54, in fetch
    return self.collate_fn(data)
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\transformers\data\data_collator.py", line 249, in __call__
    batch = self.tokenizer.pad(
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\transformers\feature_extraction_sequence_utils.py", line 132, in pad
    raise ValueError(
ValueError: You should supply an instance of `transformers.BatchFeature` or list of `transformers.BatchFeature` to this method that includes input_values, but you provided ['file', 'audio', 'label']
  0%|                                                                                                                                                                                  | 0/5055 [00:01<?, ?it/s]

Expected behavior

Expect to be start training.

@ArthurZucker
Copy link
Collaborator

cc @sanchit-gandhi

@sanchit-gandhi
Copy link
Contributor

Hey @c1ekrt - thanks for the issue report. Unfortunately, I'm not able to reproduce the error you're facing with the given command. I launched training using the arguments you provided, and training was executed successfully. See logs at wandb. Could you confirm that you are using the latest version of the examples script without modifications? Thanks!

@c1ekrt
Copy link
Author

c1ekrt commented Aug 26, 2023

Thanks for replying! I will reinstall the package and rerun the example after this weekend.

@c1ekrt
Copy link
Author

c1ekrt commented Aug 28, 2023

I had modified two lines since this error message popped out

  File "D:\Jhou's Workshop\transformers-main\examples\pytorch\audio-classification\run_audio_classification.py", line 443, in <module>
    main()
  File "D:\Jhou's Workshop\transformers-main\examples\pytorch\audio-classification\run_audio_classification.py", line 417, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\transformers\trainer.py", line 1546, in train
    return inner_training_loop(
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\transformers\trainer.py", line 1837, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\transformers\trainer.py", line 2682, in training_step
    loss = self.compute_loss(model, inputs)
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\transformers\trainer.py", line 2707, in compute_loss
    outputs = model(**inputs)
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\accelerate\utils\operations.py", line 581, in forward
    return model_forward(*args, **kwargs)
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\accelerate\utils\operations.py", line 569, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\torch\amp\autocast_mode.py", line 14, in decorate_autocast
    return func(*args, **kwargs)
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\transformers\models\wav2vec2\modeling_wav2vec2.py", line 2136, in forward
    loss = loss_fct(logits.view(-1, self.config.num_labels), labels.view(-1))
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\torch\nn\modules\loss.py", line 1174, in forward
    return F.cross_entropy(input, target, weight=self.weight,
  File "C:\Users\jim\.conda\envs\diffhug\lib\site-packages\torch\nn\functional.py", line 3029, in cross_entropy
    return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
RuntimeError: "nll_loss_forward_reduce_cuda_kernel_2d_index" not implemented for 'Int'

So I changed the code in line 400 to

 # Initialize our trainer
    trainer = Trainer(
        model=model,
        args=training_args,
        train_dataset=raw_datasets["train"].with_format("torch") if training_args.do_train else None,
        eval_dataset=raw_datasets["eval"].with_format("torch") if training_args.do_eval else None,
        compute_metrics=compute_metrics,
        tokenizer=feature_extractor,
    )

And transformers.BatchFeature error popped up

I have reinstalled transformers package but the issue remained

@sanchit-gandhi
Copy link
Contributor

sanchit-gandhi commented Aug 29, 2023

Hey @c1ekrt - you can't pass the raw dataset with {audio, text} to the trainer, you need to pass the pre-processed dataset with the features {normalised audio, token ids}:

# Initialize Trainer
trainer = Trainer(
    model=model,
    data_collator=data_collator,
    args=training_args,
    compute_metrics=compute_metrics,
    train_dataset=vectorized_datasets["train"] if training_args.do_train else None,
    eval_dataset=vectorized_datasets["eval"] if training_args.do_eval else None,
    tokenizer=processor,
)

@c1ekrt
Copy link
Author

c1ekrt commented Aug 31, 2023

I still can't get the example work. The pre-process part of the code which is this section
Line 317-329

def train_transforms(batch):
        """Apply train_transforms across a batch."""
        subsampled_wavs = []
        for audio in batch[data_args.audio_column_name]:
            wav = random_subsample(
                audio["array"], max_length=data_args.max_length_seconds, sample_rate=feature_extractor.sampling_rate
            )
            subsampled_wavs.append(wav)
        inputs = feature_extractor(subsampled_wavs, sampling_rate=feature_extractor.sampling_rate)
        output_batch = {model_input_name: inputs.get(model_input_name)}
        output_batch["labels"] = list(batch[data_args.label_column_name])
        return output_batch

never run despite set_transform being called
Line 390

raw_datasets["train"].set_transform(train_transforms, output_all_columns=False)

all of these code are unmodified.

@sanchit-gandhi
Copy link
Contributor

Indeed, the pre-processing function is defined here:

And the transformation is applied here:

raw_datasets["train"].set_transform(train_transforms, output_all_columns=False)

Can you try running the script un-changed from the default script provided? As mentioned above, can do a training run using the command you provided without any issue

It's worth trying updating the accelerate package:

pip install --upgrade accelerate

And checking that your PyTorch version is up to date (maybe even try the nightly install?)

@c1ekrt
Copy link
Author

c1ekrt commented Sep 1, 2023

OK. It seems that the 'label' of the superb dataset that passed into the cross entropy calculation happened to be wrong dtype. Hence the error

"nll_loss_forward_reduce_cuda_kernel_2d_index" not implemented for 'Int' 

occurred. After changing the dtype to torch.int64 the code start running without any error.

@c1ekrt c1ekrt closed this as completed Sep 1, 2023
@sanchit-gandhi
Copy link
Contributor

Interesting! I couldn't repro this on my side. Will leave as closed for now, but feel free to re-open if you see this phenomenon in the examples scripts again. Sorry we didn't find the complete fix this time!

@Emmekea
Copy link

Emmekea commented Nov 30, 2023

I faced the exact same issue. For me, upgrading datasets (pip3 install --upgrade datasets) did the trick.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants