Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting error during training #26

Closed
arikhalperin opened this issue May 1, 2022 · 2 comments
Closed

Getting error during training #26

arikhalperin opened this issue May 1, 2022 · 2 comments

Comments

@arikhalperin
Copy link

RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

Which means that the data did not move to GPU.

My code:

torch.device("cuda")

model = SpeechRecognitionModel("jonatasgrosman/wav2vec2-large-xlsr-53-spanish", device="cuda")
processor_ref = Wav2Vec2Processor.from_pretrained("jonatasgrosman/wav2vec2-large-xlsr-53-spanish")
token_list = list(processor_ref.tokenizer.encoder.keys())
token_set = TokenSet(token_list)

train_set = []
eval_set = []

train_set, eval_set = add_sealed_data_set(train_set, eval_set, config[environment][SAMPLES_DIR])

training_arguments = TrainingArguments()
training_arguments.overwrite_output_dir = True
training_arguments.per_device_train_batch_size = 128
training_arguments.per_device_eval_batch_size = 128

model.finetune(
    config[environment][MODEL_OUTPUT_DIR],
    train_data=train_set,
    eval_data=eval_set,  # the eval_data is optional
    token_set=token_set,
    training_args=training_arguments
)

Managing to work around this by adding a move to cuda of my dataset inside huggingsound code. If I can make it work I'll create a PR

@jonatasgrosman
Copy link
Owner

jonatasgrosman commented May 11, 2022

Hi @arikhalperin, Did you manage to solve this issue? I couldn't reproduce your error on my machine. Maybe this issue is related to your environment. Please send me more info about the version of your Cuda, PyTorch, etc. so that may I can help you to figure out what's going on :)

Another good option is to try to reproduce this issue on a Colab and send me the link

@arikhalperin
Copy link
Author

Sorry about the delay. It updated CUDA and pytorch to match and it was resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants