You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RuntimeError: Expected tensor for argument #1 'indices' to have scalar type Long; but got torch.FloatTensor instead (while checking arguments for embedding)
#7026
Closed
2 of 4 tasks
GeetDsa opened this issue
Sep 9, 2020
· 1 comment
· Fixed by #7039
Model I am using (GPT2-large) for fine-tuning on custom data:
The problem arises when using:
the official example scripts: (give details below)
my own modified scripts: (give details below)
Trace:
File "gpt_language_generation.py", line 209, in
main()
File "gpt_language_generation.py", line 136, in main
trainer.train(model_path=None)
File "<conda_env>/lib/python3.6/site-packages/transformers/trainer.py", line 708, in train
tr_loss += self.training_step(model, inputs)
File "<conda_env>/lib/python3.6/site-packages/transformers/trainer.py", line 995, in training_step
outputs = model(**inputs)
File "<conda_env>/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "<conda_env>/lib/python3.6/site-packages/transformers/modeling_gpt2.py", line 731, in forward
return_dict=return_dict,
File "<conda_env>/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "<conda_env>/lib/python3.6/site-packages/transformers/modeling_gpt2.py", line 593, in forward
inputs_embeds = self.wte(input_ids)
File "<conda_env>/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "<conda_env>/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 114, in forward
self.norm_type, self.scale_grad_by_freq, self.sparse)
File "<conda_env>/lib/python3.6/site-packages/torch/nn/functional.py", line 1484, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
@patrickvonplaten , I did find out that the problem arises from data collator (DataCollatorForLanguageModeling). The returned tensors (tensor of indices to vocab) are not Long, which is creating the problem.
Environment info
transformers
version: 3.1.0Who can help
Information
Model I am using (GPT2-large) for fine-tuning on custom data:
The problem arises when using:
Trace:
The tasks I am working on is:
To reproduce
Steps to reproduce the behavior:
Expected behavior
Expect the training to continue without an error.
The text was updated successfully, but these errors were encountered: