Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loss calculation error #55

Closed
jwang-lp opened this issue Nov 25, 2018 · 3 comments
Closed

Loss calculation error #55

jwang-lp opened this issue Nov 25, 2018 · 3 comments

Comments

@jwang-lp
Copy link

https://github.com/huggingface/pytorch-pretrained-BERT/blob/982339d82984466fde3b1466f657a03200aa2ffb/pytorch_pretrained_bert/modeling.py#L744

Got ValueError: Expected target size (1, 30522), got torch.Size([1, 11]) at line 744 of modeling.py. I think the line should be changed to masked_lm_loss = loss_fct(prediction_scores.view([-1, self.config.vocab_size]), masked_lm_labels.view([-1])).

@thomwolf
Copy link
Member

thomwolf commented Nov 25, 2018

Hi Jian, can you give me a small (self-contained) example showing how to get this error?

@jwang-lp
Copy link
Author

Hi Thomas! I modified the code in your README.md for an example:

from pytorch_pretrained_bert.modeling import BertForMaskedLM, BertConfig
from pytorch_pretrained_bert import BertTokenizer
import torch

model = BertForMaskedLM.from_pretrained('bert-base-uncased')

# Tokenized input
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
text = "Who was Jim Henson ? Jim Henson was a puppeteer"
tokenized_text = tokenizer.tokenize(text)

# Mask a token that we will try to predict back with `BertForMaskedLM`
masked_index = 6
tokenized_text[masked_index] = '[MASK]'

# Convert token to vocabulary indices
indexed_truths = tokenizer.convert_tokens_to_ids(tokenized_text)
indexed_tokens = tokenizer.convert_tokens_to_ids(tokenized_text)

# Convert inputs to PyTorch tensors
tokens_tensor = torch.tensor([indexed_tokens])
indexed_truths_tensor = torch.tensor([indexed_truths])

# Evaluate loss
model.eval()
masked_lm_logits_scores = model(tokens_tensor, masked_lm_labels=indexed_truths_tensor)
print(masked_lm_logits_scores)

@thomwolf
Copy link
Member

Thank you, you are right, I fixed that on master. It will be in the next release.

xloem pushed a commit to xloem/transformers that referenced this issue Apr 9, 2023
* Update trainer and model flows to accommodate sparseml

Disable FP16 on QAT start (huggingface#12)

* Override LRScheduler when using LRModifiers

* Disable FP16 on QAT start

* keep wrapped scaler object for training after disabling

Using QATMatMul in DistilBERT model class (huggingface#41)

Removed double quantization of output of context layer. (huggingface#45)

Fix DataParallel validation forward signatures (huggingface#47)

* Fix: DataParallel validation forward signatures

* Update: generalize forward_fn selection

Best model after epoch (huggingface#46)

fix sclaer check for non fp16 mode in trainer (huggingface#38)

Mobilebert QAT (huggingface#55)

* Remove duplicate quantization of vocabulary.

enable a QATWrapper for non-parameterized matmuls in BERT self attention (huggingface#9)

* Utils and auxillary changes

update Zoo stub loading for SparseZoo 1.1 refactor (huggingface#54)

add flag to signal NM integration is active (huggingface#32)

Add recipe_name to file names

* Fix errors introduced in manual cherry-pick upgrade

Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>
jameshennessytempus pushed a commit to jameshennessytempus/transformers that referenced this issue Jun 1, 2023
jonb377 pushed a commit to jonb377/hf-transformers that referenced this issue Apr 5, 2024
Summary:
This is the initial PoC to integrate PEFT Lora. More information can be found: http://shortn/_0CPRwxcB5P

Test Plan:
Test it locally on a v4-8.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants