Loss calculation error #55

jwang-lp · 2018-11-25T03:48:17Z

https://github.com/huggingface/pytorch-pretrained-BERT/blob/982339d82984466fde3b1466f657a03200aa2ffb/pytorch_pretrained_bert/modeling.py#L744

Got ValueError: Expected target size (1, 30522), got torch.Size([1, 11]) at line 744 of modeling.py. I think the line should be changed to masked_lm_loss = loss_fct(prediction_scores.view([-1, self.config.vocab_size]), masked_lm_labels.view([-1])).

The text was updated successfully, but these errors were encountered:

thomwolf · 2018-11-25T09:12:22Z

Hi Jian, can you give me a small (self-contained) example showing how to get this error?

jwang-lp · 2018-11-25T19:58:52Z

Hi Thomas! I modified the code in your README.md for an example:

from pytorch_pretrained_bert.modeling import BertForMaskedLM, BertConfig
from pytorch_pretrained_bert import BertTokenizer
import torch

model = BertForMaskedLM.from_pretrained('bert-base-uncased')

# Tokenized input
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
text = "Who was Jim Henson ? Jim Henson was a puppeteer"
tokenized_text = tokenizer.tokenize(text)

# Mask a token that we will try to predict back with `BertForMaskedLM`
masked_index = 6
tokenized_text[masked_index] = '[MASK]'

# Convert token to vocabulary indices
indexed_truths = tokenizer.convert_tokens_to_ids(tokenized_text)
indexed_tokens = tokenizer.convert_tokens_to_ids(tokenized_text)

# Convert inputs to PyTorch tensors
tokens_tensor = torch.tensor([indexed_tokens])
indexed_truths_tensor = torch.tensor([indexed_truths])

# Evaluate loss
model.eval()
masked_lm_logits_scores = model(tokens_tensor, masked_lm_labels=indexed_truths_tensor)
print(masked_lm_logits_scores)

thomwolf · 2018-11-26T08:52:00Z

Thank you, you are right, I fixed that on master. It will be in the next release.

* Update trainer and model flows to accommodate sparseml Disable FP16 on QAT start (huggingface#12) * Override LRScheduler when using LRModifiers * Disable FP16 on QAT start * keep wrapped scaler object for training after disabling Using QATMatMul in DistilBERT model class (huggingface#41) Removed double quantization of output of context layer. (huggingface#45) Fix DataParallel validation forward signatures (huggingface#47) * Fix: DataParallel validation forward signatures * Update: generalize forward_fn selection Best model after epoch (huggingface#46) fix sclaer check for non fp16 mode in trainer (huggingface#38) Mobilebert QAT (huggingface#55) * Remove duplicate quantization of vocabulary. enable a QATWrapper for non-parameterized matmuls in BERT self attention (huggingface#9) * Utils and auxillary changes update Zoo stub loading for SparseZoo 1.1 refactor (huggingface#54) add flag to signal NM integration is active (huggingface#32) Add recipe_name to file names * Fix errors introduced in manual cherry-pick upgrade Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>

Ra

Summary: This is the initial PoC to integrate PEFT Lora. More information can be found: http://shortn/_0CPRwxcB5P Test Plan: Test it locally on a v4-8.

thomwolf closed this as completed Nov 26, 2018

ghost mentioned this issue Nov 30, 2018

fix typo in input for masked lm loss function #70

Merged

maeotaku mentioned this issue May 23, 2019

bert->onnx ->caffe2 weird error #633

Closed

jameshennessytempus pushed a commit to jameshennessytempus/transformers that referenced this issue Jun 1, 2023

Merge pull request huggingface#55 from jamesthesnake/ra

ca84eca

Ra

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loss calculation error #55

Loss calculation error #55

jwang-lp commented Nov 25, 2018

thomwolf commented Nov 25, 2018 •

edited

jwang-lp commented Nov 25, 2018

thomwolf commented Nov 26, 2018

Loss calculation error #55

Loss calculation error #55

Comments

jwang-lp commented Nov 25, 2018

thomwolf commented Nov 25, 2018 • edited

jwang-lp commented Nov 25, 2018

thomwolf commented Nov 26, 2018

thomwolf commented Nov 25, 2018 •

edited