Remove outdated BERT tips #6217

JetRunner · 2020-08-03T15:31:24Z

Why remove the tips:

BERT is a model with absolute position embeddings so it's usually advised to pad the inputs on
the right rather than the left.

Yes but since we don't provide an option to pad from the left I think it's not necessary.

BERT was trained with a masked language modeling (MLM) objective. It is therefore efficient at predicting masked
tokens and at NLU in general, but is not optimal for text generation. Models trained with a causal language
modeling (CLM) objective are better in that regard.

No. T5 & BART proved it wrong.

Alongside MLM, BERT was trained using a next sentence prediction (NSP) objective using the [CLS] token as a sequence
approximate. The user may use this token (the first token in a sequence built with special tokens) to get a sequence
prediction rather than a token prediction. However, averaging over the sequence may yield better results than using
the [CLS] token.

No. [CLS] can do learnable self-attention pooling, which is way much better than parameter-free average pooling especially when fine-tuned. (w.r.t. SentenceBERT)

codecov · 2020-08-03T15:33:08Z

Codecov Report

Merging #6217 into master will decrease coverage by 0.64%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #6217      +/-   ##
==========================================
- Coverage   79.53%   78.88%   -0.65%     
==========================================
  Files         146      146              
  Lines       26586    26586              
==========================================
- Hits        21145    20973     -172     
- Misses       5441     5613     +172

Impacted Files	Coverage Δ
src/transformers/modeling_outputs.py	`100.00% <ø> (ø)`
src/transformers/tokenization_xlm.py	`16.26% <0.00%> (-66.67%)`	⬇️
src/transformers/tokenization_marian.py	`68.14% <0.00%> (-25.67%)`	⬇️
src/transformers/generation_tf_utils.py	`86.71% <0.00%> (+0.25%)`	⬆️
src/transformers/file_utils.py	`80.30% <0.00%> (+0.25%)`	⬆️
src/transformers/tokenization_dpr.py	`57.65% <0.00%> (+4.50%)`	⬆️
src/transformers/tokenization_ctrl.py	`96.11% <0.00%> (+17.47%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b6b2f22...8a7c591. Read the comment docs.

sgugger · 2020-08-03T15:47:26Z

I'd personally leave the first two tricks (users may have their custom script for padding and BERT is not good at language generation). For the third, as discussed, I agree.
The docstrings of BaseModelOutputWithPooling should be changed accordingly, but I can take care of that in a separate PR.

JetRunner · 2020-08-03T16:00:16Z

@sgugger I've restored tips no.1 and updated tips no.2. Also took care of BaseModelOutputWithPooling.

sgugger

Thanks a lot! This is good to go IMO.

* Remove out-dated BERT tips * Update modeling_outputs.py * Update bert.rst * Update bert.rst

Remove out-dated BERT tips

c0daf6f

JetRunner requested review from thomwolf, VictorSanh and sgugger August 3, 2020 15:31

JetRunner changed the title ~~Remove out-dated BERT tips~~ Remove outdated BERT tips Aug 3, 2020

JetRunner added 3 commits August 3, 2020 23:55

Update modeling_outputs.py

9df7845

Update bert.rst

4566645

Update bert.rst

8a7c591

sgugger approved these changes Aug 3, 2020

View reviewed changes

JetRunner merged commit 3c289fb into master Aug 3, 2020

JetRunner deleted the JetRunner-patch-1 branch August 3, 2020 17:17

Mehrad0711 pushed a commit to Mehrad0711/transformers that referenced this pull request Aug 3, 2020

Remove outdated BERT tips (huggingface#6217)

c92a3ca

* Remove out-dated BERT tips * Update modeling_outputs.py * Update bert.rst * Update bert.rst

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove outdated BERT tips #6217

Remove outdated BERT tips #6217

JetRunner commented Aug 3, 2020 •

edited

Loading

codecov bot commented Aug 3, 2020 •

edited

Loading

sgugger commented Aug 3, 2020

JetRunner commented Aug 3, 2020

sgugger left a comment

Remove outdated BERT tips #6217

Remove outdated BERT tips #6217

Conversation

JetRunner commented Aug 3, 2020 • edited Loading

codecov bot commented Aug 3, 2020 • edited Loading

Codecov Report

sgugger commented Aug 3, 2020

JetRunner commented Aug 3, 2020

sgugger left a comment

Choose a reason for hiding this comment

JetRunner commented Aug 3, 2020 •

edited

Loading

codecov bot commented Aug 3, 2020 •

edited

Loading