v4.13.0: Perceiver, ImageGPT, mLUKE, Vision-Text dual encoders, QDQBert, new documentation frontend
New Model additions
Perceiver
Eight new models are released as part of the Perceiver implementation: PerceiverModel
, PerceiverForMaskedLM
, PerceiverForSequenceClassification
, PerceiverForImageClassificationLearned
, PerceiverForImageClassificationFourier
, PerceiverForImageClassificationConvProcessing
, PerceiverForOpticalFlow
, PerceiverForMultimodalAutoencoding
, in PyTorch.
The Perceiver IO model was proposed in Perceiver IO: A General Architecture for Structured Inputs & Outputs by Andrew Jaegle, Sebastian Borgeaud, Jean-Baptiste Alayrac, Carl Doersch,
Catalin Ionescu, David Ding, Skanda Koppula, Daniel Zoran, Andrew Brock, Evan Shelhamer, Olivier Hénaff, Matthew M.
Botvinick, Andrew Zisserman, Oriol Vinyals, João Carreira.
- Add Perceiver IO by @NielsRogge in #14487
Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=perceiver
mLUKE
The mLUKE tokenizer is added. The tokenizer can be used for the multilingual variant of LUKE.
The mLUKE model was proposed in mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models by Ryokan Ri, Ikuya Yamada, and Yoshimasa Tsuruoka. It's a multilingual extension
of the LUKE model trained on the basis of XLM-RoBERTa.
- Add mLUKE by @Ryou0634 in #14640
Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=luke
ImageGPT
Three new models are released as part of the ImageGPT integration: ImageGPTModel
, ImageGPTForCausalImageModeling
, ImageGPTForImageClassification
, in PyTorch.
The ImageGPT model was proposed in Generative Pretraining from Pixels by Mark
Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever. ImageGPT (iGPT) is a GPT-2-like
model trained to predict the next pixel value, allowing for both unconditional and conditional image generation.
- Add ImageGPT by @NielsRogge in #14240
Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=imagegpt
QDQBert
Eight new models are released as part of the QDQBert implementation: QDQBertModel
, QDQBertLMHeadModel
, QDQBertForMaskedLM
, QDQBertForSequenceClassification
, QDQBertForNextSentencePrediction
, QDQBertForMultipleChoice
, QDQBertForTokenClassification
, QDQBertForQuestionAnswering
, in PyTorch.
The QDQBERT model can be referenced in Integer Quantization for Deep Learning Inference: Principles and Empirical
Evaluation by Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev and Paulius
Micikevicius.
- Add QDQBert model and quantization examples of SQUAD task by @shangz-ai in #14066
Semantic Segmentation models
The semantic Segmentation models' API is unstable and bound to change between this version and the next.
The first semantic segmentation models are added. In semantic segmentation, the goal is to predict a class label for every pixel of an image. The models that are added are SegFormer (by NVIDIA) and BEiT (by Microsoft Research). BEiT was already available in the library, but this release includes the model with a semantic segmentation head.
The SegFormer model was proposed in SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers by Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo. The model consists of a hierarchical Transformer encoder and a lightweight all-MLP decode head to achieve great results on image segmentation benchmarks such as ADE20K and Cityscapes.
The BEiT model was proposed in BEiT: BERT Pre-Training of Image Transformers by Hangbo Bao, Li Dong, Furu Wei. Rather than pre-training the model to predict the class of an image (as done in the original ViT paper), BEiT models are pre-trained to predict visual tokens from the codebook of OpenAI’s DALL-E model given masked patches.
- Add SegFormer by @NielsRogge in #14019
- Add BeitForSemanticSegmentation by @NielsRogge in #14096
Vision-text dual encoder
Adds VisionTextDualEncoder model in PyTorch and Flax to be able to load any pre-trained vision (ViT, DeiT, BeiT, CLIP's vision model) and text (BERT, ROBERTA) model in the library for vision-text tasks like CLIP.
This model pairs a vision and text encoder and adds projection layers to project the embeddings to another embeddings space with similar dimensions. which can then be used to align the two modalities.
- VisionTextDualEncoder by @patil-suraj in #13511
CodeParrot
CodeParrot, a model trained to generate code, has been open-sourced in the research projects by @lvwerra.
Language model support for ASR
- Add language model support for CTC models by @patrickvonplaten in #14339
Language model boosted decoding is added for all CTC models via https://github.com/kensho-technologies/pyctcdecode and https://github.com/kpu/kenlm.
See https://huggingface.co/patrickvonplaten/wav2vec2-xlsr-53-es-kenlm for more information.
Flax-specific additions
Adds Flax version of the vision encoder-decoder model, and adds a Flax version of GPT-J.
- Add FlaxVisionEncoderDecoderModel by @ydshieh in #13359
- FlaxGPTJ by @patil-suraj in #14396
TensorFlow-specific additions
Vision transformers are here! Convnets are so 2012, now that ML is converging on self-attention as a universal model.
Want to handle real-world tables, where text and data are positioned in a 2D grid? TAPAS is now here for both TensorFlow and PyTorch.
- Tapas tf by @kamalkraj in #13393
Automatic checkpointing and cloud saves to the HuggingFace Hub during training are now live, allowing you to resume training when it's interrupted, even if your initial instance is terminated. This is an area of very active development - watch this space for future developments, including automatic model card creation and more.
- Add model checkpointing to push_to_hub and PushToHubCallback by @Rocketknight1 in #14492
Auto-processors
A new class to automatically select processors is added: AutoProcessor
. It can be used for all models that require a processor, in both computer vision and audio.
New documentation frontend
A new documentation frontend is out for the transformers
library! The goal with this documentation is to be better aligned with the rest of our website, and contains tools to improve readability. The documentation can now be written in markdown rather than RST.
LayoutLM Improvements
The LayoutLMv2 feature extractor now supports non-English languages, and LayoutXLM gets its own processor.
- LayoutLMv2FeatureExtractor now supports non-English languages when applying Tesseract OCR. by @Xargonus in #14514
- Add LayoutXLMProcessor (and LayoutXLMTokenizer, LayoutXLMTokenizerFast) by @NielsRogge in #14115
Trainer Improvements
You can now take advantage of the Ampere hardware with the Trainer:
--bf16
- do training or eval in mixed precision of bfloat16--bf16_full_eval
- do eval in full bfloat16--tf32
control having TF32 mode on/off
Improvements and bugfixes
- Replace assertions with RuntimeError exceptions by @ddrm86 in #14186
- Adding
batch_size
support for (almost) all pipelines by @Narsil in #13724 - Remove n_ctx from configs by @thomasw21 in #14165
- Add
BlenderbotTokenizerFast
by @stancld in #13720 - Adding
handle_long_generation
paramters fortext-generation
pipeline. by @Narsil in #14118 - Fix pipeline tests env and fetch by @sgugger in #14209
- Generalize problem_type to all sequence classification models by @sgugger in #14180
- Fixing image segmentation with inference mode. by @Narsil in #14204
- Add a condition for checking labels by @hrxorxm in #14211
- Torch 1.10 by @LysandreJik in #14169
- Add more missing models to models/init.py by @ydshieh in #14177
- Clarify QA examples by @NielsRogge in #14172
- Fixing
image-segmentation
tests. by @Narsil in #14223 - Tensor location is already handled by @Narsil in #14224
- Raising exceptions instead of using assertions for few models by @pdcoded in #14219
- Fix the write problem in trainer.py comment by @wmathor in #14202
- [GPTJ] enable common tests and few fixes by @patil-suraj in #14190
- improving efficiency of mlflow metric logging by @wamartin-aml in #14232
- Fix generation docstring by @qqaatw in #14216
- Fix test_configuration_tie in FlaxEncoderDecoderModelTest by @ydshieh in #14076
- [Tests] Fix DistilHubert path by @anton-l in #14245
- Add PushToHubCallback in main init by @sgugger in #14246
- Fixes Beit training for PyTorch 1.10+ by @sgugger in #14249
- Added Beit model ouput class by @lumliolum in #14133
- Update Transformers to huggingface_hub >= 0.1.0 by @sgugger in #14251
- Add cross attentions to TFGPT2Model by @ydshieh in #14038
- [Wav2Vec2] Adapt conversion script by @patrickvonplaten in #14258
- Put
load_image
function inimage_utils.py
& fix image rotation issue by @mishig25 in #14062 - minimal fixes to run DataCollatorForWholeWordMask with return_tensors="np" and return_tensors="tf" by @dwyatte in #13891
- Adding support for
truncation
parameter onfeature-extraction
pipeline. by @Narsil in #14193 - Fix of issue #13327: Wrong weight initialization for TF t5 model by @dshirron in #14241
- Fixing typo in error message. by @Narsil in #14226
- Pin Keras cause they messed their release by @sgugger in #14262
- Quality explain by @sgugger in #14264
- Add more instructions to the release guide by @sgugger in #14263
- Fixing slow pipeline tests by @Narsil in #14260
- Fixing mishandling of
ignore_labels
. by @Narsil in #14274 - improve rewrite state_dict missing _metadata by @changwangss in #14276
- Removing Keras version pinning by @Rocketknight1 in #14280
- Pin TF until tests are fixed by @sgugger in #14283
- [Hubert Docs] Make sure example uses a fine-tuned model by @patrickvonplaten in #14291
- Add new LFS prune API by @sgugger in #14294
- Remove
DPRPretrainedModel
from docs by @xhlulu in #14300 - Handle long answer needs to be updated. by @Narsil in #14279
- [tests] Fix SegFormer and BEiT tests by @NielsRogge in #14289
- Fix typo on PPLM example README by @Beomi in #14287
- [Marian Conversion] Fix eos_token_id conversion in conversion script by @patrickvonplaten in #14320
- [Tests] Update audio classification tests to support torch 1.10 by @anton-l in #14318
- [TFWav2Vec2Model] Fix input shapes in TFWav2Vec2WeightNormConv1D by @anton-l in #14319
- Fixing tests on master. by @Narsil in #14317
- Fixing mutable default argument in
pipeline
. by @Narsil in #14316 - Changed relative imports to absolute to allow convert_graph_to_onnx.py to run as a script. by @nbertagnolli in #14325
- Expand dynamic supported objects to configs and tokenizers by @sgugger in #14296
- [deepspeed] Enable multiple test runs on single box, defer to DS_TEST_PORT if set by @jeffra in #14331
- Small change to Wav2Vec2 model to support Tensor-Parallelism with DeepSpeed by @RezaYazdaniAminabadi in #14298
- Correct order of overflowing tokens for LayoutLmV2 tokenizer by @Apoorvgarg-creator in #13495
- Update Seq2Seq QA example script to use SQuAD metric. by @karthikrangasai in #14335
- remove an irrelevant test from test_modeling_tf_layoutlm by @ydshieh in #14341
- bump flax version by @patil-suraj in #14343
- Rewrite guides for fine-tuning with Datasets by @stevhliu in #13923
- [Bert2Bert] allow bert2bert + relative embeddings by @patrickvonplaten in #14324
- Support for TF >= 2.7 by @sgugger in #14345
BatchFeature
: ConvertList[np.ndarray]
tonp.ndarray
before converting to pytorch tensors by @eladsegal in #14306- Adding some quality of life for
pipeline
function. by @Narsil in #14322 - Fix fast tokenization problems by @qqaatw in #13930
- Add notebook INC quantization for text classification tasks by @echarlaix in #14293
- enhance rewrite state_dict missing _metadata by @changwangss in #14348
- Fix list index out of range when padding nested empty lists by @qqaatw in #13876
- [testing] solve the port conflict by @stas00 in #14362
- Fix Flax params dtype by @patil-suraj in #13098
- [flax generate] allow passing params to encode by @patil-suraj in #14370
- Experimenting with adding proper get_config() and from_config() methods by @Rocketknight1 in #14361
- Fixing requirements for TF LM models and use correct model mappings by @Rocketknight1 in #14372
- fix loading flax bf16 weights in pt by @patil-suraj in #14369
- [wav2vec2] fix --gradient_checkpointing by @stas00 in #13964
- Adding support for raw python
generator
in addition toDataset
for pipelines by @Narsil in #14352 - minor doc fix by @patil-suraj in #14377
- [Wav2Vec2 Example] Improve fine-tuning script by @patrickvonplaten in #14373
- Use
AlbertConverter
for FNet instead of using FNet's own converter by @qqaatw in #14365 - Add support for WMT21 tokenizer in M2M100Tokenizer by @patil-suraj in #14376
- [M2M100Tokenizer] fix _build_translation_inputs by @patil-suraj in #14382
- Raise exceptions instead of using asserts in modeling_openai #12789 by @nbertagnolli in #14386
- [doc] performance and parallelism updates by @stas00 in #14391
- Quick fix to TF summarization example by @Rocketknight1 in #14401
- [Speech2Text2] Enable tokenizers by @patrickvonplaten in #14390
- Fix TFViT by @NielsRogge in #14399
- Fix weight loading issue by @ydshieh in #14016
- Replace BertLayerNorm with LayerNorm by @eldarkurtic in #14385
- [Wav2Vec2] Make sure that gradient checkpointing is only run if needed by @patrickvonplaten in #14407
- Allow per-version configurations by @LysandreJik in #14344
- Fix gradient_checkpointing backward compatibility by @sgugger in #14408
- Add forward method to dummy models by @sgugger in #14419
- Avoid looping when data exhausted by @valentindey in #14413
- Debug doc by @sgugger in #14424
- [Wav2Vec2] Add New Wav2Vec2 Translation by @patrickvonplaten in #14392
- Improve semantic segmentation models by @NielsRogge in #14355
- [Gradient checkpoining] Update Wav2Vec scripts by @falcaopetri in #14036
- [Bart] Fix docs by @patrickvonplaten in #14434
- [WIP] Ensure TF model configs can be converted to proper JSON by @Zahlii in #14415
- Recover Deleted XNLI Instructions by @Helw150 in #14437
- Fix EncoderDecoderModel code example by @NielsRogge in #14441
- Add a post init method to all models by @sgugger in #14431
- Fix finite IterableDataset test on multiple GPUs by @sgugger in #14445
- [Bert, et al] fix early device assignment by @stas00 in #14447
- Add GitPython to quality tools by @LysandreJik in #14459
- [ImageGPT] Small fixes by @NielsRogge in #14460
- [Generation] Allow
inputs_embeds
as an input by @patrickvonplaten in #14443 - Adding support for
hidden_states
andattentions
in unbatching support. by @Narsil in #14420 - add Tuple as possible type hint for EvalPredictions label_ids by @ameasure in #14473
- Fix dummy objects for quantization by @sgugger in #14478
- Moving pipeline tests from
Narsil
tohf-internal-testing
. by @Narsil in #14463 - Improve
add-new-pipeline
docs a bit by @stancld in #14485 - [test] add test for --config_overrides by @stas00 in #14466
- Support for Training with BF16 by @JamesDeAntonis in #13207
- fixes some key names for in LayoutLMv2 / LayoutXLM tokenizers by @valentindey in #14493
- Switch from using sum for flattening lists of lists in group_texts by @nbroad1881 in #14472
- [deepspeed] zero inference by @stas00 in #14253
- add cache_dir for tokenizer verification loading by @vmaryasin in #14508
- Fix feature extraction utils import by @LysandreJik in #14515
- [Tests] Improve vision tests by @NielsRogge in #14458
- [CI] clear
~/.cache/torch_extensions
between builds by @stas00 in #14520 - Fix a slow test. by @Narsil in #14527
- added save_directories for _psave_pretrained_pt and _tf, changed model to tf_model and pt_model, enable the notebook to run cleanly from top to bottom without error by @cfregly in #14529
- Quicktour updates by @LysandreJik in #14533
- Fixes by @LysandreJik in #14534
- [flax] unfreeze initial cache in gpt models by @patil-suraj in #14535
- Tokenizers docs: Specify which class contains
__call__
method by @xhlulu in #14379 - Rename ImageGPT by @NielsRogge in #14526
- [Generate] Fix generate with inputs_embeds on GPU by @patrickvonplaten in #14564
- [Flax] token-classification model steps enumerate start from 1 by @kamalkraj in #14547
- Fix sentinel token IDs in data collator for Flax T5 pretraining script by @rahuln in #14477
- Fix backend regex by @sgugger in #14566
- [Flax] Add FlaxBlenderbot by @stancld in #13633
- Add documentation for multi-label classification by @gsnidero in #14168
- use functional interface for softmax in attention by @t-vi in #14198
- Fix mask token handling by @qqaatw in #14364
- [doc] bf16/tf32 guide by @stas00 in #14579
- Rename toctree.yml -> _toctree.yml by @mishig25 in #14594
- Update doc img links by @mishig25 in #14593
- Adds a git pull instruction to the documentation builder by @LysandreJik in #14597
- [Flax] Add FlaxBlenderbotSmall by @stancld in #14576
- Python 3.6 -> Python 3.7 for TF runs by @LysandreJik in #14598
- change tf.math.divide with int(/) in distilbert model by @yis11178 in #14600
- fix #14524 (IndexError when mask prob is too low) by @nikvaessen in #14525
- Improve tokenizer tests by @qqaatw in #13594
- [CI] move env print to util, add pt, nccl versions by @stas00 in #14607
- 2022 is the year of multi-modality by @LysandreJik in #14610
- Fix doc builder by @LysandreJik in #14616
- [trainer] add tf32-mode control by @stas00 in #14606
- Make DefaultDataCollator importable from root by @Rocketknight1 in #14588
- fix a typo by @yuchenlin in #14626
- updated pytorch token-classification readme by @kamalkraj in #14624
- Add Flax example tests by @patil-suraj in #14599
- fix typo by @patil-suraj in #14635
- add flax example tests in CI workflow by @patil-suraj in #14637
- [urls to hub] Replace outdated model tags with their now-canonical pipeline types by @julien-c in #14617
- Update the example of exporting Bart + BeamSearch to ONNX module to resolve comments. by @fatcat-z in #14310
- Add GPTJForQuestionAnswering by @tucan9389 in #14503
- doc: mismatch between pooler/d_output by @guhur in #14641
- fix flax example tests by @patil-suraj in #14643
- Auto processor fix by @LysandreJik in #14623
- Fix syntax for class references by @sgugger in #14644
- Add a job to test the documentation build by @sgugger in #14645
- fix flax examples tests by @patil-suraj in #14646
- Use cross_attention_hidden_size in Encoder-Decoder models by @ydshieh in #14378
- [deepspeed] fix --load_best_model_at_end by @stas00 in #14652
- quick fix SummarizationPipeline error messages by @NouamaneTazi in #14618
- Fix a Bug, trainer_seq2seq.py, in the else branch at Line 172, generation_inputs should be a dict by @TranSirius in #14546
- [trainer] conditional ctx managers into one wrapper by @stas00 in #14663
- Fixing Dataset for TQA + token-classification. by @Narsil in #14658
- fix deprecated tf method by @zoheth in #14671
- Fix doc builder by @LysandreJik in #14676
- [AutoProcessor] Add Wav2Vec2WithLM & small fix #14675 (@patrickvonplaten)
- Added support for other features for already supported models #14358 (@michaelbenayoun)
- Revert "Added support for other features for already supported models" #14679 (@lewtun)
- Convert tutorials #14665 (@sgugger)
- fix: verify jsonlines file in run_translation (#14660) #14661 (@GaurangTandon)
- Improvements to Comet Integration #14680 (@DN6)
- Fixes in init #14681 (@sgugger)
- Revert open-in-colab and add perceiver #14683 (@sgugger)
- Fix wrong checkpoint paths in doc examples #14685 (@ydshieh)
- [bf16 support] tweaks #14580 (@stas00)
- [trainer] support UserDict inputs (torch-nightly) #14688 (@stas00)
- Move pyctcdecode #14686 (@sgugger)
- Make MLuke tokenizer tests slow #14690 (@sgugger)
- Fix doc examples: name '...' is not defined #14687 (@ydshieh)
- Add a job to test doc building (for realsies this time) #14662 (@sgugger)
- Fix Perceiver tests #14703 (@NielsRogge)
- add str hub token to repository when provided else fallback to default #14682 (@philschmid)
- Fix typo in toctree #14704 (@mishig25)
New Contributors
- @hrxorxm made their first contribution in #14211
- @pdcoded made their first contribution in #14219
- @wmathor made their first contribution in #14202
- @wamartin-aml made their first contribution in #14232
- @lumliolum made their first contribution in #14133
- @dwyatte made their first contribution in #13891
- @dshirron made their first contribution in #14241
- @changwangss made their first contribution in #14276
- @xhlulu made their first contribution in #14300
- @Beomi made their first contribution in #14287
- @nbertagnolli made their first contribution in #14325
- @jeffra made their first contribution in #14331
- @RezaYazdaniAminabadi made their first contribution in #14298
- @echarlaix made their first contribution in #14293
- @valentindey made their first contribution in #14413
- @Zahlii made their first contribution in #14415
- @Helw150 made their first contribution in #14437
- @shangz-ai made their first contribution in #14066
- @vmaryasin made their first contribution in #14508
- @cfregly made their first contribution in #14529
- @Xargonus made their first contribution in #14514
- @rahuln made their first contribution in #14477
- @gsnidero made their first contribution in #14168
- @t-vi made their first contribution in #14198
- @JamesDeAntonis made their first contribution in #13207
- @yis11178 made their first contribution in #14600
- @nikvaessen made their first contribution in #14525
- @yuchenlin made their first contribution in #14626
- @Ryou0634 made their first contribution in #14640
- @NouamaneTazi made their first contribution in #14618
- @TranSirius made their first contribution in #14546
- @zoheth made their first contribution in #14671
Full Changelog: v4.12.0...v4.13.0