fix(hugging-face): update dependency transformers to v4.32.0 #10581
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
4.31.0
->4.32.0
⚠ Dependency Lookup Warnings ⚠
Warnings were logged while processing this repo. Please check the Dependency Dashboard for more information.
Release Notes
huggingface/transformers (transformers)
v4.32.0
: IDEFICS, GPTQ QuantizationCompare Source
IDEFICS
The IDEFICS model was proposed in OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents by Hugo Laurençon, Lucile Saulnier, Léo Tronchon, Stas Bekman, Amanpreet Singh, Anton Lozhkov, Thomas Wang, Siddharth Karamcheti, Alexander M. Rush, Douwe Kiela, Matthieu Cord, Victor Sanh
IDEFICS is the first open state-of-the-art visual language model at the 80B scale!
The model accepts arbitrary sequences of image and text and produces text, similarly to a multimodal ChatGPT.
Blogpost: hf.co/blog/idefics
Playground: HuggingFaceM4/idefics_playground
MPT
MPT has been added and is now officially supported within Transformers. The repositories from MosaicML have been updated to work best with the model integration within Transformers.
MPT
] Add MosaicML'sMPT
model to transformers by @ArthurZucker & @younesbelkada in #24629GPTQ Integration
GPTQ quantization is now supported in Transformers, through the
optimum
library. The backend relies on the auto_gptq library, from which we use theGPTQ
andQuantLinear
classes.See below for an example of the API, quantizing a model using the new
GPTQConfig
configuration utility.Most models under TheBloke namespace with the suffix
GPTQ
should be supported, for example, to load a GPTQ quantized model onTheBloke/Llama-2-13B-chat-GPTQ
simply run (after installing latest optimum and auto-gptq libraries):For more information about this feature, we recommend taking a look at the following announcement blogpost: https://huggingface.co/blog/gptq-integration
Pipelines
A new pipeline, dedicated to text-to-audio and text-to-speech models, has been added to Transformers. It currently supports the 3 text-to-audio models integrated into
transformers
:SpeechT5ForTextToSpeech
,MusicGen
andBark
.See below for an example:
Classifier-Free Guidance sampling
Classifier-Free Guidance sampling is a generation technique developed by EleutherAI, announced in this paper. With this technique, you can increase prompt adherence in generation. You can also set it up with negative prompts, ensuring your generation doesn't go in specific directions. See its docs for usage instructions.
Task guides
A new task guide going into Visual Question Answering has been added to Transformers.
Model deprecation
We continue the deprecation of models that was introduced in https://github.com/huggingface/transformers/pull/24787.
By deprecating, we indicate that we will stop maintaining such models, but there is no intention of actually removing those models and breaking support for them (they might one day move into a separate repo/on the Hub, but we would still add the necessary imports to make sure backward compatibility stays). The main point is that we stop testing those models. The usage of the models drives this choice and aims to ease the burden on our CI so that it may be used to focus on more critical aspects of the library.
Translation Efforts
There are ongoing efforts to translate the transformers' documentation in other languages. These efforts are driven by groups independent to Hugging Face, and their work is greatly appreciated further to lower the barrier of entry to ML and Transformers.
If you'd like to kickstart such an effort or help out on an existing one, please feel free to reach out by opening an issue.
tasks/document_question_answering.md
to Korean by @jungnerd in #24588quicktour.md
by @wonhyeongseo in #24664serialization.md
by @wonhyeongseo in #24686testing.md
to Korean by @Sunmin0520 in #24900perf_train_cpu.md
to Korean by @seank021 in #24911<tf_xla>.md
to Korean by @54data in #24904perf_hardware.md
to Korean by @augustinLib in #24966hpo_train.md
to Korean by @harheem in #24968perf_infer_cpu.md
to Korean by @junejae in #24920transformers_agents.md
to Korean by @sim-so in #24881perf_infer_gpu_many.md
to Korean by @heuristicwave in #24943perf_infer_gpu_one.md
to Korean by @eenzeenee in #24978add_tensorflow_model.md
to Korean by @keonju2 in #25017perf_train_cpu_many.md
to Korean by @nuatmochoi in #24923add_new_model.md
to Korean by @mjk0618 in #24957model_summary.md
to Korean by @0525hhgus in #24625philosophy.md
to Korean by @TaeYupNoh in #25010perf_train_tpu_tf.md
to Korean by @0525hhgus in #25433Explicit input data format for image processing
Addition of
input_data_format
argument to image transforms and ImageProcessor methods, allowing the user to explicitly set the data format of the images being processed. This enables processing of images with non-standard number of channels e.g. 4 or removes error which occur when the data format was inferred but the channel dimension was ambiguous.Documentation clarification about efficient inference through
torch.scaled_dot_product_attention
& Flash AttentionUsers are not aware that it is possible to force dispatch
torch.scaled_dot_product_attention
method fromtorch
to use Flash Attention kernels. This leads to considerable speedup and memory saving, and is also compatible with quantized models. We decided to make this explicit to users in the documentation.In a nutshell, one can just run:
to enable Flash-attenion in their model. However, this feature does not support padding yet.
FSDP and DeepSpeed Changes
Users will no longer encounter CPU RAM OOM when using FSDP to train very large models in multi-gpu or multi-node multi-gpu setting.
Users no longer have to pass
fsdp_transformer_layer_cls_to_wrap
as the code now use_no_split_modules
by default which is available for most of the popular models. DeepSpeed Z3 init now works properly with Accelerate Launcher + Trainer.Breaking changes
Default optimizer in the
Trainer
classThe default optimizer in the
Trainer
class has been updated to beadam_torch
rather than our ownadam_hf
, as the official Torch optimizer is more robust and fixes some issues.In order to keep the old behavior, ensure that you pass "adamw_hf" as the
optim
value in yourTrainingArguments
.adamw_hf
toadamw_torch
🚨🚨🚨 by @muellerzr in #25109ViVit and EfficientNet rescale bugfix
There was an issue with the definition of the rescale of values with ViVit and EfficientNet. These have been fixed, but will result in different model outputs for both of these models. To understand the change and see what needs to be done to obtain previous results, please take a look at the following PR.
Removing softmax for the image classification EfficientNet class
The
EfficientNetForImageClassification
model class did not follow conventions and added a softmax to the model logits. This was removed so that it respects the convention set by other models.In order to obtain previous results, pass the model logits through a softmax.
Bug fixes with SPM models
Some SPM models had issues with their management of added tokens. Namely the
Llama
andT5
, among others, were behaving incorrectly. These have been updated in https://github.com/huggingface/transformers/pull/25224.An option to obtain the previous behavior was added through the
legacy
flag, as explained in the PR linked above.SPM
] Finish fix spm models 🚨🚨🚨 by @ArthurZucker in #25224Bugfixes and improvements
use_cache=True
by @ydshieh in #24893test_model_parallelism
forFalconModel
by @ydshieh in #24914Llama2
] replaceself.pretraining_tp
withself.config.pretraining_tp
by @younesbelkada in #24906image_processing_vilt.py
wrong default documented by @stas00 in #24931main_input_name
insrc/transformers/keras_callbacks.py
by @ydshieh in #24916LogitsProcessor
class by @shauray8 in #24848RWKV
] Add Gradient Checkpointing support for RWKV by @younesbelkada in #24955Parameter.ds_numel
by @apoorvkh in #24942LlamaConfig
] Nit: pad token should be None by default by @ArthurZucker in #24958llama
tokenization doctest by @ydshieh in #24990bnb
] Add simple check for bnb import by @younesbelkada in #24995Llama
] remove persistentinv_freq
tensor by @ArthurZucker in #24998logging.py
] set defaultstderr
path ifNone
by @ArthurZucker in #25033TrainingArgs
towandb.config
without sanitization. by @parambharat in #250358bit
] Fix 8bit corner case with Blip2 8bit by @younesbelkada in #25047RWKV
] Add note in doc onRwkvStoppingCriteria
by @ArthurZucker in #25055TF32
flag for PyTorch cuDNN backend by @XuehaiPan in #25075per_gpu_eval_batch_size
withper_device_eval_batch_size
in readme of multiple-choice task by @statelesshz in #25078generate
] Only warn users if thegeneration_config
'smax_length
is set to the default value by @ArthurZucker in #25030ForSequenceClassification
] Supportleft
padding by @ArthurZucker in #24979TF
] Also apply patch to support left padding by @ArthurZucker in #25085test_model_is_small
by @connor-henderson in #25087PreTrainedTokenizerFast
] Keep properties from fast tokenizer by @ArthurZucker in #25053MusicgenForConditionalGeneration
tests by @ydshieh in #25091T5
,MT5
,UMT5
] Add [T5, MT5, UMT5]ForSequenceClassification by @sjrl in #24726PvtModelIntegrationTest::test_inference_fp16
by @ydshieh in #25106use_auth_token
->token
by @ydshieh in #25083T5/LlamaTokenizer
] default legacy toNone
to not always warn by @ArthurZucker in #25131MptConfig
] support from pretrained args by @ArthurZucker in #25116token
things by @ydshieh in #25146.push_to_hub
and cleanupget_full_repo_name
usage by @Wauplin in #25120use_auth_token
->token
in example scripts by @ydshieh in #25167Mpt
] Fix mpt slow test by @younesbelkada in #25170InstructBlip
] Fix instructblip slow test by @younesbelkada in #25171_prepare_output_docstrings
by @ydshieh in #25202PreTrainedModel
] Wrapcuda
andto
method correctly by @younesbelkada in #25206all_model_classes
inFlaxBloomGenerationTest
by @ydshieh in #25211pipeline
] revisit device check for pipeline by @younesbelkada in #25207Pix2Struct
] Fix pix2struct cross attention by @younesbelkada in #25200Docs
/quantization
] Clearer explanation on how things works under the hood. + remove outdated info by @younesbelkada in #25216MPT
] Addrequire_bitsandbytes
on MPT integration tests by @younesbelkada in #25201Detr
] Fix detr BatchNorm replacement issue by @younesbelkada in #25230token
arugment in example scripts by @ydshieh in #25172pytest_options={"rA": None}
in CI by @ydshieh in #25263num_hidden_layers=2
🚀🚀🚀 by @ydshieh in #25266pytest_num_workers=8
for torch/tf jobs by @ydshieh in #25274report_to
logging integrations in docstring by @tomaarsen in #25281bark
could have tiny model by @ydshieh in #25290trust_remote_code
in example scripts by @Jackmin801 in #25248Repository
toupload_folder
by @sgugger in #25095NoRepeatNGramLogitsProcessor
Example forLogitsProcessor
class by @Rishab26 in #25186torch.compile()
for vision models by @merveenoyan in #24748test_model_parallelism
by @ydshieh in #25359token
in example template by @ydshieh in #25351torch_job
worker(s) crashing by @ydshieh in #25374token
by @ydshieh in #25382OneFormerModelTest.test_model_with_labels
by @ydshieh in #25383TopPLogitsWarper
by @chiral-carbon in #25361device_map
is passed by @gante in #25413torch.compile()
docs by @merveenoyan in #25432examples
to tests to run whensetup.py
is modified by @ydshieh in #25437main
on PRs/branches ifsetup.py
is not modified by @ydshieh in #25445main
on PRs/branches" by @ydshieh in #25466auxiliary_head
isNone
inUperNetPreTrainedModel
by @mmurray in #25514MaskFormerModelIntegrationTest
Configuration
📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR has been generated by Mend Renovate. View repository job log here.