fix(deps): update dependency transformers to v4.34.0 #977
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
4.33.2
->4.34.0
Release Notes
huggingface/transformers (transformers)
v4.34.0
: v4.34: Mistral, Persimmon, Prompt templating, Flash Attention 2, Tokenizer refactorCompare Source
New models
Mistral
Mistral-7B-v0.1 is a decoder-based LM with the following architectural choices:
Persimmon
The authors introduced Persimmon-8B, a decoder model based on the classic transformers architecture, with query and key normalization. Persimmon-8B is a fully permissively licensed model with approximately 8 billion parameters, released under the Apache license. Some of the key attributes of Persimmon-8B are long context size (16K), performance, and capabilities for multimodal extensions.
Persimmon
] Add support for persimmon by @ArthurZucker in #26042BROS
BROS stands for BERT Relying On Spatiality. It is an encoder-only Transformer model that takes a sequence of tokens and their bounding boxes as inputs and outputs a sequence of hidden states. BROS encode relative spatial information instead of using absolute spatial information.
ViTMatte
ViTMatte leverages plain Vision Transformers for the task of image matting, which is the process of accurately estimating the foreground object in images and videos.
Nougat
Nougat uses the same architecture as Donut, meaning an image Transformer encoder and an autoregressive text Transformer decoder to translate scientific PDFs to markdown, enabling easier access to them.
Prompt templating
We've added a new template feature for chat models. This allows the formatting that a chat model was trained with to be saved with the model, ensuring that users can exactly reproduce that formatting when they want to fine-tune the model or use it for inference. For more information, see our template documentation.
🚨🚨 Tokenizer refactor
Tokenizer
] attemp to fix add_token issues by @ArthurZucker in #23909🚨Workflow Changes 🚨:
These are not breaking changes per se but rather bugfixes. However, we understand that this may result in some workflow changes so we highlight them below.
➕ Most visible features:
tokenizer.added_tokens_decoder
for both fast and slow tokenizers. Moreover, additional tokens that were already part of the initial vocab are also found there.from_pretrained
, fasteradd_tokens
because special and non special can be mixed together and the trie is not always rebuilt.added_tokens_decoder/encoder
.tokenizer_config.json
For any issues relating to this, make sure to open a new issue and ping @ArthurZucker.
Flash Attention 2
FA2 support added to transformers for most popular architectures (llama, mistral, falcon) architectures actively being contributed in this issue (https://github.com/huggingface/transformers/issues/26350). Simply pass
use_flash_attention_2=True
when callingfrom_pretrained
In the future, PyTorch will support Flash Attention 2 through
torch.scaled_dot_product_attention
, users would be able to benefit from both (transformers core & transformers + SDPA) implementations of Flash Attention-2 with simple changes (model.to_bettertransformer()
and force-dispatch the SDPA kernel to FA-2 in the case of SDPA)core
] Integrate Flash attention 2 in most used models by @younesbelkada in #25598For our future plans regarding integrating F.sdpa from PyTorch in core transformers, see here: https://github.com/huggingface/transformers/issues/26557
Lazy import structure
Support for lazy loading integration libraries has been added. This will drastically speed up importing
transformers
and related object from the library.Example before this change:
After this change:
Bugfixes and improvements
test_load_img_url_timeout
by @ydshieh in #25976Pop2Piano
space demo. by @susnato in #25975generation_config
by @gante in #25987CI
] Fix red CI and ERROR failed should show by @ArthurZucker in #25995VITS
] tokenizer integration test: fix revision did not exist by @ArthurZucker in #25996llm_tutorial.md
to Korean by @harheem in #25791tgs
speed metrics by @CokeDong in #25858activation_dropout
and fix DropOut docs for SEW-D by @gau-nernst in #26031llama.md
to Korean by @harheem in #26044CodeLlamaTokenizerFast
] Fix fixset_infilling_processor
to properly reset by @ArthurZucker in #26041CITests
] skip failing tests until #26054 is merged by @ArthurZucker in #26063core
] Import tensorflow inside relevant methods intrainer_utils
by @younesbelkada in #26106generation_config
is untouched by @gante in #25962llama2.md
to Korean by @mjk0618 in #26047contributing.md
to Korean by @mjk0618 in #25877MarianTokenizer
to remove metaspace character indecode
by @tanaymeh in #26091core
] fix 4bitnum_parameters
by @younesbelkada in #26132RWKV
] Final fix RWMV 4bit by @younesbelkada in #26134generation_config.max_length
is set toNone
by @gante in #26147test_finetune_bert2bert
by @ydshieh in #25984beam_scores
shape when token scores shape changes afterlogits_processor
by @BakerBunker in #25980accelerate
> 0.20.3 by @sam-scale in #26060PEFT
] Fix PEFT + gradient checkpointing by @younesbelkada in #25846convert_bros_to_pytorch.py
by @ydshieh in #26212utils/documentation_tests.txt
by @ydshieh in #26213ctrl
toSalesforce/ctrl
by @julien-c in #26183whisper.md
to Korean by @nuatmochoi in #26002Error
not captured in PR doctesting by @ydshieh in #26215Trainer
] Refactor trainer + bnb logic by @younesbelkada in #26248ALL_LAYERNORM_LAYERS
by @shijie-wu in #26227model._keep_in_fp32_modules
is set even whenaccelerate
is not installed by @fxmarty in #26225store_test_results
by @ydshieh in #26223audio_classification.mdx
to Korean by @gabrielwithappy in #26200RMSProp
optimizer by @natolambert in #26425transformers
is installed withouttokenizers
by @urialon in #26236FA
/tests
] Add use_cache tests for FA models by @younesbelkada in #26415PEFT
] Fix PEFT multi adapters support by @younesbelkada in #26407runs-on
in workflow files by @ydshieh in #26435debugging.md
to Korean by @wonhyeongseo in #26246perf_train_gpu_many.md
to Korean by @wonhyeongseo in #26244cos_sin
device issue in Falcon model by @ydshieh in #26448PEFT
] introducingadapter_kwargs
for loading adapters from different Hub location (subfolder
,revision
) than the base model by @younesbelkada in #26270PEFT
] Pass token when callingfind_adapter_config
by @younesbelkada in #26488core
/auto
] Fix bnb test with code revision + bug with code revision by @younesbelkada in #26431PEFT
] Protectadapter_kwargs
check by @younesbelkada in #26537tokenizer_summary.md
to Korean by @wonhyeongseo in #26243configuration_encoder_decoder.py
by @SrijanSahaySrivastava in #26519Significant community contributions
The following contributors have made significant changes to the library over the last release:
debugging.md
to Korean (#26246)perf_train_gpu_many.md
to Korean (#26244)tokenizer_summary.md
to Korean (#26243)v4.33.3
: Patch release: v4.33.3Compare Source
A patch release was made for the following three commits:
Configuration
📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 Automerge: Enabled.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR has been generated by Mend Renovate. View repository job log here.