Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(deps): update dependency transformers to v4.34.0 #977

Merged
merged 1 commit into from
Oct 13, 2023

Conversation

renovate[bot]
Copy link
Contributor

@renovate renovate bot commented Oct 13, 2023

Mend Renovate

This PR contains the following updates:

Package Change Age Adoption Passing Confidence
transformers 4.33.2 -> 4.34.0 age adoption passing confidence

Release Notes

huggingface/transformers (transformers)

v4.34.0: v4.34: Mistral, Persimmon, Prompt templating, Flash Attention 2, Tokenizer refactor

Compare Source

New models

Mistral

Mistral-7B-v0.1 is a decoder-based LM with the following architectural choices:

  • Sliding Window Attention - Trained with 8k context length and fixed cache size, with a theoretical attention span of 128K tokens
  • GQA (Grouped Query Attention) - allowing faster inference and lower cache size.
  • Byte-fallback BPE tokenizer - ensures that characters are never mapped to out-of-vocabulary tokens.
Persimmon

The authors introduced Persimmon-8B, a decoder model based on the classic transformers architecture, with query and key normalization. Persimmon-8B is a fully permissively licensed model with approximately 8 billion parameters, released under the Apache license. Some of the key attributes of Persimmon-8B are long context size (16K), performance, and capabilities for multimodal extensions.

BROS

BROS stands for BERT Relying On Spatiality. It is an encoder-only Transformer model that takes a sequence of tokens and their bounding boxes as inputs and outputs a sequence of hidden states. BROS encode relative spatial information instead of using absolute spatial information.

ViTMatte

ViTMatte leverages plain Vision Transformers for the task of image matting, which is the process of accurately estimating the foreground object in images and videos.

Nougat

Nougat uses the same architecture as Donut, meaning an image Transformer encoder and an autoregressive text Transformer decoder to translate scientific PDFs to markdown, enabling easier access to them.

Prompt templating

We've added a new template feature for chat models. This allows the formatting that a chat model was trained with to be saved with the model, ensuring that users can exactly reproduce that formatting when they want to fine-tune the model or use it for inference. For more information, see our template documentation.

🚨🚨 Tokenizer refactor

🚨Workflow Changes 🚨:

These are not breaking changes per se but rather bugfixes. However, we understand that this may result in some workflow changes so we highlight them below.

  • unique_no_split_tokens attribute removed and not used in the internal logic
  • sanitize_special_tokens() follows a deprecation cycle and does nothing
  • All attributes in SPECIAL_TOKENS_ATTRIBUTES are stored as AddedTokens and no strings.
  • loading a slow from a fast or a fast from a slow will no longer raise and error if the tokens added don't have the correct index. This is because they will always be added following the order of the added_tokens but will correct mistakes in the saved vocabulary if there are any. (And there are a lot in old format tokenizers)
  • the length of a tokenizer is now max(set(self.get_vocab().keys())) accounting for holes in the vocab. The vocab_size no longer takes into account the added vocab for most of the tokenizers (as it should not). Mostly breaking for T5
  • Adding a token using tokenizer.add_tokens([AddedToken("hey", rstrip=False, normalized=True)]) now takes into account rstrip, lstrip, normalized information.
  • added_tokens_decoder holds AddedToken, not strings.
  • add_tokens() for both fast and slow will always be updated if the token is already part of the vocab, allowing for custom stripping.
  • initializing a tokenizer form scratch will now add missing special tokens to the vocab.
  • stripping is not always done for special tokens! 🚨 Only if the AddedToken has lstrip=True and rstrip=True
  • fairseq_ids_to_tokens attribute removed for Barthez (was not used)

➕ Most visible features:

  • printing a tokenizer now shows tokenizer.added_tokens_decoder for both fast and slow tokenizers. Moreover, additional tokens that were already part of the initial vocab are also found there.
  • faster from_pretrained, faster add_tokens because special and non special can be mixed together and the trie is not always rebuilt.
  • faster encode/decode with caching mechanism for added_tokens_decoder/encoder.
  • information is fully saved in the tokenizer_config.json

For any issues relating to this, make sure to open a new issue and ping @​ArthurZucker.

Flash Attention 2

FA2 support added to transformers for most popular architectures (llama, mistral, falcon) architectures actively being contributed in this issue (https://github.com/huggingface/transformers/issues/26350). Simply pass use_flash_attention_2=True when calling from_pretrained

In the future, PyTorch will support Flash Attention 2 through torch.scaled_dot_product_attention, users would be able to benefit from both (transformers core & transformers + SDPA) implementations of Flash Attention-2 with simple changes (model.to_bettertransformer() and force-dispatch the SDPA kernel to FA-2 in the case of SDPA)

For our future plans regarding integrating F.sdpa from PyTorch in core transformers, see here: https://github.com/huggingface/transformers/issues/26557

Lazy import structure

Support for lazy loading integration libraries has been added. This will drastically speed up importing transformers and related object from the library.

Example before this change:

2023-09-11 11:07:52.010179: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
python3 -c "from transformers import CLIPTextModel"  3.31s user 3.06s system 220% cpu 2.893 total

After this change:

python3 -c "from transformers import CLIPTextModel"  1.70s user 1.49s system 220% cpu 1.447 total

Bugfixes and improvements

Significant community contributions

The following contributors have made significant changes to the library over the last release:

v4.33.3: Patch release: v4.33.3

Compare Source

A patch release was made for the following three commits:

  • DeepSpeed ZeRO-3 handling when resizing embedding layers (#​26259)
  • [doc] Always call it Agents for consistency (#​25958)
  • deepspeed resume from ckpt fixes and adding support for deepspeed optimizer and HF scheduler (#​25863)

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Enabled.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR has been generated by Mend Renovate. View repository job log here.

@renovate renovate bot added the dependencies Pull requests that update a dependency file label Oct 13, 2023
@codecov
Copy link

codecov bot commented Oct 13, 2023

Codecov Report

All modified lines are covered by tests ✅

Comparison is base (3ae20b7) 19.25% compared to head (261ae7c) 19.25%.
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #977   +/-   ##
=======================================
  Coverage   19.25%   19.25%           
=======================================
  Files          39       39           
  Lines        3495     3495           
  Branches      497      497           
=======================================
  Hits          673      673           
  Misses       2803     2803           
  Partials       19       19           
Files Coverage Δ
src/so_vits_svc_fork/__init__.py 100.00% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@renovate renovate bot force-pushed the renovate/transformers-4.x-lockfile branch from 6368ac4 to 261ae7c Compare October 13, 2023 06:38
@34j 34j merged commit 6bb2555 into main Oct 13, 2023
9 checks passed
@34j 34j deleted the renovate/transformers-4.x-lockfile branch October 13, 2023 07:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant