Update transformers #539

maxjeblick · 2023-12-14T15:01:15Z

This PR updates transformers version to 4.36.

By this, flash attention is supported (and enabled by default) natively, see https://twitter.com/efxmarty/status/1734931075367850385

This reverts commit 02ff644.

This reverts commit b4de0b8.

This reverts commit f65728f.

…dio into max/update_transformers

psinger

Overall, looks good, thx.

But let's discuss proper setting for config.

psinger · 2023-12-18T10:04:29Z

llm_studio/src/utils/modeling_utils.py

@@ -677,17 +677,18 @@ def create_nlp_backbone(cfg, model_class=AutoModel) -> Any:
        try:
            import flash_attn  # noqa: F401

-            use_flash_attention_2 = cfg.training.use_flash_attention_2
+            # see https://github.com/fxmarty/transformers/


is this correct way to split URL for char limit?

I did it manually, happy to split it differently. Not sure if there's a commonly used convention for this.

psinger · 2023-12-18T10:09:29Z

llm_studio/src/utils/modeling_utils.py

+            # see https://github.com/fxmarty/transformers/
+            # blob/3f06a3a0aec8cc1ec3ad6bf66ebe277392c5ab37/
+            # src/transformers/configuration_utils.py#L380
+            config._attn_implementation_internal = "flash_attention_2"


Shouldnt we rather set attn_implementation here?
This is also used there: https://github.com/huggingface/transformers/blob/v4.36.1/src/transformers/models/llama/modeling_llama.py#L756

Where exactly is _internal used then?

https://github.com/huggingface/transformers/blob/main/src/transformers/modeling_utils.py#L1277

_attn_implementation_internal is an internal variable to store the implementation to use:

# Attention implementation to use, if relevant. self._attn_implementation_internal = kwargs.pop("attn_implementation", None)

As you noticed, _attn_implementation is used within the model classes.
_attn_implementation itself is actually a property that cannot be set. I guess it was designed that way to ensure backwards compatibility (although it could also have been unified in the init method).

Got it, thx.
It seems one can pass it via the model constructor though:
https://huggingface.co/docs/transformers/perf_infer_gpu_one

psinger

@maxjeblick if you are confident this _internal setting is sufficient, let's merge
thx!!!

maxjeblick added 6 commits December 13, 2023 16:40

update transformers version

c335f74

update transformers version

a784b88

Merge branch 'main' into max/update_transformers

baac935

remove flash attention

02ff644

Revert "remove flash attention"

b4de0b8

This reverts commit 02ff644.

Revert "Revert "remove flash attention""

f65728f

This reverts commit b4de0b8.

maxjeblick requested a review from psinger December 14, 2023 15:01

maxjeblick and others added 6 commits December 14, 2023 15:04

Update requirements.txt

dbbd2cb

Revert "Revert "Revert "remove flash attention"""

ef55e37

This reverts commit f65728f.

update flash attention flag

33d2d4c

Merge branch 'max/update_transformers' of github.com:h2oai/h2o-llmstu…

db22510

…dio into max/update_transformers

update flash attention flag

3141ffe

fix config._attn_implementation_internal

4d8c27a

psinger requested changes Dec 18, 2023

View reviewed changes

psinger approved these changes Dec 19, 2023

View reviewed changes

Merge branch 'main' into max/update_transformers

e110411

maxjeblick merged commit 2b0b0c1 into main Dec 19, 2023
5 checks passed

maxjeblick deleted the max/update_transformers branch December 19, 2023 14:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update transformers #539

Update transformers #539

maxjeblick commented Dec 14, 2023

psinger left a comment

psinger Dec 18, 2023

maxjeblick Dec 18, 2023

psinger Dec 18, 2023

maxjeblick Dec 18, 2023

psinger Dec 18, 2023

psinger left a comment

Update transformers #539

Update transformers #539

Conversation

maxjeblick commented Dec 14, 2023

psinger left a comment

Choose a reason for hiding this comment

psinger Dec 18, 2023

Choose a reason for hiding this comment

maxjeblick Dec 18, 2023

Choose a reason for hiding this comment

psinger Dec 18, 2023

Choose a reason for hiding this comment

maxjeblick Dec 18, 2023

Choose a reason for hiding this comment

psinger Dec 18, 2023

Choose a reason for hiding this comment

psinger left a comment

Choose a reason for hiding this comment