Better SDPA unmasking implementation #29318

fxmarty · 2024-02-27T11:33:19Z

As @ArthurZucker improved the unmasking for SDPA for mem-efficient code path let's do so for all archs using SDPA #27931

HuggingFaceDocBuilderDev · 2024-02-27T11:58:17Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

fxmarty · 2024-02-27T14:07:48Z

RUN_SLOW=1 CUDA_VISIBLE_DEVICES=3 pytest tests/ -k "test_eager_matches_sdpa_inference" -s -vvvvv passes except for qwen2 (but it is unrelated, see #28436 (comment))

fxmarty · 2024-02-27T14:08:36Z

src/transformers/modeling_attn_mask_utils.py

+        if expanded_mask.dtype == torch.bool:
+            raise ValueError("AttentionMaskConverter._unmask_unattended expects a float `expanded_mask`, got a BoolTensor.")


Some models (gpt bigcode) use bool tensors, but Arthur's implem can't work for that dtype.

can't we cast it and replace the min with 0?

For now I expect the cast to be done in the modeling file (explicit).

ArthurZucker

LGTM thanks for propagating the changes

ArthurZucker · 2024-02-28T01:15:13Z

src/transformers/modeling_attn_mask_utils.py

+        if expanded_mask.dtype == torch.bool:
+            raise ValueError("AttentionMaskConverter._unmask_unattended expects a float `expanded_mask`, got a BoolTensor.")


can't we cast it and replace the min with 0?

src/transformers/modeling_attn_mask_utils.py

* better unmask imple * comment * typo * bug report pytorch * cleanup * fix import * add back example * retrigger ci * come on

fxmarty added 2 commits February 27, 2024 12:30

better unmask imple

9f3b6e5

comment

5f4ab27

fxmarty added 3 commits February 27, 2024 13:27

typo

ac88eba

bug report pytorch

a5fe3c2

cleanup

50464fc

fxmarty commented Feb 27, 2024

View reviewed changes

fxmarty requested review from amyeroberts and ArthurZucker February 27, 2024 14:11

fix import

3510270

ArthurZucker approved these changes Feb 28, 2024

View reviewed changes

fxmarty added 3 commits February 28, 2024 11:13

add back example

72df2be

retrigger ci

eeb087b

come on

bc7da5f

fxmarty merged commit 49204c1 into huggingface:main Feb 28, 2024
20 checks passed

ArthurZucker mentioned this pull request Feb 29, 2024

[_unmask_unattended] Refactor #29356

Closed

itazap pushed a commit that referenced this pull request May 14, 2024

Better SDPA unmasking implementation (#29318)

7e107b0

* better unmask imple * comment * typo * bug report pytorch * cleanup * fix import * add back example * retrigger ci * come on

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better SDPA unmasking implementation #29318

Better SDPA unmasking implementation #29318

fxmarty commented Feb 27, 2024

HuggingFaceDocBuilderDev commented Feb 27, 2024

fxmarty commented Feb 27, 2024

fxmarty Feb 27, 2024

ArthurZucker Feb 28, 2024

fxmarty Feb 28, 2024

ArthurZucker Feb 28, 2024

ArthurZucker left a comment

ArthurZucker Feb 28, 2024

		if expanded_mask.dtype == torch.bool:
		raise ValueError("AttentionMaskConverter._unmask_unattended expects a float `expanded_mask`, got a BoolTensor.")

Better SDPA unmasking implementation #29318

Better SDPA unmasking implementation #29318

Conversation

fxmarty commented Feb 27, 2024

HuggingFaceDocBuilderDev commented Feb 27, 2024

fxmarty commented Feb 27, 2024

fxmarty Feb 27, 2024

Choose a reason for hiding this comment

ArthurZucker Feb 28, 2024

Choose a reason for hiding this comment

fxmarty Feb 28, 2024

Choose a reason for hiding this comment

ArthurZucker Feb 28, 2024

Choose a reason for hiding this comment

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker Feb 28, 2024

Choose a reason for hiding this comment