Support Value Zeroing for non-eager attention types #267

gsarti · 2024-04-25T10:10:19Z

Description

This PR addresses the bug with SDPA attention described in #266, and enables the usage of VZ out of the box for all attention implementations

…rameType

* Save attribution tensors with a lower precision. #202 Adds functionality for saving feature attributions objects and tensors in float16 or float8 format, depending on `scores_precision` parameters. Tensors are saved in huggingface safetensor format, and quantized using zeropoint quantization. Because safetensors are bytes objects, they are encoded with b64 to be saved in the output json and decoded upon reloading. * Add support for `device_map` use (#264) * Add device_map support * Fix device setter in HF model * Support Value Zeroing for non-eager attention types (#267) * Fix nans in attn, add pad if missing (#269) * Add transformers v4.40 models to config, update changelog * run tests * Minor fixes * fixed json decode error * Switched to torch-native float8_e4m3fn format * Fix style * Fix reqs * Fix safety * Add changelog --------- Co-authored-by: Gabriele Sarti <gabriele.sarti996@gmail.com>

Remove output_attention=True req to support SDPA in VZ, use correct F…

d309068

…rameType

gsarti enabled auto-merge (squash) April 25, 2024 19:55

gsarti disabled auto-merge April 25, 2024 19:56

gsarti linked an issue Apr 25, 2024 that may be closed by this pull request

Issue with the implementation of Value Zeroing #266

Closed

gsarti merged commit f2b9d92 into main Apr 25, 2024
3 checks passed

gsarti deleted the eager-attn-req branch April 25, 2024 19:56

LuukSuurmeijer pushed a commit to LuukSuurmeijer/inseq that referenced this pull request May 10, 2024

Support Value Zeroing for non-eager attention types (inseq-team#267)

6133a50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Value Zeroing for non-eager attention types #267

Support Value Zeroing for non-eager attention types #267

gsarti commented Apr 25, 2024

Support Value Zeroing for non-eager attention types #267

Support Value Zeroing for non-eager attention types #267

Conversation

gsarti commented Apr 25, 2024

Description