Flash Attention Implementation & Fuller Config Options #139

benjaminye · 2024-04-08T23:13:09Z

Addresses #130

truskovskiyk · 2024-04-09T00:25:54Z

README.md

+**pip**
+
+```
+pip install flash-attn --no-build-isolation
+```


is it possible to install it with llm-toolkit, so users do not have to do this?

Not possible as the package doesn't support PEP 517. See python-poetry/poetry#8427, Dao-AILab/flash-attention#493 (comment).

^^ Also that's the first thing I tried 🙈

llmtune/inference/lora.py

This reverts commit bc83065.

benjaminye added 5 commits April 8, 2024 18:47

add torch_dtype and attn_implementation for model loading step

3394b82

added README instruction

557719f

rename infer_test_set for greater clarity

bc83065

enrich inference config parameters

075c145

ruff formatting

050e968

benjaminye requested review from RohitSaha and truskovskiyk April 8, 2024 23:13

benjaminye added enhancement New feature or request dependencies Pull requests that update a dependency file labels Apr 8, 2024

benjaminye linked an issue Apr 8, 2024 that may be closed by this pull request

Add flash_attention_1/flash_attention_2 support & examples #130

Closed

RohitSaha approved these changes Apr 9, 2024

View reviewed changes

truskovskiyk approved these changes Apr 9, 2024

View reviewed changes

benjaminye added 4 commits April 8, 2024 22:35

Revert "rename infer_test_set for greater clarity"

b0e0f56

This reverts commit bc83065.

use getter instead of validator to handle type casting in pydantic model

3837366

adding flash attention to example config

a938550

fix lint

0f683f6

benjaminye merged commit fa532c0 into main Apr 9, 2024
3 checks passed

benjaminye deleted the flash-attn branch April 10, 2024 16:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flash Attention Implementation & Fuller Config Options #139

Flash Attention Implementation & Fuller Config Options #139

benjaminye commented Apr 8, 2024

truskovskiyk Apr 9, 2024

benjaminye Apr 9, 2024

benjaminye Apr 9, 2024

Flash Attention Implementation & Fuller Config Options #139

Flash Attention Implementation & Fuller Config Options #139

Conversation

benjaminye commented Apr 8, 2024

truskovskiyk Apr 9, 2024

Choose a reason for hiding this comment

benjaminye Apr 9, 2024

Choose a reason for hiding this comment

benjaminye Apr 9, 2024

Choose a reason for hiding this comment