Update Flash Attention forward for Llama 2: #3595

jordiclive · 2023-07-23T07:15:46Z

GQA for 34B and 70B and tp have been added

The current flash attn patch forward is for the old transformers' attention forward.

…P have been added.

andreaskoepf

Very cool thanks a lot!

Update Flash Attention forward for Llama 2: GQA for 34B and 70B and T…

e36117c

…P have been added.

jordiclive requested review from theblackcat102, sanagno, dvruette, andreaskoepf, yk and shahules786 as code owners July 23, 2023 07:15

update llama path to on redmond

fa0f368

andreaskoepf approved these changes Jul 23, 2023

View reviewed changes

andreaskoepf merged commit 3c8f93e into LAION-AI:main Jul 23, 2023
1 check passed

Provide feedback