[Feature] Support Llama-2 with GQA #147

lzhangzz · 2023-07-19T14:24:22Z

Support Llama-2 70B which uses grouped-query attention.

Tested on 8 x A100 GPUs

TODO:

disable FMHA & set n_kv_heads for GQA models
compatibility fix for models in llama format

lvhan028 · 2023-07-20T03:04:43Z

may resolve linting error and update 'News' section in REAME

src/turbomind/kernels/unfused_attention_kernels.cu

lzhangzz · 2023-07-20T04:44:38Z

may resolve linting error and update 'News' section in REAME

@lvhan028 done

grimoire

LGTM

lvhan028 · 2023-07-20T13:08:03Z

May update llama-2 7b/13b/70b serving methods in docs/en/serving.md and docs/zh_cn/serving.md

grimoire · 2023-07-20T13:17:08Z

interlm-7b broken again.

lzhangzz added 2 commits July 19, 2023 14:19

add GQA for llama2

8d0dd86

fix model conversion

ee9dd3c

lvhan028 requested review from grimoire and lvhan028 July 20, 2023 03:07

grimoire reviewed Jul 20, 2023

View reviewed changes

src/turbomind/kernels/unfused_attention_kernels.cu Outdated Show resolved Hide resolved

lzhangzz added 2 commits July 20, 2023 04:40

fix lint & remove dev log

493eb6a

update news

6e283d6

lzhangzz added 3 commits July 20, 2023 04:49

minor

5b1621d

fix allocation size

46420dc

Merge remote-tracking branch 'origin/main' into llama2

3d90af7

grimoire approved these changes Jul 20, 2023

View reviewed changes

fix split_dim for w_qkv.bias

277ac1f

lvhan028 approved these changes Jul 21, 2023

View reviewed changes

lvhan028 merged commit f07b697 into InternLM:main Jul 21, 2023
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support Llama-2 with GQA #147

[Feature] Support Llama-2 with GQA #147

lzhangzz commented Jul 19, 2023 •

edited

lvhan028 commented Jul 20, 2023

lzhangzz commented Jul 20, 2023

grimoire left a comment

lvhan028 commented Jul 20, 2023

grimoire commented Jul 20, 2023

[Feature] Support Llama-2 with GQA #147

[Feature] Support Llama-2 with GQA #147

Conversation

lzhangzz commented Jul 19, 2023 • edited

lvhan028 commented Jul 20, 2023

lzhangzz commented Jul 20, 2023

grimoire left a comment

Choose a reason for hiding this comment

lvhan028 commented Jul 20, 2023

grimoire commented Jul 20, 2023

lzhangzz commented Jul 19, 2023 •

edited