Support qwen1.5 in turbomind engine #1406

lvhan028 · 2024-04-07T11:48:51Z

Note: window attention is not supported.

zhyncs · 2024-04-08T03:37:01Z

Hi @lvhan028 Does this pr support Qwen1.5-0.5B and Qwen1.5-1.8B? Thanks.

lvhan028 · 2024-04-08T03:47:02Z

0.5B no, since its head_dim is 64 while the turbomind engine hardcode the head_dim 128
1.8B Yes.

zhyncs · 2024-04-08T03:50:28Z

0.5B no, since its head_dim is 64 while the turbomind engine hardcode the head_dim 128 1.8B Yes.

Thanks for your reply. Maybe we could also update the supported table. Thanks.

lvhan028 · 2024-04-08T04:08:52Z

0.5B no, since its head_dim is 64 while the turbomind engine hardcode the head_dim 128 1.8B Yes.

Thanks for your reply. Maybe we could also update the supported table. Thanks.

sure. updated.

docs/en/supported_models/supported_models.md

lmdeploy/turbomind/deploy/source_model/qwen.py

lmdeploy/turbomind/turbomind.py

RunningLeon · 2024-04-08T11:14:47Z

opecompass results look OK

模型	mmlu	gsm8k
qwen1.5-7b-chat-HF	61.48	55.65
qwen1.5-7b-chat-TB	61.47	54.74

RunningLeon

LGTM

irexyc · 2024-04-08T12:04:07Z

Have you tried lmdeploy convert command to convert the model?

lvhan028 · 2024-04-08T12:18:03Z

Have you tried lmdeploy convert command to convert the model?

not yet. I'll test it asap

lvhan028 · 2024-04-08T13:26:15Z

lmdeploy convert qwen <qwen1.5-model-path>

irexyc · 2024-04-09T02:52:50Z

Traceback (most recent call last):
  File "/home/chenxin/miniconda3/envs/38/bin/lmdeploy", line 33, in <module>
    sys.exit(load_entry_point('lmdeploy', 'console_scripts', 'lmdeploy')())
  File "/home/chenxin/ws3/vl/lmdeploy/cli/entrypoint.py", line 26, in run
    args.run(args)
  File "/home/chenxin/ws3/vl/lmdeploy/cli/cli.py", line 151, in convert
    main(**kwargs)
  File "/home/chenxin/ws3/vl/lmdeploy/turbomind/deploy/converter.py", line 265, in main
    tokenizer_path = get_tokenizer_path(model_path, tokenizer_path)
  File "/home/chenxin/ws3/vl/lmdeploy/turbomind/deploy/converter.py", line 45, in get_tokenizer_path
    assert tokenizer_path, 'please supply tokenizer path by --tokenizer-path'
AssertionError: please supply tokenizer path by --tokenizer-path

lvhan028 · 2024-04-09T03:30:55Z

Traceback (most recent call last):
  File "/home/chenxin/miniconda3/envs/38/bin/lmdeploy", line 33, in <module>
    sys.exit(load_entry_point('lmdeploy', 'console_scripts', 'lmdeploy')())
  File "/home/chenxin/ws3/vl/lmdeploy/cli/entrypoint.py", line 26, in run
    args.run(args)
  File "/home/chenxin/ws3/vl/lmdeploy/cli/cli.py", line 151, in convert
    main(**kwargs)
  File "/home/chenxin/ws3/vl/lmdeploy/turbomind/deploy/converter.py", line 265, in main
    tokenizer_path = get_tokenizer_path(model_path, tokenizer_path)
  File "/home/chenxin/ws3/vl/lmdeploy/turbomind/deploy/converter.py", line 45, in get_tokenizer_path
    assert tokenizer_path, 'please supply tokenizer path by --tokenizer-path'
AssertionError: please supply tokenizer path by --tokenizer-path

What's the command?

irexyc · 2024-04-09T03:33:30Z

What's the command?

lmdeploy convert qwen /mnt/140/Qwen/Qwen1.5-7B-Chat/

xiaoxiaoyuwen · 2024-04-10T06:33:51Z

@lvhan028 dose this support qwen1.5-14b-chat int8 kv cache ?
I run the calibrate cause error

RuntimeError: Currently, quantification and calibration of Qwen2ForCausalLM are not supported.

lvhan028 · 2024-04-10T07:10:48Z

hi, @xiaoxiaoyuwen
we are going to remove offline kv int8 and adopt online kv int8 in PR #1377
PR #1412 provides the guide.
This feature will be released in the next version. Stay tuned.

…ypr InternLM#1406

…ypr #1406 (#1420)

lvhan028 added 2 commits April 7, 2024 12:28

add get_model_arch

24349f8

support qwen1.5

b01c1cc

update

7888378

update

ce002fe

update support_models.md

b63df12

lzhangzz mentioned this pull request Apr 8, 2024

[Feature] Qwen1.5适配turbomind #1381

Closed

update ut

eb40caa

zhyncs reviewed Apr 8, 2024

View reviewed changes

docs/en/supported_models/supported_models.md Outdated Show resolved Hide resolved

lvhan028 requested review from RunningLeon and irexyc April 8, 2024 06:08

update doc

993d614

lvhan028 added the enhancement New feature or request label Apr 8, 2024

RunningLeon reviewed Apr 8, 2024

View reviewed changes

lmdeploy/turbomind/deploy/source_model/qwen.py Outdated Show resolved Hide resolved

RunningLeon reviewed Apr 8, 2024

View reviewed changes

lmdeploy/turbomind/deploy/source_model/qwen.py Outdated Show resolved Hide resolved

RunningLeon reviewed Apr 8, 2024

View reviewed changes

lmdeploy/turbomind/turbomind.py Outdated Show resolved Hide resolved

lvhan028 added 3 commits April 8, 2024 15:56

update according to reviewer comments

27373a1

update according to reviewer comments

1dc84f3

update

6d72002

RunningLeon approved these changes Apr 8, 2024

View reviewed changes

fix converter

9a01e54

fix converter

d8734c1

irexyc approved these changes Apr 9, 2024

View reviewed changes

lvhan028 merged commit edca3d3 into InternLM:main Apr 9, 2024
3 of 5 checks passed

lvhan028 added a commit to lvhan028/lmdeploy that referenced this pull request Apr 10, 2024

miss --trust-remote-code in converter, which is side effect brought b…

5154d3b

…ypr InternLM#1406

lvhan028 added a commit that referenced this pull request Apr 11, 2024

miss --trust-remote-code in converter, which is side effect brought b…

074a7fd

…ypr #1406 (#1420)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support qwen1.5 in turbomind engine #1406

Support qwen1.5 in turbomind engine #1406

lvhan028 commented Apr 7, 2024

zhyncs commented Apr 8, 2024

lvhan028 commented Apr 8, 2024

zhyncs commented Apr 8, 2024

lvhan028 commented Apr 8, 2024

RunningLeon commented Apr 8, 2024

RunningLeon left a comment

irexyc commented Apr 8, 2024

lvhan028 commented Apr 8, 2024

lvhan028 commented Apr 8, 2024

irexyc commented Apr 9, 2024

lvhan028 commented Apr 9, 2024

irexyc commented Apr 9, 2024

xiaoxiaoyuwen commented Apr 10, 2024 •

edited

lvhan028 commented Apr 10, 2024

Support qwen1.5 in turbomind engine #1406

Support qwen1.5 in turbomind engine #1406

Conversation

lvhan028 commented Apr 7, 2024

zhyncs commented Apr 8, 2024

lvhan028 commented Apr 8, 2024

zhyncs commented Apr 8, 2024

lvhan028 commented Apr 8, 2024

RunningLeon commented Apr 8, 2024

RunningLeon left a comment

Choose a reason for hiding this comment

irexyc commented Apr 8, 2024

lvhan028 commented Apr 8, 2024

lvhan028 commented Apr 8, 2024

irexyc commented Apr 9, 2024

lvhan028 commented Apr 9, 2024

irexyc commented Apr 9, 2024

xiaoxiaoyuwen commented Apr 10, 2024 • edited

lvhan028 commented Apr 10, 2024

xiaoxiaoyuwen commented Apr 10, 2024 •

edited