fix benchmark generation #1349

grimoire · 2024-03-26T09:34:22Z

zhulinJulia24 · 2024-03-27T01:57:34Z

bug里的问题已经修复，--tp 2可以跑起来了，但执行到最后有assert output_seqlen <= n_token <= output_seqlen + 1报错，看下先合并还是一起解决呀 @grimoire

profiling ... concurrency: 1, n_prompt_token: 128, n_completion_token: 2048, test_round: 3, warmup_round: 1
2024-03-27 00:02:30,096 - lmdeploy - ^[[37mINFO^[[0m - Checking environment for PyTorch Engine.
2024-03-27 00:02:31,235 - lmdeploy - ^[[37mINFO^[[0m - Checking model.
^MLoading checkpoint shards: 0%| | 0/21 [00:00<?, ?it/s]^MLoading checkpoint shards: 5%|▍ | 1/21 [00:00<00:02, 9.90it/s]^MLoading checkpoint shards: 14%|█▍ | 3/21 [00:00<00:01, 13.78it/s]^MLoading checkpoint shards: 24%|██▍ | 5/21 [00:00<00:01, 14.77it/s]^MLoading checkpoint shards: 33%|███▎ | 7/21 [00:00<00:00, 15.14it/s]^MLoading checkpoint shards: 43%|████▎ | 9/21 [00:00<00:00, 14.12it/s]^MLoading checkpoint shards: 52%|█████▏ | 11/21 [00:00<00:00, 13.50it/s]^MLoading checkpoint shards: 62%|██████▏ | 13/21 [00:00<00:00, 14.08it/s]^MLoading checkpoint shards: 71%|███████▏ | 15/21 [00:01<00:00, 14.58it/s]^MLoading checkpoint shards: 81%|████████ | 17/21 [00:01<00:00, 15.02it/s]^MLoading checkpoint shards: 90%|█████████ | 19/21 [00:01<00:00, 15.21it/s]^MLoading checkpoint shards: 100%|██████████| 21/21 [00:01<00:00, 14.98it/s]^MLoading checkpoint shards: 100%|██████████| 21/21 [00:01<00:00, 14.54it/s]2024-03-27 00:02:37,233 - lmdeploy - ^[[37mINFO^[[0m - distribute model parameters.
2024-03-27 00:02:55,797 - lmdeploy - ^[[37mINFO^[[0m - build CacheEngine with config:CacheConfig(block_size=64, num_cpu_blocks=682, num_gpu_blocks=7401, window_size=-1, cache_max_entry_count=0.8, max_prefill_token_num=4096)
start to warmup ...
end warmup, elapsed time: 91.2s
^M 0%| | 0/3 [00:00<?, ?it/s]^M 33%|███▎ | 1/3 [01:33<03:07, 93.66s/it]Exception in thread Thread-4:
Traceback (most recent call last):
File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/usr/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/__w/lmdeploy/lmdeploy/benchmark/profile_generation.py", line 64, in infer
assert output_seqlen <= n_token <= output_seqlen + 1,
AssertionError: Error. session_id(1) request 2048 tokens, but generate 2047 tokens
^M 33%|███▎ | 1/3 [01:33<03:07, 93.66s/it]
Traceback (most recent call last):
File "benchmark/profile_generation.py", line 472, in
main()
File "benchmark/profile_generation.py", line 425, in main
output = _process_map(profile_target, (args.model_path, ))
File "benchmark/profile_generation.py", line 372, in _process_map
raise ret
ValueError: need at least one array to stack
/usr/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 3 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '

zhulinJulia24

LGTM

RunningLeon

LGTM

update

d39966e

grimoire linked an issue Mar 26, 2024 that may be closed by this pull request

[Bug] use benchmark/profile_generation.py do pytorch backend benchmark with parameter --tp 2, RuntimeError: Rank[0] failed. error occurs #1347

Closed

2 tasks

lvhan028 requested review from zhulinJulia24 and RunningLeon March 26, 2024 09:47

lvhan028 added the Bug:P1 label Mar 26, 2024

session len

ea247be

zhulinJulia24 approved these changes Mar 27, 2024

View reviewed changes

RunningLeon approved these changes Mar 28, 2024

View reviewed changes

lvhan028 merged commit 385e9bb into InternLM:main Mar 28, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix benchmark generation #1349

fix benchmark generation #1349

grimoire commented Mar 26, 2024

zhulinJulia24 commented Mar 27, 2024 •

edited

Loading

zhulinJulia24 left a comment

RunningLeon left a comment

fix benchmark generation #1349

fix benchmark generation #1349

Conversation

grimoire commented Mar 26, 2024

zhulinJulia24 commented Mar 27, 2024 • edited Loading

zhulinJulia24 left a comment

Choose a reason for hiding this comment

RunningLeon left a comment

Choose a reason for hiding this comment

zhulinJulia24 commented Mar 27, 2024 •

edited

Loading