Releases · InternLM/lmdeploy

19 Jun 02:27

lvhan028

v0.9.0

8f4ad3d

v0.9.0 Latest

Latest

What's Changed

🚀 Features

LMDeploy Distserve by @JimyMa in #3304
allow api server terminated through requests from clients by @RunningLeon in #3533
support update params for pytorch backend from api server by @irexyc in #3535
support eplb for Qwen3-MoE by @zhaochaoxing in #3582
support update params for turbomind backend by @irexyc in #3566
Quantize Qwen3 MoE bf16 model to fp8 model at runtime by @grimoire in #3631
[Feat]: Support internvl3-8b-hf by @RunningLeon in #3633
Add FP8 MoE for turbomind by @lzhangzz in #3601

💥 Improvements

reduce ray memory usage by @grimoire in #3487
use dlblas by @zhaochaoxing in #3489
internlm3 dense fp8 by @CUHKSZzxy in #3527
random pad input ids by @grimoire in #3530
ray nsys profile support by @grimoire in #3448
update blockedfp8 scale name by @CUHKSZzxy in #3532
start engine loop on server startup event by @grimoire in #3523
update two microbatch by @SHshenhao in #3540
[ascend]set transdata dynamic shape true by @JackWeiw in #3531
ray safe exit by @grimoire in #3545
support update params with dp=1 for pytorch engine by @irexyc in #3562
Skip dp dummy input forward by @grimoire in #3552
Unclock mutual exclusivity of argument: tool-call-parser and reasoning-parser by @jingyibo123 in #3550
perform torch.cuda.empty_cache() after conversion by @bltcn in #3570
pipeline warmup by @irexyc in #3548
Launch multiple api servers for dp > 1 by @RunningLeon in #3414
support awq for Qwen2.5-VL by @RunningLeon in #3559
support qwen3 /think & /no_think & enable_thinking parameter by @BUJIDAOVS in #3564
Eplb by @zhaochaoxing in #3572
Update benchmark by @lvhan028 in #3578
block output when prefetch next forward inputs. by @grimoire in #3573
support both eplb and microbatch simultaneously by @zhaochaoxing in #3591
Add log_file and set loglevel in launch_servers by @RunningLeon in #3596
1. add migration flow control by @JimyMa in #3599
sampling on the tokenizer's vocab by @grimoire in #3604
update deepgemm version by @grimoire in #3606
[Ascend] set default distrbuted backend as ray for ascend device by @JackWeiw in #3603
Blocked fp8 tma by @grimoire in #3470
[PDDisaggreagtion] Async migration by @JimyMa in #3610
move dp loop to model agent by @grimoire in #3598
update some logs of proxy_server and pt engine by @lvhan028 in #3621
improve loading model performance by shuffling the weight files by @irexyc in #3625
add benchmark scripts about pipeline api and inference engines according to the config file by @lvhan028 in #3622

🐞 Bug fixes

[ascend] fix recompile on different rank by @jinminxi104 in #3513
fix attention sm86 by @grimoire in #3519
fix stopwords kv cache by @grimoire in #3494
[bug fix] fix PD Disaggregation in DSV3 by @JimyMa in #3547
fix proxy server heart beat by @irexyc in #3543
fix dp>1 tp=1 ep=1 by @grimoire in #3555
fix mixtral on new transformers by @grimoire in #3580
[Fix]: reset step after eviction by @RunningLeon in #3589
fix parsing dynamic rope param failed by @lvhan028 in #3575
Fix batch infer for gemma3vl by @RunningLeon in #3592
Fix symbol error when dlBLAS is not imported by @zhaochaoxing in #3597
read distributed envs by @grimoire in #3600
fix side-effect caused by PR 3590 by @lvhan028 in #3608
fix bug in qwen2 by @LKJacky in #3614
fix awq kernel by @grimoire in #3618
fix flash mla interface by @grimoire in #3617
add sampling_vocab_size by @irexyc in #3607
fix for default quant by @grimoire in #3640
Fix log file env in ray worker by @RunningLeon in #3624
fix qwen3 chat template by @lvhan028 in #3641
fix vlm runtime quant by @grimoire in #3644
Fix 'Namespace' object has no attribute 'num_tokens_per_iter' when serving by gradio by @lvhan028 in #3647
Synchronize weight processing by @lzhangzz in #3649
Fix zero scale in fp8 quantization by @lzhangzz in #3652

🌐 Other

update doc for ascend 300I Duo docker image by @jinminxi104 in #3526
simulate EPLB for benchmark only by @lvhan028 in #3490
[ci] add test workflow for 3090 machine by @zhulinJulia24 in #3561
[ci] fix transformers version in prtest by @zhulinJulia24 in #3584
[Misc] minor api_server and tm loader, and upgrade docformatter to resolve lint error by @lvhan028 in #3590
[ci] add qwen3 models into testcase by @zhulinJulia24 in #3593
update Dockerfile by @CUHKSZzxy in #3634
check in lmdeploy-builder on cuda 12.4 and 12.8 platform by @lvhan028 in #3630
fix blocked fp8 overflow by @grimoire in #3650
Bump version to v0.9.0 by @lvhan028 in #3609

New Contributors

@JimyMa made their first contribution in #3304
@jingyibo123 made their first contribution in #3550
@bltcn made their first contribution in #3570
@BUJIDAOVS made their first contribution in #3564
@LKJacky made their first contribution in #3614

Full Changelog: v0.8.0...v0.9.0

Contributors

grimoire, lvhan028, and 14 other contributors

Assets 12

04 May 03:17

lvhan028

v0.8.0

d1492f1

v0.8.0

What's Changed

🚀 Features

Torch dp support by @grimoire in #3207
Add deep gemm with tma pre allocated by @AllentDan in #3287
Add mixed DP + TP by @lzhangzz in #3229
Add Qwen3 and Qwen3MoE by @lzhangzz in #3305
[ascend] support multi nodes on ascend device by @tangzhiyi11 in #3260
[Feature] support qwen3 and qwen3-moe for pytorch engine by @CUHKSZzxy in #3315
[ascend]support deepseekv2 by @yao-fengchen in #3206
add deepep by @zhaochaoxing in #3313
support ascend w8a8 graph_mode by @yao-fengchen in #3267
support all2all ep by @zhaochaoxing in #3370
optimize ep in decoding stage by @zhaochaoxing in #3383
Warmup deepgemm by @grimoire in #3387
support Llama4 by @grimoire in #3408
add twomicrobatch support by @SHshenhao in #3381
Support phi4 mini by @RunningLeon in #3467
[Dlinfer][Ascend] support 310P by @JackWeiw in #3484
support qwen3 fp8 by @CUHKSZzxy in #3505

💥 Improvements

Add spaces_between_special_tokens to /v1/interactive and make compatible with empty text by @AllentDan in #3283
add env var to control timeout by @CUHKSZzxy in #3291
refactor attn param by @irexyc in #3164
Verbose log by @grimoire in #3329
optimize mla, remove load v by @grimoire in #3334
support dp decoding with cudagraph by @grimoire in #3311
optimize quant-fp8 kernel by @grimoire in #3345
refactor dlinfer rope by @yao-fengchen in #3326
enable qwenvl2.5 graph mode on ascend by @jinminxi104 in #3367
Add AIOHTTP_TIMEOUT env var for proxy server by @AllentDan in #3355
disable sync batch on dp eager mode by @grimoire in #3382
fix for deepgemm update by @grimoire in #3380
Add string before hash tokens in blocktrie by @RunningLeon in #3386
optimize moe get sorted idx by @grimoire in #3356
use half/bf16 lm_head output by @irexyc in #3213
remove ep eager check by @grimoire in #3392
Optimize ascend moe by @yao-fengchen in #3364
optimize fp8 moe kernel by @grimoire in #3419
ray async forward execute by @grimoire in #3443
map internvl3 chat template to builtin chat template internvl2_5 by @lvhan028 in #3450
Refactor turbomind (low-level abstractions) by @lzhangzz in #3423
remove barely used code to improve maintenance by @lvhan028 in #3462
optimize sm80 long context by @grimoire in #3465
move partial_json_parser from ’serve.txt‘ to ‘runtime.txt‘ by @lvhan028 in #3493
support qwen3-dense models awq quantization by @lvhan028 in #3503
Optimize MoE gate for Qwen3 by @lzhangzz in #3500
Pass num_tokens_per_iter and max_prefill_iters params through in lmdeploy serve api_server by @josephrocca in #3504
[Dlinfer][Ascend] Optimize performance of 310P device by @JackWeiw in #3486
optimize longcontext decoding by @grimoire in #3510
Support min_p in openai completions_v1 by @josephrocca in #3506

🐞 Bug fixes

fix activation grid oversize by @grimoire in #3282
Set ensure_ascii=False for tool calling by @AllentDan in #3295
fix sliding window multi chat by @grimoire in #3302
add v check by @grimoire in #3307
Fix Qwen3MoE config parsing by @lzhangzz in #3336
Fix finish reasons by @AllentDan in #3338
remove think_end_token_id in streaming content by @AllentDan in #3327
Fix the finish_reason by @AllentDan in #3350
set cmake policy minimum version as 3.5 by @lvhan028 in #3376
fix dp cudagraph by @grimoire in #3372
fix flashmla eagermode by @grimoire in #3375
close engine after each benchmark-generation iter by @grimoire in #3269
[Fix] fix image_token_id error of qwen2-vl and deepseek by @ao-zz in #3358
fix stopping criteria by @grimoire in #3384
support List[dict] prompt input without do_preprocess by @irexyc in #3385
add rayexecutor release timeout by @grimoire in #3403
fix tensor dispatch in dynamo by @wanfengcxz in #3417
fix linting error by upgrade to ubuntu-latest by @lvhan028 in #3442
fix awq tp for pytorch engine by @RunningLeon in #3435
fix mllm testcase fail by @caikun-pjlab in #3458
remove paged attention autotune by @grimoire in #3452
Remove empty prompts in benchmark scripts by @lvhan028 in #3460
failed to end session properly by @lvhan028 in #3471
fix qwen2.5-vl chat template by @CUHKSZzxy in #3475
Align forward arguments of deepgemm blockedf8 by @RunningLeon in #3474
fix turbomind lib missing to link nccl by exporting nccl path by @lvhan028 in #3479
fix dsvl2 no attr config error by @CUHKSZzxy in #3477
fix flash attention crash on triton3.1.0 by @grimoire in #3478
Fix disorder of ray execution by @RunningLeon in #3481
update dockerfile by @CUHKSZzxy in #3482
fix output logprobs by @irexyc in #3488
Fix Qwen2MoE shared expert gate by @lzhangzz in #3491
fix replicate kv for qwen3-moe by @grimoire in #3499
fix sampling if data overflow after temperature penalty by @irexyc in #3508

📚 Documentations

update qwen2.5-vl-32b docs by @CUHKSZzxy in #3446

🌐 Other

bump version to v0.7.2.post1 by @lvhan028 in #3298
[ci] add think function testcase by @zhulinJulia24 in #3299
merge dev into main by @lvhan028 in #3348
[ci] add vl models into pipeline interface testcase by @zhulinJulia24 in #3374
merge dev to main branch by @lvhan028 in #3378
opt experts memory and permute by @zhaochaoxing in #3390
Revert "opt experts memory and permute" by @zhaochaoxing in #3406
merge dev to main by @lvhan028 in #3400
add Hopper GPU dockerfile by @CUHKSZzxy in #3415
optimize internvit by @caikun-pjlab in #3433
fix stop/bad words by @irexyc in #3492
[ci] testcase bugfix and add more models into testcase by @zhulinJulia24 in #3463
bump version to v0.8.0 by @lvhan028 in #3432

New Contributors

@zhaochaoxing made their first contribution in #3313
@ao-zz made their first contribution in #3358
@wanfengcxz made their first contribution in #34...

Contributors

josephrocca, grimoire, and 16 other contributors

Assets 12

14 Apr 10:04

lvhan028

v0.7.3

231a323

v0.7.3

What's Changed

🚀 Features

Add Qwen3 and Qwen3MoE by @lzhangzz in #3305
[Feature] support qwen3 and qwen3-moe for pytorch engine by @CUHKSZzxy in #3315
[ascend]support deepseekv2 by @yao-fengchen in #3206
support ascend w8a8 graph_mode by @yao-fengchen in #3267
support Llama4 by @grimoire in #3408

💥 Improvements

Add spaces_between_special_tokens to /v1/interactive and make compatible with empty text by @AllentDan in #3283
add env var to control timeout by @CUHKSZzxy in #3291
optimize mla, remove load v by @grimoire in #3334
refactor dlinfer rope by @yao-fengchen in #3326
enable qwenvl2.5 graph mode on ascend by @jinminxi104 in #3367
Optimize ascend moe by @yao-fengchen in #3364
find port by @grimoire in #3429

🐞 Bug fixes

fix activation grid oversize by @grimoire in #3282
Set ensure_ascii=False for tool calling by @AllentDan in #3295
add v check by @grimoire in #3307
Fix Qwen3MoE config parsing by @lzhangzz in #3336
Fix finish reasons by @AllentDan in #3338
remove think_end_token_id in streaming content by @AllentDan in #3327
Fix the finish_reason by @AllentDan in #3350
support List[dict] prompt input without do_preprocess by @irexyc in #3385
fix tensor dispatch in dynamo by @wanfengcxz in #3417

📚 Documentations

update ascend doc by @yao-fengchen in #3420

🌐 Other

bump version to v0.7.2.post1 by @lvhan028 in #3298
Optimize internvit by @caikun-pjlab in #3316
bump version to v0.7.3 by @lvhan028 in #3416

New Contributors

@wanfengcxz made their first contribution in #3417
@caikun-pjlab made their first contribution in #3316

Full Changelog: v0.7.2...v0.7.3

Contributors

grimoire, lvhan028, and 8 other contributors

Assets 12

21 Mar 06:38

lvhan028

v0.7.2.post1

81c815e

v0.7.2.post1

What's Changed

💥 Improvements

Add spaces_between_special_tokens to /v1/interactive and make compatible with empty text by @AllentDan in #3283
add env var to control timeout by @CUHKSZzxy in #3291

🐞 Bug fixes

fix activation grid oversize by @grimoire in #3282
Set ensure_ascii=False for tool calling by @AllentDan in #3295

🌐 Other

bump version to v0.7.2.post1 by @lvhan028 in #3298

Full Changelog: v0.7.2...v0.7.2.post1

Contributors

grimoire, lvhan028, and 2 other contributors

Assets 12

19 Mar 08:36

lvhan028

v0.7.2

6f1277e

v0.7.2

What's Changed

🚀 Features

[Feature] support qwen2.5-vl for pytorch engine by @CUHKSZzxy in #3194
Support reward models by @lvhan028 in #3192
Add collective communication kernels by @lzhangzz in #3163
PytorchEngine multi-node support v2 by @grimoire in #3147
Add flash mla by @AllentDan in #3218
Add gemma3 implementation by @AllentDan in #3272

💥 Improvements

remove update badwords by @grimoire in #3183
defaullt executor ray by @grimoire in #3210
change ascend&camb default_batch_size to 256 by @jinminxi104 in #3251
Tool reasoning parsers and streaming function call by @AllentDan in #3198
remove torchelastic flag by @grimoire in #3242
disable flashmla warning on sm<90 by @grimoire in #3271

🐞 Bug fixes

Fix missing cli chat option by @lzhangzz in #3209
[ascend] fix multi-card distributed inference failures by @tangzhiyi11 in #3215
fix for small cache-max-entry-count by @grimoire in #3221
[dlinfer] fix glm-4v graph mode on ascend by @jinminxi104 in #3235
fix qwen2.5 pytorch engine dtype error on NPU by @tcye in #3247
[Fix] failed to update the tokenizer's eos_token_id into stop_word list by @lvhan028 in #3257
fix dsv3 gate scaling by @grimoire in #3263
Fix the bug for reading dict error by @GxjGit in #3196
Fix get ppl by @lvhan028 in #3268

📚 Documentations

Specifiy lmdeploy version in benchmark guide by @lyj0309 in #3216
[ascend] add Ascend docker image by @jinminxi104 in #3239

🌐 Other

[ci] testcase refactoring by @zhulinJulia24 in #3151
[ci] add testcase for native communicator by @zhulinJulia24 in #3217
[ci] add volc evaluation testcase by @zhulinJulia24 in #3240
[ci] remove v100 testconfig by @zhulinJulia24 in #3253
add rdma dependencies into docker file by @CUHKSZzxy in #3262
docs: update ascend docs for docker running by @CyCle1024 in #3266
bump version to v0.7.2 by @lvhan028 in #3252

New Contributors

@lyj0309 made their first contribution in #3216
@tcye made their first contribution in #3247

Full Changelog: v0.7.1...v0.7.2

Contributors

grimoire, lvhan028, and 10 other contributors

Assets 12

27 Feb 02:19

lvhan028

v0.7.1

c4d5bd9

v0.7.1

What's Changed

🚀 Features

support release pipeline by @irexyc in #3069
[feature] add dlinfer w8a8 support. by @Reinerzhou in #2988
[maca] support deepseekv2 for maca backend. by @Reinerzhou in #2918
[Feature] support deepseek-vl2 for pytorch engine by @CUHKSZzxy in #3149

💥 Improvements

use weights iterator while loading by @RunningLeon in #2886
Add deepseek-r1 chat template by @AllentDan in #3072
Update tokenizer by @lvhan028 in #3061
Set max concurrent requests by @AllentDan in #2961
remove logitswarper by @grimoire in #3109
Update benchmark script and user guide by @lvhan028 in #3110
support eos_token list in turbomind by @irexyc in #3044
Use aiohttp inside proxy server && add --disable-cache-status argument by @AllentDan in #3020
Update runtime package dependencies by @zgjja in #3142
Make turbomind support embedding inputs on GPU by @chengyuma in #3177

🐞 Bug fixes

[dlinfer] fix ascend qwen2_vl graph_mode by @yao-fengchen in #3045
fix error in interactive api by @lvhan028 in #3074
fix sliding window mgr by @grimoire in #3068
More arguments in api_client, update docstrings by @AllentDan in #3077
Add system role to deepseek chat template by @AllentDan in #3031
Fix xcomposer2d5 by @irexyc in #3087
fix user guide about cogvlm deployment by @lvhan028 in #3088
fix postional argument by @lvhan028 in #3086
Fix UT of deepseek chat template by @lvhan028 in #3125
Fix internvl2.5 error after eviction by @grimoire in #3122
Fix cogvlm and phi3vision by @RunningLeon in #3137
[fix] fix vl gradio, use pipeline api and remove interactive chat by @irexyc in #3136
fix the issue that stop_token may be less than defined in model.py by @irexyc in #3148
fix typing by @lz1998 in #3153
fix min length penalty by @irexyc in #3150
fix default temperature value by @irexyc in #3166
Use pad_token_id as image_token_id for vl models by @RunningLeon in #3158
Fix tool call prompt for InternLM and Qwen by @AllentDan in #3156
Update qwen2.py by @GxjGit in #3174
fix temperature=0 by @grimoire in #3176
fix blocked fp8 moe by @grimoire in #3181
fix deepseekv2 has no attribute use_mla error by @CUHKSZzxy in #3188
fix unstoppable chat by @lvhan028 in #3189

🌐 Other

[ci] add internlm3 into testcase by @zhulinJulia24 in #3038
add internlm3 to supported models by @lvhan028 in #3041
update pre-commit config by @lvhan028 in #2683
[maca] add cudagraph support on maca backend. by @Reinerzhou in #2834
bump version to v0.7.0.post1 by @lvhan028 in #3076
bump version to v0.7.0.post2 by @lvhan028 in #3094
[Fix] fix the URL judgment problem in Windows by @Lychee-acaca in #3103
bump version to v0.7.0.post3 by @lvhan028 in #3115
[ci] fix some fail in daily testcase by @zhulinJulia24 in #3134
Bump version to v0.7.1 by @lvhan028 in #3178

New Contributors

@Lychee-acaca made their first contribution in #3103
@lz1998 made their first contribution in #3153
@GxjGit made their first contribution in #3174
@chengyuma made their first contribution in #3177
@CUHKSZzxy made their first contribution in #3149

Full Changelog: v0.7.0...v0.7.1

Contributors

grimoire, lvhan028, and 12 other contributors

Assets 12

10 Feb 06:00

lvhan028

v0.7.0.post3

e98fd6a

v0.7.0.post3

What's Changed

💥 Improvements

Set max concurrent requests by @AllentDan in #2961
remove logitswarper by @grimoire in #3109

🐞 Bug fixes

fix user guide about cogvlm deployment by @lvhan028 in #3088
fix postional argument by @lvhan028 in #3086

🌐 Other

[Fix] fix the URL judgment problem in Windows by @Lychee-acaca in #3103
bump version to v0.7.0.post3 by @lvhan028 in #3115

New Contributors

@Lychee-acaca made their first contribution in #3103

Full Changelog: v0.7.0.post2...v0.7.0.post3

Contributors

grimoire, lvhan028, and 2 other contributors

Assets 12

27 Jan 15:57

lvhan028

v0.7.0.post2

637435f

LMDeploy Release V0.7.0.post2

What's Changed

💥 Improvements

Add deepseek-r1 chat template by @AllentDan in #3072
Update tokenizer by @lvhan028 in #3061

🐞 Bug fixes

Add system role to deepseek chat template by @AllentDan in #3031
Fix xcomposer2d5 by @irexyc in #3087

🌐 Other

bump version to v0.7.0.post2 by @lvhan028 in #3094

Full Changelog: v0.7.0.post1...v0.7.0.post2

Contributors

lvhan028, irexyc, and AllentDan

Assets 12

25 Jan 11:35

lvhan028

v0.7.0.post1

552bf3a

LMDeploy Release V0.7.0.post1

What's Changed

💥 Improvements

use weights iterator while loading by @RunningLeon in #2886

🐞 Bug fixes

[dlinfer] fix ascend qwen2_vl graph_mode by @yao-fengchen in #3045
fix error in interactive api by @lvhan028 in #3074
fix sliding window mgr by @grimoire in #3068
More arguments in api_client, update docstrings by @AllentDan in #3077

🌐 Other

[ci] add internlm3 into testcase by @zhulinJulia24 in #3038
add internlm3 to supported models by @lvhan028 in #3041
update pre-commit config by @lvhan028 in #2683
[maca] add cudagraph support on maca backend. by @Reinerzhou in #2834
bump version to v0.7.0.post1 by @lvhan028 in #3076

Full Changelog: v0.7.0...v0.7.0.post1

Contributors

grimoire, lvhan028, and 5 other contributors

Assets 12

15 Jan 10:04

lvhan028

v0.7.0

9fcb3b1

LMDeploy Release v0.7.0

What's Changed

🚀 Features

Support moe w8a8 in pytorch engine by @grimoire in #2894
Support DeepseekV3 fp8 by @grimoire in #2967
support new backend cambricon by @JackWeiw in #3002
support-moe-fp8 by @RunningLeon in #3007
add internlm3-dense(turbomind) & chat template by @irexyc in #3024
support internlm3 on pt by @RunningLeon in #3026
Support internlm3 quantization by @AllentDan in #3027

💥 Improvements

Optimize awq kernel in pytorch engine by @grimoire in #2965
Support fp8 w8a8 for pt backend by @RunningLeon in #2959
Optimize lora kernel by @grimoire in #2975
Remove threadsafe by @grimoire in #2907
Refactor async engine & turbomind IO by @lzhangzz in #2968
[dlinfer]rope refine by @JackWeiw in #2984
Expose spaces_between_special_tokens by @AllentDan in #2991
[dlinfer]change llm op interface of paged_prefill_attention. by @JackWeiw in #2977
Update request logger by @lvhan028 in #2981
remove decoding by @grimoire in #3016

🐞 Bug fixes

Fix build crash in nvcr.io/nvidia/pytorch:24.06-py3 image by @zgjja in #2964
add tool role in BaseChatTemplate as tool response in messages by @AllentDan in #2979
Fix ascend dockerfile by @jinminxi104 in #2989
fix internvl2 qk norm by @grimoire in #2987
fix xcomposer2 when transformers is upgraded greater than 4.46 by @irexyc in #3001
Fix get_ppl & get_logits by @lvhan028 in #3008
Fix typo in w4a16 guide by @Yan-Xiangjun in #3018
fix blocked fp8 moe kernel by @grimoire in #3009
Fix async engine by @lzhangzz in #3029
[hotfix] Fix get_ppl by @lvhan028 in #3023
Fix MoE gating for DeepSeek V2 by @lzhangzz in #3030
Fix empty response for pipeline by @lzhangzz in #3034
Fix potential hang during TP model initialization by @lzhangzz in #3033

🌐 Other

[ci] add w8a8 and internvl2.5 models into testcase by @zhulinJulia24 in #2949
bump version to v0.7.0 by @lvhan028 in #3010

New Contributors

@zgjja made their first contribution in #2964
@Yan-Xiangjun made their first contribution in #3018

Full Changelog: 0.6.5...v0.7.0

Contributors

grimoire, lvhan028, and 9 other contributors

Assets 12

Releases: InternLM/lmdeploy

v0.9.0

What's Changed

🚀 Features

💥 Improvements

🐞 Bug fixes

🌐 Other

New Contributors

Contributors

Uh oh!

v0.8.0

What's Changed

🚀 Features

💥 Improvements

🐞 Bug fixes

📚 Documentations

🌐 Other

New Contributors

Contributors

Uh oh!

v0.7.3

What's Changed

🚀 Features

💥 Improvements

🐞 Bug fixes

📚 Documentations

🌐 Other

New Contributors

Contributors

Uh oh!

v0.7.2.post1

What's Changed

💥 Improvements

🐞 Bug fixes

🌐 Other

Contributors

Uh oh!

v0.7.2

What's Changed

🚀 Features

💥 Improvements

🐞 Bug fixes

📚 Documentations

🌐 Other

New Contributors

Contributors

Uh oh!

v0.7.1

What's Changed

🚀 Features

💥 Improvements

🐞 Bug fixes

🌐 Other

New Contributors

Contributors

Uh oh!

v0.7.0.post3

What's Changed

💥 Improvements

🐞 Bug fixes

🌐 Other

New Contributors

Contributors

Uh oh!

LMDeploy Release V0.7.0.post2

What's Changed

💥 Improvements

🐞 Bug fixes

🌐 Other

Contributors

Uh oh!

LMDeploy Release V0.7.0.post1

What's Changed

💥 Improvements

🐞 Bug fixes

🌐 Other

Contributors

Uh oh!

LMDeploy Release v0.7.0

What's Changed