[WIP] Support InternLM on 3rd-party inference toolboxes #136

wangruohui · 2023-07-18T13:19:56Z

This issue is to track progress on 3rd party toolboxes which is related to InternLM.

VLLM

Inference with single GPU
- There seems some bug, not sure from my implementation or from upstream
Tensor parallel

InternLM-7B is supported in Deepspeed inference and merged to main branch: microsoft/DeepSpeed#4137

Meta tensor for faster model loading: watching microsoft/DeepSpeed#3608

wangruohui mentioned this issue Jul 19, 2023

[Fix] Support DeepSpeed on autoTP and kernel injection #138

Merged

lvhan028 closed this as completed Sep 1, 2023