Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support OpenGVLab/InternVL-Chat-V1-5 #1490

Merged
merged 6 commits into from
Apr 29, 2024
Merged

Conversation

irexyc
Copy link
Collaborator

@irexyc irexyc commented Apr 24, 2024

@lvhan028 lvhan028 added the enhancement New feature or request label Apr 25, 2024
@lvhan028 lvhan028 requested a review from AllentDan April 26, 2024 03:06
lmdeploy/vl/model/internvl.py Outdated Show resolved Hide resolved
Comment on lines +101 to +102
MEAN = (123.675, 116.28, 103.53)
STD = (58.395, 57.12, 57.375)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two constants, can it be inferred from the Internvl code?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try not to infer any thing from the upstream's repo code. We'd better keep them as independent as possible

Copy link
Collaborator Author

@irexyc irexyc Apr 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the mean and std are not in the repo but in the example code.

@LRHstudy
Copy link

LRHstudy commented Apr 26, 2024

运行这个代码在输入4通道图像时会报错:
File "/opt/py38/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/opt/lmdeploy/lmdeploy/vl/model/internvl.py", line 151, in forward
return self._forward_func(images)
File "/opt/lmdeploy/lmdeploy/vl/model/internvl.py", line 132, in _forward_v1_5
outputs = self.transform(outputs)
File "/opt/py38/lib/python3.8/site-packages/torchvision/transforms/transforms.py", line 95, in call
img = t(img)
File "/opt/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in call_impl
return forward_call(*args, **kwargs)
File "/opt/py38/lib/python3.8/site-packages/torchvision/transforms/transforms.py", line 277, in forward
return F.normalize(tensor, self.mean, self.std, self.inplace)
File "/opt/py38/lib/python3.8/site-packages/torchvision/transforms/functional.py", line 363, in normalize
return F_t.normalize(tensor, mean=mean, std=std, inplace=inplace)
File "/opt/py38/lib/python3.8/site-packages/torchvision/transforms/functional_tensor.py", line 928, in normalize
return tensor.sub
(mean).div
(std)
File "/opt/py38/lib/python3.8/site-packages/torch/utils/_device.py", line 77, in torch_function
return func(*args, **kwargs)
RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 1

缺少代码:
image = image.convert('RGB') if image.mode != 'RGB' else image

@irexyc
Copy link
Collaborator Author

irexyc commented Apr 26, 2024

@LRHstudy 修好了

@lvhan028
Copy link
Collaborator

lvhan028 commented Apr 28, 2024

  • vl pipeline (tp 1, 2)
  • api server

Copy link
Collaborator

@AllentDan AllentDan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lvhan028 lvhan028 merged commit b22366b into InternLM:main Apr 29, 2024
5 checks passed
@lijing1996
Copy link

为什么我用tp的话H800每张卡的显存还是吃满了 batch只能跟单张卡还是一样呢

@irexyc
Copy link
Collaborator Author

irexyc commented Apr 30, 2024

@lijing1996

多TP的话,是 Tensor Parallel, 每张卡都会算一部分,不管你的batch 是多少。

要控制显存的话,这里提到一些降低显存的方法:
#1173 (comment)

LLM 模型 在 tp > 1的时候,每个显卡上的显存是一致。目前vision 模型是挂在0号卡上的,会导致其他卡显存的利用率偏低。这个问题目前正在处理,后面也会让 vision 模型均匀分摊到每个卡上。

@lijing1996
Copy link

@lijing1996

多TP的话,是 Tensor Parallel, 每张卡都会算一部分,不管你的batch 是多少。

要控制显存的话,这里提到一些降低显存的方法: #1173 (comment)

LLM 模型 在 tp > 1的时候,每个显卡上的显存是一致。目前vision 模型是挂在0号卡上的,会导致其他卡显存的利用率偏低。这个问题目前正在处理,后面也会让 vision 模型均匀分摊到每个卡上。

是这样的 tp=8 和 tp=1的情况下,batch相同的情况下,0卡的显存是一样的,同时1~7也占用了大量的显存,是比0卡小。是哪里没有设置对吗?

@irexyc
Copy link
Collaborator Author

irexyc commented Apr 30, 2024

你这个没有问题,符合逻辑。

目前显存分配的逻辑是:
现在0号卡上构建vision模型 - 在剩余卡上加载LLM模型权重,计算所有卡上剩余最小的显存 - 根据百分比申请kv cache的显存。

tp=8或者1,不影响0号卡剩余的显存大小,所有两种情况显存一样。但是tp=8的时候,因为vision模型目前只在0号卡上,所以1-7上显存会小一些。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants