Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

本地注册加载ChatGLM3-6B模型,回答异常 #1463

Open
LASTLINEK opened this issue May 9, 2024 · 0 comments
Open

本地注册加载ChatGLM3-6B模型,回答异常 #1463

LASTLINEK opened this issue May 9, 2024 · 0 comments
Labels
gpu question Further information is requested
Milestone

Comments

@LASTLINEK
Copy link

不量化、不添加LoRA,加载ChatGLM3-6B模型

回答如下:
7ff8331c36c506c4e077ec403dfdf10
809b988fcac2fff58d5ddff1ef87e3c

显然这些问答是有问题的。

  • 回答自己是情感助手,但并没有提示词告知他扮演情感助手
  • 异常字符
  • 显存占用22GB(这是6B的模型,一张卡显存共24GB)。我了解到可能是由于gpu_memory_utilization=0.9这个参数造成的。

但是!!

  • 当使用8bit量化加载后,显存占用约7GB。回答正常。
  • 当添加LoRA模块后,量化设置为none,显存占用约13GB。回答正常。

正常回答如下:
image

Why

我想知道是只有在不添加LoRA模块、不使用量化时才会触发gpu_memory_utilization=0.9参数吗?这个参数与vllm的有关系吗?是什么原因导致模型回答异常?以下是部分日志:

image

感谢各位回答!

@LASTLINEK LASTLINEK added the question Further information is requested label May 9, 2024
@XprobeBot XprobeBot added the gpu label May 9, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.0, v0.11.1 May 9, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.1, v0.11.2 May 17, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.2, v0.11.3 May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gpu question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants