本地注册加载ChatGLM3-6B模型，回答异常 #1463

LASTLINEK · 2024-05-09T10:06:05Z

不量化、不添加LoRA，加载ChatGLM3-6B模型

回答如下：

显然这些问答是有问题的。

正常回答如下：

我想知道是只有在不添加LoRA模块、不使用量化时才会触发gpu_memory_utilization=0.9参数吗？这个参数与vllm的有关系吗？是什么原因导致模型回答异常？以下是部分日志：

感谢各位回答！

LASTLINEK added the question Further information is requested label May 9, 2024

XprobeBot added the gpu label May 9, 2024

XprobeBot modified the milestones: v0.11.0, v0.11.1 May 9, 2024

XprobeBot modified the milestones: v0.11.1, v0.11.2 May 17, 2024

XprobeBot modified the milestones: v0.11.2, v0.11.3 May 24, 2024