[QA] Windows11使用bitsandbytes运行InternLM2-chat-7B-4bits量化，大模型精神错乱 #680

Sawtone · 2024-01-31T06:01:38Z

Describe the question.

如图，大模型会无视用户输入并输出一些疑似知识库内的问答内容，且输出一遍问答内容之后会重复之前输出的内容持续输出到自动终止

Sawtone · 2024-01-31T06:02:47Z

Python：3.10.13
CUDA：11.7
PC：Windows 11

Sawtone · 2024-01-31T06:04:16Z

Sawtone · 2024-01-31T06:04:50Z

ZwwWayne · 2024-01-31T13:43:42Z

看起来是没有停下来，请问下transformer版本是？

Sawtone · 2024-01-31T15:05:04Z

看起来是没有停下来，请问下transformer版本是？

4.37.2，麻烦您了

ZwwWayne · 2024-02-01T02:59:32Z

这个的原因应该是你设置的 max_length 最小长度为 32，但是模型普通的回复达不到 32 的长度，所以导致模型不会停下。可以在 web_demo 的脚本中 https://github.com/InternLM/InternLM/blob/main/chat/web_demo.py#L194 修改调小最小值

在相同transformers 版本下，load_in_4bit，cli 脚本中不会遇到这个问题。示例代码如下：

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "internlm/internlm2-chat-7b/"
print('initializing model')
model = AutoModelForCausalLM.from_pretrained(model_path, device_map='auto', trust_remote_code=True, load_in_4bit=True)
print('initialized model')
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
print('initialized tokenizer')
model = model.eval()
length = 0
response, history = model.chat(tokenizer, "你是谁", history=[])
print(response[0:], flush=True, end="\n")
length += len(response)
response, history = model.chat(tokenizer, "你好", history=history)
print(response[0:], flush=True, end="\n")
print(history)

输出结果如下：

Sawtone · 2024-02-01T03:40:13Z

明白了，但是max_length不应该是模型回复的最大长度吗，我翻了一下Transformers的interface.py，发现好像没有min_length这个选项(附图一)，另外我尝试了回答大于32的提问，最终发现他会正常回答问题，但本该停止的时候他会继续回答永不停止，怀疑可能和issues其他人遇到的那个不停止的问题很像(附图2)，麻烦您了！

可以看到imend和bot

ZwwWayne · 2024-02-01T12:08:20Z

可能和一些对话超参数也有关系，主要是之前的一个 case 命令行没有复现，有点奇怪
也可以检查下HF的代码是否完整更新

ZwwWayne · 2024-02-01T12:24:26Z

用 https://github.com/InternLM/InternLM/blob/main/chat/web_demo.py 似乎没有复现，可以看看用这个项目中最新的脚本会不会遇到

Sawtone · 2024-02-01T14:19:00Z

嗯嗯可能是参数的问题，更换新demo后问题没有出现了，谢谢佬！另外提下新demo里avatar的存放路径和旧的不一样，以后有人问的话可以提醒一下~(顺便还想问一下Lagent+InternLM2能否实现读取链接的功能，最近想做一个项目，项目要求大模型能够浏览页面内容(如csdn)，如果可以的话我继续进一步做)(其实还想实现读图，这个我知道咱双卡Xcomposer可以做到，但电脑带不起来，等之后读取链接成功实现之后用咱Studio云算力试一下Xcomposer)

ZwwWayne · 2024-02-02T05:54:12Z

直接浏览网页我们还没尝试过，读图的话建议尝试 xcomposer

Sawtone · 2024-02-02T07:26:20Z

直接浏览网页我们还没尝试过，读图的话建议尝试 xcomposer

好的，那我之后使用Lagent尝试一下，谢谢！

XARKUR · 2024-02-03T14:02:05Z

我使用langchain-chatchat跑也会遇到类似的问题，就前一两个问题的回答是正常的，后面就开始循环了，不知道怎么解决

github-actions · 2024-02-11T02:06:46Z

This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 7 days if the stale label is not removed or if there is no further response.

Sawtone · 2024-02-15T08:42:07Z

我又来了佬，我把streamlit删了改成Flask了，准备用POST跟Node.js的后端连一下，用Postman测试了一下发现又出现复读机了，代码是基于上次没出错的新web_demo.py修改的，很多地方不懂都没敢改，具体附下：

Sawtone · 2024-02-15T08:51:10Z

代码如下，上传到个人主页方便查看：https://github.com/Scoodtone/InternLM_Try/blob/main/web_demo_razor.py

Sawtone · 2024-02-15T08:51:55Z

Postman发送的内容与接收到的回答如下：

github-actions · 2024-02-24T02:01:19Z

This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 7 days if the stale label is not removed or if there is no further response.

github-actions · 2024-03-03T02:06:05Z

This issue is closed because it has been stale for 7 days. Please open a new issue if you have similar issues or you have any new updates now.

Sawtone added the question Further information is requested label Jan 31, 2024

github-actions bot added the Stale label Feb 11, 2024

github-actions bot removed the Stale label Feb 16, 2024

github-actions bot added the Stale label Feb 24, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QA] Windows11使用bitsandbytes运行InternLM2-chat-7B-4bits量化，大模型精神错乱 #680

[QA] Windows11使用bitsandbytes运行InternLM2-chat-7B-4bits量化，大模型精神错乱 #680

Sawtone commented Jan 31, 2024

Sawtone commented Jan 31, 2024 •

edited

Sawtone commented Jan 31, 2024

Sawtone commented Jan 31, 2024

ZwwWayne commented Jan 31, 2024

Sawtone commented Jan 31, 2024

ZwwWayne commented Feb 1, 2024

Sawtone commented Feb 1, 2024 •

edited

ZwwWayne commented Feb 1, 2024

ZwwWayne commented Feb 1, 2024

Sawtone commented Feb 1, 2024

ZwwWayne commented Feb 2, 2024

Sawtone commented Feb 2, 2024

XARKUR commented Feb 3, 2024

github-actions bot commented Feb 11, 2024

Sawtone commented Feb 15, 2024

Sawtone commented Feb 15, 2024

Sawtone commented Feb 15, 2024

github-actions bot commented Feb 24, 2024

github-actions bot commented Mar 3, 2024

[QA] Windows11使用bitsandbytes运行InternLM2-chat-7B-4bits量化，大模型精神错乱 #680

[QA] Windows11使用bitsandbytes运行InternLM2-chat-7B-4bits量化，大模型精神错乱 #680

Comments

Sawtone commented Jan 31, 2024

Describe the question.

Sawtone commented Jan 31, 2024 • edited

Sawtone commented Jan 31, 2024

Sawtone commented Jan 31, 2024

ZwwWayne commented Jan 31, 2024

Sawtone commented Jan 31, 2024

ZwwWayne commented Feb 1, 2024

Sawtone commented Feb 1, 2024 • edited

ZwwWayne commented Feb 1, 2024

ZwwWayne commented Feb 1, 2024

Sawtone commented Feb 1, 2024

ZwwWayne commented Feb 2, 2024

Sawtone commented Feb 2, 2024

XARKUR commented Feb 3, 2024

github-actions bot commented Feb 11, 2024

Sawtone commented Feb 15, 2024

Sawtone commented Feb 15, 2024

Sawtone commented Feb 15, 2024

github-actions bot commented Feb 24, 2024

github-actions bot commented Mar 3, 2024

Sawtone commented Jan 31, 2024 •

edited

Sawtone commented Feb 1, 2024 •

edited