When using AutoModelForCausalLM, THUDM/cogagent-vqa-hf and load_in_8bit I get this error : self and mat2 must have the same dtype, but got Half and Char #28856

FurkanGozukara · 2024-02-04T12:18:41Z

System Info

Microsoft Windows [Version 10.0.19045.3996]
(c) Microsoft Corporation. All rights reserved.

G:\temp Local install\CogVLM\venv\Scripts>activate

(venv) G:\temp Local install\CogVLM\venv\Scripts>pip freeze
accelerate==0.26.1
aiofiles==23.2.1
aiohttp==3.9.3
aiosignal==1.3.1
altair==5.2.0
annotated-types==0.6.0
anyio==4.2.0
anykeystore==0.2
apex==0.9.10.dev0
async-timeout==4.0.3
attrs==23.2.0
bitsandbytes @ https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.1-py3-none-win_amd64.whl
blinker==1.7.0
blis==0.7.11
boto3==1.34.34
botocore==1.34.34
braceexpand==0.1.7
cachetools==5.3.2
catalogue==2.0.10
certifi==2022.12.7
charset-normalizer==2.1.1
click==8.1.7
cloudpathlib==0.16.0
colorama==0.4.6
confection==0.1.4
contourpy==1.2.0
cpm-kernels==1.0.11
cryptacular==1.6.2
cycler==0.12.1
cymem==2.0.8
datasets==2.16.1
deepspeed @ https://huggingface.co/MonsterMMORPG/SECourses/resolve/main/deepspeed-0.11.2_cuda121-cp310-cp310-win_amd64.whl
defusedxml==0.7.1
dill==0.3.7
einops==0.7.0
en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl
exceptiongroup==1.2.0
fastapi==0.109.1
ffmpy==0.3.1
filelock==3.9.0
fonttools==4.47.2
frozenlist==1.4.1
fsspec==2023.10.0
gitdb==4.0.11
GitPython==3.1.41
gradio==4.16.0
gradio_client==0.8.1
greenlet==3.0.3
h11==0.14.0
hjson==3.1.0
httpcore==1.0.2
httpx==0.26.0
huggingface-hub==0.20.3
hupper==1.12.1
idna==3.4
importlib-metadata==7.0.1
importlib-resources==6.1.1
Jinja2==3.1.2
jmespath==1.0.1
jsonlines==4.0.0
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
kiwisolver==1.4.5
langcodes==3.3.0
loguru==0.7.2
markdown-it-py==3.0.0
MarkupSafe==2.1.3
matplotlib==3.8.2
mdurl==0.1.2
mpmath==1.3.0
multidict==6.0.5
multiprocess==0.70.15
murmurhash==1.0.10
networkx==3.2.1
ninja==1.11.1.1
numpy==1.26.3
oauthlib==3.2.2
orjson==3.9.13
packaging==23.2
pandas==2.2.0
PasteDeploy==3.1.0
pbkdf2==1.3
pillow==10.2.0
plaster==1.1.2
plaster-pastedeploy==1.0.1
preshed==3.0.9
protobuf==4.25.2
psutil==5.9.8
py-cpuinfo==9.0.0
pyarrow==15.0.0
pyarrow-hotfix==0.6
pydantic==2.6.0
pydantic_core==2.16.1
pydeck==0.8.1b0
pydub==0.25.1
Pygments==2.17.2
pynvml==11.5.0
pyparsing==3.1.1
pyramid==2.0.2
pyramid-mailer==0.15.1
python-dateutil==2.8.2
python-multipart==0.0.7
python3-openid==3.2.0
pytz==2024.1
PyYAML==6.0.1
referencing==0.33.0
regex==2023.12.25
repoze.sendmail==4.4.1
requests==2.28.1
requests-oauthlib==1.3.1
rich==13.7.0
rpds-py==0.17.1
ruff==0.2.0
s3transfer==0.10.0
safetensors==0.4.2
scipy==1.12.0
seaborn==0.13.2
semantic-version==2.10.0
sentencepiece==0.1.99
shellingham==1.5.4
six==1.16.0
smart-open==6.4.0
smmap==5.0.1
sniffio==1.3.0
spacy==3.7.2
spacy-legacy==3.0.12
spacy-loggers==1.0.5
SQLAlchemy==2.0.25
srsly==2.4.8
starlette==0.35.1
streamlit==1.31.0
SwissArmyTransformer==0.4.11
sympy==1.12
tenacity==8.2.3
tensorboardX==2.6.2.2
thinc==8.2.2
timm==0.9.12
tokenizers==0.15.1
toml==0.10.2
tomlkit==0.12.0
toolz==0.12.1
torch==2.2.0+cu121
torchaudio==2.2.0+cu121
torchvision==0.17.0+cu121
tornado==6.4
tqdm==4.66.1
transaction==4.0
transformers==4.37.2
translationstring==1.4
triton @ https://huggingface.co/MonsterMMORPG/SECourses/resolve/main/triton-2.1.0-cp310-cp310-win_amd64.whl
typer==0.9.0
typing_extensions==4.8.0
tzdata==2023.4
tzlocal==5.2
urllib3==1.26.13
uvicorn==0.27.0.post1
validators==0.22.0
velruse==1.1.1
venusian==3.1.0
wasabi==1.1.2
watchdog==3.0.0
weasel==0.3.4
webdataset==0.2.86
WebOb==1.8.7
websockets==11.0.3
win32-setctime==1.1.0
WTForms==3.1.2
wtforms-recaptcha==0.3.2
xformers==0.0.24
xxhash==3.4.1
yarl==1.9.4
zipp==3.17.0
zope.deprecation==5.0
zope.interface==6.1
zope.sqlalchemy==3.1

(venv) G:\temp Local install\CogVLM\venv\Scripts>

Who can help?

@ArthurZucker
@amyeroberts
@pacman100
@SunMarc
@younesbelkada

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

here the full code and pip freeze

the error: self and mat2 must have the same dtype, but got Half and Char

there are no visible errors on CMD window this error returns as response

Same code load in 4 bit working

DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'
MODEL_PATH = "THUDM/cogagent-vqa-hf"
tokenizer = LlamaTokenizer.from_pretrained('lmsys/vicuna-7b-v1.5')
torch_type = torch.float16

    model = AutoModelForCausalLM.from_pretrained(
        MODEL_PATH,
        low_cpu_mem_usage=True,
        load_in_8bit=True,
        trust_remote_code=True
    ).eval()

def process_image(image, input_text, temperature, top_p, top_k, do_sample):
    with torch.no_grad():
        input_by_model = model.build_conversation_input_ids(tokenizer, query=input_text, history=[], images=[image], template_version='base')
        inputs = {
            'input_ids': input_by_model['input_ids'].unsqueeze(0).to(DEVICE),
            'token_type_ids': input_by_model['token_type_ids'].unsqueeze(0).to(DEVICE),
            'attention_mask': input_by_model['attention_mask'].unsqueeze(0).to(DEVICE),
            'images': [[input_by_model['images'][0].to(DEVICE).to(torch_type)]],
        }
        if 'cross_images' in input_by_model and input_by_model['cross_images']:
            inputs['cross_images'] = [[input_by_model['cross_images'][0].to(DEVICE).to(torch_type)]]

        gen_kwargs = {
                "max_length": 2048,
                "temperature": temperature,
                "do_sample": do_sample,
                "top_p": top_p,
                "top_k": top_k
        }
        outputs = model.generate(**inputs, **gen_kwargs)
        outputs = outputs[:, inputs['input_ids'].shape[1]:]
        response = tokenizer.decode(outputs[0])
        return response.split("</s>")[0]

Expected behavior

no error

The text was updated successfully, but these errors were encountered:

younesbelkada · 2024-02-04T16:09:36Z

Hi @FurkanGozukara
This issue is a duplicate of TimDettmers/bitsandbytes#1029 - can you share the full traceback of the error so that I can fix the issue on the Hub?
My gut feeling is that the model is not compatible with bnb-8bit, the model code authors will need to make a slight change to make it work.
You can also post the same issue on the model repo: https://huggingface.co/THUDM/cogagent-vqa-hf/discussions with the full traceback of the issue

younesbelkada · 2024-02-04T16:10:22Z

Also does the issue happens with 4-bit as well?

FurkanGozukara · 2024-02-04T16:14:44Z

Also does the issue happens with 4-bit as well?

the thing is 4-bit working perfectly fine

You may be right that it is not supporting 8-bit i mean the model

therefore I am testing cogvlm-chat-hf right now

on CMD there aren't any errors so i also have 0 other info

FurkanGozukara · 2024-02-04T16:35:45Z

cogvlm-chat-hf worked with 8 bit

so it is probably related to model itself i messaged the developers thank you

younesbelkada mentioned this issue Feb 4, 2024

When using AutoModelForCausalLM, THUDM/cogagent-vqa-hf and load_in_8bit I get this error : self and mat2 must have the same dtype, but got Half and Char TimDettmers/bitsandbytes#1029

Closed

FurkanGozukara closed this as completed Feb 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When using AutoModelForCausalLM, THUDM/cogagent-vqa-hf and load_in_8bit I get this error : self and mat2 must have the same dtype, but got Half and Char #28856

When using AutoModelForCausalLM, THUDM/cogagent-vqa-hf and load_in_8bit I get this error : self and mat2 must have the same dtype, but got Half and Char #28856

FurkanGozukara commented Feb 4, 2024 •

edited

younesbelkada commented Feb 4, 2024

younesbelkada commented Feb 4, 2024

FurkanGozukara commented Feb 4, 2024 •

edited

FurkanGozukara commented Feb 4, 2024

When using AutoModelForCausalLM, THUDM/cogagent-vqa-hf and load_in_8bit I get this error : self and mat2 must have the same dtype, but got Half and Char #28856

When using AutoModelForCausalLM, THUDM/cogagent-vqa-hf and load_in_8bit I get this error : self and mat2 must have the same dtype, but got Half and Char #28856

Comments

FurkanGozukara commented Feb 4, 2024 • edited

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

younesbelkada commented Feb 4, 2024

younesbelkada commented Feb 4, 2024

FurkanGozukara commented Feb 4, 2024 • edited

FurkanGozukara commented Feb 4, 2024

FurkanGozukara commented Feb 4, 2024 •

edited

FurkanGozukara commented Feb 4, 2024 •

edited