Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to load 4-bits weights from HuggingFace #51

Closed
PierreOreistein opened this issue Nov 7, 2023 · 3 comments
Closed

Failed to load 4-bits weights from HuggingFace #51

PierreOreistein opened this issue Nov 7, 2023 · 3 comments

Comments

@PierreOreistein
Copy link

PierreOreistein commented Nov 7, 2023

Description

Unable to load the quantized weights (4 bits) from HuggingFace

Code

The code is a direct copy from the file examples/example_chat_4bit_en.py

import torch
from transformers import AutoModel, AutoTokenizer

import auto_gptq
from auto_gptq.modeling import BaseGPTQForCausalLM

auto_gptq.modeling._base.SUPPORTED_MODELS = ["InternLMXComposer"]

torch.set_grad_enabled(False)


class InternLMXComposerQForCausalLM(BaseGPTQForCausalLM):
    layers_block_name = "internlm_model.model.layers"
    outside_layer_modules = [
        "query_tokens",
        "flag_image_start",
        "flag_image_end",
        "visual_encoder",
        "Qformer",
        "internlm_model.model.embed_tokens",
        "internlm_model.model.norm",
        "internlm_proj",
        "internlm_model.lm_head",
    ]
    inside_layer_modules = [
        ["self_attn.k_proj", "self_attn.v_proj", "self_attn.q_proj"],
        ["self_attn.o_proj"],
        ["mlp.gate_proj"],
        ["mlp.up_proj"],
        ["mlp.down_proj"],
    ]


# init model and tokenizer
model = InternLMXComposerQForCausalLM.from_quantized(
    "internlm/internlm-xcomposer-7b-4bit", trust_remote_code=True, device="cuda:0"
)
model = model.eval()
tokenizer = AutoTokenizer.from_pretrained(
    "internlm/internlm-xcomposer-7b-4bit", trust_remote_code=True
)
model.model.tokenizer = tokenizer

# example image
image = "examples/images/aiyinsitan.jpg"

# Multi-Turn Text-Image Dialogue
# 1st turn
text = 'Describe this image in detial.'
image = "examples/images/aiyinsitan.jpg"
response, history = model.chat(text, image)
print(f"User: {text}")
print(f"Bot: {response}") 
# The image features a black and white portrait of Albert Einstein, the famous physicist and mathematician. 
# Einstein is seated in the center of the frame, looking directly at the camera with a serious expression on his face. 
# He is dressed in a suit, which adds a touch of professionalism to his appearance. 

Error

Traceback (most recent call last):
  File "/mnt/bd/dev-pierre-oreistein-st/sandbox/test_internlm_vl/test_internlm_vl_4bits", line 35, in <module>
    model = InternLMXComposerQForCausalLM.from_quantized(
  File "/home/pierre/.pyenv/versions/dev3.9/lib/python3.9/site-packages/auto_gptq/modeling/_base.py", line 847, in from_quantized
    raise FileNotFoundError(f"Could not find a model in {model_name_or_path} with a name in {', '.join(searched_files)}. Please specify the argument model_basename to use a custom file name.")
FileNotFoundError: Could not find a model in internlm/internlm-xcomposer-7b-4bit with a name in gptq_model-4bit-128g.safetensors, model.safetensors. Please specify the argument model_basename to use a custom file name.

Ideas

According to this similar issue I need to specify the model file. However, I was unable to find it on HuggingFace. Could you help me with this?

Thanks in advance for your help!

@LightDXY
Copy link
Collaborator

LightDXY commented Nov 7, 2023

hi, try to set use_safetensors=False when load from the model by from_quantized. This may caused by the default value difference in different GPTQ version.

@PierreOreistein
Copy link
Author

PierreOreistein commented Nov 7, 2023

Thanks for the fast reply. I still have a similar error so. It was able to download the bin file but unable to load it I guess. Any idea what is the problem? Or any recommendation on the gptq version to use?

Environment

Python 3.9.18

accelerate==0.24.1
aiofiles==23.2.1
aiohttp==3.8.6
aiosignal==1.3.1
altair==5.1.2
annotated-types==0.6.0
anyio==3.7.1
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asttokens==2.4.1
async-lru==2.0.4
async-timeout==4.0.3
attrs==23.1.0
auto-gptq==0.5.0
Babel==2.13.1
beautifulsoup4==4.12.2
bleach==6.1.0
certifi==2023.7.22
cffi==1.16.0
charset-normalizer==3.3.2
click==8.1.7
comm==0.1.4
contourpy==1.2.0
cycler==0.12.1
datasets==2.14.6
debugpy==1.8.0
decorator==5.1.1
defusedxml==0.7.1
dill==0.3.7
einops==0.7.0
exceptiongroup==1.1.3
executing==2.0.1
fastapi==0.104.1
fastjsonschema==2.18.1
ffmpy==0.3.1
filelock==3.13.1
fonttools==4.44.0
fqdn==1.5.1
frozenlist==1.4.0
fsspec==2023.10.0
gekko==1.0.6
gradio==3.44.4
gradio_client==0.5.1
h11==0.14.0
httpcore==1.0.1
httpx==0.25.1
huggingface-hub==0.17.3
idna==3.4
importlib-metadata==6.8.0
importlib-resources==6.1.0
ipykernel==6.26.0
ipython==8.17.2
ipywidgets==8.1.1
isoduration==20.11.0
jedi==0.19.1
Jinja2==3.1.2
json5==0.9.14
jsonpointer==2.4
jsonschema==4.19.2
jsonschema-specifications==2023.7.1
jupyter==1.0.0
jupyter-console==6.6.3
jupyter-events==0.8.0
jupyter-lsp==2.2.0
jupyter_client==8.5.0
jupyter_core==5.5.0
jupyter_server==2.9.1
jupyter_server_terminals==0.4.4
jupyterlab==4.0.8
jupyterlab-pygments==0.2.2
jupyterlab-widgets==3.0.9
jupyterlab_server==2.25.0
kiwisolver==1.4.5
markdown2==2.4.10
MarkupSafe==2.1.3
matplotlib==3.8.1
matplotlib-inline==0.1.6
mistune==3.0.2
mpmath==1.3.0
multidict==6.0.4
multiprocess==0.70.15
nbclient==0.8.0
nbconvert==7.10.0
nbformat==5.9.2
nest-asyncio==1.5.8
networkx==3.2.1
notebook==7.0.6
notebook_shim==0.2.3
numpy==1.26.1
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.18.1
nvidia-nvjitlink-cu12==12.3.52
nvidia-nvtx-cu12==12.1.105
orjson==3.9.10
overrides==7.4.0
packaging==23.2
pandas==2.1.2
pandocfilters==1.5.0
parso==0.8.3
peft==0.6.0
pexpect==4.8.0
Pillow==10.1.0
platformdirs==3.11.0
polars==0.19.12
prometheus-client==0.18.0
prompt-toolkit==3.0.39
psutil==5.9.6
ptyprocess==0.7.0
pure-eval==0.2.2
pyarrow==14.0.0
pycparser==2.21
pydantic==2.4.2
pydantic_core==2.10.1
pydub==0.25.1
Pygments==2.16.1
pyparsing==3.1.1
python-dateutil==2.8.2
python-json-logger==2.0.7
python-multipart==0.0.6
pytz==2023.3.post1
PyYAML==6.0.1
pyzmq==25.1.1
qtconsole==5.5.0
QtPy==2.4.1
referencing==0.30.2
regex==2023.10.3
requests==2.31.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rouge==1.0.1
rpds-py==0.12.0
safetensors==0.4.0
semantic-version==2.10.0
Send2Trash==1.8.2
sentencepiece==0.1.99
six==1.16.0
sniffio==1.3.0
soupsieve==2.5
stack-data==0.6.3
starlette==0.27.0
sympy==1.12
terminado==0.17.1
timm==0.4.12
tinycss2==1.2.1
tokenizers==0.13.3
tomli==2.0.1
toolz==0.12.0
torch==2.1.0
torchvision==0.16.0
tornado==6.3.3
tqdm==4.66.1
traitlets==5.13.0
transformers==4.33.1
triton==2.1.0
types-python-dateutil==2.8.19.14
typing_extensions==4.8.0
tzdata==2023.3
uri-template==1.3.0
urllib3==2.0.7
uvicorn==0.24.0.post1
wcwidth==0.2.9
webcolors==1.13
webencodings==0.5.1
websocket-client==1.6.4
websockets==11.0.3
widgetsnbextension==4.0.9
XlsxWriter==3.1.2
xxhash==3.4.1
yarl==1.9.2
zipp==3.17.0

Code

I added the 'use_safetensors=False'

import torch
from transformers import AutoModel, AutoTokenizer

import auto_gptq
from auto_gptq.modeling import BaseGPTQForCausalLM

auto_gptq.modeling._base.SUPPORTED_MODELS = ["InternLMXComposer"]

torch.set_grad_enabled(False)


class InternLMXComposerQForCausalLM(BaseGPTQForCausalLM):
    layers_block_name = "internlm_model.model.layers"
    outside_layer_modules = [
        "query_tokens",
        "flag_image_start",
        "flag_image_end",
        "visual_encoder",
        "Qformer",
        "internlm_model.model.embed_tokens",
        "internlm_model.model.norm",
        "internlm_proj",
        "internlm_model.lm_head",
    ]
    inside_layer_modules = [
        ["self_attn.k_proj", "self_attn.v_proj", "self_attn.q_proj"],
        ["self_attn.o_proj"],
        ["mlp.gate_proj"],
        ["mlp.up_proj"],
        ["mlp.down_proj"],
    ]


# init model and tokenizer
model = InternLMXComposerQForCausalLM.from_quantized(
    "internlm/internlm-xcomposer-7b-4bit",
    trust_remote_code=True,
    device="cuda:0",
    use_safetensors=False
)
model = model.eval()
tokenizer = AutoTokenizer.from_pretrained(
    "internlm/internlm-xcomposer-7b-4bit", trust_remote_code=True
)
model.model.tokenizer = tokenizer

# example image
image = "examples/images/aiyinsitan.jpg"

# Multi-Turn Text-Image Dialogue
# 1st turn
text = 'Describe this image in detail.'
image = "./norway.png"
response, history = model.chat(text, image)
print(f"User: {text}")
print(f"Bot: {response}") 
# The image features a black and white portrait of Albert Einstein, the famous physicist and mathematician. 
# Einstein is seated in the center of the frame, looking directly at the camera with a serious expression on his face. 
# He is dressed in a suit, which adds a touch of professionalism to his appearance. 

Error

Downloading (…)_model-4bit-128g.bin: 100%|█████████████████████████████████████████████████████| 7.25G/7.25G [03:20<00:00, 36.1MB/s]
Traceback (most recent call last):
  File "/mnt/bd/dev-pierre-oreistein-st/sandbox/test_internlm_vl/test_internlm_vl_4bits", line 35, in <module>
    
  File "/home/pierre/.pyenv/versions/dev3.9/lib/python3.9/site-packages/auto_gptq/modeling/_base.py", line 847, in from_quantized
    raise FileNotFoundError(f"Could not find a model in {model_name_or_path} with a name in {', '.join(searched_files)}. Please specify the argument model_basename to use a custom file name.")
FileNotFoundError: Could not find a model in internlm/internlm-xcomposer-7b-4bit with a name in gptq_model-4bit-128g.bin, gptq_model-4bit-128g.pt, model.pt. Please specify the argument model_basename to use a custom file name.

@myownskyW7
Copy link
Collaborator

It seems the download files are not found. Please try to download the https://huggingface.co/internlm/internlm-xcomposer-7b-4bit/tree/main to your local_path, and change the "internlm/internlm-xcomposer-7b-4bit" in the example code to your local_path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants