Bitsandbytes integration in ORTModelForCausalLM.from_pretrained() #1664

pradeepdev-1995 · 2024-01-23T08:56:45Z

System Info

optimum==1.17.0.dev0

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction (minimal, reproducible, runnable)

The given code

from optimum.onnxruntime import ORTModelForCausalLM
from transformers import BitsAndBytesConfig
finetuned_model_name = "path"
import torch
compute_dtype = getattr(torch, "float16")
bnb_config = BitsAndBytesConfig(load_in_4bit=True,
                                bnb_4bit_quant_type="nf4",
                                bnb_4bit_compute_dtype=compute_dtype,
                                bnb_4bit_use_double_quant=False)
ort_model = ORTModelForCausalLM.from_pretrained(
    finetuned_model_name,
    use_io_binding=True,
    quantization_config=bnb_config,
    export=True,
    use_cache=True,
    from_transformers=True
)

shows the errror

TypeError: _from_transformers() got an unexpected keyword argument 'quantization_config'

so how to do quantization while loading with ORTModelForCausalLM

Expected behavior

The given code

from optimum.onnxruntime import ORTModelForCausalLM
from transformers import BitsAndBytesConfig
finetuned_model_name = "path"
import torch
compute_dtype = getattr(torch, "float16")
bnb_config = BitsAndBytesConfig(load_in_4bit=True,
                                bnb_4bit_quant_type="nf4",
                                bnb_4bit_compute_dtype=compute_dtype,
                                bnb_4bit_use_double_quant=False)
ort_model = ORTModelForCausalLM.from_pretrained(
    finetuned_model_name,
    use_io_binding=True,
    quantization_config=bnb_config,
    export=True,
    use_cache=True,
    from_transformers=True
)

shows the errror

TypeError: _from_transformers() got an unexpected keyword argument 'quantization_config'

so how to do quantization while loading with ORTModelForCausalLM

The text was updated successfully, but these errors were encountered:

pradeepdev-1995 added the bug Something isn't working label Jan 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bitsandbytes integration in ORTModelForCausalLM.from_pretrained() #1664

Bitsandbytes integration in ORTModelForCausalLM.from_pretrained() #1664

pradeepdev-1995 commented Jan 23, 2024

Bitsandbytes integration in ORTModelForCausalLM.from_pretrained() #1664

Bitsandbytes integration in ORTModelForCausalLM.from_pretrained() #1664

Comments

pradeepdev-1995 commented Jan 23, 2024

System Info

Who can help?

Information

Tasks

Reproduction (minimal, reproducible, runnable)

Expected behavior