You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was running 52_Build_RAG_pipelines_with_txtai.ipynb. Because autoAWQ v0.1.5 was compiled against CUDA 11.8, I started with the nvidia/cuda:11.8.0-devel-ubuntu22.04 docker image (which has CUDA 11.8 installed). Because the latest version of torch on pypi is compiled against CUDA 12.1, I installed pytorch 2.2.0 compiled against CUDA 11.8 using the pytorch.org registry
as suggested in the notebook's Install dependencies. However, when I went to create the LLM with LLM("TheBloke/Mistral-7B-OpenOrca-AWQ"), I get the following undefined symbol.
It looks like autoAWQ v0.1.5 was compiled against torch 2.0.1 (and CUDA 11.8), not torch 2.2.0. If I install torch 2.0.1 this fixes the problem. This is a problem with autoAWQ, but I'm posting the issue here in case anyone else runs into this. There is also an issue regarding this in autoAWQ, but no resolution yet.
Here is the backtrace
ImportErrorTraceback (mostrecentcalllast)
CellIn[2], line21# Create LLM---->2llm=LLM("TheBloke/Mistral-7B-OpenOrca-AWQ")
3# /root/.cache/huggingface/hub/models--TheBloke--Mistral-7B-OpenOrca-AWQFile/venv/lib/python3.10/site-packages/txtai/pipeline/llm/llm.py:34, inLLM.__init__(self, path, method, **kwargs)
31path=pathifpathelse"google/flan-t5-base"33# Generation instance--->34self.generator=GenerationFactory.create(path, method, **kwargs)
File/venv/lib/python3.10/site-packages/txtai/pipeline/llm/factory.py:41, inGenerationFactory.create(path, method, **kwargs)
39# Hugging Face Transformers generation40ifmethod=="transformers":
--->41returnHFGeneration(path, **kwargs)
43# Resolve custom method44returnGenerationFactory.resolve(path, method, **kwargs)
File/venv/lib/python3.10/site-packages/txtai/pipeline/llm/huggingface.py:22, inHFGeneration.__init__(self, path, template, **kwargs)
19super().__init__(path, template, **kwargs)
21# Create HuggingFace LLM pipeline--->22self.llm=HFLLM(path, **kwargs)
File/venv/lib/python3.10/site-packages/txtai/pipeline/llm/huggingface.py:35, inHFLLM.__init__(self, path, quantize, gpu, model, task, **kwargs)
34def__init__(self, path=None, quantize=False, gpu=True, model=None, task=None, **kwargs):
--->35super().__init__(self.task(path, task, **kwargs), path, quantize, gpu, model, **kwargs)
37# Load tokenizer, if necessary38self.pipeline.tokenizer=self.pipeline.tokenizerifself.pipeline.tokenizerelseModels.tokenizer(path, **kwargs)
File/venv/lib/python3.10/site-packages/txtai/pipeline/hfpipeline.py:56, inHFPipeline.__init__(self, task, path, quantize, gpu, model, **kwargs)
54self.pipeline=pipeline(task, model=model, tokenizer=path[1], device=device, model_kwargs=modelargs, **kwargs)
55else:
--->56self.pipeline=pipeline(task, model=path, device=device, model_kwargs=modelargs, **kwargs)
58# Model quantization. Compresses model to int8 precision, improves runtime performance. Only supported on CPU.59ifdeviceid==-1andquantize:
60# pylint: disable=E1101File/venv/lib/python3.10/site-packages/transformers/pipelines/__init__.py:870, inpipeline(task, model, config, tokenizer, feature_extractor, image_processor, framework, revision, use_fast, token, device, device_map, torch_dtype, trust_remote_code, model_kwargs, pipeline_class, **kwargs)
868ifisinstance(model, str) orframeworkisNone:
869model_classes= {"tf": targeted_task["tf"], "pt": targeted_task["pt"]}
-->870framework, model=infer_framework_load_model(
871model,
872model_classes=model_classes,
873config=config,
874framework=framework,
875task=task,
876**hub_kwargs,
877**model_kwargs,
878 )
880model_config=model.config881hub_kwargs["_commit_hash"] =model.config._commit_hashFile/venv/lib/python3.10/site-packages/transformers/pipelines/base.py:278, ininfer_framework_load_model(model, config, model_classes, task, framework, **model_kwargs)
272logger.warning(
273"Model might be a PyTorch model (ending with `.bin`) but PyTorch is not available. "274"Trying to load the model with Tensorflow."275 )
277try:
-->278model=model_class.from_pretrained(model, **kwargs)
279ifhasattr(model, "eval"):
280model=model.eval()
File/venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:566, in_BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
564eliftype(config) incls._model_mapping.keys():
565model_class=_get_model_class(config, cls._model_mapping)
-->566returnmodel_class.from_pretrained(
567pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs568 )
569raiseValueError(
570f"Unrecognized configuration class {config.__class__} for this kind of AutoModel: {cls.__name__}.\n"571f"Model type should be one of {', '.join(c.__name__forcincls._model_mapping.keys())}."572 )
File/venv/lib/python3.10/site-packages/transformers/modeling_utils.py:3689, inPreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, **kwargs)
3686ifquantization_config.modules_to_not_convertisnotNone:
3687modules_to_not_convert.extend(quantization_config.modules_to_not_convert)
->3689model, has_been_replaced=replace_with_awq_linear(
3690model, quantization_config=quantization_config, modules_to_not_convert=modules_to_not_convert3691 )
3692model._is_quantized_training_enabled=False3694ifnothas_been_replaced:
File/venv/lib/python3.10/site-packages/transformers/integrations/awq.py:94, inreplace_with_awq_linear(model, modules_to_not_convert, quantization_config, current_key_name, has_been_replaced)
89raiseValueError(
90"AWQ (either `autoawq` or `llmawq`) is not available. Please install it with `pip install autoawq` or check out the installation guide in https://github.com/mit-han-lab/llm-awq"91 )
93ifbackend==AwqBackendPackingMethod.AUTOAWQ:
--->94fromawq.modules.linearimportWQLinear_GEMM, WQLinear_GEMV95elifbackend==AwqBackendPackingMethod.LLMAWQ:
96fromawq.quantize.qmoduleimportWQLinearFile/venv/lib/python3.10/site-packages/awq/__init__.py:21__version__="0.1.5"---->2fromawq.models.autoimportAutoAWQForCausalLMFile/venv/lib/python3.10/site-packages/awq/models/__init__.py:1---->1from .mptimportMptAWQForCausalLM2from .llamaimportLlamaAWQForCausalLM3from .optimportOptAWQForCausalLMFile/venv/lib/python3.10/site-packages/awq/models/mpt.py:1---->1from .baseimportBaseAWQForCausalLM2fromtypingimportDict3fromtransformers.models.mpt.modeling_mptimportMptBlockasOldMptBlock, MptForCausalLMFile/venv/lib/python3.10/site-packages/awq/models/base.py:119fromawq.modules.actimportScaledActivation10fromhuggingface_hubimportsnapshot_download--->11fromawq.quantize.quantizerimportAwqQuantizer12fromawq.utils.utilsimportsimple_dispatch_model13fromtransformers.modeling_utilsimportshard_checkpointFile/venv/lib/python3.10/site-packages/awq/quantize/quantizer.py:119fromawq.utils.calib_dataimportget_calib_dataset10fromawq.quantize.scaleimportapply_scale, apply_clip--->11fromawq.modules.linearimportWQLinear_GEMM, WQLinear_GEMV12fromawq.utils.moduleimportappend_str_prefix, get_op_name, get_named_linears, set_op_by_name15classAwqQuantizer:
File/venv/lib/python3.10/site-packages/awq/modules/linear.py:42importtorch3importtorch.nnasnn---->4importawq_inference_engine# with CUDA kernels7defmake_divisible(c, divisor):
8return (c+divisor-1) //divisorImportError: /venv/lib/python3.10/site-packages/awq_inference_engine.cpython-310-x86_64-linux-gnu.so: undefinedsymbol: _ZN2at4_ops15sum_dim_IntList4callERKNS_6TensorEN3c1016OptionalArrayRefIlEEbNS5_8optionalINS5_10ScalarTypeEEE
The text was updated successfully, but these errors were encountered:
Ya. I did (I posted a few comments in that linked issue). The v0.1.8 hosted on pypi is compiled against CUDA 12.1 (or so the README says), however, the version posted on the Releases page is compiled against CUDA 11.8 (cu118). I tried them both (with torch v2.2.0 compiled against either CUDA 11.8 or 12.1, as needed) and couldn't get either to work. Although interestingly, I got different undefined symbol errors in the two cases.
What did work was autoAWQ (any version) and torch 2.0.1, both compiled against CUDA 11.8. EDIT torch 2.1.x also works
I was running 52_Build_RAG_pipelines_with_txtai.ipynb. Because autoAWQ v0.1.5 was compiled against CUDA 11.8, I started with the nvidia/cuda:11.8.0-devel-ubuntu22.04 docker image (which has CUDA 11.8 installed). Because the latest version of torch on pypi is compiled against CUDA 12.1, I installed pytorch 2.2.0 compiled against CUDA 11.8 using the pytorch.org registry
I then installed txtai with
as suggested in the notebook's Install dependencies. However, when I went to create the LLM with
LLM("TheBloke/Mistral-7B-OpenOrca-AWQ")
, I get the following undefined symbol.It looks like autoAWQ v0.1.5 was compiled against torch 2.0.1 (and CUDA 11.8), not torch 2.2.0. If I install torch 2.0.1 this fixes the problem. This is a problem with autoAWQ, but I'm posting the issue here in case anyone else runs into this. There is also an issue regarding this in autoAWQ, but no resolution yet.
Here is the backtrace
The text was updated successfully, but these errors were encountered: