Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

undefined symbol when importing autoawq #660

Closed
scott-vsi opened this issue Feb 7, 2024 · 3 comments
Closed

undefined symbol when importing autoawq #660

scott-vsi opened this issue Feb 7, 2024 · 3 comments

Comments

@scott-vsi
Copy link

scott-vsi commented Feb 7, 2024

I was running 52_Build_RAG_pipelines_with_txtai.ipynb. Because autoAWQ v0.1.5 was compiled against CUDA 11.8, I started with the nvidia/cuda:11.8.0-devel-ubuntu22.04 docker image (which has CUDA 11.8 installed). Because the latest version of torch on pypi is compiled against CUDA 12.1, I installed pytorch 2.2.0 compiled against CUDA 11.8 using the pytorch.org registry

pip install torch==2.2.0 torchvision==0.17 --index-url https://download.pytorch.org/whl/cu118

I then installed txtai with

pip install git+https://github.com/neuml/txtai#egg=txtai[pipeline] autoawq==0.1.5

as suggested in the notebook's Install dependencies. However, when I went to create the LLM with LLM("TheBloke/Mistral-7B-OpenOrca-AWQ"), I get the following undefined symbol.

ImportError: /venv/lib/python3.10/site-packages/awq_inference_engine.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops15sum_dim_IntList4callERKNS_6TensorEN3c1016OptionalArrayRefIlEEbNS5_8optionalINS5_10ScalarTypeEEE

It looks like autoAWQ v0.1.5 was compiled against torch 2.0.1 (and CUDA 11.8), not torch 2.2.0. If I install torch 2.0.1 this fixes the problem. This is a problem with autoAWQ, but I'm posting the issue here in case anyone else runs into this. There is also an issue regarding this in autoAWQ, but no resolution yet.

Here is the backtrace
ImportError                               Traceback (most recent call last)
Cell In[2], line 2
      1 # Create LLM
----> 2 llm = LLM("TheBloke/Mistral-7B-OpenOrca-AWQ")
      3 # /root/.cache/huggingface/hub/models--TheBloke--Mistral-7B-OpenOrca-AWQ

File /venv/lib/python3.10/site-packages/txtai/pipeline/llm/llm.py:34, in LLM.__init__(self, path, method, **kwargs)
     31 path = path if path else "google/flan-t5-base"
     33 # Generation instance
---> 34 self.generator = GenerationFactory.create(path, method, **kwargs)

File /venv/lib/python3.10/site-packages/txtai/pipeline/llm/factory.py:41, in GenerationFactory.create(path, method, **kwargs)
     39 # Hugging Face Transformers generation
     40 if method == "transformers":
---> 41     return HFGeneration(path, **kwargs)
     43 # Resolve custom method
     44 return GenerationFactory.resolve(path, method, **kwargs)

File /venv/lib/python3.10/site-packages/txtai/pipeline/llm/huggingface.py:22, in HFGeneration.__init__(self, path, template, **kwargs)
     19 super().__init__(path, template, **kwargs)
     21 # Create HuggingFace LLM pipeline
---> 22 self.llm = HFLLM(path, **kwargs)

File /venv/lib/python3.10/site-packages/txtai/pipeline/llm/huggingface.py:35, in HFLLM.__init__(self, path, quantize, gpu, model, task, **kwargs)
     34 def __init__(self, path=None, quantize=False, gpu=True, model=None, task=None, **kwargs):
---> 35     super().__init__(self.task(path, task, **kwargs), path, quantize, gpu, model, **kwargs)
     37     # Load tokenizer, if necessary
     38     self.pipeline.tokenizer = self.pipeline.tokenizer if self.pipeline.tokenizer else Models.tokenizer(path, **kwargs)

File /venv/lib/python3.10/site-packages/txtai/pipeline/hfpipeline.py:56, in HFPipeline.__init__(self, task, path, quantize, gpu, model, **kwargs)
     54     self.pipeline = pipeline(task, model=model, tokenizer=path[1], device=device, model_kwargs=modelargs, **kwargs)
     55 else:
---> 56     self.pipeline = pipeline(task, model=path, device=device, model_kwargs=modelargs, **kwargs)
     58 # Model quantization. Compresses model to int8 precision, improves runtime performance. Only supported on CPU.
     59 if deviceid == -1 and quantize:
     60     # pylint: disable=E1101

File /venv/lib/python3.10/site-packages/transformers/pipelines/__init__.py:870, in pipeline(task, model, config, tokenizer, feature_extractor, image_processor, framework, revision, use_fast, token, device, device_map, torch_dtype, trust_remote_code, model_kwargs, pipeline_class, **kwargs)
    868 if isinstance(model, str) or framework is None:
    869     model_classes = {"tf": targeted_task["tf"], "pt": targeted_task["pt"]}
--> 870     framework, model = infer_framework_load_model(
    871         model,
    872         model_classes=model_classes,
    873         config=config,
    874         framework=framework,
    875         task=task,
    876         **hub_kwargs,
    877         **model_kwargs,
    878     )
    880 model_config = model.config
    881 hub_kwargs["_commit_hash"] = model.config._commit_hash

File /venv/lib/python3.10/site-packages/transformers/pipelines/base.py:278, in infer_framework_load_model(model, config, model_classes, task, framework, **model_kwargs)
    272     logger.warning(
    273         "Model might be a PyTorch model (ending with `.bin`) but PyTorch is not available. "
    274         "Trying to load the model with Tensorflow."
    275     )
    277 try:
--> 278     model = model_class.from_pretrained(model, **kwargs)
    279     if hasattr(model, "eval"):
    280         model = model.eval()

File /venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:566, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    564 elif type(config) in cls._model_mapping.keys():
    565     model_class = _get_model_class(config, cls._model_mapping)
--> 566     return model_class.from_pretrained(
    567         pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
    568     )
    569 raise ValueError(
    570     f"Unrecognized configuration class {config.__class__} for this kind of AutoModel: {cls.__name__}.\n"
    571     f"Model type should be one of {', '.join(c.__name__ for c in cls._model_mapping.keys())}."
    572 )

File /venv/lib/python3.10/site-packages/transformers/modeling_utils.py:3689, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, **kwargs)
   3686 if quantization_config.modules_to_not_convert is not None:
   3687     modules_to_not_convert.extend(quantization_config.modules_to_not_convert)
-> 3689 model, has_been_replaced = replace_with_awq_linear(
   3690     model, quantization_config=quantization_config, modules_to_not_convert=modules_to_not_convert
   3691 )
   3692 model._is_quantized_training_enabled = False
   3694 if not has_been_replaced:

File /venv/lib/python3.10/site-packages/transformers/integrations/awq.py:94, in replace_with_awq_linear(model, modules_to_not_convert, quantization_config, current_key_name, has_been_replaced)
     89     raise ValueError(
     90         "AWQ (either `autoawq` or `llmawq`) is not available. Please install it with `pip install autoawq` or check out the installation guide in https://github.com/mit-han-lab/llm-awq"
     91     )
     93 if backend == AwqBackendPackingMethod.AUTOAWQ:
---> 94     from awq.modules.linear import WQLinear_GEMM, WQLinear_GEMV
     95 elif backend == AwqBackendPackingMethod.LLMAWQ:
     96     from awq.quantize.qmodule import WQLinear

File /venv/lib/python3.10/site-packages/awq/__init__.py:2
      1 __version__ = "0.1.5"
----> 2 from awq.models.auto import AutoAWQForCausalLM

File /venv/lib/python3.10/site-packages/awq/models/__init__.py:1
----> 1 from .mpt import MptAWQForCausalLM
      2 from .llama import LlamaAWQForCausalLM
      3 from .opt import OptAWQForCausalLM

File /venv/lib/python3.10/site-packages/awq/models/mpt.py:1
----> 1 from .base import BaseAWQForCausalLM
      2 from typing import Dict
      3 from transformers.models.mpt.modeling_mpt import MptBlock as OldMptBlock, MptForCausalLM

File /venv/lib/python3.10/site-packages/awq/models/base.py:11
      9 from awq.modules.act import ScaledActivation
     10 from huggingface_hub import snapshot_download
---> 11 from awq.quantize.quantizer import AwqQuantizer
     12 from awq.utils.utils import simple_dispatch_model
     13 from transformers.modeling_utils import shard_checkpoint

File /venv/lib/python3.10/site-packages/awq/quantize/quantizer.py:11
      9 from awq.utils.calib_data import get_calib_dataset
     10 from awq.quantize.scale import apply_scale, apply_clip
---> 11 from awq.modules.linear import WQLinear_GEMM, WQLinear_GEMV
     12 from awq.utils.module import append_str_prefix, get_op_name, get_named_linears, set_op_by_name
     15 class AwqQuantizer:

File /venv/lib/python3.10/site-packages/awq/modules/linear.py:4
      2 import torch
      3 import torch.nn as nn
----> 4 import awq_inference_engine  # with CUDA kernels
      7 def make_divisible(c, divisor):
      8     return (c + divisor - 1) // divisor

ImportError: /venv/lib/python3.10/site-packages/awq_inference_engine.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops15sum_dim_IntList4callERKNS_6TensorEN3c1016OptionalArrayRefIlEEbNS5_8optionalINS5_10ScalarTypeEEE
@davidmezzetti
Copy link
Member

Hello, thank you for sharing.

Did you try using the latest version of autoawq?

@scott-vsi
Copy link
Author

scott-vsi commented Feb 8, 2024

Ya. I did (I posted a few comments in that linked issue). The v0.1.8 hosted on pypi is compiled against CUDA 12.1 (or so the README says), however, the version posted on the Releases page is compiled against CUDA 11.8 (cu118). I tried them both (with torch v2.2.0 compiled against either CUDA 11.8 or 12.1, as needed) and couldn't get either to work. Although interestingly, I got different undefined symbol errors in the two cases.

What did work was autoAWQ (any version) and torch 2.0.1, both compiled against CUDA 11.8. EDIT torch 2.1.x also works

@davidmezzetti
Copy link
Member

Sounds good. I appreciate you documenting this and this should now come up if you run a search for "txtai autoawq errors" and so forth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants