mac m1: trainer.train: ImportError: incompatible architecture (have 'x86_64', need 'arm64') #145

yiershi · 2024-02-19T06:07:33Z

System Info

Apple M1 Pro 32 GB
version: 13.6 (22G120)

Reproduction

The code that reported the error.

examples/02-basic_training.ipynb

trainer.train(batch_size=32,
              nbits=4, # How many bits will the trained model use when compressing indexes
              maxsteps=500000, # Maximum steps hard stop
              use_ib_negatives=True, # Use in-batch negative to calculate loss
              dim=128, # How many dimensions per embedding. 128 is the default and works well.
              learning_rate=5e-6, # Learning rate, small values ([3e-6,3e-5] work best if the base model is BERT-like, 5e-6 is often the sweet spot)
              doc_maxlen=256, # Maximum document length. Because of how ColBERT works, smaller chunks (128-256) work very well.
              use_relu=False, # Disable ReLU -- doesn't improve performance
              warmup_steps="auto", # Defaults to 10%
             )

After executing the cell code above, the following error is reported

[Feb 19, 13:28:20] Loading segmented_maxsim_cpp extension (set COLBERT_LOAD_TORCH_EXTENSION_VERBOSE=True for more info)...
Process Process-1:
Traceback (most recent call last):
  File "/Users/hao/miniforge3/envs/colbert/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/Users/hao/miniforge3/envs/colbert/lib/python3.9/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/hao/miniforge3/envs/colbert/lib/python3.9/site-packages/colbert/infra/launcher.py", line 134, in setup_new_process
    return_val = callee(config, *args)
  File "/Users/hao/miniforge3/envs/colbert/lib/python3.9/site-packages/colbert/training/training.py", line 48, in train
    colbert = ColBERT(name=config.checkpoint, colbert_config=config)
  File "/Users/hao/miniforge3/envs/colbert/lib/python3.9/site-packages/colbert/modeling/colbert.py", line 24, in __init__
    ColBERT.try_load_torch_extensions(self.use_gpu)
  File "/Users/hao/miniforge3/envs/colbert/lib/python3.9/site-packages/colbert/modeling/colbert.py", line 39, in try_load_torch_extensions
    segmented_maxsim_cpp = load(
  File "/Users/hao/miniforge3/envs/colbert/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1306, in load
    return _jit_compile(
  File "/Users/hao/miniforge3/envs/colbert/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1736, in _jit_compile
    return _import_module_from_library(name, build_directory, is_python_module)
  File "/Users/hao/miniforge3/envs/colbert/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 2132, in _import_module_from_library
    module = importlib.util.module_from_spec(spec)
  File "<frozen importlib._bootstrap>", line 565, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 1173, in create_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
ImportError: dlopen(/Users/hao/Library/Caches/torch_extensions/py39_cpu/segmented_maxsim_cpp/segmented_maxsim_cpp.so, 0x0002): tried: '/Users/hao/Library/Caches/torch_extensions/py39_cpu/segmented_maxsim_cpp/segmented_maxsim_cpp.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64')), '/System/Volumes/Preboot/Cryptexes/OS/Users/hao/Library/Caches/torch_extensions/py39_cpu/segmented_maxsim_cpp/segmented_maxsim_cpp.so' (no such file), '/Users/hao/Library/Caches/torch_extensions/py39_cpu/segmented_maxsim_cpp/segmented_maxsim_cpp.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64'))

Solutions Tried But Not Working

Referenced solution: #102

I moved /Users/hao/Library/Caches/torch_extensions/py39_cpu/segmented_maxsim_cpp to other name(eg. segmented_maxsim_cpp_20240219) or removed directory ~/Library/Caches/torch_extensions, and then rerun trainer.train(), but the problem still remains.

# solution1
cd ~/Library/Caches/torch_extensions/py39_cpu/
mv segmented_maxsim_cpp segmented_maxsim_cpp_20240218

# solution2
rm -r ~/Library/Caches/torch_extensions

python env

$ python --version
Python 3.9.18

$ pip list
Package               Version
--------------------- -----------
aiohttp               3.9.1
aiosignal             1.3.1
annotated-types       0.6.0
anyio                 4.2.0
appnope               0.1.4
asttokens             2.4.1
async-timeout         4.0.3
attrs                 23.2.0
bitarray              2.9.2
blinker               1.7.0
catalogue             2.0.10
certifi               2024.2.2
charset-normalizer    3.3.2
click                 8.1.7
colbert-ai            0.2.19
comm                  0.2.1
dataclasses-json      0.6.4
datasets              2.17.0
debugpy               1.8.1
decorator             5.1.1
Deprecated            1.2.14
dill                  0.3.8
dirtyjson             1.0.8
distro                1.9.0
exceptiongroup        1.2.0
executing             2.0.1
faiss-cpu             1.7.4
filelock              3.13.1
Flask                 3.0.2
frozenlist            1.4.1
fsspec                2023.10.0
git-python            1.0.3
gitdb                 4.0.11
GitPython             3.1.42
greenlet              3.0.3
h11                   0.14.0
httpcore              1.0.3
httpx                 0.26.0
huggingface-hub       0.20.3
idna                  3.6
importlib-metadata    7.0.1
ipykernel             6.29.2
ipython               8.18.1
itsdangerous          2.1.2
jedi                  0.19.1
Jinja2                3.1.3
joblib                1.3.2
jsonpatch             1.33
jsonpointer           2.4
jupyter_client        8.6.0
jupyter_core          5.7.1
langchain             0.1.7
langchain-community   0.0.20
langchain-core        0.1.23
langsmith             0.0.87
llama-index           0.9.48
MarkupSafe            2.1.5
marshmallow           3.20.2
matplotlib-inline     0.1.6
mpmath                1.3.0
multidict             6.0.5
multiprocess          0.70.16
mypy-extensions       1.0.0
nest-asyncio          1.6.0
networkx              3.2.1
ninja                 1.11.1.1
nltk                  3.8.1
numpy                 1.26.4
onnx                  1.15.0
openai                1.12.0
packaging             23.2
pandas                2.2.0
parso                 0.8.3
pexpect               4.9.0
pillow                10.2.0
pip                   24.0
platformdirs          4.2.0
prompt-toolkit        3.0.43
protobuf              4.25.3
psutil                5.9.8
ptyprocess            0.7.0
pure-eval             0.2.2
pyarrow               15.0.0
pyarrow-hotfix        0.6
pydantic              2.6.1
pydantic_core         2.16.2
Pygments              2.17.2
pyspark               2.4.7
python-dateutil       2.8.2
python-dotenv         1.0.1
pytz                  2024.1
PyYAML                6.0.1
pyzmq                 25.1.2
RAGatouille           0.0.7.post3
regex                 2023.12.25
requests              2.31.0
ruff                  0.1.15
safetensors           0.4.2
scikit-learn          1.4.1.post1
scipy                 1.12.0
sentence-transformers 2.3.1
sentencepiece         0.1.99
setuptools            69.1.0
six                   1.16.0
smmap                 5.0.1
sniffio               1.3.0
SQLAlchemy            2.0.27
srsly                 2.4.8
stack-data            0.6.3
sympy                 1.12
tenacity              8.2.3
threadpoolctl         3.3.0
tiktoken              0.6.0
tokenizers            0.15.2
torch                 2.2.0
tornado               6.4
tqdm                  4.66.2
traitlets             5.14.1
transformers          4.37.2
typing_extensions     4.9.0
typing-inspect        0.9.0
tzdata                2024.1
ujson                 5.9.0
urllib3               2.2.1
voyager               2.0.2
wcwidth               0.2.13
Werkzeug              3.0.1
wheel                 0.42.0
wrapt                 1.16.0
xxhash                3.4.1
yarl                  1.9.4
zipp                  3.17.0

bclavie · 2024-02-20T15:32:50Z

Hi there!

Thank you for reporting this. Training is currently GPU/CUDA-only (this should be documented better) (cc @Anmol6 again), but this isn't quite the issue that I'd expect to pop up!

Do you have any issues loading the custom cpp code when running indexing?

yiershi · 2024-02-22T03:43:28Z

Hi there!

Thank you for reporting this. Training is currently GPU/CUDA-only (this should be documented better) (cc @Anmol6 again), but this isn't quite the issue that I'd expect to pop up!

Do you have any issues loading the custom cpp code when running indexing?

Thanks for the reply. Now I have no issues loading the custom cpp code when running indexing

yiershi closed this as completed Feb 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mac m1: trainer.train: ImportError: incompatible architecture (have 'x86_64', need 'arm64') #145

mac m1: trainer.train: ImportError: incompatible architecture (have 'x86_64', need 'arm64') #145

yiershi commented Feb 19, 2024 •

edited

Loading

bclavie commented Feb 20, 2024

yiershi commented Feb 22, 2024

mac m1: trainer.train: ImportError: incompatible architecture (have 'x86_64', need 'arm64') #145

mac m1: trainer.train: ImportError: incompatible architecture (have 'x86_64', need 'arm64') #145

Comments

yiershi commented Feb 19, 2024 • edited Loading

System Info

Reproduction

The code that reported the error.

After executing the cell code above, the following error is reported

Solutions Tried But Not Working

python env

bclavie commented Feb 20, 2024

yiershi commented Feb 22, 2024

yiershi commented Feb 19, 2024 •

edited

Loading