Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error "module compiled against ABI version" when using a device on MacBook Pro M2 with MacOS Sonoma 14.4 #122056

Closed
trevorstr opened this issue Mar 17, 2024 · 19 comments
Assignees
Labels
has workaround module: binaries Anything related to official binaries that we release to users module: numpy Related to numpy support, and also numpy compatibility of our operators triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Milestone

Comments

@trevorstr
Copy link

trevorstr commented Mar 17, 2024

馃悰 Describe the bug

pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
import torch
mps = torch.device("mps")

When I run the code, I get this error:


A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.0b1 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled against NumPy 2.0.

If you are a user of the module, the easiest solution will be to
either downgrade NumPy or update the failing module (if available).

Traceback (most recent call last):  File "/Users/trevor.sullivan/git/pytorch-audio-classifier/main.py", line 7, in <module>
    mps = torch.device("cpu")
/Users/trevor.sullivan/git/pytorch-audio-classifier/main.py:7: DeprecationWarning: numpy.core._multiarray_umath is deprecated and has been renamed to numpy._core._multiarray_umath. The numpy._core namespace contains private NumPy internals and its use is discouraged, as NumPy internals can change without warning in any release. In practice, most real-world usage of numpy.core is to access functionality in the public NumPy API. If that is the case, use the public NumPy API. If not, you are using NumPy internals. If you would still like to access an internal attribute, use numpy._core._multiarray_umath._ARRAY_API.
  mps = torch.device("cpu")
/Users/trevor.sullivan/git/pytorch-audio-classifier/main.py:7: UserWarning: Failed to initialize NumPy: module compiled against ABI version 0x1000009 but this version of numpy is 0x2000000 (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/utils/tensor_numpy.cpp:84.)
  mps = torch.device("cpu")

How can I go about resolving this?

Versions

PyTorch version: 2.4.0.dev20240317
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 14.4 (arm64)
GCC version: Could not collect
Clang version: 15.0.0 (clang-1500.3.9.4)
CMake version: Could not collect
Libc version: N/A

Python version: 3.12.2 (main, Feb  6 2024, 20:19:44) [Clang 15.0.0 (clang-1500.1.0.2.5)] (64-bit runtime)
Python platform: macOS-14.4-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M2 Pro

Versions of relevant libraries:
[pip3] numpy==2.0.0b1
[pip3] torch==2.4.0.dev20240317
[pip3] torchaudio==2.2.0.dev20240317
[pip3] torchvision==0.18.0.dev20240317
[conda] Could not collect

cc @seemethere @malfet @osalpekar @atalman @mruberry @rgommers @kulinseth @albanD @DenisVieriu97 @razarmehr

@bdhirsh bdhirsh added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: mps Related to Apple Metal Performance Shaders framework labels Mar 18, 2024
@malfet malfet added module: binaries Anything related to official binaries that we release to users module: numpy Related to numpy support, and also numpy compatibility of our operators and removed module: mps Related to Apple Metal Performance Shaders framework labels Mar 18, 2024
@malfet malfet self-assigned this Mar 18, 2024
@malfet malfet added this to the 2.3.0 milestone Mar 18, 2024
@malfet
Copy link
Contributor

malfet commented Mar 18, 2024

[Edit] Indeed PyTorch build against Numpy-2.0 release candidate is backward compatible with Numpy-1.X

Let's test it throughout our config permutations: #122157

@malfet
Copy link
Contributor

malfet commented Mar 18, 2024

@trevorstr as comment message says, your workaround is to uninstall numpy (or install nunmpy 1.x)

@atalman
Copy link
Contributor

atalman commented Mar 18, 2024

@trevorstr We are currently working on enabling numpy 2.x on our CI/CD, tracked here: #107302

@trevorstr
Copy link
Author

@trevorstr as comment message says, your workaround is to uninstall numpy (or install nunmpy 1.x)

I'm confused though, because I didn't install numpy 2.x specifically. It must have been automatically installed when I ran the pip command to install torch and torchvision.

Does that mean that the PyTorch package is referencing the wrong version of numpy in its dependencies?

@malfet
Copy link
Contributor

malfet commented Mar 18, 2024

I'm confused though, because I didn't install numpy 2.x specifically. It must have been automatically installed when I ran the pip command to install torch and torchvision.

torchvision depends on the NumPy and does not have a constraint on version to 1.x, only. And although NumPy-2.x is not out yet, as you are installing with --pre flag it will pick it.

@trevorstr
Copy link
Author

That makes perfect sense, thank you for helping me understand why that occurred.

I can confirm that uninstalling numpy 2.x, and installing the latest version of 1.x explicitly, the code now works.

(venv) trevor.sullivan@computer pytorch-audio-classifier % pip3 uninstall numpy
Found existing installation: numpy 2.0.0b1
Uninstalling numpy-2.0.0b1:
  Would remove:
    /Users/trevor.sullivan/git/pytorch-audio-classifier/venv/bin/f2py
    /Users/trevor.sullivan/git/pytorch-audio-classifier/venv/bin/numpy-config
    /Users/trevor.sullivan/git/pytorch-audio-classifier/venv/lib/python3.12/site-packages/numpy-2.0.0b1.dist-info/*
    /Users/trevor.sullivan/git/pytorch-audio-classifier/venv/lib/python3.12/site-packages/numpy/*
Proceed (Y/n)? y
  Successfully uninstalled numpy-2.0.0b1
(venv) trevor.sullivan@computer pytorch-audio-classifier % pip3 install numpy==1.26.4
Collecting numpy==1.26.4
  Downloading numpy-1.26.4-cp312-cp312-macosx_11_0_arm64.whl.metadata (61 kB)
     鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣 61.1/61.1 kB 514.8 kB/s eta 0:00:00
Downloading numpy-1.26.4-cp312-cp312-macosx_11_0_arm64.whl (13.7 MB)
   鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣鈹佲攣 13.7/13.7 MB 12.1 MB/s eta 0:00:00
Installing collected packages: numpy
pSuccessfully installed numpy-1.26.4
(venv) trevor.sullivan@computer pytorch-audio-classifier % python3 main.py
(venv) trevor.sullivan@computer pytorch-audio-classifier %

Feel free to close this ticket if you want, or leave it open since you're working on integrating numpy 2.x. Thank you

@malfet malfet changed the title [MPS] Error "module compiled against ABI version" when using a device on MacBook Pro M2 with MacOS Sonoma 14.4 Error "module compiled against ABI version" when using a device on MacBook Pro M2 with MacOS Sonoma 14.4 Mar 18, 2024
@malfet malfet removed their assignment Mar 19, 2024
@malfet malfet removed this from the 2.3.0 milestone Mar 19, 2024
@malfet
Copy link
Contributor

malfet commented Mar 19, 2024

Actually, not sure there is anything we want to fix this, as neither of those are an errors, but rather a warnings, albeit an annoying ones

@malfet malfet self-assigned this Mar 19, 2024
@malfet malfet added this to the 2.3.0 milestone Mar 19, 2024
@rgommers
Copy link
Collaborator

I'll note that the verbose warning is due to pybind11 digging into private internals of numpy (which now mostly works, but may give incorrect behavior for some pybind11 functionality in combination with numpy 2.0), and it'll likely turn into a hard error very soon.

The likely next things to happen are:

@malfet
Copy link
Contributor

malfet commented Mar 19, 2024

@rgommers but is it safe to assume, that if one to build using current version of PyBind and numpy-2.0.0b1 API, it will be both forward compatible with the upcoming 2.0.0 and backward compatible with 1.x? Or it's better to constrain PyTOrch-2.3 to NumPy-1.x?

@rgommers
Copy link
Collaborator

I think not. There is two parts to that:

  1. Is building against numpy 2.0.0b1 going to be ABI compatible with 2.0 and 1.x? Yes (we shouldn't be changing ABI anymore, and the official freeze in 2.0.0rc1 is days away)
  2. Is using pybind11 2.11.1 going to result in binaries that are runtime-compatible with numpy 2.0? Likely not

For (2), Pybind11 accesses API (e.g., https://github.com/pybind/pybind11/blob/v2.11/include/pybind11/numpy.h#L266) that may disappear or start raising.

Or it's better to constrain PyTOrch-2.3 to NumPy-1.x?

As of right now, yes, it'd be safer to have a runtime dependency numpy<2. However, there is a month left to the 2.3 release date. If a pybind11 2.12.0 release and numpy 2.0.0rc1 will be available within a week or so, it'd be really useful to update pybind11 in the 2.3 release branch and not have an upper bound on numpy.

I understand that there is some risk to doing something like that so late in the release cycle, since pybind11 is an important dependency. But it's only a risk, and if something breaks somewhere even though all CI is green, surely it can be fixed. If we have PyTorch incompatible with the latest NumPy until 2.4.0 in the second half of the year though, that is definitely going to result in annoying problems (e.g., a user doing pip install torch numpy>=2.0 is going to then get latest numpy and a torch downgraded to 2.2.2, because 2.2.2 doesn't have the upper bound on numpy).

@malfet
Copy link
Contributor

malfet commented Mar 20, 2024

  1. Is using pybind11 2.11.1 going to result in binaries that are runtime-compatible with numpy 2.0? Likely not

For (2), Pybind11 accesses API (e.g., https://github.com/pybind/pybind11/blob/v2.11/include/pybind11/numpy.h#L266) that may disappear or start raising.

But we are not querying any NumPy APIs thru PyBind, so that shouldn't be a problem.

As of right now, yes, it'd be safer to have a runtime dependency numpy<2

Tricky part is: PyTorch does not depend on NumPy, but say TorchVision does. So we can put a constraint there, but as you've said it just results in us installing older versions of Torch, which is not ideal

@rgommers
Copy link
Collaborator

But we are not querying any NumPy APIs thru PyBind, so that shouldn't be a problem.

Ah okay, then it's perhaps safer - but hard to be sure without trying. I'd say that if you see the warning in the issue description (A module that was compiled using NumPy 1.x ...), there is likely to be a problem. It's coming from

bool is_numpy_available() {
static bool available = []() {
if (_import_array() >= 0) {
return true;
}
// Try to get exception message, print warning and return false
std::string message = "Failed to initialize NumPy";
// NOLINTNEXTLINE(cppcoreguidelines-init-variables)
PyObject *type, *value, *traceback;
PyErr_Fetch(&type, &value, &traceback);
if (auto str = value ? PyObject_Str(value) : nullptr) {
if (auto enc_str = PyUnicode_AsEncodedString(str, "utf-8", "strict")) {
if (auto byte_str = PyBytes_AS_STRING(enc_str)) {
message += ": " + std::string(byte_str);
}
Py_XDECREF(enc_str);
}
Py_XDECREF(str);
}
PyErr_Clear();
TORCH_WARN(message);
return false;
, so the end result is that is_numpy_available returns false, and numpy interop will not work.

@rgommers
Copy link
Collaborator

Tricky part is: PyTorch does not depend on NumPy, but say TorchVision does. So we can put a constraint there, but as you've said it just results in us installing older versions of Torch, which is not ideal

Agreed. That won't work well in practice. Maybe we should give it a few days, and see if the needed pybind11 fixes are in by Monday?

@atalman
Copy link
Contributor

atalman commented Mar 20, 2024

@rgommers Nightly change for wheels just landed: pytorch/builder@65f8e7d We can test these tomorrow morning. Is there particular smoke test or anything like this we can run to test if it works correctly ?

@rgommers
Copy link
Collaborator

Great! ABI compat should be fine; if not a very basic test like this should catch it:

import torch
import numpy as np  # with numpy 1.26 for example

x = np.arange(5)
t = torch.tensor(x)

test/test_numpy_interop.py is a useful small subset to run, with numpy 1.26 and 2.0.0b1

That's ABI compat; doesn't mean any regular Python API usage is 100% fine (it probably is), but a regular test suite run in an env with the nightly and numpy 2.0.0b1 will turn that up soon enough.

@rgommers
Copy link
Collaborator

Maybe we should give it a few days, and see if the needed pybind11 fixes are in by Monday?

Quick update: Pybind11 master is now in good shape, pybind/pybind11#5050 just got merged.

@rgommers
Copy link
Collaborator

Pybind11 2.12.0 is out now: https://pypi.org/project/pybind11/2.12.0/. Upgrading to that will fix this issue.

@rgommers
Copy link
Collaborator

@trevorstr I'll reopen this, the upgrade actually does need to happen in the pybind11 vendored by PyTorch.

@rgommers rgommers reopened this Mar 28, 2024
@trevorstr
Copy link
Author

Cool, thank you! I appreciate you taking ownership and driving the resolution.

pytorchbot pushed a commit that referenced this issue Apr 4, 2024
To fix #122056

Building with NP 2.0 allows me to run locally with both NP 2.0 and 1.26.
Any other test we should run @rgommers  ?

FYI @Skylion007 @atalman
Pull Request resolved: #122899
Approved by: https://github.com/Skylion007

(cherry picked from commit 6c2f36c)
atalman pushed a commit that referenced this issue Apr 4, 2024
To fix #122056

Building with NP 2.0 allows me to run locally with both NP 2.0 and 1.26.
Any other test we should run @rgommers  ?

FYI @Skylion007 @atalman
Pull Request resolved: #122899
Approved by: https://github.com/Skylion007

(cherry picked from commit 6c2f36c)

Co-authored-by: albanD <desmaison.alban@gmail.com>
sanketpurandare pushed a commit to sanketpurandare/pytorch that referenced this issue Apr 22, 2024
To fix pytorch#122056

Building with NP 2.0 allows me to run locally with both NP 2.0 and 1.26.
Any other test we should run @rgommers  ?

FYI @Skylion007 @atalman
Pull Request resolved: pytorch#122899
Approved by: https://github.com/Skylion007
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
has workaround module: binaries Anything related to official binaries that we release to users module: numpy Related to numpy support, and also numpy compatibility of our operators triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants