Skip to content

Python linker errors when running cibuildwheel on the new Windows 11 ARM runners #2364

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
burgholzer opened this issue Apr 15, 2025 · 17 comments

Comments

@burgholzer
Copy link
Contributor

Description

Just yesterday, GitHub announced that the new Windows 11 ARM runners are now available as public preview.
We tried to switch one of our projects (which is using scikit-build-core together with pybind11) over at munich-quantum-toolkit/core#926, but noticed that our Python testing CI would fail on the new runners.
The corresponding run log can be found here: https://github.com/munich-quantum-toolkit/core/actions/runs/14478932552/job/40611370656?pr=926
Specifically, running cibuildwheel on the new runners would fail in the "Building wheel..." step while trying to link the compiled extension with >100 error messages of the sort

error LNK2001: unresolved external symbol __imp_Py...

I am not 100% sure if this is a cibuildwheel error in itself. This could also have something to do with scikit-build-core. But I saw that both cibuildwheel and scikit-build-core added CI tests under Windows 11 ARM today. So I am not sure.
Any help is appreciated.

Build log

https://github.com/munich-quantum-toolkit/core/actions/runs/14478932552/job/40611370656?pr=926

CI config

https://github.com/munich-quantum-toolkit/core/blob/windows-arm/.github/workflows/ci.yml, which uses the reusable workflow from ttps://github.com/munich-quantum-toolkit/workflows/blob/main/.github/workflows/reusable-python-packaging.yml

@henryiii
Copy link
Contributor

I've started producing boost-histogram wheels, scikit-hep/boost-histogram#1001, without a problem too, that's scikit-build-core, pybind11, cibuildwheel.

A few thoughts: Why are you finding the SABI component? Pybind11 doesn't support that. Why are you using Ninja on Windows? I've found Ninja to be a bit fragile on Windows. I was looking for differences vs. boost-histogram for starters.

@burgholzer
Copy link
Contributor Author

Interesting. I removed the SABI component now. That was a relic from when we were trying to switch to nanobind and it hasn't really caused any issues so far to just optionally search for it.
Ninja could be a good guess here. We basically use it to get automatic build parallelization. Has not really failed us in the past, but I also remember some comment from you in the recent past in some thread where this was a problem. I'll try and see whether not using Ninja fixes the problem after the current run without looking for the SABI component runs through.

@burgholzer
Copy link
Contributor Author

Ah, and I now remembered that we also switched to using Ninja at some point because we try to make use of ccache (via https://github.com/Chocobo1/setup-ccache-action) as much as we can to save some CI time, and that only ever really worked (when it worked...) when using Ninja as a generator.
However, that caching setup is fairly brittle anyway, so dropping it would not be the end of the world.

Unfortunately, neither of the proposed changes made a difference. The latest workflow run is at https://github.com/munich-quantum-toolkit/core/actions/runs/14480016920/job/40614815482?pr=926 and while MSVC's generator formats them slightly differently, it is still the same underlying error.

One thing that might also be different to boost-histogram is that we set BUILD_SHARED_LIBS=ON in the scikit-build-core config
But I am not really seeing how that should make a difference. Especially since the build works in all other combinations of operating systems that GitHub Actions has to offer.

@zooba
Copy link
Contributor

zooba commented Apr 16, 2025

Those unresolved import errors typically mean that you're referencing a libs directory from an x64 install of Python, not from an ARM64 one (these files are platform specific). cibuildwheel should get this right for setuptools, but it doesn't have much logic for other build backends, and most build backends aren't updated to do the detection on their own.

It's also possible that there's another issue with the 3.9 releases - they were quite experimental, and were never actually released outside of NuGet. You might want to start by restricting your ARM64 builds to 3.13 and later.

@burgholzer
Copy link
Contributor Author

Those unresolved import errors typically mean that you're referencing a libs directory from an x64 install of Python, not from an ARM64 one (these files are platform specific). cibuildwheel should get this right for setuptools, but it doesn't have much logic for other build backends, and most build backends aren't updated to do the detection on their own.

It's also possible that there's another issue with the 3.9 releases - they were quite experimental, and were never actually released outside of NuGet. You might want to start by restricting your ARM64 builds to 3.13 and later.

Thanks for the additional tips! Unfortunately, also the Python 3.13 build runs into the exact same issues: https://github.com/munich-quantum-toolkit/core/actions/runs/14497741943/job/40669686715?pr=926

Some excerpts from the build log:

uv venv 'C:\Users\runneradmin\AppData\Local\Temp\cibw-run-b53_8r26\cp313-win_arm64\build\venv' --python 'C:\Users\runneradmin\AppData\Local\pypa\cibuildwheel\Cache\nuget-cpython\pythonarm64.3.13.2\tools\python.exe'

So this is definitely setting up a virtual environment with the right (arm64) Python.

-- Found Python: C:\Users\runneradmin\AppData\Local\Temp\build-env-w0ekc1vh\Scripts\python.exe (found suitable version "3.13.2", minimum required is "3.9") found components: Interpreter Development.Module

Which is the Python from the isolated virtual environment created as part of the build. So this also checks out.

If the NuGet distributions are so fragile, I'd hope that the python-build-standalone distributions that uv also uses could be used as a substitute. ARM64 Windows distributions are currently in the works at: astral-sh/python-build-standalone#387

@zooba
Copy link
Contributor

zooba commented Apr 16, 2025

If the NuGet distributions are so fragile, I'd hope that the python-build-standalone distributions that uv also uses could be used as a substitute

There's nothing fragile about the distributions - uv does virtually nothing different for Windows - but there are things fragile about build backends that think Windows doesn't support cross-compilation. I haven't looked into which backend the project is using, but chances are if it isn't setuptools (or my own pymsbuild), then it'll need updating to properly handle the ARM64 platform (or possibly just additional configuration).

@henryiii
Copy link
Contributor

This is not cross-compiled, this issue the native runner, right? The cibuildwheel specific part you are referring to (and you added) is for cross-compiling, I believe.

Scikit-build-core should be reading the same libs location that setuptools is reading when cross-compiling: https://github.com/scikit-build/scikit-build-core/blob/765af303efdd47820a25ad11d05ffff778c8cd71/src/scikit_build_core/builder/sysconfig.py#L49-L59

Could you also print out Python_LIBRARIES and Python_LIBRARY_DIRS?

@zooba
Copy link
Contributor

zooba commented Apr 16, 2025

Scikit-build-core should be reading the same libs location that setuptools is reading when cross-compiling

Ah thanks, I was just digging through the repo looking for this.

But it looks like the problem is likely CMake, which shows this line:

-- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/VC/Tools/MSVC/14.43.34808/bin/Hostarm64/x64/cl.exe - skipped

So it's correctly identified the host platform, but has still selected x64 target instead of ARM64. That would be consistent with the linker errors, if it's the x64 compiler getting the ARM64 libs.

Under "Setting up build environment" I see this output, which could well be at fault:

  Python 3.13.2
  + python -c '"import struct; print(struct.calcsize('"'"'P'"'"') * 8)"'

Getting "64" here doesn't tell you that it's ARM64, so whatever is using the result is quite likely selecting the target inappropriately. Currently the best property to read is sys.winver.rpartition('-'), which will be empty for x64 and either "32" or "arm64" otherwise.

@henryiii
Copy link
Contributor

Could you enable verbose logging for scikit-build-core? https://scikit-build-core.readthedocs.io/en/latest/configuration/index.html#verbosity

I think we are supposed to be passing the correct -A flag for the MSVC generators, I'd like to verify that's true.

@burgholzer
Copy link
Contributor Author

This is not cross-compiled, this issue the native runner, right? The cibuildwheel specific part you are referring to (and you added) is for cross-compiling, I believe.

Yes, this is a native runner.

Could you also print out Python_LIBRARIES and Python_LIBRARY_DIRS?

Started a new run, which prints the respective variables and additionally enables verbose build output as well as DEBUG-level logging in scikit-build-core (figured that you'd also want to have the additional output 😌). Results are here: https://github.com/munich-quantum-toolkit/core/actions/runs/14500231680/job/40677905680?pr=926

Excerpt for the Python variables:

-- Python executable: C:\Users\runneradmin\AppData\Local\Temp\build-env-9e4gle8z\Scripts\python.exe
-- Python libraries: C:/Users/runneradmin/AppData/Local/pypa/cibuildwheel/Cache/nuget-cpython/pythonarm64.3.9.10/tools/libs/python39.lib
-- Python include directory: C:/Users/runneradmin/AppData/Local/pypa/cibuildwheel/Cache/nuget-cpython/pythonarm64.3.9.10/tools/Include
-- Python library directories: C:/Users/runneradmin/AppData/Local/pypa/cibuildwheel/Cache/nuget-cpython/pythonarm64.3.9.10/tools/libs

@henryiii
Copy link
Contributor

DEBUG - Selecting win-amd64 or win-arm64 due to VSCMD_ARG_TARGET_ARCH

Haha, interesting. Looks like we rely on the CMake default if nothing is selected, rather than passing -G "Visual Studio 17 2022" -A arm64. Though I don't know why the CMake default would not build for ARM on ARM. In the environment, PLATFORM='x64' is set, I noticed.

@burgholzer
Copy link
Contributor Author

DEBUG - Selecting win-amd64 or win-arm64 due to VSCMD_ARG_TARGET_ARCH

Haha, interesting. Looks like we rely on the CMake default if nothing is selected, rather than passing -G "Visual Studio 17 2022" -A arm64. Though I don't know why the CMake default would not build for ARM on ARM. In the environment, PLATFORM='x64' is set, I noticed.

Yeah. That seems to be it. If I set the architecture manually via

[[tool.cibuildwheel.overrides]]
select = "cp*-win_arm64"
environment = { CMAKE_ARGS = "-A arm64" }

the build works. See https://github.com/munich-quantum-toolkit/core/actions/runs/14500807070/job/40679759195?pr=926

IIRC, the architecture flag only works for the MSVC generator though and is not valid for Ninja. Would there be an alternative way to force the choice of the right architecture?

@henryiii
Copy link
Contributor

henryiii commented Apr 17, 2025

Have you tried without https://github.com/Chocobo1/setup-ccache-action? This, specifically, worries me. There's no mention of Windows ARM in the action and at the very least it might be mixing up caching.

Ahh, wait, you use https://github.com/ilammy/msvc-dev-cmd. I see no mention of ARM in the "native support" section in the readme. That's also suspect. Could you try removing both those? If you use the MSVC generator instead of Ninja, you shouldn't need the MSVC setup setup. I've asked at ilammy/msvc-dev-cmd#90.

@burgholzer
Copy link
Contributor Author

Good points. Before trying that, I quickly reenabled our other Windows CI runs in the PR to make sure that any change works for all use cases. One strange thing stood out:
The regular C++ testing build using windows-11-arm with MSVC runs through successfully despite having:

Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/VC/Tools/MSVC/14.43.34808/bin/Hostx64/x64/cl.exe - skipped

which is clearly the wrong architecture. Maybe this somehow works like Apple's Rosetta translation/emulation, but it is still very strange to me.

Anyway, it seems the msvc-dev-cmd action was at fault. After removing it and fiddling with a couple of settings (removing Ninja as a generator for Windows everywhere, since the Ninja generator would, for whatever reason, suddenly choose GCC as a compiler without msvc-dev-cmd), all CI runs across the board are green: https://github.com/munich-quantum-toolkit/core/actions/runs/14510398287/job/40707627311 🥳
It's a bit unfortunate that the ccache setup does not work with the MSVC generator (it finds no cacheable calls) and, hence, the builds now all take longer, but I don't see any quick fix for that. Any best practices on your end for build caching on Windows?

Thank you for the amazing help in debugging this! Really appreciated, and I hope I did not take too much of your time. Feel free to close this as completed. 😌

@zooba
Copy link
Contributor

zooba commented Apr 17, 2025

Maybe this somehow works like Apple's Rosetta translation/emulation, but it is still very strange to me.

Yes, Windows ARM64 can emulate both x86 and x64 binaries (quite efficiently, for the most part). Which is why bin\Hostx86\arm64\cl.exe is a perfectly good compiler no matter what architecture you're running on (x64 can also emulate x86, but not ARM). (And also why I say that you're always cross-compiling - none of the MSVC compilers know that they're "native", they only know which platform they're generating code for. It's only the wrappers and build backends that assume the target is the current platform.)

However, with x64 in the second part of it, things are being compiled for the wrong architecture. Windows as a platform is consistent across architectures, so detecting header files and compiler flags will be fine, but hardware detection (which largely isn't used - because binary distribution is the default for Windows, compilers tend to add runtime detection rather than compile-time detection) and the final binary will be wrong.

Any best practices on your end for build caching on Windows?

MSVC builds will put all the intermediate files into a single directory. For a CMake build, it's going to be under whichever build directory they use. You should just be able to cache that entire directory between each build (provided both cached files and source file timestamps are preserved... which may not be true at all...) and the toolchain will figure things out.

I'm not sure what ccache is intending to do with MSVC here, from reading their page it strikes me that the features that aren't available (precompiled headers, multi-file compilation) probably regress build performance more than caching can restore it.

For context, the biggest MSVC projects I'm aware of (e.g. Windows itself, Visual Studio) do binary caching, so if an entire component is up to date then it just copies the final result, otherwise it builds the whole component. For CPython, we also have some components precompiled (e.g. OpenSSL, which takes 10-20 minutes), but mostly we just do multi-file compilation and build the whole thing in 3-5 minutes. Individual source file caching between independent builds never really figures into it.

@HinTak
Copy link

HinTak commented Apr 18, 2025

It seems that skia-python is failing at the numpy part of "pip install pybind11 numpy" in the build wheel requirement part.

@HinTak
Copy link

HinTak commented Apr 18, 2025

Argh, I have CIBW_BEFORE_BUILD: pip install pybind11 numpy which is failing at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants