Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building for macOS arm64 - Apple Silicon fails with FileExistsError for cffi #813

Closed
viblo opened this issue Aug 31, 2021 · 30 comments
Closed

Comments

@viblo
Copy link

viblo commented Aug 31, 2021

Im trying to add building of macOS arm64 for a package I maintain, Pymunk. However, the build fails with a FileExistsError on a dependency. Since this error does not happen for users building "normally" on macs I think its something in cibuildwheel (to my knowledge, unfortunately I dont have a M1 to try on myself).

Im trying to build on Github actions, using joerick/cibuildwheel@v2.1.1 with a very basic workflow file:

    strategy:
      matrix:
        os: [macos-10.15]
...
      - name: Build wheels
        uses: joerick/cibuildwheel@v2.1.1
        env:
          CIBW_BUILD: "cp39-*"
          CIBW_ARCHS_MACOS: "x86_64 universal2 arm64"
          CIBW_TEST_COMMAND: "python -m pymunk.tests"
          CIBW_BUILD_VERBOSITY: 3

Full file here: https://github.com/viblo/pymunk/blob/cd5b5d5c6faa92f9a4b71e86bb257387125a7af9/.github/workflows/wheels.yml

The main part of error log:

      File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/setuptools/wheel.py", line 95, in install_as_egg
        self._install_as_egg(destination_eggdir, zf)
      File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/setuptools/wheel.py", line 103, in _install_as_egg
        self._convert_metadata(zf, destination_eggdir, dist_info, egg_info)
      File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/setuptools/wheel.py", line 124, in _convert_metadata
        os.mkdir(destination_eggdir)
    FileExistsError: [Errno 17] File exists: '/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/pip-req-build-cyf6wxz8/.eggs/cffi-1.14.6-py3.9-macosx-11.0-arm64.egg'
    Building wheel for pymunk (setup.py): finished with status 'error'
    ERROR: Failed building wheel for pymunk

Full build log is here: https://github.com/viblo/pymunk/runs/3470607783?check_suite_focus=true

The library, Pymunk, has a dependency on cffi. Cffi does not provide wheels for macOS arm64, so I guess it built it first and then something breaks.

ps. Let me take this opportunity to say that generally I think cibuildwheels works great, it has made it very easy to build wheels. Thanks!

@nitzmahone
Copy link
Contributor

CFFI maintainer here- I actually just tripped over this while looking for an existing issue on cibuildwheel. Not sure what's going on there, but I suspect it might be a weird corner case in pip or wheel with cached intermediate builds to a shared tempdir when the same dep appears in setup_requires + install_requires (as you have with cffi).

CFFI is actually adding wheels for Python 3.10 and MacOS arm64 this week as part of cffi's 1.15.0rc1, and I'm hoping to get some validation on them before we push them live. I'll post a notification at https://groups.google.com/g/python-cffi when they're ready- would be great if you could try a test build against cffi == 1.15.0rc1 at that point.

@viblo
Copy link
Author

viblo commented Sep 21, 2021

Nice, will try it out when available 👍

@viblo
Copy link
Author

viblo commented Oct 3, 2021

I saw the release of the rc, and I tried to update the build pipe to use it. However, it didnt really work out, seems like cibuildwheel uses a different macos version than the cffi wheel does. I tried a couple of different ways to force it, but couldnt manage to make it use the correct one so the prebuilt wheel could be installed.

@Czaki
Copy link
Contributor

Czaki commented Oct 3, 2021

@viblo Did you try to use macos 11.0 image? In your old workflow file, you use macos 10.15 and cffi to create arm wheels for macos 11.0

@viblo
Copy link
Author

viblo commented Oct 3, 2021

Yes, I tried with os: [macos-11] in the matrix using the latest cibuildwheel, 2.1.2. That should be enough? But will try some more tomorrow, maybe there's some other thing I missed.

@Czaki
Copy link
Contributor

Czaki commented Oct 3, 2021

I do not have big experience, it may help, but still, it will be an x86 machine? maybe @henryiii has more experience?
It may be better if cffi will provide an universal2 wheels.

@nitzmahone
Copy link
Contributor

I did most of my initial cffi build testing for Apple Silicon with universal2 wheels, but there are a lot of difficulties with that (which I'll spare you the details of), so I ended up just making 11.0_arm64 wheels in addition to the existing 10.9_x86_64 wheels. Modern pip will still happily auto-install the 10.9 x86_64 wheels on 11.0+ x86_64, so if you're not getting the new wheels when your requirement lists cffi==1.15.0rc1, there's probably something else going on. Unless you're trying to build a fully universal2 environment that's totally agnostic to x86/arm64 (good luck, it's really hard), the wheels that are there should work for anything.

@henryiii
Copy link
Contributor

henryiii commented Oct 4, 2021

You can actually merge the two into a universal2 wheel, by the way. Haven't done it, but it can be done.

Having the separate wheels does create issues when you are cross compiling, you'll get the x86 wheels when you are trying to build arm64 code. The end user will get the right wheel (if it's a dep), but you'll get the wrong one when building. Not sure if there's a way to fix that.

@nitzmahone
Copy link
Contributor

nitzmahone commented Oct 4, 2021

Yeah, I did a bunch of that- I got really good at building fat binaries for libyaml and libffi- the really hard part is getting a clean Python extension build that behaves properly. I was able to get things that "worked", but didn't fit well into the existing wheel build toolchains. It's a little easier with multibuild than cibuildwheel just by its nature of script hooks, but it still resulted in a lot more "artisinal wheel building code" than I'd prefer to maintain. 😆

I also spent a good bit of time trying to get Mac cross-compilation to work reliably on 10.x workers with new XCode/SDKs, but ultimately gave up on it as too much hassle for things of any complexity. For MacOS, it's much easier if you've got an 11.x worker, but those have only started to become widely available in the last few weeks- I ended up just rolling my own arm64 GHA workers that I run privately (using some custom stuff with the Monterey beta to host the runners in ephemeral arm64 Monterey VMs), since I'm ultimately not going to release code that hasn't actually been tested on the target arch. I was willing to do it manually if necessary for awhile, but building/testing in an arm64 runner is ultimately the Right Thing, and hopefully I can retire my custom runner hosting whenever Mac arm64 workers become available.

@nitzmahone
Copy link
Contributor

nitzmahone commented Oct 4, 2021

Having the separate wheels does create issues when you are cross compiling

Thinking about this a bit more- if Mac cross-compile support assumes that all required deps are universal2, that seems like a bug. I ran into all sorts of nasty corner cases with real-world cross-compile support anyway that were generating unusable wheels, but most were related to the distutils/setuptools extension build behavior around inheriting the build Python's compile flags. Sometimes I was able to munge it enough to get things to behave properly, but just as often I wasn't and had to drop back to manually building extensions, which I wasn't comfortable shipping or maintaining.

@joerick
Copy link
Contributor

joerick commented Oct 5, 2021

I'd like to understand this better, I think. Apologies for asking the stupid questions! Why is cffi required as a compile-time dependency here? I thought that FFI was a run-time linking tool. Is it doing some kind of codegen too?

I'm trying to understand why the cross-compile step (running on x86_64, targeting arm64) requires the arm64 wheel to do the build.

@nitzmahone
Copy link
Contributor

nitzmahone commented Oct 5, 2021

I'm trying to understand why the cross-compile step (running on x86_64, targeting arm64) requires the arm64 wheel to do the build.

Ditto. Though yes, cffi does have its own setuptools/distutils extension builder (so you can pre-build and ship CFFI glue extensions instead of building them at runtime). As I try to keep my involvement with that project limited to packaging and release management, I don't know the specifics of how that stuff would be (ab)used in a cross-compile scenario. But I suspect it's still probably relying on a lot of the same distutils/sysconfig compile flag inheritance from the hosting Python, so there could be all sorts of weird corner cases that won't behave properly in a mixed arch environment.

@viblo
Copy link
Author

viblo commented Oct 5, 2021

So, I managed to build it when just building the arm64 wheels. If I try to build for x86_64 or universal2 at the same time as arm64 it fails. Testing some more it turns out that building for x86_64 fails with os set to macos-11, while it works on macos-10.15.

Looking closer, I now see that cffi 1.15.0rc1 only provided arm64 wheels for macos-11, but the x86_64 wheels only for the 10, so it make sense that the build fails the same as when I opened this issue.

@henryiii
Copy link
Contributor

henryiii commented Oct 5, 2021

x86_64 10.x wheels should work on 11 - as long as you have an up-to-date pip, that is. Reasonably recent versions of cibuildwheel should be fine, etc.

@viblo
Copy link
Author

viblo commented Oct 6, 2021

Ok, strange. The error I get is the same as the one I wrote about in the first post in this issue thread, just that its the x86_64 version it complains about:

File exists: '/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/pip-req-build-zaxt6fl8/.eggs/cffi-1.15.0rc1-py3.9-macosx-10.9-x86_64.egg'

The strange this is that it does exist a prebuilt wheel in pypi: cffi-1.15.0rc1-cp39-cp39-macosx_10_9_x86_64.whl

So, the issue seems does not seem to be if there's wheels for cffi or not, but that the cffi x86_64 only works using the macos-10.15 image, while arm64 only works using the macos-11.

@viblo
Copy link
Author

viblo commented Oct 6, 2021

I should add that I just tried the very newly released cffi-1.15.0rc2 as well, with no changes.

@nitzmahone
Copy link
Contributor

nitzmahone commented Oct 7, 2021

I don't know how the project you're building uses cffi, but it seems unlikely that what you're trying to do will work properly without actually running your build on M1 hardware. You only have x86_64 runners available publicly in Github, and while cibuildwheel itself running with a Python fat binary can (in simple cases) trick a pure-Python setuptools Extension builder into generating the right compiler args to cross-compile an arm64/universal2 wheel on an x86_64 runner, if your build actually requires the CFFI runtime to do the build, you can't do that without the ability to actually execute arm64 code (since CFFI is decidedly not pure Python; it has to be able to execute C code and libffi under the arch it's building for).

@nitzmahone
Copy link
Contributor

nitzmahone commented Oct 7, 2021

Also, just for grins, I forked your project to my org where I have self-hosted M1 GHA runners and reproduced it on M1 hardware as well (https://github.com/rolpdog/pymunk/actions/runs/1317961546) . It looks like the original issue is a setuptools/wheel problem (the same one I hit that led me to this issue when googling the traceback). But that said, even if this bug were fixed or worked around, since your project is using CFFI's extension builder, you'd likely need to do a lot of manual intervention to get a functional universal2 wheel, since cffi itself has no concept of fat binaries or multi-arch builds (and probably never will). The way it works for really simple things where all the input source is multi-arch capable and the libs to link are fat binaries, it just passes two arches to clang and poops out a fat binary. But for anything where that's not the case (which likely includes your cffi extension), you'll have to manually build the two arch extensions separately, lipo them together, then do some surgery on the wheel build (or manually do a bdist_wheel on a hacked intermediate build dir) to get something that's properly tagged and usable. Or you could just ship separate x86_64 and arm64 wheels, as I've chosen to do, because very little of the tooling is capable of doing "easy" universal2 wheels for things of any complexity, and universal2 wheels are overrated. 😆

@nitzmahone
Copy link
Contributor

PPS, switching your project to a PEP517 build that specifies cffi worked around the setuptools bug and generated an arm64 wheel that passes tests: https://github.com/rolpdog/pymunk/actions/runs/1318077097 (via https://github.com/rolpdog/pymunk).

@nitzmahone
Copy link
Contributor

Long and short @joerick @henryiii - at the end of the day, I don't think this is a cibuildwheel bug- it's a combination of a setuptools bug (which changing to a PEP517 build can neatly avoid) and limitations of universal2 wheel builds of any complexity.

@joerick
Copy link
Contributor

joerick commented Oct 8, 2021

Thank you so much @nitzmahone for doing the research and reporting back! If running native cffi code is indeed an important part of building a cffi wheel, then it does seem like this isn't going to be easily worked around.

I think we should document this as a known issue of cibuildwheel's universal2/arm64 cross-compile support.

Let's hope Arm64 CI runners are not too far away! actions/runner-images#2187

@viblo
Copy link
Author

viblo commented Oct 9, 2021

Great find with pep 517! I updated and tried myself and it seems to work fine.

Regarding universal2, I just followed the documentation. If they wont work no big deal for now, but great if the documentation could give some hint about the issue.

Just a final question @nitzmahone on your comment about cross compiling: Do you mean that cross compiling will work with arm64 as the target, or not at all? I tried updating the pipeline with pyproject.toml, and made a run here: https://github.com/viblo/pymunk/actions/runs/1323411442 Checking the output wheels it seems like the arm64 is the same as the one you built, but want to make sure since I cant test anything on arm64 myself.

Btw, big thanks for all the help here 👍

@nitzmahone
Copy link
Contributor

nitzmahone commented Oct 12, 2021

Do you mean that cross compiling will work with arm64 as the target, or not at all?

It's really going to depend on what your extension does and how it works- as I said, my involvement with cffi is mainly limited to its packaging and release management, so I'm happily ignorant of the potential practical pitfalls. You'd have to ask @arigo (the creator/maintainer of the project) to be sure, but at least from a quick skim of the codebase, cffi itself (and its extension builder) does not appear to have been built with cross-compilation in mind. It assumes that the runtime Python arch matches the extension build arch and does its codegen accordingly. Might that work for simple cases? Probably. Would I trust it any further than I could throw it, knowing how complex it can be and not being tested for that? Probably not...

So yeah, to do this right, you're probably going to need some M1 hardware, at least temporarily. I'd be remiss if I didn't mention MacStadium- they're providing the M1 hardware cffi tests against (I have one on my desk as well, but wasn't too comfortable putting it on the internet); they can hook you up with a remotely-accessible 8G M1 Mac Mini for a pretty reasonable month-to-month rate.

@henryiii
Copy link
Contributor

Do you want me to test something on a AS machine? Happy to if I know what to run.

@henryiii
Copy link
Contributor

Honestly, I'd rather expect it to work. I didn't think CFFI produces any code that wouldn't be compilable on an M1 - I'm pretty sure it doesn't compile or write assembly itself. The generated files should compile correctly to M1 machine code, but I also know very little about CFFI other than how to use it.

@nitzmahone
Copy link
Contributor

I know at runtime there are some places where it (or libffi on its behalf) does arch-specific assembly trampoline generation that would most certainly not work, but I've not looked at the intermediate codegen for the extension builder to see if any of that ends up in the generated code or not. Ignorance is bliss, walking away from this one now, lalala 😆

@viblo
Copy link
Author

viblo commented Oct 12, 2021

I did check the hash of your wheel (or actually the .so file inside) and the one I built, and they match, so it seems like it worked in this case. But given your hesitation Im not sure I will add it to the normal build pipe and upload to pypi unless Im able to run the test suite in future releases.. I imagine that if there are errors they might be very tricky to find/debug.

@nitzmahone
Copy link
Contributor

Yeah, when cffi breaks, it's pretty much always a segfault- people tend not to like those, and debugging them is super fun 😆

@henryiii
Copy link
Contributor

I just tested clang-format on M1, and it worked fine - that's the whole LLVM build cross-compiled. If you know it works, I think it's fine to produce wheels. I ran your wheel on AS, Python 3.10, and it loaded fine and the simple example in the readme worked correctly.

@viblo
Copy link
Author

viblo commented Oct 31, 2021

Since the conclusion was that this is most likely not a problem within cibuildwheel itself but in setuptools, and there is a workaround (use pyproject.toml), I will close this issue. Thanks for the help :)

@viblo viblo closed this as completed Oct 31, 2021
gmishkin added a commit to gmishkin/rules_poetry that referenced this issue Feb 10, 2022
danxmoran added a commit to danxmoran/cairocffi that referenced this issue Jan 5, 2023
Using `setup_requires` in `setup.cfg` is deprecated in newer verisons of
`setuptools`, in favor of PEP517-style `[build-system].requires` in
`pyproject.toml`.

More concretely, I believe moving to the newer style will fix an issue
my team is hitting in our monorepo, where we've occasionally been seeing
failures to install `cairocffi` because of "file already exists" errors
when building the underlying `cffi`. I found a discussion in another
project where the `cffi` maintainer said the issue arises when the same
dependency is listed in both `setup_requires` and `install_requires`
(pypa/cibuildwheel#813 (comment))
and then confirmed that switching to a PEP517-style build fixed things
(pypa/cibuildwheel#813 (comment)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants