New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
graphite2: Fix universal build #18070
Conversation
Notifying maintainers: |
Thanks. Does this fix https://trac.macports.org/ticket/64532? If so, the commit message should say so. The commit message should begin with Let's see what upstream says about your PR before we accept this. |
62b0523
to
5b559b2
Compare
I hope that it fixes the universal Intel/PowerPC build as well, but I could test only the Intel/ARM64 build. I also understood at first the ticket is about cross-building with Rosetta, not specifically about universal. Added "Probably fixes ..."
Thanks! Updated.
The upstream seems quite dormant. graphite2 is a dependency of harfbuzz, and harfbuzz a dependency of a lot of great packages we can't build as universal at the moment. |
You could have tagged me, that was my ticket. I will test this in Rosetta, as well as ppc+ppc64. P. S. I am not really sure about MATCHES syntax, but possibly |
@ryandesign Yes, it does, but it seems that the patch here is applied to a wrong spot. Obviously, it should be in Darwin chunk and not Linux.
|
5b559b2
to
953ad05
Compare
@barracuda156 Updated the broken patch. It works for me with arm64, so I don't think there's a need to add ppc64. |
@tuffnatty Cannot verify atm, but I am fine with that: if anything, adding ppc64 will be trivial. |
953ad05
to
9b6aa0c
Compare
Updated the patch with proper CMake quoting. |
Updated the patch to also use muniversal PortGroup avoiding any performance degradation. |
dbe182c
to
b49e91e
Compare
@tuffnatty Ah, this one is still pending. If not in a hurry to merge, I can test this for ppc+ppc64 in a day or two (machine with 10.5 is in office). |
@barracuda156 Thanks! |
@tuffnatty Will be done no later than tomorrow morning (like, in a few hours). Sorry for a bit of a delay. UPD. Yes, it works fine on 10.5.8 for |
b49e91e
to
c305ea5
Compare
@barracuda156 Thanks! Rebased the patch against master. |
@ryandesign upstream hasn't commented on the PR mentioned here where the same fix was suggested. Are you comfortable with merging this? If so, please do... if not let's close the PR. |
@reneeotten I have a strong suspicion that @ryandesign does not get notifications about being mentioned in PR threads. I'm unsure why. |
There are several ports that share this same issue -- conflicting SIMD flags for different arch builds. graphene is another one that comes up https://trac.macports.org/ticket/66888. Just turning all SIMD flags off in universal builds can work, but it is a somewhat costly approach to take as a whole, long-term, for MacPorts, performance-wise. It does get it fixed quickly, though. Of interest, when building universal, clang will use neon SIMD flags on arm64 but not on x86_86, and use SSE/etc flags on x86_64, but not on arm64, etc -- in other words -- it will do the right thing if all the SIMD flags are used. It is designed to do this, specifically for this problem. gcc used to be able to do that too, when Apple was involved in it ... but that was last as of gcc-4.2, so ancient times now. It's merging the pkgconfig flags that are the hiccup. So one thing to do rather than turn all the SIMD flags off, is turn them all ON. Another approach is to have different pkgconfig flags that will be used on different architectures. That way they don't have to be merged. This is actually possible, but would require some tooling in macports to support. This issue has come up before, in other projects, and I put some thoughts about the the approach here last year when I ran across this many times building universal: https://trac.macports.org/ticket/66894 I think ... turning the SIMD flags all ON and letting clang sort them out is probably the most expedient thing to do. Until gcc is updated, they might have to be turned all off there, but it would be unfortunate to disable all the acceleration for all newer systems all the time to support an ancient architecture virtually nobody builds any longer. Sorting out arch-specific pkgconfig flags as per https://trac.macports.org/ticket/66894 has a tantalizing elegance about it.... but not sure anyone is going to actually take the bull by the horns and do that project. There are only a few folks around MacPorts with the skill set for a project like that -- Marcus, Josh, Ryan, etc -- and they are all pretty busy right now. |
@kencu As long as turning all SIMD flags on is clang-only, and for gcc builds they are either set correctly per-arch or at least turned off for all universal builds, all should be good. |
So can we finish this? |
If this is rushed, the temptation would be to just disable all SIMD when building universal, and that (IMHO) would be the wrong thing to do. Imagine sorting out build or functional issues when the library you're linking against is built differently depending on whether it was installed universal or not. Nightmare. It needs to be sorted out properly, and what is done here will be done elsewhere. |
@kencu wrote:
This is not a very accurate description of what my patch does. My patch does not break or slow down any build that was working well before the patch. It only disables passing wrong flags in a cross-platform build situation. Pure PowerPC and ARM builds did never use any SIMD flags. Only when building on Intel for a non-native target, Intel SIMD flags were being passed erroneously. That is, the SIMD flags were ever used by upstream only on Linux+Intel, Darwin+Intel graphite2 builds, ignoring other OSes and CPUs. By using muniversal, the Intel part of the fat binary is built with SIMD flags, and the ARM/PowerPC part is built without, but SIMD was never supported by upstream for PowerPC anyway, and for ARM64, SIMD support is on by default. Probably if someone could recommend proper PowerPC flags to me (-faltivec on G4+?) and help with PowerPC testing, it would be possible to add this feature [to upstream as well]. But as for now, it seems that my patch and/or its context has been misunderstood. |
@tuffnatty Perhaps fixing it in upstream first will be a better approach, since a) that benefits everyone and not only Macports users, and b) once something is merged to upstream, it will be merged here without any argument. AFAIK modern GCC can use |
OK, reopened as misunderstood, for @ryandesign to deal with as he sees fit as port maintainer. Apologies for responding to @pmetzger 's request to keep the queue clean. I will leave the queue full. This PR has been sitting here for about a year now, though. |
currently, graphite2 does not use the muniversal portgroup, by the way. https://github.com/macports/macports-ports/blob/master/graphics/graphite2/Portfile |
Ah, sorry, I missed that your PR here adds muniversal. My bad. |
I am also sorry about that. The upstream maintainers seem to not have reviewed any PRs since December 2022. |
@tuffnatty if you wanted to move this along, you might prove/demonstrate that the universal build with your enhancements here works, first of all. Then show that it builds with all SIMD enabled, ideally both on an Intel system building universal, and on an arm system building universal, if you have the hardware available. With that info in hand and confirmed, committing this would be much easier for someone. |
Is the benefit of a universal build worth the effort here? Is it going to save enough people enough time over the long run? |
@kencu Thank you for your suggestion. What kind of proof would be enough? https://github.com/tuffnatty/sqlitestudio/actions/runs/7202979501/job/19622098015 - this is a GitHub Actions workflow run that uses this patch to build qt5 +universal on Intel GitHub runners. It was not created in order to prove anything, but to have a MacOS universal Qt5 SDK that could be used on GitHub runners. So it does only prove that graphite2 builds +universal with this patch, allowing to build +universal further dependencies. You can also see from the log that the Intel compiler invocations include
@pmetzger Thank you, it's probably a valid question. How many saved people should I bring? |
So that is a universal build built on an Intel system? Look like. And the Intel SSE is enabled. What happens when the universal build is done on an arm system? |
That's a valid question as well. On a M1/darwin21 system, I see no SSE flags passed in the universal build, which still results in a binary using xmmN registers on x86_64 and vN registers on arm64:
The symbol names are mangled, but it's probably due to a different XCode version being used. |
On an M1 darwin system, does the universal build generate SSE code in the intel portion? Because that is what I am afraid it won't, and that is what we need to fix if it doesn't. All SIMD enabled, basically, is what we need. Same build on either Intel or arm. |
And yes, this stuff is tricky to get right, I know that. This kind of detail is what slows down commits that we didn't vet ourselves. |
|
I am having trouble understanding you. Can you be more clear what your problem is? |
In the comment you were seemingly replying to, I have addressed exactly this question. |
so no SSE flags are passed in the logs you can see but I can’t, yet you find SSE is still used nonethless? OK, that is your submission then. Leave it for Ryan now. |
Right. Regarding the logs I can see but you can't: I can paste the logs here, but would it give you more trust (I am not saying they have any evidence of SSE2 switched on; I am saying, on the opposite, they are missing the flags)? I can tell you how to easily reproduce my findings:
|
OK, I got you. No SSE flags used, but you find SSE used anyway. I don’t understand it, but I hear you. Ryan can decide what he wants to do.. Thanks for your submission. |
It's very simple. https://godbolt.org/z/xezf5cTsd Here is an example of auto-vectorizable loop. On godbolt.org, I see that passing |
If I change the ints in the example to floats, the loop is auto-vectorized no matter which flags I pass, even with the oldest versions of Clang and GCC available on https://godbolt.org/z/1eM9j7vzs . |
More info. graphite2's documentation:
So, I am interpreting this as: the So, if a universal arm/x86-32 build was a thing, it could be broken by not passing BTW, the initial bug I was fixing (using ${CMAKE_SYSTEM_PROCESSOR} as the host processor while it seems to mean the build processor in this case) was added to upstream by @ryandesign in an attempt to fix the PowerPC build: silnrsi/graphite@4f78a29 . |
if the build is using SSE code when it’s instructed not to (ie SSE is not enabled) that is presumably a build error that needs fixing. I’m sorry I don’t have more time to deep dive on this in Godbolt, etc. You’ve laid out a good case, thank you. Ryan will be along to assess his port in due course. |
Sorry for not responding here for so long. I do get notifications but I don't click on most of them because it has been my experience that often clicking on a notification results in me spending an hour reading lots of comments, having to think about a lot of things I haven't though about in awhile, and then still often not arriving at a solution.
It was added by me after being suggested by @barracuda156 in https://trac.macports.org/ticket/64532. I see that @barracuda156 already commented on that commit suggesting the use of Have you already read my comment at https://trac.macports.org/ticket/64532#comment:4 where I said |
I'm wrong, it was mentioned by @barracuda156 there but it was added in 88eb463 . |
clang should just do the right thing here, but it is not doing so. perhaps there is a reason it will not smoothly handle the multi-arch build when using those flags.... I think it should... but there may be subtleties going on that I don't know about. I opened an issue with llvm upstream to see what they think / would recommend for a general fix for this. This issue does arise in other projects too. |
It seems to me that although people are interested in this, it's not being worked on actively enough to leave in the queue. |
I'm closing this for now. It can be re-opened if someone wishes to work on it. |
Description
Fix universal build for graphics/graphite2 which used to wrongly specify SSE flags if the build machine is Intel. There is a pull request in upstream: silnrsi/graphite#84
Type(s)
Tested on
macOS 11.7.4
Xcode 13 / Command Line Tools 13.2.0.0.1.1638488800
Verification
Have you
sudo port -vst install
?