-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not an Issue, but close: M1 compilation and speed #1498
Comments
Apparently we have no problem with alignment. I mean, no error is reported. |
we are currently aligning on 16 bytes if Eigen is included in the build. This is what is recommended, and similarly on M1. |
Foreach on M1 takes exactly the same time whatever the size of the array 600, 6000, 6000000... ! |
In fact it is the deletion of an object (here a loop variable) that seems to be the culprit. |
eh eh eh GDL's FOREACH code is too good. See #1500 |
OK folks, the only reasonable way to get GDL compiled with OpenMP on a M1 (here a mac mini with
note: INTERACTIVE_GRAPHICS=OFF because MacPort's plplot is not compiled with DYNAMIC_DRIVERS, this will be solved easily. With openmp on, I have test_all_basic_functions down to 5 seconds, now twice as fast as my old BUT: time_test4 chokes on
This being related to the strange slowness of deleting objects. |
I stop to look for a way to use OpenMP on Mac since three years : every release they change some details inside ... A total mess and loss of time. One point : did you try to use another compiler than Clang. Quite easy to switch to a true GCC ... |
indeed, I tried with
Of course, it may well be that the simplest solution is to use Brew and not MacPorts, I'll do tests. |
I was able to recompile on M1 using the version of |
Ah, and there's a problem when using the PROJ library (crash in test_map.pro) |
The PROJ library problem may just arise from incompatibility with installed versions. Tricky or just needs a debugger?. |
Proj library problem above was just a version problem |
now on #1675 |
I gained intermittent access to a M2 OSX The good news : yes the script is working fine and OpenMP is activated The very bad news : we do horrible performances in some places, already reported here by @GillesDuvert
On my old laptop (4 cores, i5)
|
@alaingdl Having a fast GDL on M1 and M2 seems an important goal. The slowness of creation/destruction of C++ objects is quite possibly the culprit. This in turn may be due to the compiler + the platform, as the same code is fast everywhere else (including Apple x86). It would be interesting to test GDL on a M1, but with linux and not OSX to confirm that. Perhaps simple, or rather, better C++ code, such as adding 'const' or 'final' everywhere possible in our code, will make faster code on OSX+M1. Using the Apple compiler may also be an asset. Various things I'm not in position to do having no Apple machine or license. |
I have a rendez-vous on Monday with a colleague who removed the OSX from an Apple laptop and then installed a arm-based linux ! I hope it will clarify this serious question ! Clang problem ? Later I will try to play with the Clang options ... |
I wonder if @opoplawski would confirm GDL compiles on linux M1... If they have such a machine in their huge rebuild system. The other way around is to check with Apple clang, as I did here, including the openmp bypass described in the same comment. |
So, we (Fedora) don't necessarily build on M1/M2 - but we do build for Linux on aarch64 in general. I started poking at time_test4 - which seems to be part of IDL. I tried running the version from IDL 8.7 with gdl 1.0.2 and got:
I'm not finding any other source for time_test4. What am I missing? |
I think I figured out how to run time_test4 from IDL 8.7 - here is the output from an aarch64 Fedora builder:
This appears to be an 8 core VM. cpuinfo reports:
|
Thanks you so much Clear good results :) Since those results are in line with the ones on x86, I would say that Clang or OSX does have a problem with the default usage we do (do we have to look at the options & flags ?) |
Many thanks, @opoplawski !! |
Thanks to a M2 Apple running a Ubuntu Linux arm VM 8 cores, I was able to replicate the results from @opoplawski On other site, on my old laptop (x86) I install clang-15 and the numbers are as good as with gcc Could someone remember me how to switch off most options when compiling with Clang ? Thanks ! |
@alaingdl nice results, too. In GDL's cmake or in build_gdl.sh I see no specific clang options/flags (???). |
solved see #1755 |
I have a login on a remote mac mini with M1.
The system prefers to use MacPorts, which is not a problem per se. The main defect is that, as for Homebrew, plplot is compiled without dynamic drivers so unless we patch the plplot MacPort, (something build_gdl.sh already does for homebrew's plplot), there will be no 3D support and no wxWidgets PLOT windows. But otherwise all the components for GDL are here.
The main issue was to have openmp. With the current state of CMakeLists.txt, impossible to get it with Apple clang. I succeeded only with
sudo port install clang-14
-DCMAKE_CXX_COMPILER="/opt/local/bin/clang++-mp-14" -DCMAKE_C_COMPILER="/opt/local/bin/clang-mp-14"
Results:
OMP is here, because
test_all_basic_functions,size=1000000 gives
% Time elapsed ALL TESTS: 3.9622629 seconds.
that's 2 times faster than on my (old) Intel(R) Core(TM) i7-4710MQ CPU @ 2.50GHzBUT: time_test4 shows incredible loss of time, in particular a bottleneck in Foreach (44.5 sec where my computer does it in 0.13s)
Interestingly, compiling without Eigen3 sort of divide by 2 the time passed in the previous tests (but as expected produce terrible results for test_all_basic_functions: 30 seconds)
As Eigen is related to the alignment of our variables in memory there may be a subject here.
The text was updated successfully, but these errors were encountered: