Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSX: performance issues within Clang + duplicate symboles with g++-13 #1755

Closed
alaingdl opened this issue Feb 22, 2024 · 7 comments
Closed

Comments

@alaingdl
Copy link
Contributor

OK, the performance issues within FOR loops detected first on Mac M2/M3 is in fact also here on x86_64

  • Linux U22 my old laptop / gcc
    time_test4 : 1.06796=Total Time

The OSX versions here were compiled with the script, and OpenMP is declared as ON
(all tests : 4, 5, 16, 25 are bad, but also 2 regress since clang 17 :(

  • OSX gdl-1.0.2git230313 : clang 15.0.7_1
    time_test4 : 21.8063=Total Time

  • OSX gdl-1.0.2git230420 : clang 16.0.1
    time_test4 : 21.9463=Total Time
    (case 2 0.206096 Foreach, 6000000 elements

  • OSX gdl-1.0.3git231123CMake: clang 17.0.4
    time_test4 : 70.3176=Total Time
    (case 2 : 49.0947 Foreach, 6000000 elements)

  • OSX gdl-1.0.4git240222CMake: clang 17.0.6_1
    time_test4 : 69.5246=Total Time

Unfortunately I cannot finish the compilation with GCC 13 because of duplicates symbols

CC=/usr/local/bin/gcc-13 CXX=/usr/local/bin/g++-13 cmake .. -DREADLINE=no -DHDF=OFF -DHDF5=OFF -DPYTHON=off -DGRAPHICSMAGICK=off -DMAGICK=OFF -DWXWIDGETS=off -DQHULL=off

[...]  // the first ones 

[ 15%] Linking CXX executable gdl
duplicate symbol '__ZTS5Data_I10SpDComplexE' in:
    CMakeFiles/gdl.dir/datatypes.cpp.o
    CMakeFiles/gdl.dir/basic_op.cpp.o
duplicate symbol '__ZTI5Data_I10SpDComplexE' in:
    CMakeFiles/gdl.dir/datatypes.cpp.o
    CMakeFiles/gdl.dir/basic_op.cpp.o


[...] // the last ones

duplicate symbol '__ZTS5Data_I9SpDLong64E' in:
    CMakeFiles/gdl.dir/datatypes.cpp.o
    CMakeFiles/gdl.dir/ofmt.cpp.o
duplicate symbol '__ZTI5Data_I9SpDLong64E' in:
    CMakeFiles/gdl.dir/datatypes.cpp.o
    CMakeFiles/gdl.dir/ofmt.cpp.o
ld: 252 duplicate symbols for architecture x86_64
collect2: error: ld returned 1 exit status

datatypes.cpp.o is always involved ...

@GillesDuvert
Copy link
Contributor

I confirm that -fsanitize=address makes gdl 100 times faster for code related to memory transfer (copy from variable to variable) on a Mac mini with M1.
The code to be tested is simple:

GDL> tic & for i=1L,600000 do a=1 & toc
% Time elapsed : 4.5299740 seconds.

which takes 0.057317972 seconds on my intel linux laptop, gcc compiler, no eigen::
As

GDL> tic & for i=1L,600000 do a=a & toc
% Time elapsed : 0.019397974 seconds.

is internally optimised to do nothing (a=a !!!), 0.019397974 seconds measures the empty loop speed, which is OK.

This restricts the area of the problem to a very tiny number of code lines, essentially what happens in "a=1".

@GillesDuvert
Copy link
Contributor

@alaingdl the multiply defined symbol have already been encountered ( #677 , #734) , and should indeed be avoided. However there always were compiler options to circumvent that problem which arises only on a limited number of platforms.

@GillesDuvert
Copy link
Contributor

CULPRIT FOUND!!!

On OSX, for obscure historical reasons, and given that the system defines HAVE_MALLOC_ZONE_STATISTICS and HAVE_MALLOC_MALLOC_H, the very very inner code for destruction of variables would call the obscure UpdateCurrent() function to report precise memory useage. The loss of time is tremendous, and would have been seen in a profiler by the enormous number of calls to strange functions like malloc_zone_statistics() etc.

making UpdateCurrent() just return solves the speed problem, time_test4 drops to 1 sec.

@GillesDuvert
Copy link
Contributor

Just commited the single-liner that is supposed to do wonders.

@alaingdl
Copy link
Contributor Author

alaingdl commented Mar 2, 2024

@GillesDuvert : brilliant ! Thanks

tested on a intel OSX, using the script ...

GDL> time_test4
[...]
      1.10098=Total Time,      0.021701576=Geometric mean,      25 tests.

GDL> TEST_LOOPS
% Time elapsed : 0.0098431110 seconds.
% Time elapsed : 0.010197878 seconds.
% Time elapsed : 0.0053970814 seconds.
% Time elapsed : 0.0092120171 seconds.

@brandy125
Copy link

brandy125 commented Mar 2, 2024 via email

@GillesDuvert
Copy link
Contributor

#1776

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants