Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiling perl-5.37.3 from source is slow with GCC 12.2.0 MSVCRT #114

Closed
sisyphus opened this issue Aug 24, 2022 · 18 comments
Closed

Compiling perl-5.37.3 from source is slow with GCC 12.2.0 MSVCRT #114

sisyphus opened this issue Aug 24, 2022 · 18 comments

Comments

@sisyphus
Copy link

With gcc-12.1.0 (x86_64, msvcrt) building perl-5.37.3 from source took 11 minutes 20 seconds.
With gcc-12.2.0(x86_64, msvcrt) the same task took 23 minutes.

I ran both builds concurrently in 2 separate cmd.exe shells, side by side on my Windows 7 system - same perl source, same build options, same everything except for the winlibs toolchain.

When building perl, it's not all just compilation of C code, but it's the compilations of the various C files that stand out as being about twice as slow.
I certainly didn't need to run the comparison to realize that gcc-12.2.0 was compiling the C code much more slowly.

AFAICS, there's no difference in the final product produced by gcc-12.1.0 and gcc-12.2.0.
Both perl.exe files are 25600 bytes, both perl537.dll files are 3127808 bytes and the performance of both perls is comparable.
I ran the perl test suite for both perls concurrently, and they ran at the same speed.

Thank you, @brechtsanders, for the work you do in providing these builds.

Cheers,
Rob

@brechtsanders
Copy link
Owner

I had the impression it wasn't much faster, but hadn't noticed yet it was actually slower.

Have you reported this to GCC development?

@sisyphus
Copy link
Author

I've just tried building gmp-6.2.1 in the MSYS2 shell, and found that the speed disparity between 12.1.0 and 12.2.0 is not as great, but still quite significant.
The 'make' step took 3 min 20 sec with 12.1.0, and 4 min 20 sec with gcc-12.2.0.
Same results with both static and dynamic builds of gmp-6.2.1.

Have you reported this to GCC development?

No. I'll do that tomorrow.
(I'll wait until then in case there's any further comments about what I've found so far.)

@brechtsanders
Copy link
Owner

brechtsanders commented Aug 24, 2022

I just tried building gmp-6.2.1 (static+shared) in my build environment with gcc 12.1.0 and 12.2.0 at the same time on the same system and both build processes ran absolutely parallel and finished about the same time too.

So I couldn't really reproduce it.

Note that I was using a RAM drive (using ImDisk) both to build on and as TMP location.

@brechtsanders
Copy link
Owner

brechtsanders commented Aug 24, 2022

Tried building gmp-6.2.1 (static+shared) again, this time building on local disk (NVMe disk, TMP still pointed to RAM drive).

  • GCC 12.1.0 (ucrt): 12 minutes 40 seconds
  • GCC 12.2.0 (ucrt): 13 minutes 0 seconds

So it was 20 seconds slower (or between 2% and 3%).

Just in case, I will now also try without pointing TMP to my RAM drive....

@brechtsanders
Copy link
Owner

New results, completely using disk GCC 12.2.0 (ucrt) build 1 second faster.

@brechtsanders
Copy link
Owner

brechtsanders commented Aug 24, 2022

Final try: build gmp-6.2.1 (static+shared) on RAM disk (TMP also on RAM disk), antivirus disabled:

  • GCC 12.1.0 (ucrt): 5 minutes 18 seconds
  • GCC 12.2.0 (ucrt): 5 minutes 15 seconds

Timing includes configure and make twice (once static and once shared).
RAM drive (32 GB) via ImDisk
CPU: AMD Ryzen 9 5950X

So in this case the newer compiler was (slightly) faster.

Conclusion: I'm not able to reproduce your complaint.
Maybe you should figure out what is different on your system when using GCC 12.2.0 compared to 12.1.0, maybe there is something else at play...

@brechtsanders
Copy link
Owner

Can you share how you build Perl, so I can give that a try too?

@sisyphus
Copy link
Author

Yes, I find that the UCRT versions of gcc-12.1.0 and gcc-12.2.0 are quite consistent with each other wrt building gmp-6.2.1.
But I'm using the MSVCRT versions. (Sorry, I should have mentioned that more emphatically.)
Can you try the same comparison with gmp, but using the MSVCRT versions ?

AFAIK, perl cannot be built with the UCRT compilers.
To test with the perl source, just grab https://cpan.metacpan.org/authors/id/N/NE/NEILB/perl-5.37.3.tar.gz, unpack it and cd in cmd.exe shell to the win32folder.
Then using x86_64 builds of gcc (with msvcrt) run:

gmake CCHOME=C:\path_to_winlibs\mingw64 CCTYPE=GCC OPTIMIZE=-Os USE_MINGW_ANSI_STDIO=define

The CCHOME argument needs to identify the location of gcc's "bin" folder - which means that it will terminate with "\mingw64", and without a trailing backslash.
The CCHOME\bin folder needs to be in your path, too - and ahead of any other folder that contains a "gcc.exe".

I think that should be enough to provide a successful running of 'gmake', and to observe the issue.
(If you want to actually install the perl that you build, you probably want to provide an INST_TOP argument and perhaps some other arguments, too. In that case, see the documentation inside the win32/GNUmakefile.)

If you're unable to reproduce the issue I'm seeing, then I probably have to start wondering what it is about this Windows 7 system of mine that binds up this MSVCRT gcc-12.2.0 compiler.

In the meantime, I'll try the same experiment on a Windows 10 laptop that I have and see if that makes any difference

BTW, here's the (rough) timings for the gmp 'make' phase to complete on the Windows 7 machine using the 4 compilers that I've now tried:
12.1.0 UCRT: 4 min 25 sec
12.2.0 UCRT: 4 min 25 sec
12.1.0 MSVCRT: 3 min 20 sec
12.2.0 MSVCRT: 4 min 20 sec.
Interestingly, it's the MSVCRT 12.1.0 compiler that seems to be "the odd one out".

@brechtsanders
Copy link
Owner

brechtsanders commented Aug 24, 2022

I'm on Windows 11.

I just tried building gmp-6.2.1 (static+shared) again (on RAM drive, antivirus disabled) with MSVCRT GCC/MinGW-w64 (64-bit) versions. Timings for only the make install-strip steps (static+shared):

  • 12.1.0 MSVCRT: 1 min 46 sec
  • 12.2.0 MSVCRT: 1 min 50 sec

So I'm still not seeing the big difference you are seeing.

@sisyphus
Copy link
Author

On Windows 10, I'm still seeing a difference. It's a much slower system than Windows 7, and gmp's make stage took 11 minutes 40 with gcc-12.1.0 and 13 minutes 10 seconds with gcc-12.2.0.
In percentage terms, those timings are closer than they were on Windows 7.

As regards, the 'make' stage of the perl-5.37.3 build on Windows 10, gcc-12.2.0 is taking twice as long as gcc-12.1.0.
That is very consistent with what happens on Windows 7.
I've just repeated the experiment on Windows 7, this time running one build of perl followed by the other (instead of doing both builds simultaneously) and gcc-12.2.0 is still twice as slow as 12.1.0.

If you could run that test against perl-5.37.3, that would be helpful.
I acknowledge that if you're not seeing any difference in the gmp build times, then you're probably not going to see any difference in the perl build times either.
But confirmation of that would rule out there being something weird in the perl source.
(When I ran the test on Windows 10, I used the same trimmed back 'gmake' command as I recommended that you could use in my previous post. It worked fine.)

Other than that, I think I'll let this Issue sit for a day or 2 (if that's ok with you) and then close it as a "just me" issue if nothing enlightening has turned up.

I only installed gcc-12.2.0 last night.
Maybe it's still jetlagged after the long trip over ;-)

@sisyphus
Copy link
Author

On the Windows 7, while running the 'gmake' step of the perl-5.37.3 build, I've subsequently done a comparison of the time taken to process the first 2 files (toke.c and regcomp.c) that are compiled.

Compiling toke.c takes 5 seconds with gcc-12.1 and 19 seconds with gcc-12.2.
Compiling regcomp.c takes 9 seconds with gcc-12.1 and 34 seconds with gcc-12.2.
So the slowdown is actually closer to 4-fold.
(Presumably, non-compilation processes that are run during 'gmake' suffer little or no slowdown under gcc-12.2 - so 'gmake' takes only twice as long.)

Anyway, as this is a "just me" thing at this stage, I'll close this Issue.
I've started to "watch" the Issues that arise here, and I'll continue to do so in case someone else provides a report that confirms that it's not "just me".

@kenneth-olwing
Copy link

For the record, I've tested building Perl as well, and in summary I confirm @sisyphus results - 12.2.0 is clearly taking a much longer time than 12.2.1.

Basics:
I'm using Windows 11 (a fact that appears to be somewhat significant regarding optimization settings, see Perl/perl5#20136).
I have three toolchain observations:

  1. gcc 8.3.0 (as part of Strawberry Perl 5.32.1.1)
  2. winlibs-x86_64-posix-seh-gcc-12.1.0-llvm-14.0.6-mingw-w64msvcrt-10.0.0-r3
  3. winlibs-x86_64-posix-seh-gcc-12.2.0-llvm-14.0.6-mingw-w64msvcrt-10.0.0-r1

In the builds I have run like this (example for 12.1.0):

gmake -f GNUmakefile CCHOME=c:\ken1\apps\winlibs-x86_64-posix-seh-gcc-12.1.0-llvm-14.0.6-mingw-w64msvcrt-10.0.0-r3\mingw64 OPTIMIZE=-Os

I've done this from blead just now as 5.37.3 are missing some important build fixes, although not yet the optimization fix. Whether the optimization stuff is required for gcc 12 I have not yet ascertained.

For the specific examples made by @sisyphus I get the following:

gcc830: toke.c=4s, regcomp.c=7s, full build 7m36s
gcc1210: toke.c=5s, regcomp.c=8s, full build 9m17s
gcc1220: toke.c=15s, regcomp.c=26s, full build 17m48s

It's evident that there are clear differences; gcc1210 is just slightly slower than gcc830, but gcc1220 takes a giant leap to something like threefold for the individual files.

A very uneducated guess would be that maybe the optimization step(s) are so vastly improved that it does that better, but to a large cost in running time. Well...yeah...probably not, just tossing it out there...:-)

@sisyphus
Copy link
Author

sisyphus commented Aug 27, 2022

@kenneth-olwing, thank you for checking and confirming this.
I've just realized that, if we are interested only in seeing how long it takes for the first few files to compile we can just run gmake without any arguments at all, and that will use the default optimization level of -O2.
If we want to test with a different optimization level we can just run (eg) gmake OPTIMIZATION=-O0.
I find that, irrespective of the optimization level being used in the building of perl, gcc-12.2 is still around 3 times slower than gcc-12.1 (for the same optimization level).

Whether the optimization stuff is required for gcc 12 I have not yet ascertained.

The problems I experience building perl with x64 gcc-12.1 and -O2 are still present with gcc-12.2. (We haven't discussed that in this thread - I think it's probably a separate issue.)

gcc1210 is just slightly slower than gcc830

Yes, I had also noticed the same.

I've also just timed the compiling of toke.c and regcomp.c with 32-bit versions of gcc-12.1.0 and gcc-12.2.0.
Again the 3x slowdown with gcc-12.2 is evident.

Note: I claimed earlier that it was a 4x (or thereabouts) slowdown ... but I now think 3x is closer to the actual mark.

Before I report this to gcc (as @brechtsanders suggested earlier), I'll try to find out if linux folk are also seeing the same sort of thing when building perl with gcc-12.1 and gcc-12.2.

@brechtsanders
Copy link
Owner

ANother thing to check if it's really gcc slowing down or part of binutils.

I noticed that binutils added a bunch of Windows specific code related to opening files. In #112 this actually caused issues as that code is not perfect yet. Those issues were when gcc calls as during compilation.

I suspect the same added code may also have some performance overhead, but I haven't measured it yet.

@sisyphus
Copy link
Author

Another thing to check if it's really gcc slowing down or part of binutils.

I think it's gcc.
AFAICT, both @kenneth-olwing and myself have been timing commands like:

gcc -c  -I.\include -I. -I.. -DWIN32 -DWIN64 -DPERLDLL -DPERL_CORE -Os  -D__USE_MINGW_ANSI_STDIO -fwrapv -fno-strict-aliasing -DPERL_EXTERNAL_GLOB -DPERL_IS_MINIPERL -omini\toke.o  ..\toke.c
and
gcc -c  -I.\include -I. -I.. -DWIN32 -DWIN64 -DPERLDLL -DPERL_CORE -Os  -D__USE_MINGW_ANSI_STDIO -fwrapv -fno-strict-aliasing -DPERL_EXTERNAL_GLOB -DPERL_IS_MINIPERL -omini\regcomp.o  ..\regcomp.c

and there is no mention in either of those commands of any of the "binutils" commands listed at https://en.wikipedia.org/wiki/GNU_Binutils .

But I'm a bit out of my depth, here. Is it possible that binutils commands are involved (but unannounced) in the running of those compilations ?

@brechtsanders
Copy link
Owner

If you add -v to gcc it will tell you more about steps taken. Usually to compile .c into .o assembler code is generated by gcc and passed internally to as which actually generated the .o file.

@kenneth-olwing
Copy link

Not directly relevant here perhaps, but since I stated:

Whether the optimization stuff is required for gcc 12 I have not yet ascertained.

I now have - so FWIW, apparently -O2 doesn't work for 12.2.0 either, I get a couple of test fails for a Perl build unless I shift to -Os (however, the failing tests are different from the ones with 8.3.0)

@sisyphus
Copy link
Author

f you add -v to gcc it will tell you more about steps taken. Usually to compile .c into .o assembler code is generated by gcc and passed internally to as which actually generated the .o file.

I've tried that, and it's a rather weird result.
According to the output I see, it appears that all of the time is spent on doing nothing.

I've manually re-run the compiling of toke.c using the -v option:

gcc -v -c  -I.\include -I. -I.. -DWIN32 -DWIN64 -DPERLDLL -DPERL_CORE -Os  -D__USE_MINGW_ANSI_STDIO -fwrapv -fno-strict-aliasing -DPERL_EXTERNAL_GLOB -DPERL_IS_MINIPERL -omini\toke.o  ..\toke.c

Instantly, there's quite a few lines of output flash by (including an as command),
That output terminates with a line that announces:

Compiler executable checksum: 11e0db502eb2edd262aeb87d2525f2c4

Then the cmd.exe cursor just moves to the next line and blinks and blinks and blinks (without any announcement of what is being done) until the compilation completes instantly with some more lines of output (including another as command).
Task Manager shows that cpu usage remains at the same level during that long hiatus ... so something is being done, though there's no indication from the output of what that "something" actually is.

Could it be that the output is being buffered ? It's being sent to STDERR, and I have this notion that it's therefore not subject to buffering.

Here is the entire output of the command:

D:\comp-1220\perl-5.37\win32>gcc -v -c  -I.\include -I. -I.. -DWIN32 -DWIN64 -DPERLDLL -DPERL_CORE -Os  -D__USE_MINGW_ANSI_STDIO -fwrapv -fno-strict-aliasing -DPERL_EXTERNAL_GLOB -DPERL_IS_MINIPERL -omini\toke.o  ..\toke.c 
Using built-in specs.
COLLECT_GCC=gcc
OFFLOAD_TARGET_NAMES=nvptx-none
Target: x86_64-w64-mingw32
Configured with: ../configure --prefix=/r/winlibs64_stage/17520.gcc/inst_gcc-12.2.0/share/gcc --build=x86_64-w64-mingw32 --host=x86_64-w64-mingw32 --enable-offload-targets=nvptx-none --with-pkgversion='MinGW-W64 x86_64-msvcrt-posix-seh, built by Brecht Sanders' --with-tune=generic --enable-checking=release --enable-threads=posix --disable-sjlj-exceptions --disable-libunwind-exceptions --disable-serial-configure --disable-bootstrap --enable-host-shared --enable-plugin --disable-default-ssp --disable-rpath --disable-libstdcxx-debug --disable-version-specific-runtime-libs --with-stabs --disable-symvers --enable-languages=c,c++,fortran,lto,objc,obj-c++,jit --disable-gold --disable-nls --disable-stage1-checking --disable-win32-registry --disable-multilib --enable-ld --enable-libquadmath --enable-libada --enable-libssp --enable-libstdcxx --enable-lto --enable-fully-dynamic-string --enable-libgomp --enable-graphite --enable-mingw-wildcard --enable-libstdcxx-time --disable-libstdcxx-pch --with-mpc=/d/Prog/winlibs64_stage/custombuilt --with-mpfr=/d/Prog/winlibs64_stage/custombuilt --with-gmp=/d/Prog/winlibs64_stage/custombuilt --with-isl=/d/Prog/winlibs64_stage/custombuilt --enable-libstdcxx-backtrace --enable-install-libiberty --enable-__cxa_atexit --without-included-gettext --with-diagnostics-color=auto --enable-clocale=generic --with-libiconv --with-system-zlib --with-build-sysroot=/r/winlibs64_stage/17520.gcc/gcc-12.2.0/build_mingw/mingw-w64 CFLAGS='-I/d/Prog/winlibs64_stage/custombuilt/include/libdl-win32 -D__USE_MINGW_ACCESS' CXXFLAGS=-D__USE_MINGW_ACCESS LDFLAGS='-Wl,--disable-nxcompat -Wl,--disable-high-entropy-va -Wl,--disable-dynamicbase'
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.2.0 (MinGW-W64 x86_64-msvcrt-posix-seh, built by Brecht Sanders) 
COLLECT_GCC_OPTIONS='-v' '-c' '-I' '.\include' '-I' '.' '-I' '..' '-D' 'WIN32' '-D' 'WIN64' '-D' 'PERLDLL' '-D' 'PERL_CORE' '-Os' '-D' '__USE_MINGW_ANSI_STDIO' '-fwrapv' '-fno-strict-aliasing' '-D' 'PERL_EXTERNAL_GLOB' '-D' 'PERL_IS_MINIPERL' '-o' 'mini\toke.o' '-mtune=generic' '-march=x86-64' '-dumpdir' 'mini\'
 c:/winlibs-gcc-1220/mingw64/bin/../libexec/gcc/x86_64-w64-mingw32/12.2.0/cc1.exe -quiet -v -I .\include -I . -I .. -iprefix c:\winlibs-gcc-1220\mingw64\bin\../lib/gcc/x86_64-w64-mingw32/12.2.0/ -D_REENTRANT -D WIN32 -D WIN64 -D PERLDLL -D PERL_CORE -D __USE_MINGW_ANSI_STDIO -D PERL_EXTERNAL_GLOB -D PERL_IS_MINIPERL ..\toke.c -quiet -dumpdir mini\ -dumpbase toke.c -dumpbase-ext .c -mtune=generic -march=x86-64 -Os -version -fwrapv -fno-strict-aliasing -o C:\Users\Owner\AppData\Local\Temp\ccfxIfRn.s
GNU C17 (MinGW-W64 x86_64-msvcrt-posix-seh, built by Brecht Sanders) version 12.2.0 (x86_64-w64-mingw32)
	compiled by GNU C version 12.2.0, GMP version 6.2.1, MPFR version 4.1.0, MPC version 1.2.1, isl version UNKNOWN-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring duplicate directory "c:/winlibs-gcc-1220/mingw64/lib/gcc/../../lib/gcc/x86_64-w64-mingw32/12.2.0/include"
ignoring nonexistent directory "R:/winlibs64_stage/17520.gcc/inst_gcc-12.2.0/share/gcc/include"
ignoring nonexistent directory "/r/winlibs64_stage/17520.gcc/inst_gcc-12.2.0/share/gcc/include"
ignoring duplicate directory "c:/winlibs-gcc-1220/mingw64/lib/gcc/../../lib/gcc/x86_64-w64-mingw32/12.2.0/include-fixed"
ignoring duplicate directory "c:/winlibs-gcc-1220/mingw64/lib/gcc/../../lib/gcc/x86_64-w64-mingw32/12.2.0/../../../../x86_64-w64-mingw32/include"
ignoring nonexistent directory "/mingw/include"
#include "..." search starts here:
#include <...> search starts here:
 .\include
 .
 ..
 D:\_64\msys_1220\1.0\local\include
 c:\winlibs-gcc-1220\mingw64\bin\../lib/gcc/x86_64-w64-mingw32/12.2.0/include
 c:\winlibs-gcc-1220\mingw64\bin\../lib/gcc/x86_64-w64-mingw32/12.2.0/../../../../include
 c:\winlibs-gcc-1220\mingw64\bin\../lib/gcc/x86_64-w64-mingw32/12.2.0/include-fixed
 c:\winlibs-gcc-1220\mingw64\bin\../lib/gcc/x86_64-w64-mingw32/12.2.0/../../../../x86_64-w64-mingw32/include
End of search list.
GNU C17 (MinGW-W64 x86_64-msvcrt-posix-seh, built by Brecht Sanders) version 12.2.0 (x86_64-w64-mingw32)
	compiled by GNU C version 12.2.0, GMP version 6.2.1, MPFR version 4.1.0, MPC version 1.2.1, isl version UNKNOWN-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 11e0db502eb2edd262aeb87d2525f2c4
COLLECT_GCC_OPTIONS='-v' '-c' '-I' '.\include' '-I' '.' '-I' '..' '-D' 'WIN32' '-D' 'WIN64' '-D' 'PERLDLL' '-D' 'PERL_CORE' '-Os' '-D' '__USE_MINGW_ANSI_STDIO' '-fwrapv' '-fno-strict-aliasing' '-D' 'PERL_EXTERNAL_GLOB' '-D' 'PERL_IS_MINIPERL' '-o' 'mini\toke.o' '-mtune=generic' '-march=x86-64' '-dumpdir' 'mini\'
 c:/winlibs-gcc-1220/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/12.2.0/../../../../x86_64-w64-mingw32/bin/as.exe -v -I .\include -I . -I .. -o mini\toke.o C:\Users\Owner\AppData\Local\Temp\ccfxIfRn.s
GNU assembler version 2.39 (x86_64-w64-mingw32) using BFD version (Binutils for MinGW-W64 x86_64, built by Brecht Sanders) 2.39
COMPILER_PATH=c:/winlibs-gcc-1220/mingw64/bin/../libexec/gcc/x86_64-w64-mingw32/12.2.0/;c:/winlibs-gcc-1220/mingw64/bin/../libexec/gcc/;c:/winlibs-gcc-1220/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/12.2.0/../../../../x86_64-w64-mingw32/bin/
LIBRARY_PATH=c:/winlibs-gcc-1220/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/12.2.0/;c:/winlibs-gcc-1220/mingw64/bin/../lib/gcc/;D:/_64/msys_1220/1.0/local/lib/../lib/;c:/winlibs-gcc-1220/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/12.2.0/../../../../x86_64-w64-mingw32/lib/../lib/;c:/winlibs-gcc-1220/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/12.2.0/../../../../lib/;D:/_64/msys_1220/1.0/local/lib/;c:/winlibs-gcc-1220/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/12.2.0/../../../../x86_64-w64-mingw32/lib/;c:/winlibs-gcc-1220/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/12.2.0/../../../
COLLECT_GCC_OPTIONS='-v' '-c' '-I' '.\include' '-I' '.' '-I' '..' '-D' 'WIN32' '-D' 'WIN64' '-D' 'PERLDLL' '-D' 'PERL_CORE' '-Os' '-D' '__USE_MINGW_ANSI_STDIO' '-fwrapv' '-fno-strict-aliasing' '-D' 'PERL_EXTERNAL_GLOB' '-D' 'PERL_IS_MINIPERL' '-o' 'mini\toke.o' '-mtune=generic' '-march=x86-64' '-dumpdir' 'mini\toke.'

Cheers,
Rob

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants