Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MSWindows] 32-bit quadmath builds fail multiple tests. #21313

Closed
sisyphus opened this issue Jul 31, 2023 · 8 comments · Fixed by #21324
Closed

[MSWindows] 32-bit quadmath builds fail multiple tests. #21313

sisyphus opened this issue Jul 31, 2023 · 8 comments · Fixed by #21324

Comments

@sisyphus
Copy link
Contributor

This issue pertains to MSWin32-x86-multi-thread-quadmath and MSWin32-x86-multi-thread-64int-quadmath builds of perl, going back over at least the last couple of years.
All other configurations are currently fine, irrespective of whether gcc's threads model is POSIX or MCF, and irrespective of whether gcc's runtime is msvcrt or ucrt.

For the record, at time of writing, current stable release is 5.38.0, and current devel release is 5.39.0.

Over the last year or two, I have played around with a number of mingw-w64 ports of gcc provided by nixman (https://github.com/niXman/mingw-builds-binaries/releases/), by lhmouse (https://gcc-mcf.lhmouse.com/) and by Brecht Sanders (https://winlibs.com).
Most of the ones that I've tried (and there has been quite a few) have been from the winlibs website.

The 32-bit quadmath builds always fail multiple tests (irrespective of IVSIZE) - except when using lhmouse's compilers, where all tests always pass (again, irrespective of IVSIZE).
The only 3 compilers from lhmouse that I've actually tried have been a 13.0.1 snapshot, the 13.1.0 (dated 2023-04-26), and the 13.2.1 snapshot (dated 2023-07-28). All three have successfully built these 32-bit quadmath perls 5.37.10 to 5.39.1, with all tests passing.
And all 3 were built with MCF-threads && msvc runtime.

From a discussion at
https://sourceforge.net/p/mingw-w64/mailman/mingw-w64-public/thread/CADZSBj2PDJj1E64zWeWUeZ%3DrvRj7_ETbHxb6p1pBKg-rf9BJ5Q%40mail.gmail.com/#msg37836780
I gather that lhmouse's gcc builds have Intel as the default assembly syntax, whereas AFAIK the other builders are still specifying AT&T as the default.
I don't know if the default of Intel could account for the success that I have in building 32-bit quadmath perls with lhmouse's compilers.
I can well envisage that there might be other differences, too - but the default assembly syntax is the only difference I know of.

When using Brecht Sanders releases of gcc-13.1.0 built with MCF-threads && msvc runtime, the perls they build suffer multiple test failures.
Here is the 'gmake test' report for almost-current (commit 942fa8d) blead, using Brecht Sanders' gcc-13.1.0 built with MCF-threads && msvc runtime.

Test Summary Report
-------------------
op/fork.t                                                          (Wstat: 0 Tests: 28 Failed: 1)
  Failed test:  9
op/index_thr.t                                                     (Wstat: 1280 (exited 5) Tests: 58 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 415 tests but ran 58.
op/threads.t                                                       (Wstat: 1280 (exited 5) Tests: 3 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 30 tests but ran 3.
re/pat_advanced_thr.t                                              (Wstat: 1280 (exited 5) Tests: 1 Failed: 0)
  Non-zero exit status: 5
  Parse errors: No plan found in TAP output
re/pat_psycho_thr.t                                                (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: No plan found in TAP output
re/pat_re_eval_thr.t                                               (Wstat: 1280 (exited 5) Tests: 2 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 527 tests but ran 2.
re/pat_rt_report_thr.t                                             (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 2514 tests but ran 0.
re/pat_special_cc_thr.t                                            (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 9 tests but ran 0.
re/reg_email_thr.t                                                 (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: No plan found in TAP output
re/regexp_unicode_prop_thr.t                                       (Wstat: 1280 (exited 5) Tests: 2 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 1110 tests but ran 2.
re/speed_thr.t                                                     (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 59 tests but ran 0.
re/user_prop_race_thr.t                                            (Wstat: 0 Tests: 3 Failed: 1)
  Failed test:  1
win32/popen.t                                                      (Wstat: 0 Tests: 1 Failed: 1)
  Failed test:  1
win32/signal.t                                                     (Wstat: 1280 (exited 5) Tests: 2 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 4 tests but ran 2.
../cpan/File-Temp/t/fork.t                                         (Wstat: 1280 (exited 5) Tests: 2 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 8 tests but ran 2.
../cpan/Test-Simple/t/Legacy/Builder/fork_with_new_stdout.t        (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 2 tests but ran 0.
../cpan/Test-Simple/t/Legacy/Regression/683_thread_todo.t          (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: No plan found in TAP output
../cpan/Test-Simple/t/Legacy/subtest/fork.t                        (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 1 tests but ran 0.
../cpan/Test-Simple/t/Legacy/threads.t                             (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 6 tests but ran 0.
../cpan/Test-Simple/t/regression/fork_first.t                      (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: No plan found in TAP output
../cpan/Test-Simple/t/Test2/acceptance/try_it_threads.t            (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 6 tests but ran 0.
../cpan/Test-Simple/t/Test2/modules/Hub.t                          (Wstat: 1280 (exited 5) Tests: 2 Failed: 0)
  Non-zero exit status: 5
  Parse errors: No plan found in TAP output
../cpan/Test-Simple/t/Test2/regression/746-forking-subtest.t       (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: No plan found in TAP output
../dist/IO/t/io_multihomed.t                                       (Wstat: 1280 (exited 5) Tests: 5 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 8 tests but ran 5.
../dist/IO/t/io_sock.t                                             (Wstat: 1280 (exited 5) Tests: 17 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 26 tests but ran 17.
../dist/Storable/t/threads.t                                       (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 2 tests but ran 0.
../dist/Thread-Queue/t/01_basic.t                                  (Wstat: 1280 (exited 5) Tests: 2 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 81 tests but ran 2.
../dist/Thread-Queue/t/02_refs.t                                   (Wstat: 1280 (exited 5) Tests: 6 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 46 tests but ran 6.
../dist/Thread-Queue/t/03_peek.t                                   (Wstat: 1280 (exited 5) Tests: 1 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 19 tests but ran 1.
../dist/Thread-Queue/t/05_extract.t                                (Wstat: 1280 (exited 5) Tests: 8 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 20 tests but ran 8.
../dist/Thread-Queue/t/07_lock.t                                   (Wstat: 1280 (exited 5) Tests: 1 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 3 tests but ran 1.
../dist/Thread-Queue/t/09_ended.t                                  (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 60 tests but ran 0.
../dist/Thread-Queue/t/10_timed.t                                  (Wstat: 1280 (exited 5) Tests: 2 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 19 tests but ran 2.
../dist/Thread-Queue/t/11_limit.t                                  (Wstat: 1280 (exited 5) Tests: 1 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 13 tests but ran 1.
../dist/Thread-Semaphore/t/01_basic.t                              (Wstat: 1280 (exited 5) Tests: 3 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 10 tests but ran 3.
../dist/Thread-Semaphore/t/05_force.t                              (Wstat: 1280 (exited 5) Tests: 2 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 8 tests but ran 2.
../dist/Thread-Semaphore/t/06_timed.t                              (Wstat: 1280 (exited 5) Tests: 3 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 10 tests but ran 3.
../dist/threads-shared/t/object2.t                                 (Wstat: 1280 (exited 5) Tests: 131 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 133 tests but ran 131.
../dist/threads-shared/t/stress.t                                  (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 1 tests but ran 0.
../dist/threads-shared/t/wait.t                                    (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 91 tests but ran 0.
../dist/threads/t/free.t                                           (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 29 tests but ran 0.
../dist/threads/t/free2.t                                          (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 78 tests but ran 0.
../dist/threads/t/libc.t                                           (Wstat: 1280 (exited 5) Tests: 1 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 11 tests but ran 1.
../dist/threads/t/stack.t                                          (Wstat: 1280 (exited 5) Tests: 15 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 18 tests but ran 15.
../dist/threads/t/stress_cv.t                                      (Wstat: 1280 (exited 5) Tests: 30 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 61 tests but ran 30.
../dist/threads/t/stress_string.t                                  (Wstat: 1280 (exited 5) Tests: 31 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 61 tests but ran 31.
../dist/threads/t/thread.t                                         (Wstat: 1280 (exited 5) Tests: 14 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 35 tests but ran 14.
../ext/B/t/b.t                                                     (Wstat: 1280 (exited 5) Tests: 86 Failed: 0)
  Non-zero exit status: 5
  Parse errors: No plan found in TAP output
../ext/IPC-Open3/t/IPC-Open3.t                                     (Wstat: 0 Tests: 45 Failed: 0)
  TODO passed:   25
../ext/PerlIO-encoding/t/threads.t                                 (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 4 tests but ran 0.
../ext/XS-APItest/t/keyword_plugin_threads.t                       (Wstat: 0 Tests: 1 Failed: 1)
  Failed test:  1
../ext/XS-APItest/t/my_cxt.t                                       (Wstat: 1280 (exited 5) Tests: 7 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 16 tests but ran 7.
../dist/threads-shared/t/waithires.t                               (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 63 tests but ran 0.
Files=2771, Tests=1116289, 1157 wallclock secs ( 7.99 usr +  0.45 sys =  8.44 CPU)
Result: FAIL

There are far fewer failures when the same perl source is built using Brecht Sanders' gcc-13.1.0 that is built with POSIX-threads && msvc runtime:

Test Summary Report
-------------------
op/index_thr.t                                                     (Wstat: 0 Tests: 412 Failed: 0)
  Parse errors: Bad plan.  You planned 415 tests but ran 412.
op/substr_thr.t                                                    (Wstat: 256 (exited 1) Tests: 153 Failed: 0)
  Non-zero exit status: 1
  Parse errors: Bad plan.  You planned 400 tests but ran 153.
op/threads.t                                                       (Wstat: 2304 (exited 9) Tests: 9 Failed: 0)
  Non-zero exit status: 9
  Parse errors: Bad plan.  You planned 30 tests but ran 9.
re/pat_psycho_thr.t                                                (Wstat: 256 (exited 1) Tests: 13 Failed: 0)
  Non-zero exit status: 1
  Parse errors: Bad plan.  You planned 15 tests but ran 13.
re/speed_thr.t                                                     (Wstat: 256 (exited 1) Tests: 25 Failed: 0)
  Non-zero exit status: 1
  Parse errors: Bad plan.  You planned 59 tests but ran 25.
../dist/threads/t/thread.t                                         (Wstat: 0 Tests: 35 Failed: 2)
  Failed tests:  15, 20
../ext/IPC-Open3/t/IPC-Open3.t                                     (Wstat: 0 Tests: 45 Failed: 0)
  TODO passed:   25
../ext/XS-APItest/t/keyword_plugin_threads.t                       (Wstat: 0 Tests: 1 Failed: 1)
  Failed test:  1
Files=2771, Tests=1123102, 2441 wallclock secs ( 7.25 usr +  0.30 sys =  7.55 CPU)
Result: FAIL

So we see that the number of failures increases markedly when Brecht Sanders' MCF-threads version of gcc-13.1.0 is used. (Though it's lhmouse's MCF-threads version that succeeds.)

Things look pretty much the same with Brecht Sanders' 13.2.0 compilers.
LHmouse hasn't provided a gcc-13.2.0, but the gcc-13.2.1 (snapshot) works fine.
(For the 32-bit compilers lhmouse provides only MCF-threads with msvc runtime.)

Perl configuration

> ..\perl -I..\lib -V
Summary of my perl5 (revision 5 version 39 subversion 2) configuration:
  Derived from:
  Platform:
    osname=MSWin32
    osvers=10.0.22621.1992
    archname=MSWin32-x86-multi-thread-quadmath
    uname=''
    config_args='undef'
    hint=recommended
    useposix=true
    d_sigaction=undef
    useithreads=define
    usemultiplicity=define
    use64bitint=undef
    use64bitall=undef
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
  Compiler:
    cc='gcc'
    ccflags =' -DWIN32 -fdiagnostics-color=never -DPERL_TEXTMODE_SCRIPTS -DMULTIPLICITY -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D__USE_MINGW_ANSI_STDIO -fwrapv -fno-strict-aliasing -mms-bitfields'
    optimize='-O2'
    cppflags='-DWIN32'
    ccversion=''
    gccversion='13.1.0'
    gccosandvers=''
    intsize=4
    longsize=4
    ptrsize=4
    doublesize=8
    byteorder=1234
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=12
    longdblkind=3
    ivtype='long'
    ivsize=4
    nvtype='__float128'
    nvsize=16
    Off_t='long long'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='g++'
    ldflags ='-s -L"D:\perl-5.39.2-1310\lib\MSWin32-x86-multi-thread-quadmath\CORE" -L"C:\winpos-gcc-1310\mingw32\lib" -L"C:\winpos-gcc-1310\mingw32\i686-w64-mingw32\lib" -L"C:\winpos-gcc-1310\mingw32\lib\gcc\i686-w64-mingw32\13.1.0"'
    libpth=C:\winpos-gcc-1310\mingw32\lib C:\winpos-gcc-1310\mingw32\i686-w64-mingw32\lib C:\winpos-gcc-1310\mingw32\lib\gcc\i686-w64-mingw32\13.1.0 D:\_32\msys_1310\1.0\local\lib
    libs= -lmoldname -lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32 -lcomctl32 -lquadmath
    perllibs= -lmoldname -lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32 -lcomctl32 -lquadmath
    libc=
    so=dll
    useshrplib=true
    libperl=libperl539.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_win32.xs
    dlext=dll
    d_dlsymun=undef
    ccdlflags=' '
    cccdlflags=' '
    lddlflags='-shared -s -L"D:\perl-5.39.2-1310\lib\MSWin32-x86-multi-thread-quadmath\CORE" -L"C:\winpos-gcc-1310\mingw32\lib" -L"C:\winpos-gcc-1310\mingw32\i686-w64-mingw32\lib" -L"C:\winpos-gcc-1310\mingw32\lib\gcc\i686-w64-mingw32\13.1.0"'


Characteristics of this binary (from libperl):
  Compile-time options:
    HAS_LONG_DOUBLE
    HAS_STRTOLD
    HAS_TIMES
    HAVE_INTERP_INTERN
    MULTIPLICITY
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_HASH_FUNC_ZAPHOD32
    PERL_HASH_USE_SBOX32
    PERL_IMPLICIT_SYS
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    PERL_USE_SAFE_PUTENV
    USE_ITHREADS
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
    USE_QUADMATH
  Locally applied patches:
    uncommitted-changes
  Built under MSWin32
  Compiled at Jul 31 2023 11:46:25
  @INC:
    ..\lib
    D:/comp-1310/perl-5.39.2/lib

Cheers,
Rob

@sisyphus
Copy link
Contributor Author

This issue pertains to MSWin32-x86-multi-thread-quadmath and MSWin32-x86-multi-thread-64int-quadmath builds

Just to clarify - the issue does NOT pertain to the unthreaded MSWin32-x86-perlio-quadmath and MSWin32-x86-perlio-64int-quadmath builds.
They build fine, irrespective of whose compiler is used.
It's only when threads and quadmath get together on 32-bit builds that the issue can arise.

As an experiment I set EXTRACFLAGS to -masm=intel in the GNUmakefile and rebuilt perl using Sanders' gcc-13.1.0 compilers.
I don't think it made any difference re the POSIX-threads compiler, but it did reduce the number of failing tests with the POSIX-threads compiler:

Test Summary Report
-------------------
op/threads.t                                                       (Wstat: 2304 (exited 9) Tests: 9 Failed: 0)
  Non-zero exit status: 9
  Parse errors: Bad plan.  You planned 30 tests but ran 9.
../dist/threads/t/thread.t                                         (Wstat: 0 Tests: 35 Failed: 4)
  Failed tests:  15-16, 19-20
../ext/IPC-Open3/t/IPC-Open3.t                                     (Wstat: 0 Tests: 45 Failed: 0)
  TODO passed:   25
../ext/XS-APItest/t/keyword_plugin_threads.t                       (Wstat: 0 Tests: 1 Failed: 1)
  Failed test:  1
Files=2771, Tests=1123506, 1295 wallclock secs ( 7.72 usr +  0.25 sys =  7.97 CPU)
Result: FAIL

That is, the op/index_thr.t, op/substr_thr.t. re/pat_psycho_thr.t and re/speed_thr.t failures were avoided.
I don't know what to make of that.

Cheers,
Rob

@tonycoz
Copy link
Contributor

tonycoz commented Aug 2, 2023

The problem is stack alignment.

The point that at least one failing test crashes at is:

Dump of assembler code for function __letf2:
   0x6342bab0 <+0>:     push   %ebp
   0x6342bab1 <+1>:     push   %edi
   0x6342bab2 <+2>:     push   %esi
   0x6342bab3 <+3>:     push   %ebx
   0x6342bab4 <+4>:     sub    $0x6c,%esp
   0x6342bab7 <+7>:     fnstcw 0x4e(%esp)
=> 0x6342babb <+11>:    movdqa 0x80(%esp),%xmm0
   0x6342bac4 <+20>:    mov    0x88(%esp),%edi

(gdb) info register esp
esp            0x2c4fc98           0x2c4fc98

movdqa is "Move Aligned Double Quadword" which requires that the address be 16-byte aligned.

If I look at the similar code for lhmouse (note change to Intel syntax):

Dump of assembler code from 0x68c6bac0 to 0x68c6bae0:
   0x68c6bac0:  push   ebp
=> 0x68c6bac1:  push   edi
   0x68c6bac2:  push   esi
   0x68c6bac3:  push   ebx
   0x68c6bac4:  sub    esp,0x6c
   0x68c6bac7:  fnstcw WORD PTR [esp+0x4e]
   0x68c6bacb:  movdqu xmm0,XMMWORD PTR [esp+0x80]
   0x68c6bad4:  mov    edi,DWORD PTR [esp+0x88]
   0x68c6badb:  mov    eax,DWORD PTR [esp+0x80]

This uses the movdqu instruction, which allows for unaligned addresses.

__letf2 (aka __lttf2) is a function used to implement less than for __float128 values, See https://github.com/gcc-mirror/gcc/blob/master/libgcc/soft-fp/letf2.c in my case it's being called from pp_chr:

        if (!IN_BYTES /* under bytes, chr(-1) eq chr(0xff), etc. */
            && ((SvIOKp(top) && !SvIsUV(top) && SvIV_nomg(top) < 0)
                ||
                ((SvNOKp(top) || (SvOK(top) && !SvIsUV(top)))
                 && SvNV_nomg(top) < 0.0))) <==== This

If I build with -mstackrealign then the test I used op/substr_thr.t passes, fully:

gmake -j6 CCTYPE=GCC CCHOME=c:\mingw-i686-ucrt USE_QUADMATH=define I_QUADMATH=define CFG=DebugSymbols OPTIMIZE="-g -O1 -mstackrealign" test-prep

AFAIK, 32-bit windows only requires 4 byte stack alignment, so SSE code should not assume that esp accesses are properly aligned.

The backtraces I checked for the failing cases were in non-main threads, so I suspect CreateThread() is giving us an unaligned stack.

So I took a look at threads.x and tried:

diff --git a/dist/threads/threads.xs b/dist/threads/threads.xs
index 92c5fd8fe4..da055dab25 100644
--- a/dist/threads/threads.xs
+++ b/dist/threads/threads.xs
@@ -535,11 +535,11 @@ S_jmpenv_run(pTHX_ int action, ithread *thread,
     return jmp_rc;
 }

-
 /* Starts executing the thread.
  * Passed as the C level function to run in the new thread.
  */
 #ifdef WIN32
+__attribute__((force_align_arg_pointer))
 STATIC THREAD_RET_TYPE
 S_ithread_run(LPVOID arg)
 #else

(without -mstackrealign) and this also passed my otherwise failing test.

I don't know why lhmouse is generating movdqu vs movdqa, the code here is in libgcc_s_dw2-1.dll.

Note: the issue has nothing to do with intel vs AT&T syntax.

@tonycoz
Copy link
Contributor

tonycoz commented Aug 2, 2023

It didn't fix every failure, but it did most of them:

op/fork.t                                                          (Wstat: 0 Tests: 28 Failed: 1)
  Failed test:  9
porting/cmp_version.t                                              (Wstat: 0 Tests: 46 Failed: 1)
  Failed test:  41
win32/popen.t                                                      (Wstat: 0 Tests: 1 Failed: 1)
  Failed test:  1
win32/signal.t                                                     (Wstat: 1280 (exited 5) Tests: 2 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 4 tests but ran 2.
../cpan/File-Temp/t/fork.t                                         (Wstat: 1280 (exited 5) Tests: 6 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 8 tests but ran 6.
../cpan/Test-Simple/t/Legacy/subtest/fork.t                        (Wstat: 1280 (exited 5) Tests: 0 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 1 tests but ran 0.
../cpan/Win32/t/Privileges.t                                       (Wstat: 0 Tests: 7 Failed: 1)
  Failed test:  7
../dist/IO/t/io_multihomed.t                                       (Wstat: 1280 (exited 5) Tests: 5 Failed: 0)
  Non-zero exit status: 5
  Parse errors: Bad plan.  You planned 8 tests but ran 5.
../ext/IPC-Open3/t/IPC-Open3.t                                     (Wstat: 0 Tests: 45 Failed: 0)
  TODO passed:   25
Files=2771, Tests=1188722, 2030 wallclock secs (73.05 usr + 11.97 sys = 85.02 CPU)
Result: FAIL
Finished test run at Wed Aug  2 16:02:37 2023.

I suspect some of the other failures that involve callbacks (fork, signals) have similar fixes

@tonycoz
Copy link
Contributor

tonycoz commented Aug 2, 2023

Diff as follows:

diff --git a/dist/threads/threads.xs b/dist/threads/threads.xs
index 92c5fd8fe4..da055dab25 100644
--- a/dist/threads/threads.xs
+++ b/dist/threads/threads.xs
@@ -535,11 +535,11 @@ S_jmpenv_run(pTHX_ int action, ithread *thread,
     return jmp_rc;
 }

-
 /* Starts executing the thread.
  * Passed as the C level function to run in the new thread.
  */
 #ifdef WIN32
+__attribute__((force_align_arg_pointer))
 STATIC THREAD_RET_TYPE
 S_ithread_run(LPVOID arg)
 #else
diff --git a/mg.c b/mg.c
index 899cc4a2d2..d7e5eb4d90 100644
--- a/mg.c
+++ b/mg.c
@@ -1526,6 +1526,7 @@ Perl_magic_clearsig(pTHX_ SV *sv, MAGIC *mg)
 }


+__attribute__((force_align_arg_pointer))
 #ifdef PERL_USE_3ARG_SIGHANDLER
 Signal_t
 Perl_csighandler(int sig, Siginfo_t *sip, void *uap)
diff --git a/win32/perlhost.h b/win32/perlhost.h
index e6ef46f809..6d01820474 100644
--- a/win32/perlhost.h
+++ b/win32/perlhost.h
@@ -1692,6 +1692,7 @@ PerlProcGetTimeOfDay(struct IPerlProc* piPerl, struct timeval *t, void *z)
 }

 #ifdef USE_ITHREADS
+__attribute__((force_align_arg_pointer))
 static THREAD_RET_TYPE
 win32_start_child(LPVOID arg)
 {

helped a lot:

    cd ..\t && perl.exe harness   op/fork.t porting/cmp_version.t  win32/popen.t win32/signal.t ../cpan/File-Temp/t/fork.t ../cpan/Test-Simple/t/Legacy/subtest/fork.t ../cpan/Win32/t/Privileges.t ../dist/IO/t/io_multihomed.t
op/fork.t .................................... ok
porting/cmp_version.t ........................ 24/46 # not ok 41 - dist/threads/lib/threads.pm version 2.37
porting/cmp_version.t ........................ Failed 1/46 subtests
win32/popen.t ................................ ok
win32/signal.t ............................... ok
../cpan/File-Temp/t/fork.t ................... ok
../cpan/Test-Simple/t/Legacy/subtest/fork.t .. ok
../cpan/Win32/t/Privileges.t ................. 1/7 # Failed test 7 in t/Privileges.t at line 54
../cpan/Win32/t/Privileges.t ................. Failed 1/7 subtests
../dist/IO/t/io_multihomed.t ................. ok

I also have a simple reproducer:

C:\Users\Tony\dev\perl\git>type stackquad.c
#include <quadmath.h>
#include <windows.h>
#include <stdio.h>
#include <stdbool.h>

__float128 x = 1.0;

bool y;

DWORD WINAPI ThreadProc(
  LPVOID lpParameter) {
  y = x < 0;
  return 0;
}

int main() {
  HANDLE h = CreateThread(NULL, 0, ThreadProc, NULL, 0, NULL);
  if (WaitForSingleObject(h, 10000) != WAIT_OBJECT_0) {
    printf("Failed wait\n");
    return 1;
  }
  return 0;
}

C:\Users\Tony\dev\perl\git>gcc -ostackquad.exe -g stackquad.c -lquadmath

C:\Users\Tony\dev\perl\git>gdb .\stackquad.exe
GNU gdb (GDB for MinGW-W64 i686, built by Brecht Sanders) 13.2
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "i686-w64-mingw32".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from .\stackquad.exe...
(gdb) r
Starting program: C:\Users\Tony\dev\perl\git\stackquad.exe
[New Thread 22248.0x8cf8]
[New Thread 22248.0x284c]

Thread 3 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 22248.0x284c]
__letf2 (a=1, b=0) at ../../../libgcc/soft-fp/letf2.c:41
41      ../../../libgcc/soft-fp/letf2.c: No such file or directory.
(gdb) bt
#0  __letf2 (a=1, b=0) at ../../../libgcc/soft-fp/letf2.c:41
#1  0x007015fc in ThreadProc@4 (lpParameter=0x0) at stackquad.c:12
#2  0x76ce00c9 in KERNEL32!BaseThreadInitThunk () from C:\WINDOWS\SysWOW64\kernel32.dll
#3  0x779b7b1e in ntdll!RtlGetAppContainerNamedObjectPath () from C:\WINDOWS\SysWOW64\ntdll.dll
#4  0x779b7aee in ntdll!RtlGetAppContainerNamedObjectPath () from C:\WINDOWS\SysWOW64\ntdll.dll
#5  0x00000000 in ?? ()
(gdb) info register esp
esp            0xe8feb8            0xe8feb8
(gdb)

@tonycoz
Copy link
Contributor

tonycoz commented Aug 2, 2023

I'll look at cleaning this up tomorrow.

@xenu
Copy link
Member

xenu commented Aug 2, 2023

I don't know why lhmouse is generating movdqu vs movdqa, the code here is in libgcc_s_dw2-1.dll.

I couldn't find the source of the winlibs build, but it seems that lhmouse's includes this patch, which seems related.

There's a comment about it in the PKGBUILD file:

  # workaround for AVX misalignment issue for pass-by-value arguments
  #   cf. https://github.com/msys2/MSYS2-packages/issues/1209
  #   cf. https://sourceforge.net/p/mingw-w64/discussion/723797/thread/bc936130/ 
  #  Issue is longstanding upstream at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412
  #  Potential alternative: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=939559
  # https://github.com/msys2/MINGW-packages/pull/8317#issuecomment-824548411
  apply_patch_with_msg \
    0200-add-m-no-align-vector-insn-option-for-i386.patch

@xenu
Copy link
Member

xenu commented Aug 2, 2023

In the bugzilla ticket it's mentioned that it's possible to force gas to always generate unaligned moves with this:

-Wa,-muse-unaligned-vector-move

I didn't test it, but it could be the solution.

@tonycoz
Copy link
Contributor

tonycoz commented Aug 2, 2023

Unfortunately neither is a solution we can implement, the code doing the unaligned accesses here is in libgcc_s_dw2-1.dll, which is prebuilt.

Also, I expect any new mingw packagers who don't notice the issue will be broken until they notice it or gcc itself is fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants