Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

t/Image/Transform.t aborts on memmove() in img_2d_transform() if compiled with -D_FORTIFY_SOURCE=3 and glibc-2.36.9000 #78

Closed
ppisar opened this issue Jan 11, 2023 · 11 comments · Fixed by #79

Comments

@ppisar
Copy link
Contributor

ppisar commented Jan 11, 2023

When building Prima with -D_FORTIFY_SOURCE=3:

$ perl Makefile.PL OPTIMIZE='-Wp,-D_FORTIFY_SOURCE=3 -g -O2'

t/Image/Transform.t test aborts:

$ perl -Iblib/{lib,arch} t/Image/Transform.t
ok 1 - rotate(90) width ok
ok 2 - rotate(90) height ok
ok 3 - rotate(90) data ok
ok 4 - rotate(270) width ok
ok 5 - rotate(270) height ok
ok 6 - rotate(270) data ok
ok 7 - rotate(180) width ok
ok 8 - rotate(180) height ok
ok 9 - rotate(180) data ok
ok 10 - short: rotate(90) data ok
ok 11 - short: rotate(270) data ok
ok 12 - short: rotate(180) data ok
ok 13 - byte: vertical mirroring ok
ok 14 - byte: horizontal mirroring ok
ok 15 - short: vertical mirroring ok
ok 16 - short: horizontal mirroring ok
ok 17 - rotation 360 seems performing
ok 18 - rotation 360 is correct
ok 19 - rotation 360 by transform2d seems performing
ok 20 - rotation 360 transform2d is correct
ok 21 - xshear(2) integral data
ok 22 - xshear(2) integral mask
*** buffer overflow detected ***: terminated
Aborted (core dumped)

This can be reduced to:

$ cat /tmp/test.t 
#!/usr/bin/perl
use Prima::sys::Test qw(noX11);
my $p = Prima::Image->create(
        width    => 2,
        height   => 2,
        type     => im::Byte,
        data     => "\1\4..\3\6");
$p->shear(0,0.5);
$ perl -Iblib/{lib,arch} /tmp/test.t 
*** buffer overflow detected ***: terminated
Aborted (core dumped)

Valgrind reports:

$ valgrind -- perl -Iblib/{lib,arch} /tmp/test.t 
==50828== Memcheck, a memory error detector
==50828== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==50828== Using Valgrind-3.20.0 and LibVEX; rerun with -h for copyright info
==50828== Command: perl -Iblib/lib -Iblib/arch /tmp/test.t
==50828== 
==50828== Source and destination overlap in memcpy_chk(0x1ffeffd574, 0x1ffeffd580, 24)
==50828==    at 0x484C332: __memcpy_chk (vg_replace_strmem.c:1741)
==50828==    by 0x572FF3D: UnknownInlinedFun (string_fortified.h:36)
==50828==    by 0x572FF3D: img_2d_transform (rotate.c:1171)
==50828==    by 0x56DC6F1: Image_matrix_transform (conv.c:943)
==50828==    by 0x56D9BC4: Image_transform (conv.c:992)
==50828==    by 0x566FDA0: template_xs_Bool_Handle_HVPtr (thunks.tinc:3788)
==50828==    by 0x497BECF: Perl_pp_entersub (pp_hot.c:5352)
==50828==    by 0x496D84F: Perl_runops_standard (run.c:41)
==50828==    by 0x48DDFA0: UnknownInlinedFun (perl.c:2721)
==50828==    by 0x48DDFA0: perl_run (perl.c:2644)
==50828==    by 0x109349: main (perlmain.c:110)
==50828== 
**50828** *** memcpy_chk: buffer overflow detected ***: program terminated
==50828==    at 0x48470AC: VALGRIND_PRINTF_BACKTRACE (valgrind.h:6815)
==50828==    by 0x484C3BC: __memcpy_chk (vg_replace_strmem.c:1741)
==50828==    by 0x572FF3D: UnknownInlinedFun (string_fortified.h:36)
==50828==    by 0x572FF3D: img_2d_transform (rotate.c:1171)
==50828==    by 0x56DC6F1: Image_matrix_transform (conv.c:943)
==50828==    by 0x56D9BC4: Image_transform (conv.c:992)
==50828==    by 0x566FDA0: template_xs_Bool_Handle_HVPtr (thunks.tinc:3788)
==50828==    by 0x497BECF: Perl_pp_entersub (pp_hot.c:5352)
==50828==    by 0x496D84F: Perl_runops_standard (run.c:41)
==50828==    by 0x48DDFA0: UnknownInlinedFun (perl.c:2721)
==50828==    by 0x48DDFA0: perl_run (perl.c:2644)
==50828==    by 0x109349: main (perlmain.c:110)
==50828== 

It is this code:

    SKIP:
        if ( step < iop.n_steps - 1)
→           memmove( io, io + 1, (iop.n_steps - step - 1) * sizeof(ImgOp) );
        iop.n_steps--;
        step--;
        io--; 
    }

This happens both with Prima-1.67 as well as with current head (8a0c8bc).

memove() is requires that source and target memory areas cannot overlap. Modern glibc scrutinizes the calls if a program is compiled with -D_FORTIFY_SOURCE=3 -O2 cflags and aborts the program when it detects overlapping memory areas.

Please note that I have not yet verified that this is a genuine bug in Prima and not a bug in glibc.

@dk
Copy link
Owner

dk commented Jan 11, 2023

Thank you for great anaysis! One thing bothers me though: for decades I knew that if one suspects that memory areas overlap, memmove is the safer approach than memcpy. Now you are saying the reverse. I shall check the bug later no matter what, but could you corroborate the statement about glibc.memmove's area that cannot overlap?

@ppisar
Copy link
Contributor Author

ppisar commented Jan 12, 2023

You are right, memove() can overlap. I got confused by the valgrind output. Maybe glibc fortification does not play well with valgrind instrumentation, or valgrind prints a misnomer.

Here is a backtrace from GDB when Prima built with -D_FORTIFY_SOURCE=3 and glibc aborts the program:

Program received signal SIGABRT, Aborted.
0x00007ffff7ab02fc in __pthread_kill_implementation () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff7ab02fc in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x00007ffff7a5e056 in raise () at /lib64/libc.so.6
#2  0x00007ffff7a4787c in abort () at /lib64/libc.so.6
#3  0x00007ffff7a485b3 in _IO_peekc_locked.cold () at /lib64/libc.so.6
#4  0x00007ffff7b3fceb in  () at /lib64/libc.so.6
#5  0x00007ffff7b3e516 in  () at /lib64/libc.so.6
#6  0x00007ffff753bf3e in memmove
    (__len=<optimized out>, __src=0x7fffffffb5f0, __dest=0x7fffffffb5e4, __dest=<optimized out>, __src=<optimized out>, __len=<optimized out>) at /usr/include/bits/string_fortified.h:36
#7  img_2d_transform
    (self=93824992510816, matrix=<optimized out>, fill=0x7fffffffe100 "", output=0x7fffffffdab0)
    at img/rotate.c:1171
#8  0x00007ffff74e86f2 in Image_matrix_transform
    (self=93824992510816, matrix=0x7fffffffe0d0, fill=0x7fffffffe100 "") at class/Image/conv.c:943
#9  0x00007ffff74e5bc5 in Image_transform
    (self=self@entry=93824992510816, profile=profile@entry=0x555555596358) at class/Image/conv.c:992
#10 0x00007ffff747bda1 in template_xs_Bool_Handle_HVPtr
    (cv=<optimized out>, subName=<optimized out>, func=0x7ffff74e5a10 <Image_transform>)
    at include/generic/thunks.tinc:3788
#11 0x00007ffff7d1eed0 in Perl_pp_entersub () at /lib64/libperl.so.5.36
#12 0x00007ffff7d10850 in Perl_runops_standard () at /lib64/libperl.so.5.36
#13 0x00007ffff7c80fa1 in perl_run () at /lib64/libperl.so.5.36
#14 0x000055555555534a in main ()

It points to the same memove() call at img/rotate.c:1171. There are lot of local variables. Interesting ones from frame #7:

(gdb) p step
$1 = <optimized out>
(gdb) p iop
$2 = {n_steps = 2, steps = {{cmd = 2, p1 = 1, p2 = 1}, {cmd = 1, p1 = 0.5, p2 = 0}, {cmd = 1, 
      p1 = 0.5, p2 = 0}, {cmd = 0, p1 = 0, p2 = 0}, {cmd = 0, p1 = 0, p2 = 0}}}
(gdb) p sizeof(ImgOp)
$3 = 12

Frame #6 does not show anything interesting because all arguments were optimized out. Problem with _FORTIFY_SOURCE is that it requires enabled optimizations.

@ppisar
Copy link
Contributor Author

ppisar commented Jan 12, 2023

If I recompile without fortification and optimizations, valgrind does not catch anything.

@dk
Copy link
Owner

dk commented Jan 12, 2023

This is all quite mystical I must say... if I compile with these flags, nothing is happening, neither coredumps nor valgrind alerts. Is it a specific glibc version and/or architecture? I tried this on Ubuntu GLIBC 2.35-0ubuntu3.1 .

One thing though, my compiler warns during the compilation:

cc -c  -Iinclude -Iinclude/generic -I/usr/include/fribidi -I/usr/include/uuid -I/usr/include/freetype2 -I/usr/include/libpng16 -I/usr/include/harfbuzz -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/gtk-3.0 -I/usr/include/at-spi2-atk/2.0 -I/usr/include/at-spi-2.0 -I/usr/include/dbus-1.0 -I/usr/lib/x86_64-linux-gnu/dbus-1.0/include -I/usr/include/gtk-3.0 -I/usr/include/gio-unix-2.0 -I/usr/include/cairo -I/usr/include/pango-1.0 -I/usr/include/pango-1.0 -I/usr/include/atk-1.0 -I/usr/include/cairo -I/usr/include/pixman-1 -I/usr/include/gdk-pixbuf-2.0 -I/usr/include/x86_64-linux-gnu -I/usr/include/libmount -I/usr/include/blkid  -fopenmp -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -Wp,-D_FORTIFY_SOURCE=3 -g -O3   -DVERSION=\"1.67\" -DXS_VERSION=\"1.67\" -fPIC "-I/home/dk/perl5/perlbrew/perls/perl-5.36.0/lib/5.36.0/x86_64-linux/CORE"   img/rotate.c -o img/rotate.o
<command-line>: warning: "_FORTIFY_SOURCE" redefined

I wonder if _FORTIFY_SOURCE=3 has any effect here?

I'd hate to call it a glibc bug but it looks like one. There are no direct google hits, but there is some vague info on google that memcpy and memmove were one until some version, and I see that in the stack trace you've got there is this line

at 0x484C332: __memcpy_chk (vg_replace_strmem.c:1741)

hinting that valgrind uses its own memcpy_chk to test, well, memcpy, and not necessarily memmove.

Anyway more details needed - I'd be curious too to recompile glibc version to dig down to the real problem.

@dk
Copy link
Owner

dk commented Jan 12, 2023

( btw I just tried to play with the latest glibc 5.36 but that didn't work for me, caused coredumps for every command...)

@ppisar
Copy link
Contributor Author

ppisar commented Jan 12, 2023

I observe this on current Fedora 38. It has glibc-2.36.9000, a snapshot of development glibc newer than 2.36 release.

I added few debugging printf()s and this what it prints:

$ perl -Iblib/{lib,arch} /tmp/test 
sizeof(oip)=64
sizeof(oip.steps)=60
Initial iop.steps=0x7ffce4065984 past(iop.steps)=0x7ffce40659c0 iop.n_steps=3
step=0, iop.n_steps=3, io=0x7ffce4065984
Before memove condition: step=0 iop.n_steps=3
Calling memmove(io=0x7ffce4065984, io+1=0x7ffce4065990, (iop.n_steps - step - 1) * sizeof(ImgOp)=24)
memmove() finished
Decreasing iop.n_steps, step, io
step=0, iop.n_steps=2, io=0x7ffce4065984
Before memove condition: step=0 iop.n_steps=2
Calling memmove(io=0x7ffce4065984, io+1=0x7ffce4065990, (iop.n_steps - step - 1) * sizeof(ImgOp)=12)
*** buffer overflow detected ***: terminated
Aborted (core dumped)

If did the math correctly, than none of the two memmove() calls reads or writes out of the iop structure memory. It's ridiculous that first memmove() of size 24 passes fortification, but subsequent memmove() of size 12 fails the fortification check.

The corresponding initial iop content is:

(gdb) p iop
$5 = {n_steps = 3, steps = {{cmd = 0, p1 = 0, p2 = 0}, {cmd = 2, p1 = 1, p2 = 1}, {cmd = 1, p1 = 0.5,
      p2 = 0}, {cmd = 0, p1 = 0, p2 = 0}, {cmd = 0, p1 = 0, p2 = 0}}}

I conclude it's a bug in glibc or gcc. I will try to minimize the code and report to glibc developers.

@dk
Copy link
Owner

dk commented Jan 12, 2023

This is gets more and more interesting ... let's though keep this bug open for the reference

@ppisar ppisar changed the title t/Image/Transform.t aborts because of overlapping memmove() in img_2d_transform t/Image/Transform.t aborts on memmove() in img_2d_transform() if compiled with -D_FORTIFY_SOURCE=3 and glibc-2.36.9000 Jan 13, 2023
@ppisar
Copy link
Contributor Author

ppisar commented Jan 13, 2023

I reported it to glibc together with a minimal reproducer https://bugzilla.redhat.com/show_bug.cgi?id=2160682.

@siddhesh
Copy link
Contributor

You are right, memove() can overlap. I got confused by the valgrind output. Maybe glibc fortification does not play well with valgrind instrumentation, or valgrind prints a misnomer.

Yeah, valgrind can report spurious warnings with fortified functions: https://bugs.kde.org/show_bug.cgi?id=453084

I'm trying to figure out out to fix this in gcc, the io-- is throwing it off due to a potential underflow in the first iteration. It's not even actual undefined behaviour AFAICT, so I want to try and fix it in the compiler.

In the meantime, I've suggested a workaround in the fedora bug Petr filed for the reproducer, hopefully that helps work around this in actual code too. Basically the intent is to ensure that io never crosses object boundaries, even in unreachable code, regardless of whether it is actually dereferenced.

@dk
Copy link
Owner

dk commented Jan 13, 2023

I don't see the patch you mention but i"m ok to change the code. I cannot reproduce it though so i depend here on whoever can. Replace io + 1 to iop[steps + 1] or something similar, I would guess?

@siddhesh
Copy link
Contributor

I've sent a PR that ought to resolve this.

@dk dk closed this as completed in #79 Jan 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants