Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lsof crash on glibc 2.33 armv7a #160

Closed
10ne1 opened this issue Jun 19, 2021 · 8 comments · Fixed by #161
Closed

lsof crash on glibc 2.33 armv7a #160

10ne1 opened this issue Jun 19, 2021 · 8 comments · Fixed by #161

Comments

@10ne1
Copy link

10ne1 commented Jun 19, 2021

Hi. Starting with glibc v2.33 the {f}stat{at} APIs got reworked which exposed a bug in lsof making it always crash on 32bit arm systems due to a buffer alignment problem in this code location.

The glibc 2.33 commit causing the crash this: aa03f722f3 linux: Add {f}stat{at} y2038 support

See the discussion in the glibc bugtracker which uncovered the issue. Here's a stack trace:

#0  0xf396e1aa in __cp_stat64_t64_stat64 (st64_t64=0xffa4bc78, st64=<optimized out>)
    at ../sysdeps/unix/sysv/linux/stat_t64_cp.c:40
#1  0xf39671a8 in __GI___stat64 (file=<optimized out>, buf=0xffa4bd2b)
    at ../sysdeps/unix/sysv/linux/stat64.c:38
#2  0x066cbf26 in doinchild (fn=<optimized out>, fp=<optimized out>, rbuf=<optimized out>, 
    rbln=<optimized out>) at misc.c:364
#3  0x066cc09e in Readlink (arg=0x6b59378 "/proc") at misc.c:1109
#4  0x066bbe52 in readmnt () at dmnt.c:512
#5  0x066c6454 in ck_file_arg (i=1, ac=2, av=0xffa52224, fv=0, rs=0, sbp=0x0) at arg.c:189
#6  0x066ca506 in main (argc=<optimized out>, argv=<optimized out>) at main.c:1251

In a nutshell lsof needs to ensure the rbuf in doinchild() is properly aligned.

@masatake
Copy link
Contributor

Thank you for reporting.

Could you rebuild the lsof command with 'make clean; make DEBUG='-g -O0'' and capture the stack trace again?

@10ne1
Copy link
Author

10ne1 commented Jun 19, 2021

@masatake Here you go, though glibc itself is built with optimizations, but I assume that's irrelevant.

#0  0xf6f151aa in __cp_stat64_t64_stat64 (st64_t64=0xff8b8c70, st64=<optimized out>)
    at ../sysdeps/unix/sysv/linux/stat_t64_cp.c:40
#1  0xf6f0e1a8 in __GI___stat64 (file=<optimized out>, buf=0xff8b8d7a)
    at ../sysdeps/unix/sysv/linux/stat64.c:38
#2  0x0cc83d30 in dostat (path=0xff8b9d7b "/", rbuf=0xff8b8d7a "", rbln=104) at misc.c:474
#3  0x0cc830c2 in doinchild (fn=0xcc83961 <doreadlink>, fp=0xff8badf1 "/home", rbuf=0xff8bbdf2 "", 
    rbln=4096) at misc.c:364
#4  0x0cc8352c in Readlink (arg=0xff8bee33 "/home/chronos/u-*") at misc.c:1109
#5  0x0cc7a8fe in ck_file_arg (i=3, ac=4, av=0xff8be294, fv=0, rs=0, sbp=0x0) at arg.c:160
#6  0x0cc807cc in main (argc=4, argv=0xff8be294) at main.c:1251

@masatake
Copy link
Contributor

Looks strange.

doreadlink is passed to doinclude. However, dostat is called. I wonder why.

How about the following change?

diff --git a/misc.c b/misc.c
index 3bebdc5..4954e90 100644
--- a/misc.c
+++ b/misc.c
@@ -293,7 +293,11 @@ doinchild(fn, fp, rbuf, rbln)
                 */
 
                    int r_al, r_rbln;
-                   char r_arg[MAXPATHLEN+1], r_rbuf[MAXPATHLEN+1];
+                   char r_arg[MAXPATHLEN+1];
+                   union {
+                     char r_rbuf[MAXPATHLEN+1];
+                     struct stat r_stat;
+                   } r;
                    int (*r_fn)();
                /*
                 * Close sufficient open file descriptors except Pipes[0] and
@@ -358,16 +362,16 @@ doinchild(fn, fp, rbuf, rbln)
                        ||  read(Pipes[0], r_arg, r_al) != r_al
                        ||  read(Pipes[0], (char *)&r_rbln, sizeof(r_rbln))
                            != (int)sizeof(r_rbln)
-                       ||  r_rbln < 1 || r_rbln > (int)sizeof(r_rbuf))
+                       ||  r_rbln < 1 || r_rbln > (int)sizeof(r.r_rbuf))
                            break;
-                       zeromem (r_rbuf, r_rbln);
-                       rv = r_fn(r_arg, r_rbuf, r_rbln);
+                       zeromem (r.r_rbuf, r_rbln);
+                       rv = r_fn(r_arg, r.r_rbuf, r_rbln);
                        en = errno;
                        if (write(Pipes[3], (char *)&rv, sizeof(rv))
                            != sizeof(rv)
                        ||  write(Pipes[3], (char *)&en, sizeof(en))
                            != sizeof(en)
-                       ||  write(Pipes[3], r_rbuf, r_rbln) != r_rbln)
+                       ||  write(Pipes[3], r.r_rbuf, r_rbln) != r_rbln)
                            break;
                    }

@10ne1
Copy link
Author

10ne1 commented Jun 19, 2021

@masatake

The above fix works, the union with the buffer correctly fixes the alignment. Can you please apply this fix to the main or branches? Thank you!

@masatake
Copy link
Contributor

@10ne1, thank you for testing.
I will make a pull request and merge it.

@masatake
Copy link
Contributor

BTW, what kind of command-line options did you specified?
I guess you specified -O. Am I correct?

@10ne1
Copy link
Author

10ne1 commented Jun 20, 2021

Typically it is run in ChromiumOS at startup as lsof -n -Fn /home/chronos/u-* with different paths or I run it manually without any arguments like lsof /var/log/messages. I did not explicitely test -O ., but I'd expect that to crash too because in my testing it crashed regardless of the arguments when built against glibc 2.33 on arm.

Once you merge a fix I'll push it to the lsof package in Gentoo then I will also push to ChromiumOS.

Thank you very much!

@masatake
Copy link
Contributor

Now, I understand what happened. I assumed the child process was created each time when calling stat or readlink.
The fact is that the child process acted like a daemon; it serves all the system calls requested from the parent process.

masatake added a commit to masatake/lsof that referenced this issue Jun 20, 2021
Close lsof-org#160.

The original code passes char[] buffer to stat().
This can be cause a SIGBUS.

lsof-org#160 reported an actual crash on armv7a + glibc-2.33 platform.
See also https://sourceware.org/bugzilla/show_bug.cgi?id=27993.

The issue is reported by @10ne1.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
10ne1 added a commit to 10ne1/gentoo that referenced this issue Jun 21, 2021
This backports an upstream fix for a crash which happens on
armv7a + glibc 2.33 due to a buffer misalignment.

Upstream issue: lsof-org/lsof#160
Upstream commit: 21cb1dad1243f4c0a427d893babab12e48b60f0e

Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
10ne1 added a commit to 10ne1/gentoo that referenced this issue Jun 21, 2021
This backports an upstream fix for a crash which happens on
armv7a + glibc 2.33 due to a buffer misalignment.

Upstream issue: lsof-org/lsof#160
Upstream commit: 21cb1dad1243f4c0a427d893babab12e48b60f0e

Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
10ne1 added a commit to 10ne1/gentoo that referenced this issue Jun 21, 2021
This backports an upstream fix for a crash which happens on
armv7a + glibc 2.33 due to a buffer misalignment.

Upstream issue: lsof-org/lsof#160
Upstream commit: 21cb1dad1243f4c0a427d893babab12e48b60f0e

Bug: https://bugs.gentoo.org/797358

Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
gentoo-bot pushed a commit to gentoo/gentoo that referenced this issue Jun 21, 2021
This backports an upstream fix for a crash which happens on
armv7a + glibc 2.33 due to a buffer misalignment.

Upstream issue: lsof-org/lsof#160
Upstream commit: 21cb1dad1243f4c0a427d893babab12e48b60f0e
Bug: https://bugs.gentoo.org/797358
Closes: #21354
Acked-by: David Seifert <soap@gentoo.org>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Sam James <sam@gentoo.org>
axelgenus pushed a commit to axelgenus/gentoo that referenced this issue Jun 22, 2021
This backports an upstream fix for a crash which happens on
armv7a + glibc 2.33 due to a buffer misalignment.

Upstream issue: lsof-org/lsof#160
Upstream commit: 21cb1dad1243f4c0a427d893babab12e48b60f0e
Bug: https://bugs.gentoo.org/797358
Closes: gentoo#21354
Acked-by: David Seifert <soap@gentoo.org>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Sam James <sam@gentoo.org>
axelgenus pushed a commit to axelgenus/gentoo that referenced this issue Jun 22, 2021
This backports an upstream fix for a crash which happens on
armv7a + glibc 2.33 due to a buffer misalignment.

Upstream issue: lsof-org/lsof#160
Upstream commit: 21cb1dad1243f4c0a427d893babab12e48b60f0e
Bug: https://bugs.gentoo.org/797358
Closes: gentoo#21354
Acked-by: David Seifert <soap@gentoo.org>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Sam James <sam@gentoo.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants