Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gpo_child process terminates with SIGSEGV. #6790

Closed
dpslavov opened this issue Jun 19, 2023 · 16 comments
Closed

gpo_child process terminates with SIGSEGV. #6790

dpslavov opened this issue Jun 19, 2023 · 16 comments
Assignees
Labels
Closed: Fixed Issue was closed as fixed. Work in Progress

Comments

@dpslavov
Copy link

This is caused by type mismatch for variable chain_id, it's declared as uint64_t (

uint64_t chain_id;
) while is parsed as long (
{"chain-id", 0, POPT_ARG_LONG, &chain_id,
). krb5_child suffers from same mismatch but don't terminates abnormally because chain_id is initialized to zero (
uint64_t chain_id = 0;
).

@alexey-tikhonov
Copy link
Member

Could you please attach a backtrace as well?

@dpslavov
Copy link
Author

Here it's:

(gdb) bt
#0  __GI_strlen () at ../sysdeps/arm/armv6t2/strlen.S:126
#1  0xb6e1dd28 in __vfprintf_internal (s=s@entry=0xbec39698, format=format@entry=0x422fc8 "[RID#%lu] %s", ap=..., ap@entry=..., mode_flags=mode_flags@entry=2) at vfprintf-process-arg.c:397
#2  0xb6e30f6e in __vsnprintf_internal (string=0x10875b1 "[RID#152] ", maxlen=<optimized out>, maxlen@entry=102335, format=0x422fc8 "[RID#%lu] %s", args=args@entry=..., mode_flags=2) at vsnprintf.c:114
#3  0xb6e9a81e in ___vsnprintf_chk (s=<optimized out>, maxlen=maxlen@entry=102335, flag=flag@entry=1, slen=slen@entry=4294967295, format=<optimized out>, format@entry=0x422fc8 "[RID#%lu] %s", ap=...) at vsnprintf_chk.c:34
#4  0xb6f8e996 in vsnprintf (__ap=..., __fmt=0x422fc8 "[RID#%lu] %s", __n=102335, __s=<optimized out>) at /usr/include/arm-linux-gnueabihf/bits/stdio2.h:68
#5  _backtrace_vprintf (format=format@entry=0x422fc8 "[RID#%lu] %s", ap=..., ap@entry=...) at ../src/util/debug_backtrace.c:256
#6  0xb6f8eb08 in sss_debug_backtrace_vprintf (level=level@entry=1024, format=0x422fc8 "[RID#%lu] %s", format@entry=0x2d5 <error: Cannot access memory at address 0x2d5>, ap=..., ap@entry=...) at ../src/util/debug_backtrace.c:119
#7  0xb6f8eb5e in sss_debug_backtrace_printf (level=level@entry=1024, format=0x422fc8 "[RID#%lu] %s") at ../src/util/debug_backtrace.c:128
#8  0xb6f8e310 in sss_vdebug_fn (file=file@entry=0x422a68 "../src/providers/ad/ad_gpo_child.c", line=line@entry=725, function=function@entry=0x4234f4 <__FUNCTION__.8> "main", level=level@entry=1024, flags=flags@entry=0,
    format=format@entry=0x422fd8 "gpo_child started.\n", ap=ap@entry=...) at ../src/util/debug.c:356
#9  0xb6f8e558 in sss_debug_fn (file=file@entry=0x422a68 "../src/providers/ad/ad_gpo_child.c", line=line@entry=725, function=function@entry=0x4234f4 <__FUNCTION__.8> "main", level=level@entry=1024,
    format=0x422fd8 "gpo_child started.\n") at ../src/util/debug.c:376
#10 0x0041f27a in main (argc=7, argv=0xbec39c94) at ../src/providers/ad/ad_gpo_child.c:725

@alexey-tikhonov
Copy link
Member

alexey-tikhonov commented Nov 10, 2023

Hi @dpslavov,

sorry for the delay.

This is caused by type mismatch for variable chain_id, it's declared as uint64_t (

uint64_t chain_id;

) while is parsed as long (

{"chain-id", 0, POPT_ARG_LONG, &chain_id,

).

What platform do you use? ARM, but what arm exactly?
This could only be the reason for a memory corruption, if 'popt' would expect POPT_ARG_LONG to be larger than uint64_t (8 bytes).

krb5_child suffers from same mismatch but don't terminates abnormally because chain_id is initialized to zero (

uint64_t chain_id = 0;

).

If it's other way around and uint64_t is larger than long then zero initialization would prevent from garbage in variable.
At the moment I don't understand how garbage in chain_id can result in a crash...

#3  0xb6e9a81e in ___vsnprintf_chk (s=<optimized out>, 
                                    maxlen=maxlen@entry=102335, 
                                    flag=flag@entry=1, 
                                    slen=slen@entry=4 294 967 295, 
                                    format=<optimized out>, 
                                    format@entry=0x422fc8 "[RID#%lu] %s", ap=...) at vsnprintf_chk.c:34

#4  0xb6f8e996 in vsnprintf (__ap=..., 
                                                   __fmt=0x422fc8 "[RID#%lu] %s",
                                                   __n=102335,
                                                   __s=<optimized out>) at /usr/include/arm-linux-gnueabihf/bits/stdio2.h:68

#5   _backtrace_vprintf (format=format@entry=0x422fc8 "[RID#%lu] %s", ap=..., ap@entry=...) at ../src/util/debug_backtrace.c:256

Two things bothers me here ^^:

  1. why does vsnprintf have reverse args order ('__ap' goes first)?
  2. why __glibc_objsize (__s) == 4 294 967 295? IIUC, this is size of allocated buffer for internal backtrace, and it should be 100kb (__n=102335 is remaining free space in the bufer)

How SSSD was built on this platform?

@dpslavov
Copy link
Author

Hi @alexey-tikhonov,

It's an armv7l machine running Debian and using stock SSSD package. Unfortunately since I reported this bug I've upgraded it to the newer release "bookworm", but seems that the issue is still there. While it crash at another place in another process (sssd_be) the stack back trace is same: it crashes while trying to format debugging message, more specifically when try to format unsigned long long to unsigned long. Probably it has something to do with the internals of implementation of variadic functions. The problem can be easy reproduced:

cat > test.c << EOF
#include <stdio.h>

int main()
{
        printf("%lu %s\n", 1ULL, "foo");
}
EOF
gcc -o test -g test.c
./test
Segmentation fault

@alexey-tikhonov
Copy link
Member

@dpslavov, do I understand correctly that sizeof(unsigned long) on your platform is 4?

@dpslavov
Copy link
Author

Yes, that's correct. It's a 32-bit platform.

alexey-tikhonov added a commit to alexey-tikhonov/sssd that referenced this issue Nov 13, 2023
@alexey-tikhonov
Copy link
Member

@dpslavov, I think #7031 should fix this.
Any chance you could try this patch?

@dpslavov
Copy link
Author

I tested it and while the sssd don't crash crash anymore and works correctly the RID in debug messages from gpo_child looks bogus:

Nov 14 12:12:37 host.domain sssd_be[12395]: [RID#5] All data has been sent!
Nov 14 12:12:37 host.domain gpo_child[12790]: [RID#13183383835691188229] gpo_child started.

I suppose it's related to missing initialization of chain_id to zero.

@alexey-tikhonov
Copy link
Member

I suppose it's related to missing initialization of chain_id to zero.

No, this part is due to mismatch of sizes of uint64_t chain_id and POPT_ARG_LONG (as you mentioned in original report).

Zero init will hide this issue, but isn't a proper solution. I'll add another patch.

alexey-tikhonov added a commit to alexey-tikhonov/sssd that referenced this issue Nov 14, 2023
Sizes might not match on some platforms.

Resolves: SSSD#6790
@alexey-tikhonov
Copy link
Member

@dpslavov, I added a 2nd patch. Could you please test this?

I expect this to resolve the issue. Once confirmed, I'll also add zero init but this is for case "--chain-id" arg is totally missing.

@dpslavov
Copy link
Author

As far as I can see this second patch should be applied alone, without applying the previous one. Correct?

@alexey-tikhonov
Copy link
Member

As far as I can see this second patch should be applied alone, without applying the previous one. Correct?

No, both patches from https://github.com/SSSD/sssd/pull/7031/commits are required.

First fixes size mismatch within logger implementation, second fixes size mismatch during arguments parsing.

@dpslavov
Copy link
Author

No issues were observed after applying the patches and changing debug_level to 6. CID/RID are correct now.

@dpslavov
Copy link
Author

dpslavov commented Nov 14, 2023

Just to mention that only following modules were tested: gpo_child, krb5_child, ldap_child, sssd_be, sssd_nss, sssd_pac, sssd_pam and sssd_sudo.

@alexey-tikhonov
Copy link
Member

alexey-tikhonov commented Nov 14, 2023

Thank you for testing.

alexey-tikhonov added a commit to alexey-tikhonov/sssd that referenced this issue Nov 14, 2023
Sizes might not match on some platforms.

Resolves: SSSD#6790
alexey-tikhonov added a commit that referenced this issue Nov 17, 2023
Sizes might not match on some platforms.

Resolves: #6790

Reviewed-by: Justin Stephenson <jstephen@redhat.com>
Reviewed-by: Tomáš Halman <thalman@redhat.com>
alexey-tikhonov added a commit that referenced this issue Nov 17, 2023
Resolves: #6790

Reviewed-by: Justin Stephenson <jstephen@redhat.com>
Reviewed-by: Tomáš Halman <thalman@redhat.com>
(cherry picked from commit 2617dcf)
alexey-tikhonov added a commit that referenced this issue Nov 17, 2023
Sizes might not match on some platforms.

Resolves: #6790

Reviewed-by: Justin Stephenson <jstephen@redhat.com>
Reviewed-by: Tomáš Halman <thalman@redhat.com>
(cherry picked from commit 098bf64)
@alexey-tikhonov
Copy link
Member

Pushed PR: #7031

  • master
    • 098bf64 - Don't provide 'uint64_t' as POPT_ARG_LONG.
    • 2617dcf - UTIL: use proper specifier for 'DEBUG_CHAIN_ID_FMT_*'
  • sssd-2-9
    • 1e2af0d - Don't provide 'uint64_t' as POPT_ARG_LONG.
    • 4b4564c - UTIL: use proper specifier for 'DEBUG_CHAIN_ID_FMT_*'

@alexey-tikhonov alexey-tikhonov added the Closed: Fixed Issue was closed as fixed. label Nov 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closed: Fixed Issue was closed as fixed. Work in Progress
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants