-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
coredump occurs when I restart sssd-ifp.service with sssd.service is inactive #6324
Comments
sorry for late to update the message src/monitor/monitor.c | 4 ++++ diff --git a/src/monitor/monitor.c b/src/monitor/monitor.c
some analyse below: In fact, all services should have the above settings, but I have only reproduced the coredump scenario after ifp reconnection, and other responders have not been tested for reproduction. In order to keep the minimum modification, an if judgment is added here, and only the scene of ifp is processed. |
the reproduce step is: sssd version is 2.6.1 |
Hi @huangzq6, what's in the |
hi, @alexey-tikhonov , |
Hi,
Patch proposed in #6324 (comment) doesn't take into account 'ifp' can be configured in sssd.conf to be run by sssd service (and not by sssd-ifp). |
Ah... Do you mean:
|
Hi @huangzq6, could you please check if following patch resolves an issue for you:
? |
hi, thanks for your patch. |
Hi @alexey-tikhonov, I also tried your modification and found it worked. This is the PR submitted according to your suggestion |
When socket activated service connects for a first time, it is added to `mt_ctx->svc_list` by `socket_activated_service_not_found()` with a proper `socket_activated = true`. But when it reconnects again, `get_service_in_the_list()` finds it in `mt_ctx->svc_list` and overwrites `socket_activated = false` unconditionally. This patch moves moves `socket_activated = false` to `start_service()`. Resolves: SSSD#6324
Thank you for the confirmations. |
When socket activated service connects for a first time, it is added to `mt_ctx->svc_list` by `socket_activated_service_not_found()` with a proper `socket_activated = true`. But when it reconnects again, `get_service_in_the_list()` finds it in `mt_ctx->svc_list` and overwrites `socket_activated = false` unconditionally. This patch moves `socket_activated = false` to `start_service()`. Resolves: SSSD#6324
When socket activated service connects for the first time, it is added to `mt_ctx->svc_list` by `socket_activated_service_not_found()` with a proper `socket_activated = true`. But when it reconnects again, `get_service_in_the_list()` finds it in `mt_ctx->svc_list` and overwrites `socket_activated = false` unconditionally. This patch moves `socket_activated = false` to `start_service()`. Resolves: SSSD#6324
When socket activated service connects for the first time, it is added to `mt_ctx->svc_list` by `socket_activated_service_not_found()` with a proper `socket_activated = true`. But when it reconnects again, `get_service_in_the_list()` finds it in `mt_ctx->svc_list` and overwrites `socket_activated = false` unconditionally. This patch moves `socket_activated = false` to `start_service()`. Resolves: #6324 Reviewed-by: Iker Pedrosa <ipedrosa@redhat.com> Reviewed-by: Pavel Březina <pbrezina@redhat.com> (cherry picked from commit d4f7ed6)
HI, coredump occurs when I restart sssd-ifp.service with sssd.service is inactive. The probability of occurrence is very high,or you can restart the sssd-ifp.service and sssd.service meanwhile, it will crash.
I am checking for the reason, it maybe a double free.
the system is with 5.10 kernel,
some message is that:
sssd: 2.6.1-1
glibc: 2.34-70
the coredump file log is as follows(some ):
localhost:/home/hzq # gdb /usr/sbin/sssd core.sssd.0.b3ab7388b13f4a979fa9fe86fcaae558.57906.1661343892000000
GNU gdb (GDB) openEuler 11.1-1.h2.openEuler
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "aarch64-openEuler-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
https://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/sssd...
Reading symbols from /usr/lib/debug//usr/sbin/sssd-2.6.1-1.h7.openEuler.aarch64.debug...
[New LWP 57906]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/sssd -i --logger=files'.
Program terminated with signal SIGABRT, Aborted.
#0 __pthread_kill_implementation (threadid=281473457532960, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
Missing separate debuginfos, use: dnf debuginfo-install ...
(gdb) bt
#0 __pthread_kill_implementation (threadid=281473457532960, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x0000ffffa4ac7d14 in __pthread_kill_internal (signo=, threadid=) at pthread_kill.c:78
#2 0x0000ffffa4a83cbc in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x0000ffffa4a71d2c in __GI_abort () at abort.c:79
#4 0x0000ffffa4dfbdf8 in _talloc_free () from /usr/lib64/libtalloc.so.2
#5 0x0000ffffa4e7f190 in sbus_connection_destructor (conn=0xaaaafa94b6c0) at src/sbus/connection/sbus_connection.c:78
#6 0x0000ffffa4dfaef8 in ?? () from /usr/lib64/libtalloc.so.2
#7 0x0000ffffa4dfab58 in ?? () from /usr/lib64/libtalloc.so.2
#8 0x0000ffffa4dfab58 in ?? () from /usr/lib64/libtalloc.so.2
#9 0x0000ffffa4dfab58 in ?? () from /usr/lib64/libtalloc.so.2
#10 0x0000ffffa4dfab58 in ?? () from /usr/lib64/libtalloc.so.2
#11 0x0000ffffa55a00f0 in server_atexit () at src/util/server.c:45
#12 0x0000ffffa4a864d0 in __run_exit_handlers (status=0, listp=0xffffa4be7698 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true,
run_dtors=run_dtors@entry=true) at exit.c:113
#13 0x0000ffffa4a8662c in __GI_exit (status=) at exit.c:143
#14 0x0000aaaad64b6f58 in monitor_quit (mt_ctx=, ret=0) at src/monitor/monitor.c:1438
#15 0x0000aaaad64b76d0 in monitor_quit_signal (ev=, se=, signum=15, count=, siginfo=,
private_data=) at src/monitor/monitor.c:1455
#16 0x0000ffffa4e24248 in tevent_common_invoke_signal_handler () from /usr/lib64/libtevent.so.0
#17 0x0000ffffa4e243f8 in tevent_common_check_signal () from /usr/lib64/libtevent.so.0
#18 0x0000ffffa4e228c4 in ?? () from /usr/lib64/libtevent.so.0
#19 0x0000ffffa4e1f4d4 in _tevent_loop_once () from /usr/lib64/libtevent.so.0
#20 0x0000ffffa4e1f7b0 in tevent_common_loop_wait () from /usr/lib64/libtevent.so.0
#21 0x0000ffffa55a1510 in server_loop (main_ctx=0xaaaafa93ee60) at src/util/server.c:733
#22 0x0000aaaad64b5974 in main (argc=, argv=) at src/monitor/monitor.c:2582
The text was updated successfully, but these errors were encountered: