Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SECRETS: Don't remove a container when it has children #11

Closed
wants to merge 2 commits into from

Conversation

Projects
None yet
4 participants
@fidencio
Copy link
Contributor

commented Sep 1, 2016

Let's return and log an error in case the container to be removed has
children.

The approach taken introduced at least one new search in every delete
operation. As far as I understand searching in the BASE scope is quite
cheap and that's the reason I decided to just do the search in the
ONELEVEL scope when the deleted dn is for sure a container.

While the first patch is not needed for solving this issue, it's needed for adding something to a container.

The easiest way to test would be:

  • Create a container: curl -H "Content-Type: application/json" --unix-socket /var/run/secrets.socket -XPOST http://localhost/secrets/bar/
  • Create a container inside the firstly created container: curl -H "Content-Type: application/json" --unix-socket /var/run/secrets.socket -XPOST http://localhost/secrets/bar/foo/
  • Add a secret to the container: curl -H "Content-Type: application/json" --unix-socket /var/run/secrets.socket -XPUT http://localhost/secrets/bar/foo/test -d'{"type":"simple","value":"foosecret"}'
  • Try to remove the container bar: curl -H "Content-Type: application/json" --unix-socket /var/run/secrets.socket -XDELETE http://localhost/secrets/bar/

With the patch you won't be able to remove unless it has no children. Without, you will be able to remove the container.

@fidencio fidencio force-pushed the fidencio:wip/#3167 branch 2 times, most recently from e5478ec to 616695e Sep 1, 2016

talloc_free(res);
res = NULL;

ret = ldb_search(lctx->ldb, mem_ctx, &res, dn, LDB_SCOPE_SUBTREE,

This comment has been minimized.

Copy link
@simo5

simo5 Sep 2, 2016

Contributor

Please use LDB_SCOPE_ONE, if you have any immediate children that's enough.

@fidencio fidencio force-pushed the fidencio:wip/#3167 branch from 616695e to 1763c37 Sep 2, 2016

}
}

ret = sysdb_error_to_errno(ldb_delete(lctx->ldb, dn));

This comment has been minimized.

Copy link
@simo5

simo5 Sep 8, 2016

Contributor

please split in two lines
'''
ret = ldb_delete(lctx->ldb, dn);
if (ret) {
ret = sysdb_error_to_errno(ret);
}
'''

if (ret != EOK) return ret;

if (res->count == 1) {
talloc_free(res);

This comment has been minimized.

Copy link
@simo5

simo5 Sep 8, 2016

Contributor

You should use a tmp_ctx in this function, and allocate the dn above, and all searches on it, then just talloc_free() the tmp_ctx at the end.

@fidencio fidencio force-pushed the fidencio:wip/#3167 branch from 1763c37 to 5d2e880 Sep 8, 2016

if (ret != EOK) goto done;

if (res->count > 0) {
ret = EINVAL;

This comment has been minimized.

Copy link
@simo5

simo5 Sep 9, 2016

Contributor

maybe this should be something like EACCESS or EPERM ?

This comment has been minimized.

Copy link
@simo5

simo5 Sep 9, 2016

Contributor

Actually it needs to be EMEDIUMTYPE I think.
According to https://github.com/latchset/custodia/blob/master/API.md#deleting-keys
We need to return:

  • 406 not acceptable, type unknown/not permitted

This is done in sec_errno_to_http_status():

    case EMEDIUMTYPE:
        return STATUS_406;

This comment has been minimized.

Copy link
@lslebodn

lslebodn Sep 9, 2016

Contributor

On (09/09/16 06:09), Simo Sorce wrote:

 ret = local_db_dn(mem_ctx, lctx->ldb, req_path, &dn);
  • if (ret != EOK) return ret;
  • if (ret != EOK) goto done;
  • ret = ldb_search(lctx->ldb, tmp_ctx, &res, dn, LDB_SCOPE_BASE,
  •                attrs, LOCAL_CONTAINER_FILTER);
    
  • if (ret != EOK) goto done;
  • if (res->count == 1) {
  •    ret = ldb_search(lctx->ldb, tmp_ctx, &res, dn, LDB_SCOPE_ONELEVEL,
    
  •                     attrs, NULL);
    
  •    if (ret != EOK) goto done;
    
  •    if (res->count > 0) {
    
  •        ret = EINVAL;
    

Actually it needs to be EMEDIUMTYPE I think.
Or maybe it would be better to replace EMEDIUMTYPE
with internal error code src/util/util_errors.c

strerror for EMEDIUMTYPE says "Wrong medium type"

LS

This comment has been minimized.

Copy link
@simo5

simo5 Sep 9, 2016

Contributor

open a ticket for that, it is a separte change

This comment has been minimized.

Copy link
@lslebodn

lslebodn Sep 12, 2016

Contributor

On (09/09/16 13:12), Simo Sorce wrote:

 ret = local_db_dn(mem_ctx, lctx->ldb, req_path, &dn);
  • if (ret != EOK) return ret;
  • if (ret != EOK) goto done;
  • ret = ldb_search(lctx->ldb, tmp_ctx, &res, dn, LDB_SCOPE_BASE,
  •                attrs, LOCAL_CONTAINER_FILTER);
    
  • if (ret != EOK) goto done;
  • if (res->count == 1) {
  •    ret = ldb_search(lctx->ldb, tmp_ctx, &res, dn, LDB_SCOPE_ONELEVEL,
    
  •                     attrs, NULL);
    
  •    if (ret != EOK) goto done;
    
  •    if (res->count > 0) {
    
  •        ret = EINVAL;
    

open a ticket for that, it is a separte change

We(sssd) do not require ticket for any trivial change.
I let Fabiano decide wheter he wants to do it in this patch set
or in separate PR. Maybe he does not want to do it at all.

LS

This comment has been minimized.

Copy link
@fidencio

fidencio Sep 13, 2016

Author Contributor

Simo,

On Fri, Sep 9, 2016 at 3:09 PM, Simo Sorce notifications@github.com wrote:

In src/responder/secrets/local.c:

 ret = local_db_dn(mem_ctx, lctx->ldb, req_path, &dn);
  • if (ret != EOK) return ret;
  • if (ret != EOK) goto done;
  • ret = ldb_search(lctx->ldb, tmp_ctx, &res, dn, LDB_SCOPE_BASE,
  •                attrs, LOCAL_CONTAINER_FILTER);
    
  • if (ret != EOK) goto done;
  • if (res->count == 1) {
  •    ret = ldb_search(lctx->ldb, tmp_ctx, &res, dn, LDB_SCOPE_ONELEVEL,
    
  •                     attrs, NULL);
    
  •    if (ret != EOK) goto done;
    
  •    if (res->count > 0) {
    
  •        ret = EINVAL;
    

Actually it needs to be EMEDIUMTYPE I think.
According to https://github.com/latchset/custodia/blob/master/API.md#deleting-keys
We need to return:

406 not acceptable, type unknown/not permitted

This is done in sec_errno_to_http_status():

case EMEDIUMTYPE:
    return STATUS_406;


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.

Checking the link you mentioned (thanks, btw) I do believe the proper
error to return is EEXIST, which will be translated to "409: if the
container is not empty."
Let me change this and re-submit the patch.

This comment has been minimized.

Copy link
@simo5

simo5 Sep 13, 2016

Contributor

Ooh you are right, thanks for double checking.

fidencio added some commits Aug 30, 2016

SECRETS: Search by the right type when checking containers
We've been searching for the wrong type ("simple") in
local_db_check_containers(), which always gives us a NULL result.

Let's introduce the new LOCAL_CONTAINER_FILTER and do the search for the
right type ("container") from now on.

Resolves:
https://fedorahosted.org/sssd/ticket/3137

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
SECRETS: Don't remove a container when it has children
Let's return and log an error in case the container to be removed has
children.

The approach taken introduced at least one new search in every delete
operation. As far as I understand searching in the BASE scope is quite
cheap and that's the reason I decided to just do the search in the
ONELEVEL scope when the requested to be deleted dn is for sure a
container.

Resolves:
https://fedorahosted.org/sssd/ticket/3167

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

@fidencio fidencio force-pushed the fidencio:wip/#3167 branch from 5d2e880 to 9ab6cc5 Sep 13, 2016

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Sep 14, 2016

The first patch was already pushed.

The second looks good to me visually and I wrote a simple patch to verify the fix:
jhrozek@374eff8

@jhrozek jhrozek added Accepted Pushed and removed Accepted labels Sep 14, 2016

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Sep 16, 2016

@jhrozek jhrozek closed this Sep 16, 2016

fidencio added a commit to fidencio/sssd that referenced this pull request Dec 12, 2016

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 some deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

The removal of those functions we lose some debug messages, which is the
least of problems now. Also instead of re-setting the watchdog timeout
in case of time-shift (please, see b8ceaeb for more info), let's just
re-set the timer's time.

Along with this commit, the function to teardown the watchdog has been
removed as it's not used anymore.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Dec 13, 2016

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 some deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

The removal of those functions we lose some debug messages, which is the
least of problems now. Also instead of re-setting the watchdog timeout
in case of time-shift (please, see b8ceaeb for more info), let's just
exit the let the monitor restart the process.

A proper fix for the clock screw situation should be implemented on
samba's side, by having tevent using monotonic (or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Dec 20, 2016

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and a single byte is written to this pipe (which
is an async-signal-safe operation) and the logic currently done by the
timer handler will be done inside the fd handler in a safe way.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Dec 21, 2016

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by _exit(). Personally I don't
think is worth the trouble to try to log those messages properly in this
specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Jan 3, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by _exit(). Personally I don't
think is worth the trouble to try to log those messages properly in this
specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Jan 3, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by _exit(). Personally I don't
think is worth the trouble to try to log those messages properly in this
specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Jan 3, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by kill(-getpgrp(), SIGTERM).
Personally I don't think is worth the trouble to try to log those messages
properly in this specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Jan 5, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by _exit(1). Personally I don't
think is worth the trouble to try to log those messages properly in
this specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Jan 9, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by kill(-getpgrp(), SIGTERM).
Personally I don't think is worth the trouble to try to log those messages
properly in this specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Jan 9, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by kill(-getpgrp(), SIGTERM).
Personally I don't think is worth the trouble to try to log those messages
properly in this specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Jan 10, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by kill(-getpgrp(), SIGTERM) (or
_exit(1) considering the cases where setting up the process group during
the server_setup() has failed).
Personally I don't think is worth the trouble to try to log those messages
properly in this specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

jhrozek pushed a commit that referenced this pull request Jan 25, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by kill(-getpgrp(), SIGTERM) (or
_exit(1) considering the cases where setting up the process group during
the server_setup() has failed).
Personally I don't think is worth the trouble to try to log those messages
properly in this specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    #1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    #2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    #3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    #4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    #5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    #6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    #7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    #8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    #9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    #10 <signal handler called>
    #11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    #12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    #13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    #14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    #15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    #16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    #17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    #18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    #19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    #20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    #21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    #22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    #23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    #24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    #25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    #26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    #27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    #28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    #29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    #30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    #31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    #32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    #33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    #34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    #35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    #36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    #37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    #38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    #39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    #40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    #41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    #42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    #43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    #44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    #45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    #46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    #47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    #48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    #49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    #50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    #51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    #52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    #53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    #54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    #55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    #56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    #57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    #58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    #59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    #60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Reviewed-by: Pavel Březina <pbrezina@redhat.com>
Reviewed-by: Jakub Hrozek <jhrozek@redhat.com>
Reviewed-by: Simo Sorce <simo@redhat.com>

jhrozek pushed a commit that referenced this pull request Jan 25, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by kill(-getpgrp(), SIGTERM) (or
_exit(1) considering the cases where setting up the process group during
the server_setup() has failed).
Personally I don't think is worth the trouble to try to log those messages
properly in this specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    #1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    #2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    #3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    #4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    #5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    #6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    #7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    #8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    #9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    #10 <signal handler called>
    #11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    #12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    #13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    #14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    #15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    #16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    #17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    #18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    #19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    #20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    #21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    #22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    #23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    #24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    #25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    #26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    #27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    #28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    #29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    #30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    #31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    #32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    #33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    #34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    #35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    #36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    #37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    #38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    #39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    #40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    #41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    #42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    #43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    #44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    #45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    #46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    #47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    #48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    #49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    #50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    #51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    #52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    #53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    #54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    #55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    #56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    #57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    #58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    #59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    #60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Reviewed-by: Pavel Březina <pbrezina@redhat.com>
Reviewed-by: Jakub Hrozek <jhrozek@redhat.com>
Reviewed-by: Simo Sorce <simo@redhat.com>
(cherry picked from commit e6a5f8c)

jhrozek added a commit to jhrozek/sssd that referenced this pull request Jul 20, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by kill(-getpgrp(), SIGTERM) (or
_exit(1) considering the cases where setting up the process group during
the server_setup() has failed).
Personally I don't think is worth the trouble to try to log those messages
properly in this specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    #1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    #2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    #3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    #4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    #5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    #6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Reviewed-by: Pavel Březina <pbrezina@redhat.com>
Reviewed-by: Jakub Hrozek <jhrozek@redhat.com>
Reviewed-by: Simo Sorce <simo@redhat.com>
(cherry picked from commit e6a5f8c)
(cherry picked from commit 0606a71)

jhrozek added a commit that referenced this pull request May 24, 2018

selinux_child: Allow to query sssd
The function getpwnam_r is indirectly used ins selinux_child
on few places. (in libselinux and libsemanage)

There is not any reason why we should block nss calls with sssd.
It is a child process and loop cannot be created.
(BTW it is also allowed in krb_child and proxy_child)

  #0  _nss_sss_getpwnam_r (name=0x55c0e6471a50 "user4_2", result=0x7ffe9ab0d05,
          buffer=0x55c0e64741a0 "\200é\256\177\177", buflen=1024,
          errnop=0x7f7fafbcdb08)
          at src/sss_client/nss_passwd.c:132
  #1  0x00007f7fae7ad48f in __getpwnam_r (name=name@entry=0x55c0e6471a50 "user4_2",
          resbuf=resbuf@entry=0x7ffe9ab0d050, buffer=buffer@entry=0x55c0e64741a0 "\200é\256\177\177",
          buflen=buflen@entry=1024, result=result@entry=0x7ffe9ab0d048)
          at ../nss/getXXbyYY_r.c:316
  #2  0x00007f7faeabc9e2 in get_default_gid (name=0x55c0e6471a50 "user4_2")
          at seusers.c:105
  #3  getseuserbyname (name=0x55c0e6471a50 "user4_2", r_seuser=0x7ffe9ab0d0f0,
          r_level=0x7ffe9ab0d0f8) at seusers.c:186
  #4  0x000055c0e5126d02 in seuser_needs_update (ibuf=0x55c0e64718e0)
          at src/providers/ipa/selinux_child.c:175
  #5  main (argc=<optimized out>, argv=<optimized out>)
          at src/providers/ipa/selinux_child.c:332

  #0  _nss_sss_getpwnam_r (name=0x55c0e647dda0 "user3_1", result=0x7ffe9ab0cce0,
          buffer=0x55c0e6482180 "\240AG\346\300U", buflen=1024,
          errnop=0x7f7fafbcdb08) at src/sss_client/nss_passwd.c:132
  #1  0x00007f7fae7ad48f in __getpwnam_r (name=name@entry=0x55c0e647dda0 "user3_1",
          resbuf=resbuf@entry=0x7ffe9ab0cce0, buffer=buffer@entry=0x55c0e6482180 "\240AG\346\300U",
          buflen=buflen@entry=1024, result=result@entry=0x7ffe9ab0ccd8)
          at ../nss/getXXbyYY_r.c:316
  #2  0x00007f7faece29b3 in add_user (head=head@entry=0x7ffe9ab0ce28,
          user=user@entry=0x55c0e64b5930, name=name@entry=0x55c0e647dda0 "user3_1",
          sename=sename@entry=0x55c0e647bdc0 "staff_u",
          selogin=selogin@entry=0x55c0e647dda0 "user3_1",
          s=<optimized out>) at genhomedircon.c:999
  #3  0x00007f7faece334c in get_users (errors=<synthetic pointer>,
          s=0x7ffe9ab0ce70) at genhomedircon.c:1167
  #4  write_gen_home_dir_context (homedir_context_tpl=0x55c0e647d3d0,
          user_context_tpl=0x55c0e647a870, username_context_tpl=0x0,
          out=0x55c0e646fa80, s=0x7ffe9ab0ce70) at genhomedircon.c:1205
  #5  write_context_file (out=<optimized out>, s=0x7ffe9ab0ce70)
          at genhomedircon.c:1317
  #6  semanage_genhomedircon (sh=sh@entry=0x55c0e6476380, policydb=<optimized out>,
          usepasswd=<optimized out>, ignoredirs=<optimized out>)
          at genhomedircon.c:1382
  #7  0x00007f7faecdfb95 in semanage_direct_commit (sh=0x55c0e6476380)
          at direct_api.c:1575
  #8  0x00007f7faece4d6d in semanage_commit (sh=0x55c0e6476380) at handle.c:426
  #9  0x000055c0e5127cf8 in sss_set_seuser (login_name=0x55c0e6471a5 "user4_2",
          seuser_name=0x55c0e6471960 "staff_u", mls=<optimized out>)
          at src/util/sss_semanage.c:335
  #10 0x000055c0e5126eea in sc_set_seuser (mls=0x55c0e64719d0 "s0-s0:c0.c1023",
          seuser_name=0x55c0e6471960 "staff_u",
          login_name=0x55c0e6471a50 "user4_2")
          at src/providers/ipa/selinux_child.c:162
  #11 main (argc=<optimized out>, argv=<optimized out>)
          at src/providers/ipa/selinux_child.c:334

Merges: https://pagure.io/SSSD/sssd/pull-request/3732

Reviewed-by: Jakub Hrozek <jhrozek@redhat.com>

lslebodn pushed a commit to lslebodn/sssd that referenced this pull request Jan 23, 2019

Lukas Slebodnik
pam-srv-tests: Do not depend on system configuration
Few unit tests failed because of system configuration
e.g.

[ RUN      ] test_pam_preauth_cert_nomatch
(Wed Jan 23 21:09:40:460969 2019) [sssd] [child_sig_handler] (0x0020): child [6367] failed with status [1].
[  ERROR   ] --- sss_dp_get_account_recv() has remaining non-returned values.
../sssd/src/tests/cmocka/common_mock_resp_dp.c:122: note: remaining item was declared here
../sssd/src/tests/cmocka/common_mock_resp_dp.c:123: note: remaining item was declared here
../sssd/src/tests/cmocka/common_mock_resp_dp.c:124: note: remaining item was declared here
../sssd/src/tests/cmocka/common_mock_resp_dp.c:126: note: remaining item was declared here

[  FAILED  ] test_pam_preauth_cert_nomatch

This patch create custom user configuration for p11-kit ($HOME/.config/pkcs11/pkcs11.conf)
user configuration just disable default/global configuration for p11-kit
and thus no extra modules are loaded by p11_child.

sh$ P11_KIT_DEBUG=all ./p11_child --logger=stderr --debug-microseconds=0 --debug-timestamps=0 --debug-level=0xf7f0 --pre --verify no_ocsp --nssdb sql:tp_pam_srv_tests-test_pam_srv
[[sssd[p11_child[7894]]]] [main] (0x0400): p11_child started.
[[sssd[p11_child[7894]]]] [main] (0x2000): Running in [pre-auth] mode.
[[sssd[p11_child[7894]]]] [main] (0x2000): Running with effective IDs: [1000][1000].
[[sssd[p11_child[7894]]]] [main] (0x2000): Running with real IDs [1000][1000].
[[sssd[p11_child[7894]]]] [parse_cert_verify_opts] (0x4000): Found 'no_ocsp' option, disabling OCSP.
(p11-kit:7894) p11_library_init_impl: initializing library
(p11-kit:7894) _p11_conf_parse_file: reading config file: /etc/pkcs11/pkcs11.conf
(p11-kit:7894) _p11_conf_parse_file: config file does not exist
(p11-kit:7894) _p11_conf_parse_file: reading config file: /home/alcik/.config/pkcs11/pkcs11.conf
(p11-kit:7894) _p11_conf_parse_file: config file does not exist
(p11-kit:7894) load_configs_from_directory: loading module configs in: /home/alcik/.config/pkcs11/modules
(p11-kit:7894) load_configs_from_directory: module configs do not exist
(p11-kit:7894) load_configs_from_directory: loading module configs in: /etc/pkcs11/modules
(p11-kit:7894) load_configs_from_directory: loading module configs in: /usr/share/p11-kit/modules
(p11-kit:7894) _p11_conf_parse_file: reading config file: /usr/share/p11-kit/modules/p11-kit-trust.module
(p11-kit:7894) _p11_conf_parse_file: config value: module: p11-kit-trust.so
(p11-kit:7894) _p11_conf_parse_file: config value: priority: 1
(p11-kit:7894) _p11_conf_parse_file: config value: trust-policy: yes
(p11-kit:7894) _p11_conf_parse_file: config value: x-trust-lookup: pkcs11:library-description=PKCS%2311%20Kit%20Trust%20Module
(p11-kit:7894) _p11_conf_parse_file: config value: disable-in: p11-kit-proxy
(p11-kit:7894) _p11_conf_parse_file: reading config file: /usr/share/p11-kit/modules/softhsm2.module
(p11-kit:7894) _p11_conf_parse_file: config value: module: /usr/lib64/pkcs11/libsofthsm2.so
(p11-kit:7894) _p11_conf_parse_file: reading config file: /usr/share/p11-kit/modules/opensc.module
(p11-kit:7894) _p11_conf_parse_file: config value: module: opensc-pkcs11.so
(p11-kit:7894) load_module_from_file_inlock: loading module softhsm2 from path: /usr/lib64/pkcs11/libsofthsm2.so
(p11-kit:7894) dlopen_and_get_function_list: opened module: /usr/lib64/pkcs11/libsofthsm2.so
(p11-kit:7894) is_module_enabled_unlocked: enabled module 'p11-kit-trust' running in 'lt-p11_child'
(p11-kit:7894) load_module_from_file_inlock: module path is relative, loading from: /usr/lib64/pkcs11
(p11-kit:7894) load_module_from_file_inlock: loading module p11-kit-trust from path: /usr/lib64/pkcs11/p11-kit-trust.so
(p11-kit:7894) p11_library_init_impl: initializing library
(p11-kit:7894) dlopen_and_get_function_list: opened module: /usr/lib64/pkcs11/p11-kit-trust.so
(p11-kit:7894) load_module_from_file_inlock: module path is relative, loading from: /usr/lib64/pkcs11
(p11-kit:7894) load_module_from_file_inlock: loading module opensc from path: /usr/lib64/pkcs11/opensc-pkcs11.so
(p11-kit:7894) dlopen_and_get_function_list: opened module: /usr/lib64/pkcs11/opensc-pkcs11.so
(p11-kit:7894) is_module_enabled_unlocked: disabled module 'p11-kit-trust' running in 'lt-p11_child'
(p11-kit:7894) proxy_C_Initialize: in
(p11-kit:7894) managed_C_Initialize: in
(p11-kit:7894) initialize_module_inlock_reentrant: C_Initialize: calling
(p11-kit:7894) initialize_module_inlock_reentrant: C_Initialize: result: 0
(p11-kit:7894) managed_C_Initialize: out: 0
(p11-kit:7894) managed_C_Initialize: in
(p11-kit:7894) initialize_module_inlock_reentrant: C_Initialize: calling
(p11-kit:7894) initialize_module_inlock_reentrant: C_Initialize: result: 5
(p11-kit:7894) managed_C_Initialize: out: 5
(p11-kit:7894) message: softhsm2: module failed to initialize, skipping: Internal error
(p11-kit:7894) proxy_C_Initialize: out: 0
[[sssd[p11_child[7894]]]] [do_card] (0x4000): Default Module List:
[[sssd[p11_child[7894]]]] [do_card] (0x4000): common name: [NSS Internal PKCS SSSD#11 Module].
[[sssd[p11_child[7894]]]] [do_card] (0x4000): dll name: [(null)].
[[sssd[p11_child[7894]]]] [do_card] (0x4000): common name: [p11-kit-proxy].
[[sssd[p11_child[7894]]]] [do_card] (0x4000): dll name: [p11-kit-proxy.so].
[[sssd[p11_child[7894]]]] [do_card] (0x4000): Dead Module List:
[[sssd[p11_child[7894]]]] [do_card] (0x4000): DB Module List:
[[sssd[p11_child[7894]]]] [do_card] (0x4000): common name: [NSS Internal Module].
[[sssd[p11_child[7894]]]] [do_card] (0x4000): dll name: [(null)].
[[sssd[p11_child[7894]]]] [do_card] (0x4000): common name: [Policy File].
[[sssd[p11_child[7894]]]] [do_card] (0x4000): dll name: [(null)].
[[sssd[p11_child[7894]]]] [do_card] (0x4000): Description [NSS Internal Cryptographic Services                             Mozilla Foundation                   ] Manufacturer [Mozilla Foundation                 ] flags [9] removable [false] token present [true].
[[sssd[p11_child[7894]]]] [do_card] (0x4000): Description [NSS User Private Key and Certificate Services                   Mozilla Foundation                   ] Manufacturer [Mozilla Foundation                 ] flags [9] removable [false] token present [true].
[[sssd[p11_child[7894]]]] [do_card] (0x4000): Description [Alcor Micro AU9560 00 00                                        Generic                         ] Manufacturer [Generic                         ] flags [6] removable [true] token present [false].
[[sssd[p11_child[7894]]]] [do_card] (0x4000): Token not present.
[[sssd[p11_child[7894]]]] [main] (0x0040): do_work failed.
[[sssd[p11_child[7894]]]] [main] (0x0020): p11_child failed!
(p11-kit:7894) p11_kit_modules_release: in
(p11-kit:7894) p11_kit_modules_release: out
(p11-kit:7894) uninit_common: uninitializing library
(p11-kit:7894) uninit_common: uninitializing library

lslebodn pushed a commit to lslebodn/sssd that referenced this pull request Jun 28, 2019

Lukas Slebodnik
PROXY: Return data in output parameter if everything is OK
The function remove_duplicate_group_members might return EOK also in the middle
of function but return parameter was not set with right data.
Processing continued in the function save_group but there was a
dereference of NULL pointer.

Introduced in: https://pagure.io/SSSD/sssd/issue/3931

Crash:
  (gdb) bt
  #0  0x00007fb4ce4a9ac5 in save_group (sysdb=sysdb@entry=0x55c9a0efb230, dom=dom@entry=0x55c9a0efb420, grp=grp@entry=0x55c9a0f370f0, real_name=0x55c9a0f47340 "nobody@ldap",
      alias=alias@entry=0x0) at src/providers/proxy/proxy_id.c:748
  #1  0x00007fb4ce4aa600 in get_gr_gid (mem_ctx=mem_ctx@entry=0x55c9a0f38be0, sysdb=sysdb@entry=0x55c9a0efb230, dom=dom@entry=0x55c9a0efb420, gid=99, now=<optimized out>,
      ctx=<optimized out>) at src/providers/proxy/proxy_id.c:1160
  SSSD#2  0x00007fb4ce4ac9e5 in get_initgr_groups_process (pwd=0x55c9a0f384a0, pwd=0x55c9a0f384a0, dom=0x55c9a0efb420, sysdb=0x55c9a0efb230, ctx=0x55c9a0f048e0, memctx=0x55c9a0f38be0)
      at src/providers/proxy/proxy_id.c:1553
  SSSD#3  get_initgr (i_name=<optimized out>, dom=0x55c9a0efb420, sysdb=<optimized out>, ctx=0x55c9a0f048e0, mem_ctx=0x55c9a0f38b70) at src/providers/proxy/proxy_id.c:1461
  SSSD#4  proxy_account_info (domain=0x55c9a0efb420, be_ctx=<optimized out>, data=<optimized out>, ctx=0x55c9a0f048e0, mem_ctx=0x55c9a0f38b70) at src/providers/proxy/proxy_id.c:1659
  SSSD#5  proxy_account_info_handler_send (mem_ctx=<optimized out>, id_ctx=0x55c9a0f048e0, data=<optimized out>, params=0x55c9a0f39790) at src/providers/proxy/proxy_id.c:1758
  SSSD#6  0x000055c99fc67677 in file_dp_request (_dp_req=<synthetic pointer>, req=0x55c9a0f39470, request_data=<optimized out>, dp_flags=1, method=DPM_ACCOUNT_HANDLER, target=DPT_ID,
      name=<optimized out>, domainname=0x55c9a0f39190 "LDAP", provider=0x55c9a0efe0e0, mem_ctx=<optimized out>) at src/providers/data_provider/dp_request.c:250
  SSSD#7  dp_req_send (mem_ctx=0x55c9a0f37b60, provider=provider@entry=0x55c9a0efe0e0, domain=domain@entry=0x55c9a0f39190 "LDAP", name=<optimized out>, target=target@entry=DPT_ID,
      method=method@entry=DPM_ACCOUNT_HANDLER, dp_flags=dp_flags@entry=1, request_data=0x55c9a0f37c00, _request_name=0x55c9a0f37b60) at src/providers/data_provider/dp_request.c:295
  SSSD#8  0x000055c99fc6a132 in dp_get_account_info_send (mem_ctx=<optimized out>, ev=0x55c9a0eddbc0, sbus_req=<optimized out>, provider=0x55c9a0efe0e0, dp_flags=1,
      entry_type=<optimized out>, filter=0x55c9a0f358d0 "name=nobody@ldap", domain=0x55c9a0f39190 "LDAP", extra=0x55c9a0f354a0 "") at src/providers/data_provider/dp_target_id.c:528
  SSSD#9  0x00007fb4da35265b in _sbus_sss_invoke_in_uusss_out_qus_step (ev=0x55c9a0eddbc0, te=<optimized out>, tv=..., private_data=<optimized out>) at src/sss_iface/sbus_sss_invokers.c:2847
  SSSD#10 0x00007fb4d9cfb1cf in tevent_common_invoke_timer_handler () from /lib64/libtevent.so.0
  SSSD#11 0x00007fb4d9cfb339 in tevent_common_loop_timer_delay () from /lib64/libtevent.so.0
  SSSD#12 0x00007fb4d9cfc2f9 in epoll_event_loop_once () from /lib64/libtevent.so.0
  SSSD#13 0x00007fb4d9cfa7b7 in std_event_loop_once () from /lib64/libtevent.so.0
  SSSD#14 0x00007fb4d9cf5b5d in _tevent_loop_once () from /lib64/libtevent.so.0
  SSSD#15 0x00007fb4d9cf5d8b in tevent_common_loop_wait () from /lib64/libtevent.so.0
  SSSD#16 0x00007fb4d9cfa757 in std_event_loop_wait () from /lib64/libtevent.so.0
  SSSD#17 0x00007fb4dd955ac3 in server_loop (main_ctx=0x55c9a0edf090) at src/util/server.c:724
  SSSD#18 0x000055c99fc59760 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:747
  (gdb) l
  (gdb) bt
  #0  0x00007fb4ce4a9ac5 in save_group (sysdb=sysdb@entry=0x55c9a0efb230, dom=dom@entry=0x55c9a0efb420, grp=grp@entry=0x55c9a0f370f0, real_name=0x55c9a0f47340 "nobody@ldap",
      alias=alias@entry=0x0) at src/providers/proxy/proxy_id.c:748
  #1  0x00007fb4ce4aa600 in get_gr_gid (mem_ctx=mem_ctx@entry=0x55c9a0f38be0, sysdb=sysdb@entry=0x55c9a0efb230, dom=dom@entry=0x55c9a0efb420, gid=99, now=<optimized out>,
      ctx=<optimized out>) at src/providers/proxy/proxy_id.c:1160
  SSSD#2  0x00007fb4ce4ac9e5 in get_initgr_groups_process (pwd=0x55c9a0f384a0, pwd=0x55c9a0f384a0, dom=0x55c9a0efb420, sysdb=0x55c9a0efb230, ctx=0x55c9a0f048e0, memctx=0x55c9a0f38be0)
      at src/providers/proxy/proxy_id.c:1553
  SSSD#3  get_initgr (i_name=<optimized out>, dom=0x55c9a0efb420, sysdb=<optimized out>, ctx=0x55c9a0f048e0, mem_ctx=0x55c9a0f38b70) at src/providers/proxy/proxy_id.c:1461
  SSSD#4  proxy_account_info (domain=0x55c9a0efb420, be_ctx=<optimized out>, data=<optimized out>, ctx=0x55c9a0f048e0, mem_ctx=0x55c9a0f38b70) at src/providers/proxy/proxy_id.c:1659
  SSSD#5  proxy_account_info_handler_send (mem_ctx=<optimized out>, id_ctx=0x55c9a0f048e0, data=<optimized out>, params=0x55c9a0f39790) at src/providers/proxy/proxy_id.c:1758
  SSSD#6  0x000055c99fc67677 in file_dp_request (_dp_req=<synthetic pointer>, req=0x55c9a0f39470, request_data=<optimized out>, dp_flags=1, method=DPM_ACCOUNT_HANDLER, target=DPT_ID,
      name=<optimized out>, domainname=0x55c9a0f39190 "LDAP", provider=0x55c9a0efe0e0, mem_ctx=<optimized out>) at src/providers/data_provider/dp_request.c:250
  SSSD#7  dp_req_send (mem_ctx=0x55c9a0f37b60, provider=provider@entry=0x55c9a0efe0e0, domain=domain@entry=0x55c9a0f39190 "LDAP", name=<optimized out>, target=target@entry=DPT_ID,
      method=method@entry=DPM_ACCOUNT_HANDLER, dp_flags=dp_flags@entry=1, request_data=0x55c9a0f37c00, _request_name=0x55c9a0f37b60) at src/providers/data_provider/dp_request.c:295
  SSSD#8  0x000055c99fc6a132 in dp_get_account_info_send (mem_ctx=<optimized out>, ev=0x55c9a0eddbc0, sbus_req=<optimized out>, provider=0x55c9a0efe0e0, dp_flags=1,
      entry_type=<optimized out>, filter=0x55c9a0f358d0 "name=nobody@ldap", domain=0x55c9a0f39190 "LDAP", extra=0x55c9a0f354a0 "") at src/providers/data_provider/dp_target_id.c:528
  SSSD#9  0x00007fb4da35265b in _sbus_sss_invoke_in_uusss_out_qus_step (ev=0x55c9a0eddbc0, te=<optimized out>, tv=..., private_data=<optimized out>) at src/sss_iface/sbus_sss_invokers.c:2847
  SSSD#10 0x00007fb4d9cfb1cf in tevent_common_invoke_timer_handler () from /lib64/libtevent.so.0
  SSSD#11 0x00007fb4d9cfb339 in tevent_common_loop_timer_delay () from /lib64/libtevent.so.0
  SSSD#12 0x00007fb4d9cfc2f9 in epoll_event_loop_once () from /lib64/libtevent.so.0
  SSSD#13 0x00007fb4d9cfa7b7 in std_event_loop_once () from /lib64/libtevent.so.0
  SSSD#14 0x00007fb4d9cf5b5d in _tevent_loop_once () from /lib64/libtevent.so.0
  SSSD#15 0x00007fb4d9cf5d8b in tevent_common_loop_wait () from /lib64/libtevent.so.0
  SSSD#16 0x00007fb4d9cfa757 in std_event_loop_wait () from /lib64/libtevent.so.0
  SSSD#17 0x00007fb4dd955ac3 in server_loop (main_ctx=0x55c9a0edf090) at src/util/server.c:724
  SSSD#18 0x000055c99fc59760 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:747
  (gdb) l
  733         ret = remove_duplicate_group_members(tmp_ctx, grp, &ngroup);
  734         if (ret != EOK) {
  735             DEBUG(SSSDBG_CRIT_FAILURE, "Failed to remove duplicate group member     s\n");
  736             goto done;
  737         }
  738
  739         DEBUG_GR_MEM(SSSDBG_TRACE_LIBS, ngroup);
  740
  741         ret = sysdb_transaction_start(sysdb);
  742         if (ret != EOK) {
  743             DEBUG(SSSDBG_CRIT_FAILURE, "Failed to start transaction\n");
  744             goto done;
  745         }
  746         in_transaction = true;
  747
  748         if (ngroup->gr_mem && ngroup->gr_mem[0]) {
  749             attrs = sysdb_new_attrs(tmp_ctx);
  750             if (!attrs) {
  751                 DEBUG(SSSDBG_CRIT_FAILURE, "Allocation error?!\n");
  752                 ret = ENOMEM;
  (gdb) p ngroup
  $1 = (struct group *) 0x0
  743             DEBUG(SSSDBG_CRIT_FAILURE, "Failed to start transaction\n");
  744             goto done;
  745         }
  746         in_transaction = true;
  747
  748         if (ngroup->gr_mem && ngroup->gr_mem[0]) {
  749             attrs = sysdb_new_attrs(tmp_ctx);
  750             if (!attrs) {
  751                 DEBUG(SSSDBG_CRIT_FAILURE, "Allocation error?!\n");
  752                 ret = ENOMEM;
  (gdb) p ngroup
  $1 = (struct group *) 0x0

Merges: https://pagure.io/SSSD/sssd/pull-request/4036

Resolves:
https://pagure.io/SSSD/sssd/issue/XXXX

jhrozek added a commit that referenced this pull request Jun 28, 2019

PROXY: Return data in output parameter if everything is OK
The function remove_duplicate_group_members might return EOK also in the middle
of function but return parameter was not set with right data.
Processing continued in the function save_group but there was a
dereference of NULL pointer.

Introduced in: https://pagure.io/SSSD/sssd/issue/3931

Crash:
  (gdb) bt
  #0  0x00007fb4ce4a9ac5 in save_group (sysdb=sysdb@entry=0x55c9a0efb230, dom=dom@entry=0x55c9a0efb420, grp=grp@entry=0x55c9a0f370f0, real_name=0x55c9a0f47340 "nobody@ldap",
      alias=alias@entry=0x0) at src/providers/proxy/proxy_id.c:748
  #1  0x00007fb4ce4aa600 in get_gr_gid (mem_ctx=mem_ctx@entry=0x55c9a0f38be0, sysdb=sysdb@entry=0x55c9a0efb230, dom=dom@entry=0x55c9a0efb420, gid=99, now=<optimized out>,
      ctx=<optimized out>) at src/providers/proxy/proxy_id.c:1160
  #2  0x00007fb4ce4ac9e5 in get_initgr_groups_process (pwd=0x55c9a0f384a0, pwd=0x55c9a0f384a0, dom=0x55c9a0efb420, sysdb=0x55c9a0efb230, ctx=0x55c9a0f048e0, memctx=0x55c9a0f38be0)
      at src/providers/proxy/proxy_id.c:1553
  #3  get_initgr (i_name=<optimized out>, dom=0x55c9a0efb420, sysdb=<optimized out>, ctx=0x55c9a0f048e0, mem_ctx=0x55c9a0f38b70) at src/providers/proxy/proxy_id.c:1461
  #4  proxy_account_info (domain=0x55c9a0efb420, be_ctx=<optimized out>, data=<optimized out>, ctx=0x55c9a0f048e0, mem_ctx=0x55c9a0f38b70) at src/providers/proxy/proxy_id.c:1659
  #5  proxy_account_info_handler_send (mem_ctx=<optimized out>, id_ctx=0x55c9a0f048e0, data=<optimized out>, params=0x55c9a0f39790) at src/providers/proxy/proxy_id.c:1758
  #6  0x000055c99fc67677 in file_dp_request (_dp_req=<synthetic pointer>, req=0x55c9a0f39470, request_data=<optimized out>, dp_flags=1, method=DPM_ACCOUNT_HANDLER, target=DPT_ID,
      name=<optimized out>, domainname=0x55c9a0f39190 "LDAP", provider=0x55c9a0efe0e0, mem_ctx=<optimized out>) at src/providers/data_provider/dp_request.c:250
  #7  dp_req_send (mem_ctx=0x55c9a0f37b60, provider=provider@entry=0x55c9a0efe0e0, domain=domain@entry=0x55c9a0f39190 "LDAP", name=<optimized out>, target=target@entry=DPT_ID,
      method=method@entry=DPM_ACCOUNT_HANDLER, dp_flags=dp_flags@entry=1, request_data=0x55c9a0f37c00, _request_name=0x55c9a0f37b60) at src/providers/data_provider/dp_request.c:295
  #8  0x000055c99fc6a132 in dp_get_account_info_send (mem_ctx=<optimized out>, ev=0x55c9a0eddbc0, sbus_req=<optimized out>, provider=0x55c9a0efe0e0, dp_flags=1,
      entry_type=<optimized out>, filter=0x55c9a0f358d0 "name=nobody@ldap", domain=0x55c9a0f39190 "LDAP", extra=0x55c9a0f354a0 "") at src/providers/data_provider/dp_target_id.c:528
  #9  0x00007fb4da35265b in _sbus_sss_invoke_in_uusss_out_qus_step (ev=0x55c9a0eddbc0, te=<optimized out>, tv=..., private_data=<optimized out>) at src/sss_iface/sbus_sss_invokers.c:2847
  #10 0x00007fb4d9cfb1cf in tevent_common_invoke_timer_handler () from /lib64/libtevent.so.0
  #11 0x00007fb4d9cfb339 in tevent_common_loop_timer_delay () from /lib64/libtevent.so.0
  #12 0x00007fb4d9cfc2f9 in epoll_event_loop_once () from /lib64/libtevent.so.0
  #13 0x00007fb4d9cfa7b7 in std_event_loop_once () from /lib64/libtevent.so.0
  #14 0x00007fb4d9cf5b5d in _tevent_loop_once () from /lib64/libtevent.so.0
  #15 0x00007fb4d9cf5d8b in tevent_common_loop_wait () from /lib64/libtevent.so.0
  #16 0x00007fb4d9cfa757 in std_event_loop_wait () from /lib64/libtevent.so.0
  #17 0x00007fb4dd955ac3 in server_loop (main_ctx=0x55c9a0edf090) at src/util/server.c:724
  #18 0x000055c99fc59760 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:747
  (gdb) l
  (gdb) bt
  #0  0x00007fb4ce4a9ac5 in save_group (sysdb=sysdb@entry=0x55c9a0efb230, dom=dom@entry=0x55c9a0efb420, grp=grp@entry=0x55c9a0f370f0, real_name=0x55c9a0f47340 "nobody@ldap",
      alias=alias@entry=0x0) at src/providers/proxy/proxy_id.c:748
  #1  0x00007fb4ce4aa600 in get_gr_gid (mem_ctx=mem_ctx@entry=0x55c9a0f38be0, sysdb=sysdb@entry=0x55c9a0efb230, dom=dom@entry=0x55c9a0efb420, gid=99, now=<optimized out>,
      ctx=<optimized out>) at src/providers/proxy/proxy_id.c:1160
  #2  0x00007fb4ce4ac9e5 in get_initgr_groups_process (pwd=0x55c9a0f384a0, pwd=0x55c9a0f384a0, dom=0x55c9a0efb420, sysdb=0x55c9a0efb230, ctx=0x55c9a0f048e0, memctx=0x55c9a0f38be0)
      at src/providers/proxy/proxy_id.c:1553
  #3  get_initgr (i_name=<optimized out>, dom=0x55c9a0efb420, sysdb=<optimized out>, ctx=0x55c9a0f048e0, mem_ctx=0x55c9a0f38b70) at src/providers/proxy/proxy_id.c:1461
  #4  proxy_account_info (domain=0x55c9a0efb420, be_ctx=<optimized out>, data=<optimized out>, ctx=0x55c9a0f048e0, mem_ctx=0x55c9a0f38b70) at src/providers/proxy/proxy_id.c:1659
  #5  proxy_account_info_handler_send (mem_ctx=<optimized out>, id_ctx=0x55c9a0f048e0, data=<optimized out>, params=0x55c9a0f39790) at src/providers/proxy/proxy_id.c:1758
  #6  0x000055c99fc67677 in file_dp_request (_dp_req=<synthetic pointer>, req=0x55c9a0f39470, request_data=<optimized out>, dp_flags=1, method=DPM_ACCOUNT_HANDLER, target=DPT_ID,
      name=<optimized out>, domainname=0x55c9a0f39190 "LDAP", provider=0x55c9a0efe0e0, mem_ctx=<optimized out>) at src/providers/data_provider/dp_request.c:250
  #7  dp_req_send (mem_ctx=0x55c9a0f37b60, provider=provider@entry=0x55c9a0efe0e0, domain=domain@entry=0x55c9a0f39190 "LDAP", name=<optimized out>, target=target@entry=DPT_ID,
      method=method@entry=DPM_ACCOUNT_HANDLER, dp_flags=dp_flags@entry=1, request_data=0x55c9a0f37c00, _request_name=0x55c9a0f37b60) at src/providers/data_provider/dp_request.c:295
  #8  0x000055c99fc6a132 in dp_get_account_info_send (mem_ctx=<optimized out>, ev=0x55c9a0eddbc0, sbus_req=<optimized out>, provider=0x55c9a0efe0e0, dp_flags=1,
      entry_type=<optimized out>, filter=0x55c9a0f358d0 "name=nobody@ldap", domain=0x55c9a0f39190 "LDAP", extra=0x55c9a0f354a0 "") at src/providers/data_provider/dp_target_id.c:528
  #9  0x00007fb4da35265b in _sbus_sss_invoke_in_uusss_out_qus_step (ev=0x55c9a0eddbc0, te=<optimized out>, tv=..., private_data=<optimized out>) at src/sss_iface/sbus_sss_invokers.c:2847
  #10 0x00007fb4d9cfb1cf in tevent_common_invoke_timer_handler () from /lib64/libtevent.so.0
  #11 0x00007fb4d9cfb339 in tevent_common_loop_timer_delay () from /lib64/libtevent.so.0
  #12 0x00007fb4d9cfc2f9 in epoll_event_loop_once () from /lib64/libtevent.so.0
  #13 0x00007fb4d9cfa7b7 in std_event_loop_once () from /lib64/libtevent.so.0
  #14 0x00007fb4d9cf5b5d in _tevent_loop_once () from /lib64/libtevent.so.0
  #15 0x00007fb4d9cf5d8b in tevent_common_loop_wait () from /lib64/libtevent.so.0
  #16 0x00007fb4d9cfa757 in std_event_loop_wait () from /lib64/libtevent.so.0
  #17 0x00007fb4dd955ac3 in server_loop (main_ctx=0x55c9a0edf090) at src/util/server.c:724
  #18 0x000055c99fc59760 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:747
  (gdb) l
  733         ret = remove_duplicate_group_members(tmp_ctx, grp, &ngroup);
  734         if (ret != EOK) {
  735             DEBUG(SSSDBG_CRIT_FAILURE, "Failed to remove duplicate group member     s\n");
  736             goto done;
  737         }
  738
  739         DEBUG_GR_MEM(SSSDBG_TRACE_LIBS, ngroup);
  740
  741         ret = sysdb_transaction_start(sysdb);
  742         if (ret != EOK) {
  743             DEBUG(SSSDBG_CRIT_FAILURE, "Failed to start transaction\n");
  744             goto done;
  745         }
  746         in_transaction = true;
  747
  748         if (ngroup->gr_mem && ngroup->gr_mem[0]) {
  749             attrs = sysdb_new_attrs(tmp_ctx);
  750             if (!attrs) {
  751                 DEBUG(SSSDBG_CRIT_FAILURE, "Allocation error?!\n");
  752                 ret = ENOMEM;
  (gdb) p ngroup
  $1 = (struct group *) 0x0
  743             DEBUG(SSSDBG_CRIT_FAILURE, "Failed to start transaction\n");
  744             goto done;
  745         }
  746         in_transaction = true;
  747
  748         if (ngroup->gr_mem && ngroup->gr_mem[0]) {
  749             attrs = sysdb_new_attrs(tmp_ctx);
  750             if (!attrs) {
  751                 DEBUG(SSSDBG_CRIT_FAILURE, "Allocation error?!\n");
  752                 ret = ENOMEM;
  (gdb) p ngroup
  $1 = (struct group *) 0x0

Merges: https://pagure.io/SSSD/sssd/pull-request/4036

Resolves:
https://pagure.io/SSSD/sssd/issue/4037

Reviewed-by: Jakub Hrozek <jhrozek@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.