Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTIL: Remove support to libini older than 1.0.0 #10

Closed
wants to merge 1 commit into from

Conversation

Projects
None yet
3 participants
@fidencio
Copy link
Contributor

commented Sep 1, 2016

libini 1.0.0 is part of ding-libs 0.3.0 and has been around since 2013.
Even the old systems that we have to support already have a newer
version of the library. RHEL6, for instance, has ding-libs 0.4.0 which
provides libini 1.1.0.

By removing this code we also can stop depending on libcollection.

Signed-off-by: Fabiano Fidêncio fidencio@redhat.com

UTIL: Remove support to libini older than 1.0.0
libini 1.0.0 is part of ding-libs 0.3.0 and has been around since 2013.
Even the old systems that we have to support already have a newer
version of the library. RHEL6, for instance, has ding-libs 0.4.0 which
provides libini 1.1.0.

By removing this code we also can stop depending on libcollection.

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

@fidencio fidencio force-pushed the fidencio:wip/cleanup_sss_ini branch from dc5ac90 to 331a48c Sep 1, 2016

@lslebodn

This comment has been minimized.

Copy link
Contributor

commented Sep 1, 2016

On (31/08/16 23:18), fidencio wrote:

libini 1.0.0 is part of ding-libs 0.3.0 and has been around since 2013.
Even the old systems that we have to support already have a newer
version of the library. RHEL6, for instance, has ding-libs 0.4.0 which
provides libinit 1.1.0.

By removing this code we also can stop depending on libcollection.

Signed-off-by: Fabiano Fidêncio fidencio@redhat.com
You can view, comment on, or merge this pull request online at:

#10

-- Commit Summary --

  • UTIL: Remove support to libini older than 1.0.0

-- File Changes --

M configure.ac (1)
M contrib/ci/deps.sh (1)
M contrib/sssd.spec.in (1)
D src/external/libcollection.m4 (9)
M src/util/sss_ini.c (97)

-- Patch Links --

https://github.com/SSSD/sssd/pull/10.patch
https://github.com/SSSD/sssd/pull/10.diff

OpenSUSE LEAP has just ding-libs 0.3.0.1 in official repositories.
http://software.opensuse.org/package/ding-libs?search_term=ding-libs

We try to be a nice upstream.

Older version of ding-libs (<0.4) are supported for limited functionality.
e.g.
sssd without IPA and AD provider. (configure time option --without-samba)

If we want to replace augeas with libini_config-1.2 then ding-libs (<0.4)
will not be supported with infopipe responder (--with-infopipe)

LS

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Sep 1, 2016

On Thu, Sep 1, 2016 at 11:54 AM, lslebodn notifications@github.com wrote:

On (31/08/16 23:18), fidencio wrote:

libini 1.0.0 is part of ding-libs 0.3.0 and has been around since 2013.
Even the old systems that we have to support already have a newer
version of the library. RHEL6, for instance, has ding-libs 0.4.0 which
provides libinit 1.1.0.

By removing this code we also can stop depending on libcollection.

Signed-off-by: Fabiano Fidêncio fidencio@redhat.com
You can view, comment on, or merge this pull request online at:

#10

-- Commit Summary --

  • UTIL: Remove support to libini older than 1.0.0

-- File Changes --

M configure.ac (1)
M contrib/ci/deps.sh (1)
M contrib/sssd.spec.in (1)
D src/external/libcollection.m4 (9)
M src/util/sss_ini.c (97)

-- Patch Links --

https://github.com/SSSD/sssd/pull/10.patch
https://github.com/SSSD/sssd/pull/10.diff

OpenSUSE LEAP has just ding-libs 0.3.0.1 in official repositories.
http://software.opensuse.org/package/ding-libs?search_term=ding-libs

Yep. OpenSUSE LEAP has 0.3.0.1, Debian Stable (Jessie) has 0.4.0,
latest Ubuntu LTS has 0.5.0.
And all of them would be able to build SSSD with my patch without any issues.

We try to be a nice upstream.

And we are, no? :-)

Older version of ding-libs (<0.4) are supported for limited functionality.
e.g.
sssd without IPA and AD provider. (configure time option --without-samba)

If we want to replace augeas with libini_config-1.2 then ding-libs (<0.4)
will not be supported with infopipe responder (--with-infopipe)

This is not exactly related to this patch. Anyways, I don't see how
having this code ding-libs (<0.3.0) would solve the augeas
replacement.

Best Regards,

Fabiano Fidêncio

@lslebodn

This comment has been minimized.

Copy link
Contributor

commented Sep 1, 2016

On (01/09/16 03:21), fidencio wrote:

On Thu, Sep 1, 2016 at 11:54 AM, lslebodn notifications@github.com wrote:

On (31/08/16 23:18), fidencio wrote:

libini 1.0.0 is part of ding-libs 0.3.0 and has been around since 2013.
Even the old systems that we have to support already have a newer
version of the library. RHEL6, for instance, has ding-libs 0.4.0 which
provides libinit 1.1.0.

By removing this code we also can stop depending on libcollection.

Signed-off-by: Fabiano Fidêncio fidencio@redhat.com
You can view, comment on, or merge this pull request online at:

#10

-- Commit Summary --

  • UTIL: Remove support to libini older than 1.0.0

-- File Changes --

M configure.ac (1)
M contrib/ci/deps.sh (1)
M contrib/sssd.spec.in (1)
D src/external/libcollection.m4 (9)
M src/util/sss_ini.c (97)

-- Patch Links --

https://github.com/SSSD/sssd/pull/10.patch
https://github.com/SSSD/sssd/pull/10.diff

OpenSUSE LEAP has just ding-libs 0.3.0.1 in official repositories.
http://software.opensuse.org/package/ding-libs?search_term=ding-libs

Yep. OpenSUSE LEAP has 0.3.0.1, Debian Stable (Jessie) has 0.4.0,
latest Ubuntu LTS has 0.5.0.
And all of them would be able to build SSSD with my patch without any issues.

OpenSUSE LEAP was just an example because.
It is quite new distribution and has very old version of libini_config (1.0)
and we optionaly use some features from libini_config-1.1 1.2 and 1.3
Debian old_stable does not have libini_config.so.5 (libini-config-1.x)

I would prefer if limited version of sssd (ldap + krb5 provider)
could be compiled almost anywhere. (even old distributions)

This is a reason why we have abstraction for different versions of
libini_config in src/util/sss_ini.c

If we decide to drop support for libini_config < 1.0
then we can completely remove sssd abstraction in src/util/sss_ini.c

LS

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Sep 1, 2016

On Thu, Sep 1, 2016 at 12:39 PM, lslebodn notifications@github.com wrote:

On (01/09/16 03:21), fidencio wrote:

On Thu, Sep 1, 2016 at 11:54 AM, lslebodn notifications@github.com
wrote:

On (31/08/16 23:18), fidencio wrote:

libini 1.0.0 is part of ding-libs 0.3.0 and has been around since 2013.
Even the old systems that we have to support already have a newer
version of the library. RHEL6, for instance, has ding-libs 0.4.0 which
provides libinit 1.1.0.

By removing this code we also can stop depending on libcollection.

Signed-off-by: Fabiano Fidêncio fidencio@redhat.com
You can view, comment on, or merge this pull request online at:

#10

-- Commit Summary --

  • UTIL: Remove support to libini older than 1.0.0

-- File Changes --

M configure.ac (1)
M contrib/ci/deps.sh (1)
M contrib/sssd.spec.in (1)
D src/external/libcollection.m4 (9)
M src/util/sss_ini.c (97)

-- Patch Links --

https://github.com/SSSD/sssd/pull/10.patch
https://github.com/SSSD/sssd/pull/10.diff

OpenSUSE LEAP has just ding-libs 0.3.0.1 in official repositories.
http://software.opensuse.org/package/ding-libs?search_term=ding-libs

Yep. OpenSUSE LEAP has 0.3.0.1, Debian Stable (Jessie) has 0.4.0,
latest Ubuntu LTS has 0.5.0.
And all of them would be able to build SSSD with my patch without any
issues.

OpenSUSE LEAP was just an example because.
It is quite new distribution and has very old version of libini_config
(1.0)
and we optionaly use some features from libini_config-1.1 1.2 and 1.3
Debian old_stable does not have libini_config.so.5 (libini-config-1.x)

I would prefer if limited version of sssd (ldap + krb5 provider)
could be compiled almost anywhere. (even old distributions)

Okay,

This is a reason why we have abstraction for different versions of
libini_config in src/util/sss_ini.c

If we decide to drop support for libini_config < 1.0
then we can completely remove sssd abstraction in src/util/sss_ini.c

No, we can't. In src/utils/sss_ini.c we are, currently, keeping code for
libini < 1.0, libini < 1.3 and the newer versions (>= 1.3).
I case my patch was accepted we would still keep the abstraction for libini
< 1.3 and the newer versions (>= 1.3)

Feel free to nack the patch.

Best Regards,

Fabiano Fidêncio

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Sep 1, 2016

On Thu, Sep 01, 2016 at 03:21:06AM -0700, fidencio wrote:

On Thu, Sep 1, 2016 at 11:54 AM, lslebodn notifications@github.com wrote:

On (31/08/16 23:18), fidencio wrote:

libini 1.0.0 is part of ding-libs 0.3.0 and has been around since 2013.
Even the old systems that we have to support already have a newer
version of the library. RHEL6, for instance, has ding-libs 0.4.0 which
provides libinit 1.1.0.

By removing this code we also can stop depending on libcollection.

Signed-off-by: Fabiano Fidêncio fidencio@redhat.com
You can view, comment on, or merge this pull request online at:

#10

-- Commit Summary --

  • UTIL: Remove support to libini older than 1.0.0

-- File Changes --

M configure.ac (1)
M contrib/ci/deps.sh (1)
M contrib/sssd.spec.in (1)
D src/external/libcollection.m4 (9)
M src/util/sss_ini.c (97)

-- Patch Links --

https://github.com/SSSD/sssd/pull/10.patch
https://github.com/SSSD/sssd/pull/10.diff

OpenSUSE LEAP has just ding-libs 0.3.0.1 in official repositories.
http://software.opensuse.org/package/ding-libs?search_term=ding-libs

Yep. OpenSUSE LEAP has 0.3.0.1, Debian Stable (Jessie) has 0.4.0,
latest Ubuntu LTS has 0.5.0.
And all of them would be able to build SSSD with my patch without any issues.

Two questions:
1) how long until the distributions with too old ding-libs go out of
support?
2) since we will be (likely) supporting sssd-1-13 for the lifetime
of RHEL-6, can we say that the old distributions just use sssd-1-13?

@lslebodn

This comment has been minimized.

Copy link
Contributor

commented Sep 1, 2016

On (01/09/16 04:32), Jakub Hrozek wrote:

On Thu, Sep 01, 2016 at 03:21:06AM -0700, fidencio wrote:

On Thu, Sep 1, 2016 at 11:54 AM, lslebodn notifications@github.com wrote:

On (31/08/16 23:18), fidencio wrote:

libini 1.0.0 is part of ding-libs 0.3.0 and has been around since 2013.
Even the old systems that we have to support already have a newer
version of the library. RHEL6, for instance, has ding-libs 0.4.0 which
provides libinit 1.1.0.

By removing this code we also can stop depending on libcollection.

Signed-off-by: Fabiano Fidêncio fidencio@redhat.com
You can view, comment on, or merge this pull request online at:

#10

-- Commit Summary --

  • UTIL: Remove support to libini older than 1.0.0

-- File Changes --

M configure.ac (1)
M contrib/ci/deps.sh (1)
M contrib/sssd.spec.in (1)
D src/external/libcollection.m4 (9)
M src/util/sss_ini.c (97)

-- Patch Links --

https://github.com/SSSD/sssd/pull/10.patch
https://github.com/SSSD/sssd/pull/10.diff

OpenSUSE LEAP has just ding-libs 0.3.0.1 in official repositories.
http://software.opensuse.org/package/ding-libs?search_term=ding-libs

Yep. OpenSUSE LEAP has 0.3.0.1, Debian Stable (Jessie) has 0.4.0,
latest Ubuntu LTS has 0.5.0.
And all of them would be able to build SSSD with my patch without any issues.

Two questions:

  1. how long until the distributions with too old ding-libs go out of
    support?
  2. since we will be (likely) supporting sssd-1-13 for the lifetime
    of RHEL-6, can we say that the old distributions just use sssd-1-13?

libini_config-1.0 does not provide any new functionality
which is not in libini_config < 1.0

They have just a different API (and moreover libini_config-1.0
still provides old API) (at least from sssd POV)

New features were added in 1.1, 1.2 and 1.3
So dropping libini_config < 1.0 does not make a sense.

If we decided to drop support for libini_config < 1.2
then it woudl be a different case.
But libini_config-1.3 is not in many distributions.
and there are still some distributions which does not have 1.2

LS

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Sep 1, 2016

On Thu, Sep 1, 2016 at 2:10 PM, lslebodn notifications@github.com wrote:

On (01/09/16 04:32), Jakub Hrozek wrote:

On Thu, Sep 01, 2016 at 03:21:06AM -0700, fidencio wrote:

On Thu, Sep 1, 2016 at 11:54 AM, lslebodn notifications@github.com
wrote:

On (31/08/16 23:18), fidencio wrote:

libini 1.0.0 is part of ding-libs 0.3.0 and has been around since
2013.
Even the old systems that we have to support already have a newer
version of the library. RHEL6, for instance, has ding-libs 0.4.0
which
provides libinit 1.1.0.

By removing this code we also can stop depending on libcollection.

Signed-off-by: Fabiano Fidêncio fidencio@redhat.com
You can view, comment on, or merge this pull request online at:

#10

-- Commit Summary --

  • UTIL: Remove support to libini older than 1.0.0

-- File Changes --

M configure.ac (1)
M contrib/ci/deps.sh (1)
M contrib/sssd.spec.in (1)
D src/external/libcollection.m4 (9)
M src/util/sss_ini.c (97)

-- Patch Links --

https://github.com/SSSD/sssd/pull/10.patch
https://github.com/SSSD/sssd/pull/10.diff

OpenSUSE LEAP has just ding-libs 0.3.0.1 in official repositories.
http://software.opensuse.org/package/ding-libs?search_term=ding-libs

Yep. OpenSUSE LEAP has 0.3.0.1, Debian Stable (Jessie) has 0.4.0,
latest Ubuntu LTS has 0.5.0.
And all of them would be able to build SSSD with my patch without any
issues.

Two questions:

  1. how long until the distributions with too old ding-libs go out of
    support?

Hmm. Unfortunately I don't have an answer for you.
What are the major distributions that we want to support? Debian, Ubuntu
LTS, RHEL, SLES ...?

  1. since we will be (likely) supporting sssd-1-13 for the lifetime
    of RHEL-6, can we say that the old distributions just use sssd-1-13?

libini_config-1.0 does not provide any new functionality
which is not in libini_config < 1.0

They have just a different API (and moreover libini_config-1.0
still provides old API) (at least from sssd POV)

There's no new functionality, true. But there's a quite good API
simplification and we can stop depending on libcollection.

New features were added in 1.1, 1.2 and 1.3
So dropping libini_config < 1.0 does not make a sense.

If we decided to drop support for libini_config < 1.2
then it woudl be a different case.
But libini_config-1.3 is not in many distributions.
and there are still some distributions which does not have 1.2

LS


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#10 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAG4esCkRqBTzpeBZ7ywzRj9F51ymMfZks5qlsC3gaJpZM4JyWgP
.

Best Regards,

Fabiano Fidêncio

@lslebodn

This comment has been minimized.

Copy link
Contributor

commented Sep 1, 2016

On (01/09/16 08:35), fidencio wrote:

On Thu, Sep 1, 2016 at 2:10 PM, lslebodn notifications@github.com wrote:

On (01/09/16 04:32), Jakub Hrozek wrote:

On Thu, Sep 01, 2016 at 03:21:06AM -0700, fidencio wrote:

On Thu, Sep 1, 2016 at 11:54 AM, lslebodn notifications@github.com
wrote:

On (31/08/16 23:18), fidencio wrote:

libini 1.0.0 is part of ding-libs 0.3.0 and has been around since
2013.
Even the old systems that we have to support already have a newer
version of the library. RHEL6, for instance, has ding-libs 0.4.0
which
provides libinit 1.1.0.

By removing this code we also can stop depending on libcollection.

Signed-off-by: Fabiano Fidêncio fidencio@redhat.com
You can view, comment on, or merge this pull request online at:

#10

-- Commit Summary --

  • UTIL: Remove support to libini older than 1.0.0

-- File Changes --

M configure.ac (1)
M contrib/ci/deps.sh (1)
M contrib/sssd.spec.in (1)
D src/external/libcollection.m4 (9)
M src/util/sss_ini.c (97)

-- Patch Links --

https://github.com/SSSD/sssd/pull/10.patch
https://github.com/SSSD/sssd/pull/10.diff

OpenSUSE LEAP has just ding-libs 0.3.0.1 in official repositories.
http://software.opensuse.org/package/ding-libs?search_term=ding-libs

Yep. OpenSUSE LEAP has 0.3.0.1, Debian Stable (Jessie) has 0.4.0,
latest Ubuntu LTS has 0.5.0.
And all of them would be able to build SSSD with my patch without any
issues.

Two questions:

  1. how long until the distributions with too old ding-libs go out of
    support?

Hmm. Unfortunately I don't have an answer for you.
What are the major distributions that we want to support? Debian, Ubuntu
LTS, RHEL, SLES ...?

I would prefer if limited version of sssd (ldap + krb5 provider)
could be compiled almost anywhere. (even old distributions)
It is not only about major distributions.

  1. since we will be (likely) supporting sssd-1-13 for the lifetime
    of RHEL-6, can we say that the old distributions just use sssd-1-13?

libini_config-1.0 does not provide any new functionality
which is not in libini_config < 1.0

They have just a different API (and moreover libini_config-1.0
still provides old API) (at least from sssd POV)

There's no new functionality, true. But there's a quite good API
simplification
I agree that API is better.
But feature wise are the same. So if we wanted to drop
support for libini_config < 1.0 then we could
drop support for libini_config-1.0.
It would simplify more things in sssd

However, OpenSUSE LEAP has just a libini_config-1.0

and we can stop depending on libcollection.

libini_config-devel depends on libcollection-devel :-)
and it's a very tiny dependency.

sh$ rpm -q --requires libini_config-devel
/usr/bin/pkg-config
libbasicobjects-devel = 0.1.1-29.fc25
libcollection-devel = 0.7.0-29.fc25
libini_config = 1.3.0-29.fc25
libini_config.so.5()(64bit)
libref_array-devel = 0.1.5-29.fc25
pkgconfig(basicobjects)
pkgconfig(collection)
pkgconfig(ref_array)
rpmlib(CompressedFileNames) <= 3.0.4-1
rpmlib(FileDigests) <= 4.6.0-1
rpmlib(PayloadFilesHavePrefix) <= 4.0-1
rpmlib(PayloadIsXz) <= 5.2-1

We would not reduce our dependency chain.

LS

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Sep 6, 2016

On Thu, Sep 01, 2016 at 09:01:54AM -0700, lslebodn wrote:

On (01/09/16 08:35), fidencio wrote:

On Thu, Sep 1, 2016 at 2:10 PM, lslebodn notifications@github.com wrote:

On (01/09/16 04:32), Jakub Hrozek wrote:

On Thu, Sep 01, 2016 at 03:21:06AM -0700, fidencio wrote:

On Thu, Sep 1, 2016 at 11:54 AM, lslebodn notifications@github.com
wrote:

On (31/08/16 23:18), fidencio wrote:

libini 1.0.0 is part of ding-libs 0.3.0 and has been around since
2013.
Even the old systems that we have to support already have a newer
version of the library. RHEL6, for instance, has ding-libs 0.4.0
which
provides libinit 1.1.0.

By removing this code we also can stop depending on libcollection.

Signed-off-by: Fabiano Fidêncio fidencio@redhat.com
You can view, comment on, or merge this pull request online at:

#10

-- Commit Summary --

  • UTIL: Remove support to libini older than 1.0.0

-- File Changes --

M configure.ac (1)
M contrib/ci/deps.sh (1)
M contrib/sssd.spec.in (1)
D src/external/libcollection.m4 (9)
M src/util/sss_ini.c (97)

-- Patch Links --

https://github.com/SSSD/sssd/pull/10.patch
https://github.com/SSSD/sssd/pull/10.diff

OpenSUSE LEAP has just ding-libs 0.3.0.1 in official repositories.
http://software.opensuse.org/package/ding-libs?search_term=ding-libs

Yep. OpenSUSE LEAP has 0.3.0.1, Debian Stable (Jessie) has 0.4.0,
latest Ubuntu LTS has 0.5.0.
And all of them would be able to build SSSD with my patch without any
issues.

Two questions:

  1. how long until the distributions with too old ding-libs go out of
    support?

Hmm. Unfortunately I don't have an answer for you.
What are the major distributions that we want to support? Debian, Ubuntu
LTS, RHEL, SLES ...?

I would prefer if limited version of sssd (ldap + krb5 provider)
could be compiled almost anywhere. (even old distributions)
It is not only about major distributions.

  1. since we will be (likely) supporting sssd-1-13 for the lifetime
    of RHEL-6, can we say that the old distributions just use sssd-1-13?

libini_config-1.0 does not provide any new functionality
which is not in libini_config < 1.0

They have just a different API (and moreover libini_config-1.0
still provides old API) (at least from sssd POV)

There's no new functionality, true. But there's a quite good API
simplification
I agree that API is better.
But feature wise are the same. So if we wanted to drop
support for libini_config < 1.0 then we could
drop support for libini_config-1.0.
It would simplify more things in sssd

However, OpenSUSE LEAP has just a libini_config-1.0

This PR has stalled somehow. Can we either move it forward or reject?

I admit I got a bit lost in the comments, which distributions would we
break with this patch? Are any of them supported/will be supported in th
near future?

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Sep 6, 2016

The distributions that would break with this patch are:

  • RHEL/CentOS 5 and older
  • Debian Wheezy (from 2013) and older
  • Ubuntu 12.04 LTS and older

I was not able to find what's the version of the package on SLES

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Sep 6, 2016

On Tue, Sep 06, 2016 at 05:16:00AM -0700, fidencio wrote:

The distributions that would break with this patch are:

  • RHEL/CentOS 5 and older

I don't think we care about RHEL-5 with master and I'm not sure sssd
master even builds there.

  • Debian Wheezy (from 2013) and older

OK, this one I think we care about for the basic functionality:
7.0 Wheezy May 4th 2013 April 26th 2016 (full) / May 2018 (LTS)
But LTS means bug fixes and for those I think sssd-1-13 would be OK.

  • Ubuntu 12.04 LTS and older

12.04 is often used (for example travis-ci still offers only this
distro) and ends its life in 2017.

On one hand, it's unlikely that users of LTS distributions will run
master, they will probably only run sssd-1-13 or some PPAs, on the other
hand, I don't see too much value except for cleaner code.

So all in all I suggest we nack this patchset for now and push it when
Ubuntu 12.04 goes EOL.

Does that sound like a reasonable compromise?

@lslebodn

This comment has been minimized.

Copy link
Contributor

commented Sep 6, 2016

On (06/09/16 05:16), fidencio wrote:

The distributions that would break with this patch are:

  • RHEL/CentOS 5 and older
  • Debian Wheezy (from 2013) and older
  • Ubuntu 12.04 LTS and older

I was not able to find what's the version of the package on SLES

  • libini_config-0.7.0 is feature wise the same as libini_config-1.0
    sssd uses it for basic parsing of ini files
  • libini_config-1.1 is an optional feature used in ad provider
    for parsing gpo files
  • libini_config-1.2 provides optional feature for config snippets in
    /etc/sssd/conf.d
  • libini_config-1.3 provides optional feature for validation of ini files

After this patch we still have 3 optional features provided by different
versions of libini_config.

If we want to drop support for older version of libini_config and siplify the
wrapper then I would drop support for libini_config < 1.3
and remove wrappers in src/util/sss_ini.c

However, libini_config-1.3 is only in fedora.

LS

@jhrozek jhrozek added the Deferred label Sep 6, 2016

@lslebodn

This comment has been minimized.

Copy link
Contributor

commented Sep 6, 2016

IMHO, it might be better to close this PR.
If we decide to dor support for libini_config < 1.1 or 1.2
then it will be a different patch anyway. @see my previous comment

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Sep 7, 2016

On Tue, Sep 06, 2016 at 11:49:09AM -0700, lslebodn wrote:

IMHO, it might be better to close this PR.
If we decide to dor support for libini_config < 1.1 or 1.2
then it will be a different patch anyway. @see my previous comment

The only reason I suggest deferred over closing is that if we close this
PR, we will never remember to ressurect it. If it's going to be
deferred, it will keep coming up in the PR list.

@lslebodn

This comment has been minimized.

Copy link
Contributor

commented Oct 19, 2016

On (07/09/16 00:56), Jakub Hrozek wrote:

On Tue, Sep 06, 2016 at 11:49:09AM -0700, lslebodn wrote:

IMHO, it might be better to close this PR.
If we decide to dor support for libini_config < 1.1 or 1.2
then it will be a different patch anyway. @see my previous comment

The only reason I suggest deferred over closing is that if we close this
PR, we will never remember to ressurect it. If it's going to be
deferred, it will keep coming up in the PR list.

If we decide dro support for older version of libini_config
it will be higher version that 1.0 and therfore the patch
would be different.

Moreover, pull request are not tracking system.

LS

@lslebodn lslebodn added Rejected and removed Deferred labels Oct 19, 2016

@lslebodn lslebodn closed this Oct 19, 2016

fidencio added a commit to fidencio/sssd that referenced this pull request Dec 12, 2016

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 some deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

The removal of those functions we lose some debug messages, which is the
least of problems now. Also instead of re-setting the watchdog timeout
in case of time-shift (please, see b8ceaeb for more info), let's just
re-set the timer's time.

Along with this commit, the function to teardown the watchdog has been
removed as it's not used anymore.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Dec 13, 2016

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 some deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

The removal of those functions we lose some debug messages, which is the
least of problems now. Also instead of re-setting the watchdog timeout
in case of time-shift (please, see b8ceaeb for more info), let's just
exit the let the monitor restart the process.

A proper fix for the clock screw situation should be implemented on
samba's side, by having tevent using monotonic (or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Dec 20, 2016

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and a single byte is written to this pipe (which
is an async-signal-safe operation) and the logic currently done by the
timer handler will be done inside the fd handler in a safe way.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Dec 21, 2016

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by _exit(). Personally I don't
think is worth the trouble to try to log those messages properly in this
specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Jan 3, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by _exit(). Personally I don't
think is worth the trouble to try to log those messages properly in this
specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Jan 3, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by _exit(). Personally I don't
think is worth the trouble to try to log those messages properly in this
specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Jan 3, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by kill(-getpgrp(), SIGTERM).
Personally I don't think is worth the trouble to try to log those messages
properly in this specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Jan 5, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by _exit(1). Personally I don't
think is worth the trouble to try to log those messages properly in
this specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Jan 9, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by kill(-getpgrp(), SIGTERM).
Personally I don't think is worth the trouble to try to log those messages
properly in this specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Jan 9, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by kill(-getpgrp(), SIGTERM).
Personally I don't think is worth the trouble to try to log those messages
properly in this specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

fidencio added a commit to fidencio/sssd that referenced this pull request Jan 10, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by kill(-getpgrp(), SIGTERM) (or
_exit(1) considering the cases where setting up the process group during
the server_setup() has failed).
Personally I don't think is worth the trouble to try to log those messages
properly in this specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    SSSD#1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    SSSD#2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    SSSD#3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    SSSD#4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    SSSD#5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    SSSD#6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>

jhrozek pushed a commit that referenced this pull request Jan 25, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by kill(-getpgrp(), SIGTERM) (or
_exit(1) considering the cases where setting up the process group during
the server_setup() has failed).
Personally I don't think is worth the trouble to try to log those messages
properly in this specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    #1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    #2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    #3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    #4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    #5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    #6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    #7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    #8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    #9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    #10 <signal handler called>
    #11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    #12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    #13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    #14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    #15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    #16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    #17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    #18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    #19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    #20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    #21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    #22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    #23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    #24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    #25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    #26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    #27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    #28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    #29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    #30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    #31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    #32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    #33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    #34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    #35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    #36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    #37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    #38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    #39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    #40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    #41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    #42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    #43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    #44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    #45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    #46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    #47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    #48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    #49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    #50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    #51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    #52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    #53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    #54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    #55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    #56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    #57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    #58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    #59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    #60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Reviewed-by: Pavel Březina <pbrezina@redhat.com>
Reviewed-by: Jakub Hrozek <jhrozek@redhat.com>
Reviewed-by: Simo Sorce <simo@redhat.com>

jhrozek pushed a commit that referenced this pull request Jan 25, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by kill(-getpgrp(), SIGTERM) (or
_exit(1) considering the cases where setting up the process group during
the server_setup() has failed).
Personally I don't think is worth the trouble to try to log those messages
properly in this specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    #1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    #2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    #3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    #4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    #5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    #6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    #7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    #8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    #9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    #10 <signal handler called>
    #11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    #12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    #13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    #14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    #15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    #16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    #17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    #18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    #19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    #20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    #21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    #22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    #23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    #24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    #25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    #26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    #27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    #28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    #29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    #30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    #31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    #32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    #33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    #34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    #35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    #36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    #37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    #38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    #39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    #40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    #41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    #42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    #43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    #44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    #45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    #46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    #47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    #48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    #49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    #50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    #51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    #52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    #53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    #54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    #55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    #56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    #57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    #58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    #59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    #60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Reviewed-by: Pavel Březina <pbrezina@redhat.com>
Reviewed-by: Jakub Hrozek <jhrozek@redhat.com>
Reviewed-by: Simo Sorce <simo@redhat.com>
(cherry picked from commit e6a5f8c)

jhrozek added a commit to jhrozek/sssd that referenced this pull request Jul 20, 2017

WATCHDOG: Avoid non async-signal-safe from the signal_handler
While debugging rhbz#1396912 a deadlock on sssd_be was noticed[0] and
it's been caused by the use of non async-signal-safe functions from the
signal_handler (please, see man 7 signal for more info about which are
the async-signal-safe functions that can be used).

In order to work this situation around a pipe has been added to the
watchdog_ctx structure and, in case of clock screw,  a single byte is
written to this pipe (which is an async-signal-safe operation) and the
logic currently done by the timer handler to reset the watchdog will be
done inside the fd handler in a safe way.

With this patch we ended up losing some debug messages as
orderly_shutdown() has been replaced by kill(-getpgrp(), SIGTERM) (or
_exit(1) considering the cases where setting up the process group during
the server_setup() has failed).
Personally I don't think is worth the trouble to try to log those messages
properly in this specific case.

It's really worth to mention that a proper fix the clock screw situation
should be implemented on samba's side, by having tevent using monotonic
(or boottime) clock.

[0]:
  [root@dusan ~]# pstack 17922
    #0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
    #1  0x00007fe707d04f93 in _L_lock_14932 () from /lib64/libc.so.6
    #2  0x00007fe707d02013 in __GI___libc_malloc (bytes=140630248638304, bytes@entry=15) at malloc.c:2891
    #3  0x00007fe707d0888a in __GI___strdup (s=0x7fe707dff4f7 "/etc/localtime") at strdup.c:42
    #4  0x00007fe707d31b61 in tzset_internal (always=<optimized out>, explicit=explicit@entry=1) at tzset.c:438
    #5  0x00007fe707d32523 in __tz_convert (timer=timer@entry=0x7ffcd5d2b090, use_localtime=use_localtime@entry=1, tp=tp@entry=0x7fe708041d40 <_tmbuf>) at tzset.c:621
    #6  0x00007fe707d30521 in __GI_localtime (t=t@entry=0x7ffcd5d2b090) at localtime.c:42
    SSSD#7  0x00007fe70886c7b0 in sss_vdebug_fn (file=<optimized out>, line=<optimized out>, function=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=16, flags=flags@entry=0, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n", ap=ap@entry=0x7ffcd5d2b130) at src/util/debug.c:248
    SSSD#8  0x00007fe70886c995 in sss_debug_fn (file=file@entry=0x7fe70bff263b "src/util/util_watchdog.c", line=line@entry=82, function=function@entry=0x7fe70bff27f0 <__FUNCTION__.9379> "watchdog_handler", level=level@entry=16, format=format@entry=0x7fe70bff2760 "Watchdog timer overflow, killing process!\n") at src/util/debug.c:284
    SSSD#9  0x00007fe70bfdb409 in watchdog_handler (sig=<optimized out>) at src/util/util_watchdog.c:81
    SSSD#10 <signal handler called>
    SSSD#11 0x00007fe707cff664 in _int_malloc (av=av@entry=0x7fe70803c760 <main_arena>, bytes=bytes@entry=151) at malloc.c:3494
    SSSD#12 0x00007fe707d01fbc in __GI___libc_malloc (bytes=bytes@entry=151) at malloc.c:2893
    SSSD#13 0x00007fe708450749 in __talloc_with_prefix (prefix_len=0, size=55, context=0x7fe718373210) at ../talloc.c:668
    SSSD#14 __talloc (size=55, context=0x7fe718373210) at ../talloc.c:708
    SSSD#15 _talloc_named_const (name=0x7fe70bb7015d "../common/ldb_pack.c:425", size=55, context=0x7fe718373210) at ../talloc.c:865
    SSSD#16 talloc_named_const (context=<optimized out>, size=size@entry=55, name=name@entry=0x7fe70bb7015d "../common/ldb_pack.c:425") at ../talloc.c:1606
    SSSD#17 0x00007fe70bb61803 in ldb_unpack_data_only_attr_list (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=0x7fe7184aa1e0, list=list@entry=0x0, list_size=list_size@entry=0, nb_elements_in_db=nb_elements_in_db@entry=0x0) at ../common/ldb_pack.c:425
    SSSD#18 0x00007fe70bb61a7d in ldb_unpack_data (ldb=ldb@entry=0x7fe70e4d52c0, data=data@entry=0x7ffcd5d2b990, message=<optimized out>) at ../common/ldb_pack.c:470
    SSSD#19 0x00007fe6fdc29b46 in ltdb_parse_data_unpack (key=..., data=..., private_data=0x7ffcd5d2ba70) at ../ldb_tdb/ldb_search.c:249
    SSSD#20 0x00007fe70a5e0a24 in tdb_parse_data (tdb=tdb@entry=0x7fe70e4eaa10, key=..., offset=15619748, len=414772, parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/io.c:637
    SSSD#21 0x00007fe70a5dc1fc in tdb_parse_record (tdb=0x7fe70e4eaa10, key=..., parser=parser@entry=0x7fe6fdc29b10 <ltdb_parse_data_unpack>, private_data=private_data@entry=0x7ffcd5d2ba70) at ../common/tdb.c:253
    SSSD#22 0x00007fe6fdc29e7b in ltdb_search_dn1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, msg=msg@entry=0x7fe7184aa1e0) at ../ldb_tdb/ldb_search.c:287
    SSSD#23 0x00007fe6fdc2acbb in ltdb_dn_list_load (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183c4940, list=list@entry=0x7fe7183c3a30) at ../ldb_tdb/ldb_index.c:181
    SSSD#24 0x00007fe6fdc2bbbb in ltdb_index_add1 (module=module@entry=0x7fe70e4eab50, dn=dn@entry=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", v_idx=v_idx@entry=0, el=<optimized out>, el=<optimized out>) at ../ldb_tdb/ldb_index.c:1134
    SSSD#25 0x00007fe6fdc2c62c in ltdb_index_add_el (el=0x7fe7184aa3e0, dn=0x7fe7183bf3e0 "name=testuser7045@domain.com,cn=users,cn=DOMAIN.COM,cn=sysdb", module=0x7fe70e4eab50) at ../ldb_tdb/ldb_index.c:1180
    SSSD#26 ltdb_index_add_element (module=module@entry=0x7fe70e4eab50, dn=<optimized out>, el=el@entry=0x7fe7184aa3e0) at ../ldb_tdb/ldb_index.c:1290
    SSSD#27 0x00007fe6fdc290bb in ltdb_modify_internal (module=module@entry=0x7fe70e4eab50, msg=0x7fe7183bf0c0, req=req@entry=0x7fe7183bdc10) at ../ldb_tdb/ldb_tdb.c:903
    SSSD#28 0x00007fe6fdc2958a in ltdb_modify (ctx=0x7fe7183c2950, ctx=0x7fe7183c2950) at ../ldb_tdb/ldb_tdb.c:998
    SSSD#29 ltdb_callback (ev=<optimized out>, te=<optimized out>, t=..., private_data=<optimized out>) at ../ldb_tdb/ldb_tdb.c:1380
    SSSD#30 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4d2890) at ../tevent_timed.c:341
    SSSD#31 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4d2890, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#32 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4d2890, location=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent_standard.c:114
    SSSD#33 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4d2890, location=location@entry=0x7fe70bb72ec5 "../common/ldb.c:631") at ../tevent.c:533
    SSSD#34 0x00007fe70bb6bc4f in ldb_wait (handle=0x7fe7183c4530, type=<optimized out>) at ../common/ldb.c:631
    SSSD#35 0x00007fe70bb6c793 in ldb_autotransaction_request (ldb=0x7fe70e4d52c0, req=0x7fe7183bdc10) at ../common/ldb.c:573
    SSSD#36 0x00007fe70bb6d263 in ldb_modify (ldb=ldb@entry=0x7fe70e4d52c0, message=<optimized out>) at ../common/ldb.c:1655
    SSSD#37 0x00007fe70bfa2ab5 in sysdb_set_cache_entry_attr (ldb=0x7fe70e4d52c0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bf680, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1159
    SSSD#38 0x00007fe70bfa304d in sysdb_rep_ts_entry_attr (sysdb=0x7fe70e4eadd0, attrs=0x7fe7183bf680, entry_dn=0x7fe7183c4760) at src/db/sysdb_ops.c:1218
    SSSD#39 sysdb_set_ts_entry_attr (sysdb=sysdb@entry=0x7fe70e4eadd0, entry_dn=entry_dn@entry=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1248
    SSSD#40 0x00007fe70bfa4aa9 in sysdb_set_entry_attr (sysdb=0x7fe70e4eadd0, entry_dn=0x7fe7183c4760, attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1199
    SSSD#41 0x00007fe70bfa4b5f in sysdb_set_user_attr (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", attrs=attrs@entry=0x7fe7183bb840, mod_op=mod_op@entry=2) at src/db/sysdb_ops.c:1285
    SSSD#42 0x00007fe70bfa58c3 in sysdb_add_user (domain=domain@entry=0x7fe70e4d62f0, name=name@entry=0x7fe7183c01f0 "testuser7045@domain.com", uid=uid@entry=1415408147, gid=<optimized out>, gid@entry=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:1928
    SSSD#43 0x00007fe70bfab271 in sysdb_store_new_user (now=1481105315, cache_timeout=5400, attrs=0x7fe7183bb840, orig_dn=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", shell=0x0, homedir=0x0, gecos=0x7fe710465d00 "Test User7045", gid=1415400513, uid=1415408147, name=0x7fe7183c01f0 "testuser7045@domain.com", domain=0x7fe70e4d62f0) at src/db/sysdb_ops.c:2549
    SSSD#44 sysdb_store_user (domain=domain@entry=0x7fe70e4d62f0, name=0x7fe7183c01f0 "testuser7045@domain.com", pwd=pwd@entry=0x0, uid=1415408147, gid=1415400513, gecos=gecos@entry=0x7fe710465d00 "Test User7045", homedir=homedir@entry=0x0, shell=shell@entry=0x0, orig_dn=orig_dn@entry=0x7fe710465940 "CN=Test User7045,OU=Sales,DC=DOMAIN,DC=COM", attrs=attrs@entry=0x7fe7183bb840, remove_attrs=0x7fe7183c08a0, cache_timeout=cache_timeout@entry=5400, now=now@entry=1481105315) at src/db/sysdb_ops.c:2499
    SSSD#45 0x00007fe6fba0d9f9 in sdap_save_user (memctx=memctx@entry=0x7fe70e544ee0, opts=opts@entry=0x7fe70e518400, dom=dom@entry=0x7fe70e4d62f0, attrs=<optimized out>, _usn_value=_usn_value@entry=0x7ffcd5d2c260, now=now@entry=1481105315) at src/providers/ldap/sdap_async_users.c:509
    SSSD#46 0x00007fe6fba0df9a in sdap_save_users (memctx=memctx@entry=0x7fe70e544e40, sysdb=0x7fe70e4eadd0, dom=0x7fe70e4d62f0, opts=0x7fe70e518400, users=<optimized out>, num_users=10006, _usn_value=_usn_value@entry=0x7fe70e544e60) at src/providers/ldap/sdap_async_users.c:572
    SSSD#47 0x00007fe6fba0e460 in sdap_get_users_done (subreq=<optimized out>) at src/providers/ldap/sdap_async_users.c:938
    SSSD#48 0x00007fe6fba0c9d5 in sdap_search_user_process (subreq=0x0) at src/providers/ldap/sdap_async_users.c:814
    SSSD#49 0x00007fe6fba07379 in generic_ext_search_handler (subreq=0x0, opts=<optimized out>) at src/providers/ldap/sdap_async.c:1689
    SSSD#50 0x00007fe6fba0991b in sdap_get_generic_op_finished (op=<optimized out>, reply=<optimized out>, error=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:1621
    SSSD#51 0x00007fe6fba083cd in sdap_process_message (ev=<optimized out>, sh=<optimized out>, msg=0x7fe70e5f9ce0) at src/providers/ldap/sdap_async.c:353
    SSSD#52 sdap_process_result (ev=<optimized out>, pvt=<optimized out>) at src/providers/ldap/sdap_async.c:197
    SSSD#53 0x00007fe708664b4f in tevent_common_loop_timer_delay (ev=ev@entry=0x7fe70e4cbc30) at ../tevent_timed.c:341
    SSSD#54 0x00007fe708665b5a in epoll_event_loop_once (ev=0x7fe70e4cbc30, location=<optimized out>) at ../tevent_epoll.c:911
    SSSD#55 0x00007fe708664257 in std_event_loop_once (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:114
    SSSD#56 0x00007fe70866040d in _tevent_loop_once (ev=ev@entry=0x7fe70e4cbc30, location=location@entry=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:533
    SSSD#57 0x00007fe7086605ab in tevent_common_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent.c:637
    SSSD#58 0x00007fe7086641f7 in std_event_loop_wait (ev=0x7fe70e4cbc30, location=0x7fe70bfee8e7 "src/util/server.c:702") at ../tevent_standard.c:140
    SSSD#59 0x00007fe70bfd1993 in server_loop (main_ctx=0x7fe70e4cd080) at src/util/server.c:702
    SSSD#60 0x00007fe70c84cb82 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:587

Resolves:
https://fedorahosted.org/sssd/ticket/3266

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Reviewed-by: Pavel Březina <pbrezina@redhat.com>
Reviewed-by: Jakub Hrozek <jhrozek@redhat.com>
Reviewed-by: Simo Sorce <simo@redhat.com>
(cherry picked from commit e6a5f8c)
(cherry picked from commit 0606a71)

jhrozek added a commit that referenced this pull request May 24, 2018

selinux_child: Allow to query sssd
The function getpwnam_r is indirectly used ins selinux_child
on few places. (in libselinux and libsemanage)

There is not any reason why we should block nss calls with sssd.
It is a child process and loop cannot be created.
(BTW it is also allowed in krb_child and proxy_child)

  #0  _nss_sss_getpwnam_r (name=0x55c0e6471a50 "user4_2", result=0x7ffe9ab0d05,
          buffer=0x55c0e64741a0 "\200é\256\177\177", buflen=1024,
          errnop=0x7f7fafbcdb08)
          at src/sss_client/nss_passwd.c:132
  #1  0x00007f7fae7ad48f in __getpwnam_r (name=name@entry=0x55c0e6471a50 "user4_2",
          resbuf=resbuf@entry=0x7ffe9ab0d050, buffer=buffer@entry=0x55c0e64741a0 "\200é\256\177\177",
          buflen=buflen@entry=1024, result=result@entry=0x7ffe9ab0d048)
          at ../nss/getXXbyYY_r.c:316
  #2  0x00007f7faeabc9e2 in get_default_gid (name=0x55c0e6471a50 "user4_2")
          at seusers.c:105
  #3  getseuserbyname (name=0x55c0e6471a50 "user4_2", r_seuser=0x7ffe9ab0d0f0,
          r_level=0x7ffe9ab0d0f8) at seusers.c:186
  #4  0x000055c0e5126d02 in seuser_needs_update (ibuf=0x55c0e64718e0)
          at src/providers/ipa/selinux_child.c:175
  #5  main (argc=<optimized out>, argv=<optimized out>)
          at src/providers/ipa/selinux_child.c:332

  #0  _nss_sss_getpwnam_r (name=0x55c0e647dda0 "user3_1", result=0x7ffe9ab0cce0,
          buffer=0x55c0e6482180 "\240AG\346\300U", buflen=1024,
          errnop=0x7f7fafbcdb08) at src/sss_client/nss_passwd.c:132
  #1  0x00007f7fae7ad48f in __getpwnam_r (name=name@entry=0x55c0e647dda0 "user3_1",
          resbuf=resbuf@entry=0x7ffe9ab0cce0, buffer=buffer@entry=0x55c0e6482180 "\240AG\346\300U",
          buflen=buflen@entry=1024, result=result@entry=0x7ffe9ab0ccd8)
          at ../nss/getXXbyYY_r.c:316
  #2  0x00007f7faece29b3 in add_user (head=head@entry=0x7ffe9ab0ce28,
          user=user@entry=0x55c0e64b5930, name=name@entry=0x55c0e647dda0 "user3_1",
          sename=sename@entry=0x55c0e647bdc0 "staff_u",
          selogin=selogin@entry=0x55c0e647dda0 "user3_1",
          s=<optimized out>) at genhomedircon.c:999
  #3  0x00007f7faece334c in get_users (errors=<synthetic pointer>,
          s=0x7ffe9ab0ce70) at genhomedircon.c:1167
  #4  write_gen_home_dir_context (homedir_context_tpl=0x55c0e647d3d0,
          user_context_tpl=0x55c0e647a870, username_context_tpl=0x0,
          out=0x55c0e646fa80, s=0x7ffe9ab0ce70) at genhomedircon.c:1205
  #5  write_context_file (out=<optimized out>, s=0x7ffe9ab0ce70)
          at genhomedircon.c:1317
  #6  semanage_genhomedircon (sh=sh@entry=0x55c0e6476380, policydb=<optimized out>,
          usepasswd=<optimized out>, ignoredirs=<optimized out>)
          at genhomedircon.c:1382
  #7  0x00007f7faecdfb95 in semanage_direct_commit (sh=0x55c0e6476380)
          at direct_api.c:1575
  #8  0x00007f7faece4d6d in semanage_commit (sh=0x55c0e6476380) at handle.c:426
  #9  0x000055c0e5127cf8 in sss_set_seuser (login_name=0x55c0e6471a5 "user4_2",
          seuser_name=0x55c0e6471960 "staff_u", mls=<optimized out>)
          at src/util/sss_semanage.c:335
  #10 0x000055c0e5126eea in sc_set_seuser (mls=0x55c0e64719d0 "s0-s0:c0.c1023",
          seuser_name=0x55c0e6471960 "staff_u",
          login_name=0x55c0e6471a50 "user4_2")
          at src/providers/ipa/selinux_child.c:162
  #11 main (argc=<optimized out>, argv=<optimized out>)
          at src/providers/ipa/selinux_child.c:334

Merges: https://pagure.io/SSSD/sssd/pull-request/3732

Reviewed-by: Jakub Hrozek <jhrozek@redhat.com>

lslebodn pushed a commit to lslebodn/sssd that referenced this pull request Jun 28, 2019

Lukas Slebodnik
PROXY: Return data in output parameter if everything is OK
The function remove_duplicate_group_members might return EOK also in the middle
of function but return parameter was not set with right data.
Processing continued in the function save_group but there was a
dereference of NULL pointer.

Introduced in: https://pagure.io/SSSD/sssd/issue/3931

Crash:
  (gdb) bt
  #0  0x00007fb4ce4a9ac5 in save_group (sysdb=sysdb@entry=0x55c9a0efb230, dom=dom@entry=0x55c9a0efb420, grp=grp@entry=0x55c9a0f370f0, real_name=0x55c9a0f47340 "nobody@ldap",
      alias=alias@entry=0x0) at src/providers/proxy/proxy_id.c:748
  #1  0x00007fb4ce4aa600 in get_gr_gid (mem_ctx=mem_ctx@entry=0x55c9a0f38be0, sysdb=sysdb@entry=0x55c9a0efb230, dom=dom@entry=0x55c9a0efb420, gid=99, now=<optimized out>,
      ctx=<optimized out>) at src/providers/proxy/proxy_id.c:1160
  SSSD#2  0x00007fb4ce4ac9e5 in get_initgr_groups_process (pwd=0x55c9a0f384a0, pwd=0x55c9a0f384a0, dom=0x55c9a0efb420, sysdb=0x55c9a0efb230, ctx=0x55c9a0f048e0, memctx=0x55c9a0f38be0)
      at src/providers/proxy/proxy_id.c:1553
  SSSD#3  get_initgr (i_name=<optimized out>, dom=0x55c9a0efb420, sysdb=<optimized out>, ctx=0x55c9a0f048e0, mem_ctx=0x55c9a0f38b70) at src/providers/proxy/proxy_id.c:1461
  SSSD#4  proxy_account_info (domain=0x55c9a0efb420, be_ctx=<optimized out>, data=<optimized out>, ctx=0x55c9a0f048e0, mem_ctx=0x55c9a0f38b70) at src/providers/proxy/proxy_id.c:1659
  SSSD#5  proxy_account_info_handler_send (mem_ctx=<optimized out>, id_ctx=0x55c9a0f048e0, data=<optimized out>, params=0x55c9a0f39790) at src/providers/proxy/proxy_id.c:1758
  SSSD#6  0x000055c99fc67677 in file_dp_request (_dp_req=<synthetic pointer>, req=0x55c9a0f39470, request_data=<optimized out>, dp_flags=1, method=DPM_ACCOUNT_HANDLER, target=DPT_ID,
      name=<optimized out>, domainname=0x55c9a0f39190 "LDAP", provider=0x55c9a0efe0e0, mem_ctx=<optimized out>) at src/providers/data_provider/dp_request.c:250
  SSSD#7  dp_req_send (mem_ctx=0x55c9a0f37b60, provider=provider@entry=0x55c9a0efe0e0, domain=domain@entry=0x55c9a0f39190 "LDAP", name=<optimized out>, target=target@entry=DPT_ID,
      method=method@entry=DPM_ACCOUNT_HANDLER, dp_flags=dp_flags@entry=1, request_data=0x55c9a0f37c00, _request_name=0x55c9a0f37b60) at src/providers/data_provider/dp_request.c:295
  SSSD#8  0x000055c99fc6a132 in dp_get_account_info_send (mem_ctx=<optimized out>, ev=0x55c9a0eddbc0, sbus_req=<optimized out>, provider=0x55c9a0efe0e0, dp_flags=1,
      entry_type=<optimized out>, filter=0x55c9a0f358d0 "name=nobody@ldap", domain=0x55c9a0f39190 "LDAP", extra=0x55c9a0f354a0 "") at src/providers/data_provider/dp_target_id.c:528
  SSSD#9  0x00007fb4da35265b in _sbus_sss_invoke_in_uusss_out_qus_step (ev=0x55c9a0eddbc0, te=<optimized out>, tv=..., private_data=<optimized out>) at src/sss_iface/sbus_sss_invokers.c:2847
  SSSD#10 0x00007fb4d9cfb1cf in tevent_common_invoke_timer_handler () from /lib64/libtevent.so.0
  SSSD#11 0x00007fb4d9cfb339 in tevent_common_loop_timer_delay () from /lib64/libtevent.so.0
  SSSD#12 0x00007fb4d9cfc2f9 in epoll_event_loop_once () from /lib64/libtevent.so.0
  SSSD#13 0x00007fb4d9cfa7b7 in std_event_loop_once () from /lib64/libtevent.so.0
  SSSD#14 0x00007fb4d9cf5b5d in _tevent_loop_once () from /lib64/libtevent.so.0
  SSSD#15 0x00007fb4d9cf5d8b in tevent_common_loop_wait () from /lib64/libtevent.so.0
  SSSD#16 0x00007fb4d9cfa757 in std_event_loop_wait () from /lib64/libtevent.so.0
  SSSD#17 0x00007fb4dd955ac3 in server_loop (main_ctx=0x55c9a0edf090) at src/util/server.c:724
  SSSD#18 0x000055c99fc59760 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:747
  (gdb) l
  (gdb) bt
  #0  0x00007fb4ce4a9ac5 in save_group (sysdb=sysdb@entry=0x55c9a0efb230, dom=dom@entry=0x55c9a0efb420, grp=grp@entry=0x55c9a0f370f0, real_name=0x55c9a0f47340 "nobody@ldap",
      alias=alias@entry=0x0) at src/providers/proxy/proxy_id.c:748
  #1  0x00007fb4ce4aa600 in get_gr_gid (mem_ctx=mem_ctx@entry=0x55c9a0f38be0, sysdb=sysdb@entry=0x55c9a0efb230, dom=dom@entry=0x55c9a0efb420, gid=99, now=<optimized out>,
      ctx=<optimized out>) at src/providers/proxy/proxy_id.c:1160
  SSSD#2  0x00007fb4ce4ac9e5 in get_initgr_groups_process (pwd=0x55c9a0f384a0, pwd=0x55c9a0f384a0, dom=0x55c9a0efb420, sysdb=0x55c9a0efb230, ctx=0x55c9a0f048e0, memctx=0x55c9a0f38be0)
      at src/providers/proxy/proxy_id.c:1553
  SSSD#3  get_initgr (i_name=<optimized out>, dom=0x55c9a0efb420, sysdb=<optimized out>, ctx=0x55c9a0f048e0, mem_ctx=0x55c9a0f38b70) at src/providers/proxy/proxy_id.c:1461
  SSSD#4  proxy_account_info (domain=0x55c9a0efb420, be_ctx=<optimized out>, data=<optimized out>, ctx=0x55c9a0f048e0, mem_ctx=0x55c9a0f38b70) at src/providers/proxy/proxy_id.c:1659
  SSSD#5  proxy_account_info_handler_send (mem_ctx=<optimized out>, id_ctx=0x55c9a0f048e0, data=<optimized out>, params=0x55c9a0f39790) at src/providers/proxy/proxy_id.c:1758
  SSSD#6  0x000055c99fc67677 in file_dp_request (_dp_req=<synthetic pointer>, req=0x55c9a0f39470, request_data=<optimized out>, dp_flags=1, method=DPM_ACCOUNT_HANDLER, target=DPT_ID,
      name=<optimized out>, domainname=0x55c9a0f39190 "LDAP", provider=0x55c9a0efe0e0, mem_ctx=<optimized out>) at src/providers/data_provider/dp_request.c:250
  SSSD#7  dp_req_send (mem_ctx=0x55c9a0f37b60, provider=provider@entry=0x55c9a0efe0e0, domain=domain@entry=0x55c9a0f39190 "LDAP", name=<optimized out>, target=target@entry=DPT_ID,
      method=method@entry=DPM_ACCOUNT_HANDLER, dp_flags=dp_flags@entry=1, request_data=0x55c9a0f37c00, _request_name=0x55c9a0f37b60) at src/providers/data_provider/dp_request.c:295
  SSSD#8  0x000055c99fc6a132 in dp_get_account_info_send (mem_ctx=<optimized out>, ev=0x55c9a0eddbc0, sbus_req=<optimized out>, provider=0x55c9a0efe0e0, dp_flags=1,
      entry_type=<optimized out>, filter=0x55c9a0f358d0 "name=nobody@ldap", domain=0x55c9a0f39190 "LDAP", extra=0x55c9a0f354a0 "") at src/providers/data_provider/dp_target_id.c:528
  SSSD#9  0x00007fb4da35265b in _sbus_sss_invoke_in_uusss_out_qus_step (ev=0x55c9a0eddbc0, te=<optimized out>, tv=..., private_data=<optimized out>) at src/sss_iface/sbus_sss_invokers.c:2847
  SSSD#10 0x00007fb4d9cfb1cf in tevent_common_invoke_timer_handler () from /lib64/libtevent.so.0
  SSSD#11 0x00007fb4d9cfb339 in tevent_common_loop_timer_delay () from /lib64/libtevent.so.0
  SSSD#12 0x00007fb4d9cfc2f9 in epoll_event_loop_once () from /lib64/libtevent.so.0
  SSSD#13 0x00007fb4d9cfa7b7 in std_event_loop_once () from /lib64/libtevent.so.0
  SSSD#14 0x00007fb4d9cf5b5d in _tevent_loop_once () from /lib64/libtevent.so.0
  SSSD#15 0x00007fb4d9cf5d8b in tevent_common_loop_wait () from /lib64/libtevent.so.0
  SSSD#16 0x00007fb4d9cfa757 in std_event_loop_wait () from /lib64/libtevent.so.0
  SSSD#17 0x00007fb4dd955ac3 in server_loop (main_ctx=0x55c9a0edf090) at src/util/server.c:724
  SSSD#18 0x000055c99fc59760 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:747
  (gdb) l
  733         ret = remove_duplicate_group_members(tmp_ctx, grp, &ngroup);
  734         if (ret != EOK) {
  735             DEBUG(SSSDBG_CRIT_FAILURE, "Failed to remove duplicate group member     s\n");
  736             goto done;
  737         }
  738
  739         DEBUG_GR_MEM(SSSDBG_TRACE_LIBS, ngroup);
  740
  741         ret = sysdb_transaction_start(sysdb);
  742         if (ret != EOK) {
  743             DEBUG(SSSDBG_CRIT_FAILURE, "Failed to start transaction\n");
  744             goto done;
  745         }
  746         in_transaction = true;
  747
  748         if (ngroup->gr_mem && ngroup->gr_mem[0]) {
  749             attrs = sysdb_new_attrs(tmp_ctx);
  750             if (!attrs) {
  751                 DEBUG(SSSDBG_CRIT_FAILURE, "Allocation error?!\n");
  752                 ret = ENOMEM;
  (gdb) p ngroup
  $1 = (struct group *) 0x0
  743             DEBUG(SSSDBG_CRIT_FAILURE, "Failed to start transaction\n");
  744             goto done;
  745         }
  746         in_transaction = true;
  747
  748         if (ngroup->gr_mem && ngroup->gr_mem[0]) {
  749             attrs = sysdb_new_attrs(tmp_ctx);
  750             if (!attrs) {
  751                 DEBUG(SSSDBG_CRIT_FAILURE, "Allocation error?!\n");
  752                 ret = ENOMEM;
  (gdb) p ngroup
  $1 = (struct group *) 0x0

Merges: https://pagure.io/SSSD/sssd/pull-request/4036

Resolves:
https://pagure.io/SSSD/sssd/issue/XXXX

jhrozek added a commit that referenced this pull request Jun 28, 2019

PROXY: Return data in output parameter if everything is OK
The function remove_duplicate_group_members might return EOK also in the middle
of function but return parameter was not set with right data.
Processing continued in the function save_group but there was a
dereference of NULL pointer.

Introduced in: https://pagure.io/SSSD/sssd/issue/3931

Crash:
  (gdb) bt
  #0  0x00007fb4ce4a9ac5 in save_group (sysdb=sysdb@entry=0x55c9a0efb230, dom=dom@entry=0x55c9a0efb420, grp=grp@entry=0x55c9a0f370f0, real_name=0x55c9a0f47340 "nobody@ldap",
      alias=alias@entry=0x0) at src/providers/proxy/proxy_id.c:748
  #1  0x00007fb4ce4aa600 in get_gr_gid (mem_ctx=mem_ctx@entry=0x55c9a0f38be0, sysdb=sysdb@entry=0x55c9a0efb230, dom=dom@entry=0x55c9a0efb420, gid=99, now=<optimized out>,
      ctx=<optimized out>) at src/providers/proxy/proxy_id.c:1160
  #2  0x00007fb4ce4ac9e5 in get_initgr_groups_process (pwd=0x55c9a0f384a0, pwd=0x55c9a0f384a0, dom=0x55c9a0efb420, sysdb=0x55c9a0efb230, ctx=0x55c9a0f048e0, memctx=0x55c9a0f38be0)
      at src/providers/proxy/proxy_id.c:1553
  #3  get_initgr (i_name=<optimized out>, dom=0x55c9a0efb420, sysdb=<optimized out>, ctx=0x55c9a0f048e0, mem_ctx=0x55c9a0f38b70) at src/providers/proxy/proxy_id.c:1461
  #4  proxy_account_info (domain=0x55c9a0efb420, be_ctx=<optimized out>, data=<optimized out>, ctx=0x55c9a0f048e0, mem_ctx=0x55c9a0f38b70) at src/providers/proxy/proxy_id.c:1659
  #5  proxy_account_info_handler_send (mem_ctx=<optimized out>, id_ctx=0x55c9a0f048e0, data=<optimized out>, params=0x55c9a0f39790) at src/providers/proxy/proxy_id.c:1758
  #6  0x000055c99fc67677 in file_dp_request (_dp_req=<synthetic pointer>, req=0x55c9a0f39470, request_data=<optimized out>, dp_flags=1, method=DPM_ACCOUNT_HANDLER, target=DPT_ID,
      name=<optimized out>, domainname=0x55c9a0f39190 "LDAP", provider=0x55c9a0efe0e0, mem_ctx=<optimized out>) at src/providers/data_provider/dp_request.c:250
  #7  dp_req_send (mem_ctx=0x55c9a0f37b60, provider=provider@entry=0x55c9a0efe0e0, domain=domain@entry=0x55c9a0f39190 "LDAP", name=<optimized out>, target=target@entry=DPT_ID,
      method=method@entry=DPM_ACCOUNT_HANDLER, dp_flags=dp_flags@entry=1, request_data=0x55c9a0f37c00, _request_name=0x55c9a0f37b60) at src/providers/data_provider/dp_request.c:295
  #8  0x000055c99fc6a132 in dp_get_account_info_send (mem_ctx=<optimized out>, ev=0x55c9a0eddbc0, sbus_req=<optimized out>, provider=0x55c9a0efe0e0, dp_flags=1,
      entry_type=<optimized out>, filter=0x55c9a0f358d0 "name=nobody@ldap", domain=0x55c9a0f39190 "LDAP", extra=0x55c9a0f354a0 "") at src/providers/data_provider/dp_target_id.c:528
  #9  0x00007fb4da35265b in _sbus_sss_invoke_in_uusss_out_qus_step (ev=0x55c9a0eddbc0, te=<optimized out>, tv=..., private_data=<optimized out>) at src/sss_iface/sbus_sss_invokers.c:2847
  #10 0x00007fb4d9cfb1cf in tevent_common_invoke_timer_handler () from /lib64/libtevent.so.0
  #11 0x00007fb4d9cfb339 in tevent_common_loop_timer_delay () from /lib64/libtevent.so.0
  #12 0x00007fb4d9cfc2f9 in epoll_event_loop_once () from /lib64/libtevent.so.0
  #13 0x00007fb4d9cfa7b7 in std_event_loop_once () from /lib64/libtevent.so.0
  #14 0x00007fb4d9cf5b5d in _tevent_loop_once () from /lib64/libtevent.so.0
  #15 0x00007fb4d9cf5d8b in tevent_common_loop_wait () from /lib64/libtevent.so.0
  #16 0x00007fb4d9cfa757 in std_event_loop_wait () from /lib64/libtevent.so.0
  #17 0x00007fb4dd955ac3 in server_loop (main_ctx=0x55c9a0edf090) at src/util/server.c:724
  #18 0x000055c99fc59760 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:747
  (gdb) l
  (gdb) bt
  #0  0x00007fb4ce4a9ac5 in save_group (sysdb=sysdb@entry=0x55c9a0efb230, dom=dom@entry=0x55c9a0efb420, grp=grp@entry=0x55c9a0f370f0, real_name=0x55c9a0f47340 "nobody@ldap",
      alias=alias@entry=0x0) at src/providers/proxy/proxy_id.c:748
  #1  0x00007fb4ce4aa600 in get_gr_gid (mem_ctx=mem_ctx@entry=0x55c9a0f38be0, sysdb=sysdb@entry=0x55c9a0efb230, dom=dom@entry=0x55c9a0efb420, gid=99, now=<optimized out>,
      ctx=<optimized out>) at src/providers/proxy/proxy_id.c:1160
  #2  0x00007fb4ce4ac9e5 in get_initgr_groups_process (pwd=0x55c9a0f384a0, pwd=0x55c9a0f384a0, dom=0x55c9a0efb420, sysdb=0x55c9a0efb230, ctx=0x55c9a0f048e0, memctx=0x55c9a0f38be0)
      at src/providers/proxy/proxy_id.c:1553
  #3  get_initgr (i_name=<optimized out>, dom=0x55c9a0efb420, sysdb=<optimized out>, ctx=0x55c9a0f048e0, mem_ctx=0x55c9a0f38b70) at src/providers/proxy/proxy_id.c:1461
  #4  proxy_account_info (domain=0x55c9a0efb420, be_ctx=<optimized out>, data=<optimized out>, ctx=0x55c9a0f048e0, mem_ctx=0x55c9a0f38b70) at src/providers/proxy/proxy_id.c:1659
  #5  proxy_account_info_handler_send (mem_ctx=<optimized out>, id_ctx=0x55c9a0f048e0, data=<optimized out>, params=0x55c9a0f39790) at src/providers/proxy/proxy_id.c:1758
  #6  0x000055c99fc67677 in file_dp_request (_dp_req=<synthetic pointer>, req=0x55c9a0f39470, request_data=<optimized out>, dp_flags=1, method=DPM_ACCOUNT_HANDLER, target=DPT_ID,
      name=<optimized out>, domainname=0x55c9a0f39190 "LDAP", provider=0x55c9a0efe0e0, mem_ctx=<optimized out>) at src/providers/data_provider/dp_request.c:250
  #7  dp_req_send (mem_ctx=0x55c9a0f37b60, provider=provider@entry=0x55c9a0efe0e0, domain=domain@entry=0x55c9a0f39190 "LDAP", name=<optimized out>, target=target@entry=DPT_ID,
      method=method@entry=DPM_ACCOUNT_HANDLER, dp_flags=dp_flags@entry=1, request_data=0x55c9a0f37c00, _request_name=0x55c9a0f37b60) at src/providers/data_provider/dp_request.c:295
  #8  0x000055c99fc6a132 in dp_get_account_info_send (mem_ctx=<optimized out>, ev=0x55c9a0eddbc0, sbus_req=<optimized out>, provider=0x55c9a0efe0e0, dp_flags=1,
      entry_type=<optimized out>, filter=0x55c9a0f358d0 "name=nobody@ldap", domain=0x55c9a0f39190 "LDAP", extra=0x55c9a0f354a0 "") at src/providers/data_provider/dp_target_id.c:528
  #9  0x00007fb4da35265b in _sbus_sss_invoke_in_uusss_out_qus_step (ev=0x55c9a0eddbc0, te=<optimized out>, tv=..., private_data=<optimized out>) at src/sss_iface/sbus_sss_invokers.c:2847
  #10 0x00007fb4d9cfb1cf in tevent_common_invoke_timer_handler () from /lib64/libtevent.so.0
  #11 0x00007fb4d9cfb339 in tevent_common_loop_timer_delay () from /lib64/libtevent.so.0
  #12 0x00007fb4d9cfc2f9 in epoll_event_loop_once () from /lib64/libtevent.so.0
  #13 0x00007fb4d9cfa7b7 in std_event_loop_once () from /lib64/libtevent.so.0
  #14 0x00007fb4d9cf5b5d in _tevent_loop_once () from /lib64/libtevent.so.0
  #15 0x00007fb4d9cf5d8b in tevent_common_loop_wait () from /lib64/libtevent.so.0
  #16 0x00007fb4d9cfa757 in std_event_loop_wait () from /lib64/libtevent.so.0
  #17 0x00007fb4dd955ac3 in server_loop (main_ctx=0x55c9a0edf090) at src/util/server.c:724
  #18 0x000055c99fc59760 in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:747
  (gdb) l
  733         ret = remove_duplicate_group_members(tmp_ctx, grp, &ngroup);
  734         if (ret != EOK) {
  735             DEBUG(SSSDBG_CRIT_FAILURE, "Failed to remove duplicate group member     s\n");
  736             goto done;
  737         }
  738
  739         DEBUG_GR_MEM(SSSDBG_TRACE_LIBS, ngroup);
  740
  741         ret = sysdb_transaction_start(sysdb);
  742         if (ret != EOK) {
  743             DEBUG(SSSDBG_CRIT_FAILURE, "Failed to start transaction\n");
  744             goto done;
  745         }
  746         in_transaction = true;
  747
  748         if (ngroup->gr_mem && ngroup->gr_mem[0]) {
  749             attrs = sysdb_new_attrs(tmp_ctx);
  750             if (!attrs) {
  751                 DEBUG(SSSDBG_CRIT_FAILURE, "Allocation error?!\n");
  752                 ret = ENOMEM;
  (gdb) p ngroup
  $1 = (struct group *) 0x0
  743             DEBUG(SSSDBG_CRIT_FAILURE, "Failed to start transaction\n");
  744             goto done;
  745         }
  746         in_transaction = true;
  747
  748         if (ngroup->gr_mem && ngroup->gr_mem[0]) {
  749             attrs = sysdb_new_attrs(tmp_ctx);
  750             if (!attrs) {
  751                 DEBUG(SSSDBG_CRIT_FAILURE, "Allocation error?!\n");
  752                 ret = ENOMEM;
  (gdb) p ngroup
  $1 = (struct group *) 0x0

Merges: https://pagure.io/SSSD/sssd/pull-request/4036

Resolves:
https://pagure.io/SSSD/sssd/issue/4037

Reviewed-by: Jakub Hrozek <jhrozek@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.