Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pthread_once: g_lock may lead deadlock #10752

Merged
merged 1 commit into from Sep 26, 2023

Conversation

hujun260
Copy link
Contributor

@hujun260 hujun260 commented Sep 21, 2023

Summary

For programs with the dependencies logic in pthread_once callback , using global locks may cause deadlock:

task A
pthread_once()
|
|-> nxrmutex_lock(&g_lock);
 -> init_routine(); // callback to wait task B
                                                  task B
                                                  pthread_once()
                                                  |
                                                   ->nxrmutex_lock(&g_lock); // Deadlock
                                                   ->init_routine(); // hold resource to wake up task A

Impact

pthread_once

Testing

internal

include/pthread.h Outdated Show resolved Hide resolved
libs/libc/pthread/pthread_once.c Outdated Show resolved Hide resolved
Copy link
Contributor

@jerpelea jerpelea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add commit message describing the change

@@ -136,7 +136,7 @@

/* Used to initialize a pthread_once_t */

#define PTHREAD_ONCE_INIT (false)
#define PTHREAD_ONCE_INIT {false, PTHREAD_MUTEX_INITIALIZER}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, reviewing it from mobile. Is this a recursive mutex?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rmutex is not necessary. If a user uses the same once_control in init_routine, it can cause a deadlock, which we believe is a user code design issue.

If rmutex is used, it can cause stack overflow and make it difficult to troubleshoot

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why rmutex can cause stack overflow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why rmutex can cause stack overflow?

Too many levels of recursive calls

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think taking into account once state value I do not expect more than one level of nesting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misunderstood. It should not cause recursive calls, but using rmutex cannot guarantee that init_route execution is complete when pthread_once returns

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The POSIX specifies:

If init_routine recursively calls pthread_once() with the same once_control, the recursive call will not call the specified init_routine, and thus the specified init_routine will not complete, and thus the recursive call to pthread_once() will not return.

Not sure how that should be treated. Especially the last sentence the recursive call to pthread_once() will not return.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also POSIX contains:

For dynamic library initialization in a multi-threaded process, if an initialization flag is used the flag needs to be protected against modification by multiple threads simultaneously calling into the library. This can be done by using a mutex (initialized by assigning PTHREAD_MUTEX_INITIALIZER). However, the better solution is to use pthread_once() which is designed for exactly this purpose

So in this case I do not see a difference from PTHREAD_MUTEX_INITIALIZER described case in POSIX

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's normally a program error to call an initialization routine recursively, it's good to catch this error through the non-recursive mutex(dead lock).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pkarashchenko do you have more question?

libs/libc/pthread/pthread_once.c Outdated Show resolved Hide resolved
@@ -136,7 +136,7 @@

/* Used to initialize a pthread_once_t */

#define PTHREAD_ONCE_INIT (false)
#define PTHREAD_ONCE_INIT {false, PTHREAD_MUTEX_INITIALIZER}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also POSIX contains:

For dynamic library initialization in a multi-threaded process, if an initialization flag is used the flag needs to be protected against modification by multiple threads simultaneously calling into the library. This can be done by using a mutex (initialized by assigning PTHREAD_MUTEX_INITIALIZER). However, the better solution is to use pthread_once() which is designed for exactly this purpose

So in this case I do not see a difference from PTHREAD_MUTEX_INITIALIZER described case in POSIX

For programs with the dependencies logic in pthread_once callback , using global locks may cause deadlock:

task A
pthread_once()
|
|-> nxrmutex_lock(&g_lock);
 -> init_routine(); // callback to wait task B
                                                  task B
                                                  pthread_once()
                                                  |
                                                   ->nxrmutex_lock(&g_lock); // Deadlock
                                                   ->init_routine(); // hold resource to wake up task A

Signed-off-by: hujun5 <hujun5@xiaomi.com>
@xiaoxiang781216 xiaoxiang781216 merged commit b669306 into apache:master Sep 26, 2023
26 checks passed
@jerpelea jerpelea added this to To-Add in Release Notes - 12.3.0 Sep 26, 2023
@jerpelea jerpelea moved this from To-Add to done in Release Notes - 12.3.0 Oct 3, 2023
@hujun260 hujun260 deleted the apache_2 branch October 11, 2023 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants