Skip to content
This repository has been archived by the owner on Oct 12, 2022. It is now read-only.

fix Issue 13416 - dead-lock in FreeBSD suspend handler #1061

Merged
merged 4 commits into from
Dec 15, 2014

Conversation

MartinNowak
Copy link
Member

  • use pthread internal THR_IN_CRITICAL to retry suspend

Issue 13416

- use pthread internal THR_IN_CRITICAL to retry suspend
@MartinNowak
Copy link
Member Author

This is for you @braddr :)

@schveiguy
Copy link
Member

I'd love to merge (having just deprecated one of those freebsd tests that hung), but I'm afraid I am not qualified to review :) The inter-thread communication looks simple enough, and correct to me.

The major concern I have on this is that you are relying on pthread implementation details. Is there any way to static assert or assert(0) at runtime when those change?

@MartinNowak
Copy link
Member Author

The major concern I have on this is that you are relying on pthread implementation details. Is there any way to static assert or assert(0) at runtime when those change?

None that I know of other than to explicitly check for FreeBSD versions.
The good news is, that those structs change extremely rarely, because it breaks binary compatibility.
The start of the struct was touched last in 2006 to actually add the critical_count.
freebsd/freebsd-src@d6c88c0#diff-d06c43019bc03b70015bb933d5e0d6f0R350
And there doesn't seem to be a different way to do this (even tried to override SIGCANCEL to use pthread_suspend_np and still get the top of the stack in the signal handler).
Maybe they will implement pthread_stackseg_np from OpenBSD at some point, but it's rather unlikely.

@schveiguy
Copy link
Member

The good news is, that those structs change extremely rarely, because it breaks binary compatibility.

pthread_t is an opaque object, it can be changed without breaking binary compatibility.

Any idea if they would accept a request to add the THR_IN_CRITICAL function to the library itself? What about compiling it in C and #include the pthread_t structure? Where is it defined?

@MartinNowak
Copy link
Member Author

pthread_t is an opaque object, it can be changed without breaking binary compatibility.

It does change the internal ABI. It's not a kernel struct AFAIK, but changing it would still require any binary code that uses thr_private to be changed. And they wouldn't have a good reason to change the beginning of that struct.

Any idea if they would accept a request to add the THR_IN_CRITICAL function to the library itself?

THR_IN_CRITICAL a primitive from their library.
https://github.com/freebsd/freebsd/blob/fda27c9937b724b3c3e24266d58c4be4d1839e7f/lib/libthr/thread/thr_private.h#L543
Even if we could convince them to add it (which I don't we could), it would take ages to land in a relevant release, because FBSD development is very conservative.
They also ignored my last patch to fix timezone code.

What about compiling it in C and #include the pthread_t structure? Where is it defined?

Sounds nice, we could even add C code to fetch the values, I think I tried that and it wasn't possible to include that header, not sure right now. It would only partly solve our problem, we'd then have to make 2 releases compiled against different versions of FBSD.
If possible i'll add a C based offsetof unittest.

@schveiguy
Copy link
Member

If possible i'll add a C based offsetof unittest.

I like this plan.

@MartinNowak
Copy link
Member Author

I like this plan.

It's not possible, the header doesn't exist outside of libthr.
I think I've even tried before.
So be pragmatic and solve the dead-lock?

@schveiguy
Copy link
Member

OK, then I guess we do what we have to. I think you are probably correct in that it won't change.

  1. The likelihood of this breaking code is low
  2. The code only reads the struct, does not write to it, so it can't mess up operation even if the struct layout changes.
  3. The breakage will be quite catastrophic (and so won't go unnoticed), or will simply revert to the original behavior.

Can you add a warning in the comment for THR_IN_CRITICAL noting the potential issue of the pthread struct changing, and where to find this function, so future generations that may have to deal with this know where to look if it breaks?

Other than that, LGTM.

@MartinNowak
Copy link
Member Author

future generations that may have to deal with this know where to look if it breaks?

Done, can't hurt to add a few lines, even though there already is the git history and I used the literal name of that macro.

@schveiguy
Copy link
Member

Nice, thanks. LGTM.

@MartinNowak
Copy link
Member Author

Nice, thanks. LGTM.

Added one more commit. Can you please merge it afterwards?

@schveiguy
Copy link
Member

Auto-merge toggled on

schveiguy added a commit that referenced this pull request Dec 15, 2014
fix Issue 13416 - dead-lock in FreeBSD suspend handler
@schveiguy schveiguy merged commit 513ba19 into dlang:master Dec 15, 2014
@schveiguy
Copy link
Member

Might as well, I think the best way to get this going is to merge and see if anything bad happens :)
It can't be worse than getting a hung test every so often on freebsd.

@MartinNowak MartinNowak deleted the fix13416 branch December 15, 2014 19:49
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants