Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use an exit flag as well as pthread_cancel to tell worker threads to terminate #8

Merged
merged 5 commits into from
Nov 27, 2016

Conversation

chutz
Copy link

@chutz chutz commented Sep 8, 2016

It turns out that OSX's pthread_cancel implementation is not very reliable, so this switches to a flag variable rather than using pthread_cancel to tell threads that they should exit.

This also removes the volatile keyword from the qstate variable. It's not necessary since the variable is always protected by a mutex, and it causes the compiler to not be able to optimize properly.

@vapier
Copy link

vapier commented Sep 8, 2016

pedantically, you want "its"

wrt pthread_cancel changing, imo the commit message needs more details/references. you should also leave comments in the code as to why you're switching to a manual polling method. by not using canceling, you possibly allow threads to stay hung up in loops/syscalls when a cancel request would kick them out automatically (look at all the "cancellation endpoints").

@chutz chutz force-pushed the use-flag-rather-than-pthread-cancel branch from ba49a49 to 290d1ee Compare September 9, 2016 01:12
@chutz
Copy link
Author

chutz commented Sep 9, 2016

That's a good point.

I have updated the patch to take both approaches, so we will get all the advantages of pthread_cancel, and the backup of an exit flag for OSX , so if the cancel doesn't propagate for some reason we will still exit.

@chutz chutz force-pushed the use-flag-rather-than-pthread-cancel branch from 290d1ee to a1851d4 Compare September 9, 2016 01:15
@chutz chutz changed the title Use flag rather than pthread cancel Use an exit flag as well as pthread_cancel to tell worker threads to terminate Sep 9, 2016
@chutz chutz force-pushed the use-flag-rather-than-pthread-cancel branch from a1851d4 to ac462a9 Compare September 9, 2016 01:32
Patrick McLean added 3 commits September 8, 2016 18:41
Since this variable is protected by locks everywhere it is used,
volatile is useless, and just makes things slower.
@chutz chutz force-pushed the use-flag-rather-than-pthread-cancel branch 2 times, most recently from 13b407d to 21c72dd Compare September 9, 2016 01:47
@vapier
Copy link

vapier commented Sep 9, 2016

i think their preferred term now is "macOS" :)

i'm still confused by what you mean by "misbehaves". does it randomly crash ? randomly hang ? randomly make chicken soup ? or do the other threads just randomly fail to notice they've been canceled ?

@chutz
Copy link
Author

chutz commented Sep 9, 2016

Sorry about that, I hadn't realized that I didn't properly articulate that problem. It seems that the threads occasionally fail to notice that they have been cancelled, and never terminate. When the main thread joins them, it ends up hanging forever.

@vapier
Copy link

vapier commented Sep 9, 2016

OK -- throw that in the comment. guess there isn't a whole lot we could do if calling pthread_cancel() goes unnoticed but the threads keep checking.

can you do some measurements on perf impact ? maybe we should ifdef the damage away to __APPLE__.

@chutz chutz force-pushed the use-flag-rather-than-pthread-cancel branch from 21c72dd to 8b1a78d Compare September 9, 2016 05:10
@chutz
Copy link
Author

chutz commented Sep 9, 2016

OK, I just pushed over some changes that refactor so all the thread stopping/cancellation code is in one place, then gated the damage away to __APPLE__

@chutz chutz force-pushed the use-flag-rather-than-pthread-cancel branch 2 times, most recently from 216c62b to 72a53a2 Compare September 9, 2016 05:31
#ifdef __APPLE__
/* Check if we should exit, we are doing both cancel and exit condition
* since on OSX threads seem to occasionally fail to notice when they have
* been cancelled. We want to havea backup to make sure that we won't hang
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"have a"

@vapier
Copy link

vapier commented Sep 9, 2016

overall seems fine to me now

@chutz chutz force-pushed the use-flag-rather-than-pthread-cancel branch 2 times, most recently from 020749e to d648cef Compare September 9, 2016 06:36
@chutz
Copy link
Author

chutz commented Sep 9, 2016

OK, final changes made.

…tion

This code is pretty repetitive, so let's split it out to its own function so we
won't have so much repeated code.
@chutz chutz force-pushed the use-flag-rather-than-pthread-cancel branch from d648cef to efaeffa Compare September 9, 2016 06:39
@vapier
Copy link

vapier commented Sep 9, 2016

thanks! let's see if @rapier1 has anything to add.

It turns out that apparently on OSX, pthread_cancel is not completely
reliable. It seems that occasionally a thread will fail to notice that it has
been cancelled. Repeatedly running the openssh integrity test suite will
eventually deadlock on exit on OSX (I think it took about an hour for
the person I had testing this).

This changes the code to use an exit flag as well as pthread_cancel
to tell the worker threads to exit. This should give us the best of
both worlds.
@chutz chutz force-pushed the use-flag-rather-than-pthread-cancel branch from efaeffa to 526b442 Compare September 9, 2016 17:13
@chutz
Copy link
Author

chutz commented Sep 9, 2016

One final minor fix I though of this morning. After stopping the threads in ssh_aes_ctr_init, make sure to reset the exit flag on __APPLE__ so the new threads don't immediately terminate.

@vapier vapier merged commit b616cf2 into rapier1:master Nov 27, 2016
@vapier
Copy link

vapier commented Nov 27, 2016

merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants