Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coroutine stackoverflow detection for linux/posix; Issue #2408 #2740

Merged
merged 12 commits into from Jul 15, 2017

Conversation

ct-clmsn
Copy link
Contributor

@ct-clmsn ct-clmsn commented Jul 5, 2017

Initial implementation of stackoverflow detection for coroutines issue tracker #2408 . The source was derived from posix/linux system man pages, the rethinkdb link provided in the issue conversation thread, and a blog post about coroutine implementations.

@ct-clmsn ct-clmsn changed the title PR for Stackoverflow detection for linux, e.g. based on libsigsegv #2408 PR for Stackoverflow detection for linux/posix #2408 Jul 5, 2017
@ct-clmsn ct-clmsn changed the title PR for Stackoverflow detection for linux/posix #2408 PR for stackoverflow detection for linux/posix #2408 Jul 5, 2017
@ct-clmsn ct-clmsn changed the title PR for stackoverflow detection for linux/posix #2408 Coroutine stackoverflow detection for linux/posix; Issue #2408 Jul 5, 2017
@ct-clmsn
Copy link
Contributor Author

ct-clmsn commented Jul 5, 2017

@hkaiser mixed up my unit test with an existing unit test. will rollback and create a new unit test. on that note, should the coroutine errors be printed using std::cerr or should they be printed out using the hpx compliment streaming output sink?

@@ -140,7 +151,19 @@ namespace hpx { namespace threads { namespace coroutines

x86_linux_context_impl()
: m_stack(nullptr)
{}
{
segv_stack.ss_sp = valloc(SEGV_STACK_SIZE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have that depending on a define such that is configurable through cmake?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sithhell sure thing. How would you want to distinguish or designate the cmake variables? example: HPX_COROUTINE_STACKOVERFLOW_DETECTION

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sithhell is streaming to std::cerr acceptable or would ya'll prefer hpx::cerr?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually name our cmake config variable HPX_WITH_<FOO> and the corresponding preprocessor constant HPX_HAVE_<FOO>.

Also, hpx::cerr might not really work as it would send things to the console which might not be operational under stack-overflow conditions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sithhell just placed in the cmake modification you suggested - hope that's sufficient. thanks! @hkaiser thank you for the guidance!

@ABresting
Copy link
Contributor

@ct-clmsn I can see the code you have written and am a bit afraid as the code is somewhat similar to that of libsigsegv, it's a GPL'ed library so please be a bit careful about it before opening any PR.

I will get back to you about why it's not working but kindly take care of the GPL issue.

@ct-clmsn
Copy link
Contributor Author

ct-clmsn commented Jul 5, 2017

@ABresting I'm aware of the GPL concerns mentioned in issue #2408. I did not read and have not read the libsegsegv implementation source files. At the top of this PR's commentary thread, I noted the reference materials used to implement this PR. I'd strongly encourage reviewing the links provided at the top of this PR commentary thread if you have concerns regarding the origin of this PR's implementation.

segv_stack.ss_flags = 0;
segv_stack.ss_size = SEGV_STACK_SIZE;

bzero(&action, sizeof(action));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I suggest to use std::memset(&action), '\0', sizeof()); instead of bzero (also #include <cstring> for memset to be declared)? bzero is not portable (not that we would need for it to be portable) and not part of the std C++ library.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hkaiser sure thing! modifications made and pushed.

@ABresting
Copy link
Contributor

@ct-clmsn rethink-db's method is under Apache licensing, so even I am not so sure how safe it will be.

@biddisco
Copy link
Contributor

biddisco commented Jul 6, 2017

The apache license allows redistribution under a new license, providing the original code keeps the apache license, so it would be compatible with HPX if any apache licensed code has been used directly.

from https://www.whitesourcesoftware.com/whitesource-blog/top-10-apache-license-questions-answered/

"You can choose to release the modified or derived products under different licenses, the unmodified parts of the software, however, must retain the Apache License, and you cannot name your modified version in any way that suggests that the final product is endorsed or created by the ASF.

Additionally, if you want to add a copyright statement about all the modifications that you’ve done to any Apache licensed software; you are free to do so. Since the Apache License doesn’t require you to release the modified code under the same license, you can choose to add specific license terms and conditions that govern how others use, reproduce, or distribute your modified code."

@ct-clmsn
Copy link
Contributor Author

ct-clmsn commented Jul 6, 2017

@ABresting the implementation of this PR used tutorial information provided in the two blog posts (blogs provided examples and explanations - not licensed program source code) combined with my interpretation of posix/linux manual pages. @biddisco thanks!

@biddisco
Copy link
Contributor

biddisco commented Jul 6, 2017

So to finish my thought on the subject - if we were to use the code, add the boost licence, add a note to say that the code was inspired by the blog posts #link1, #link2, &etc then since the author has not looked at the other library, it qualifies as not being a derivative work of the GPL'd code. It could be construed as a derivative of some apache stuff and we therefore note that, include the apache license and refer again to the articles, then I think we're ok. If anyone ever raises an objection, the code can be pulled until the issue is resolved - at that time.

@hkaiser
Copy link
Member

hkaiser commented Jul 6, 2017

@biddisco: sounds like a plan. Better beg for forgiveness afterwards than to ask for permission upfront ;-)

@ct-clmsn
Copy link
Contributor Author

ct-clmsn commented Jul 7, 2017

@biddisco @hkaiser web-link-resources have been added to the source in this PR in a comment; pretty sure all of the extra EOL character spaces have been removed.

@hkaiser
Copy link
Member

hkaiser commented Jul 7, 2017

Thanks Chris. Code-wise this looks good now. I will try it out asap (I think @sithhell wanted to try it out as well).

@ct-clmsn
Copy link
Contributor Author

ct-clmsn commented Jul 7, 2017

Great! Thank you! Should the error message include a suggestion about problem decomposition?

hpx_option(HPX_WITH_THREAD_STACKOVERFLOW_DETECTION
BOOL
"Enable stackoverflow detection for HPX threads/coroutines. (default: ON)"
ON ADVANCED)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make this OFF by default?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest leaving it ON by default at least in Debug mode.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, would it make sense to make enable/disable a runtime property?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about the overheads this detection adds. I am leaning towards only enabling it on request (If posix/linux based platform), given that prerequiste, we can also safely enable it when the build type is debug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sithhell @hkaiser just to clarify: disable by default and enable for debug builds by default?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ct-clmsn Enable for debug builds by default, Disable by default for the remaining build types, I'd say.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sithhell: we were not able to resolve a similar configuration issue a while back (see: #2365). How would you suggest to do this (if you had a new idea we could go back and fix #2365 as well...)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue we had back then was to supporting multi configuration builds. As those don't exist in any of the *nix ecosystems (or does WSL change the picture?), I would consider this a non problem for this particular feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh right, I withdraw my comment, then...

@@ -14,6 +14,7 @@ set(tests
thread_launching
thread_mf
thread_stacksize
thread_stacksize_overflow
thread_suspension_executor
thread_yield
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CMakeLists.txt here need to flag the added test as "will fail". I suggest something like this:

set_tests_properties(tests.unit.threads.thread_stacksize_overflow PROPERTIES
  PASS_REGULAR_EXPRESSION "Stack overflow in coroutine at address 0x[0-9a-fA-F]*")

This will mark the test as passed when it fails with that output. That would require to add this test only when the overflow detection is activated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this should be enabled on Linux-based systems only.

Copy link
Contributor Author

@ct-clmsn ct-clmsn Jul 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sithhell thank you for the suggestion and assistance. @hkaiser will be sure to provide a platform check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ct-clmsn I will do that. Let's go ahead with this as is.

Copy link
Member

@sithhell sithhell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the comments are resolved, this PR is ready to be merged. This is a long needed addition.

<< "hpx.stacks.large_size, "
<< "or hpx.stacks.huge_size runtime "
<< std::endl
<< "flags to configure coroutine heap sizes."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should read "stack sizes"

<< "hpx.stacks.large_size, "
<< "or hpx.stacks.huge_size runtime "
<< std::endl
<< "flags to configure coroutine heap sizes."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@hkaiser hkaiser merged commit 0c10ac3 into STEllAR-GROUP:master Jul 15, 2017
@ct-clmsn
Copy link
Contributor Author

ct-clmsn commented Jul 16, 2017

@hkaiser thank you for helping to finalize the merge!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants