-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infinite recursion in TLS tramp guard #98
Comments
I'll work on getting an easy reproducer for this scenario, but I wanted to get this problem description up now so others can start thinking about it. Another troublesome aspect is that the thread's SIGSEGV leads to the mutator stalling too. I haven't characterized that issue yet though. |
Two questions to think about: is it preferable to pay the costs for safety at mutator design time, at instrumentation time, or at run time, and do we need a solution to scale to an arbitrary number of threads? |
It would be an extremely high burden to design our mutator without any potential recursion, so that tramp guards could be disabled. Especially since systemtap gives the user a lot of freedom in what to do in their instrumentation. The only other option I see from the mutator design is to forbid instrumenting anywhere in the I can't really evaluate the costs at instrumentation time or run time until we have proposals for what those fixes would entail. I have one hack written to insert a global atomic tramp counter before tls is accessed, to limit how deep it can go, but this has its own drawbacks. I'm also exploring whether there are any ways to reconfigure TLS itself. I'm curious to hear what ideas you have.
It sure would be nice. I know this is why we switched to a TLS tramp guard in the first place. |
Bart and I had a talk about this; all solutions we have come up with thus far are kind of gross. Recall that we do have the ability to instrument or execute RPCs at thread creation time. So here's what I can figure as our options, in two orthogonal sets: TLS CREATION TIME
ENSURING SAFETY
With respect to DYNINST_lock_tramp_guard executing unmodified __tls_get_addr, there are multiple approaches: redirecting that particular call to an unmodified copy is probably the simplest, but I'm not positive I'm seeing all the angles on it. |
There's a similar sort of problem with signal handlers that use TLS variables. The general advice there is to touch TLS early in each thread before a signal can get to it. The safe way in this case seems like it would require freezing all threads, removing or disabling ld/libc/libpthread instrumentation, touch TLS in the new thread, re-instrument and resume. A rather heavy solution.
That sounds great, but I don't think this call chain can be statically determined. It looks like I spoke with Carlos O'Donell about this issue, and he let me in on the secret that is Static TLS, via the tls-model "initial-exec". This can be chosen either by Briefly reviewing the MSDN articles on TLS and thread, it sounds like Windows has a pretty similar static model for |
Using systemtap, I have tried to instrument the
mutex_entry
static probe point inlibpthread.so
. When any new thread tries to take a lock, it recurses on itself until the stack overflows. The sequence looks something like:mutex_entry
address and starts the instrumentation.DYNINST_lock_tramp_guard()
to make sure we aren't recursing.DYNINST_tls_tramp_guard
as an offset from__tls_get_addr()
.tls_get_addr_tail()
.__pthread_mutex_lock()
.mutex_entry
... repeat until the stack runs out and we hit SIGSEGV.I don't think there's anything actually special about the exact
mutex_entry
point here -- it's just a conveniently low-level place to probe. Probably any point within__tls_get_addr()
or below will have similar hazards.The text was updated successfully, but these errors were encountered: