-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make shutdown sequence more robust under cooperative scheduling #788
Conversation
dc88755
to
c6de3e7
Compare
c6de3e7
to
76728df
Compare
46a205f
to
f8553fa
Compare
/* Set termination flag to notify lthread scheduler to bail out. */ | ||
lthread_notify_completion(); | ||
SGXLKL_VERBOSE("calling vio_terminate()\n"); | ||
vio_terminate(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this go before the lkl_sys_halt
call? Or will VirtIO interrupts be silently ignored if they happen after the kernel has exited? I vaguely remember some shutdown-gate code for this, but it would be good to have a comment explaining this assumption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we move it earlier, there may still be background I/O by the kernel, no? (I tried this, and I saw failures.)
pthread_mutex_unlock(&terminating_ethread_exited_mtx); | ||
|
||
// Only try to destroy enclave if all ethreads successfully exited | ||
if (oe_enclave && exited_ethread_count == econf->ethreads) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? Don't we want to kill the enclave as long as one thread has delivered to us a return value? The other threads are not doing anything we care about...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but the problem is that the OE terminate API call segfaults in these cases: #788 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably need to file that as an OE bug and link to that issue here. If you can’t tear down an uncooperative enclave, that’s a problem. That said, what happens if we send a signal to all of the host pthreads that back our ethreads first? Does that not let us exit them all from the enclave and then tear it down?
2d05ec0
to
f6024e2
Compare
147866e
to
2778aba
Compare
@@ -138,10 +138,22 @@ extern int sgxlkl_trace_redirect_syscall; | |||
} \ | |||
} while (0) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/** | |
* Pointer to the thread context. This is needed because of limitations in the C address space syntax. | |
*/ | |
typedef struct schedctx* threadctx_ptr; | |
/** | |
* Pointer to the per-ethread context. | |
*/ | |
static const __seg_gs threadctx_ptr *ethread_schedctx = ((__seg_gs threadctx_ptr *)48); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this have the value of the first ethread which evaluates this or it indeed would act as ethread local variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accesses to this translate to %gs:48
.
struct schedctx* _sgxlkl_verbose_self; \ | ||
__asm__ __volatile__("mov %%gs:48,%0" : "=r"(_sgxlkl_verbose_self)); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
struct schedctx* _sgxlkl_verbose_self; \ | |
__asm__ __volatile__("mov %%gs:48,%0" : "=r"(_sgxlkl_verbose_self)); \ | |
struct schedctx* _sgxlkl_verbose_self = *ethread_schedctx; \ |
All of the assembly for accessing this with hard-coded constants all over the place makes me nervous. Let's add a global for this and clean up code to use that. GCC and clang have good support for gs-relative addresses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I couldn't get this to work, as I presume that this needs gnu99 support? For this, we would also need to change musl to be compiled with gnu99...?
This helps identify cases when a thread exit for tid=1 causes problems.
The test itself is broken. It tries to unmount a file system from a non-existent mount point.
By creating a separate thread for the execution of the application's main function, it can join in the regular way after termination.
104d2ec
to
7c158a5
Compare
@vtikoo, since you'll probably end up merging this one, please will you do the following sequence:
Thank you! |
This PR changes the shutdown sequence to make it more robust under cooperative scheduling. The main idea is to use the idle host task for the shutdown and only allow a single ethread in the enclave during shutdown. In addition, it introduces a separate lthread to run the application's main function and makes this lthread exit in a regular fashion.
The PR also fixes a memory corruption issue that potentially affected shutdown, and ensures that userspace threads cannot make SGX-LKL shutdown hang.
The problem, illustrated by a new exit spin test in this PR, is that an application that has a thread with a busy compute loop can make SGX-LKL shutdown hang. There is no clean way under cooperative scheduling to make the associated ethread exit the enclave. In the current shutdown design, we need all ethreads to exit the enclave, which is brittle.
In this design, a single ethread is designated the "terminating ethread" and executes the shutdown code after the other ethreads have left the enclave. Only the terminating ethread needs to exit the enclave successfully. The launcher process will then quit, with ethreads potentially still executing in the enclave.
Fixes #733 #795 #796 #797 #802
Known limitations
Even when the clone test passes, I see garbage characters being printed (scroll to the right below to see them):
The same corruption happens in current runs of the clone test (without this PR), so I believe that the above is not a new regression introduced by this PR (cc: @davidchisnall @vtikoo @SeanTAllen). This may also be related to #740.
oe_terminate_enclave()
call results in a segfault in such cases, which may be an OE bug:This PR avoids calling
oe_terminate_enclave()
if not all ethreads have exited, which shouldn't be a problem as I presume that the enclave will be terminated correctly on process termination.