Make shutdown sequence more robust under cooperative scheduling #788

prp · 2020-08-14T10:31:19Z

This PR changes the shutdown sequence to make it more robust under cooperative scheduling. The main idea is to use the idle host task for the shutdown and only allow a single ethread in the enclave during shutdown. In addition, it introduces a separate lthread to run the application's main function and makes this lthread exit in a regular fashion.

The PR also fixes a memory corruption issue that potentially affected shutdown, and ensures that userspace threads cannot make SGX-LKL shutdown hang.

The problem, illustrated by a new exit spin test in this PR, is that an application that has a thread with a busy compute loop can make SGX-LKL shutdown hang. There is no clean way under cooperative scheduling to make the associated ethread exit the enclave. In the current shutdown design, we need all ethreads to exit the enclave, which is brittle.

In this design, a single ethread is designated the "terminating ethread" and executes the shutdown code after the other ethreads have left the enclave. Only the terminating ethread needs to exit the enclave successfully. The launcher process will then quit, with ethreads potentially still executing in the enclave.

Fixes #733 #795 #796 #797 #802

Known limitations

The clone CI test sometimes fails:

[[  SGX-LKL ]] [7f0042b46f78] [  44] app_main_thread(): Invoking dynamic loader (stage 3)
Clone syscall number: 56
lkl_syscall: 0
fn: 0x7f003daddce7 
ctid: 0x7f003de69dd8 
Stack: 0x7f003dadb000-0x7f003dadd000 
Clone returned 43, ctid: 43 ptid: 43
New thread created.
Arg: 0x42.
Stack: 0x7f003dadcfe7.
After futex call, ctid is 0
Other thread should have terminated by now.
Thread stacks: 0x7f003dadb000, 0x7f003dce20a0, 0x7f003dce40a0
Thread 0 started
Thread 1 started
Created two threads: 44, 45, waking them now

If this test hangs after waking up one thread, check you have at least 2 ethreads
Thread 0 started
Thread 1 started
Created two threads: 44, 45, waking them now
Thread This test is checking that LKL is able to wake up two cloned threads and leaving them running in parallel

1 woke up
tarted
Thread 0 woke up
Thread 0 finished
Waiting for for Thread 1 to finish
Segmentation fault
[[  SGX-LKL ]] [7f0042b46f78] [  41] lkl_terminate(): terminating LKL (exit_status=139)

Even when the clone test passes, I see garbage characters being printed (scroll to the right below to see them):

[[  SGX-LKL ]] [7f504171ef78] [  44] app_main_thread(): Invoking dynamic loader (stage 3)
Clone syscall number: 56
lkl_syscall: 0
fn: 0x7f503daddce7
ctid: 0x7f503de69dd8
Stack: 0x7f503dadb000-0x7f503dadd000
Clone returned 43, ctid: 43 ptid: 43
New thread created.
Arg: 0x42.
Stack: 0x7f503dadcfe7.
After futex call, ctid is 0
Other thread should have terminated by now.
Thread stacks: 0x7f503dadb000, 0x7f503dce20a0, 0x7f503dce40a0
Thread 0 started                                                                                                                                                                                           Thread 0 woke up                                                                                                                                                                                           >P                                                                                                                                                                                                          Pd                                                                                                                                                                                                        Created two threads: 4                                                                                                                                                                                     If this test hangs after waking up one thread, check you have at least 2 ethreads
This test is checking that LKL is able to wake up two cloned threads and leaving them running in parallel

Thread 1 woke up
Waiting for for Thread 0 to finish
Waiting for for Th
TEST_PASSED

The same corruption happens in current runs of the clone test (without this PR), so I believe that the above is not a new regression introduced by this PR (cc: @davidchisnall @vtikoo @SeanTAllen). This may also be related to #740.

The enclave cannot be shut down with an OE API call if there are still ethreads executing. The oe_terminate_enclave() call results in a segfault in such cases, which may be an OE bug:

[[  SGX-LKL ]] [7f1041b26f78] [  42] lkl_termination_thread(): done
[   SGX-LKL  ] ethread exited (ethread_id=3)
[   SGX-LKL  ] oe_terminate_enclave... 2020-08-15T19:11:30.000000Z [(H)VERBOSE] tid(0x7f28bed85300) | /home/v-pepiet/devel/sgx-lkl/build_musl/libsgxlkl.so.signed 0x0 OE_ECALL: DESTRUCTOR
2020-08-15T19:11:30.000000Z [(H)VERBOSE] tid(0x7f28bed85300) | _do_eenter(tcs=0x7f1041b21000 aep=0x4008a120 codeIn=1, funcIn=0 argIn=0)
 [/home/v-pepiet/devel/sgx-lkl/openenclave/host/sgx/calls.c:_do_eenter:178]
2020-08-15T19:11:30.000000Z [(H)VERBOSE] tid(0x7f28bed85300) | /home/v-pepiet/devel/sgx-lkl/build_musl/libsgxlkl.so.signed 0x0 EDL_OCALL: CALL_HOST_FUNCTION
2020-08-15T19:11:30.000000Z [(E)INFO] tid(0x7f28bed85300) | libsgxlkl.so.signed:attesters is NULL [/home/v-pepiet/devel/sgx-lkl/openenclave/enclave/sgx/attester.c:oe_attester_shutdown:512]
2020-08-15T19:11:30.000000Z [(H)VERBOSE] tid(0x7f28bed85300) | /home/v-pepiet/devel/sgx-lkl/build_musl/libsgxlkl.so.signed 0x0 EDL_OCALL: CALL_HOST_FUNCTION
2020-08-15T19:11:30.000000Z [(E)INFO] tid(0x7f28bed85300) | libsgxlkl.so.signed:verifiers is NULL [/home/v-pepiet/devel/sgx-lkl/openenclave/common/sgx/verifier.c:oe_verifier_shutdown:847]
2020-08-15T19:11:30.000000Z [(H)VERBOSE] tid(0x7f28aeffd700) | /home/v-pepiet/devel/sgx-lkl/build_musl/libsgxlkl.so.signed 0x0 EDL_OCALL: CALL_HOST_FUNCTION
2020-08-15T19:11:30.000000Z [(H)VERBOSE] tid(0x7f28b4f56700) | /home/v-pepiet/devel/sgx-lkl/build_musl/libsgxlkl.so.signed 0x0 EDL_OCALL: CALL_HOST_FUNCTION
2020-08-15T19:11:30.000000Z [(H)ERROR] tid(0x7f28adffb700) | tcs=0x41f29000
 [/home/v-pepiet/devel/sgx-lkl/openenclave/host/sgx/enclavemanager.c:oe_query_enclave_instance:204]
Makefile:56: recipe for target 'run-spin-hw' failed
make: *** [run-spin-hw] Aborted (core dumped)

This PR avoids calling oe_terminate_enclave() if not all ethreads have exited, which shouldn't be a problem as I presume that the enclave will be terminated correctly on process termination.

There are rare cases when the idle thread is seen to be doing too much work by the kernel, and there is a failure. A better solution would be to have a separate kernel host task to execute the termination code, but this would require larger changes.

src/lkl/setup.c

src/main-oe/sgxlkl_run_oe.c

src/sched/lthread.c

src/enclave/enclave_oe.c

davidchisnall · 2020-08-17T10:16:16Z

src/lkl/setup.c

-    /* Set termination flag to notify lthread scheduler to bail out. */
-    lthread_notify_completion();
+    SGXLKL_VERBOSE("calling vio_terminate()\n");
+    vio_terminate();


Should this go before the lkl_sys_halt call? Or will VirtIO interrupts be silently ignored if they happen after the kernel has exited? I vaguely remember some shutdown-gate code for this, but it would be good to have a comment explaining this assumption.

If we move it earlier, there may still be background I/O by the kernel, no? (I tried this, and I saw failures.)

src/lkl/setup.c

src/main-oe/sgxlkl_run_oe.c

davidchisnall · 2020-08-17T10:25:51Z

src/main-oe/sgxlkl_run_oe.c

+    pthread_mutex_unlock(&terminating_ethread_exited_mtx);
+
+    // Only try to destroy enclave if all ethreads successfully exited
+    if (oe_enclave && exited_ethread_count == econf->ethreads)


Why? Don't we want to kill the enclave as long as one thread has delivered to us a return value? The other threads are not doing anything we care about...

Right, but the problem is that the OE terminate API call segfaults in these cases: #788 (comment)

We probably need to file that as an OE bug and link to that issue here. If you can’t tear down an uncooperative enclave, that’s a problem. That said, what happens if we send a signal to all of the host pthreads that back our ethreads first? Does that not let us exit them all from the enclave and then tear it down?

src/sched/futex.c

src/sched/lthread.c

src/enclave/enclave_oe.c

tests/ltp/ltp-batch1/ltp_disabled_tests.txt

tests/ltp/run_ltp_test.sh

src/sched/lthread.c

src/enclave/mpmc_queue.c

src/enclave/enclave_init.c

davidchisnall · 2020-08-21T08:27:15Z

src/include/enclave/enclave_util.h

@@ -138,10 +138,22 @@ extern int sgxlkl_trace_redirect_syscall;
        }                                                     \
    } while (0)



Suggested change

/**

* Pointer to the thread context. This is needed because of limitations in the C address space syntax.

*/

typedef struct schedctx* threadctx_ptr;

/**

* Pointer to the per-ethread context.

*/

static const __seg_gs threadctx_ptr *ethread_schedctx = ((__seg_gs threadctx_ptr *)48);

Will this have the value of the first ethread which evaluates this or it indeed would act as ethread local variable?

Accesses to this translate to %gs:48.

davidchisnall · 2020-08-21T08:28:08Z

src/include/enclave/enclave_util.h

+        struct schedctx* _sgxlkl_verbose_self;                               \
+        __asm__ __volatile__("mov %%gs:48,%0" : "=r"(_sgxlkl_verbose_self)); \


Suggested change

struct schedctx* _sgxlkl_verbose_self; \

__asm__ __volatile__("mov %%gs:48,%0" : "=r"(_sgxlkl_verbose_self)); \

struct schedctx* _sgxlkl_verbose_self = *ethread_schedctx; \

All of the assembly for accessing this with hard-coded constants all over the place makes me nervous. Let's add a global for this and clean up code to use that. GCC and clang have good support for gs-relative addresses.

Ok, I couldn't get this to work, as I presume that this needs gnu99 support? For this, we would also need to change musl to be compiled with gnu99...?

src/include/enclave/lthread.h

This helps identify cases when a thread exit for tid=1 causes problems.

The test itself is broken. It tries to unmount a file system from a non-existent mount point.

By creating a separate thread for the execution of the application's main function, it can join in the regular way after termination.

davidchisnall · 2020-08-24T08:43:37Z

@vtikoo, since you'll probably end up merging this one, please will you do the following sequence:

Squash and merge the LKL and Musl PRs.
Add a commit to this branch updating the LKL and musl submodules to be the post-merge commits.
Squash and merge this, cleaning up the commit messages.

Thank you!

prp force-pushed the prp/robust_shutdown branch from dc88755 to c6de3e7 Compare August 14, 2020 13:30

prp requested review from davidchisnall, vtikoo, SeanTAllen and paulcallen August 14, 2020 14:47

SeanTAllen reviewed Aug 14, 2020

View reviewed changes

src/lkl/setup.c Outdated Show resolved Hide resolved

SeanTAllen reviewed Aug 14, 2020

View reviewed changes

src/main-oe/sgxlkl_run_oe.c Outdated Show resolved Hide resolved

prp force-pushed the prp/robust_shutdown branch from c6de3e7 to 76728df Compare August 14, 2020 15:29

davidchisnall reviewed Aug 14, 2020

View reviewed changes

src/sched/lthread.c Outdated Show resolved Hide resolved

src/enclave/enclave_oe.c Outdated Show resolved Hide resolved

prp force-pushed the prp/robust_shutdown branch 2 times, most recently from 46a205f to f8553fa Compare August 15, 2020 07:38

davidchisnall marked this pull request as draft August 17, 2020 10:13

davidchisnall reviewed Aug 17, 2020

View reviewed changes

SeanTAllen mentioned this pull request Aug 17, 2020

[Test] Fix and enable 10 tests disabled with PR 789 #794

Open

10 tasks

prp force-pushed the prp/robust_shutdown branch from 2d05ec0 to f6024e2 Compare August 18, 2020 20:34

prp marked this pull request as ready for review August 19, 2020 07:17

prp requested a review from davidchisnall August 19, 2020 07:18

This was referenced Aug 19, 2020

Make shutdown sequence more robust under cooperative scheduling lsds/lkl#8

Merged

Remove custom futex.h changes. lsds/sgx-lkl-musl#35

Merged

prp marked this pull request as draft August 20, 2020 08:05

prp force-pushed the prp/robust_shutdown branch 2 times, most recently from 147866e to 2778aba Compare August 20, 2020 18:16

prp marked this pull request as ready for review August 20, 2020 21:12

hukoyu mentioned this pull request Aug 20, 2020

[Test] Build only enabled LTP tests #812

Merged

hukoyu reviewed Aug 20, 2020

View reviewed changes

tests/ltp/ltp-batch1/ltp_disabled_tests.txt Outdated Show resolved Hide resolved

hukoyu reviewed Aug 20, 2020

View reviewed changes

tests/ltp/run_ltp_test.sh Outdated Show resolved Hide resolved

vtikoo requested a review from mikbras August 21, 2020 05:30

vtikoo reviewed Aug 21, 2020

View reviewed changes

src/sched/lthread.c Show resolved Hide resolved

davidchisnall reviewed Aug 21, 2020

View reviewed changes

Add new exit test that has a thread with a busy loop

7aa2f38

prp added 23 commits August 22, 2020 08:19

Add LKL halt call and ignore userspace threads

270f205

Add futex_init prototype to header

8f2676d

Test CI without lthread deallocation

76e227c

Don't make the shutdown thread detached

8c1d60b

Fix correct allocation of OE heap page count

efc785b

Make enclave single-threaded during shutdown

2c6f2a3

Add vio shutdown

11f34cc

Fix vio shutdown sequence

f989af1

Change shutdown sequence to run in LKL idle host task

ab4e349

Avoid LKL CPU/thread shutdown

4cc0cfb

Reactivate spin test

36147bd

Fix bug in status code reporting

b9e4f85

Address review comments

e8ada07

More review comments

b85c163

Bump LKL

9b4dad9

Add verbose output for thread_exit

f9700fa

This helps identify cases when a thread exit for tid=1 causes problems.

Deactivate LTP test fdatasync03

6f442a0

The test itself is broken. It tries to unmount a file system from a non-existent mount point.

Create new thread to execute app main

71eed01

By creating a separate thread for the execution of the application's main function, it can join in the regular way after termination.

Fix bug in initialisation sequence

50020ff

Do not print stack trace for every exception

45f9274

Revert stack traces for signal handling

e4dd516

Change mpmc pause implementation

1efba83

Address review comments

7c158a5

prp force-pushed the prp/robust_shutdown branch from 104d2ec to 7c158a5 Compare August 22, 2020 08:23

davidchisnall approved these changes Aug 24, 2020

View reviewed changes

vtikoo approved these changes Aug 24, 2020

View reviewed changes

Update lkl and sgx-lkl-musl submodules

827796e

vtikoo merged commit 2bfcc27 into oe_port Aug 25, 2020

vtikoo deleted the prp/robust_shutdown branch August 25, 2020 05:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make shutdown sequence more robust under cooperative scheduling #788

Make shutdown sequence more robust under cooperative scheduling #788

prp commented Aug 14, 2020 •

edited

Loading

davidchisnall Aug 17, 2020

prp Aug 19, 2020

davidchisnall Aug 17, 2020

prp Aug 19, 2020

davidchisnall Aug 19, 2020

davidchisnall Aug 21, 2020

vtikoo Aug 21, 2020

davidchisnall Aug 21, 2020

davidchisnall Aug 21, 2020

prp Aug 21, 2020

davidchisnall commented Aug 24, 2020

		@@ -138,10 +138,22 @@ extern int sgxlkl_trace_redirect_syscall;
		} \
		} while (0)

+/**
+ * Pointer to the thread context.  This is needed because of limitations in the C address space syntax.
+ */
+typedef struct schedctx* threadctx_ptr;
+/**
+ * Pointer to the per-ethread context.
+ */
+static const __seg_gs threadctx_ptr *ethread_schedctx = ((__seg_gs threadctx_ptr *)48);

		struct schedctx* _sgxlkl_verbose_self; \
		__asm__ __volatile__("mov %%gs:48,%0" : "=r"(_sgxlkl_verbose_self)); \

	struct schedctx* _sgxlkl_verbose_self; \
	__asm__ __volatile__("mov %%gs:48,%0" : "=r"(_sgxlkl_verbose_self)); \
	struct schedctx* _sgxlkl_verbose_self = *ethread_schedctx; \

Make shutdown sequence more robust under cooperative scheduling #788

Make shutdown sequence more robust under cooperative scheduling #788

Conversation

prp commented Aug 14, 2020 • edited Loading

Known limitations

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidchisnall commented Aug 24, 2020

prp commented Aug 14, 2020 •

edited

Loading