Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV: running basic/clone looping with 2 ethreads and OE debug allocator #740

Closed
paulcallen opened this issue Aug 6, 2020 · 9 comments
Labels
p0 Blocking priority

Comments

@paulcallen
Copy link
Member

I enabled OE debug allocator (-DUSE_DEBUG_MALLOC=ON) for building OE
Running on 4-core ACC VM with 2 ETHREADS
Running test tests/basic/clone with command sudo make DEBUG=1 run-hw-gdb-clone-loop
Crash is here:

Thread 6 "ENCLAVE" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffef172700 (LWP 13152)]
0x00007fe0005b80ba in _lthread_free (lt=0x7fe000b15ef0) at sched/lthread.c:361
361             void (*f)(void*) = lt->cancelbuf->__f;
(gdb) bt
#0  0x00007fe0005b80ba in _lthread_free (lt=0x7fe000b15ef0) at sched/lthread.c:361
#1  0x00007fe0005b866c in _lthread_resume (lt=0x7fe000b15ef0) at sched/lthread.c:530
#2  0x00007fe0005b7d1b in lthread_run () at sched/lthread.c:248
#3  0x00007fe0005a7d11 in __libc_init_enclave (argc=1, argv=0x7fe000b04840) at enclave/enclave_init.c:169
#4  0x00007fe0005a93e5 in __sgx_init_enclave () at enclave/enclave_oe.c:159
#5  0x00007fe0005a9c2d in sgxlkl_enclave_init (shared_memory=0x7fe000b03160) at enclave/enclave_oe.c:382
#6  0x00007fe0005d32e9 in ecall_sgxlkl_enclave_init (input_buffer=0x7fe000b03150 "", input_buffer_size=112, output_buffer=0x7fe000b031c0 "", output_buffer_size=16, output_bytes_written=0x7fe040f03c68) at en
#7  0x00007fe0004f7fdc in oe_handle_call_enclave_function (arg_in=140737204657448) at /home/azureuser/sgx-lkl/openenclave/enclave/core/sgx/calls.c:289
#8  0x00007fe0004fa06e in _handle_ecall (td=0x7fe040f0a000, func=2, arg_in=140737204657448, output_arg1=0x7fe040f03fc8, output_arg2=0x7fe040f03fc0) at /home/azureuser/sgx-lkl/openenclave/enclave/core/sgx/ca
#9  0x00007fe0004f99af in __oe_handle_main (arg1=281483566645248, arg2=140737204657448, cssa=0, tcs=0x7fe040f05000, output_arg1=0x7fe040f03fc8, output_arg2=0x7fe040f03fc0)
    at /home/azureuser/sgx-lkl/openenclave/enclave/core/sgx/calls.c:996
#10 0x00007fe000520704 in oe_enter () at /home/azureuser/sgx-lkl/openenclave/enclave/core/sgx/enter.S:197
#11 0x0000000040085278 in __morestack (tcs=0x7fe040f05000, aep=1074286688, arg1=281483566645248, arg2=140737204657448, arg3=0x7fffef171b88, arg4=0x7fffef171b80, enclave=0x402a23a0)
    at /home/azureuser/sgx-lkl/openenclave/host/sgx/enter.c:182
#12 0x000000004007508f in _do_eenter (enclave=0x402a23a0, tcs=0x7fe040f05000, aep=1074286688, code_in=OE_CODE_ECALL, func_in=2, arg_in=140737204657448, code_out=0x7fffef171b7c, func_out=0x7fffef171b7a, resu
    arg_out=0x7fffef171b70) at /home/azureuser/sgx-lkl/openenclave/host/sgx/calls.c:193
#13 oe_ecall (enclave=0x402a23a0, func=2, arg=140737204657448, arg_out_ptr=0x7fffef171d20) at /home/azureuser/sgx-lkl/openenclave/host/sgx/calls.c:627
#14 0x00000000400574bc in oe_call_enclave_function_by_table_id (enclave=0x402a23a0, table_id=18446744073709551615, function_id=0, input_buffer=0x7fffe8000b20, input_buffer_size=112, output_buffer=0x7fffe800
    output_buffer_size=16, output_bytes_written=0x7fffef171e60) at /home/azureuser/sgx-lkl/openenclave/host/calls.c:86
#15 0x00000000400579e5 in oe_call_enclave_function (enclave=0x402a23a0, function_id=0, input_buffer=0x7fffe8000b20, input_buffer_size=112, output_buffer=0x7fffe8000b90, output_buffer_size=16, output_bytes_w
    at /home/azureuser/sgx-lkl/openenclave/host/calls.c:124
#16 0x000000004001ccd3 in sgxlkl_enclave_init (enclave=0x402a23a0, _retval=0x7fffef171ed8, shared_memory=0x402a1608 <sgxlkl_host_state+1384>) at main-oe/sgxlkl_u.c:142
#17 0x000000004000f766 in enclave_init (args=0x7fffffffcf30) at main-oe/sgxlkl_run_oe.c:1320
#18 0x00007ffff72e66db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#19 0x00007ffff700fa3f in clone () from /lib/x86_64-linux-gnu/libc.so.6

All threads:

(gdb) thread apply all bt

Thread 7 (Thread 0x7fffee971700 (LWP 13153)):
#0  0x00007ffff6ffe297 in write () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff6f7922d in _IO_file_write () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007ffff6f7afc1 in _IO_do_write () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007ffff6f79a5d in _IO_file_xsputn () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x00007ffff6f4b07b in vfprintf () from /lib/x86_64-linux-gnu/libc.so.6
#5  0x00007ffff6f52ed4 in fprintf () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x0000000040057d16 in oe_write_ocall (device=0, str=0x7fffe0000bb0 "[[    LKL   ]] lkl_syscall(): switching to host task (no=93 task=host_clone4649 current=idle_host_task)\n",
    maxlen=18446744073709551615) at /home/azureuser/sgx-lkl/openenclave/host/ocalls.c:46
#7  0x000000004001efe7 in ocall_oe_write_ocall (input_buffer=0x7fffe0000b90 "", input_buffer_size=144, output_buffer=0x7fffe0000c20 "NG=0)\n", output_buffer_size=32, output_bytes_written=0x7fffee9705c0)
    at main-oe/sgxlkl_u.c:1522
#8  0x0000000040073ac1 in oe_handle_call_host_function (arg=140737196262800, enclave=0x402a23a0) at /home/azureuser/sgx-lkl/openenclave/host/sgx/calls.c:276
#9  0x000000004007417e in _handle_ocall (enclave=0x402a23a0, tcs=0x7fe04130d000, func=32768, arg_in=140737196262800, arg_out=0x7fffee970500) at /home/azureuser/sgx-lkl/openenclave/host/sgx/calls.c:370
#10 0x0000000040073cd5 in __oe_dispatch_ocall (arg1=985162418487296, arg2=140737196262800, arg1_out=0x7fffee970608, arg2_out=0x7fffee970600, tcs_=0x7fe04130d000, enclave=0x402a23a0)
    at /home/azureuser/sgx-lkl/openenclave/host/sgx/calls.c:451
#11 0x00000000400850dc in __oe_host_stack_bridge (arg1=1, arg2=140737085705200, arg1_out=0x68, arg2_out=0x7ffff6ffe297 <write+71>, tcs=0x0, enclave=0x68, ecall_context=0x7fffee970590)
    at /home/azureuser/sgx-lkl/openenclave/host/sgx/enter.c:86
#12 0x00007fe0004f8266 in __morestack (arg1=985162418487296, arg2=140737196262800) at /home/azureuser/sgx-lkl/openenclave/enclave/core/sgx/calls.c:627
#13 0x00007fe0004f8952 in _handle_exit (code=OE_CODE_OCALL, func=32768, arg=140737196262800) at /home/azureuser/sgx-lkl/openenclave/enclave/core/sgx/calls.c:332
#14 0x00007fe0004f86b4 in oe_ocall (func=32768, arg_in=140737196262800, arg_out=0x0) at /home/azureuser/sgx-lkl/openenclave/enclave/core/sgx/calls.c:687
#15 0x00007fe0004f92be in oe_call_host_function_by_table_id (table_id=18446744073709551615, function_id=10, input_buffer=0x7fffe0000b90, input_buffer_size=144, output_buffer=0x7fffe0000c20,
    output_buffer_size=32, output_bytes_written=0x7fe03dcfe9c0 <child_stack1+6464>, switchless=false) at /home/azureuser/sgx-lkl/openenclave/enclave/core/sgx/calls.c:796
#16 0x00007fe0004f962c in oe_call_host_function (function_id=10, input_buffer=0x7fffe0000b90, input_buffer_size=144, output_buffer=0x7fffe0000c20, output_buffer_size=32,
    output_bytes_written=0x7fe03dcfe9c0 <child_stack1+6464>) at /home/azureuser/sgx-lkl/openenclave/enclave/core/sgx/calls.c:827
#17 0x00007fe0005d5a91 in oe_write_ocall (device=0, str=0x7fe03dcfeb80 <child_stack1+6912> "[[    LKL   ]] lkl_syscall(): switching to host task (no=93 task=host_clone4649 current=idle_host_task)\n",
    maxlen=18446744073709551615) at enclave/sgxlkl_t.c:1750
#18 0x00007fe0004ef18c in oe_host_write (device=0, str=0x7fe03dcfeb80 <child_stack1+6912> "[[    LKL   ]] lkl_syscall(): switching to host task (no=93 task=host_clone4649 current=idle_host_task)\n",
    len=18446744073709551615) at /home/azureuser/sgx-lkl/openenclave/enclave/core/hostcalls.c:194
#19 0x00007fe0004ef090 in oe_host_vfprintf (device=0, fmt=0x7fe0006dd6c8 "%.*s", ap_=0x7fe03dcfee60 <child_stack1+7648>) at /home/azureuser/sgx-lkl/openenclave/enclave/core/hostcalls.c:163
#20 0x00007fe0004ef3ca in oe_host_printf (fmt=0x7fe0006dd6c8 "%.*s") at /home/azureuser/sgx-lkl/openenclave/enclave/core/hostcalls.c:174
#21 0x00007fe0005ca7f2 in print (str=0x7fe000b164c0 "[[    LKL   ]] lkl_syscall(): switching to host task (no=93 task=host_clone4649 current=idle_host_task)\n", len=104) at lkl/posix-host.c:101
#22 0x00007fe0000826b7 in lkl_vprintf (fmt=0x0, args=args@entry=0x7fe03dcfef10 <child_stack1+7824>) at lib/utils.c:181
#23 0x00007fe0000827b9 in lkl_printf (fmt=<optimized out>) at lib/utils.c:193
#24 0x00007fe0000892ce in lkl_syscall (no=93, params=0x7fe03dcff050 <child_stack1+8144>) at arch/lkl/kernel/syscalls.c:184
#25 0x0000000000000000 in ?? ()

Thread 6 (Thread 0x7fffef172700 (LWP 13152)):
#0  0x00007fe0005b80ba in _lthread_free (lt=0x7fe000b15ef0) at sched/lthread.c:361
#1  0x00007fe0005b866c in _lthread_resume (lt=0x7fe000b15ef0) at sched/lthread.c:530
#2  0x00007fe0005b7d1b in lthread_run () at sched/lthread.c:248
#3  0x00007fe0005a7d11 in __libc_init_enclave (argc=1, argv=0x7fe000b04840) at enclave/enclave_init.c:169
#4  0x00007fe0005a93e5 in __sgx_init_enclave () at enclave/enclave_oe.c:159
#5  0x00007fe0005a9c2d in sgxlkl_enclave_init (shared_memory=0x7fe000b03160) at enclave/enclave_oe.c:382
#6  0x00007fe0005d32e9 in ecall_sgxlkl_enclave_init (input_buffer=0x7fe000b03150 "", input_buffer_size=112, output_buffer=0x7fe000b031c0 "", output_buffer_size=16, output_bytes_written=0x7fe040f03c68)
    at enclave/sgxlkl_t.c:124
#7  0x00007fe0004f7fdc in oe_handle_call_enclave_function (arg_in=140737204657448) at /home/azureuser/sgx-lkl/openenclave/enclave/core/sgx/calls.c:289
#8  0x00007fe0004fa06e in _handle_ecall (td=0x7fe040f0a000, func=2, arg_in=140737204657448, output_arg1=0x7fe040f03fc8, output_arg2=0x7fe040f03fc0)
    at /home/azureuser/sgx-lkl/openenclave/enclave/core/sgx/calls.c:417
#9  0x00007fe0004f99af in __oe_handle_main (arg1=281483566645248, arg2=140737204657448, cssa=0, tcs=0x7fe040f05000, output_arg1=0x7fe040f03fc8, output_arg2=0x7fe040f03fc0)
    at /home/azureuser/sgx-lkl/openenclave/enclave/core/sgx/calls.c:996
#10 0x00007fe000520704 in oe_enter () at /home/azureuser/sgx-lkl/openenclave/enclave/core/sgx/enter.S:197
#11 0x0000000040085278 in __morestack (tcs=0x7fe040f05000, aep=1074286688, arg1=281483566645248, arg2=140737204657448, arg3=0x7fffef171b88, arg4=0x7fffef171b80, enclave=0x402a23a0)
    at /home/azureuser/sgx-lkl/openenclave/host/sgx/enter.c:182
#12 0x000000004007508f in _do_eenter (enclave=0x402a23a0, tcs=0x7fe040f05000, aep=1074286688, code_in=OE_CODE_ECALL, func_in=2, arg_in=140737204657448, code_out=0x7fffef171b7c, func_out=0x7fffef171b7a,
    result_out=0x7fffef171b78, arg_out=0x7fffef171b70) at /home/azureuser/sgx-lkl/openenclave/host/sgx/calls.c:193
#13 oe_ecall (enclave=0x402a23a0, func=2, arg=140737204657448, arg_out_ptr=0x7fffef171d20) at /home/azureuser/sgx-lkl/openenclave/host/sgx/calls.c:627
#14 0x00000000400574bc in oe_call_enclave_function_by_table_id (enclave=0x402a23a0, table_id=18446744073709551615, function_id=0, input_buffer=0x7fffe8000b20, input_buffer_size=112,
---Type <return> to continue, or q <return> to quit---
    output_buffer=0x7fffe8000b90, output_buffer_size=16, output_bytes_written=0x7fffef171e60) at /home/azureuser/sgx-lkl/openenclave/host/calls.c:86
#15 0x00000000400579e5 in oe_call_enclave_function (enclave=0x402a23a0, function_id=0, input_buffer=0x7fffe8000b20, input_buffer_size=112, output_buffer=0x7fffe8000b90, output_buffer_size=16,
    output_bytes_written=0x7fffef171e60) at /home/azureuser/sgx-lkl/openenclave/host/calls.c:124
#16 0x000000004001ccd3 in sgxlkl_enclave_init (enclave=0x402a23a0, _retval=0x7fffef171ed8, shared_memory=0x402a1608 <sgxlkl_host_state+1384>) at main-oe/sgxlkl_u.c:142
#17 0x000000004000f766 in enclave_init (args=0x7fffffffcf30) at main-oe/sgxlkl_run_oe.c:1320
#18 0x00007ffff72e66db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#19 0x00007ffff700fa3f in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 5 (Thread 0x7fffef973700 (LWP 13151)):
#0  0x00007ffff701f173 in clock_nanosleep () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x0000000040016ca7 in timerdev_task (timer_dev_mem=0x402d68e0) at host_interface/timer_dev.c:70
#2  0x00007ffff72e66db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#3  0x00007ffff700fa3f in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 4 (Thread 0x7ffff0174700 (LWP 13150)):
#0  0x00007ffff72ecf85 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x0000000040016522 in _vio_host_wait_for_enclave_event (cfg=0x402d9e60, evt_chn=0x402d91e8, val=129, timeout_ms=10) at host_interface/host_event_channel.c:56
#2  0x00000000400166e4 in vio_host_process_enclave_event (dev_id=1 '\001', timeout_ms=10) at host_interface/host_event_channel.c:115
#3  0x00000000400163f4 in console_task (arg=0x0) at host_interface/virtio_console.c:321
#4  0x00007ffff72e66db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x00007ffff700fa3f in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 3 (Thread 0x7ffff19ec700 (LWP 13149)):
#0  0x00007ffff7002cf9 in poll () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x0000000040015d2a in poll_console_for_input (poll_fd=0) at host_interface/virtio_console.c:69
#2  0x0000000040015dd6 in monitor_console_input (cons_dev=0x7ffff7fed000) at host_interface/virtio_console.c:97
#3  0x00007ffff72e66db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#4  0x00007ffff700fa3f in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 2 (Thread 0x7ffff21ed700 (LWP 13148)):
#0  0x00007ffff72eced9 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x0000000040016522 in _vio_host_wait_for_enclave_event (cfg=0x402d9df0, evt_chn=0x402d91d0, val=33, timeout_ms=10) at host_interface/host_event_channel.c:56
#2  0x00000000400166e4 in vio_host_process_enclave_event (dev_id=0 '\000', timeout_ms=10) at host_interface/host_event_channel.c:115
#3  0x0000000040018a41 in blkdevice_thread (arg=0x402d9df0) at host_interface/virtio_blkdev.c:186
#4  0x00007ffff72e66db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#5  0x00007ffff700fa3f in clone () from /lib/x86_64-linux-gnu/libc.so.6

Thread 1 (Thread 0x7ffff7fe9300 (LWP 13112)):
#0  0x00007ffff72e7d2d in __pthread_timedjoin_ex () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x0000000040011434 in main (argc=1, argv=0x7fffffffe260, envp=0x7fffffffe270) at main-oe/sgxlkl_run_oe.c:2009

cannot include full log as it is too big!

@github-actions github-actions bot added the needs-triage Bug does not yet have a priority assigned label Aug 6, 2020
@paulcallen paulcallen added this to Needs triage in Issue triage via automation Aug 6, 2020
@paulcallen paulcallen moved this from Needs triage to Proposed p0 in Issue triage Aug 6, 2020
@paulcallen paulcallen added p0 Blocking priority and removed needs-triage Bug does not yet have a priority assigned labels Aug 6, 2020
@paulcallen paulcallen removed this from Proposed p0 in Issue triage Aug 6, 2020
@davidchisnall
Copy link
Contributor

The cancelbuf field is gone from the lthread structure in the cleaned up layering. This may be a UAF of something allocated with the oe malloc allocator (so something aliases the lthread), or it may be a bug in code that is about to be deleted.

@paulcallen
Copy link
Member Author

I am hitting this with every run right now of the clone-loop test

@davidchisnall
Copy link
Contributor

Are you running with lthread UAF checks enabled?

@paulcallen
Copy link
Member Author

paulcallen commented Aug 11, 2020

I am. UAF, OE Debug Heap, DEBUG and LKL_DEBUG. Running on 4-core machine with 2 ETHREADS and running in the debugger

@SeanTAllen
Copy link
Contributor

@paulcallen are you actively working on this? Should I assign you to it?

@paulcallen
Copy link
Member Author

@SeanTAllen I am not sure I have the deep understanding of this code yet to be able to fix this.

@SeanTAllen
Copy link
Contributor

@paulcallen can you leave your complete build command for sgx-lkl so I can try to reproduce on my machine?

@prp
Copy link
Member

prp commented Sep 4, 2020

@paulcallen is this issue now outdated and can be closed? The code in question above has been removed due to the relayering work.

@prp
Copy link
Member

prp commented Sep 16, 2020

Since we haven't had further reports of this (and the clone test is part of the CI and passes), I am closing this issue.

@prp prp closed this as completed Sep 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p0 Blocking priority
Projects
None yet
Development

No branches or pull requests

4 participants