Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintain libc.threads_minus_1 correctly. #1273

Merged
merged 1 commit into from Mar 25, 2022

Conversation

anakrish
Copy link
Collaborator

The thread created for the child of a fork is not created via pthread_create.
Such a thread also does not call pthread_exit.

Typically, libc.threads_minus_1 is maintained by pthread_create and pthread_exit.

Since these functions are not invoked for a forked child, maintain the counter
manually by doing increment just before cloning to create the thread and
doing decrement just before invoking SYS_group_exit for the thread.

Signed-off-by: Anand Krishnamoorthi anakrish@microsoft.com

Copy link
Collaborator

@vtikoo vtikoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great find! We should probably also decrement the counter when there is an execve? As the caller will have a new CRT by the end of the syscall.

crt/exit.c Outdated
// Decrement MUSL's number of threads. Typically this is done by
// MUSL's pthread_exit which is not called for child process's
// main thread.
__atomic_sub_fetch(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crt protects this with locks. Therefore this is not thread safe. You will need to patch musl to make the variable atomic.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. We need to use the same lock that MUSL uses or patch musl to make this atomic.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used locks to be consistent with MUSL. I considered using atomics, but felt it was simpler to just follow what MUSL is doing.

@anakrish
Copy link
Collaborator Author

This is a great find! We should probably also decrement the counter when there is an execve? As the caller will have a new CRT by the end of the syscall.

Great point. We need to keep the counter in sync when the thread becomes owned by the child process. If the child process ever hands the ownership back to the parent process, then at that point the count ought to be incremented.

But I'm not sure where exactly that happens in code...

@vtikoo
Copy link
Collaborator

vtikoo commented Mar 17, 2022

We need to keep the counter in sync when the thread becomes owned by the child process. If the child process ever hands the ownership back to the parent process, then at that point the count ought to be incremented.

But I'm not sure where exactly that happens in code...

src/process/execve.c under musl seems like a good place. We could either modify musl or rewrite execve under crt.

@jxyang
Copy link
Contributor

jxyang commented Mar 17, 2022

Started a nightly pipeline for the PR with broad impact. https://oe-jenkins-dev.westeurope.cloudapp.azure.com/job/Mystikos/job/Nightly-Pipeline-Manual/34/

Please wait for the outcome.

@anakrish
Copy link
Collaborator Author

Unfortunately, the samples/go failure doesn't easily reproduce in GDB even after running for hours. I'm trying to see what other approach we can use to understand the SEGV.

When run in a loop outside GDB, there is eventually a SEGV:

fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x1c pc=0x13ee1a36d]

goroutine 3 [running]:
runtime.throw({0x13f5efc69?, 0x1e?})
/usr/local/go/src/runtime/panic.go:992 +0x71 fp=0x114337688 sp=0x114337658 pc=0x13ee5cfb1
runtime.sigpanic()
/usr/local/go/src/runtime/signal_unix.go:802 +0x3c9 fp=0x1143376d8 sp=0x114337688 pc=0x13ee796e9
runtime.chansend(0x0, 0x100a4b0b8, 0x1, 0x100a4b0b8?)
/usr/local/go/src/runtime/chan.go:203 +0xcd fp=0x114337760 sp=0x1143376d8 pc=0x13ee1a36d
runtime: unexpected return pc for runtime.chansend1 called from 0x1007bc196
stack: frame={sp:0x114337760, fp:0x114337790} stack=[0x114337000,0x114337800)
0x0000000114337660: 0x000000013ee5cfe0 <runtime.throw.func1+0x0000000000000000> 0x000000013f5efc69
0x0000000114337670: 0x000000000000002a 0x00000001143376c8
0x0000000114337680: 0x000000013ee796e9 <runtime.sigpanic+0x00000000000003c9> 0x000000013f5efc69
0x0000000114337690: 0x000000000000001e 0x0000000000000000
0x00000001143376a0: 0x0000000000000000 0x0000000200000000
0x00000001143376b0: 0x0000000000000004 0x0000000000000000
0x00000001143376c0: 0x0000000000000003 0x0000000114337750
0x00000001143376d0: 0x000000013ee1a36d <runtime.chansend+0x00000000000000cd> 0x000000011435e058
0x00000001143376e0: 0x0000000000000000 0x0000000000000000
0x00000001143376f0: 0x0000000000000000 0x0000000000000000
0x0000000114337700: 0x0000000000000000 0x0000000000000000
0x0000000114337710: 0x0000000000000000 0x000000011435e058
0x0000000114337720: 0x0000000000000000 0x0000000000000000
0x0000000114337730: 0x0000000000000000 0x0000000000000000
0x0000000114337740: 0x0000000000000000 0x0000000000000000
0x0000000114337750: 0x0000000114337780 0x000000013ee1a27d <runtime.chansend1+0x000000000000001d>
0x0000000114337760: <0x0000000000000000 0x0000000100a4b0b8
0x0000000114337770: 0x0000000000000001 0x0000000100a4b0b8
0x0000000114337780: 0x00000001143377b0 !0x00000001007bc196
0x0000000114337790: >0x000000000000f000 0x0000000100a5ebb8
0x00000001143377a0: 0x0000000000000000 0x0000000100a5ebb8
0x00000001143377b0: 0x00000001143377d0 0x00000001007e516e
0x00000001143377c0: 0x0000000118804000 0x0000000118804000
0x00000001143377d0: 0x0000000114337850 0x00000001007d7d0a
0x00000001143377e0: 0x00000001143378b0 0x00000000000000ca
0x00000001143377f0: 0x0000000000000000 0x0000000118804000
runtime.chansend1(0xf000?, 0x100a5ebb8?)
/usr/local/go/src/runtime/chan.go:144 +0x1d fp=0x114337790 sp=0x114337760 pc=0x13ee1a27d
created by runtime.gcenable
/usr/local/go/src/runtime/mgc.go:177 +0x6f

goroutine 1 [chan receive, locked to thread]:
runtime.gopark(0x114362000?, 0x114336720?, 0xe5?, 0x80?, 0x0?)
/usr/local/go/src/runtime/proc.go:361 +0xdc
runtime.chanrecv(0x11435e000, 0x0, 0x1)
/usr/local/go/src/runtime/chan.go:577 +0x590
runtime.chanrecv1(0x0?, 0x2?)
/usr/local/go/src/runtime/chan.go:440 +0x18
runtime.gcenable()
/usr/local/go/src/runtime/mgc.go:180 +0xc8
runtime.main()
/usr/local/go/src/runtime/proc.go:209 +0x165
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1571 +0x1

@jxyang
Copy link
Contributor

jxyang commented Mar 22, 2022

@anakrish The go sample failures are root caused to the invalid instruction RDTSCP in Coffee Lake SGX environment. Note the above failures happen on coffee lake only. Do you think we can merge the PR now?

Copy link
Collaborator

@vtikoo vtikoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please address the execve concern. Thanks.

@anakrish
Copy link
Collaborator Author

@anakrish The go sample failures are root caused to the invalid instruction RDTSCP in Coffee Lake SGX environment. Note the above failures happen on coffee lake only. Do you think we can merge the PR now?

@jxyang There is a SEGV failure in go samples in IceLake machines. This failure is unrelated to the RDTSCP failure in Coffee Lake machines. However, the SEGV failures can be observed in Mystikos' main branch itself. Based on that, this PR no longer has to be blocked due to go sample failures.

@anakrish
Copy link
Collaborator Author

Can you please address the execve concern. Thanks.

Sure. Let me see if that can be addressed easily.

The thread created for the child of a fork is not created via pthread_create.
Such a thread also does not call pthread_exit.

Typically, libc.threads_minus_1 is maintained by pthread_create and pthread_exit.

Since these functions are not invoked for a forked child, maintain the counter
manually by doing increment just before cloning to create the thread and
doing decrement just before invoking SYS_group_exit for the thread.

Additionally, just before execve syscall is dispatched, the thread count of the
current CRT is decremented. execve will create a new process and the current thread
is owned by that process.

Signed-off-by: Anand Krishnamoorthi <anakrish@microsoft.com>
@anakrish
Copy link
Collaborator Author

@vtikoo I have handled SYS_execve and SYS_execveat in myst_syscall. This is consistent with how fork is handled. We will get control whether execve function is called or whether SYS_execve is called via the syscall function.

Note, the current fork implementation has the drawback that if fork is called via the syscall instruction, then myst_fork won't be called. Similarly, the custom thread count management won't be invoked if a syscall instruction is used for execve.
We'd have to address these in the future if we encounter a library that uses syscall instruction for fork/execve.

@anakrish anakrish merged commit 384a522 into deislabs:main Mar 25, 2022
@anakrish anakrish deleted the anakrish-libc-num-threads branch March 25, 2022 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants