Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: thread exit improvement for 10.14 and later #76

Merged
merged 4 commits into from Apr 1, 2023

Conversation

LouisBrunner
Copy link
Owner

@paulfloyd Tested with your dummy program and no crashes anymore. Is that the kind of things you had in mind?

@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@paulfloyd
Copy link
Contributor

paulfloyd commented Mar 9, 2023

@paulfloyd Tested with your dummy program and no crashes anymore. Is that the kind of things you had in mind?

I had two ideas, either similar to this (trying to get the right arguments for bsdthread_terminate, plus other platforms make the calls in inline asm).

Or alternatively see what happens if we just restart the thread by calling VG_(scheduler)(tid)

I'll try to explain.

At the point that we're trying to kill the thread we're in this line of code

	_pthread_exit(self, (self->fun)(self->arg));

(line 888 here https://github.com/apple-oss-distributions/libpthread/blob/libpthread-514/src/pthread.c)

The client thread function is pointed to by self->fun, at least conceptually. To get that to work, hijack_thread_state et al will have previously replaced the original self->fun with a pointer to pthread_hijack which when executed calls start_thread_NORETURN which will run the original thread function in the Valgrind scheduler.

If we can continue running the scheduler then it should execute _pthread_exit and clean up and kill the thread.

It may be necessary to store the result of the guest thread function in RAX.

I also suspect that the changes made to the stack in _call_on_new_stack_0_1 mean that that the return address (in the scheduler) will no longer take us back to the call to _pthread_exit

@LouisBrunner
Copy link
Owner Author

I had two ideas, either similar to this (trying to get the right arguments for bsdthread_terminate, plus other platforms make the calls in inline asm).

This would require reading the internal pthread struct right? (https://github.com/apple-oss-distributions/libpthread/blob/libpthread-514/src/pthread.c#L733) I couldn't find any place where Valgrind currently does that, it might also be a bit inconsistent between macOS (and thus libpthread) versions?

Is there any point in doing the call in assembly?

Or alternatively see what happens if we just restart the thread by calling VG_(scheduler)(tid)

Interesting, I will have a look.

@paulfloyd
Copy link
Contributor

I had two ideas, either similar to this (trying to get the right arguments for bsdthread_terminate, plus other platforms make the calls in inline asm).

This would require reading the internal pthread struct right? (https://github.com/apple-oss-distributions/libpthread/blob/libpthread-514/src/pthread.c#L733) I couldn't find any place where Valgrind currently does that, it might also be a bit inconsistent between macOS (and thus libpthread) versions?

I really don't know what the right thing to do is. Not setting these values could cause a per-thread leak. But then we're creating the thread stack in pthread_hijack so maybe they should be 0 after all.

The call in pthread,c is

__bsdthread_terminate((void *)freeaddr, freesize, kport, custom_stack_sema);

We can get the first 3 args from 'self' and 'kport' passed to pthread_hijack.
However I don't see any way to access custom_stack_sema.

Is there any point in doing the call in assembly?

I don't see why it was done, but there must have been some reason, possibly a long time ago and only on Linux copied by all the other ports.

Or alternatively see what happens if we just restart the thread by calling VG_(scheduler)(tid)

Interesting, I will have a look.

asm(
".globl _pthread_hijack_asm\n"
"_pthread_hijack_asm:\n"
"   movq %rsp,%rbp\n"
"   push $0\n"    // alignment pad
"   push %rbp\n"  // original sp
                  // other values stay where they are in registers
"   push $0\n"    // fake return address
"   jmp _pthread_hijack\n"
);

The above looks one-way. A call instead of push/jmp might be needed.

@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@LouisBrunner
Copy link
Owner Author

I really don't know what the right thing to do is. Not setting these values could cause a per-thread leak. But then we're creating the thread stack in pthread_hijack so maybe they should be 0 after all.

The call in pthread,c is

__bsdthread_terminate((void *)freeaddr, freesize, kport, custom_stack_sema);

We can get the first 3 args from 'self' and 'kport' passed to pthread_hijack. However I don't see any way to access custom_stack_sema.

I dug into pthread a bit to understand the exact flow with the hijack (been a long time since I looked at it). Turns out, we already execute all the _pthread_exit code, including bsdthread_terminate. The syscall is caught by Valgrind and all clean up is already performed in PRE(bsdthread_terminate). Check the logs with --trace-syscalls=yes:

...
Hello from thread_function, and goodbye
SYSCALL[7427,2](unix:397) ... [async] --> Success(0x0:0x28)
SYSCALL[7427,2](unix:331) __disable_threadsignal(1, 40, 0) --> [pre-success] Success(0x0:0x0)
SYSCALL[7427,2](unix:361) bsdthread_terminate( 0x7000034c3000, 81000, mach_thread_self(), NULL ) --> [pre-success] Success(0x0:0x0)
...

So, all that we have left is to actually call bsdthread_terminate as you suggested but no arguments are required.

Is there any point in doing the call in assembly?

I don't see why it was done, but there must have been some reason, possibly a long time ago and only on Linux copied by all the other ports.

Ok, I looked at other ports, no harm to do it in assembly if it helps avoiding messing with the stack.

Or alternatively see what happens if we just restart the thread by calling VG_(scheduler)(tid)

Interesting, I will have a look.

asm(
".globl _pthread_hijack_asm\n"
"_pthread_hijack_asm:\n"
"   movq %rsp,%rbp\n"
"   push $0\n"    // alignment pad
"   push %rbp\n"  // original sp
                  // other values stay where they are in registers
"   push $0\n"    // fake return address
"   jmp _pthread_hijack\n"
);

The above looks one-way. A call instead of push/jmp might be needed.

As explained above, that code is already running.

In conclusion, this PR should be enough to solve the issue. Do you have any way to test on x86? It's been a long time since I last wrote assembly and I don't have any way to test outside amd64/arm64 macOS 13.

@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as off-topic.

@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as off-topic.

@github-actions

This comment was marked as off-topic.

@github-actions

This comment was marked as outdated.

@LouisBrunner LouisBrunner merged commit 1ab162e into main Apr 1, 2023
4 of 12 checks passed
@LouisBrunner LouisBrunner deleted the feature/fix-thread-exit branch April 1, 2023 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants