Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement async termination of blocking thread #2516

Merged
merged 40 commits into from
Sep 20, 2023

Conversation

yamt
Copy link
Collaborator

@yamt yamt commented Aug 30, 2023

send a signal whose handler is no-op to a blocking thread to wake up the blocking syscall
with either EINTR equivalent or partial success.

unlike the approach taken in the dev/interrupt_block_insn branch (that is, signal + longjmp similarly to OS_ENABLE_HW_BOUND_CHECK),
this PR does not use longjmp because:

  • longjmp from signal handler doesn't work on nuttx (signal handler is called within "kernel" apache/nuttx#10326)
  • i'm afraid that the singal+longjmp approach is too difficult for average programmers who might implement host functions to deal with. especially when the longjmp can rewind a lot of functions, some of which are potentially user-supplied.

todo/known issues/limitations/open questions:

  • make the initialization independent from OS_ENABLE_HW_BOUND_CHECK
  • provide an api (or at least a build-time config) to specifiy a signal to use, instead of hardcoding SIGUSR1.
  • adapt operations other than fd_read
    • fd_close
    • fd_pread
    • fd_pwrite
    • fd_read
    • fd_write
    • path_open
      • i guess it can block on eg. fifo. i'm not sure if i want to fix it in this PR.
    • poll_oneoff
    • sock_accept
    • sock_close (?) just ENOSYS
    • sock_connect
    • sock_recv just a wrapper of sock_recv_from
    • sock_recv_from
    • sock_send just a wrapper of sock_send_to
    • sock_send_to
    • sock_shutdown usually not block
    • sock_addr_resolve
      • this one is not a simple syscall. see the comment in blocking_op.c
    • memory.atomic.wait32/64
  • what to do for platforms other than posix?
  • what to do for other wasi implementation? (uvwasi)
  • document/comment
  • test
    • the relevant tests in wasi-threads repo passed on macOS and nuttx/esp32-devkitc
  • add an api to terminate an instance Add an API to terminate instance #2538
  • add a sample Add an API to terminate instance #2538
  • for simple blocking system call like readv, it's easy to wrap it with wasm_runtime_begin_blocking_op/wasm_runtime_end_blocking_op. but how about something more complex, like a call which possibly blocks deep inside an external library?

relevant issue: #1910

@yamt yamt changed the title Implement termination of blocking thread Implement async termination of blocking thread Aug 30, 2023
@yamt yamt force-pushed the blocking-op branch 3 times, most recently from 38f769a to 455061a Compare August 31, 2023 22:59
@yamt
Copy link
Collaborator Author

yamt commented Aug 31, 2023

the last push passes all test cases in wasi-threads repo on macOS.

@yamt yamt force-pushed the blocking-op branch 2 times, most recently from 70a66cb to 8b14c2f Compare September 4, 2023 05:30
@yamt yamt force-pushed the blocking-op branch 4 times, most recently from 348a15d to 1b28087 Compare September 7, 2023 08:41
@yamt yamt force-pushed the blocking-op branch 2 times, most recently from c3a7e6f to 4877ae8 Compare September 13, 2023 11:45
@yamt yamt marked this pull request as ready for review September 13, 2023 11:46
@yamt
Copy link
Collaborator Author

yamt commented Sep 13, 2023

the ci failure looks unrelated

  Cannot initiate the connection to ppa.launchpadcontent.net:443 (2620:2d:4000:1::3e). - connect (101: Network is unreachable) Could not connect to ppa.launchpadcontent.net:443 (185.125.190.52), connection timed out [IP: 185.125.190.52 443]

core/iwasm/include/wasm_export.h Outdated Show resolved Hide resolved
core/iwasm/include/wasm_export.h Outdated Show resolved Hide resolved
core/iwasm/include/wasm_export.h Outdated Show resolved Hide resolved
@wenyongh
Copy link
Contributor

@eloparco Could you help review the PR? It implements feature like dev/interrupt_block_insn in different manner.

I was assuming pthread_sigmask was an alias of sigprocmask.
It's true for platforms I'm familiar with.

However, after reading the descriptions in the relevant standards,
I changed my mind.
@eloparco
Copy link
Contributor

@eloparco Could you help review the PR? It implements feature like dev/interrupt_block_insn in different manner.

Sure, I'll take a look in the next couple of days.

Copy link
Contributor

@wenyongh wenyongh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eloparco
Copy link
Contributor

Took a look. This doesn't cover user-defined functions that could have been achieved with longjmp I think. But, as you said, the longjmp approach is probably too complicated.

We should be able to add these tests now, probably in a separate PR.

@yamt
Copy link
Collaborator Author

yamt commented Sep 20, 2023

Took a look. This doesn't cover user-defined functions that could have been achieved with longjmp I think. But, as you said, the longjmp approach is probably too complicated.

user-defined functions should take care of it by itself, either using the api introduced by this PR or other approaches like #1951. i guess in many cases it's simpler than preparing for signal+longjmp.

We should be able to add these tests now, probably in a separate PR.

yes. they all pass with this PR: #2516 (comment)

@wenyongh
Copy link
Contributor

Took a look. This doesn't cover user-defined functions that could have been achieved with longjmp I think. But, as you said, the longjmp approach is probably too complicated.

We should be able to add these tests now, probably in a separate PR.

@eloparco so the PR is good to you, right? Shall we merge the PR?

@eloparco
Copy link
Contributor

@eloparco so the PR is good to you, right? Shall we merge the PR?

LGTM

@wenyongh wenyongh merged commit 444b159 into bytecodealliance:main Sep 20, 2023
368 checks passed
yamt added a commit to yamt/wasm-micro-runtime that referenced this pull request Sep 20, 2023
On posix-like platforms, the rest of wasi-threads tests
should pass after the recent changes including the following PRs:

bytecodealliance#2516
bytecodealliance#2524
bytecodealliance#2529
wenyongh added a commit to wenyongh/wasm-micro-runtime that referenced this pull request Sep 21, 2023
Implement async termination of blocking thread (bytecodealliance#2516)
yamt added a commit to yamt/wasm-micro-runtime that referenced this pull request Sep 21, 2023
On posix-like platforms, the rest of wasi-threads tests
should pass after the recent changes including the following PRs:

bytecodealliance#2516
bytecodealliance#2524
bytecodealliance#2529
yamt added a commit to yamt/wasm-micro-runtime that referenced this pull request Sep 22, 2023
On posix-like platforms, the rest of wasi-threads tests
should pass after the recent changes including the following PRs:

bytecodealliance#2516
bytecodealliance#2524
bytecodealliance#2529
yamt added a commit to yamt/wasm-micro-runtime that referenced this pull request Sep 23, 2023
On posix-like platforms, the rest of wasi-threads tests
should pass after the recent changes including the following PRs:

bytecodealliance#2516
bytecodealliance#2524
bytecodealliance#2529
wenyongh pushed a commit that referenced this pull request Sep 26, 2023
On posix-like platforms, the rest of wasi-threads tests
should pass after the recent changes including the following PRs:
#2516, #2524, #2529, #2571, #2576 and #2582.
victoryang00 pushed a commit to victoryang00/wamr-aot-gc-checkpoint-restore that referenced this pull request May 27, 2024
Send a signal whose handler is no-op to a blocking thread to wake up
the blocking syscall with either EINTR equivalent or partial success.

Unlike the approach taken in the `dev/interrupt_block_insn` branch (that is,
signal + longjmp similarly to `OS_ENABLE_HW_BOUND_CHECK`), this PR
does not use longjmp because:
* longjmp from signal handler doesn't work on nuttx
  refer to apache/nuttx#10326
* the singal+longjmp approach may be too difficult for average programmers
  who might implement host functions to deal with

See also bytecodealliance#1910
victoryang00 pushed a commit to victoryang00/wamr-aot-gc-checkpoint-restore that referenced this pull request May 27, 2024
On posix-like platforms, the rest of wasi-threads tests
should pass after the recent changes including the following PRs:
bytecodealliance#2516, bytecodealliance#2524, bytecodealliance#2529, bytecodealliance#2571, bytecodealliance#2576 and bytecodealliance#2582.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants