Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SMP races with thread swap and abort #21903

Merged
merged 7 commits into from
Jan 21, 2020

Conversation

andyross
Copy link
Contributor

@andyross andyross commented Jan 14, 2020

Fixes for two races under smp:

  1. A thread aborting another thread that just self-aborted will return from k_thread_abort() before the target thread is actually finished running on its stack, leading to a corruption/crash if the same thread object is then re-used immediately.

  2. A runnable thread that enters z_swap() can be seen by other CPUs as part of the run queue before its own CPU has finished saving its state inside of arch_switch().

@andyross
Copy link
Contributor Author

Think I have this now. It was a little subtle. Actually one of the hardest parts was writing a clear explanation.

@zephyrbot zephyrbot added area: API Changes to public APIs area: Kernel labels Jan 14, 2020
Copy link
Contributor

@andrewboie andrewboie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can still reproduce #21317 , patching ztest to add the sleep call makes this easier to repro

* complete. Otherwise the thread still runs a bit
* until it can swap, requiring a delay.
*/
if (IS_ENABLED(CONFIG_SMP) && arch_is_in_isr()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this. z_thread_single_abort() can be called with a thread object running on another CPU, but we're checking if the current CPU is in an ISR. do we need to check the state of the CPU that the thread is running on?

* delay in that circumstance.
*/
if ((thread->base.thread_state & _THREAD_ABORTED_IN_ISR) == 0U) {
k_busy_wait(THREAD_ABORT_DELAY_US);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel uneasy about this .. is there some state in the other thread we can poll in a loop, instead of just an arbitrary delay?

Also I'm assuming that this logic happens with interrupts unlocked?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is not, absent changing the way swap works internally. The thread stack is in use right up to the swap call, but the swapped-to context may have been interrupted, there is nowhere at the OS level to place code to run "after swap", it just swaps back to whatever it was doing. I tried to find a way to make this work such that the scheduler lock is held and synchronously released at swap time, but even then "synchronously" isn't synchronous enough, because the CPU is still running C code at the point where that spinlock gets released.

An architectural fix would have every call to z_swap set a value in the swapped-from thread struct after the last memory access to its data had occurred, then we could spin waiting for that.

And yes, this delay happens outside the lock.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking out loud here: if the race condition is two different contexts both trying to complete k_thread_abort() on the same thread object, could we introduce a per-thread object spinlock to synchronize this?

@andyross
Copy link
Contributor Author

Thinking out loud here: if the race condition is two different contexts both trying to complete k_thread_abort() on the same thread object, could we introduce a per-thread object spinlock to synchronize this?

The synchronization is already there. The problem is that AFTER you've "completed" all the bookkeeping for thread abort and released the lock, you're still running until you get to the swap call. So another CPU can't be allowed to think that its own call to abort is complete until you're definitely not running any more. (Which is why this is SMP-only: on UP, even if the newly-dead-but-still-running thread is interrupted before it swaps, it will never be resumed, so no race can happen. With SMP there is another processor that can do it.)

Basically, what happens now is that the other CPU thinks "Oh, OK, you're finished aborting" and then launches another thread in your stack area while you're still running, causing one or both to crash weirdly.

So we need to "lock" not only the metadata needed for thread abort but the whole thread, its stack, and tracking object. And the only way to do that is deep inside swap.

@andyross
Copy link
Contributor Author

Update to correct the too-clever handling of the ABORTED_IN_ISR flag.

This fixes the thread end/spawn races for me, though there are still some spurious failures in there that need attention. FWIW: applying this patch above the last commit before x86_64 userspace results in a tree that passes basically everything (though there's a semi-routine failure in the shell test that persists).

The ones remaining look odd -- RIP is getting set to bad values, sometimes nulls, sometimes pointers to rodata areas, and it's happening in interrupt contexts where a few dozen bytes have been pushed on the stack already. It has the look of double-fault kind of things where it's actually logging the second exception and not the first, maybe? Still looking.

@andrewboie
Copy link
Contributor

andrewboie commented Jan 14, 2020

It has the look of double-fault kind of things where it's actually logging the second exception and not the first, maybe? Still looking.

The exception code reports the vector that caused the fault, so you should see "Double Fault" if that was the case. Unless it faulted in the exception / irq code path...

FWIW: applying this patch above the last commit before x86_64 userspace results in a tree that passes basically everything (though there's a semi-routine failure in the shell test that persists).

I'm not seeing this. I checked out my tree at revision d3c525b which is immediately before the userspace series. I then applied this patch, and then added the sleep to z_ztest_run_test_suite(). I get several failures each run in a variety of tests. Here's the output of four runs in a row:

$ scr64
Deleting output directory /dev/shm/sanity-out
INFO    - JOBS: 48
INFO    - Building initial testcase list...
INFO    - 162 test configurations selected, 575 configurations discarded due to filters.
INFO    - Adding tasks to the queue...
INFO    - Total complete:   94/ 162  58%  skipped:   27, failed:    0
ERROR   - qemu_x86_64               tests/kernel/common/kernel.common.misra            FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/kernel/common/kernel.common.misra/handler.log
INFO    - Total complete:  119/ 162  73%  skipped:   27, failed:    1
ERROR   - qemu_x86_64               tests/kernel/queue/kernel.queue.poll               FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/kernel/queue/kernel.queue.poll/handler.log
INFO    - Total complete:  122/ 162  75%  skipped:   27, failed:    2
ERROR   - qemu_x86_64               tests/kernel/queue/kernel.queue                    FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/kernel/queue/kernel.queue/handler.log
INFO    - Total complete:  137/ 162  84%  skipped:   27, failed:    3
ERROR   - qemu_x86_64               tests/lib/json/libraries.encoding.json             FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/lib/json/libraries.encoding.json/handler.log
INFO    - Total complete:  162/ 162  100%  skipped:   27, failed:    4at 
INFO    - 131 of 162 tests passed (97.04%), 4 failed, 27 skipped with 0 warnings in 102.18 seconds
INFO    - In total 748 test cases were executed on 1 out of total 218 platforms (0.46%)
apboie@shodan:~/projects/zephyr/zephyr (1) (HEAD detached from d3c525b986) /home/apboie/projects/zephyr/zephyr
$ scr64
Deleting output directory /dev/shm/sanity-out
INFO    - JOBS: 48
INFO    - Building initial testcase list...
INFO    - 162 test configurations selected, 575 configurations discarded due to filters.
INFO    - Adding tasks to the queue...
INFO    - Total complete:   75/ 162  46%  skipped:   23, failed:    0
ERROR   - qemu_x86_64               tests/kernel/common/kernel.common.misra            FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/kernel/common/kernel.common.misra/handler.log
INFO    - Total complete:   77/ 162  47%  skipped:   24, failed:    1
ERROR   - qemu_x86_64               tests/shell/shell.core                             FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/shell/shell.core/handler.log
INFO    - Total complete:   88/ 162  54%  skipped:   25, failed:    2
ERROR   - qemu_x86_64               tests/kernel/mem_protect/sys_sem/kernel.memory_protection.sys_sem FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/kernel/mem_protect/sys_sem/kernel.memory_protection.sys_sem/handler.log
INFO    - Total complete:   92/ 162  56%  skipped:   27, failed:    3
ERROR   - qemu_x86_64               tests/kernel/common/kernel.common                  FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/kernel/common/kernel.common/handler.log
INFO    - Total complete:   98/ 162  60%  skipped:   27, failed:    4
ERROR   - qemu_x86_64               tests/kernel/fifo/fifo_api/kernel.fifo             FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/kernel/fifo/fifo_api/kernel.fifo/handler.log
INFO    - Total complete:  116/ 162  71%  skipped:   27, failed:    5
ERROR   - qemu_x86_64               tests/lib/sprintf/libraries.libc.sprintf           FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/lib/sprintf/libraries.libc.sprintf/handler.log
INFO    - Total complete:  162/ 162  100%  skipped:   27, failed:    6
INFO    - 129 of 162 tests passed (95.56%), 6 failed, 27 skipped with 0 warnings in 47.67 seconds
INFO    - In total 748 test cases were executed on 1 out of total 218 platforms (0.46%)
apboie@shodan:~/projects/zephyr/zephyr (1) (HEAD detached from d3c525b986) /home/apboie/projects/zephyr/zephyr
$ scr64
Deleting output directory /dev/shm/sanity-out
INFO    - JOBS: 48
INFO    - Building initial testcase list...
INFO    - 162 test configurations selected, 575 configurations discarded due to filters.
INFO    - Adding tasks to the queue...
INFO    - Total complete:   50/ 162  30%  skipped:   23, failed:    0
ERROR   - qemu_x86_64               tests/subsys/logging/log_msg/logging.log_msg       FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/subsys/logging/log_msg/logging.log_msg/handler.log
INFO    - Total complete:   76/ 162  46%  skipped:   23, failed:    1
ERROR   - qemu_x86_64               tests/posix/fs/portability.posix.fs                FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/posix/fs/portability.posix.fs/handler.log
INFO    - Total complete:   86/ 162  53%  skipped:   25, failed:    2
ERROR   - qemu_x86_64               tests/kernel/common/kernel.common.misra            FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/kernel/common/kernel.common.misra/handler.log
INFO    - Total complete:   88/ 162  54%  skipped:   25, failed:    3
ERROR   - qemu_x86_64               tests/kernel/common/kernel.common                  FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/kernel/common/kernel.common/handler.log
INFO    - Total complete:   97/ 162  59%  skipped:   27, failed:    4
ERROR   - qemu_x86_64               tests/kernel/fifo/fifo_api/kernel.fifo.poll        FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/kernel/fifo/fifo_api/kernel.fifo.poll/handler.log
INFO    - Total complete:  106/ 162  65%  skipped:   27, failed:    5
ERROR   - qemu_x86_64               tests/kernel/mem_heap/mheap_api_concept/kernel.memory_heap FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/kernel/mem_heap/mheap_api_concept/kernel.memory_heap/handler.log
INFO    - Total complete:  162/ 162  100%  skipped:   27, failed:    6
INFO    - 129 of 162 tests passed (95.56%), 6 failed, 27 skipped with 0 warnings in 47.46 seconds
INFO    - In total 748 test cases were executed on 1 out of total 218 platforms (0.46%)
apboie@shodan:~/projects/zephyr/zephyr (1) (HEAD detached from d3c525b986) /home/apboie/projects/zephyr/zephyr
$ scr64
Deleting output directory /dev/shm/sanity-out
INFO    - JOBS: 48
INFO    - Building initial testcase list...
INFO    - 162 test configurations selected, 575 configurations discarded due to filters.
INFO    - Adding tasks to the queue...
INFO    - Total complete:   58/ 162  35%  skipped:   23, failed:    0
ERROR   - qemu_x86_64               tests/subsys/shell/shell_history/shell.history     FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/subsys/shell/shell_history/shell.history/handler.log
INFO    - Total complete:   78/ 162  48%  skipped:   24, failed:    1
ERROR   - qemu_x86_64               tests/kernel/common/kernel.common                  FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/kernel/common/kernel.common/handler.log
INFO    - Total complete:   80/ 162  49%  skipped:   24, failed:    2
ERROR   - qemu_x86_64               tests/kernel/common/kernel.common.misra            FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/kernel/common/kernel.common.misra/handler.log
INFO    - Total complete:  110/ 162  67%  skipped:   27, failed:    3
ERROR   - qemu_x86_64               tests/kernel/mem_pool/sys_mem_pool/kernel.memory_pool.sys FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/kernel/mem_pool/sys_mem_pool/kernel.memory_pool.sys/handler.log
INFO    - Total complete:  111/ 162  68%  skipped:   27, failed:    4
ERROR   - qemu_x86_64               tests/kernel/mem_heap/mheap_api_concept/kernel.memory_heap FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/kernel/mem_heap/mheap_api_concept/kernel.memory_heap/handler.log
INFO    - Total complete:  118/ 162  72%  skipped:   27, failed:    5
ERROR   - qemu_x86_64               tests/kernel/mem_pool/mem_pool_api/kernel.memory_pool.api FAILED: Exited with 2
ERROR   - see: /dev/shm/sanity-out/qemu_x86_64/tests/kernel/mem_pool/mem_pool_api/kernel.memory_pool.api/handler.log
INFO    - Total complete:  162/ 162  100%  skipped:   27, failed:    6
INFO    - 129 of 162 tests passed (95.56%), 6 failed, 27 skipped with 0 warnings in 47.93 seconds
INFO    - In total 748 test cases were executed on 1 out of total 218 platforms (0.46%)
apboie@shodan:~/projects/zephyr/zephyr (1) (HEAD detached from d3c525b986) /home/apboie/projects/zephyr/zephyr
$ 

@andrewboie
Copy link
Contributor

It may also be useful in testing to clone this branch:

https://github.com/andrewboie/zephyr/tree/smp-issues

Which shows similar behavior with the old x86-64 port. If you backport your fix there (it doesn't apply cleanly) and run tests with the old x86_64 target, it will help identify whether the behavior is related to the arch port or not.

@andyross
Copy link
Contributor Author

I can't tell if you're saying you can still see problems (where I agree) or whether you're saying this patch doesn't fix anything for you (seems like it's a clear bug and it absolutely improves things a ton for me)?

I genuinely don't think the stuff you're seeing is related to this issue. It's a separate bug.

@andrewboie
Copy link
Contributor

I can't tell if you're saying you can still see problems (where I agree) or whether you're saying this patch doesn't fix anything for you (seems like it's a clear bug and it absolutely improves things a ton for me)?

I honestly cannot tell if this patch improves the situation. I have no objections to merging it.

What I'm trying to make clear is that this doesn't fix the behavior reported in #21317 and I'm not seeing a significant reduction in the tests that are failing.

@andrewboie
Copy link
Contributor

I have no objections to merging it.

Actually that't not quite true, which is why I haven't +1'd this. I'm very uneasy with the solution to add a 50us busy-wait to correct this race. I need some kind of guarantee that this could never fail to work, in all conceivable scheduling and interrupt delivery scenarios. Think hundreds/thousands of devices, in a functional safety scenario -- any failure would be unacceptable. I understand a more robust solution requires much more invasive changes but I'm not convinced yet that these changes don't need to be made.

If you can prove that the delay is always guaranteed to work then I'll add a +1. I'm not at the point where I'd leave a -1, but this doesn't sit well enough with me yet to +1 it.

@andyross
Copy link
Contributor Author

OK, try this. It now includes a fix for a second race in z_swap(), and IMHO a pretty clever synchronization trick that avoids the need to change the architecture API. This is working to well under a failure per run on my rig. Will continue to look at the remaining spurious failures (samples/userspace/shared_mem and samples/userspace/prod_consumer are the most affected, with a few tests/kernel/common failures too, it seems).

Note that the original abort fix (with the delay) remains, unmodified. You might ask: can the tight spin loop used for swap be used for the same purpose here? And I think it can, but it's going to require some rework to the way that IPI works currently.

@andyross andyross changed the title kernel/sched: Address thread abort termination delay issue on SMP Fix SMP races with thread swap and abort Jan 17, 2020
@zephyrbot zephyrbot added area: Boards area: X86 x86 Architecture (32-bit) labels Jan 17, 2020
@zephyrbot
Copy link
Collaborator

zephyrbot commented Jan 17, 2020

All checks are passing now.

Tip: The bot edits this comment instead of posting a new one, so you can check the comment's history to see earlier messages.

@andrewboie
Copy link
Contributor

I'm getting really good results with this PR. In several runs with the delay placed in ztest, most of the failures I see are just timing related stuff that we already know about.

I concur that just a few tests left have problems:

samples/userspace/shared_mem - This test seems to run and suddenly exit the emulator. QEMU does this when it triple faults and in a lot of cases dumps no debug information. I think this one is probably on my end.

tests/kernel/common - Some of these failures I see are due to timing slop. But sometimes it crashes, with a trace that looks like it was trying to read/write a member of a NULL struct address. Not sure what it is about this test that elicits behavior like this

I haven't seen any crashes in prod_consumer, got any logs?

Note that the original abort fix (with the delay) remains, unmodified. You might ask: can the tight spin loop used for swap be used for the same purpose here? And I think it can, but it's going to require some rework to the way that IPI works currently.

How's this sound for a deal -- I've voiced some doubts about using a delay, but I'll shut up about it if we can file a GH enhancement for implementing a more robust solution in the fullness of time?

@andyross
Copy link
Contributor Author

One more easy one in shared_mem that seems to have resolved that test. Down to what I'm measuring as 4 total failures across 7 runs so far, with no repetitions.

@zephyrbot zephyrbot added the area: Samples Samples label Jan 17, 2020
Andy Ross added 7 commits January 17, 2020 14:01
Races seem fixed, at least the ones measurable by existing tests.

This reverts commit 3f3c72a.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The original intent was that the output handle be written through the
pointer in the second argument, though not all architectures used that
scheme.  As it turns out, that write is becoming a synchronization
signal, so it's no longer optional.

Clarify the documentation in arch_switch() about this requirement, and
add an instruction to the x86_64 context switch to implement it as
original envisioned.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
These two spots were calling z_sched_ipi() (the IPI handler run under
the ISR, which is a noop here because obviously the current thread
isn't DEAD) and not arch_sched_ipi() (which triggers an IPI on other
CPUs to inform them of scheduling state changes), presumably because
of a typo.

Apparently we don't have tests for k_wakeup() and
k_thread_priority_set() that are sensitive to latency in SMP
contexts...

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
It's possible for a thread to abort itself simultaneously with an
external abort from another thread.  In fact in our test suite this is
a common thing, as ztest will abort its own spawend threads at the end
of a test, as they tend to be exiting on their own.

When that happens, the thread marks itself DEAD and does all its
scheduler bookeeping, but it is STILL RUNNING on its own stack until
it makes its way to its final swap.  The external context would see
that "dead" metadata and return from k_thread_abort(), allowing the
next test to reuse and spawn the same thread struct while the old
context was still running.  Obviously that's bad.

Unfortunately, this is impossible to address completely without
modifying every SMP architecture to add a API-visible hook to every
swap that signals completion.  In practice the best we can do is add a
delay.  But note the optimization: almost always, the scheduler IPI
catches the running thread and kills it from interrupt context
(i.e. on a different stack).  When that happens, we know that the
interrupted thread will never be resumed (because it's dead) and can
elide the delay.  We only pay the cost when we actually detect a race.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
On SMP, there is an inherent race when swapping: the old thread adds
itself back to the run queue before calling into the arch layer to do
the context switch.  The former is properly synchronized under the
scheduler lock, and the later operates with interrupts locally
disabled.  But until somewhere in the middle of arch_switch(), the old
thread (that is in the run queue!) does not have complete saved state
that can be restored.

So it's possible for another CPU to grab a thread before it is saved
and try to restore its unsaved register contents (which are garbage --
typically whatever state it had at the last interrupt).

Fix this by leveraging the "swapped_from" pointer already passed to
arch_switch() as a synchronization primitive.  When the switch
implementation writes the new handle value, we know the switch is
complete.  Then we can wait for that in z_swap() and at interrupt
exit.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Calling z_ready_thread() means the thread is now ready and can wake up
at any moment on another CPU.  But we weren't finished setting the
return value!  So the other side could wake up with a spurious "error"
condition if it ran too soon.  Note that on systems with a working
IPI, that wakeup can happen much faster than you might think.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
There was some unlocked initialization code in the "enc" thread that
would race with the "pt" and "ct" threads if another CPU was available
to run them (it was safe on UP because "enc" entered the queue first
and was cooperative, the others wouldn't run until it blocked).

Move it to main() and remove the enc_state guard variable which is no
longer doing anything.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
@andyross
Copy link
Contributor Author

Fix commit message warning; I'd left a bare revert in.

@@ -104,6 +103,15 @@ void main(void)
struct k_mem_partition *dom0_parts[] = {&part0, &part1};
k_tid_t tPT, tENC, tCT;

fBUFIN = 0; /* clear flags */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change looks good but I'm still worried about how this was crashing by the emulator suddenly exiting. even with a race it should at least produce an exception that can be handled, if the whole thing triple faults that tells me we might have a bug in arch/x86 somewhere that I should try to investigate more, unless this race somehow got the page tables corrupted. what do you think?

Copy link
Contributor

@andrewboie andrewboie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need some questions I asked earlier answered before this can go in.

  1. Are there any scenarios of interrupt delivery/scheduling where the k_busy_wait(THREAD_ABORT_DELAY_US) will not work, no matter how obscure? If the answer is yes, then we need at minimum to have an issue in the backlog to address it with a solution like was done with wait_for_switch(). Or ideally, just do it like that now.

  2. Was the samples/userspace/shared_mem test triple-faulting and exiting the emulator, or was it simply getting stuck and timing out? If it was triple-faulting and exiting there may be a serious bug somewhere that needs to be looked at, no user thread should be able to get the kernel to explode like that

@andyross
Copy link
Contributor Author

change looks good but I'm still worried about how this was crashing by the emulator suddenly exiting.

This was just a vanilla failure. The triple fault behavior was AFAICT exclusively related to the "two CPUs running on the same kernel stack" symptom. I don' t have a good explanation for exactly how the process got out of control like that either, but it hasn't reappeared with the two main fixes here.

Regarding the delay: we don't currently spin on anything non-deterministic on the abort paths (i.e. from the point where the thread has self-aborted to the next swap or return from ISR). But obviously it's not possible to prove that will always be so; new APIs will always be able to invent new ways to k_thread_abort(_current) and then do something crazy like spin on an external resource.

If it bugs you we can just wait: I have a handle on how to rework IPI to be synchronous (there's a cost there too: you have to wait for all CPUs to have handled the IPI and not just for one to have seen this particular thread, but that's not awful given that we're talking about thread abort and not a performance path).

@andrewboie
Copy link
Contributor

This was just a vanilla failure. The triple fault behavior was AFAICT exclusively related to the "two CPUs running on the same kernel stack" symptom. I don' t have a good explanation for exactly how the process got out of control like that either, but it hasn't reappeared with the two main fixes here.

Gotcha, thanks for the additional details.

Regarding the delay: we don't currently spin on anything non-deterministic on the abort paths (i.e. from the point where the thread has self-aborted to the next swap or return from ISR). But obviously it's not possible to prove that will always be so; new APIs will always be able to invent new ways to k_thread_abort(_current) and then do something crazy like spin on an external resource.

If it bugs you we can just wait: I have a handle on how to rework IPI to be synchronous (there's a cost there too: you have to wait for all CPUs to have handled the IPI and not just for one to have seen this particular thread, but that's not awful given that we're talking about thread abort and not a performance path).

Let's get this merged now then, but I would like to leave #21317 open until we can do the IPI re-work. It really has been bugging me, mostly because of our fusa positioning.

@andrewboie andrewboie merged commit 86ac7dc into zephyrproject-rtos:master Jan 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: API Changes to public APIs area: Boards area: Kernel area: Samples Samples area: X86 x86 Architecture (32-bit)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants