Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib/ukschedcoop: Add early check for runnable thread in idle thread #941

Conversation

mogasergiu
Copy link
Member

@mogasergiu mogasergiu commented Jun 9, 2023

lib/ukschedcoop: Add early check for runnable thread in idle thread

Since we enable IRQ's before context switching, there is a chance that
IRQ's may fire in the short time frame between the enabling of IRQ's
in the scheduler before context switching and disabling of IRQ's before
entering the idle thread's halting state.

To fix this, make sure we disable IRQ's at the beginning of the idle
thread's halting loop and do a quick early check for runnable threads
alongside the one for exited threads.

Co-authored-by: Stefan Jumarea stefanjumarea02@gmail.com
Signed-off-by: Sergiu Moga sergiu@unikraft.io
Signed-off-by: Stefan Jumarea stefanjumarea02@gmail.com

Prerequisite checklist

  • Read the contribution guidelines regarding submitting new changes to the project;
  • Tested your changes against relevant architectures and platforms;
  • Ran the checkpatch.uk on your commit series before opening this PR;
  • Updated relevant documentation.

Base target

  • Architecture(s): [e.g. x86_64 or N/A]
  • Platform(s): [e.g. kvm, xen or N/A]
  • Application(s): [e.g. app-python3 or N/A]

Additional configuration

Description of changes

@mogasergiu mogasergiu requested a review from a team as a code owner June 9, 2023 10:15
@unikraft-bot unikraft-bot added area/lib Internal Unikraft Microlibrary lang/c Issues or PRs to do with C/C++ lib/ukschedcoop labels Jun 9, 2023
@mogasergiu mogasergiu force-pushed the smoga/stuck_hlt_no_timer_irq_issue branch from 17b567d to 3f28479 Compare June 9, 2023 12:05
@razvand razvand assigned andreittr and unassigned nderjung Jun 9, 2023
@razvand razvand requested review from StefanJum and dinhngtu and removed request for a team and andraprs June 9, 2023 12:48
@razvand razvand added this to the v0.14.0 (Prometheus) milestone Jun 9, 2023
Copy link
Member

@StefanJum StefanJum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good on my side 🚀
I won't add the review tag, since I'm added as a co-author.

Copy link
Member

@dinhngtu dinhngtu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch looks fine to me.

A side note: sti has a delayed effect that won't kick in until the next instruction, meaning IRQs cannot fire in the middle of a sti; hlt sequence with the exception of NMIs on some CPUs (which Linux handles in this fashion). There might be an alternative way to solve the issue without having to set the clock every idle tick.

Reviewed-by: Tu Dinh Ngoc dinhngoc.tu@irit.fr

@mogasergiu
Copy link
Member Author

The patch looks fine to me.

A side note: sti has a delayed effect that won't kick in until the hlt instruction, meaning IRQs cannot fire in the middle of a sti; hlt sequence with the exception of NMIs on some CPUs (which Linux handles in this fashion). This might be an alternative way to solve the issue without having to set the clock every idle tick.

Reviewed-by: Tu Dinh Ngoc dinhngoc.tu@irit.fr

Hey, thanks for the feedback! Interesting, I did not know that, definetely valuable information and I should have checked before for how Linux handles this, indeed. One thing though, is that with this change, the same bug is also fixed on ARM.
Now that you mention this, QEMU seems to have something related to this as well[1].
The troublesome IRQ, I believe, fires somewhere after schedcoop_schedule's end, since if it were to have fired anywhere before that, this branch would have been taken and things would not break. Therefore, the actual fix, I guess, would have been trying to make sure that whatever possible IRQ's may have been fired after that point do not fire until that hlt is hit, or the actual thread switch has been made. To confirm this, I removed this restoration of IRQ's and, indeed, this bug does not occur anymore. However, I am not sure this would be a proper fix... unless during a context switch we would also restore the interrupt flags of that context I guess, so the thread does not remain with disabled IRQ's.

In any case, the commit message must definetely be updated to at least change the sti; hlt x86 example :).

What is your opinion on this, @andreittr ? It would be great to avoid having to re-configure the timer at every idle tick.

[1] https://github.com/qemu/qemu/blob/master/target/i386/cpu.c#L7424
https://github.com/qemu/qemu/blob/master/target/i386/tcg/sysemu/misc_helper.c#L484
https://github.com/qemu/qemu/blob/master/target/i386/tcg/translate.c#L5564

@mschlumpp
Copy link
Member

Yes, the correct way to solve this is to use the "interrupt shadow" mechanism as @dinhngtu noted. The x86 ukplat_lcpu_halt_irq also takes care of this by using a bare sti; hlt under the hood.
The problem is that the idle thread is running with enabled interrupts and therefore any interrupt after the switch to it can fail to correctly unblock a thread. I wrote a workaround using the existing sched_have_pending_events variable some time ago.

@mogasergiu
Copy link
Member Author

Yes, the correct way to solve this is to use the "interrupt shadow" mechanism as @dinhngtu noted. The x86 ukplat_lcpu_halt_irq also takes care of this by using a bare sti; hlt under the hood. The problem is that the idle thread is running with enabled interrupts and therefore any interrupt after the switch to it can fail to correctly unblock a thread. I wrote a workaround using the existing sched_have_pending_events variable some time ago.

Ok, but, as I said, I would like the fix to apply to ARM as well. Besides, can IRQ's not fire before the idle thread is even switched to?

@mschlumpp
Copy link
Member

mschlumpp commented Jun 14, 2023

Besides, can IRQ's not fire before the idle thread is even switched to?

True.

Ok, but, as I said, I would like the fix to apply to ARM as well.

I mean, there has to be some way on ARM to properly halt and wait for interrupts. Otherwise, timers would be basically unusable.

The current approach of deciding to wait for interrupts (in the scheduler), enable the interrupts, do some stuff, and then doing the actual "wait-for-interrupts" is just very wrong.

@mogasergiu
Copy link
Member Author

mogasergiu commented Jun 14, 2023

Besides, can IRQ's not fire before the idle thread is even switched to?

True.

Ok, but, as I said, I would like the fix to apply to ARM as well.

I mean, there has to be some way on ARM to properly halt and wait for interrupts. Otherwise, timers would be basically unusable.

Agreed. I'd have to check how others do it on ARM. However, at this point in time, as you both guys also pointed out, the problem ends up not being in the halting instruction sequence. But rather before it, because IRQ's can end up being fired between before switching and our if statement. Maybe something like placing the scheduler call a bit earlier would be an idea, but even then, IRQ's can fire after we reach it, meaning that we would also have to have IRQ's disabled at that point. I propose something similar to this

diff --git a/lib/ukschedcoop/schedcoop.c b/lib/ukschedcoop/schedcoop.c
index 4655c4bc8..f18c2b479 100644
--- a/lib/ukschedcoop/schedcoop.c
+++ b/lib/ukschedcoop/schedcoop.c
@@ -188,9 +188,16 @@ static __noreturn void idle_thread_fn(void *argp)
 {
 	struct schedcoop *c = (struct schedcoop *) argp;
 	__nsec now, wake_up_time;
+	unsigned long flags;
 
 	UK_ASSERT(c);
 
+	flags = ukplat_lcpu_save_irqf();
+
+	/* Quick early check to see if, in the meantime, we got a runnable thread */
+	if (UK_TAILQ_FIRST(&c->run_queue)) {
+               ukplat_lcpu_restore_irqf(flags);
+		schedcoop_schedule(&c->sched);
+       }
+
 	for (;;) {
 		/*
 		 * FIXME: We assume that `uk_sched_thread_gc()` is non-blocking
@@ -215,12 +222,12 @@ static __noreturn void idle_thread_fn(void *argp)
 
 		if (!wake_up_time || wake_up_time > now) {
 			if (wake_up_time) {
-				ukplat_lcpu_halt_to(wake_up_time);
+				time_block_until(wake_up_time);
 			} else {
 				ukplat_lcpu_halt_irq();
-				ukplat_lcpu_enable_irq();
 			}
 
+			ukplat_lcpu_restore_irqf(flags);
 			/* handle pending events if any */
 			ukplat_lcpu_irqs_handle_pending();
 		}

This works on my side. I hope it does not just work because I am missing something though. The flags might not be needed AFAICT though, maybe just enabling/disabling would suffice. I am waiting for your guys' opinions.

@mschlumpp
Copy link
Member

@mogasergiu The "is-work-available" check has to be within the loop. Otherwise, it will only work the first time. So the loop should be a if-work-available -> yes: enable-irq-again + schedule + disable-irq, no: block

Also I would rather modify ukplat_lcpu_halt_to to expect a IRQ-disabled context (also with an UK_ASSERT) to match the semantics of ukplat_lcpu_halt_irq (Using it in a context with enabled IRQs is almost guaranteed to be broken and it has the hard assumption that IRQs are disabled at the call site anyway).

Using save/restore is fine and IMHO usually less bugprone.

@mogasergiu
Copy link
Member Author

@mogasergiu The "is-work-available" check has to be within the loop.

Oops, yes, you are correct. So I was, indeed, missing something 😆. Thank you!

Otherwise, it will only work the first time. So the loop should be a if-work-available -> yes: enable-irq-again + schedule + disable-irq, no: block

Ah, yes, I forgot to disable IRQ's after exiting the scheduler. For whatever reason, I thought it was a __noreturn when writing the diff, also the reason why I did not place the if statement in the loop. Obviously wouldn't have made sense for it to be __noreturn, otherwise there wouldn't be any loop there in the first place.

Also I would rather modify ukplat_lcpu_halt_to to expect a IRQ-disabled context (also with an UK_ASSERT) to match the semantics of ukplat_lcpu_halt_irq (Using it in a context with enabled IRQs is almost guaranteed to be broken and it has the hard assumption that IRQs are disabled at the call site anyway).

Indeed, that would be a good idea. Would help avoid future mistakes, or at least in being more cautious.

Using save/restore is fine and IMHO usually less bugprone.

Thanks a lot for the valuable input ✈️. I will try to replace the current commit with the following two commits soon:

  • one to add the UK_ASSERT in ukplat_lcpu_halt_to (will add your Suggested-by: tag)
  • the actual fix commit with the suggested diff, updated with the if statement in the loop and the disable_irq as well

@mogasergiu
Copy link
Member Author

@mogasergiu The "is-work-available" check has to be within the loop.

Oops, yes, you are correct. So I was, indeed, missing something laughing. Thank you!

Otherwise, it will only work the first time. So the loop should be a if-work-available -> yes: enable-irq-again + schedule + disable-irq, no: block

Ah, yes, I forgot to disable IRQ's after exiting the scheduler. For whatever reason, I thought it was a __noreturn when writing the diff, also the reason why I did not place the if statement in the loop. Obviously wouldn't have made sense for it to be __noreturn, otherwise there wouldn't be any loop there in the first place.

Also I would rather modify ukplat_lcpu_halt_to to expect a IRQ-disabled context (also with an UK_ASSERT) to match the semantics of ukplat_lcpu_halt_irq (Using it in a context with enabled IRQs is almost guaranteed to be broken and it has the hard assumption that IRQs are disabled at the call site anyway).

Indeed, that would be a good idea. Would help avoid future mistakes, or at least in being more cautious.

Using save/restore is fine and IMHO usually less bugprone.

Thanks a lot for the valuable input airplane. I will try to replace the current commit with the following two commits soon:

  • one to add the UK_ASSERT in ukplat_lcpu_halt_to (will add your Suggested-by: tag)
  • the actual fix commit with the suggested diff, updated with the if statement in the loop and the disable_irq as well

@mschlumpp Actually, one thing that I have just noticed though, is that this ukplat_lcpu_halt_to is used in __spin_wait (the only other place it is used). So that function was moved during a revisit apprently. Therefore, I am going to take back what I said about the UK_ASSERT commit. Unless the context is clear enough, I would avoid changing anything else, other than the fix itself. If the need arises, in the future, fine, but I do not think that now is the time for this change. Instead, the additional change that I would suggest is moving _time.h's contents (it is not clear to me why time_block_until had to have its own header) to time.h and rename the function to ukplat_time_block_until to align it with the others. This way libraries can also have easy access to it. Thoughts?

@mschlumpp
Copy link
Member

mschlumpp commented Jun 14, 2023

We can also just add a save/restore pair to __spin_wait. Also if something breaks because of that assertion then it means it was likely already used in a wrong way, which is a good thing.

Is there any benefit of exposing the raw platform time_block_until?

@mogasergiu
Copy link
Member Author

We can also just add a save/restore pair to __spin_wait. Also if something breaks because of that assertion then it means it was likely already used in a wrong way, which is a good thing.

Wouldn't ukplat_lcpu_halt_to just become an alias for time_block_until in that case?

Is there any benefit of exposing the raw platform time_block_until?

Every platform seems to have its own implementation for it. The main reason I want to do this is because libraries cannot include headers from uk/plat/common/ unless you modify their Makefile.uk.

@mschlumpp
Copy link
Member

mschlumpp commented Jun 14, 2023

Wouldn't ukplat_lcpu_halt_to just become an alias for time_block_until in that case?

It was already used for the common code. In that case we still have at least one sanity check there (the irq-disabled assertion).

Every platform seems to have its own implementation for it. The main reason I want to do this is because libraries cannot include headers from uk/plat/common/ unless you modify their Makefile.uk.

That would increase the risk of using "unchecked" version accidentally. I also do not see a reason why libraries would want to opt-in the potential risk of running into the bug that this PR is about.

The documentation of ukplat_lcpu_halt_to explicitly mentions that it will wait for the timeout or an interrupt:

Halts the current logical CPU. Execution is resumed when an interrupt/signal arrives or the specified deadline expires

If you use this function you must disable IRQs or it will not do what you want.

@mogasergiu
Copy link
Member Author

Wouldn't ukplat_lcpu_halt_to just become an alias for time_block_until in that case?

It was already used for the common code. In that case we still have at least one sanity check there (the irq-disabled assertion).

So you want ukplat_lcpu_halt_to to be just time_block_until but with a UK_ASSERT?

Every platform seems to have its own implementation for it. The main reason I want to do this is because libraries cannot include headers from uk/plat/common/ unless you modify their Makefile.uk.

That would increase the risk of using "unchecked" version accidentally.
I also do not see a reason why libraries would want to opt-in the potential risk of running into bug that this PR is about.

They have different signatures, I am not sure how likely that is to happen. The ukplat_time_block_until would be used by someone that just wants to wait, without caring about IRQ's. But, yeah, the same, or better, could be achieved with nanosleep, so yeah I guess no point in exposing it after all as there is a better alternative.

The documentation of ukplat_lcpu_halt_to explicitly mentions that it will wait for the timeout or an interrupt:

Halts the current logical CPU. Execution is resumed when an interrupt/signal arrives or the specified deadline expires

If you use this function you must disable IRQs or it will not do what you want.

Ok, this I agree with, the use of this function would be weird without IRQ's disabled, sure.

Right then, the second commit shall be just making ukplat_lcpu_halt_to execute time_block_until with a UK_ASSERT I guess. Although that would mean explicitly re-enabling IRQ's after returning from it. It would be the exact same as ukplat_lcpu_halt_irq but with an actual timed wait to wake us up. So maybe it would make sense to rename it to ukplat_lcpu_halt_irq_to as well 😆, if you agree. I shall also add an explicit warning in the function definition's comment.

@mschlumpp
Copy link
Member

Instead, the additional change that I would suggest is moving _time.h's contents (it is not clear to me why time_block_until had to have its own header) to time.h and rename the function to ukplat_time_block_until [...]

My point is that the function would still wait for interrupts, but in a very unreliable way if you don't disable interrupts. Otherwise, you would have to explicitly check that the wake-interrupt was the correct timer interrupt. Or at least mention that there are "spurious" wakeups in the documentation.

So you want ukplat_lcpu_halt_to to be just time_block_until but with a UK_ASSERT?

Yes.

They have different signatures, I am not sure how likely that is to happen.

Okay, it is possible I misunderstood you. I though you wanted to rename time_block_until to ukplat_time_block_until, which (currently) has essentially the same signature as ukplat_lcpu_halt_to.

So maybe it would make sense to rename it to ukplat_lcpu_halt_irq_to

Sounds good, or _until?

Although that would mean explicitly re-enabling IRQ's after returning from it.

Huh? You mean at the call-site?

@mogasergiu
Copy link
Member Author

Instead, the additional change that I would suggest is moving _time.h's contents (it is not clear to me why time_block_until had to have its own header) to time.h and rename the function to ukplat_time_block_until [...]

My point is that the function would still wait for interrupts, but in a very unreliable way if you don't disable interrupts. Otherwise, you would have to explicitly check that the wake-interrupt was the correct timer interrupt. Or at least mention that there are "spurious" wakeups in the documentation.

So you want ukplat_lcpu_halt_to to be just time_block_until but with a UK_ASSERT?

Yes.

They have different signatures, I am not sure how likely that is to happen.

Okay, it is possible I misunderstood you. I though you wanted to rename time_block_until to ukplat_time_block_until, which (currently) has essentially the same signature as ukplat_lcpu_halt_to.

So maybe it would make sense to rename it to ukplat_lcpu_halt_irq_to

Sounds good, or _until?

Good idea! _until is even better as it would call time_block_until anyway.

Although that would mean explicitly re-enabling IRQ's after returning from it.

Huh? You mean at the call-site?

Yes.

We can also just add a save/restore pair to __spin_wait.

Did I misunderstand your idea?
I don't think you can just enable_irqf afterwards (if you mean to just remove save_irqf), without saving irqf before the wait, all in ukplat_lcpu_halt_to. Or you could, but then you would have to call local_save_flags and, in the linuxu case, just set it to irq_enabled. This would look very ugly, ending up with an #ifdef, which I would like to avoid (althrough this would be probably easier to solve by writing a linuxu local_save_flags). Besides, this would mean that the caller would have to only disable IRQ's before the call. It would look more explicit if the call would be surrounded by save/restore.

This is what I thought we agreed upon.

diff --git a/plat/common/lcpu.c b/plat/common/lcpu.c
index 2af4b5621..e02b3983c 100644
--- a/plat/common/lcpu.c
+++ b/plat/common/lcpu.c
@@ -189,11 +189,9 @@ void __noreturn ukplat_lcpu_halt(void)
 
- void ukplat_lcpu_halt_to(__nsec until)
+ void ukplat_lcpu_halt_until(__nsec until)
 {
-	unsigned long flags;
+	UK_ASSERT(ukplat_lcpu_irqs_disabled());
 
-	flags = ukplat_lcpu_save_irqf();
 	time_block_until(until);
-	ukplat_lcpu_restore_irqf(flags);
 }

We can also just add a save/restore pair to __spin_wait.

Is this not what you meant?

diff --git a/lib/posix-time/time.c b/lib/posix-time/time.c
index b6fc71636..97ec47131 100644
--- a/lib/posix-time/time.c
+++ b/lib/posix-time/time.c
@@ -56,9 +56,12 @@
 static void __spin_wait(__nsec nsec)
 {
 	__nsec until = ukplat_monotonic_clock() + nsec;
+	unsigned long flags;
 
+	flags = ukplat_lcpu_save_irqf();
 	while (until > ukplat_monotonic_clock())
- 		ukplat_lcpu_halt_to(until);
+ 		ukplat_lcpu_halt_until(until);
+	ukplat_lcpu_restore_irqf(flags);
 }
 #endif

@mogasergiu mogasergiu force-pushed the smoga/stuck_hlt_no_timer_irq_issue branch from 3f28479 to 49d0ed1 Compare June 14, 2023 14:23
@mogasergiu mogasergiu requested review from a team as code owners June 14, 2023 14:23
@mogasergiu mogasergiu changed the title lib/ukschedcoop: Ensure Timer IRQ can kick us out of halt states lib/ukschedcoop: Add early check for runnable thread in idle thread Jun 14, 2023
@unikraft-bot
Copy link
Member

Checkpatch passed

Beep boop! I ran Unikraft's checkpatch.pl support script on your pull request and it all looks good!

SHA commit checkpatch
7ca6ea5 plat/common: Ensure `IRQ`'s disabled in `ukplat_lcpu_halt_to`

@unikraft-bot unikraft-bot added area/include Part of include/uk area/plat Unikraft Patform plat/common Common to all platforms labels Jun 14, 2023
@razvand razvand removed request for a team June 15, 2023 15:49
@razvand razvand assigned mschlumpp and unassigned andreittr Jun 15, 2023
@razvand
Copy link
Contributor

razvand commented Jun 15, 2023

@mschlumpp, I added you as the assignee for this PR. Approve it if / when it's all good.

Copy link
Member

@mschlumpp mschlumpp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: putting the review on hold for now

@mogasergiu mogasergiu force-pushed the smoga/stuck_hlt_no_timer_irq_issue branch from 49d0ed1 to defcd91 Compare June 21, 2023 16:15
mogasergiu and others added 2 commits June 21, 2023 19:16
Make sure that `IRQ`'s are disabled when calling `ukplat_lcpu_halt_to`.
Furthermore, do not surround the call to `time_block_until` with `IRQ`
`save`/`restore` and inform the function user through a `NOTE` to take
care of that himself.

Therefore, with this commit, change every other reference to this
function accordingly. Furthermore, change the name of the function to
reflect this.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Suggested-by: Marco Schlumpp <marco@unikraft.io>
Since we enable IRQ's before context switching, there is a chance that
IRQ's may fire in the short time frame between the enabling of IRQ's
in the scheduler before context switching and disabling of IRQ's before
entering the idle thread's halting state.

To fix this, make sure we disable IRQ's at the beginning of the idle
thread's halting loop and do a quick early check for runnable threads
alongside the one for exited threads.

Co-authored-by: Stefan Jumarea <stefanjumarea02@gmail.com>
Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Signed-off-by: Stefan Jumarea <stefanjumarea02@gmail.com>
@mogasergiu mogasergiu force-pushed the smoga/stuck_hlt_no_timer_irq_issue branch from defcd91 to 0365bc1 Compare June 21, 2023 16:16
Copy link
Member

@mschlumpp mschlumpp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, now everything seems to work fine. Thanks again!

Approved-by: Marco Schlumpp marco@unikraft.io

unikraft-bot pushed a commit that referenced this pull request Jul 7, 2023
Since we enable IRQ's before context switching, there is a chance that
IRQ's may fire in the short time frame between the enabling of IRQ's
in the scheduler before context switching and disabling of IRQ's before
entering the idle thread's halting state.

To fix this, make sure we disable IRQ's at the beginning of the idle
thread's halting loop and do a quick early check for runnable threads
alongside the one for exited threads.

Co-authored-by: Stefan Jumarea <stefanjumarea02@gmail.com>
Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Signed-off-by: Stefan Jumarea <stefanjumarea02@gmail.com>
Reviewed-by: Tu Dinh Ngoc <dinhngoc.tu@irit.fr>
Approved-by: Marco Schlumpp <marco@unikraft.io>
Tested-by: Unikraft CI <monkey@unikraft.io>
GitHub-Closes: #941
@unikraft-bot unikraft-bot added the ci/merged Merged by CI label Jul 7, 2023
i-Pear pushed a commit to i-Pear/unikraft that referenced this pull request Jul 31, 2023
Make sure that `IRQ`'s are disabled when calling `ukplat_lcpu_halt_to`.
Furthermore, do not surround the call to `time_block_until` with `IRQ`
`save`/`restore` and inform the function user through a `NOTE` to take
care of that himself.

Therefore, with this commit, change every other reference to this
function accordingly. Furthermore, change the name of the function to
reflect this.

Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Suggested-by: Marco Schlumpp <marco@unikraft.io>
Reviewed-by: Tu Dinh Ngoc <dinhngoc.tu@irit.fr>
Approved-by: Marco Schlumpp <marco@unikraft.io>
Tested-by: Unikraft CI <monkey@unikraft.io>
GitHub-Closes: unikraft#941
i-Pear pushed a commit to i-Pear/unikraft that referenced this pull request Jul 31, 2023
Since we enable IRQ's before context switching, there is a chance that
IRQ's may fire in the short time frame between the enabling of IRQ's
in the scheduler before context switching and disabling of IRQ's before
entering the idle thread's halting state.

To fix this, make sure we disable IRQ's at the beginning of the idle
thread's halting loop and do a quick early check for runnable threads
alongside the one for exited threads.

Co-authored-by: Stefan Jumarea <stefanjumarea02@gmail.com>
Signed-off-by: Sergiu Moga <sergiu@unikraft.io>
Signed-off-by: Stefan Jumarea <stefanjumarea02@gmail.com>
Reviewed-by: Tu Dinh Ngoc <dinhngoc.tu@irit.fr>
Approved-by: Marco Schlumpp <marco@unikraft.io>
Tested-by: Unikraft CI <monkey@unikraft.io>
GitHub-Closes: unikraft#941
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/include Part of include/uk area/lib Internal Unikraft Microlibrary area/plat Unikraft Patform ci/merged Merged by CI lang/c Issues or PRs to do with C/C++ lib/ukschedcoop plat/common Common to all platforms
Projects
Status: Done!
Development

Successfully merging this pull request may close these issues.

None yet

8 participants