Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel work increment/decrement prevents MCU going to sleep #2098

Merged
merged 3 commits into from Sep 10, 2020

Conversation

alexandruradovici
Copy link
Contributor

Pull Request Overview

This pull request (hopefully) fixes an issue that seems to prevent the kernel from putting the MCU to sleep. The issue seems to be from the kernel work counter.

While running the following user space app, it seems that the kernel does not sleep the MCU after the process's last yield system call.

char hello[] = "Hello World!\r\n";
char run[] = "system run\r\n";
char texte[] = "texte\r\n";

static void nop(
  int a __attribute__((unused)),
  int b __attribute__((unused)),
  int c __attribute__((unused)),
  void* d __attribute__((unused))) {}


int main(void) {
  putnstr_async(hello, sizeof(hello), nop, NULL);
  putnstr_async(run, sizeof(run), nop, NULL);
  putnstr_async(texte, sizeof(texte), nop, NULL);
  return 0;
}

I did some research and found out that the issue pops up only when there are several subscribes for the same callback (driver_id, subscribe_number). In the kernel:

  1. I found that when a callback is overwritten:
    a. the kernel work is incremented
    b. the previous callback is removed, but the kernel work is not decremented accordingly

    This prevents sends the scheduler into an infinite loop.

  2. I also noticed that enqueue_task increments the kernel work regardless of whether a callback is scheduled or not. We also experienced an interesting issue when trying to implement a real-time external interrupt counter. It seems that if the kernel drops callbacks, it starts having a strange behavior, the apps jam. I think it might be related to the kernel work. We still need to do some more research.

Testing Strategy

TODO or Help Wanted

N/A

Documentation Updated

  • Updated the relevant files in /docs, or no updates are required.

Formatting

  • Ran make prepush.

@hudson-ayers
Copy link
Contributor

  1. I found that when a callback is overwritten:
    a. the kernel work is incremented
    b. the previous callback is removed, but the kernel work is not decremented accordingly
    This prevents sends the scheduler into an infinite loop.

Good catch!

  1. I also noticed that enqueue_task increments the kernel work regardless of whether a callback is scheduled or not. We also experienced an interesting issue when trying to implement a real-time external interrupt counter. It seems that if the kernel drops callbacks, it starts having a strange behavior, the apps jam. I think it might be related to the kernel work. We still need to do some more research.

Another good catch! It also seems incorrect that the kernel work is incremented even in the case that enqueue_task() fails, I think we should fix that in this PR as well.

kernel/src/process.rs Outdated Show resolved Hide resolved
@alexandruradovici
Copy link
Contributor Author

  1. I found that when a callback is overwritten:
    a. the kernel work is incremented
    b. the previous callback is removed, but the kernel work is not decremented accordingly
    This prevents sends the scheduler into an infinite loop.

Good catch!

  1. I also noticed that enqueue_task increments the kernel work regardless of whether a callback is scheduled or not. We also experienced an interesting issue when trying to implement a real-time external interrupt counter. It seems that if the kernel drops callbacks, it starts having a strange behavior, the apps jam. I think it might be related to the kernel work. We still need to do some more research.

Another good catch! It also seems incorrect that the kernel work is incremented even in the case that enqueue_task() fails, I think we should fix that in this PR as well.

I think I fixed this already by moving the increment_work function call. Am I missing something else here?

@hudson-ayers
Copy link
Contributor

I think I fixed this already by moving the increment_work function call. Am I missing something else here?

Oh, I must have been looking at a single commit or something. You did fix it.

@ppannuto ppannuto added the last-call Final review period for a pull request. label Sep 9, 2020
@bradjc
Copy link
Contributor

bradjc commented Sep 10, 2020

bors r+

@bors bors bot merged commit bf429ce into tock:master Sep 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
last-call Final review period for a pull request.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants