Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack overflow from CoroutineScheduler::list() after suspend() -> resume() #19

Closed
EkardNT opened this issue Nov 7, 2020 · 5 comments
Closed

Comments

@EkardNT
Copy link

EkardNT commented Nov 7, 2020

The following code causes a stack overflow within CoroutineScheduler::list() using AceRoutine 1.1.0 and AceCommon 1.1.1 when running on a ESP8266 (NodeMCU 12e module).

#include <AceRoutine.h>

using namespace ace_routine;

COROUTINE(hello) {
  static int i = 0;
  COROUTINE_LOOP() {
    Serial.printf("Hello coroutine [%d]\r\n", i);
    i += 1;
    COROUTINE_DELAY(1000);
  }
}

void setup() {
  Serial.begin(115200);
  CoroutineScheduler::setup();
  hello.suspend();
  hello.resume();
  CoroutineScheduler::list(Serial);
}

void loop() {
  CoroutineScheduler::loop();
}

I took a quick look into the loop() code and based on what its doing its possible that the combination of suspend() -> resume() is creating a circular loop inside the CoroutineScheduler's linked list.

@EkardNT
Copy link
Author

EkardNT commented Nov 7, 2020

Seems like this code inside Coroutine::resume may cause the coroutine's next pointer to point to itself if the coroutine is already the root?

  // insert at the head of the linked list
  Coroutine** p = getRoot();
  mNext = *p;
  *p = this;

@bxparks
Copy link
Owner

bxparks commented Nov 8, 2020

Hmm, let me take a closer look at this later today..

@bxparks
Copy link
Owner

bxparks commented Nov 9, 2020

You kinda hit a design flaw in my library. The current implementation of suspend() defers the update of the singly-linked list to the CoroutineScheduler::loop() method, which happens in the future. The resume() method was implemented to assume that the loop() has made this change. If the resume() is called immeidately after the suspend(), a cycle in the list is created as you point out correctly. This is a terrible bug, and I'm a bit embarrassed by it.

I can think of 2 solutions:

  1. Change the singly-linked list to a doubly-linked list, which allows the suspend() method to take effect immediately. This increases the memory size of each Coroutine instance.
  2. Give up on the optimization of removing Suspended, and Terminated coroutines from the singly-linked list. If I stop manipulating the linked list dynamically, there is no possibility of creating a cycle in the list.

I'm leaning towards (2) because getting rid of that (somewhat dubious) optimization makes the code simpler and fixes this bug. Initially I thought that the library should be able to handle large numbers of coroutines (dozens, if not hundreds). But in real life, I've never really needed more than about 5-10 coroutines. So my current design feels like premature optimization that has come back to bite me. If this library need to scale, I can implement solution (1) to get multiple pools of coroutines, and I can do that with no change to the external API of the library.

I got most of the fix done. Just need to clean it up, double check things, and update the docs. Give me a day or two.

bxparks added a commit that referenced this issue Nov 10, 2020
bxparks added a commit that referenced this issue Nov 10, 2020
…end() removes coroutine from the internal linked list; causes an immediate resume() to create cycle in linked list (see #19)
bxparks added a commit that referenced this issue Nov 10, 2020
…(simpler, faster) instead of insertSorted(); add setupCoroutineOrderedByName() for backwards compatibility for unit tests (see #19)
bxparks added a commit that referenced this issue Nov 10, 2020
…e(); explicitly make the properties of internal linked list an unspecified implementation detail (see #19)
bxparks added a commit that referenced this issue Nov 10, 2020
@bxparks
Copy link
Owner

bxparks commented Nov 10, 2020

I'm in the process of refactoring the README.md. When I'm done with that I will push a new release. Probably today or tomorrow.

@bxparks
Copy link
Owner

bxparks commented Nov 11, 2020

v1.2 released, please give it a spin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants