Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lockup of two SM instances #43

Closed
domokos opened this issue Dec 29, 2015 · 4 comments
Closed

Lockup of two SM instances #43

domokos opened this issue Dec 29, 2015 · 4 comments

Comments

@domokos
Copy link

domokos commented Dec 29, 2015

I spent quite some time until I figured that FM instances are not independent, they share a common queue/event source, which actually causes a lockup when two independent threads (foo, bar) use two - seemingly independent - SM instances (A & B) where thread foo from the callback of SM A wants to

  1. signal thread bar to exit and wait for it to exit
  2. bar on receiving the signal wants to trigger a state change in FM B and exit

In this case the SM will lock up in a deadlock.

Here's the code snippet attached to reproduce the issue - you can set WORKAROUND_ACTIVE to activate a workaround with a trade-off. You can signal TTIN to the app while being locked up to see the issue.

This may be a design limitation but then it should be noted in the - otherwise excellent - documentation. I found this out the 'hard way' by getting an unexpected and seemingly inexplicable thread lockup.

code.zip

@piotrmurach
Copy link
Owner

Thank you for reporting this, I will definitely take a look. I didn't intend this to be a feature, if anything, I wanted to ensure that instances of FM are threadsafe but independent of each other. You should be able to create a system of state machines that cooperate to solve a task, this deadlock is a bug, not a feature. Let's fix it!

piotrmurach added a commit that referenced this issue Dec 30, 2015
@piotrmurach
Copy link
Owner

@domokos Thank you for the code, it was super helpful in zeroing in on the problem! I have added a simpler version of your code as a integration test case to keep this bug at bay for the future.

Turns out the issue was to do with how state machines trigger and emit callback events, more precisely, how threads acquire an exclusive lock to be able to issue and observe events. My thoery is that the mutex has become a global mutex for the whole state machine, meaning, once thread triggered the event it also got to have a lock for the observer to emit callbacks and hence locking up any other thread from responding to events from any callbacks. Hence I think triggering an event from a different thread helped to fix the issue as a new mutex was created and hence new lock could be acquired. Long story short, this should be fixed now as both observer and event triggering should have separate locks. I will let you know once released, all is in master if you want to checkout.

@piotrmurach
Copy link
Owner

Released v0.11.2 that includes the fix, would you mind trying it out?

@domokos
Copy link
Author

domokos commented Dec 30, 2015

@peter-murach Thank you for resolving this so fast.

I installed version 0.11.2 and can confirm that this version does not have this bug any more. It works as expected.

This issue can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants