Changed EventQueue::cancel to return boolean value #10750

jarvte · 2019-06-04T05:19:29Z

Description

It's important in some cases to know whether the EventQueue::cancel succeeded. Now cancel method returns the boolean value to indicate did the cancel succeed.

Pull request type

[ ] Fix
[ ] Refactor
[ ] Target update
[x] Functionality change
[ ] Docs update
[ ] Test update
[ ] Breaking change

Reviewers

@kjbracey-arm
@kivaisan

Release Notes

EventQueue::cancel now returns the boolean value to indicate whether the cancel was successful or not.

ciarmcom · 2019-06-04T07:00:44Z

@jarvte, thank you for your changes.
@kjbracey-arm @kivaisan @ARMmbed/mbed-os-test @ARMmbed/mbed-os-core @ARMmbed/mbed-os-maintainers please review.

kjbracey · 2019-06-04T07:39:03Z

events/equeue/equeue.h

 //
 // If called while the event queue's dispatch loop is active, equeue_cancel
-// does not guarantee that the event will not not execute after it returns as
+// does not guarantee that the event will not execute after it returns as


This comment needs a bit of a rewrite now that we do return something. A return of true guarantees that it will not execute.

updated comment

kjbracey · 2019-06-04T07:41:15Z

events/EventQueue.h

     *  returns, as the event may have already begun executing.
     *
     *  @param id       Unique id of the event
+     *  @return         true  if event was successfully cancelled


It may be worth adjusting the final paragraph above - "If called while the event queue's dispatch loop is active in another thread, ...". A call made from the same thread as the dispatch loop can always succeed.

updated comment

kjbracey · 2019-06-04T07:42:54Z

@geky, please take a look. This is required to fully fix #10684

SeppoTakalo · 2019-06-04T08:09:29Z

UNITTESTS/stubs/EventQueue_stub.cpp

-void EventQueue::cancel(int id)
+bool EventQueue::cancel(int id)
 {
 }


bool should return true or false

fixed to return true

SeppoTakalo · 2019-06-04T08:11:39Z

events/equeue/equeue.c

 {
    struct equeue_chain_context *c = (struct equeue_chain_context *)p;
-    equeue_cancel(c->target, c->id);
+    (void)equeue_cancel(c->target, c->id);


What happens if the cancellation would fail in this case? Is it possible, and should we stop then?

I'm not sure so I left it as it is, maybe @geky could help with this one?

Failure is ok in this case. If we don't cancel the old chain event the worst thing that happens is our chained equeue gets a spurious wakeup. But equeue handles spurious wakeups by always checking the time and going back to sleep if it's too early.

jarvte · 2019-06-05T06:04:10Z

Force pushed review fixes

geky · 2019-06-06T00:49:23Z

Thanks for the PR @jarvte, sorry about the delay.

I think it would be better to use the event's destructor:
#10684 (comment)

Some of the reason for not returning an error is that it's deceptive and tricky. With multiple threads you can't rely on an error to mean that the event is still running.

Though the real reason is that I haven't seen a use case for cancel errors that isn't handled better by relying on the event's destructor. (Though IMO the destructor hook is a bit cleaner in C). But I'd be happy to be wrong.

kjbracey · 2019-06-06T09:52:12Z

With multiple threads you can't rely on an error to mean that the event is still running.

Can't you? Assuming the event has some synchronisation mechanism with the canceller (here a mutex), the canceller can know it has been queued but hasn't reached its synchronised section yet. So it either hasn't been dispatched yet, or is currently being dispatched and will soon reach its synchronised section.

If cancel returns an error, that means it wasn't found in the event queue, so we can conclude it is in the process of being dispatched, which in turn means it must be on another thread, so we can block this thread and wait for it to run.

Is there a flaw in that reasoning?

geky · 2019-06-06T15:20:30Z

Assuming the event has some synchronisation mechanism with the canceller (here a mutex)

With additional synchronization you can.

But otherwise you can't be sure that the event hasn't already expired.

kjbracey · 2019-06-07T08:48:46Z

With additional synchronization you can.
But otherwise you can't be sure that the event hasn't already expired.

I think it's a fair assumption that anyone trying to cancel their events should have some synchronisation mechanism with the event itself. They must be sharing some sort of state, or they wouldn't be needing/wanting to cancel it, and caring whether it had expired or not, hence the should have some way of synchronising with it. (So probably a mutex, an atomic, or knowledge that they're in the same event queue).

In the example we're aiming for here, there is a mutex, and the routine clears the mutex-protected "event_queued" flag.

If we were to also clear in the destructor, the destructor would also need to take the mutex. But then we've got the problem that we've got a new state to worry about - between the routine and the destructor. If the routine doesn't clear the flag, we can have "event_queued" true, but the routine's already run. If the routine does clear the flag, then someone can queue another routine, but the destructor for the first clears it.

This is making my head hurt a bit. I would like to think about improving Event+EventQueue to forward better, to maybe make more complex objects with destructors usable as the functors, but I'm not sure borrowing the destructor is the answer for the routine synchronisation, at least in this example where we have to synchronise an immediate delete. If the whole thing was reference counted with shared pointers, it might be the answer.

bulislaw · 2019-07-02T11:37:57Z

That is a breaking change isn't it? So it should target major release.

kjbracey · 2019-07-02T13:10:36Z

Adding a return value to a void member function is only very slightly breaking.

Only failure I can see is breaking anyone passing it as a member function pointer somewhere. Would be pretty weird thing to do for this function, but I guess it's possible.

geky · 2019-07-03T22:37:38Z

Hi, sorry for my terribly bad response times. If this is blocking something don't let me stop it.

I'm still believe using the equeue destructor is the correct solution. Cleaning up an event's state is what the destructor was designed for. It is less risky than relying on knowledge of when the event is alive.

If you need a temporary solution that doesn't create an API change, consider using the SharedPtr.

The unfortunate fact is this is the C++03 solution to lifetime problems until we adopt C++11. Just look the STL functions before emplace_back and friends.

If we were to also clear in the destructor, the destructor would also need to take the mutex. But then we've got the problem that we've got a new state to worry about - between the routine and the destructor. If the routine doesn't clear the flag, we can have "event_queued" true, but the routine's already run. If the routine does clear the flag, then someone can queue another routine, but the destructor for the first clears it.

Isn't the sharing of state what encapsulation solves?

Keep in mind the destructor runs if either 1. the event is canceled, or 2. the routine runs. So I don't believe you will run into the overlapping issue.

For #10684, I think you just need the destructor to set the _event_id to 0 in any situation.

kjbracey · 2019-07-04T07:38:41Z

I'm still believe using the equeue destructor is the correct solution. Cleaning up an event's state is what the destructor was designed for. It is less risky than relying on knowledge of when the event is alive.

It's pretty heavy though. There's an possible optimisation in Callback to disable storage of non-trivial types which I'm planning to expose as a config option, and that saves 800 bytes of ROM on a client build. (On top of another 400 bytes gained in case-by-case detection of triviality with type traits). Can currently enable it for low memory because nobody is storing anything other than (member) function pointers.

The unfortunate fact is this is the C++03 solution to lifetime problems until we adopt C++11. Just look the STL functions before emplace_back and friends.

Ah, well the handy thing there is we have adopted C++11. So if you can cook up a C++11 answer, I'm all for it.

Isn't the sharing of state what encapsulation solves?

The overlap I'm considering is for where the event is not encapsulating information, but is just a trigger to start work. (Most common case?)

For #10684, I think you just need the destructor to set the _event_id to 0 in any situation.

I can't figure out how to make that work. Failure scenario:

Thread A: Queue event. _event_id = 1
Thread B: start to dispatch event 1,
Thread B: event 1 runs, clears event_id
Thread A: Queue event, event_id = 2
Thread B: destroys event 1, sets event_id = 0
Thread A: Destruction - sees event_id = 0, does not cancel event 2
Thread B: Event 2 runs, crashes claiming deleted mutex.

That's assuming the event running clears the event ID inside its synchronisation logic. If it doesn't, then you get this failure scenario:

Thread A: Queue event. _event_id = 1
Thread B: start to dispatch event 1,
Thread B: event 1 runs,
Thread A: Has new work, sees event_id set, decides it doesn't need to queue
Thread B: destroys event 1, sets event_id = 0
New work never processed.

I'm sure there's some trick to make that work by putting logic into the destructor to retrigger a copy of itself, but this is getting incredibly complicated.

kjbracey · 2019-07-08T09:41:03Z

Another way of looking at it - we're not at all interested in the lifetime of the event object itself. It could linger on forever as far as we care. It's only a trigger. The only thing we care about is whether the event routine's synchronised section is due to run. And the event object itself can't determine that - only the event queue has that information available.

It seems like it would make sense to use the destructor of the event object to deal with any lifetime issues it has itself - eg destroying a shared pointer inside it. But we need synchronisation with the event queue and in the event routine.

Certainly you could make this work by using a shared (or weak) pointer in the event object, but that's forcing even more dynamic allocation into the design, which I want to avoid. In this scenario, ideally both the network interface and the event itself would be entirely static. At the minute we can at least have the interface be entirely static, and there's still potential to permit static events.

geky · 2019-07-09T00:06:40Z

Oh ok, so if I understand correctly you're dealing with a chain of multiple events, where the thing you need to clean up persists across the chain?

This does actually seem like a case where using reference counting would protect everything quite nicely, but I'll give you that it does get more complicated.

Lets go with adding the boolean value to cancel if it makes life easier. I don't have that strong of an argument against it. As long as it comes in the the normal API change runway in Mbed OS's release schedule (minor release?), public API changes shouldn't be easy.

Ah, well the handy thing there is we have adopted C++11. So if you can cook up a C++11 answer, I'm all for it.

Then you just need to modify EventQueue to do perfect forwarding :)

kjbracey · 2019-07-09T07:18:23Z

Oh ok, so if I understand correctly you're dealing with a chain of multiple events, where the thing you need to clean up persists across the chain?

Well, it's even simpler than that, really. The event is just a singleton trigger from "arbitrary thread context" to start "work on event queue". It contains no information, it's just the queued request to "start work".

The model (which is very typical in various Mbed components ) is

* arbitary context (eg `NetworkSocket::sigio`) - queue event if "event queued flag" clear 
* event queue - clear "event queued flag", process work

Notionally, this model would settle for a totally static handler routine and a squashable "event flag" type trigger - we don't really need to queue an object in the event queue.

We need to cancel in the destructor of the subsystem - the destructor has to (synchronously) cancel any pending events. (Or otherwise ensure they'd be no-ops not touching our destroyed object).

This does actually seem like a case where using reference counting would protect everything quite nicely, but I'll give you that it does get more complicated.

Yes, if the "event queued flag" and a "cancel flag" were actually accessed via a shared pointer, rather than being embedded in the subsystem object, and that shared pointer was passed in the event object (as weak?), then it would work, but I'd rather avoid the dynamic allocation (and the need to explain the mechanism).

Then you just need to modify EventQueue to do perfect forwarding :)

I don't think perfect forwarding is what we're thinking about here - we have to capture.

But we could upgrade to move capture rather than copy. (But I note std::function itself always copy-captures its FunctionObject - it doesn't offer move capture)

geky · 2019-07-23T14:31:14Z

We need to cancel in the destructor of the subsystem - the destructor has to (synchronously) cancel any pending events. (Or otherwise ensure they'd be no-ops not touching our destroyed object).

But if the events are not chained, does the class need to know whether or not canceling the events succeeded? (well, it knows no events will run after cancel).

Yes, if the "event queued flag" and a "cancel flag" were actually accessed via a shared pointer, rather than being embedded in the subsystem object, and that shared pointer was passed in the event object (as weak?), then it would work, but I'd rather avoid the dynamic allocation (and the need to explain the mechanism).

I'm not sure how dynamic memory got involved. If the flags are static/class allocated, then in theory so can the reference count + pointer.

I don't think perfect forwarding is what we're thinking about here - we have to capture.

Both work, from what I understand. With perfect forwarding you can construct the object directly in the event queue without needing a copy.

I think this PR is good to go 👍

0xc0170 · 2019-08-20T10:56:35Z

@jarvte Please rebase. Shall we move this forward now, as we are getting close to 5.14

@kjbracey-arm Can you approve or not?

jarvte · 2019-08-20T12:58:28Z

@0xc0170 rebase done

0xc0170 · 2019-08-21T07:19:00Z

CI started

mbed-ci · 2019-08-21T12:35:49Z

Test run: SUCCESS

Summary: 11 of 11 test jobs passed
Build number : 1
Build artifacts

0xc0170 · 2019-08-26T11:53:33Z

Adding a return value to a void member function is only very slightly breaking.

Only failure I can see is breaking anyone passing it as a member function pointer somewhere. Would be pretty weird thing to do for this function, but I guess it's possible.

@bulislaw Based on the decryption above, is this going in for 5.14 ?

bulislaw · 2019-08-27T15:50:51Z

Yeah, I think it's ok to go as is to minor.

ciarmcom requested review from kivaisan, kjbracey and a team June 4, 2019 07:00

ciarmcom added the needs: review label Jun 4, 2019

kjbracey reviewed Jun 4, 2019

View reviewed changes

SeppoTakalo reviewed Jun 4, 2019

View reviewed changes

0xc0170 added release-version: 5.14.0-rc1 needs: work and removed needs: review labels Jun 4, 2019

0xc0170 requested a review from geky June 4, 2019 09:24

jarvte force-pushed the eventqueue_cancel_return branch from 7b896f1 to de555af Compare June 5, 2019 06:03

jarvte force-pushed the eventqueue_cancel_return branch from de555af to f329c0b Compare June 5, 2019 06:32

geky approved these changes Jul 9, 2019

View reviewed changes

geky mentioned this pull request Jul 21, 2019

Feature public headers #11073

Merged

kjbracey mentioned this pull request Aug 13, 2019

Get number of items in EventQueue? #11184

Closed

kjbracey approved these changes Aug 20, 2019

View reviewed changes

Fix EventQueue::cancel to return value

719117e

jarvte force-pushed the eventqueue_cancel_return branch from f329c0b to 719117e Compare August 20, 2019 12:57

0xc0170 added needs: CI and removed needs: work labels Aug 20, 2019

0xc0170 added needs: review and removed needs: CI labels Aug 21, 2019

0xc0170 requested a review from bulislaw August 21, 2019 12:55

0xc0170 added ready for merge and removed needs: review labels Aug 27, 2019

0xc0170 merged commit 6add979 into ARMmbed:master Aug 27, 2019

0xc0170 removed the ready for merge label Aug 27, 2019

jarvte deleted the eventqueue_cancel_return branch January 17, 2020 12:21

Changed EventQueue::cancel to return boolean value #10750

Changed EventQueue::cancel to return boolean value #10750

Uh oh!

Conversation

jarvte commented Jun 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Pull request type

Reviewers

Release Notes

Uh oh!

ciarmcom commented Jun 4, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kjbracey commented Jun 4, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jarvte commented Jun 5, 2019

Uh oh!

geky commented Jun 6, 2019

Uh oh!

kjbracey commented Jun 6, 2019

Uh oh!

geky commented Jun 6, 2019

Uh oh!

kjbracey commented Jun 7, 2019

Uh oh!

bulislaw commented Jul 2, 2019

Uh oh!

kjbracey commented Jul 2, 2019

Uh oh!

geky commented Jul 3, 2019

Uh oh!

kjbracey commented Jul 4, 2019

Uh oh!

kjbracey commented Jul 8, 2019

Uh oh!

geky commented Jul 9, 2019

Uh oh!

kjbracey commented Jul 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

geky commented Jul 23, 2019

Uh oh!

0xc0170 commented Aug 20, 2019

Uh oh!

jarvte commented Aug 20, 2019

Uh oh!

0xc0170 commented Aug 21, 2019

Uh oh!

mbed-ci commented Aug 21, 2019

Uh oh!

0xc0170 commented Aug 26, 2019

Uh oh!

bulislaw commented Aug 27, 2019

Uh oh!

Uh oh!

jarvte commented Jun 4, 2019 •

edited

Loading

kjbracey commented Jul 9, 2019 •

edited

Loading