Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mbed_events.h: Add ability to request a shared event queue #4406

Merged
merged 2 commits into from
Sep 4, 2017

Conversation

kjbracey
Copy link
Contributor

First pass at a couple of shared event queues.

To allow components with a simple need to schedule a few events to not
have to create their own threads, with all the associated memory
overhead, add 2 central calls to get shared normal and an
interrupt-deferral event queues, each dispatched on their own shared
threads.

One initial in-tree user in this PR - the Nanomesh/mbed-client timer.

First out-of-tree users would be 6LoWPAN radio drivers like atmel-rf-driver.

@kjbracey
Copy link
Contributor Author

First pass for comment.

I'm not particularly attached to any of the fine details here, but this is the basic functionality I'm looking for.

The "run normal event queue from main" option is one of those "why not" things - we may have a main() thread with nothing else to do after boot, so it's an obvious memory saving option.

@geky, @sg-, @MarceloSalazar, @SeppoTakalo

@kjbracey
Copy link
Contributor Author

FYI @karsev

@adbridge
Copy link
Contributor

@kjbracey-arm Could you please update the PR title so that it is suitably informed for inclusion in a release note. Thanks

@kjbracey kjbracey changed the title Shared event queues mbed_events.h: Add ability to request a shared event queue May 31, 2017
@kjbracey kjbracey force-pushed the shared_equeues branch 2 times, most recently from 4ba6a53 to 961af4b Compare June 2, 2017 08:29
@0xc0170
Copy link
Contributor

0xc0170 commented Jun 5, 2017

Please rebase

@tommikas
Copy link
Contributor

jenkins/pr-head is failing in all Nanostack builds due to:

06:21:38.620 | D1 <-- DutThread: mbed assertation failed: status == osOK, file: ./mbed-os/events/mbed_events.cpp, line 42

With 6lowpan and Thread it happens immediately after reset and with nanostack ethernet on network connect.

@SeppoTakalo
Copy link
Contributor

@tommikas Please ignore the build failures for now, this PR yes now open for a discussion and the API might change so it cannot be considered as a ready-to-be-merged.

@kjbracey
Copy link
Contributor Author

Actually, I am interested in investigating the build failures - I would expect this to work.

But I confess I have not run it myself - just checked compilation.

@tommikas
Copy link
Contributor

@SeppoTakalo Sure, the comment was just meant as FYI. Happened to notice that as I was looking through some failed builds.

Copy link
Contributor

@sg- sg- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like where this is going. Few questions about the possibilities:

  • Can the normal queue be bound to the main thread and on task change trampoline into queue.dispatch before returning execution into main? This would save a thread and do what I think the most common use case would be, signal the super loop
  • Naming for the getters might need work.
  • shared-normal config, not sure what this does but IMO simple use cases first and then we can expand functionality / options

@kjbracey
Copy link
Contributor Author

  • Main context hook sounds a bit clever. Not sure the OS offers such a thing, but I can check. The "dispatch from main" option was intended to be the dumb equivalent of that, for the not-uncommon case where the application isn't doing anything else from main(), so can promise to call dispatch (either forever at the end, or periodically in some way).

  • I wasn't happy with the getter naming either - suggestions welcome.

  • At a minimum we absolutely need the ability to set the basic memory sizes from the application so they can tune to squeeze into a small platform. (We still lack this for the main stack, I believe?) Ideally, libraries would be able to specify minimum requirements via JSON, to make this more automated, but this will do for now. Does mean that if, say, a library was using mbed TLS from the normal event loop, the application would have to know to set the normal event loop stack size to 6-7K so it fitted.

@kjbracey
Copy link
Contributor Author

One background note - I currently have no users waiting for the normal event loop, but do have a few lined up for the high-priority one.

Thread deferral from interrupt is a very common driver requirement, and the high-priority one can also be a replacement for the deprecated RtosTimer.

@theotherjimmy
Copy link
Contributor

@kjbracey-arm What is that temporary debugging commit all about?

@adbridge
Copy link
Contributor

@kjbracey-arm bump

@kjbracey
Copy link
Contributor Author

I was just adding trace to get more explanation of the CI failure that @tommikas highlighted.

@tommikas
Copy link
Contributor

@kjbracey-arm It's still failing the same way. The error print isn't visible for some reason.

Copy link
Contributor

@c1728p9 c1728p9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it!

#ifdef MBED_CONF_RTOS_PRESENT
EventQueue *get_shared_interrupt_deferral_event_queue()
{
return do_shared_event_queue_with_thread<osPriorityRealtime, MBED_CONF_EVENTS_SHARED_IRQDEFERRAL_EVENTSIZE, MBED_CONF_EVENTS_SHARED_IRQDEFERRAL_STACKSIZE>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the case where the RTOS is not present, can this just call into get_shared_normal_event_queue()? You wouldn't get the latency benefits, but code for 5 which uses interrupt deferral but not the rtos would be able to run on 2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting thought. I think that could make some 5 code run on 2, but code that actually needs the low latency (eg the 802.15.4 radio drivers) would find it a non-starter and still have to use raw interrupt.

So I'm currently leaning to no, for this reason: if any code using this would still work on the normal event queue, it should be always be using the normal event queue. The "Realtime" RTOS priority should be reserved for people who actually need it.

There's a difference between "wanting" a latency benefit and needing one. If people who don't need the Realtime priority use it, then the people who do need it are in danger of failing.

Copy link
Contributor Author

@kjbracey kjbracey Jul 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My last comment actually suggests maybe this should be renamed to emphasise the realtime nature.

There are cases where deferring an interrupt to a normal priority thread+event would be fine, and I wouldn't want users to automatically assume they should use this "interrupt deferral" event queue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After responding below, I started thinking, rather than re-naming it (or in addition to renaming it), why not assert the latency instead? That would enforce the realtime nature of this queue. In addition, for mbed 2 builds you could share the event queue, but with caveat that adding long running events to the normal event queue while using a driver which requires the interrupt queue will trigger timing asserts. That way you could use components like nanostack on 2 as long as the application and any libraries it pulls in don't need to post any long-running tasks to the queue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's then effectively saying that it would have no normal event queue though, isn't it?

And it would further mean that there could be no real "normal" processing or even slightly time-taking blocking calls anywhere in the application, as the processor would be required to be giving it's full attention to achieving realtime-class latency to its event loop in the main thread. Feels a bit unrealistic.

Copy link
Contributor

@c1728p9 c1728p9 Jul 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily. Assuming you have 3 types of drivers:

  1. Drivers which require real-time latency and have short running events
  2. Drivers which don't require real-time and have short running events
  3. Drivers which don't require real-time and have long running events

The current implementation allows an mbed 2 application to use type 2 and 3 drivers. Trying to use a type 1 driver will cause a compiler error.

If you have both normal and interrupt queue present then applications will be able to use type 2 and 3 drivers or type 1 and 2 drivers, but not type 1 and 3 in the same application. The biggest downside to this approach is you get a runtime assert error for mixing type 1 and 3 drivers, rather than getting an upfront compiler error.

I'm happy with keeping the interrupt queue out of mbed 2, as this is more conservative and doesn't risk setting any false expectations. I just wanted to make sure all the options were explored.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it's not just about "long-running" events on the event queue and drivers - it's the entire system, including any application. For the event loop to have that 100us latency, nothing in the entire system could take more than 100us. You're requiring everything the system ever does to be written as if it was in an interrupt handler.

Trying to use the normal event loop on mbed OS 2 is already a stretch - it's effectively turning it into back into mbed OS 3, and forcing the complete system to be event-driven with "reasonable" event run times. (10s of ms, basically what Nanostack/mbed-client copes with). So no blocking network code, no excessive block crypto operations, etc.

But trying to get your type-1 driver to run in mbed OS 2 via event loops totally rules out almost all application code. You couldn't run Nanostack, you couldn't run mbed-Client, you couldn't use the SD card. All of them occasionally do things that process for milliseconds. My basic argument is that the "type 1 + type 2" system does not currently exist, and I don't see how it ever could, unless it was something as simple as "blinky".

I'll happily believe we can create useful non-RTOS systems using a normal-class 10-100ms event loop, mbed OS 3/Nanomesh style (with realtime stuff running under interrupt directly). But I don't believe we can realistically break the entire work of the system into 100us chunks to allow a non-RTOS device to defer its interrupt work into the foreground.

On the other hand, with some more serious work, we could create the equivalent of the Linux "softirq" or RISC OS "transient callback" - queue up things to be run on exit from the supervisor or interrupt context before returning to the application. That would be a very powerful tool - it did all the heavy background lifting in RISC OS, a non-threaded system, giving you this extra third "context" between interrupts and the application.

If we were serious about supporting non-RTOS devices, that would be the most useful thing we could do - it's been on my mind for ages, but it needs integration into the OS, so is decidedly non-trivial.

On the other hand is there some syntactical sugar we could use here? Some way to make it easy to write code that defers to the realtime queue if available, else puts it directly in the interrupt handler? Actually, maybe it's not that hard - my PR over in atmel-rf-driver is quite "ifdeff"y on the RTOS flag, but if I used bound callback parameters I could cut down a bit and get rid of the little thunk functions?

https://github.com/ARMmbed/atmel-rf-driver/pull/55/files

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In mbed 2, could we just run the realtime queue in interrupt context? Event queue could be dispatched from a ticker using the background function.

ARM supports interrupts with different priorities, so there is the possibility of even having this queue in low-priority interrupt context, but that might require more work.

Copy link
Contributor Author

@kjbracey kjbracey Jul 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, but you want the event to be run immediately after queuing, so don't see how a regular ticker provides the desired latency.

Could be done using a low priority software/generated interrupt (or 0-delay timer). Do we have such a thing on all platforms? Guess there's something built in to the Cortex-M, but might not be there on Cortex-A?

Either of those might be sufficient in this case, and I think give you the Linux softirq equivalent. The remaining catch is that you are still in interrupt context, so can't call any libraries or facilities that aren't interrupt re-entrant.

Portable protection against this "background" context, whether it be thread or interrupt, is one of the things in the Nanostack HAL - platform_enter_critical(). That is a "disable IRQs" if the background context is interrupts, or a mutex claim if the background context is a realtime thread. Maybe that could be exposed via this API - giving application code in the normal event queue the ability to be safe against changes running in the realtime event queue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most tickers have handling for already expired events (example). Although we would have to check that this corner case is handled correctly across targets.

I'd be fine adopting something similar to platform_enter_critical in mbed OS, although it may need a name change since we already have the bombastic name core_util_critical_section_enter. Maybe something core_util_realtime_section_enter...

Although it does sound like this work should be grouped into a second pr. Simply ifdefing away the realtime queue seems fine to me for now. We can always expose it to the mbed 2 world later.

*
* @return pointer to normal event queue
*/
events::EventQueue *get_shared_normal_event_queue();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are your thoughts on having get_shared_normal_event_queue and get_shared_interrupt_deferral_event_queue take event size and stack size as optional parameters to provide hints/warnings? Drivers could pass in the size they need and there could be a warning or error if the defines are not set correctly.

Another more radical idea is to have this grow the sizes at runtime, so config isn't needed. Maybe (probably) a crazy idea, but I figured I would mention it, as this API would allow it to be lazily added in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, makes sense, particularly for stack size. In the absence of any other current negotiation mechanism such as JSON, it at least avoids hard-to-debug overflows due to misconfiguration, and potentially opens up the door to some sort of compile-time (or even run-time) magic.

What form - MBED_ASSERT? mbed_error?

For event size, it's a bit less useful than stack size, or at least trickier, as the requirement is (sort of) a sum of the users, rather than the maximum. Maybe it could add up the total from calls, if individual components didn't repeatedly call it with event size? On the other hand, event overflow is easier to debug than stack - you can have an immediate nice assert with backtrace.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MBED_ASSERT might be a good way to do it. That way in release builds the assert can be compiled out.

I'm not sure how best the event size option should be handled. To safely size the queue you need to know the size of each event, the rate of posting events, and the latency of the queue. Maybe this optional parameter could be specified later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another optional parameter that might be good to have, at least for the interrupt event queue is required latency. That would allow timing violations to be caught deterministically, rather than allowing them to manifest as driver errors that are hard to reproduce.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ability to diagnose this sort of problem is certainly good, but an ordinary assert that would trigger in any debug build is a bit strong - we're not generally a hard real-time system, so occasional spikes aren't the end of the world. I'd be inclined to have an option you can turn on when required to hunt for culprits if you see problems.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are using a debugger you would definitely want the asserts off. On the flip side, it would be great if CI could fail changes cause timing violations. Since this could vary from target to target, it might not be feasible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, CI alerts are useful, so a couple of tests with the alert flagging on would be good.

What could be useful if we could get some sort of reliable raw latency number out would be having the CI flag if it ever sees a significant shift since the last test on any given platform. That's something that's proven invaluable for raw speed benchmarks (where either a speed increase or decrease could indicate a bug), but it might be harder to get a reliable number on latency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To not slow down this pr, could we move the latency asserts (and stack asserts unless someone has a patch ready) to another pr? We can always add additional parameters to this function in a backwards compatible manner.

This changeset is already very valuable as is.

@theotherjimmy
Copy link
Contributor

@kjbracey-arm Have you been able to track down what's going wrong here?

* limitations under the License.
*/

#include "events/mbed_events.h"
Copy link
Contributor

@geky geky Jul 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this feature be moved into its own file? I realize it's small, but mbed_events.h was just the entry point to including the other mbed-events headers (EventQueue.h and Event.h).

Perhaps mbed_shared_queues.h and mbed_shared_queues.cpp?

They should still be included from mbed_events.h, so this wouldn't change the user experience, but it would help keep the logical chunks of this module organized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, no problem with that. I was aware it was somewhat over-shadowing the file's original core contents.

Copy link
Contributor

@geky geky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about being late,

This looks great! Using static function variables is a solid approach that removes a lot of concerns with the gc, and this looks like a very clean implementation.

* @return pointer to interrupt deferral event queue
*/

events::EventQueue *get_shared_interrupt_deferral_event_queue();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the question of names came up earlier

Thoughts on get_shared_queue and get_shared_irq_queue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite happy to drop "normal", but isn't it specifically an "event queue"?

IRQ queue doesn't feel quite right, but not sure I've got anything better.

This is very much an analogue to Nanostack's way of working, and it has no formal name for this context, but I've tended to call it the "background" context (as opposed to the main foreground application running in Nanostack event loop). Don't really like that either.

Maybe go for "soft_irq_event_queue", akin to Linux?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, did you have a view on my previous thought that it could be named from the latency point of view - "high-priority / low-latency / realtime" or something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name "realtime" seems to capture what we're going for the best to me (although it just seems like there's not a perfect solution 😆).

How about get_shared_event_queue and get_shared_realtime_event_queue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realtime has meaning which we should stay away from. mbed_highprio_event_queue and mbed_lowprio_event_queue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not like to specify the normal priority queue to be a lowprio_event_queue. It would steer some users towards highprio queues. How about just event_queue and highprio_event_queue instead because the "other" queue is not in lower priority than rest of the threads.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree about not calling it "low" - it's "normal", and I'm fine with that being implied without being in the name.

Copy link
Contributor

@geky geky Jul 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity why not realtime? It sounds to me like it captures what we're going for with this queue (latency guarantees). It's even running on a osPriorityRealtime thread. If it's a scary name that's a good thing.

Actually if we go with mbed_highprio_event_queue, we should go with osPriorityHigh to match. That that would still accomplish the same requirement right?

Sounds like the current lead are mbed_highprio_event_queue and mbed_event_queue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity why not realtime?

From wikipedia, Real-time programs must guarantee response within specified time constraints, often referred to as "deadlines" and I dont think we can or want to commit to this. There is a significant amount to make this commitment something we're not ready for.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All embedded threading kernels are referred as a RTOS (Realtime Operating System) even when they don't make any guarantees about delays.

I don't think using word "realtime" here to refer the thread priority osPriorityRealtime would be any more misleading that industry adopted way of calling all small kernels as a realtime kernels.
As long as we don't refer our selfs as a hard realtime system, this sound OK to me.

Also, deadlines can be added to this queue later.

@theotherjimmy
Copy link
Contributor

@kjbracey-arm Any news?

1 similar comment
@theotherjimmy
Copy link
Contributor

@kjbracey-arm Any news?

@kjbracey
Copy link
Contributor Author

Hi, just back from 4 weeks' holiday - will start looking at updating this shortly.

@kjbracey
Copy link
Contributor Author

Following comments, a couple more commits:

  • Implementation moved to separate files.
  • Rename to mbed_event_queue and mbed_highprio_event_queue, including changing priority to high rather than realtime.

Will squash together before finalising.

@tommikas
Copy link
Contributor

6lowpan and Thread seem to work now. Nanostack ethernet driver still fails.

Copy link
Contributor

@geky geky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great 👍

Lets see if we can't get this in for 5.6

I believe the CI is down this week, so we'll run it when we can @studavekar

@kjbracey kjbracey force-pushed the shared_equeues branch 2 times, most recently from dfd691c to d490162 Compare September 1, 2017 08:30
@kjbracey
Copy link
Contributor Author

kjbracey commented Sep 1, 2017

It wasn't working reliably because I wasn't aligning the stack block.

Corrected, and took the opportunity to adjust Thread and EventQueue prototypes so I didn't have to put in pointless casts to unsigned char * from my uint64_t arrays.

If you don't fancy that change, I can drop them and put in the casts.

To allow components with a simple need to schedule a few events to not
have to create their own threads, with all the associated memory
overhead, add 2 central calls to get shared normal and an
interrupt-deferral event queues, each dispatched on their own shared
threads.

For non-RTOS systems, just the normal event queue is provided, and the
application would have to dispatch this itself. This
application-dispatch is also available via a config option, to
potentially save memory by reusing the main thread.

Possible future improvement: the ability for separate components to
request a minimum stack size, and have the JSON combine these requests.
(Analogous tooling has already been mooted for mbed TLS config options
like key size).
Don't create our own timer thread - use the shared event queue.
@kjbracey
Copy link
Contributor Author

kjbracey commented Sep 1, 2017

Okay, binary compatibility grief made me change my mind on those void * changes to EventQueue and Thread. Taken out, and put in the casts.

@studavekar
Copy link
Contributor

/morph test-nightly

@mbed-bot
Copy link

mbed-bot commented Sep 3, 2017

Result: SUCCESS

Your command has finished executing! Here's what you wrote!

/morph test-nightly

Output

mbed Build Number: 1182

All builds and test passed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet