Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider migrating away from nb #47

Open
chrysn opened this issue Feb 2, 2021 · 18 comments
Open

Consider migrating away from nb #47

chrysn opened this issue Feb 2, 2021 · 18 comments
Labels

Comments

@chrysn
Copy link
Contributor

chrysn commented Feb 2, 2021

Given embedded-nal's similarities to embedded-hal, it's probably worth following the migration from nb to the now available native Rust async features.

Issue at embedded-hal is rust-embedded/embedded-hal#172.

There's probably no immediate action needed in terms of code change in embedded-nal, but participating in the discussion there might help in getting a consistent and usable outcome across the ecosystem.

@ryan-summers
Copy link
Member

I haven't looked deeply into async yet, but does it require some sort of external application-level executor? If that's the case, would using async then impose this requirement on all users of the embedded-nal? I think that's a design choice I'd like to avoid, but I'm somewhat ignorant of rust async semantics. Perhaps an example impl would be a good idea to see what this looks like.

@chrysn
Copy link
Contributor Author

chrysn commented Feb 2, 2021 via email

@ryan-summers
Copy link
Member

@chrysn
Copy link
Contributor Author

chrysn commented Feb 2, 2021

Just to verify my understanding: This offers a poll-based API that is supposed to be called by the top-level application at regular intervals. The application picks a sleeping interval that's long enough that it can sleep at least a bit but short enough that the accumulating latencies of waiting for the next tick to happen are not too bad. Is this the benchmark which an async variety would compete against?

@ryan-summers
Copy link
Member

ryan-summers commented Feb 2, 2021

Actually, if my understanding is correct, recv() would return a Future that we could then poll() to determine whether or not it was complete, so it wouldn't require any type of async-executor on the application, correct?

P.S. Your analysis is correct in the usage. I'm not particularly interested in performance, but rather what this imposes on the application for users.

@chrysn
Copy link
Contributor Author

chrysn commented Feb 2, 2021

(On the PS: My users are building applications that run on battery, so while per-iteration performance may be negligible, anything that produces frequent ticks is not an option -- but I'd leave that be a side note because I hope we'll eventually cater for both scenarios anyway.)

Yes, it'd return a Future that needs to be polled. An application that likes to drive its one task in a poll-every-few-milliseconds way could still do that using a very simple executor to do this; that executor would have a zero-sized no-op context whose waker does nothing because the futures will be polled anyway (and can thus be optimized out at build time).

How complex the future is depends on the function being desugared. A function that escalates nb::WouldBlock can typically have only one point at which to break (more than one only if all but the last function are idempotent; getting that case's efficiency back may be a bit tricky but is it at all about that?), which becomes a single-variant (or possibly a two-variant for exhaustion, not sure) enum that contains something like the arguments to the function -- so it'd be about the same in terms of what is kept, just that it's enclosed in the future rather than passed in again.

The trouble, AIU (and I'm also just getting into this from the embedded side), is not so much the code complexity (whether block!() is calling an executor or looping around is invisible to the user, and async fn will, if it is not already, be the more familiar idiom to having an extra return state) but the type expressivity. The standard library (or crates like async-std) have it easy because they just provide a struct whose desugared methods can return an impl Future. We here (embedded-nal and -hal, targeting systems where the OS is just something that implements traits) have it harder because we're describing traits with those same properties. This is where the complexity comes in and manifests itself in the currently proposed two approaches of which one depends on GATs to be stabilized, and the other needs to split all traits in two in order to provide default implementations.

@chrysn
Copy link
Contributor Author

chrysn commented Feb 3, 2021

I've assembled an example based on the CoAP server I'm building; the commit on the towards-coreasync-nal branch shows a self-contained example.

It's not attempting to do anything on the embedded-nal layer yet, but just converting the CoAP server which uses embedded-nal. This is similar in that likewise, a coreasync-based embedded-nal impelmentation might build on embedded-hal while it's nb-based -- although probably we don't really need to concern ourselves with that, as migrating embedded-nal only makes sense when embedded-hal made the step; I'm primarily exploring here. The conveniently sidesteps all the GAT / trait-ishness, for it's about a concrete server, not a trait-description of a server that later needs to be implemented. (I trust that whatever works for embedded-hal will work for embedded-nal there too).

The "user-facing" changes in src/lib.rs are really slim (all that nb_to_future is just bridging the gap between ecosystems), and other than that the server code required naught more than the added async and await keywords.

The top-level application changes look more frightening, but that's primarily because what would be the executor is rolled into there.

There is one issue that is where resources are allocated: With the changes introduced by the function being continuable, the whole Future that's constructed out of it holds the buffer. (I didn't measure whether the compiler got it right and it holds the buffer only once as it'd necessarily do, or twice if it doesn't optimize all the way it should). This is in parts to be resolved when #12/#15 are implemented (where the receive buffer is concerned), and in parts up for further exploration as to whether I can make the async function loop back or return early if sending fails (hoping on input around the matrix channel) -- in the changes to the example server, that topic is reflected in the lib.rs comment change on leveraging a protocol optimization that used to be implied in the nb version (although that needs to be used very carefully with nb if it's not an explicit thing, or data might get lost subtly).

@ryan-summers
Copy link
Member

The top-level application changes look more frightening, but that's primarily because what would be the executor is rolled into there.

This is my original concern - if we convert this NAL to use async/await, does that force all users of the embedded-nal to have an executor for the futures? If so, that's definitely not the direction I would like to take this crate, as forcing embedded applications to use a Future executor is not the right approach in my opinion.

@ryan-summers
Copy link
Member

It seems to me that the approach of implementing a "dummy" executor has the trade-off of forcing overhead boilerplate onto non-async applications, but gives them the ability to use async whether they want it or not

@chrysn
Copy link
Contributor Author

chrysn commented Feb 3, 2021

non-async applications

Are they even a thing any more with embedded-nal, now that the blocking modes have been removed in 0.2?

dummy executor

From how I understand the available example, users already use a dummy executor, they just always roll own (here in the form of sleeping-for-some-milliseconds, and I'm not convinced that any much better executor is even possible with nb). An executor that does this is admittedly a bit more verbose than the existing ones, but not by much, and should be provideable as a general component.

@ryan-summers
Copy link
Member

ryan-summers commented Feb 3, 2021

non-async applications

Are they even a thing any more with embedded-nal, now that the blocking modes have been removed in 0.2?

Non-async embedded applications make up the vast majority of the embedded application space. I have not yet once written an async embedded application. That's why async is just now becoming a thing in the embedded-hal - no one has needed/wanted it yet.

There's an enormous amount of trade-offs you make when you use the async approach in embedded, since you forfeit a lot of real-time control.

@ryan-summers
Copy link
Member

From how I understand the available example, users already use a dummy executor, they just always roll own (here in the form of sleeping-for-some-milliseconds, and I'm not convinced that any much better executor is even possible with nb). An executor that does this is admittedly a bit more verbose than the existing ones, but not by much, and should be provideable as a general component.

A traditional embedded application does not have any executor, but rather a super-loop in the main() function that loops forever. When tasks do not need to execute (e.g. nb::NotReady), the processor goes to sleep to save power.

@chrysn
Copy link
Contributor Author

chrysn commented Feb 3, 2021

But how are they supposed to do that with an embedded-nal when all the blocking interaction patterns were removed? (Not that I terribly miss them, but a blocking receive in RIOT-OS was comparatively straightforward because it supports some form of threads). Are they expected to set up manually a hook into the network stack for a "something has come in" event (bypassing the stack abstractions), and then call block!()ing functions?

If that is the case, that should still be supportable by an async loop if the even-simpler busy-looping executor is called from the hooking point.

When tasks do not need to execute (e.g. nb::NotReady), the processor goes to sleep to save power.

... and rely on the network stack's interrupt handler to get the main loop out of its wait state? That, too, should still be possible; sketching up an example...

@chrysn
Copy link
Contributor Author

chrysn commented Feb 3, 2021

... and it's now in a somewhat cleaner shape that hopefully better reflects practical use. Most of all, the code I'm now using to bridge the parts is more precise around what's happening.

The overall changes to the CoAP server's main loop now use the (still locally provided) async_tools::execute executor. It doesn't do anything too magical, but it's something one would probably want to not do for each project anew. (Posting this I noted that the error handling is somewhat different in that socket errors are now silently ignored; an .expect() after the await would so the equivalent.)

All the adaptation between nb is now in a single file. Mind you I'm not suggesting we'd actually use this (nb and futures at the same time), but to illustrate what happens during the transition from nb to async, if it is to happen. While this is all no perfectly efficient, it is doable. The other way 'round would be trickier if embedded-hal goes to async and embedded-nal sticks with nb, as then the context used with the Future to get an an nb::Result would need to know out of thin air how to wake whatever kind of main loop the nb user is using.

In a future-only ecosystem, an executor like this should still be viable (with, say, a WFI as the sleep command), because each event that can cause a state change (ie. each interrupt) also causes the WFI to end and the loop to come around. Users who need a full-blown executor that can start separate tasks can still use one, but a simple one should cover what can be done with nb. There are downsides to consider when it comes to such a simple executor not compiling away to nothing (zero-sized wakers and all that), but that's probably more a topic for the embedded-hal issue.

@MathiasKoch
Copy link
Collaborator

I finally found some time to weigh in on this, sorry for the absence!

Personally I am in the camp, where i don't think a proper executor for embedded rust can get here fast enough! I have not even experimented with async on embedded yet, but i can see so many places in my code, where async would free up resources of my processor. All of those places could probably be solved by returning WouldBlock, but the statekeeping of where to resume is just too much work for the time won. Async would solve this for me without even writing any code (almost). My application is not timing critical, but extremely IO bound (typical IoT device, with the exception that i have mains power, so no battery to consider)

So that kinda set the scene on what i am biased towards.

On the other hand, I see what @ryan-summers is saying about a super loop and timing critical applications.

I think we need to realize, that async is coming for embedded applications as well, whether one wants to use it and likes or not. With that in mind, i think it would be easy to do a compat-layer between nb and async (actually i see @chrysn already made that), making it possible for people who do not want to use async to continue as-is, with the only difference beeing nb::block!() replaced by async_tools::block() everywhere. This is possible because going from async -> nb is as simple as ignoring any existing waker.

The other way around on the other hand is, as i understand it, quite difficult? This is because the nb -> async would require that waker information to be usefull.

Perhaps we could offer both, with the nb version as an auto-implementation using @chrysn async-tools?

@chrysn
Copy link
Contributor Author

chrysn commented Feb 5, 2021

The drafted compat layer would primarily cater for "call / poll this in a tight loop until it doesn't WouldBlock any more" cases. I think it best fits "around" whatever is composed of async parts, so it wouldn't be so much "providing an automatic nb version of an async-converted embedded-nal" but rather having an async-converted embedded-nal with a note that if you want to use it that way, take the block function from over there.

But I'd like to understand the timing critical applications better:

My impression of the timing-critical parts is that they are about cases where a fast response is decided -- for example (and I happen to be about to build something like that), something comes in on an interrupt and needs the response to be ready basically right inside the interrupt for immediate return within however long that interrupt may take (we really want that bounded but AFAIK there's no widely available tools to ensure that in Rust so far even for nb style). In my understanding, running the future in a tight loop inside the interrupt is possible just as it would be to run a nb function that way.

Sure, with full async it's more tempting to have some generic big implementation in there that does things, possibly things you shouldn't do in an interrupt, or even stalls for a lower-priority interrupt to come in first, but AIU the same can be done with nb (eg. some HAL might, rather than polling a register, need a bit more preprocessing and set up a low-priority interrupt handler that we're now starving on). So no qualitative changes here, right?

(There are probably quantitative changes as long as we can't do certain optimizations, but from the optimizations I've seen around the coroutine generation maybe not too much at all).

@chrysn
Copy link
Contributor Author

chrysn commented Aug 11, 2021

To get this unstuck, could any of you help me verify the above "My impression"?

@ryan-summers
Copy link
Member

ryan-summers commented Aug 11, 2021

I believe this is primarily stuck because async cannot be done on stable no-std Rust currently - I believe GATs are currently blocking most implementations. See rust-embedded/embedded-hal#285

After talking with a colleague through this, I think my "timing critical" concerns are less important - you can use async executors in environments with interrupts just fine, which gives you the best of both worlds.

@eldruin eldruin added the async label Aug 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants