Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP, RFC] doc/memos: Added RDM on high level timer API requirements and common features #12970
[WIP, RFC] doc/memos: Added RDM on high level timer API requirements and common features #12970
Changes from all commits
d53b2fc
8f4dd44
65b1943
d8cf7c9
169723c
d3aa905
bd57915
2f553b7
ff3dcdc
e796c32
f0b75fe
148be8b
4b71ec1
aebb1f3
2f0046d
b259522
f5fae3e
6c99879
1af6249
8e597ca
8092ecd
6e30f5c
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline with @kaspar030, a function for setting an absolute timeout time is needed for most time sensitive MAC layers (LoRaWAN, TSCH).
Something like:
Usually the slot times for these protocols are known before hand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 here
With LoRaWAN there are some commands to get the absolute timestamp from the network server (with < 10ms of accuracy). Implementing such an API for LoRaWAN would be straightforward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is IMO out of scope
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... or should maybe go to the utility functionality...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. It is already in the feature requirements below utilities. To me, this is an important thing to improve real time capabilities in a convenient way. (Obviously, one could just write their own handlers to call
event_post()
; but I think having a utility function for that can be much easier to use.)I'm not sure if it makes sense to have this written twice. To me this is both a feature requirement and a realtime requirement; so I just added it twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
O(n) is an unbound function, so this justification is inconsistent. For real-time requirements ask for O(1), I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For every concrete application, there is an upper bound for the number n of software timers being active at the same time. So for every concrete application, O(n) has an upper bound.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is true for any finite set of numbers. So we could also go for an O(exp(n))?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentionally written to not be O(1), because of the reasonable trade-offs as described. We know how to do O(n) by simply using linked lists and sorting on insert, and we understand the runtime implications well. We don't know how to do this in O(1) without limiting the number of allowed timers.
We do multiplexing here, so one hardware timer (or more, but a small number) needs to be used to schedule N timeouts.
So when adding one of these timeouts (using a set(interval) function), that new timeout needs to be added into the list of already queued timers. Using linked lists, this is an O(n) operation. Using an array and binary search, this is O(logn). We need to be careful what to require here, because if we break O(n*logn) for sorting n timers, we might attract a lot of attention.
It is possible to add an allowed timers limit (that is lower than available memory, e.g., 16 or 128) and enforce this.
We'd suddenly have O(1) on paper (meeting the requirements), but wouldn't have gained anything, other than that now a multitude of functions have an error case (timer_sleep() might fail with "maximum number of timers reached") that needs to be checked everywhere, which is unreasonable.
TL;DR let's not require something we don't know how to implement. This is phrased as "yadda yadda reasonable trade-offs".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The complete timer management can be done in O(log(n)), which is almost constant in the number range we consider realistically. Further optimizations can apply, if algorithms account for the deadline of the next firing timer, which can be determined in constant time.
Current preliminary HiL-tests of @pokgak show a rather poor behavior of xtimer and a somewhat unclear signature of ztimer. We will publish once ready.
In any case, it is important that timer firing remains within predictable bounds independent of the number of instantiated timers - otherwise real-time gets out of control.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log(25) ~ 1.39, exp(25) ~ 7.2 x 10**10.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both bounded, which is exactly what a realtime system needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to dive into the numerical analysis of the numbers you provided or can we agree here that both numbers are constants?
A realtime system does not require the system to be fast. It does however require the system to respond within the specified deadlines. This deadline can be in the order of microsecond or in the order of years, depending on the requirements.
Furthermore, a complex O(1) algorithm, taking 10 seconds performs worse compared to a simple O(n) algorithm taking
10 ms + n * 1 ms
. when n is known to be bounded to 100.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a solution here, why don't we rephrase a bit and specify that it must be trivial to determine the maximum number of active timers for an application in order to ensure bounded execution time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maribu @kaspar030 Here, and wherever else applicable in this section, what about giving some ballpark figures that we aim for, at the minimum?
For that, a (low) hanging fruit could be to use the output of some xtimer benchmarks.
In my mind, such figures would not aim to be "exhaustive", but rather to be indicative.
I think it could be helpful. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#12942 might provide some, but they're very dependent on both platform and configuration...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup. That's why I was suggesting we do not claim these numbers capture exhaustively the requirements, but rather are put here as indicators, to give a rough idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most MCUs have two kind of timers
Now a timer API could combine both to always set the longest possible sleep with the low power timer and use the high precision timer for the rest.
e.g. if you have a low power 32kHz timer and a fast 1 MHz timer and want to sleep for 325 ticks and then use the high precision timer for the remaining 82 ticks.
How ideally you could also specify a minimum and a maximum wake up time, so wake-up events can be bundled to reduce the amount of total wake-ups.
For precise timings you'd have
min == max
But the user shouldn't have to worry about which timer to use, that should be selected automatically based on the wait time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if there really is a use case for that. E.g. when you want to perform 3 measurements with a delay of exactly 10 ms and additionally want to repeat those three measurements every 30 minutes, this would be a good example for using both timers. But to me it makes more sense to have one 30 minutes timer, and then a 10 ms timer that is used for the second and third measurement of each row.
If you would transparently use both low frequency and high frequency timers (e.g. for a sleep of 30 minutes and 1337 µs, transparently set one 30 minute low frequency timer and afterwards a 1337 µs high frequency timer), there would be some issues:
IMO mixing clocks should not be done transparently, as this might (depending on the availability of low frequency clocks) reduce the accuracy of timers in unexpected ways. Having this explicitly makes sure users are aware of power consumption, accuracy and precision of the used clock.
Sounds like this adds quite some complexity. I'm not sure if it is better to let the application developers optimize their use of timers. IMO core infrastructure like software timers should come with low ROM / RAM requirements and robust implementations, as one usually cannot opt out of using them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is something that seems useful in theory. In practice, in a non-idle system, it is difficult enough to get a 1MHz timer actually have better than the 30.5us accuracy that a 32kHz timer promises. See #14931. We tried debugging this, uncorrected, the (quite fast) nrf52840 has 30us offset on a simple ztimer_sleep(). There's a context switch, expensive timer reads, and ztimer's overhead. The time is spread more or less evenly across them. Yeah, that can be mostly corrected, but this is the simplest use-case.
A simple combination of timers won't gain anything, as setting a low-freq timer already quantizes on that timer's frequency, so the result is off by half that timer's period length, on average, anyhow. So using a HF timer "for the last microseconds", but being based on a, from the HF timer's perspective, arbitrary time in between a LF period, is useless.
There are ways to double sync, but that's expensive...
IMO, with all these theoretical timing needs, people need to realize that in most cases, 1KHz or 32786KHz is mightily sufficient. And those are easy to do with RTTs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, as said in previous discussions, a sane abstraction of timers that use only a single backend (e.g., don't do these complex combinations of timers) are a nice base to put the complex stuff on top. At least for prototyping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user needs to choose the precision (the time base). There cannot be an API where the user says "sleep 100". It'll always be "sleep 100ms" or "sleep 100us".
IMO it is OK to simplify our lives (and reduce complexity) by asking the user to always choose the time base that is just enough to express the needs. E.g., if the user asks for "sleep 100ms", it is acceptable to expect them to call "sleep_ms(100)", and not do "sleep_us(100000)".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is re-hashing the mailing list discussions from a year ago :)
I'd be happy to see a prototype of what is proposed here.
I agree that for "sleep_ms(100)" vs "sleep_us(100000)", there are ways to maximize power efficiency. A simple
if us > 1000: sleep_ms(us/1000), sleep_us(rest)
might do it for many cases.Otherwise, I see split between domains, clock drift boundaries, high precision LF, mentioned. Implement! :)
Well, how do you do that without explicit synchronization? Assuming LF is being used as it is active during sleep, and HF is not active during sleep, they need to be synchronized at least on every transition between sleep and wake. I call that double sync.
I call that expensive because on some platforms (e.g., sam0), just reading the RTT "now" register needs to synchronize the clocks, in some cases taking a whole period (30us). sometimes, depending on what else is going on in the system. That dwarfs a lot of clock drift for reasonable time scales.
Yup. I'm strongly in favor of not making things complex, and teach users to use ms or 32kHz, unless there's a reeeeally good reason not to. Accurate ms or 32kHz timing goes very far. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know, and to be honest I don't want to bore people with lengthy discussions again. There is no point in these lengthy "abstract" discussions if not more of the developers jump in. Yet, the arguments are not less valid than before. It appears to me that we all hold single pieces of the big picture without an explicit agreement on how the big picture looks like. This document will eventually help with that I think.
On it (at least for some of these problems) just from a slightly different direction.. I'll explain during the summit.
Not. At least not without any synchronization. Of course they need to be synchronized.
But why double sync? And who says that you need to do this explicitly? We may very well use an "opportunistic" approach where you sync based on events that happen anyway. I.e., you don't care how long reading 'now' takes for the slow timer if you read the fast timer close enough to a known event of the slow timer... I don't say this solves all cases everywhere. I'm just trying to think towards a solution instead of searching for reasons against it.
Yes reading LF timer can indeed be expensive. Apart from reading not being necessarily required, some would argue transitioning to a low power mode for arbitrarily small periods is "expensive" because of the delay when leaving lpm. Again, it all depends on the perspective ;)
Fair enough. I get your point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW: why do I get these check-labels mail-notification after adding a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the end this will have to be a trade-off between:
sleep_ms(1)
is just an alias forsleep_us(1000)
USEMODULE += ztimer_policy_powersafe
sleeps could be moved to the low frequency clock where possible, etc. And one would not have to replaceztimer_sleep(ZTIMER_USEC, x)
byztimer_sleep(ZTIMER_MSEC, (x + 500) / 1000)
)We just won't be able to get those all.
IMO we should first assemble a list design goals and vote on them to get some agreement of a prioritization of them. If overhead, memory, and maintainability get a higher priority than user friendliness and timer policy support, an explicit clock selection via the API makes more sense. Otherwise, adding heuristics and policies for clock selection makes more sense. (This might still be technically better implemented on top of a "low-level high-level timer API", but that would be an implementation detail.)