-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce RX task minimum update rate to 22Hz, to improve 25Hz link stability #13435
base: master
Are you sure you want to change the base?
Reduce RX task minimum update rate to 22Hz, to improve 25Hz link stability #13435
Conversation
Do you want to test this code? You can flash it directly from Betaflight Configurator:
WARNING: It may be unstable. Use only for testing! |
dcf81f0
to
99c97c6
Compare
@ctzsnooze : Last image is with this PR applied? |
Yes, with commit [Handle NULL input as before, log frame time At least with ELRS/CRSF, there is less than 20uS difference in the interval measurements from What I don't know about is if commands are received over MSP or SPI, exactly what would happen then Observing the Also, the final frame delta value (red line in traces) does do a pretty good job of tracking frame delta, I think, at the moment. If OK for SPI and MSP then I'm happy. I'll make another log at higher speed. My great fear is accidentally powering up the microwave accidentally after I put the Tx in it :-) It is a perfect test situation, closing the door reliably wrecks the Rx link, but there is an automatic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some (random) comments
It would be best to cleanup this code a bit - document expected result of important functions and then try to implement this specification. Also, hidden state variables shall be removed where possible.
With current PR code, this is looking closely at The value derived from Rx time stamps is more stable, and probably better suited for the correction of the derivative of stick travel in feedforward. Finally I have some data to support using |
@ledvinap Thank you, sincerely, for your refactoring suggestions via PR21 - much nicer code! I'm very grateful. Tested... first by comparing the 'noise' in the estimate of interval with a historical log of the same code but without PR21. Both logs are 250Hz; the value in red is the output value for Frame Delta in both cases. Comparing the two in low LQ tests, the responses appear identical in time and form, in relation to signal lost. Behaviour when powering the transmitter down, to failsafe, is exactly the same in both cases. Update - tested with ELRS SPI, no logging, but in Sensors tab, appeared to work well. The link was extremely precise, reporting 40.0ms at 250Hz nearly every packet. The only thing of note was that with the same LQ drop test as before, there were no gradual interruptions, the link just died abruptly, and recovered equally quickly. And the link would only work at 250 and 500Hz, anything else failed. Petr: Would it be OK to for me to add your refactoring changes from PR21 into this PR? |
Is this different from old code? I don't see why it should be related, but I may have overlooked something.
Sure, you are welcome! |
Thanks, @ledvinap! Really appreciate your direct proposal, I could not have done that. It is definitely much more clear, now. I've committed and pushed your changes to this PR, re-named one constant as suggested, and logged the time stamp (because I quite like seeing it ramp up, and when it goes flat, I know what that means). Anything else we should be doing within the context of this PR? One thing I noticed is that we calculate I wondered how best to just do the maths once, perhaps where we calculate The feedforward and rcSmoothing applications can use the constrained RxRate value (between Min and Max). The CLI would report a limited frequency range of 15-1000Hz, and as a result would not report zero, even if no packets were received, which is perhaps not ideal; maybe we should check there if the interval is valid with ELRS devs say they have identified a reason for the 'more frequent than expected' dropouts, with perhaps a solution at their end, and will look into their expectations of the SPI code also. I can go back and flash older versions down to 4.4 and test ELRS 3.x over SPI, but. that will take time. I have no clue about it to be honest, I never tried it at 50Hz before. Very unlikely to be something we have done. |
@ledvinap In the most recent commit, I have tried to refactor out the unnecessary repetitions of the rxRate calculation, and to avoid unnecessary translations into seconds. I'm not sure about the best way to set |
cliPrint("# Detected Rx frequency: "); | ||
if (getCurrentRxIntervalUs() == 0) { | ||
cliPrintLine("NO SIGNAL"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check could not have worked since currentRxIntervalUs was constrained to a min value that was >0.
3af28c5
to
eaaf220
Compare
- prepare for direct use of lastRcFrameTimeUs - refactor rx_spi callback
bd6588c
to
cff7f8c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any obvious problem and it is code improvement overall. Let's finish it
const uint16_t smoothedRxRateHz = lrintf(rcSmoothingData->smoothedRxRateHz); | ||
cliPrintLinef("%dHz", smoothedRxRateHz); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const uint16_t smoothedRxRateHz = lrintf(rcSmoothingData->smoothedRxRateHz); | |
cliPrintLinef("%dHz", smoothedRxRateHz); | |
cliPrintLinef("%dHz", lrintf(rcSmoothingData->smoothedRxRateHz)); |
Users had noted a high incidence of
RXLOSS
warnings when using ELRS 25Hz, despite 100% LQ being reported by the ELRS code.ELRS code uses a 100 sample moving average, effectively, to return LQ (with telemetry packets counted as 'good' despite no data being received); the internal Betaflight LQ method is not used. It seems that single packet loss, and telemetry loss, isn't counted.
Previously we would announce RXLOSS if there were no good frames over a continuous 100ms period. For 25Hz users this meant that dropping just two frames in a row or one frame either side of a telemetry packet would result in an RXLOSS warning. Logging showed single and double-frame random drops were not all that uncommon at range, but rarely interfere significantly with flight. By increasing the 'no data' window to 150ms we can reduce 'false alarms' quite a lot, not only at 25Hz but also at 50Hz.
Currently the minimum Rx task interval is 33Hz, ie intervals of ~30ms. This could cause a detection error when the intervals are 40ms long, as in a 25Hz link.
This PR makes quite a few small changes to improve the tendency for false RXLOSS warnings at low Rx rates:
RXLOSS
FrameAge
parameter, which was used in a manner that seemed quite confusingrxRate
is calculated independently, with one single calculation to deriverxRateHz
in rc.c, and a get function to return it where needed in cli.c viagetCurrentRxRateHz
.Overall these changes should reduce inappropriate RXLOSS indications for 25Hz links, and help a bit with 50Hz links also. Most pilots won't notice any significant difference. RXLOSS should appear whenever a sustained loss of data for more than 150ms occurs.
The refactoring should reduce CPU load a fraction.
Please test with the
RX_TIMING
debug, to which LQ and signalReceived values are logged; please compare Master to this PR.This is the debug content of RX_TIMING debug in this PR:
vs master:
Testing appears OK, has been rebased to master.
Here is a graphic showing simulated link loss (put the quad into the microwave oven and shut the door). Axes are not labeled properly. The spikes coming up from the bottom indicate the duration of the dropout; the log is done at 50Hz so each discrete step up is 20ms of lag from one lost packet.
The very top trace is the SignalReceived indicator that triggers RXLOSS warning when it goes false, which happens on the left on four occasions when the LQ falls to around 50%. On the less severe link problem to the right, where LQ drops to around 70, there is no RXLOSS warning.
I do appreciate that quite a few long-range pilots felt that the RXLOSS warning was too intrusive. This perhaps shifts the balance back a bit away from it being a false alarm. When it appears, you have had no packets at all for 150ms, and that's cause for concern regardless of link speed.