-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up serial GPS message decoding #13469
base: master
Are you sure you want to change the base?
Conversation
Do you want to test this code? You can flash it directly from Betaflight Configurator:
WARNING: It may be unstable. Use only for testing! |
Test unit needs some changes:
|
The problem is the cli test, which pulls in the io/gps.c which now requires drivers/system.c. Pulling in drivers/system.c directly doesn't work. Creating mocks of the required functions would be easy, but I'm not sure where to put them. |
Unit tests should build now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another hack to avoid fixing the real problem ..
while (serialRxBytesWaiting(gpsPort)) { | ||
wait = 0; | ||
if (!isFast) { | ||
rescheduleTask(TASK_SELF, TASK_PERIOD_HZ(TASK_GPS_RATE_FAST)); | ||
isFast = true; | ||
} | ||
if (cmpTimeUs(micros(), currentTimeUs) > GPS_RECV_TIME_MAX) { | ||
if ((getCycleCounter() - initialCycleCount) > cpuCycleLimit()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmp32()
here. Subtraction result is unsigned and it will fail on wraparound. And in this case, it may happen every few seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some kind of Betaflight or MCU specific weirdness going on with unsigned int arithmetics? As far as I can tell the current code should handle wraparound as expected.
I did a small test to be sure
unsigned int val1 = 2U;
unsigned int val2 = 4294967295U; // 2^32 - 1
printf("value: %lld \n", val1 - val2);
which resulted in
value: 3
as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CmpTimeUs
macro handles wrap. See cmpTimeCycles
for cycle comparison. Used in the scheduler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must be missing something here. Why should the result of the subtraction be converted to a signed value if it is known which value that is older? As far as I can tell the result will be the same in most cases, and if the difference is large it will result in a negative value, which we do not want here (then, if the difference ever become that large there are bigger problems elsewhere).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tbolin : Sorry, I was wrong. The wrap does work as intended.
Why should the result of the subtraction be converted to a signed value if it is known which value that is older?
For ms/us, 31bits is considerably longer than any intended delay. Signed value does work independent of value ordering, so there is lesser chance to get it wrong. ( cmp32(a, b) > tout
/ cmp32(a, b+tout) > 0
). Also, cmpTimeUs / cmp32 marks calculation with values that are expected to overflow (so cmpTimeCyclesU(a, b)
instead of a - b
).
For tick counter, the situation is more complicated due to range. Personally, I'd use signed conversion and limit maximum delay to some small value (<1s). This should be enforced (compile-time asserts / runtime saturation). Without this, code may behave differently depending on clock speed, leading to support nightmare. We will still run in problems with 2GHz chips, but that will take some time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW: It may be good idea to use
uint32_t getCycleCounter(void)
{
return DWT->CYCCNT | 1;
}
This way, you can use if (timeout && cmp32(getCycleCounter(), timeout) < 0)
safely and resolution loss is negligible.
This PR improves the performance of the serial GPS message decoding. The execute time of the GPS task now usually stays bellow 20 us on an F722. Usually only one more fast cycle is required process a nav-pvt message compared to current master. Previously the processing time was just bellow 30.
Limit gps processing time with clock cycles instead of micros
Replacing the repeated calls to micros with checking the number of CPU cycles speeds up the message processing by about 25-30%. The max processing time is lowered to 15 us (from 25) to take advantage of the faster processing.
Defer ublox message processing if low on time
Processing an ublox message after the last byte is received takes about 4-5 us. The task will now use a lower time limit if the ublox parser is on the last byte to prevent the task time from spiking at the end of a message.
Logs
gps_processing_logs.zip
The logs were recorder at 115200 kbit to create a worst case scenario.