Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize interrupt time #5483

Open
night-ghost opened this issue Jan 2, 2017 · 3 comments
Open

Optimize interrupt time #5483

night-ghost opened this issue Jan 2, 2017 · 3 comments

Comments

@night-ghost
Copy link
Contributor

night-ghost commented Jan 2, 2017

Issue details

execution time of AP_InertialSensor_Invensense::_read_fifo() can be up to 1286uS so it is hard to schedule it at 1KHz. So I propose the following optimization, which will reduce this time by almost half.

  • exclude "accel *= _accel_scale;" and "gyro *= GYRO_SCALE;" from AP_InertialSensor_Invensense::_accumulate() and AP_InertialSensor_Invensense::_accumulate_fast_sampling(). Instead, publish that scales to backend and take in into accoiunt in _rotate_and_correct_accel() and _rotate_and_correct_gyro(). If that scales will be 1 by default it will not affect other sensors.

  • in AP_InertialSensor_Backend::_rotate_and_correct_accel() and AP_InertialSensor_Backend::_rotate_and_correct_gyro() pre-calculate at init() all rotations and scales into one transform matrix, so instead of 2 vector*scalar and 2 rotations we got only one matrix multiplication. Offsets can be at init() rotated back to real accel/gyro orientation and thus applied before other calculations

Version

git

Platform

[X] All
[ ] AntennaTracker
[ ] Copter
[ ] Plane
[ ] Rover

@tridge
Copy link
Contributor

tridge commented Jan 20, 2017

I completely agree with the idea of reducing the amount of computation done in _read_fifo(). While we are in _read_fifo() we hold the lock on the bus, so we're preventing other bus activity.
I'm surprised that the scaling of the accel and gyro vectors are a significant cost though. What microcontroller is this on? And with which sensor? Is fast sampling enabled?

@night-ghost
Copy link
Contributor Author

What microcontroller is this on? And with which sensor? Is fast sampling enabled?

All data for RevoMini so STM32F405 (hard float) and MPU6000. Fast sampling is off in this configuration.

Long story :)
To better debug scheduler in my HAL I add some statistics to it. Scheduler runs an 8KHz so I was very confused when I saw the maximum cycle time ~4800uS! Then I add per task statistics and saw the next table
Scheduler stats:
% of full time: 24.56 Efficiency 0.957
delay times: in main 86.09 including in semaphore 0.00 in timer 4.89 in isr 0.00
Task times:
task 0x808FFB1200074C4 tim 0.0 int 0.000% tot 0.0000% mean time 0.0 max 1
task 0x8090BB1200074F0 tim 293.6 int 2.321% tot 0.5701% mean time 8.3 max 18
task 0x808B07D200073E8 tim 1.8 int 0.014% tot 0.0035% mean time 0.9 max 3
task 0x804206720009438 tim 1711.7 int 13.532% tot 3.3244% mean time 442.5 max 445
task 0x804403720009900 tim 1845.0 int 14.586% tot 3.5834% mean time 632.1 max 867
task 0x804AEDD200099F0 tim 8797.0 int 69.547% tot 17.0855% mean time 265.9 max 3474

See - last task uses ~70% of all interrupt time and unlike all another has a highly different execution times with huge maximum of 3474uS than makes impossible loop time 500Hz without jitter. By its address I found a task - yes this is _read_fifo(). (task 3 is a baro and task 4 is a compass, tasks 0-2 are HAL internal)

You know better than me that these tasks is not only reads data, but also provides significant work - normalizes and turns vectors.

IMHO the best way is to move out all calculations from interrupt time to main loop, but if it impossible then that calculations can be optimized by some pre-calculations.

PS, GCC at -Os optimization level don't like to inline functions so all mathematics with vectors translated to individual functions with load before and store after, example:
gyro *= GYRO_SCALE; translates to

mov r0, r4
vldr s0, [pc, #116]
bl 0x804d724 <Vector3::operator*=(float)>

  Vector3<float>::operator*=(float):

vldr s13, [r0]
vldr s14, [r0, #4]
vldr s15, [r0, #8]
vmul.f32 s13, s13, s0
vmul.f32 s14, s14, s0
vmul.f32 s0, s15, s0
vstr s13, [r0]
vstr s14, [r0, #4]
vstr s0, [r0, #8]
bx lr

which more than doubles execution time

@amilcarlucas
Copy link
Contributor

amilcarlucas commented Feb 28, 2018

How much of this has been changed in recent commits ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants