Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize beam sensor model runtime performance #200

Merged
merged 1 commit into from
May 30, 2023
Merged

Conversation

glpuga
Copy link
Collaborator

@glpuga glpuga commented May 28, 2023

Proposed changes

Minor optimizations to the tightest execution loops in the beam sensor model.

While these changes do improve performance a bit, they barely make a difference in the performance disadvantage we have against Nav2 AMCL. We are still missing something much bigger than these optimizations.

Type of change

  • 🐛 Bugfix (change which fixes an issue)
  • 🚀 Feature (change which adds functionality)
  • 📚 Documentation (change which fixes or extends documentation)

Checklist

Put an x in the boxes that apply. This is simply a reminder of what we will require before merging your code.

  • Lint and unit tests (if any) pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • All commmits have been signed for DCO

Additional comments

These changes change the performance profile, as seen through perf from this:

baseline_flamegraph

to this:

commit6_flamegraph

Note: These flamegraphs assume the changes in #199 have already been merged, since there are grid optimizations that are common to both Likelihood and Beam.

Notice that relative to the unmodified the beluga::Bresenham2i::Line block of code (that had no changes done to it, neither in performance per execution nor in total number of executions), the overall time spent in the importance_weight function seems to have reduced significantly.

Notice also the removal of the second stack call tower on the right, which appears to be related to queuing in the MessageFilter.

However, these changes barely changed the cpu usage profile:

beam_vs_beluga_par

beam_vs_beluga_seq

And the change is barely noticeable in the difference against Nav2 AMCL.

beam_vs_beluga_vs_nav2_seq

While it's still possible our implementation of Bresenham is less peforming than Nav2's, even if we somehow reduced the tracing runtime cost to 0 with some magic implementation, that would still get us to perform at basically the same level as Nav2 amcl. To me that indicates that the problem is somewhere else.

I suspect we are actually doing more work than Nav2, but I haven't been able to find any proof of that.

Further work is still needed.

@glpuga glpuga added enhancement New feature or request cpp Related to C++ code labels May 28, 2023
Copy link
Collaborator

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

beluga/include/beluga/sensor/beam_model.hpp Show resolved Hide resolved
beluga/include/beluga/sensor/beam_model.hpp Show resolved Hide resolved
Copy link
Member

@nahueespinosa nahueespinosa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glpuga Left two super minor comments. It is also worth noting that RelWithDebInfo reduces the optimization level to -O2 instead of -O3, so flamegraphs might not be as good at comparing micro-optimizations changes like these.

Microbenchmarks using googlebenchmark compiled in Release mode may be better at detecting true progress.

beluga/include/beluga/algorithm/raycasting.hpp Outdated Show resolved Hide resolved
beluga/include/beluga/sensor/data/regular_grid.hpp Outdated Show resolved Hide resolved
Signed-off-by: Gerardo Puga <glpuga@ekumenlabs.com>
@nahueespinosa nahueespinosa changed the title Optimize Beam sensor model runtime performance Optimize beam sensor model runtime performance May 30, 2023
@glpuga glpuga merged commit 74ed2f5 into main May 30, 2023
5 checks passed
@glpuga glpuga deleted the glpuga/speed_up_beam branch May 30, 2023 14:22
glpuga added a commit that referenced this pull request Jun 4, 2023
Adds an updated report including the following changes from the last:

- Includes the changes merged in #195 #199 #200 #207 
- Measured using the 1x replay speed to prevent distortions to the CPU
results
- Fixes typos in configuration files, found during review

Signed-off-by: Gerardo Puga <glpuga@ekumenlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cpp Related to C++ code enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants