Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mppi] Investigate speed up potential of preallocated noise. #4232

Closed
aosmw opened this issue Apr 2, 2024 · 3 comments
Closed

[mppi] Investigate speed up potential of preallocated noise. #4232

aosmw opened this issue Apr 2, 2024 · 3 comments

Comments

@aosmw
Copy link
Contributor

aosmw commented Apr 2, 2024

Feature request

Feature description

Add a noise_benchmark to help explore potential speed up of using an xt::adapt(ed) vector of pre-generated noise rather than lazily evaluating it with xt::random::randn.

Implementation considerations

  • Trade off memory usage for speed of operation?
  • How much noise is enough?
  • Acceptable startup delay to pre-generate noise?
  • Option to reuse pre-generated noise during optimiser reset?
  • Does it translate to an actual improvement?

Preliminary Work.

Code - https://github.com/aosmw/navigation2/blob/feature/mw/pregenerate_noise/nav2_mppi_controller/benchmark/noise_benchmark.cpp
Results - noise_benchmark.txt

2024-04-02T17:07:32+11:00
Running ./build/nav2_mppi_controller/benchmark/noise_benchmark
Run on (24 X 4600 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x12)
  L1 Instruction 32 KiB (x12)
  L2 Unified 1024 KiB (x12)
  L3 Unified 19712 KiB (x1)
Load Average: 0.38, 0.51, 0.45
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
------------------------------------------------------------------------------------
Benchmark                                          Time             CPU   Iterations
------------------------------------------------------------------------------------
BM_Noise_xt_random                            896127 ns       896116 ns          631
BM_Noise_xt_random_noalias                    885876 ns       885859 ns          762
BM_Noise_adapt_vector_1k                       81336 ns        81331 ns         8505
BM_Noise_adapt_vector_3k                       81357 ns        81355 ns         8602
BM_Noise_adapt_vector_shape_1k                 95690 ns        95684 ns         7114
BM_Noise_adapt_vector_shape_3k                 94969 ns        94966 ns         7315
BM_Noise_adapt_vector_shape_3k_adopt_once      81218 ns        81211 ns         8577
@SteveMacenski
Copy link
Member

We are not lazily evaluating it with xt::random::randn, if you look at the data structures involved, its immediately evaluated into the xtensor<float,2>, not an evaluation expression. But, happy for any contributions you might be able to provide to accelerate that process :-) Perhaps some different patterns here could be applied to other bottlenecks like trajectory rollouts.

Note that we have the regenerate_noises parameter that lets you select whether to regenerate noises each iteration or use a static set noise distribution for all iterations. After speaking with some folks about this, its not the most ideal for small samples, but as the number of trajectories being evaluated increases, it is increasingly less important that it is resampled each time. There are several companies that use MPPI in automotive applications which I spoke with and did not regenerate noises each iteration in their certified implementations, so I took my inspiration from them and "good enough for them, good enough for me".

But your vector 1k/3k being 10x faster seems really compelling, can you open a PR to introduce that and I can test it real quick on my side and merge it? I would mention that your numbers don't align with my experiences with computational time of these operations (they're a bit low) so I wonder in context if there's some cache breaks that result in some of that slow down. But hey, saving 0.7ms is a worthwhile activity!

@SteveMacenski
Copy link
Member

@aosmw any insights from your work?

@SteveMacenski
Copy link
Member

Closing without update. The other threads with Eigen seem more relevant

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants