optimization slowdown since 4.1.0 with python #1165

shteren1 · 2022-04-10T15:58:22Z

Description

Since gtsam=4.1.0 i notice slow down of factor x2-3 when running a LM optimization on standard Visual Slam problem.

Steps to reproduce

create two new python virtual environment, one with gtsam=4.0.3 pip package, and one with any of the packages >=4.1.0
copy the following file somewhere locally: https://github.com/shteren1/gtsam/blob/python_timing/python/gtsam_unstable/tests/python_timing.py
run the script and see the timing, i got ~9 seconds using gtsam=4.0.3 and 22 seconds using 4.2a5

This script generate 20 uniform poses on a circle of radius 20m all facing towards the center, it then generate 1000 landmarks around the center of the circle and generate measurements for them in each of the cameras represented by the poses.
it adds the generated landmarks positions + some random noise and poses position + random noise as initial guess and optimizes.
The script is compatible with both versions of gtsam before and after 4.1

Expected behavior

runtime should be nearly the same, or ideally even better with newer versions

Environment

tested on ubuntu 18.04 using clean python virtual environment
EDIT:
added priors on the first pose and landmark that were missing, the timing diff is x4 now instead of the x2.5 it was before. and its possible to use GN and DL optimizers also, without the prior they were throwing indeterminant linear system error.

dellaert · 2022-04-11T15:01:41Z

Could you share the timings for just the optimizer call in line 76? That will disambiguate whether this is a graph construction issue with python wrapper, or a true slowdown in optimization.

shteren1 · 2022-04-11T15:43:57Z

@dellaert here are the results from those lines of code at the bottom, the timer is sampling only the optimizer.optimize() call:

t1 = time()
result = optimizer.optimize()
print(f"optimizer ran: {optimizer.iterations()} iterations, time took: {time() - t1} seconds, initial error: {error0}, final error: {graph.error(result)}")

gtsam=4.2a5:
optimizer ran: 63 iterations, time took: 18.369078636169434 seconds, initial error: 7505.995184644437, final error: 5742.6712427961265

gtsam=4.0.3:
optimizer ran: 63 iterations, time took: 4.867237329483032 seconds, initial error: 7505.995184644437, final error: 5742.671242796135

dellaert · 2022-04-11T19:27:00Z

Wow! Well, that's certainly annoying. @ProfFan ?

ProfFan · 2022-04-12T16:52:36Z

~~Could be a problem with the switch from boost pool allocator. I already have some machinery for micro profiling, but that would need to be after end-of-semester stuff....~~
I fixed it!

ProfFan · 2022-04-12T17:48:43Z

OK, most time is wasted in throwing bad_casts....

@shteren1 Change L69 to values.insert_point3 immediately gives me a 4 sec execution time

shteren1 · 2022-04-12T18:29:04Z

@ProfFan Thanks! i wasn't aware of values.insert_point3 option at all, it didn't exist in 4.0.3 and i didn't see it in any example.
after the pull request this won't be necessary right?
i also tried using insert_pose3 for the poses but it doesn't seem to make any difference on the runtime, why is that different than point3? because the poses are still gtsam class and not eigen vector like point3?

ProfFan · 2022-04-12T19:01:53Z

@shteren1 You don't need to change any code, the PR will fix the slowdown :)

ProfFan · 2022-04-12T19:03:18Z

Yeah, the reason is that because we need to handle arbitrary sized matrices and vectors, but not arbitrary sized Poses, so the poses code is all fixed-sized and good....

ProfFan · 2022-04-12T19:09:06Z

So the final verdict is that if you use the insert_point3 (which is guaranteed static 3x1 vector) you get 4.50s of runtime
If you use insert after the fix PR you get 4.80s of runtime. The difference is caused by 2 extra dynamic_casts.

So, if your code is not too sensitive to performance, just use insert for everything is fine.

dellaert · 2022-04-12T20:00:45Z

I think we should not point people to insert_point3, which is an accidental name generated by pybind11. I would support adding an explicit insertPoint2 and insertPoint3 and properly document their reason for existence. @ProfFan , would you be willing to add that in the same PR?

ProfFan · 2022-04-14T18:59:12Z

@shteren1 I think you can now use insertPoint3 if you build from develop

shteren1 · 2022-04-14T19:04:50Z

@ProfFan thanks! i'll try it out asap.

btw, how did you run that profiling, it looks usefull.

Edit:
complied the new source and all seems to work, using regular insert still results in reasonable runtime, and using insertPoint3 works and shaves off 4-5% of the runtime.

ProfFan · 2022-04-15T16:10:31Z

@shteren1 That one is done using Apple's Instruments app. Just compile and use instruments to run the resulting binary.

shteren1 changed the title ~~optimization slowdown since 4.1.0~~ optimization slowdown since 4.1.0 with python Apr 10, 2022

ProfFan mentioned this issue Apr 12, 2022

Fix slow dynamic cast for Point3 and Point2 #1168

Merged

ProfFan closed this as completed in #1168 Apr 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimization slowdown since 4.1.0 with python #1165

optimization slowdown since 4.1.0 with python #1165

shteren1 commented Apr 10, 2022 •

edited

dellaert commented Apr 11, 2022

shteren1 commented Apr 11, 2022

dellaert commented Apr 11, 2022

ProfFan commented Apr 12, 2022 •

edited

ProfFan commented Apr 12, 2022

shteren1 commented Apr 12, 2022

ProfFan commented Apr 12, 2022

ProfFan commented Apr 12, 2022

ProfFan commented Apr 12, 2022

dellaert commented Apr 12, 2022

ProfFan commented Apr 14, 2022

shteren1 commented Apr 14, 2022 •

edited

ProfFan commented Apr 15, 2022

optimization slowdown since 4.1.0 with python #1165

optimization slowdown since 4.1.0 with python #1165

Comments

shteren1 commented Apr 10, 2022 • edited

Description

Steps to reproduce

Expected behavior

Environment

dellaert commented Apr 11, 2022

shteren1 commented Apr 11, 2022

dellaert commented Apr 11, 2022

ProfFan commented Apr 12, 2022 • edited

ProfFan commented Apr 12, 2022

shteren1 commented Apr 12, 2022

ProfFan commented Apr 12, 2022

ProfFan commented Apr 12, 2022

ProfFan commented Apr 12, 2022

dellaert commented Apr 12, 2022

ProfFan commented Apr 14, 2022

shteren1 commented Apr 14, 2022 • edited

ProfFan commented Apr 15, 2022

shteren1 commented Apr 10, 2022 •

edited

ProfFan commented Apr 12, 2022 •

edited

shteren1 commented Apr 14, 2022 •

edited