-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add evaluate method to graph #151
Add evaluate method to graph #151
Conversation
const ceres::Problem::EvaluateOptions& options) const | ||
{ | ||
ceres::Problem problem(problem_options_); | ||
createProblem(problem); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side comments:
- I wonder if there's any plan to
createProblem
only once and re-use the problem between optimization cycles, and also when retrieving the covariance, so the problem can be built only once foroptimize
andgetCovariance
. - Similarly, that should allow to incrementally remove and add constraints (residual blocks) and variables (parameter blocks) to the problem as new measurements come in and old are marginalized out.
- The
fuse_graphs::HashGraph
constraints are stored in anstd::unordered_map
. I suspect that means every time theceres::Problem
is created, the constraints could have a different order, which could give different sparsity patterns and therefore, different convergence and performance. - In addition to the last comment, with an
std::unordered_map
I suspect the order in which the constraints are inserted is ignored, but in some cases that ordered reflects some structure in the problem that could give better sparsity, and therefore better convergence and performance. I believe the same applies to the variables, which are also stored in anstd::unordered_map
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1./2. When I originally started the fuse project, that was my plan. However, there was Google Groups Ceres Solver question asking about this, and the suggestion was to generate the problem from scratch every time. Basically the solver time far exceeds the problem creation time, so why bother. I, of course, cannot find that post right now, but I will keep looking.
If we do want to incrementally update a Problem object, there are some problem setting that would need to be enforced...which is fine.
Problem::Options::enable_fast_removal
http://ceres-solver.org/nnls_modeling.html#_CPPv2N5ceres7Problem20RemoveParameterBlockEPd
I haven't done any profiling recently, so I don't know how much time is consumed recreating the problem every iteration. I also recall reading somewhere that enabling the fast removal flag turns on additional indexing inside of Ceres, so there is some sort of performance hit. These things would all need to be tested, but I am definitely not opposed to the idea. Basically the Graph object could own a Problem object and keep it in sync with its own internal structures. The "keeping in sync" is the only tricky part, particularly in the face of exceptions.
3./4. My original vision of fuse is having multiple sensor model plugins running asynchronously, each one sending cost functions and variables to the optimizer on its own schedule. In such a scheme, arrival order is nearly meaningless. Minute differences in timing could cause CostB to be inserted before CostA.
On top of that, Ceres will compute an "optimal" variable elimination order and reorder the variables as it sees fit. There may be cases where the optimal order has a tie between multiple variables and the variable insertion order may play some role in deciding the relative order of these equivalent variables. I honestly have no idea how deterministic the "tie" cases are.
If the problem is well-constrained, these small changes in ordering should not result in a significant change in the output solution. If the problem is not well constrained, then many subtle things can have a noticeable impact on the output.
So my stance is, the reordering that might be caused by future insertions into the unordered_map
s should not be a concern, and that the speed improvement gain using the unordered_map
s is well worth that trade-off.
That said, HashGraph is only one possible implementation of the Graph interface. This was done intentionally to allow multiple data storage designs to be possible. For example, I suspected that a flatmap Graph implementation could outperform the unordered maps in a fixed-lag smoother context. If you are interested in rolling a new derived Graph type that maintains a fixed variable order, I'm all for it.
If you do want that, let me know. Having lived with the Graph interface for a while now, I feel it is overly cumbersome. There should be a way to refactor the Graph class to implement most of the interface in the base class, leaving just a few lookup/insert functions for the derived class to handle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1./2. FYI I did some profiling some time ago and the results agree with your comment. The most expensive parts are the optimization and the covariance computation. Creating the problem is actually relatively fast compare to those two.
I'm not sure though, if the covariance computation could be faster if it's performed after the optimization, if we didn't have to re-create the problem. 🤔
As you said, it's something that we'll have to test. I might give it a try in the future, but I don't really have time atm, and it doesn't look like a trivial amount of changes.
3./4. Thanks for the clarification regarding the fact that Ceres computes an "optimal" variable elimination ordering. I wasn't aware of that. Using an unordered_map
sounds like a good choice then.
Re. the flatmap Graph implementation, it seems like we'll have to experiment with it to see if it's really faster. I've tried to find a benchmark that reflects the fixed_lag_smoother use case, which I believe is as follows:
- Size (number of entries): ~1k, although this depends on the number of sensor models, their frequency and number of constraints they generate, as well as the lag duration, and likely other factors.
- Key type:
std::string
, although we actually usefuse_core::UUID
, but I think it'd be equivalent. 🤔 - Insertions: ~10, although this also depends on the number of sensor models and the number of constraints they generate.
- Deletion: same as insertions
- Lookup queries: not sure, probably
N * k
whereN
is the number of constraints andk
is the average number of variables a constraint has. I'm thinking of the worst case, where we iterate over all constraints and query for the constraint variables.
These are some benchmarks I found:
- https://tessil.github.io/2016/08/29/benchmark-hopscotch-map.html
- http://incise.org/hash-table-benchmarks.html
But I don't think they reflect the use case I described above. So we'll likely have to experiment and benchmark things ourselves.
I wonder what would be your first flatmap implementation candidate to try. I was thinking of the absl::flat_hash_map
from https://abseil.io/docs/cpp/guides/container
Either way, from my profiling results this didn't show up as a hotspot, so the improvement would be marginal, since the computation time is dominated by the Ceres optimization inside optimize and the covariance computation inside getCovariance.
It doesn't look though as something too hard to implement, but I'm afraid I won't have time to do in the near future. 😃
Re. the Graph interface, I was indeed wondering why createProblem
wasn't in the base class, but then I saw the variables_on_hold_
are only the child HashGraph
class. Not sure what would be the right approach to handle those. 🤔
Why are the checks failing? Locally they pass, but I had to remove the old |
A whole bunch of this:
I can paste the whole error output if you want, but it is this same error triggered from several different places in the code. |
e500a2b
to
07e5c11
Compare
Thanks. It should pass now. BTW For some reason I could add you as reviewers in this PR, @svwilliams @ayrton04 |
* See https://ceres-solver.googlesource.com/ceres-solver/+/master/include/ceres/problem.h#401 | ||
* @return True if the problem evaluation was successful; False, otherwise. | ||
*/ | ||
virtual bool evaluate(double* cost, std::vector<double>* residuals = nullptr, std::vector<double>* gradient = nullptr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd love to have std::optional's here, but that is a C++17 feature. Maybe I'll update this one day when C++17 is a minimum requirement.
This allows to evaluate the problem for the current set of variable values and obtain the cost of the graph/problem, as well as the individual residuals. The motivation is to analyse the influence of difference residuals to the overall cost, with or without the influence of loss functions, if any, and per source and/or constraint/residual type.