[RST-2202] Catch potential errors when computing the covariances #18

svwilliams · 2019-07-08T15:49:12Z

There are several possible errors that can cause the covariance computation to throw. One source of such error arises from numerical stability/accuracy issues, which is a runtime phenomenon and cannot be solved by code fixes elsewhere. We should catch those errors and display a message rather than letting the exception propagate and take down the whole state estimation stack.

It is an open question for me whether we should publish the state estimate without the covariance information in this case. Or if we should issue the warning and not publish. Some things (probably most things) only care about the state estimate itself, and ignore the covariance anyway. Not publishing due to lack of covariance may be overly conservative. On the other hand, failure to compute a covariance is an indication of some sort of issue with the graph (improperly constructed, numerical stability issues, etc.) and the state estimate may be garbage even if it exists. Interested in hearing opinions.

efernandez · 2019-07-08T16:41:45Z

src/odometry_2d/publisher.cpp

+      ROS_FATAL_STREAM("An error occurred computing the covariance information for " << latest_stamp_ << ":\n" <<
+                       e.what());
+    }
+    // TODO(swilliams): Should we always publish the odometry message, even if the covariance information is missing?


This will publish the last good covariance. If we're going to pubish when the covariance computation fails, we should set the odom_output_ covariances to all zeros (or any invalid covariance that can be detected in the consumer side).

I lean towards publishing even if the covariance couldn't be computed. In that case, I'd do:

When the covariance computation fails, set the odom_output_.pose.covariance or odom_output_.twist.covariance to all zeros. Any consumer should check the covariance is valid, so that way it's possible to detect the covariance couldn't be computed and either use the pose or disregard it.

Request the pose and twist covariance in two separate try-catch blocks. I guess there are cases where only one of them can't be computed.

In theory, as you're saying, if the covariance couldn't be computed, it means that something is wrong with the problem we're trying to optimize and the solution found is likely wrong. I think it's better to still publish and explicitly tell the consumer the problem couldn't be solve (or at least the covariance), as opposed to not publishing, which could also mean that something else happened (e.g. it took too long to publish, a configuration issue, ...).

That being said, if I had to implement a consumer of this output, TBH I wouldn't use the pose or twist if their covariance couldn't be computed. In the best case, it might just be that the covariance is rank deficient, so likely some components of the pose or twist are close to the optimal, but there are others that aren't, but even in that case I don't see myself using only some pose or twist components.

Clearing the covariance is definitely a good call.

I'm somewhat hesitant to separate the covariance computation into two operations. This is a heavy computation, and is much more efficient to perform as a single operation rather than two. I'm also unsure if the current "rank deficient" error is affected by which covariance blocks are requested. I'll do some digging in the Ceres code. If the "rank deficient" error will occur regardless of what covariance blocks are requested, I am going to leave the computation as a single operation. If the covariance block request might affect the "rank deficient" error, we can discuss whether breaking them into two operations is worth the performance hit or not.

Ceres forms a (usually sparse) representation of the entire Jacobian, then performs a QR decomposition on the full Jacobian to determine the rank. This will occur regardless what specific covariance blocks were requested. As such I'm going to leave the position and velocity covariance requests in a single call to getCovariance().

ayrton04

Would it be valuable to add a debug parameter to optionally serialize/save the graph when this happens?

svwilliams · 2019-07-09T14:29:01Z

In an ideal world, maybe. And while I hacked together a serialize function to debug this specific problem, it is not general purpose. I need to revive the graph serialization effort at some point.

Catch potential errors when computing the covariances

864fcf1

svwilliams requested review from matthewbwhitaker and ayrton04 July 8, 2019 15:49

efernandez suggested changes Jul 8, 2019

View reviewed changes

Clear the covariance on error

2c0032d

ayrton04 approved these changes Jul 9, 2019

View reviewed changes

matthewbwhitaker approved these changes Jul 9, 2019

View reviewed changes

svwilliams merged commit f7d23ff into devel Jul 9, 2019

svwilliams deleted the RST-2202-underconstrained-error branch July 9, 2019 15:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RST-2202] Catch potential errors when computing the covariances #18

[RST-2202] Catch potential errors when computing the covariances #18

svwilliams commented Jul 8, 2019

efernandez Jul 8, 2019

svwilliams Jul 8, 2019

svwilliams Jul 9, 2019

ayrton04 left a comment

svwilliams commented Jul 9, 2019

[RST-2202] Catch potential errors when computing the covariances #18

[RST-2202] Catch potential errors when computing the covariances #18

Conversation

svwilliams commented Jul 8, 2019

efernandez Jul 8, 2019

Choose a reason for hiding this comment

svwilliams Jul 8, 2019

Choose a reason for hiding this comment

svwilliams Jul 9, 2019

Choose a reason for hiding this comment

ayrton04 left a comment

Choose a reason for hiding this comment

svwilliams commented Jul 9, 2019