Filter out transactions older than the lag window #173

svwilliams · 2020-04-10T16:13:34Z

There is a bug where old sensor data is getting added to the fixed-lag smoother. This PR fixes that by correctly filtering out old sensor transactions before that are inserted into the graph. For thread synchronization reasons, this is done inside of the optimization thread rather than in the ROS callback thread.

fuse_optimizers/src/fixed_lag_smoother.cpp

efernandez · 2020-04-13T08:34:28Z

fuse_optimizers/src/fixed_lag_smoother.cpp

@@ -338,7 +342,20 @@ void FixedLagSmoother::processQueue(fuse_core::Transaction& transaction)
  while (transaction_riter != pending_transactions_.rend())
  {
    auto& element = *transaction_riter;
-    if (std::find(sensor_blacklist.begin(), sensor_blacklist.end(), element.sensor_name) != sensor_blacklist.end())
+    auto min_stamp = element.stamp();
+    if (!element.transaction->involvedStamps().empty() && *element.transaction->involvedStamps().begin() < min_stamp)


I wonder what's the relation between the transaction stamp() and the involvedStamps(). IMHO it's probably better to make the transaction stamp() return involved_stamps_.begin() or ros::Time(0, 0) if involved_stamps_.empty(). I'm not sure if there's a valid case where stamp() could or should be different than any those two cases. 🤔

Why not only checking the involved stamps? Would it matter if there's no involved stamp? I believe that's already handled in:

fuse/fuse_core/src/timestamp_manager.cpp

Lines 66 to 70 in 3b59771

const auto& stamps = transaction.involvedStamps();

if (stamps.empty())

{

return;

}

Maybe add this method to Transaction:

const ros::Time& Transaction::oldestStamp() const { if (involved_stamps_.empty()) { return stamp_; } else { return std::min(*involved_stamps_.begin(), stamp_); } }

and this to the TransactionQueueElement:

const ros::Time& oldestStamp() const { return transaction->oldestStamp(); }

So you can do this here: const auto& min_stamp = element.oldestStamp();

If we get rid of stamp_ though, that could be further simplified, but I'm still not sure about that part. The question is, should the transaction stamp always be the oldest involved stamps? Or zero if there's no involved stamp, which is the current behaviour. 🤔

I like the concept of the transaction having an "oldestStamp()" method. That cleans up this code nicely, and insulates it from future changes.

As far as the Transaction stamp vs involved_stamps. I'm honestly not sure what the correct answer is. In my mind, I want the transactions applied in the order that they were measured/computed/etc. So in the case of relative poses, the pose1->pose2 measurement will occur at the pose2 timestamp. So the transaction stamp == pose2 stamp, but the involved stamps will contain {pose1 stamp, pose2 stamp}. But I'm honestly not sure if the application order matters at all, it just felt like the thing to do at the time.

I guess one small difference will be transaction timeout. If we always use the oldest stamp, a transaction could timeout instantly. We could add a newestStamp() method to the transaction as well, and use the newest stamp for timeout purposes.

I guess one small difference will be transaction timeout. If we always use the oldest stamp, a transaction could timeout instantly. We could add a newestStamp() method to the transaction as well, and use the newest stamp for timeout purposes.

Yes, I added newestStamp() in #176 , but the transaction timeout logic I implemented is actually comparing against the oldest stamp of each transaction:

https://github.com/locusrobotics/fuse/pull/176/files#diff-db29a07a8b4cc5e7dfa08f70ca3907aaR337

https://github.com/locusrobotics/fuse/pull/176/files#diff-db29a07a8b4cc5e7dfa08f70ca3907aaR372

I think you're right. The timeout logic should use the newestStamp() of all transactions. That would required more changes, since we can't rely on the pending_transactions_ ordering, if it's based on the oldestStamp(), as I did in #176

This should be good for now, for this PR. I think this can be merged as is and we can continue the discussion and fixed the remaining issues in #176.

efernandez · 2020-04-13T10:17:45Z

After some recent PRs got merged, I've noticed that sometimes after setting a pose this WARN messages is printed:

[INFO]: Received a set_pose request (stamp: 173.570000000, x: 1.4, y: 0.525, yaw: -0.0350735)
[WARN ]: Unable to locate a state in this history with stamp <= 173.540000000. Variables will all be initialized to 0.

I just tried with #172 and #173, but I still see that message. Not sure if these PRs are supposed to get rid of that or not though. 🤔

I'm also getting:

[ERROR]: The queued transaction with timestamp 2589.220000000 from sensor laser_odometry is not an ignition sensor transaction. This transaction will not be processed individually.

I'm starting to suspect there's something wrong in the ignition logic I changed recently. 🤔

Update

@svwilliams The issue is that for differential/relative constraints, the transaction stamp() is the one of the newest variable:

fuse/fuse_models/src/odometry_2d.cpp

Line 103 in 3b59771

transaction->stamp(msg->header.stamp);

Since the pending_transactions_ are order by transaction stamp() instead of the minimum/oldest involvedStamps():

fuse/fuse_optimizers/src/fixed_lag_smoother.cpp

Lines 425 to 436 in 3b59771

    
           // Add the new transaction to the pending set 
        
           // The pending set is arranged "smallest stamp last" to making popping off the back more efficient 
        
           auto comparator = [](const ros::Time& value, const TransactionQueueElement& element) 
        
           { 
        
             return value >= element.stamp(); 
        
           }; 
        
           auto position = std::upper_bound( 
        
             pending_transactions_.begin(), 
        
             pending_transactions_.end(), 
        
             transaction->stamp(), 
        
             comparator); 
        
           pending_transactions_.insert(position, {sensor_name, std::move(transaction)});  // NOLINT

When an ignition transaction is received, the transactions with differential/relative constraints don't get purged, because the minimum/oldest involvedStamps() is not considered in that check.

I'm still not sure if we should use the minimum/oldest involvedStamps() always, for the transaction stamp(). We probably want the newest stamp in the pending_transactions_, but not when purging. Thoughts? 🤔

This is definitely an issue, but I believed it's not related with any recent changes. It's probably been an issue always, but no warn/error messages was being printed in relation to it.

I've created #176 to address these issues.

efernandez · 2020-04-13T10:39:10Z

fuse_optimizers/src/fixed_lag_smoother.cpp

@@ -204,9 +207,10 @@ void FixedLagSmoother::optimizationLoop()
      // Optimization is complete. Notify all the things about the graph changes.
      notify(std::move(new_transaction), graph_->clone());
      // Compute a transaction that marginalizes out those variables.
+      lag_expiration = computeLagExpirationTime();


Why is this done here, and not before processQueue?

It sems to me that the lag_expiration should be computed before, so it's updated after a reset:

fuse/fuse_optimizers/src/fixed_lag_smoother.cpp

Line 398 in 3b59771

timestamp_tracking_.clear();

Maybe lag_expiration must be an attribute. 🤔

So.....

A more straight-forward processing regime would be:

Compute lag expiration

Compute marginal (This transaction will removes variables before the lag expiration)

Add the marginal transaction to the graph (This actually removing those old variables)

Add new transactions to the graph (As long as they don't involve old variables)

Optimize the graph

Notify publishers

However, that puts additional marginalization processing at the top of the loop. In an attempt to reduce publication latency, I shifted the above sequence by half a loop.
On loop 1:

Add new transactions to the graph

Optimize the graph

Notify publishers

Compute lag expiration

Compute marginal (Note that we do not apply the marginal here)

Then on loop 2+:

Add the marginal transaction to the graph (removing the old variables computed during the previous iteration)

Add new transactions to the graph (as long as they don't involve those old variables)

Optimize the graph

Notify publishers

Compute lag expiration

Compute the next marginal

In the above scheme, the marginalization processing happens after the publishers are notified about the new graph values, under the theory that we have some spare time before the next optimization loop is triggered.

I could apply the marginal transaction at the end of the loop without optimizing the graph. That would clean up the code a bit, but it would leave the graph object in a different state than what was sent to the publishers, etc. I think that would be fine. But that does not change where in the loop we compute the lag expiration time.

The question you raise is, can we do this instead:

Compute lag expiration

Add new transactions to the graph

Optimize the graph

Notify publishers

Compute marginal

Add the marginal transaction to the graph

Existing sequence, iteration 1:

Add t1, t2, t3 to the graph

Optimize and publish (graph contains t1, t2, t3)

expiration = t1

Compute marginal for t1

Existing sequence, iteration 2:

Apply marginal for t1

Add new after t1 (t3, t4, t5)

Optimize and publish (graph contains t2, t3, t4, t5)

expiration = t3

Compute marginal for t2, t3

Existing sequence, iteration 3:

Apply marginal for t2, t3

Add new after t3 (t6, t7, t8)

Optimize and publish (graph contains t4, t5, t6, t7, t8)

expiration = t6

Compute marginal for t4, t5, t6

Proposed sequence, iteration 1:

expiration = t0 (The graph is empty)

Add t1, t2, t3 to the graph

Optimize and publish (graph contains t1, t2, t3)

Compute marginal for t0 (which is empty)

Apply marginal for t0 (which is empty)

Proposed sequence, iteration 2:

expiration = t1 (Graph contains t1, t2, t3)

Add new after t1 (t3, t4, t5)

Optimize and publish (graph contains t1, t2, t3, t4, t5)

Compute marginal for t1

Apply marginal for t1 (Graph contains t2, t3, t4, t5)

Proposed sequence, iteration 3:

expiration = t3 (Graph contains t2, t3, t4, t5)

Add new after t3 (t6, t7, t8)

Optimize and publish (graph contains t2, t3, t4, t5, t6, t7, t8)

Compute marginal for t2, t3

Apply marginal for t1, t2, t3 (Graph contains t4, t5, t6, t7, t8)

Assuming I did the above correctly, the proposed sequence ends each iteration with the same graph state, but results in more variables than needed during each optimization call. Unless there is a compelling reason to move that computation, I feel like the current sequence is correct.

But I agree there is bug relating to the reset function.

Assuming I did the above correctly, the proposed sequence ends each iteration with the same graph state, but results in more variables than needed during each optimization call. Unless there is a compelling reason to move that computation, I feel like the current sequence is correct.

Thanks for the detailed explanation. LGTM.

But I agree there is bug relating to the reset function.

Actually, it looks like even if the lag_expiration is not reset, the logic should still work, since the transactions' timestamps would still be newer than it. At least, in my tests with #176 it looks like everything was already fine and no change was required in this regard.

Something bad could probably still happen if you reset a bagfile or something and time moves backwards. I'll put in a fix regardless; it should be reset for consistency.

Makes sense. 👍

svwilliams · 2020-04-14T22:04:50Z

[INFO]: Received a set_pose request (stamp: 173.570000000, x: 1.4, y: 0.525, yaw: -0.0350735)
[WARN ]: Unable to locate a state in this history with stamp <= 173.540000000. Variables will all be initialized to 0.

I suspect what is going on is the ignition transaction is added with stamp 173.57. That is now the current stamp, making the lag_expiration stamp = 173.57 - lag_duration, which let's say is 172.57. The older involved stamp of the relative pose constraint is 173.54, which is before the start_time but after the lag_expiration. The lag_expiration should not be allowed to be before the start_time. Let me go try to fix that.

efernandez · 2020-04-15T08:27:36Z

The lag_expiration should not be allowed to be before the start_time. Let me go try to fix that.

That makes sense, and it should get rid of some corner cases.

However, it won't cover the cases where:

Ignition transaction was stamp 10.0
A transaction with a differential constraint has stamp 11.0 with involved stamps: 9.0 and 11.0

In this case, with the current implementation, the transaction with the differential constraint is still processed after the ignition transaction, because its stamp (11.0) is greater than the ignition transaction stamp (10.0). But the first/older involved stamp (9.0) would print a WARN message like this:

[WARN ]: Unable to locate a state in this history with stamp <= 10.0. Variables will all be initialized to 0.

I think that transaction should have been purged.

Just saying, I think you're already aware of this other corner case, that it's handled in #176

efernandez

LGTM after the last changes.

efernandez · 2020-04-15T08:33:05Z

fuse_optimizers/src/fixed_lag_smoother.cpp

@@ -338,7 +342,20 @@ void FixedLagSmoother::processQueue(fuse_core::Transaction& transaction)
  while (transaction_riter != pending_transactions_.rend())
  {
    auto& element = *transaction_riter;
-    if (std::find(sensor_blacklist.begin(), sensor_blacklist.end(), element.sensor_name) != sensor_blacklist.end())
+    auto min_stamp = element.stamp();
+    if (!element.transaction->involvedStamps().empty() && *element.transaction->involvedStamps().begin() < min_stamp)


This should be good for now, for this PR. I think this can be merged as is and we can continue the discussion and fixed the remaining issues in #176.

svwilliams · 2020-04-15T14:15:23Z

However, it won't cover the cases where:

Ignition transaction was stamp 10.0
A transaction with a differential constraint has stamp 11.0 with involved stamps: 9.0 and 11.0

With this PR, that differential constraint should get rejected inside the processQueue() method here:

fuse/fuse_optimizers/src/fixed_lag_smoother.cpp

Line 349 in 519a183

if (min_stamp < lag_expiration)

When we iterate over the pending the transactions, the min transaction time is checked against the lag_expiration time. So, once I get the lag_expiration time to respect the start time (testing that change now), I believe the case you describe should be handled. At least, it was my intent to handle the case you describe.

Perhaps it is less than ideal to have transaction filtering in two different places (some in the transaction callback, and some in the processQueue method). Given the system's asynchronous nature, the processQueue method is the definitive place to filter message. Perhaps all of the filtering should be moved there. And the transaction callback (a) watches for ignition transactions, and (b) keeps the queue size reasonable while waiting for an ignition transaction. I'll take a look at that...but in a different PR.

efernandez · 2020-04-15T15:21:48Z

When we iterate over the pending the transactions, the min transaction time is checked against the lag_expiration time. So, once I get the lag_expiration time to respect the start time (testing that change now), I believe the case you describe should be handled. At least, it was my intent to handle the case you describe.

That makes sense. If that's the case, it'd simplify most of the code I did in #176, so probably most of it isn't needed. 😄

efernandez · 2020-04-22T17:31:53Z

@svwilliams I wonder if you could made any progress with this recently. I'm not sure what's the current status or whether I can help with this somehow or not. 😃

Indeed, I'm not sure if #176 is useful as is right now. Maybe I can split some functionality from it, or test this PR again. 🤔

svwilliams · 2020-04-23T16:28:22Z

Yeah. Got side-tracked for a bit. I'll be testing this today and should get it merged in. And I think I'll steal the Transaction::minStamp() code from the other PR.

efernandez · 2020-04-23T16:46:12Z

I'll be testing this today and should get it merged in. And I think I'll steal the Transaction::minStamp() code from the other PR.

Sounds good. No problem, take whatever you need. 👍

…he lag expiation computation

…actions correctly

efernandez

It's hard to tell if the usage of min and max stamp is doing the right thing, but it LGTM.

I can try to give this branch a try locally if it's ready for that. 🤔

efernandez · 2020-04-24T18:38:33Z

fuse_optimizers/src/fixed_lag_smoother.cpp

-                          (current_time - element.transaction->stamp()) << " seconds, which is greater " <<
-                          "than the 'transaction_timeout' value of " << params_.transaction_timeout <<
-                          ". Ignoring this transaction.");
+        ROS_ERROR_STREAM("The queued transaction with timestamp " << element.stamp() << "and maximum "


A white space is missed before and:

Suggested change

ROS_ERROR_STREAM("The queued transaction with timestamp " << element.stamp() << "and maximum "

ROS_ERROR_STREAM("The queued transaction with timestamp " << element.stamp() << " and maximum "

efernandez · 2020-04-24T19:43:45Z

I just did a quick test using the branch from this PR and all looks fine. I also don't see any of the WARN or ERROR messages I mentioned in #176, so IMO this PR is good to merged, and #176 is good to close.

From all the minor commits in #176 I think you bring here all except this one: f6e71a3, do you mind including it as well?

svwilliams requested a review from ayrton04 April 10, 2020 16:13

efernandez approved these changes Apr 13, 2020

View reviewed changes

efernandez reviewed Apr 13, 2020

View reviewed changes

efernandez mentioned this pull request Apr 13, 2020

Sort transactions by minimum timestamp and purge wrt the maximum timestamp of all pending transactions #176

Closed

efernandez approved these changes Apr 15, 2020

View reviewed changes

ayrton04 approved these changes Apr 16, 2020

View reviewed changes

svwilliams and others added 7 commits April 24, 2020 06:16

Filter out transactions older than the lag window

eaa0c41

Fix expiration time computation

ad38071

Reset the lag expiration time when the smoother is reset

07d8cc3

Reorganize class variables by their mutex guard

f6981e7

Add a mutex guard for start_time_; use start_time_ as a min time in t…

3e54a9c

…he lag expiation computation

Add minStamp() and maxStamp() accessors to the Transaction class

7dc3d0f

Use the minStamp() and maxStamp() accessors to filter and purge trans…

a86dd29

…actions correctly

svwilliams force-pushed the filter-transactions-on-lag-boundary branch from 519a183 to a86dd29 Compare April 24, 2020 15:18

efernandez approved these changes Apr 24, 2020

View reviewed changes

PR feedback

56f9851

svwilliams merged commit 55b3abc into devel Apr 25, 2020

svwilliams deleted the filter-transactions-on-lag-boundary branch December 17, 2020 17:30

This was referenced Jul 13, 2021

fuse: 0.4.1-2 in 'melodic/distribution.yaml' [bloom] ros/rosdistro#30187

Merged

fuse: 0.4.1-1 in 'noetic/distribution.yaml' [bloom] ros/rosdistro#30198

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter out transactions older than the lag window #173

Filter out transactions older than the lag window #173

svwilliams commented Apr 10, 2020

efernandez Apr 13, 2020

efernandez Apr 13, 2020

svwilliams Apr 14, 2020

efernandez Apr 14, 2020

efernandez Apr 15, 2020

efernandez commented Apr 13, 2020 •

edited

Loading

efernandez Apr 13, 2020

svwilliams Apr 14, 2020

efernandez Apr 14, 2020

svwilliams Apr 14, 2020

efernandez Apr 15, 2020

svwilliams commented Apr 14, 2020

efernandez commented Apr 15, 2020

efernandez left a comment

efernandez Apr 15, 2020

svwilliams commented Apr 15, 2020

efernandez commented Apr 15, 2020

efernandez commented Apr 22, 2020

svwilliams commented Apr 23, 2020

efernandez commented Apr 23, 2020

efernandez left a comment

efernandez Apr 24, 2020

efernandez commented Apr 24, 2020 •

edited

Loading

	const auto& stamps = transaction.involvedStamps();
	if (stamps.empty())
	{
	return;
	}

	ROS_ERROR_STREAM("The queued transaction with timestamp " << element.stamp() << "and maximum "
	ROS_ERROR_STREAM("The queued transaction with timestamp " << element.stamp() << " and maximum "

Filter out transactions older than the lag window #173

Filter out transactions older than the lag window #173

Conversation

svwilliams commented Apr 10, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

efernandez commented Apr 13, 2020 • edited Loading

Update

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

svwilliams commented Apr 14, 2020

efernandez commented Apr 15, 2020

efernandez left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

svwilliams commented Apr 15, 2020

efernandez commented Apr 15, 2020

efernandez commented Apr 22, 2020

svwilliams commented Apr 23, 2020

efernandez commented Apr 23, 2020

efernandez left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

efernandez commented Apr 24, 2020 • edited Loading

efernandez commented Apr 13, 2020 •

edited

Loading

efernandez commented Apr 24, 2020 •

edited

Loading