New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting transform from TransformListener with old timestamp always gives latest transform #388
Comments
@baumanta FYI |
This is quite surprising. Is there any sort of reproduction you can provide? tf lookups are accurate enough to render pointclouds from scanning lasers. My first guess is that there's several potential point cloud conversions inside transformPointCloud that look like they're happening, one of which might be failing to copy the header information properly. For debugging I'd suggest putting in a debug output of the lookupTransform next to the call with the relevant frames and times to verify that the lookup is functioning. |
Also make sure to check the return code to make sure that it's succeeding at all. It might not be transforming anything. transformPointCloud catches the exceptions and has a return value for failure. |
A minimal reproduction is a bit of a pain since we noticed it running on real hardware, but it was reproducible running the Local Planner with Mavros + PX4 SITL. To debug we inlined the transformPointCloud you linked above, and examined the timestamp coming back on the transform, which was indeed correct for the timestamp we had requested. However we also retrieved the latest timestamp (by passing The other logic around when the data was in the buffer seemed correct. Asking for data which hadn't arrived yet produced the expected exceptions, as did asking for data which was too old. I'm not sure if maybe it is something specific to our transform chain? It shouldn't be anything crazy, we have a dynamic TF coming in over MAVROS for the FCU->local, and we have a static TF between FCU and camera. We now have a working fix which is basically doing a cache of just the final transform results, with lerp/slerp for in between timestamps. The fact that doing this fixed our issue, and that the TF2 buffer works for other people, leads me to believe there must be some edge case in our setup which breaks the TF2 implementation. Edit: this is the last checkout which doesn't use our new tf buffering: https://github.com/PX4/avoidance/tree/39717991297b00dd786682b325b222a506d8e3c9 |
The behavior of the transform always looking up with the same value at all times sounds a lot like the behavior of a static transform. I have a sneaking suspicion that the extra static transform publisher I noticed here: mavlink/mavros#1278 is initializing the fcu frame as a static frame, and currently tf does not support changing the frame type after initialization. So when the dynamic frame is published later it gets reparented but the storage is not reallocated. You should be able to |
This is confirmed to be a MAVROS issue, closing. |
We had quite a long investigation into why our camera data wasn't synchronized with the robot body frame, and eventually came all the way back to the TransformListener (which just wraps TF2 these days, hence filing this bug here). We have rapid orientation shifts, so we need to use the TF2 caching buffer to correctly transform our point cloud data at the correct time, since there is a small delay (~100ms) between when the light enters the camera and when the data is available in ROS.
Code is here:
https://github.com/PX4/avoidance/blob/e83c7abf117907f74a90eab5c57779b38db4ef9a/local_planner/src/nodes/local_planner_node.cpp#L676
We've debugged it in depth, and it seems that no matter how far in the past the timestamp for pcl_cloud is (up to the 10s of buffer), it always transforms it with the latest transform (as if the pcl_cloud timestamp was 0). We've checked the received transforms, and although the timestamps are correct for what we ask, the actual values are bitwise-identical to the latest transform.
We'll end up writing our own (simple) cache to solve this since we can't wait for upstream fixes to all permeate out into the repos, but I thought this bug report would still be valuable.
I know several other projects also don't use TF based caching due to similar issues ( eg. ethz-asl/voxblox#201 ), so I don't think we're crazy here, something is actually wrong.
This is all tested on Kinetic. Happy to answer questions and do high-level debugging, but I mostly wanted to report this. Hey, maybe we're doing something wrong 🙂
The text was updated successfully, but these errors were encountered: