Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Residual Block Error #233

Closed
TrophyBuck opened this issue Apr 6, 2017 · 19 comments
Closed

Residual Block Error #233

TrophyBuck opened this issue Apr 6, 2017 · 19 comments
Assignees
Labels

Comments

@TrophyBuck
Copy link

TrophyBuck commented Apr 6, 2017

Started using the cartographer_turtlebot package in simulation, and I've been running into this error. It occurs after a random amount of time ranging from a few seconds after starting the cartographer_node to several minutes- the only changes I've made have been remapping a few topic names (mainly odometry and the laser scanner), removing a line launching turtlebot's minimal bringup and a line launching the urg_node (since I'm using my own version of turtlebot) and disabling the imu (changes have been across this file and this file). The file I've been launching from is here. The error I've been getting is shown below:

[ WARN] [1491490574.176173648, 624.161000000]: W0406 10:56:14.000000  8798 residual_block.cc:131] 

Error in evaluating the ResidualBlock.

There are two possible reasons. Either the CostFunction did not evaluate and fill all    
residual and jacobians that were requested or there was a non-finite value (nan/infinite)
generated during the or jacobian computation. 

Residual Block size: 1 parameter blocks x 184 residuals

For each parameter block, the value of the parameters are printed in the first column   
and the value of the jacobian under the corresponding residual. If a ParameterBlock was 
held constant then the corresponding jacobian is printed as 'Not Computed'. If an entry 
of the Jacobian/residual array was requested but was not written to by user code, it is 
indicated by 'Uninitialized'. This is an error. Residuals or Jacobian values evaluating 
to Inf or NaN is also an error.  

Residuals:              nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan 

Parameter Block 0, size: 3

         nan |         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan 
         nan |         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan 
  -0.0744971 |         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan 


CHOLMOD error: invalid xtype
F0406 10:56:14.176298  8798 covariance_impl.cc:652] Check failed: 'permutation' Must be non NULL 
[FATAL] [1491490574.176547018, 624.161000000]: F0406 10:56:14.000000  8798 covariance_impl.cc:652] Check failed: 'permutation' Must be non NULL 
*** Check failure stack trace: ***
    @     0x7ffff7bb35cd  google::LogMessage::Fail()
    @     0x7ffff7bb5433  google::LogMessage::SendToLog()
    @     0x7ffff7bb315b  google::LogMessage::Flush()
    @     0x7ffff7bb5e1e  google::LogMessageFatal::~LogMessageFatal()
    @           0x6546e1  ceres::internal::CovarianceImpl::ComputeCovarianceValuesUsingSuiteSparseQR()
    @           0x65b0a5  ceres::internal::CovarianceImpl::ComputeCovarianceValues()
    @           0x65b22f  ceres::internal::CovarianceImpl::Compute()
    @           0x5a6918  cartographer::mapping_2d::scan_matching::CeresScanMatcher::Match()
    @           0x5ae390  cartographer::mapping_2d::LocalTrajectoryBuilder::ScanMatch()
    @           0x5ae94e  cartographer::mapping_2d::LocalTrajectoryBuilder::AddHorizontalRangeData()
    @           0x5acfbc  cartographer::mapping_2d::GlobalTrajectoryBuilder::AddRangefinderData()
    @           0x5377f9  cartographer::mapping::CollatedTrajectoryBuilder::HandleCollatedSensorData()
    @           0x53822e  _ZNSt17_Function_handlerIFvRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt10unique_ptrIN12cartographer6sensor4DataESt14default_deleteISB_EEEZNS9_7mapping25CollatedTrajectoryBuilderC4EPNSA_8CollatorEiRKSt13unordered_setIS5_St4hashIS5_ESt8equal_toIS5_ESaIS5_EES8_INSG_32GlobalTrajectoryBuilderInterfaceESC_IST_EEEUlS7_SE_E_E9_M_invokeERKSt9_Any_dataS7_OSE_
    @           0x61327a  _ZNSt17_Function_handlerIFvSt10unique_ptrIN12cartograp9_Any_dataOS6_
    @           0x6171f9  cartographer::sensor::OrderedMultiQueue::Dispatch()
    @           0x61879f  cartographer::sensor::OrderedMultiQueue::Add()
    @           0x613574  cartographer::sensor::Collator::AddSensorData()
    @           0x536948  cartographer::mapping::CollatedTrajectoryBuilder::AddS
    @           0x509f5b  cartographer_ros::SensorBridge::HandleRangefinder()
    @           0x50ab9a  cartographer_ros::SensorBridge::HandleLaserScanMessage
    @           0x4efb2b  _ZN5boost6detail8function26void_function_obj_invoker1I
    @           0x4f6c4e  boost::detail::function::void_function_obj_invoker1<>:
    @           0x500e39  ros::SubscriptionCallbackHelperT<>::call()
    @     0x7ffff4d1c5cd  ros::SubscriptionQueue::call()
    @     0x7ffff4cc6cf0  ros::CallbackQueue::callOneCB()
    @     0x7ffff4cc80f3  ros::CallbackQueue::callAvailable()
    @     0x7ffff4d20691  ros::SingleThreadedSpinner::spin()
    @     0x7ffff4d0572b  ros::spin()
    @           0x4f0f6e  cartographer_ros::(anonymous namespace)::Run()
    @           0x4ed6a4  main
    @     0x7ffff371f830  __libc_start_main
    @           0x4ef129  _start

Thread 1 "cartographer_no" received signal SIGABRT, Aborted.
0x00007ffff3734428 in __GI_raise (sig=sig@entry=6)
    at ../sysdeps/unix/sysv/linux/raise.c:54
54      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.

It's also worth noting that this error occurs regardless of if the robot stands still or moves- and while moving, does correctly generate a map. I've tested running the 2D Lidar Demo (listed here) with imu disabled and no other changes, in order to make sure the disabled imu wasn't causing the issue. It ran without error under those circumstances. I'm using Ubuntu 16.04 and ROS Kinetic.

@SirVer
Copy link
Contributor

SirVer commented Apr 7, 2017

This seems to me like bogus data from your simulation.

Please provide your full configuration in a fork of cartographer_ros and a bag file with sample data so that we can have a look.

@TrophyBuck
Copy link
Author

TrophyBuck commented Apr 7, 2017

I actually forked cartographer_turtlebot since that's what I've changed, link here. I've updated the bug_demo branch in it to showcase the bug- though the bag file was too large for GitHub to accept, so I've put it here on Google Drive. When downloaded and setup, run the following to run the bag file alongside the cartographer_turtlebot launch.

roslaunch cartographer_turtlebot turtlebot_lidar_2d_demo.launch bag_filename:=${HOME}/Downloads/failure_demo.bag

Error should occur at about 1005.49 time with this setup- it's consistent here, though if I were to relaunch all the nodes manually instead of using the bag file, it would probably happen at a different time.

@SirVer SirVer added the bug label Apr 10, 2017
@SirVer SirVer self-assigned this Apr 10, 2017
@BrannonKing
Copy link

I see this error often when using odometry data. I've verified that my odometry data never contains NaN. I just have the two inputs: odometry and a 2D laser.

@BrannonKing
Copy link

BrannonKing commented Apr 18, 2017

This is a bug in SuiteSparse v4.4.6. Modify the Ceres CMakeLists.txt file to use EigenSparse instead of SuiteSparse and recompile.

@SirVer
Copy link
Contributor

SirVer commented Apr 19, 2017

@BrannonKing Thanks for finding the root cause!

@SirVer SirVer closed this as completed Apr 19, 2017
@BrannonKing
Copy link

I should add that if you have a NaN in one of ROS's broadcasted transforms, you may get this error, but you will get a more informative error using EigenSparse. I would like to know if anyone is able to use SuiteSparse v4.5.5 to make this error go away.

@TrophyBuck
Copy link
Author

TrophyBuck commented Apr 19, 2017

Thanks for the replies- I tried setting the CMakeLists.txt to use EigenSparse, but when I did I got the following error during runtime:

E0419 13:57:37.598997  8287 covariance_impl.cc:548] SuiteSparse is required to use the SUITE_SPARSE_QR algorithm.
[ERROR] [1492624657.599212015, 26.529000000]: E0419 13:57:37.000000  8287 covariance_impl.cc:548] SuiteSparse is required to use the SUITE_SPARSE_QR algorithm.
F0419 13:57:37.599295  8287 ceres_scan_matcher.cc:108] Check failed: covariance_computer.Compute(covariance_blocks, &problem) 
[FATAL] [1492624657.599386618, 26.529000000]: F0419 13:57:37.000000  8287 ceres_scan_matcher.cc:108] Check failed: covariance_computer.Compute(covariance_blocks, &problem)

The code I added to the CMake file is below- I added it after the options setup.

update_cache_variable(SUITESPARSE OFF)
update_cache_variable(EIGENSPARSE ON)

I'm guessing I'm setting it up incorrectly- it's worth noting that I am getting these confirmations during building that Eigen is enabled and Suite is disabled.

-- Found Eigen version 3.2.92: /usr/include/eigen3

   ===============================================================
   Enabling the use of Eigen as a sparse linear algebra library 
   for solving the nonlinear least squares problems. Enabling 
   this option results in an LGPL licensed version of 
   Ceres Solver as the Simplicial Cholesky factorization in Eigen
   is licensed under the LGPL. 
   ===============================================================
-- Building without SuiteSparse.

@BrannonKing
Copy link

When I did it, I didn't add any rows to CMakeLists.txt; I modified the two existing lines for those variables. Also, I had to clear out my build folder as the change wasn't detected sufficiently to cause a rerun of CMake.

@TrophyBuck
Copy link
Author

TrophyBuck commented Apr 19, 2017

Do you mean the option lines when you say you modified the existing lines? I tried changing those, but when I double checked the values by adding a message to the CMake it showed that Eigen was still disabled and Suite was still enabled.

message("EIGEN ${EIGENSPARSE} SUITE ${SUITESPARSE}") #added

I just tried deleting the old build folder and rebuilding but got some different errors relating back to Residual Blocks, so I think I'll try changing whatever existing lines you changed and rebuild.

To clarify, i mean these when I say 'option lines'

option(SUITESPARSE "Enable SuiteSparse." OFF)
OPTION(EIGENSPARSE "Enable Eigen as a sparse linear algebra library, WARNING: results in an LGPL licensed Ceres." ON)

@TrophyBuck
Copy link
Author

TrophyBuck commented Apr 19, 2017

After changing the options lines, cartographer doesn't crash but it does stop producing a submap, showing the error below that looks similar to the error this issue started with. This error is repeated once it shows up, and the submap seems to stop publishing.

Error:   TF_NAN_INPUT: Ignoring transform for child_frame_id "turtlebot_tf/odom" from authority "unknown_publisher" because of a nan value in the transform (nan nan nan) (0.000000 0.000000 0.001341 0.999999)
         at line 240 in /tmp/binarydeb/ros-kinetic-tf2-0.5.13/src/buffer_core.cpp
[ WARN] [1492628146.895639256, 259.038000000]: W0419 14:55:46.000000 29344 residual_block.cc:131] 

Error in evaluating the ResidualBlock.

There are two possible reasons. Either the CostFunction did not evaluate and fill all    
residual and jacobians that were requested or there was a non-finite value (nan/infinite)
generated during the or jacobian computation. 

Residual Block size: 1 parameter blocks x 202 residuals

For each parameter block, the value of the parameters are printed in the first column   
and the value of the jacobian under the corresponding residual. If a ParameterBlock was 
held constant then the corresponding jacobian is printed as 'Not Computed'. If an entry 
of the Jacobian/residual array was requested but was not written to by user code, it is 
indicated by 'Uninitialized'. This is an error. Residuals or Jacobian values evaluating 
to Inf or NaN is also an error.  

This error is followed by the same Residual and Parameter Block matrices that are full of "nan"s.

@BrannonKing
Copy link

I too have seen the (nan, nan, nan) in the transform today. I was too hasty in declaring it the fault of SuiteSparse as I didn't see it for a day after I ditched that. Let's reopen this bug. At this point I'm not sure if the bug lies with Cartographer or Cartographer_ros. It definitely seems related to the odometry. The higher-quality the odometry the less likely one is to see this.

@SirVer SirVer reopened this Apr 20, 2017
@ojura
Copy link
Contributor

ojura commented Apr 21, 2017

This is a bug in SuiteSparse v4.4.6. Modify the Ceres CMakeLists.txt file to use EigenSparse instead of SuiteSparse and recompile.

You can do that without changing CMakeLists.txt by just calling CMake/catkin with additional CMake arguments, e.g. -DEIGENSPARSE=True -DSUITESPARSE=False.

But, IIRC, it did not work well for me, and I haven't seen the nice loop-closing realtime submap alignments when using EigenSparse.

The background of this is that earlier this year, I was wondering why I could not get loop closing to work (cartographer-project/cartographer_ros#247). It turned out that I was missing SuiteSparse, and Ceres was built without any sparse linear algebra library. To resolve this, in #189, I added that Cartographer requires Ceres built with any sparse linear algebra library (either EigenSparse or SuiteSparse or CXSparse). IIRC, one of the things I did try was -DEIGENSPARSE, and I think that it didn't work well. Only when I installed SuiteSparse did Cartographer start closing loops properly.

If it seems that Cartographer really requires SuiteSparse to work properly, maybe the requirement in Cartographer's CMakeLists should be changed from Ceres with SparseLinearAlgebraLibrary to Ceres with exactly SuiteSparse? @SirVer

@TrophyBuck
Copy link
Author

TrophyBuck commented Apr 25, 2017

I don't think that it's a requirement to use SuiteSparse- the issue we're having seems to happen on both SuiteSparse and EigenSparse. I tried forcing the use of SuiteSparse in Cartographer's CMake like @ojura mentioned, and still got the original error mentioned in this issue.

@TrophyBuck
Copy link
Author

@SirVer Is there any other information or testing we can provide to help solve this issue?

@ltecot
Copy link

ltecot commented Apr 25, 2017

I am having the same issue. I've linked my fork and rosbag below.
I tried building with the arguments ojura described above. Only difference it produced is instead of crashing right away and not producing any maps, it would produce a handful of messages, stop publishing and repeatedly produce the nan residual block warning plus a TF_NAN_INPUT error, both of which I've copied below. Let me know if you want additional data.
I am using indigo on Ubuntu 14.04

Error:   TF_NAN_INPUT: Ignoring transform for child_frame_id "base_link" from authority "unknown_publisher" because of a nan value in the transform (-nan -nan 0.000000) (0.000000 0.000000 0.002974 0.999996)


[ WARN] [1493150351.421769307, 1492802830.018635481]: W0425 12:59:11.000000  9558 residual_block.cc:131] 

Error in evaluating the ResidualBlock.

There are two possible reasons. Either the CostFunction did not evaluate and fill all    
residual and jacobians that were requested or there was a non-finite value (nan/infinite)
generated during the or jacobian computation. 

Residual Block size: 1 parameter blocks x 61 residuals

For each parameter block, the value of the parameters are printed in the first column   
and the value of the jacobian under the corresponding residual. If a ParameterBlock was 
held constant then the corresponding jacobian is printed as 'Not Computed'. If an entry 
of the Jacobian/residual array was requested but was not written to by user code, it is 
indicated by 'Uninitialized'. This is an error. Residuals or Jacobian values evaluating 
to Inf or NaN is also an error.  

Residuals:             -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan         -nan 

Parameter Block 0, size: 3

        -nan |          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan 
        -nan |          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan 
  0.00594778 |          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan          nan 

Build with instructions specified in the Cartographer ROS Docs
Run with: roslaunch cartographer_ros scanse_baxter_bag.launch bag_filename:=${BAG_DIR}
https://github.com/Cranapple/cartographer_ros
https://drive.google.com/file/d/0BwfwZjTRXXFdbFJhbzVROXNfQ00/view?usp=sharing

SirVer added a commit to SirVer/cartographer that referenced this issue Apr 26, 2017
If mapping_2d::LocalTrajectoryBuilder::AddHorizontalRangeData is called
twice in a row with the same `time`, the `velocity_estimate_` becomes
`inf` which led to `inf`s in the optimization problem, which led to
failures inside Ceres.

Fixes cartographer-project#233.
@SirVer
Copy link
Contributor

SirVer commented Apr 26, 2017

@TrophyBuck I looked into your turtlebot example and was able to repro the crash.

Things I found:

  • Your scan topic contains NaNs. It is not entirely clear to me by reading the documentation if this is actually valid. However Cartographer does the sane thing and filters them out already. This was not the issue at hand.
  • Your odom topic seems wonky - it sends data rather irregularly, but when it sends something it always sends two things very close in time. Not sure if this is an issue, since I disabled the topic for my investigations to focus on the crash.
  • Directly before the crash you have 2 LaserScan messages with exactly the same time stamp. This in itself is bogus, so you might want to investigate where that comes from. However, this triggered a bug in Cartographer where the velocity prediction went to inf which led to the crash. I fixed this in a PR.

@Cranapple I did not look into your case. Could you check if my fix helps you too and open a new bug report otherwise?

SirVer added a commit that referenced this issue Apr 26, 2017
If mapping_2d::LocalTrajectoryBuilder::AddHorizontalRangeData is called
twice in a row with the same `time`, the `velocity_estimate_` becomes
`inf` which led to `inf`s in the optimization problem, which led to
failures inside Ceres.

Fixes #233.
@BrannonKing
Copy link

Thank you very much for tracking this down, and for the fix! In my system, it is not uncommon to send data with the same timestamp as the previous batch. On devices that don't have a system clock, the clock signal must be broadcast from a different device. Previous designs have simply published a clock signal periodically with all receivers using the most-recently-published time value. I'm working with the ROS2 design to remedy this; there is no need to use the most recent value as every platform has good timers, even if they don't have a clock chip. See https://discourse.ros.org/t/of-clocks-and-simulation-betimes-and-otherwise/1587

@SirVer
Copy link
Contributor

SirVer commented Apr 26, 2017

In my system, it is not uncommon to send data with the same timestamp as the previous batch.

This sounds dangerous to me. In our experience better timing translates directly to better SLAM quality. Yolo timing means that you need to be lenient in your SLAM expetations - i.e. increase the resolution, expect more drift and so on. Of course, timing is also a very hard problem.

@TrophyBuck
Copy link
Author

@SirVer Thanks, that seems to have fixed it! I'm not sure how I ended up sending two LaserScans with the same time stamp, but in my testing with the new fix Cartographer maps the environment very well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants