Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Localization failure | Generation of deb files #10

Open
abylikhsanov opened this issue Sep 22, 2019 · 27 comments
Open

Localization failure | Generation of deb files #10

abylikhsanov opened this issue Sep 22, 2019 · 27 comments

Comments

@abylikhsanov
Copy link

abylikhsanov commented Sep 22, 2019

Hi,

I decided tom tryout the default configuration of the guardian localization example. I am only using the front lidar and therefore edited the launch file a bit (just the lidar area).

First, the localization was not able to find the correspondance points and I thought this is due to the global localization problem. After I pointed out the initial pose, I get:

[ INFO] [1569164427.212248087]: Received initial pose at time [1569164427.180871518]: 
	Translation -> [ x: -1.34668 | y: -0.0939072 | z: 0 ]
	Rotation -> [ qx: 0 | qy: 0 | qz: 0.055193 | qw: 0.998476 ]
[pcl::RandomSampleConsensus::computeModel] No samples could be selected!
[pcl::SampleConsensusModel::getSamples] Can not select 0 unique points out of 2!
[pcl::RandomSampleConsensus::computeModel] No samples could be selected!
[pcl::IterativeClosestPoint::computeTransformation] Not enough correspondences found. Relax your threshold parameters.
[ WARN] [1569164428.699073609]: Discarded cloud because localization couldn't be calculated
[ WARN] [1569164428.699200539]: Failed point cloud processing with error [FailedInitialPoseEstimation]
[ WARN] [1569164429.367911121]: Pose to TF publisher reached timeout for last valid pose
	TF translation -> [ x: -1.34668 | y: -0.0939072 | z: 0 ]
	TF orientation -> [ qx: 0 | qy: 0 | qz: 0.055193 | qw: 0.998476 ]
	Current time: 1569164429.367611740
	Last pose time: 1569164427.180871518
	Last pose arrival time: 1569164427.268565237
[ WARN] [1569164430.468016846]: Pose to TF publisher reached timeout for last valid pose
	TF translation -> [ x: -1.34668 | y: -0.0939072 | z: 0 ]
	TF orientation -> [ qx: 0 | qy: 0 | qz: 0.055193 | qw: 0.998476 ]
	Current time: 1569164430.467624702
	Last pose time: 1569164427.180871518
	Last pose arrival time: 1569164427.268565237
[ WARN] [1569164430.611402094]: The /scan observation buffer has not been updated for 1.31 seconds, and it should be updated every 1.20 seconds.
[pcl::IterativeClosestPoint2D::computeTransformation] Not enough correspondences found. Relax your threshold parameters.
[ WARN] [1569164430.830316015]: EuclideanTransformationValidator rejected new pose at time 1569164430.830252524 due to root_mean_square_error -> 
	 correction translation: 0 | (max_transformation_distance: 0.5)
	 correction rotation: 0 | (max_transformation_angle_: 0.7)
	 new pose diff translation: 0 | (max_new_pose_diff_distance_: 0.9)
	 new pose diff rotation: 0 | (max_new_pose_diff_angle_: 1.1)
	 root_mean_square_error: 1.79769e+308 | (max_root_mean_square_error_: 0.1)
	 root_mean_square_error_reference_pointcloud: 0 | (max_root_mean_square_error_reference_pointcloud_: -1)
	 outliers_percentage: 1 | (max_outliers_percentage_: 0.85)
	 outliers_percentage_reference_pointcloud: 0 | (max_outliers_percentage_reference_pointcloud_: -1)
	 inliers_angular_distribution: 2 | (min_inliers_angular_distribution_: 0.07)
	 outliers_angular_distribution: -2 | (max_outliers_angular_distribution_: 0.875)
[ WARN] [1569164430.830735712]: Discarded cloud because localization couldn't be calculated
[ WARN] [1569164430.830845610]: Failed point cloud processing with error [FailedPoseEstimation]

Is that really the parameter choice?

Aslo, I have noticed that the pointcloud produced by lasercan_to_pointcloud (/guardian/lasers) works fine but no node subscribes to it. So I am not sure the purpose of it. The other PCL produced by drl node (ambient_pointcloud) works fine.

Here is the another warning:

[ WARN] [1569224041.915014225]: EuclideanTransformationValidator rejected new pose at time 1569224041.914963026 due to max_new_pose_diff_angle -> 
	 correction translation: 0.0208703 | (max_transformation_distance: 0.5)
	 correction rotation: 0.00947034 | (max_transformation_angle_: 0.7)
	 new pose diff translation: 0.0208703 | (max_new_pose_diff_distance_: 0.9)
	 new pose diff rotation: 0.00947034 | (max_new_pose_diff_angle_: 1.1)
	 root_mean_square_error: 0.0307908 | (max_root_mean_square_error_: 0.1)
	 root_mean_square_error_reference_pointcloud: 0 | (max_root_mean_square_error_reference_pointcloud_: -1)
	 outliers_percentage: 0.348315 | (max_outliers_percentage_: 0.85)
	 outliers_percentage_reference_pointcloud: 0 | (max_outliers_percentage_reference_pointcloud_: -1)
	 inliers_angular_distribution: 2 | (min_inliers_angular_distribution_: 0.07)
	 outliers_angular_distribution: -2 | (max_outliers_angular_distribution_: 0.875)

I am not sure why this warning appears as the max_new_pose_diff_angle is higher than new pose diff rotation: 0.00947034

@carlosmccosta
Copy link
Owner

carlosmccosta commented Sep 30, 2019

Good night :)

I "recently" upgraded the EuclideanTransformationValidator class and tested it only for the 3D perception use case. And by mistake, I introduced a bug in the comparison of the max_new_pose_diff_angle (was <, should have been >).
Thank you for spotting the bug 👍

This bug would negatively affect the 3 DoF pose estimation you were given in rviz (in the map frame and to topic /initialpose).

Please do git pull in drl, compile and try again.

The topic /guardian/lasers was only used in slam mode, and was being subscribed by octomap.
The main reason for using the lasercan_to_pointcloud instead of giving both lasers directly to octomap is related with the distortion correction that I implemented in the lasercan_to_pointcloud package using spherical linear interpolation.

I also tested giving to octomap the fused point cloud (with both front and back lasers), but this has the drawback of not doing perfect ray tracing for marking the voxels as occupied or free, because octomap is expecting 1 sensor origin for each LaserScan / PointCloud (which isn't the case if the sensor data is merged from 2 sensors).

In relation to the automatic global localization, depending on your environment, you may need to fine tune the initial algorithms parameterization, namely:

  • scale and contrast of the sift detector
  • radius of the fpfh descriptor
  • correspondence distance of the ransac initial pose estimator

P.S.
Sorry for the delay answering your question, I was out of the country, at the PSA engine factory in Metz preparing demonstrations for the 3rd test sprint of the Scalable4.0 project (such as this one, among others). I only had time to look at this issue today.

@abylikhsanov
Copy link
Author

Hi, Thanks I will try.
Also, when I was compiling the package on my other machine, I get the error:
error: 'PointCloud does not name a type, did you mean PointDEM? in /usr/include/pcl-1.8/pcl/point_types_conversion.h:246:27

I did not compile your version of PCL but simply copied all my pcl files of your version from my host pc to that machine, so I think I have the correct version of PCL. I checked two files (host pc and machine) in that specific location and they are identical.

@carlosmccosta
Copy link
Owner

Good night :)

The header mentioned in the error is on a path (/usr/include/pcl-1.8) associated with the oficial pcl version.
If you copied the pcl libraries compiled from my fork, you need to make sure that the catkin workspace in which you are compiling drl is extending the workspace that has my fork of pcl.
Otherwise catkin will use the oficial pcl libraries instead of my fork, resulting in those errors.

Look at this page for more information about workspace chaining in catkin tools.

You can use this script to initialize the workspace with the proper chaining.
Be aware that the path you specify in the extend_ws parameter either points to /opt/ros/ros_version or to the devel folder of the parent workspace.

Alternatively, for making the installation easier, you can try to create a .deb file for pcl, drl and its dependencies using this tutorial.

I recently added the cmake install configurations to my packages and I will soon try to allocate some time to create the .deb installation files for my fork of pcl, drl and its dependencies.

@carlosmccosta
Copy link
Owner

Good afternoon :)

I updated my pcl fork and packages with releases containing the debian installation files for Ubuntu 16 and ROS Kinetic (I will upgrade to Ubuntu 18 soon, after I finish a couple of research projects).

Please have a look at this section of the updated readme.

Now you can install pcl and drl in a new pc in less than a minute :)

If your OS or ROS version is different, you can follow this tutorial for generating the deb files for my fork of pcl, drl and its 2 dependencies.

P.S.
You will need to do git pull in these 4 repositories:

@abylikhsanov
Copy link
Author

Hi, thank you, that helps :)

@abylikhsanov
Copy link
Author

abylikhsanov commented Oct 3, 2019

Hi, I am generating the deb files myself and I am a bit uncertain about the pcl package as it is not ros package. I have read and understood about other ros packages and it seems quite easy (creating deb files for laserscan_to_pointcloud, pose_to_tf_publisher and then add them as dependencies when building the dynamic_robot_localization package).

@carlosmccosta
Copy link
Owner

Good morning :)

PCL is a CMake package, and catkin tools can compile CMake packages without a problem.

For compiling PCL with catkin I just needed to add the package.xml with the appropriate configurations and add a CMake install command to include the package.xml in the deb file.

The main advantage of packaging PCL as a CMake "ROS" package is that it easily overlays / overrides the oficial pcl installation without requiring that you either:

  • configure the CMAKE_PREFIX_PATH, LD_LIBRARY_PATH and PATH, which is also easy, its just appending values to environment variables
  • uninstall / overwrite the oficial pcl installation

@abylikhsanov
Copy link
Author

Oh, so I can generate deb using bloom the same way as I generated other ROS packages?

@abylikhsanov
Copy link
Author

I can create the deb files for bionic x86_64 and for ARMv8 and send it to you if you want

@carlosmccosta
Copy link
Owner

With the addition of the appropriate package.xml and install commands in the CMakeLists, you can generate deb files for pure CMake packages with bloom.

@carlosmccosta
Copy link
Owner

If you could send me the deb files that would be awesome :)
I could create a shared folder with the different deb files for each OS/ROS version.

@abylikhsanov
Copy link
Author

Ok, question, when you created a deb for the first package (without any dependencies) and tried to create for another one (which depends on previous), did you get this error?

CMake Error at /opt/ros/melodic/share/laserscan_to_pointcloud/cmake/laserscan_to_pointcloudConfig.cmake:150 (message):
  Project 'pose_to_tf_publisher' tried to find library
  'tf_rosmsg_eigen_conversions'.  The library is neither a target nor
  built/installed properly.  Did you compile project
  'laserscan_to_pointcloud'? Did you find_package() it before the
  subdirectory containing its code is included?

@carlosmccosta
Copy link
Owner

I did not get the error you mention.
For making sure the packages I was compiling were getting the libraries contained in the deb files, I did for each package (in the proper order for following the compilation dependencies):

  • source /opt/ros/$(rosversion -d)/setup.bash
  • Compile dependency and create .deb
  • Install dependency using sudo dpkg -i path_to_pkg.deb
  • rospack profile
    Try doing the same and please report back if it solves your problem.

@carlosmccosta
Copy link
Owner

Before compiling, please make sure you git pull in each repository and then checkout each git to the release tag associated with each of the 4 repositories.

roscd pcl && git pull && git checkout pcl-1.8.1-cc-master-all-pr
roscd laserscan_to_pointcloud && git pull && git checkout v1.0.0
roscd pose_to_tf_publisher && git pull && git checkout v1.0.0
roscd dynamic_robot_localization && git pull && git checkout v1.2.0

This way, when you send the .deb files, I can add then to the associated release page in the github of each repository.
For the end users, this ensures that they know which commit was used to generate the deb file.
Thanks :)

@carlosmccosta
Copy link
Owner

I updated the .deb generation tutorial with some notes regarding this.

@carlosmccosta carlosmccosta changed the title Localization failure Localization failure | Generation of deb files Oct 3, 2019
@abylikhsanov
Copy link
Author

Yes, it is working now. thanks!

@abylikhsanov
Copy link
Author

abylikhsanov commented Oct 5, 2019

Hi Carlos,
After installing all the dependencies for the dynamic_robot_localization package, I got this error (linking error):
/opt/ros/melodic/lib/libpcl_filters.so: undefined reference to pcl::PCLBase<pcl::GASDSignature512>::PCLBase()'

It looks like the linking error has nothing to do with the official PCL but has to do with the pcl-ros package. Did you also install pcl-ros after installing your fork of pcl?

I have looked the CMakeLists of your package and it tries to find non ROS PCL. As I have installed your PCL for as ros-melodic-pcl may be I should edit the list to include that instead of finding the PCL?

@abylikhsanov
Copy link
Author

After some tryings, I looked closely at CMake messages and found this warning:

runtime library [libpcl_common.so] in /usr/lib/x86_64-linux-gnu may be hidden by files in:
      /opt/ros/melodic/lib

I might be wrong but it looks like the CMake is looking for the wrong PCL, I might have another version of it.. I have deleted those files in /usr/lib/x86_64-linux-gnu and got an error that the files do not exist which makes me thinking that CMake definately found a wrong PCL.

I have decided to put the PATHS in CMakeLists for CMake to get the righ directory of PCL:

set(PCL_DIR "/opt/ros/melodic/share/pcl-1.8")
find_package(PCL REQUIRED PATHS ${PCL_DIR})

But it still tries to look at /usr/lib ...

@abylikhsanov
Copy link
Author

Sorry, I now see the problem, the drl_node was not linked by the PCL library... Fixed it now

@carlosmccosta
Copy link
Owner

Good afternoon,

This error was a bit tricky to find.
After some digging in cmake and in the output of the deb generator, it seems that in Ubuntu 18 during the link stage this was being appended (in the /usr/bin/c++ command):

-lpcl_common -lpcl_octree -lpcl_io

This did not happen in Ubuntu 16, so I was not getting the error you mentioned.
The linking of drl_node to the pcl libraries is not required because these dependencies were transitive through the drl_localization library (that links to the pcl_libraries).

If you execute the last command that previously have given you the error and removed the -lpcl_common -lpcl_octree -lpcl_io part, the linking would succeed.

For avoiding this manual step, I added to the CMakeLists the explicit linking with the pcl libraries for the drl_node.
This way, during the linking stage they are listed before the -lpcl_*, and since they have absolute paths to the libs of my fork (${PCL_LIBRARIES}), this avoids the linker to get the system installation of pcl.

You can inspect these variables by adding after catkin_package:

message("############################ PCL_INCLUDE_DIRS:" ${PCL_INCLUDE_DIRS})
message("############################ PCL_LIBRARIES:" ${PCL_LIBRARIES})

Please change all my git repositories to the melodic-devel branch.

I added the debs for Ubuntu 18 / ROS melodic / x64 in the release tabs of each repository.

P.S.
I removed the dependency to the pcl_ros recently since I do not needed it.
The CMake error about the .so files of my fork hiding the .so files of the oficial pcl is for informing you that you should be careful about the LD_LIBRARY_PATH, otherwise the dynamic linker may not load the .so files you were expecting.
There are also other hints that are given to the dynamic linker, such as the library run path, which can be embedded in the .so and binaries.
Soon I will add a tutorial explaining how to install my fork of pcl to a custom folder for not conflicting with other ros packages that were compiled with the oficial pcl version.
Please check this section.

@abylikhsanov
Copy link
Author

abylikhsanov commented Oct 6, 2019

Thanks. Also going back to the localization issue, I think my x86 is not powerful enough and even setting the wait limit to 0.3 seconds it looks like the computation is too heavy:
[DefaultConvergenceCriteriaWithTime::hasConverged] Convergence time limit of 0.3 seconds exceeded (elapsed time: 0.312735) | Iteration: 5 | CorrespondencesCurrentMSE: 0.000440362

So, I have tried to use the correspondence table to ease the computational burden:

tracking_matchers:
    ignore_height_corrections: false
    correspondence_estimation_approach: 'CorrespondenceEstimationLookupTable'

But the node just dies. Here is a Valgrind output:


==4599== Conditional jump or move depends on uninitialised value(s)
==4599==    at 0x7392D90: ros::ROSOutAppender::log(ros::console::levels::Level, char const*, char const*, char const*, int) (in /opt/ros/melodic/lib/libroscpp.so)
==4599==    by 0x166D34A9: ros::console::impl::Log4cxxAppender::append(log4cxx::helpers::ObjectPtrT<log4cxx::spi::LoggingEvent> const&, log4cxx::helpers::Pool&) (in /opt/ros/melodic/lib/librosconsole_log4cxx.so)
==4599==    by 0x16BAA664: log4cxx::AppenderSkeleton::doAppend(log4cxx::helpers::ObjectPtrT<log4cxx::spi::LoggingEvent> const&, log4cxx::helpers::Pool&) (in /usr/lib/x86_64-linux-gnu/liblog4cxx.so.10.0.0)
==4599==    by 0x16BA860E: log4cxx::helpers::AppenderAttachableImpl::appendLoopOnAppenders(log4cxx::helpers::ObjectPtrT<log4cxx::spi::LoggingEvent> const&, log4cxx::helpers::Pool&) (in /usr/lib/x86_64-linux-gnu/liblog4cxx.so.10.0.0)
==4599==    by 0x16BEDADC: log4cxx::Logger::callAppenders(log4cxx::helpers::ObjectPtrT<log4cxx::spi::LoggingEvent> const&, log4cxx::helpers::Pool&) const (in /usr/lib/x86_64-linux-gnu/liblog4cxx.so.10.0.0)
==4599==    by 0x16BEDD0D: log4cxx::Logger::forcedLog(log4cxx::helpers::ObjectPtrT<log4cxx::Level> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, log4cxx::spi::LocationInfo const&) const (in /usr/lib/x86_64-linux-gnu/liblog4cxx.so.10.0.0)
==4599==    by 0x166CFEAA: ros::console::impl::print(void*, ros::console::levels::Level, char const*, char const*, char const*, int) (in /opt/ros/melodic/lib/librosconsole_log4cxx.so)
==4599==    by 0x761D53F: ros::console::print(ros::console::FilterBase*, void*, ros::console::levels::Level, char const*, int, char const*, char const*, ...) (in /opt/ros/melodic/lib/librosconsole.so)
==4599==    by 0x116E3E: main (in /home/a/master-wheel/devel/lib/dynamic_robot_localization/drl_localization_node)
==4599== 

And I also thought to increase the tf timeout to something like 0.5 using this yaml config:

message_management:
    tf_buffer_duration: 600                          
    tf_timeout: 0.5

But even after that I am still getting:
Timed out waiting for transform from base_footprint to map to become available before running costmap, tf error: canTransform: target_frame map does not exist.. canTransform returned after 0.100464 timeout was 0.1.

EDIT:

Sorry, my bad, that tf timeoout is not from your package... I mean this warning:

Pose to TF publisher reached timeout for last valid pose
	TF translation -> [ x: -0.100946 | y: -0.130493 | z: 0 ]
	TF orientation -> [ qx: 0 | qy: 0 | qz: 0.163041 | qw: 0.986619 ]
	Current time: 1570373753.393665601
	Last pose time: 1570373751.236291000
	Last pose arrival time: 1570373751.393646733

It looks like it found a valid pose but rejects it due to the timeout. Any idea how to overcome that?

@carlosmccosta
Copy link
Owner

carlosmccosta commented Oct 6, 2019

Good night :)

I updated a yaml in drl for showing the usage of the lookup tables.
I tested it in Ubuntu 16 and 18 with some of my bag files and the lut-icp worked as expected and with similar CPU usage and run time as the ones showed in this article.

In my laptop with a Intel i7-6700HQ, drl is using +-1 % and taking around 30 ms to update the robot pose for this test with the guardian robot:

Before implementing the lookup tables I did some tests with drl running in a very weak CPU (Intel Atom N2800), and drl managed to localize the robot without problems and without using too much of the CPU (+- 30% to 50%, depending on the test). These results were using kd-trees, with lookup tables the cpu usage should be much lower.
Please check this article.

A few notes if you are using guardian_config without modifications:

  • Be aware that in my test environment I had defined a region of interest using a crop box in this file. This ROI will descard all data that is not within these limits. You will most likely need to comment the 4_crop_box section, since your map is different.
  • When generating the map, be careful about its cell resolution. Also, if there were dynamic objects creating geometry that the robot will no longer see, then, a manual clean up of the map (in Gimp, for example) may improve the localization of the robot
  • You can reduce the computational requirements of drl by adjusting the preprocessing filters (such as voxel grid and random sample) of the reference and ambient point cloud. The lower the number of points to align, the less the CPU usage. Depending on your pose estimation accuracy requirements, you can lower what is currently in guardian_config
  • You can try different algorithms (several variants of ICP and NDT)
  • You can fine tune the convergence criteria of each algorithm for making them terminate sooner (what you define as good enough alignment depends on your use case)
    • Look carefully at the parameters below this line
  • What you define as outlier depends on your hardware and use case requirements
    • You may need to adjust the outlier detection module
    • An outlier is a point considered as either incorrectly registered or not having a correspondence in the map. This is defined by a distance threshold.
  • What you define as correct registration also depends on your hardware and especially the environment and the map. Having an empty room with a perfect map and no people is different than having a bad or outdated map with a room full of people moving around or a lot of unknown regions that are not on the map.
  • For the automatic initial pose alignment using features to work you need to add to the curvature estimators this:
    • colorize_pointcloud_with_curvatures: true
    • Check this commit
    • Please be aware that the guardian_config repository was used and configured for this demonstration and it has not been updated in a a while (the drl package has been evolving for other use cases, so you may need to update the yamls of guardian_config for keeping up with the drl developments, in which the yaml layout is documented in the drl_configs.yaml)
  • Even though drl can be used without odometry, a good and calibrated wheel encoder giving input to an odometry package, can make the CPU usage of drl much lower, because the odometry will give a good estimate of the robot motion and icp will correct small offsets. If no odometry is given, then ICP will need to correct much larger offsets, taking more iterations and as a result requiring more time and CPU resources.

@carlosmccosta
Copy link
Owner

carlosmccosta commented Oct 6, 2019

You can adjust the logging level of drl and pcl using these parameters:

pcl_verbosity_level: 'ERROR'      # VERBOSE | DEBUG | INFO | WARN | ERROR | ALWAYS || ALWAYS -> no console output
ros_verbosity_level: 'DEBUG'      #           DEBUG | INFO | WARN | ERROR | FATAL

This can be very useful for you to inspect what drl and pcl are doing "under the hood".

Check these examples:

@abylikhsanov
Copy link
Author

abylikhsanov commented Oct 7, 2019

Hi Carlos, I did some testing and the filters were the main issue of 100% CPU usage on my Intel Atom x5-Z8350, specifically this:

5_random_sample:
            number_of_random_samples: 500

I changed 500 to 50 and the usage dropped to 30%.

Also, I noticed that whe drl cannot localize due to a completely different environment (I did that on purpose), it starts to eat a lot of CPU on the initial initialization and I figured that I can tweak something there:

point_matchers:
#        iterative_closest_point:
        iterative_closest_point_2d:
#        iterative_closest_point_non_linear:
#        iterative_closest_point_generalized:
            convergence_time_limit_seconds: 5.0
            max_correspondence_distance: 1.5
            transformation_epsilon: 1e-6
            euclidean_fitness_epsilon: 1e-6
            max_number_of_registration_iterations: 15 #150
            max_number_of_ransac_iterations: 15 #150

After that, the CPU consumption dropped to 4 percent. Of course, with so many iterations, it won't be able to localize correctly, so I put like 50 instead of 150.

Still, I notice this warning:

Pose to TF publisher reached timeout for last valid pose
	TF translation -> [ x: 2.94319 | y: 5.73195 | z: 0 ]
	TF orientation -> [ qx: -0.307899 | qy: 0.951419 | qz: 0 | qw: 0 ]
	Current time: 1570435190.569443416
	Last pose time: 1570435188.094532000
	Last pose arrival time: 1570435188.469512578

And after browsing the source code, it looks like the drl gets the pose from pose_to_tf_publisher package and it does not give any extra time. Do you think it would be rational to give some time before saying the timeout? As 200-400 msec won't make the robot to crash at lower speeds

Also, you mentioned that having a wheel odometry, the drl computation can be reduced. You mean that if odometry can more or less localize the robot locally, the DRL won't need to estimate its initial position (which then consumes a lot of CPU) but only correct therefore the CPU consumption of drl will be more or less static?

@carlosmccosta
Copy link
Owner

Good morning :)

The initial pose estimation uses feature matching for performing an approximate alignment and then relies on ICP for accurate pose estimation.

The drl should be most of the time in tracking mode. It should only use the initial alignment algorithms when the robot starts up and does not know where it is on the map.
The tracking algorithms are several orders of magnitude faster than the initial alignment algorithms (a few milliseconds vs 1 or 2 seconds).
As such, they should only be activated for recovery purposes (if you want, you can disable the initial alignment algorithms, just comment the yaml, but this way, when the system starts, somebody needs to go to rviz and tell drl where is the robot on the map).
If the initial alignment algorithms are being activated a lot of times, then you need to look at the console debug output of drl and see why the transformation validators are discarding poses

  • In parallel look at rviz, at the overlap between the map and the aligned_pointcloud topic
  • If the aligned point cloud is overlapping the map and drl is discarding poses, then you need to look carefully at the thresholds of the transformation validators, especially to the max_outliers_percentage (I also added other thresholds recently, look here)
    • These thresholds change depending on your use case
      • For example, if you expect the sensor to see a lot of geometry that is not on the map, such as dynamic objects / people, then the outlier max_outliers_percentage needs to be high
      • On the other hand, be aware that if these thresholds are too high, you may allow drl to accept poses that have high localization error

The publishing of tf can be done from within drl or with the pose_to_tf_publisher package.
In the guardian_config, the pose_to_tf_publisher publishes the associated TF (map -> odom) when it receives from drl a PoseStamped in the localization_pose topic.
Meanwhile, If the pose_to_tf_publisher does not receive a new pose within this timeout after receiving the last pose, it stops publishing the TF of the last pose and gives that warning.

A good wheel encoder can give reliable motion estimation for short amounts of time.
But like every odometry method, it has the drawback that it accumulates drift over time.
On the other hand, odometry plays an important role to motion controllers (control the movement of the robot), since it can give motion estimation at high rate (100 Hz or more).
As such, most mobile robots use odometry for having an estimate of its motion and then rely on sensors to correct the drift, which will be very small most of the time, because most sensors operate at 10 Hz or more, and most robots don't move very fast or with high accelerations (which causes wheel slippage, which in turns causes bad odometry estimation). This way, the localization system only needs to correct the drift that happen in relation to the last scan that it processed. This is the reason why in ros navigation there is this convention of tfs:

  • [map -> odom] (drift correction from localization system)
  • [odom -> base_footprint or base_link] (motion estimation from the odometry system)

@abylikhsanov
Copy link
Author

abylikhsanov commented Oct 8, 2019

Hi Carlos, thanks, I had tested that. It looks like the alighned_pointcloud does not cover the whole lidar specter but only some range of it? So instead of 360, it only covers around 90 degrees and those 90 degrees are aligned correctly. I then changed the box values:

ambient_pointcloud_map_frame:
        4_crop_box:
            box_min_x: -10 #-0.25
            box_min_y: -10 #-0.25
            box_min_z: -1.0
            box_max_x: 10 #2.35
            box_max_y: 10 #2.25

After that, the aligned_pointcloud covered almost the whole lidar range but instead of getting dense pointcloud, I only see small dots. Am I filtering them?

@carlosmccosta
Copy link
Owner

carlosmccosta commented Oct 8, 2019

Good afternoon,

The aligned point cloud displays the sensor data after filtering and alignment.
You said previously that you had changed the random sample filter to 50 points. As such, the aligned point cloud will only have 50 points.

For avoiding discarding sensor data, you should probably remove the cropbox filter (comment it or remove it from the yaml).
I added it because I knew exactly where the robot would be moving on the map and with the crop box filter I was able to remove a zone in which people watching the demonstration were expected to be moving around (this allowed the drl to use only good sensor data, avoiding the processing of data that would degrade the system performance, such as dynamic objects that are not on the map).

The laserscan_to_pointcloud also has a built in filter in which you can limit the min and max range of the point cloud that it generates from the laser scan data.
These min and max thresholds are scaling factors that are applied to the values that come in the laserscan data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants