Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[registration] Add multi-threaded NDT #4135

Closed
wants to merge 0 commits into from

Conversation

koide3
Copy link
Contributor

@koide3 koide3 commented May 22, 2020

This PR is created according to the suggestion on koide3/ndt_omp#19. It adds a multi-threaded and SSE-optimized NDT implementation that is derived from pcl::NormalDistributionsTransform and is considerably faster than the original one. A benchmark result is available at the bottom of this post.

[Added] NormalDistributionsTransformOMP

  • The following two methods are added to the original NDT
  • setNumThreads() specifies the number of threads to be used
  • setNeighborhoodSearchMethod() offers a trade-off between registration stability and speed (see here for brief explanation)

[Updated] VoxelGridCovariance

  • const qualifer is added to some methods to clarify that they are thread-safe
  • getNeighborhoodAtPoint() is generalized for different neighboring voxel search patterns

[Added] Unit test for NormalDistributionsTransformOMP

Benchmark result from https://github.com/koide3/ndt_omp

--- original NDT ---
single : 282.222[msec]
10times: 2921.92[msec]
fitness: 0.213937

--- multi-threaded NDT (KDTREE, 1 threads) ---
single : 207.697[msec]
10times: 2059.19[msec]
fitness: 0.213937

--- multi-threaded NDT (DIRECT7, 1 threads) ---
single : 139.433[msec]
10times: 1356.79[msec]
fitness: 0.214205

--- multi-threaded NDT (DIRECT1, 1 threads) ---
single : 34.6418[msec]
10times: 317.03[msec]
fitness: 0.208511

--- multi-threaded NDT (KDTREE, 8 threads) ---
single : 54.9903[msec]
10times: 500.51[msec]
fitness: 0.213937

--- multi-threaded NDT (DIRECT7, 8 threads) ---
single : 63.1442[msec]
10times: 343.336[msec]
fitness: 0.214205

--- multi-threaded NDT (DIRECT1, 8 threads) ---
single : 17.2353[msec]
10times: 100.025[msec]
fitness: 0.208511

@kunaltyagi kunaltyagi added changelog: new feature Meta-information for changelog generation needs: code review Specify why not closed/merged yet module: registration labels May 22, 2020
@kunaltyagi
Copy link
Member

Looking at the time taken, any reason why DIRECT1 should not be the default case for NDT implementation?

@koide3
Copy link
Contributor Author

koide3 commented May 22, 2020

I observed that registration results with DIRECT1 can be unreliable when the displacement is larger than the voxel resolution while DIRECT7 can deal with such cases by smoothing gradients over neighboring voxels.

As DIRECT7 is reasonably fast and reliable, I personally think it should be the default case. But, I consider that it depends on the policy of PCL on the default case choice (speed vs reliability).

Copy link
Member

@kunaltyagi kunaltyagi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-through review!! Haven't seen any source material.

#include <pcl/registration/ndt_omp.h>
#include <pcl/registration/impl/ndt_omp.hpp>

template class PCL_EXPORTS pcl::NormalDistributionsTransformOMP<pcl::PointXYZ, pcl::PointXYZ>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicit instantiations should only be done in case PCL_NO_PRECOMPILE is not provided.

/** \brief The voxel grid generated from target cloud containing point means and covariances. */
TargetGrid target_cells_;

// double fitness_epsilon_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unneeded code/comment?

Eigen::Matrix<double, 6, 6>& hessian,
PointCloudSource& trans_cloud);

/** \brief Update interval of possible step lengths for More-Thuente method, \f$ I \f$ in More-Thuente (1994)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I appreciate the effort in documentation, it'd be better to create a struct, and provide comments in one place, not multiple places.

int num_threads_;

/** \brief Neighboring voxel search method */
NeighborSearchMethod search_method;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

search_method_ instead

h_ang_f2_, h_ang_f3_;

/** \brief Matrix composed of h_ang_*_ (for SSE optimization) */
Eigen::Matrix<float, 16, 4> h_ang;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems no reason to keep h_ang_*_ and h_ang separate

inline float
getResolution() const
{
return (resolution_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In new code, no need to parenthesis protect the value.

template <typename PointSource, typename PointTarget>
void
pcl::NormalDistributionsTransformOMP<PointSource, PointTarget>::computePointDerivatives(Eigen::Vector3d& x, Eigen::Matrix<double, 3, 6>& point_gradient, Eigen::Matrix<double, 18, 6>& point_hessian, bool compute_hessian) const
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to zero the point_{gradient,hessian} (Though point_jacobian is better name due to symmetry with hessian)


/** \brief Precompute anglular components of derivatives.
* \note Equation 6.19 and 6.21 [Magnusson 2009].
* \param[in] p the current transform vector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p isn't a nice name for a transform vector in implementation, although it may be true to its literary roots

sz = sin(p(5));
}

// Precomputed angular gradiant components. Letters correspond to Equation 6.19 [Magnusson 2009]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gradient, or better yet, jacobian

j_ang_h_ << (sx * cz + cx * sy * sz), (cx * sy * cz - sx * sz), 0;

j_ang.setZero();
j_ang.row(0).noalias() = Eigen::Vector4f((-sx * sz + cx * sy * cz), (-sx * cz - cx * sy * sz), (-cx * cy), 0.0f);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like you're duplication calculation for some unknown reason

@kunaltyagi
Copy link
Member

I observed that registration results with DIRECT1 can be unreliable when the displacement is larger than the voxel resolution while DIRECT7 can deal with such cases by smoothing gradients over neighboring voxels

Adding this observation in detail docstring for enum would be helpful 😄

@kunaltyagi kunaltyagi added needs: more work Specify why not closed/merged yet and removed needs: code review Specify why not closed/merged yet labels May 22, 2020
@koide3
Copy link
Contributor Author

koide3 commented May 22, 2020

Thanks for reviewing. I'll fix them soon. Please note that some of the issues came from the original NDT. If you like, I can refactor the original one as well. What do you think?

@kunaltyagi
Copy link
Member

I can refactor the original one as well. What do you think?

Is it possible to have 3 PR instead?

  1. VoxelGridCovariance
  2. Original NDT
  3. NDT OMP

That'd make it easier to digest changes, specially since this is pushing 1700 lines already

@koide3
Copy link
Contributor Author

koide3 commented May 22, 2020

OK, I'll open 3 separated PRs.

@stale
Copy link

stale bot commented Jun 21, 2020

Marking this as stale due to 30 days of inactivity. Commenting or adding a new commit to the pull request will revert this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog: new feature Meta-information for changelog generation module: registration needs: more work Specify why not closed/merged yet status: stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants