Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FastVGICPCuda not working while FastVGICP does #69

Open
Dysl3xik opened this issue Sep 7, 2021 · 8 comments
Open

FastVGICPCuda not working while FastVGICP does #69

Dysl3xik opened this issue Sep 7, 2021 · 8 comments

Comments

@Dysl3xik
Copy link

Dysl3xik commented Sep 7, 2021

I am trying to test both variants of VGICP using the same data set and the cuda variant seems to have a bug and is simply not returning real results.

When I use the cuda variant it seems to be taking some time to do the covariance calcs, but the LM optimization just returns all zeros and or NAN. Switching to gauss newton results in it immediately returning the same input transform (identity in this case).

If it test using VGICP on the same data / params I get the expected results.

I am using cuda 11.4 if that makes any difference.

Any ideas what may be wrong here?

@Dysl3xik
Copy link
Author

Dysl3xik commented Sep 7, 2021

I actually tracked this down to the RegularizationMethod parameter. When its set to Planar, the default the code does not work on either my sample set, or your provided data.

If I change to Frobenius it seems to work fine.

@koide3
Copy link
Owner

koide3 commented Sep 10, 2021

I'm not really sure, but I guess it is a problem on Eigen::SelfAdjointEigenSolver used in the PLANE regularization that may have a problem on some GPUs. What GPU are you using? Can you insert the following test code just below eig.computeDirect(cov); to see if eigenvalue decomposition is working properly?

    // --- test code ---
    Eigen::Vector3f values = eig.eigenvalues();
    Eigen::Matrix3f v_diag = values.asDiagonal();
    Eigen::Matrix3f v_inv = eig.eigenvectors().inverse();

    Eigen::Matrix3f C_ = eig.eigenvectors() * v_diag * v_inv;

    if((cov - C_).array().abs().maxCoeff() > 1e-3) {
      printf("wrong SVD result\n");
      printf("--- C ---\n");
      for(int i=0; i<3; i++) {
        for(int j=0; j<3; j++) {
          printf("%.6f ", cov(i, j));
        }
        printf("\n");
      }

      printf("--- C_ ---\n");
      for(int i=0; i<3; i++) {
        for(int j=0; j<3; j++) {
          printf("%.6f ", C_(i, j));
        }
        printf("\n");
      }
    }
    // ---

@Dysl3xik
Copy link
Author

GPU is RTX8000

This code did not trigger to trap anything, I tried to go through all the covariance stuff and look for NAN or INF and I am not seeing it show up anywhere...

@Dysl3xik
Copy link
Author

I also tried reverting the plane code to the commented section and use covariance_regularization_svd() and still get the same result. Maybe that provides any more useful information....

@rzhao88
Copy link

rzhao88 commented Jan 2, 2022

I found a fix for this. rzhao88@2159f39

Comment is wrong, I was using CUDA 11.5

@cdb0y511
Copy link
Contributor

cdb0y511 commented Jan 16, 2022

I found a fix for this. rzhao88@2159f39

Comment is wrong, I was using CUDA 11.5

Hi, @rzhao88
I have tested it on CUDA 11.5. It fixes sometimes the covariance stuff return NAN or INF issues in the Cuda version.
I think you could make a PR, the
src/fast_gicp/cuda/covariance_regularization.cu
fix it.
Thanks, @rzhao88
I think you @koide3 may be interested in it.

@rzhao88
Copy link

rzhao88 commented Feb 3, 2022

I don't have the time current (due to job constraints) to do a clean fix for this. Feel free to take it and make a PR. @koide3 @cdb0y511

cdb0y511 added a commit to cdb0y511/fast_gicp that referenced this issue Feb 8, 2022
Take from rzhao88@2159f39.
Works on cuda 11.5 and 11.6.
This issue may be caused by the mixed device and host memory usage.
koide3 pushed a commit that referenced this issue Feb 11, 2022
Take from rzhao88@2159f39.
Works on cuda 11.5 and 11.6.
This issue may be caused by the mixed device and host memory usage.
@JACKLiuDay
Copy link

I have the same problem. Do you guys have any ideas?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants