Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusion with alignment matrix #31

Closed
laurelkeys opened this issue May 11, 2021 · 4 comments
Closed

Confusion with alignment matrix #31

laurelkeys opened this issue May 11, 2021 · 4 comments

Comments

@laurelkeys
Copy link

Hi, while I was going through the code in python_toolbox/evaluation/ to better understand how the evaluation metrics are computed I got a little confused by the way alignment / transformation matrices are applied.

From what I understand, the adopted convention is that the matrices align the reconstructed pose to the ground-truth (as mentioned in #12 (comment) and on section 3-1. of the tutorial), i.e., using Open3D's parameter names: "source = reconstructed / estimate" and "target = ground-truth".

Hence, in run_evaluation() the transformation matrix gt_trans should align the reconstruction to the ground-truth (right?).

However, in trajectory_alignment() the transformation is applied to the ground-truth trajectory:
https://github.com/intel-isl/TanksAndTemples/blob/90cd206d6991acec775cf8a2788517d7ecc30c2f/python_toolbox/evaluation/registration.py#L65-L69

Does it make sense to apply a "reference to ground-truth" transform to data in the ground-truth coordinate frame? Shouldn't this use the inverse transform, effectively taking "ground-truth to reference" (i.e. traj_pcd_col.transform(np.linalg.inv(gt_trans)))? Or instead, apply the transformation to the reference data (traj_to_register_pcd in this case)?

Thank you.

@arknapit
Copy link
Contributor

Hi Tiago,

The variable names are a bit ambiguous here, and I see how this can be confusing. The function trajectory_alignment() aligns the "source" camera trajectory with our known "target" trajectory and in addition puts it in real-world coordinates (LiDAR coordinates). This target trajectory (gt_traj_col = e.g. Ignatius_COLMAP_SfM.log) still lives in an arbitrary COLMAP outputted reference frame, therefore we transform it to the LiDAR reference frame using "gt_trans" and do the ICP between the camera positions to get the final trajectory alignment afterwards (its a precursor for the final refinement with the dense point cloud later). The GT in the name gt_traj_col here just means that it is the camera trajectory we want to get our traj_to_register aligned to, so its basically the camera trajectory, where we know how it aligns to the GT reconstruction.
So to summarize: for this automatic alignment procedure, all we usually have is the "source" camera poses in an arbitrary reference frame, so we need 3 additional things to calculate this "pre-alignment" step (see here)::

  1. map_file = for temporal alignment, in case you used different frames from the video than what we provide as image samples. (see tutorial A, case 2 for more details)
  2. gt_traj_col = the camera trajectory of the target reconstruction (its not in real world coordinates, its just another reconstruction)
  3. gt_trans = the transformation to get gt_traj_col to the actual real-world (Lidar) coordinates

let me know if there are still questions,

@laurelkeys
Copy link
Author

Hi Arno,

Thank you for the detailed reply!

Does this mean that – besides gt_traj_col which is in COLMAP's (arbitrary) reference frame – all other data is either in the estimate reconstruction reference frame or in known real-world coordinates? That is:

  • COLMAP (arbitrary reference frame):
    • gt_traj_col = read_trajectory(colmap_ref_logfile)
  • Estimate reconstruction ("source"):
    • traj_to_register = read_trajectory(traj_path)
    • pcd = o3d.io.read_point_cloud(ply_path)
  • Real-world ("target" ground-truth):
    • vol = o3d.visualization.read_selection_polygon_volume(cropfile)
    • gt_pcd = o3d.io.read_point_cloud(gt_filen)

And so, gt_trans = np.loadtxt(alignment) is a transformation matrix from COLMAP to "target" (i.e. real-world), while trajectory_transform is a transformation from "source" to "target" (and so are the three r*.transformation)?

@arknapit
Copy link
Contributor

Exactly:
trajectory_transform is the rough pre-alignment using the camera positions, and the registration refinment is done using the dense pointclouds.

@laurelkeys
Copy link
Author

Awesome, thank you for the replies!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants