You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've a dataset of images with given camera intrinsics and extrinsics (pose). So, I'm trying to generate the transforms_train.json file without needing to run colmap and other stuff. To do this, I'm trying to figure out how transforms_train.json is created based on colmap sparse reconstruction for the ScanNet scenes. I figured out the relation between rotation matrices, however, I'm not able to figure out how translation is scaled. I found different scaling factors for the 5 scenes. I tried to understand the C++ code that generates the transforms_train.json file, but I'm not used to C++ and hence couldn't figure it out.
Can you please tell me how to compute the scaling factor for the camera translation?
PS: I also noticed that, unlike the original NeRF which scales translation so that nearest depth becomes unity, you do not do such a scaling here.
The text was updated successfully, but these errors were encountered:
Here is an explanation on the content of transforms_train.json and the definition of the transformation matrix. Does this answer your question?
In the preprocessing on the ScanNet SfM reconstructions we apply a scaling to the reconstruction (sparse depth and poses), because SfM reconstructions are only upto a scaling factor. This scaling allows to compute depth metrics against the ScanNet sensor depth.
Scaling of the sparse depth and poses is needed only when:
you have ground truth depth to evaluate against with different scaling than the camera poses and sparse depth, or
the depth prior network was trained on a different range of sparse depth.
Hi,
I've a dataset of images with given camera intrinsics and extrinsics (pose). So, I'm trying to generate the
transforms_train.json
file without needing to run colmap and other stuff. To do this, I'm trying to figure out howtransforms_train.json
is created based on colmap sparse reconstruction for the ScanNet scenes. I figured out the relation between rotation matrices, however, I'm not able to figure out how translation is scaled. I found different scaling factors for the 5 scenes. I tried to understand the C++ code that generates thetransforms_train.json
file, but I'm not used to C++ and hence couldn't figure it out.Can you please tell me how to compute the scaling factor for the camera translation?
PS: I also noticed that, unlike the original NeRF which scales translation so that nearest depth becomes unity, you do not do such a scaling here.
The text was updated successfully, but these errors were encountered: