Configuration parameters #25

ming82871 · 2023-07-24T03:24:48Z

Hi!
Thanks for your excellent work! We intend to evaluate some other datasets with the code which you have provided. Therefore, we need to reconstruct the config file following your code example. We found that these parameters, especially the bounding box, learning rate, and loss weight, significantly impact rendering and reconstruction in tracking and mapping thread, even if the GT pose is given. Can you give me some suggestions or experience about the details of parameter configuration for the new dataset such as ETH3D (how to fine-tune the parameter on the new indoor dataset)?
Thanks!

HengyiWang · 2023-07-24T15:40:12Z

Hi @ming82871, thanks for using our code. I will try to give you some suggestions for tuning the parameters.

Bounding box: The requirement of the bounding box is to cover the target object/scene. We usually use the pre-defined voxel size to construct the feature grid. Thus, the bounding box size usually will not affect the results that much. However, if you make the bounding box too large, then you may need to increase the hash table size and the dimension of oneblob encoding as well. Another notable thing is that for the TUM dataset, sometimes the background wall is quite far away from the target table, and the depth measurements are quite noisy there. Setting a smaller bounding box can be helpful in that case, but we did not do that in our experiments. (We did not take effort into outlier handling, e.g., rejecting pixels that have large re-render errors. However, this is quite important if you want to apply Co-SLAM for some challenging scenes) You can use the jupyter notebook here https://github.com/HengyiWang/Co-SLAM/blob/main/vis_bound.ipynb to help you set the bounding box if you have gt pose. Otherwise, you need to estimate the bounding on your own.
Learning rate and loss weight are usually tuned for different camera settings and scenes. For example, the TUM dataset has a rolling shatter effect, thus, we have to down-weight the color loss and increase the weight of the SDF loss. Also, the TUM dataset usually focuses on smaller objects instead of indoor scene reconstruction, so a smaller truncation distance is needed. In terms of learning rate, there are learning rates for parametric encoding, decoder, and pose parameters. We set lr_embed=lr_decoder=0.01 for all experiments (There is an interesting thing here. If you set lr_embed=0.001, then coordinate encoding will be dominating and give you some smooth results and vice versa.) In terms of the pose parameters, we observe that 0.001 is usually the most generalizable for room-scale scene reconstruction.

I would always suggest trying out the config file of ScanNet first as the setting there will be mostly generalizable to other room-scale scenes.

Please feel free to ask if you need more clarifications (Also, it would be helpful if you can share your problems more in detail).

HengyiWang · 2023-07-24T15:44:14Z

Also, please note there is a documentation that contains the details of each hyper-parameters here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configuration parameters #25

Configuration parameters #25

ming82871 commented Jul 24, 2023

HengyiWang commented Jul 24, 2023

HengyiWang commented Jul 24, 2023

Configuration parameters #25

Configuration parameters #25

Comments

ming82871 commented Jul 24, 2023

HengyiWang commented Jul 24, 2023

HengyiWang commented Jul 24, 2023