Hi, first of all, thank you for sharing the code and resources with the community! I’ve been experimenting with the four pretrained models provided in the repository to extract depth maps. While testing, I adjusted the network size parameters (net_h, net_w) and observed that increasing these values seemed to improve the detail in the depth estimation, especially in more complex regions of the images.
However, I have a concern that increasing these values too much might lead to a trade-off where the model focuses too heavily on local features at the cost of global geometric consistency across the image. I would like to know your thoughts on this hypothesis: Could increasing the network size cause a decrease in global geometric coherence?
Additionally, for processing images with a resolution of 1920x1080, I aim to achieve a dense depth map without geometric inconsistencies. Could you recommend which of the four pretrained weights would be best suited for this task? And, based on your experience, what would be an optimal setting for net_h and net_w to balance detail and global consistency?
Thanks again for your help and for providing this fantastic tool!