关于模型泛化性的问题 #3

puyiwen · 2023-07-28T06:06:40Z

大佬您好，我看您这篇论文中用了大量的RGBD数据集，模型的泛化性比较好是不是主要原因是混合了大量数据集呢？还是说这种强泛化性也得益于将不同相机都规范化的做法？我之前做的是针对于室内场景的单模态RGB深度估计，我混合了几个公开RGBD数据集，还有我自己采集的一些实验场景的RGBD数据（普通混合，并没有用过这种相机规范化做法）。但是，泛化性依旧不是很好，只要物体换个场景，我这边预测的深度误差就非常大了。不知道使用您这种相机规范化方法，只是混合室内场景的数据，泛化性能不能有所提高？

YvanYin · 2023-07-28T06:51:40Z

Thanks for your attention to our work.

I believe the generalization problem lies in two parts.

The first is the scene generalization. Previous methods, such as DIW, MiDaS, DPT, and our DiverseDepth/LeReS, mainly focus on this part. They merge large-scale diverse data in training. As they employ a ranking loss or scale-shift invariant loss, the camera variants issues are decoupled. Thus they can produce a strong and robust depth model, i.e. generalize to diverse scenes. If you only mix indoor datasets, the model cannot work on in-the-wild scenes.

The second is the camera problem, which is related to the metric problem. If you do not need the absolute depth, you can ignore this and follow previous method to train a robust relative depth model. However, if your evaluation focuses on metric, you can follow our method to preprocess data. This can help the model to converge and achieve both strong generalization and metric recovery ability. Also note that only indoor data cannot ensure strong generalization.

Hope this can help you.

puyiwen · 2023-07-28T06:58:19Z

感谢您对我们工作的关注。

我认为泛化问题存在于两个部分。

首先是场景概括。之前的方法，比如DIW、MiDaS、DPT以及我们的DiverseDepth/LeReS，主要集中在这部分。他们在训练中合并大规模不同的数据。由于它们采用排名损失或尺度变化不变损失，因此相机变体问题被解耦。因此，它们可以产生强大且鲁棒的深度模型，即推广到不同的场景。如果仅混合室内数据集，则模型无法在野外场景中运行。

第二是相机问题，这与度量问题有关。如果不需要绝对深度，可以忽略这一点并按照之前的方法训练鲁棒的相对深度模型。但是，如果您的评估侧重于指标，您可以按照我们的方法来预处理数据。这可以帮助模型收敛并实现强大的泛化能力和度量恢复能力。另请注意，仅室内数据无法确保强泛化性。

希望这可以帮到你。

非常感谢您的回复，我还有几个疑问。

我的应用场景目前只是在室内，我还需要加一些室外场景的数据集来提高室内场景的泛化性吗？
规范化相机焦距的设定，是自己随便设定一个？还是有一个明确的值？
您会在公布代码的时候把所用数据集的相机内参都公布出来吗？

JUGGHM · 2023-07-28T10:29:26Z

我的理解哈：
1.应该不太需要
2.论文中有消融实验讨论这件事
3.论文中列出了使用的所有数据集，具体的内参应该都包含在不同的数据集中

kwea123 · 2023-07-29T05:52:13Z

論文中有提到設定的canonical focal會對結果有所影響，論文中最佳值為1000，我感覺只是因為訓練使用的資料大部分focal length是1000左右，如果你的資料focal平均值不在1000，可能就要換一個最佳的focal length

YvanYin · 2023-07-30T11:20:31Z

在训练中，大部分数据也并不1000。比如taskonomy，大部分在500-700左右，测试的NYU等数据集也并不在500左右。所以并不是平均值在1000。关于这部分的ablation，还可以继续深入探索一下。

YvanYin closed this as completed Jul 30, 2023

mwdotzom mentioned this issue Sep 27, 2023

defective prediction for large focal length #19

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

关于模型泛化性的问题 #3

关于模型泛化性的问题 #3

puyiwen commented Jul 28, 2023 •

edited

YvanYin commented Jul 28, 2023

puyiwen commented Jul 28, 2023

JUGGHM commented Jul 28, 2023

kwea123 commented Jul 29, 2023

YvanYin commented Jul 30, 2023

关于模型泛化性的问题 #3

关于模型泛化性的问题 #3

Comments

puyiwen commented Jul 28, 2023 • edited

YvanYin commented Jul 28, 2023

puyiwen commented Jul 28, 2023

JUGGHM commented Jul 28, 2023

kwea123 commented Jul 29, 2023

YvanYin commented Jul 30, 2023

puyiwen commented Jul 28, 2023 •

edited