Question about depth usage in training stage 2

Hi, thank you for open-sourcing such amazing work!

I have a question regarding the training details mentioned in the paper. For the second stage (real-data training), are the depth loss and depth adjustment module still used? My understanding is that there is no ground truth depth available at this stage. Is it purely RGB self-supervised learning?