You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your interest. The main motivation of this paper is indeed to guide the slot masks with motion cues (motion segments).
In both our paper and the most recent update, we use both ground-truth segmentation masks and estimated motion segments (which are conducted on the self-supervised flow and a pre-trained model on the toy flyingthings dataset).
We agree that even for the lightweight estimated motion, we still need some minimal supervision from the toy dataset, but that model did not get access to any ground truth of the target dataset. On the KITTI dataset, we did not use any ground-truth segments and can achieve a reasonable result.
Thanks a lot for pointing out the shortcoming of our model and we are actually aiming to further reduce the supervision until achieving the full self-supervised object discovery results.
In the paper, you used the segmentation masks to supervise, so why it is called unsupervised method??
The text was updated successfully, but these errors were encountered: