You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a dataset for monocular depth estimation, and I am planning to fine-tune pre-trained ViT backbone + DPT head on this dataset. While fine-tuning it, backbone will be frozen, only DPT head will be optimized. I loaded the weights for ViT backbone, and when I just want to see what the model will generate as output, I faced the following error:
"Inference tensors cannot be saved for backward. To work around you can make a clone to get a normal tensor and use it in autograd."
I also dropped my full code below. As far as I understand, forward pass can be done only under with torch.inference_mode():. In that case, how can I fine-tune this model for my depth dataset ?
I have found what causes this problem. There is a forward_pre_hook defined for ViT backbone inside the function _make_dinov2_dpt_depther. This hook is CenterPadding defined in dinov2.hub.utils. Its forward was tagged with @torch.inference_mode(). As far as I understand, if image size and patch size are not compatible with each other, this center padding handles it. What makes me confused is that why it was tagged with @torch.inference_mode() ? I thought that it is also useful in training stage. Is there a specific reason why it was only for inference ?
Thanks in advance.
The text was updated successfully, but these errors were encountered:
Hello DINO-V2 Community,
I have a dataset for monocular depth estimation, and I am planning to fine-tune pre-trained ViT backbone + DPT head on this dataset. While fine-tuning it, backbone will be frozen, only DPT head will be optimized. I loaded the weights for ViT backbone, and when I just want to see what the model will generate as output, I faced the following error:
I also dropped my full code below. As far as I understand, forward pass can be done only under
with torch.inference_mode():
. In that case, how can I fine-tune this model for my depth dataset ?Thanks in advance.
Edit:
Hello DINO-V2 Community
I have found what causes this problem. There is a forward_pre_hook defined for ViT backbone inside the function
_make_dinov2_dpt_depther
. This hook is CenterPadding defined in dinov2.hub.utils. Its forward was tagged with@torch.inference_mode()
. As far as I understand, if image size and patch size are not compatible with each other, this center padding handles it. What makes me confused is that why it was tagged with@torch.inference_mode()
? I thought that it is also useful in training stage. Is there a specific reason why it was only for inference ?Thanks in advance.
The text was updated successfully, but these errors were encountered: