PureCLIP-Depth: Prompt-Free and Decoder-Free Monocular Depth Estimation within CLIP Embedding Space

We propose PureCLIP-Depth, a completely prompt-free, decoder-free Monocular Depth Estimation (MDE) model that operates entirely within the Contrastive Language-Image Pre-training (CLIP) embedding space. Unlike recent models that rely heavily on geometric features, we explore a novel approach to MDE driven by conceptual information, performing computations directly within the conceptual CLIP space. The core of our method lies in learning a direct mapping from the RGB domain to the depth domain strictly inside this embedding space. Our approach achieves state-of-the-art performance among CLIP embedding-based models on both indoor and outdoor datasets.

Prediction on NYU Depth V2 dataset

Prediction on KITTI dataset

Prediction on KITTI dataset (movie)

KITTI2.mp4

Getting Started

Installation

conda create -n PureCLIP-Depth -y python=3.12
conda activate PureCLIP-Depth

pip install -r requirement.txt

Download pre-trained weight

Weight for KITTI dataset

Weight for NYU Depth V2 dataset

Train the model

python main_train_nyu.py

License

This repository is released under the MIT License.

References

Project	Link	Description
CLIP	openai/CLIP	Official implementation of CLIP
PyTorch	pytorch/pytorch	Tensors and Dynamic neural networks

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
config		config
data		data
job		job
models		models
utils		utils
LICENSE		LICENSE
README.md		README.md
main_train_kitti.py		main_train_kitti.py
main_train_nyu.py		main_train_nyu.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PureCLIP-Depth: Prompt-Free and Decoder-Free Monocular Depth Estimation within CLIP Embedding Space

Getting Started

Installation

Download pre-trained weight

Train the model

License

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PureCLIP-Depth: Prompt-Free and Decoder-Free Monocular Depth Estimation within CLIP Embedding Space

Getting Started

Installation

Download pre-trained weight

Train the model

License

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages