Skip to content

ryutaroLF/PureCLIP-Depth

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Paper

PureCLIP-Depth: Prompt-Free and Decoder-Free Monocular Depth Estimation within CLIP Embedding Space

We propose PureCLIP-Depth, a completely prompt-free, decoder-free Monocular Depth Estimation (MDE) model that operates entirely within the Contrastive Language-Image Pre-training (CLIP) embedding space. Unlike recent models that rely heavily on geometric features, we explore a novel approach to MDE driven by conceptual information, performing computations directly within the conceptual CLIP space. The core of our method lies in learning a direct mapping from the RGB domain to the depth domain strictly inside this embedding space. Our approach achieves state-of-the-art performance among CLIP embedding-based models on both indoor and outdoor datasets.

Prediction on NYU Depth V2 dataset

Prediction on KITTI dataset

Prediction on KITTI dataset (movie)

KITTI2.mp4

Getting Started

Installation

conda create -n PureCLIP-Depth -y python=3.12
conda activate PureCLIP-Depth

pip install -r requirement.txt

Download pre-trained weight

Weight for KITTI dataset

Weight for NYU Depth V2 dataset

Train the model

python main_train_nyu.py

License

This repository is released under the MIT License.

References

Project Link Description
CLIP openai/CLIP Official implementation of CLIP
PyTorch pytorch/pytorch Tensors and Dynamic neural networks

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors