Learning Depth Representation from RGB-D Videos by Time-Aware Contrastive Pre-training

Setup environments

Use anaconda to create a Python 3.8 environment:

conda create -n py38 python3.8
conda activate py38

Install requirements

pip install -r requirements

UniRGBD dataset

An unified and universal RGB-D database for depth representation pre-training.

The script for unifying various RGB-D frames to generate UniRGBD is scripts/rgbd_data.ipynb. You can download our pre-processed version (split into several parts due to too large size): [HM3D][SceneNet][SUN3D][TUM, DIODE, NYUv2][Evaluation data (with ScanNet)][Outdoor data (from RGBD1K and DIML)]. The access code is tacp.

Important: HM3D is free for academic, non-commercial research, but requires the access from Mattterport. After getting the access and 3D scenes, you can run scripts/hm3d_data.mp.py to generate RGB-D frames or download the pre-processed version.

After decompression, the folder structure will be like (there may exist a few redundant folders):

data/rgbd_data/
├── diode_clean_resize
│   └── train
│       ├── indoors
│       └── outdoor
├── hm3d_rgbd
│   └── train
│       ├── 0
│       ├── 1
│       └── ...
├── nyuv2_resize
│   ├── all
│   ├── train
│   └── val
├── pretrain_val
│   ├── diode_val
│   ├── hm3d_val
│   ├── nyuv2_val
│   ├── scannet_val
│   ├── scenenet_val500
│   ├── sun3d_val
│   └── tum_val
├── scenenet_resize
│   └── train
│       ├── 0
│       ├── 1
│       └── ...
├── sun3d
│   ├── train
└── tumrgbd_clean_resize
    ├── train

Note that all path variables in scripts are absolute, so remember to change them as needed. You can add arbitrary new data by appending the new folder to _C.DATA.RGBD.data_path in config/default.py.

Oringinal data source links: HM3D, SceneNet, SUN3D, TUM, DIODE, NYUv2, ScanNet, RGBD1K and DIML.

Run pre-training

train

train.sh is used for training on single GPU; multi_proc.sh is used for training on multiple GPUs. The pre-trained weights will be stored in data/checkpoints. All configuration files are in the config folder.

evaluate

eval.sh supplies the standard evaluation procedure, including non-shuffle, block-shuffle, shuffle and out-of-domain. Metrics calculation can be found in trainers/dist_trainer.py. The evaluation results will be stored in data/checkpoints/{}/evals.

check evaluation order

For fair comparison, we supply the standard evaluation order files here. Run generate_eval_order.sh to compare whether the evaluation orders are the same as ours.

Evaluation performance

	Shuffle Top-1	Block-shuffle Near-1	Non-shuffle Near-1	Out-domain Top-1
TAC	0.974	0.642	0.603	0.850

Customize usage

scripts/demo.ipynb gives a simple demonstration of encoding a depth image. You can also separate the depth encoder apart from the whole model as needed.

Pretrained weight

[Checkpoint]

Extended experiments

scripts/uncertainty.ipynb: Conduct the MC Dropout uncertainty analysis.
scripts/zero_shot.ipynb: Conduct zero-shot room classification by depth images.
scripts/mae and config/v2/v2_mae.yaml: Train cross-modal masked autoencoder model.
config/v2/v2_edge.yaml: RGBD alignment by Canny edge detection.
config/v2/v2_tac_outdoortune.yaml: Fine-tune model with a few outdoor frames.

Embodied experiments

Experiment codes are stored in here.

Visualization

PointNav

VLN

EQA

Rearrange

Citation

@ARTICLE{10288539,
  author={He, Zongtao and Wang, Liuyi and Dang, Ronghao and Li, Shu and Yan, Qingqing and Liu, Chengju and Chen, Qijun},
  journal={IEEE Transactions on Circuits and Systems for Video Technology}, 
  title={Learning Depth Representation from RGB-D Videos by Time-Aware Contrastive Pre-training}, 
  year={2023},
  volume={},
  number={},
  pages={1-1},
  doi={10.1109/TCSVT.2023.3326373}}

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
common		common
config		config
dataloaders		dataloaders
models		models
resources		resources
scripts		scripts
trainers		trainers
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
eval.sh		eval.sh
generate_eval_order.sh		generate_eval_order.sh
imports.py		imports.py
multi_proc.sh		multi_proc.sh
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_dist.py		run_dist.py
train.sh		train.sh

License

RavenKiller/TAC

Folders and files

Latest commit

History

Repository files navigation