Skip to content

weiguangzhao/Diff-OP3D

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Open-Pose 3D Zero-Shot Learning: Benchmark and Challenges

  • Release environment setting
  • Release open-pose benchmark datasets McGill
  • Release datasets ModelNet40, ModelNet10,
  • Release our baseline eval code CLIP-Based
  • Release our baseline eval code Diffusion-Based

Open-Pose Benchmark

Datasets Total Classes Seen/Unseen Classes Train/Valid/Test Samples Download
ModelNet40 40 30/- 5852/1560/- google driver
ModelNet10 10 -/10 -/-/908 google driver
McGill 19 -/14 -/-/115 google driver

Our Baseline Method

avatar

Environment

Our baseline (Diffusion-based or CLIP-based) could be conducted on one single RTX3090 or RTX4090.

conda env create -f op3dzsl.yaml
conda activtae op3dzsl
pip install git+https://github.com/openai/CLIP.git

Download the Diffusion pretrained model google driver or official website. Rename the pretrained model as "model.ckpt" and put it in the directory "models/ldm/stable-diffusion-v1/".

Baseline Evaluation

For our CLIP-Based baseline

python baseline_eval/clip_eval.py

For our Diffusion-Based baseline

python baseline_eval/diffusion_eval.py

Citation

If you find this work useful in your research, please cite:

@article{zhao2023diff,
  title={Open-Pose 3D Zero-Shot Learning: Benchmark and Challenges},
  author={Zhao, Weiguang and Yang, Guanyu and Zhang, Rui and Jiang, Chenru and Yang, Chaolong and Yan, Yuyao and Hussain, Amir and Huang, Kaizhu},
  journal={arXiv preprint arXiv:2312.07039},
  year={2023}
}

If you utilize our open-pose datasets, it is necessary to cite the previous works from which they were developed: ModelNet40 and McGill.

@inproceedings{ModelNet,
  title={3d shapenets: A deep representation for volumetric shapes},
  author={Wu, Zhirong and Song, Shuran and Khosla, Aditya and Yu, Fisher and Zhang, Linguang and Tang, Xiaoou and Xiao, Jianxiong},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  pages={1912--1920},
  year={2015}
}
@article{McGill,
  title={Retrieving articulated 3-D models using medial surfaces},
  author={Siddiqi, Kaleem and Zhang, Juan and Macrini, Diego and Shokoufandeh, Ali and Bouix, Sylvain and Dickinson, Sven},
  journal={Machine Vision and Application},
  volume={19},
  pages={261--275},
  year={2008}
}

Acknowlegement

This project is not possible without multiple great opensourced codebases. We list some notable examples: TZSL, PointCLIP, PointCLIPv2, ReConCLIP, CLIP2Point, ULIP, DiffCLIP, Stable-Diffusion etc.

About

Open-Pose 3D Zero-Shot Learning: Benchmark and Challenges

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages