Skip to content

[CVPR 2024✨Highlight] This is a repository for HOLD, the first method that jointly reconstructs articulated hands and objects from monocular videos without assuming a pre-scanned object template and 3D hand-object training data.

zc-alexfan/hold

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video

Image

[ Project Page ] [ Paper ] [ ArXiV ] [ Video ]

News

🚀 Register an HOLD account here for news such as code release, data/model download, and other future updates!

  • 2024.04.04: HOLD is awarded CVPR highlight!
  • 2024.02.27: HOLD is accepted to CVPR'24! Working on code release!

Overview

This is a repository for HOLD, a method that jointly reconstructs hands and objects from monocular videos without assuming a pre-scanned object template.

Image

HOLD can reconstruct 3D geometries of novel objects and hands:

Image

Image

Abstract

Since humans interact with diverse objects every day, the holistic 3D capture of these interactions is important to understand and model human behaviour. However, most existing methods for hand-object reconstruction from RGB either assume pre-scanned object templates or heavily rely on limited 3D hand-object data, restricting their ability to scale and generalize to more unconstrained interaction settings. To this end, we introduce HOLD -- the first category-agnostic method that reconstructs an articulated hand and object jointly from a monocular interaction video. We develop a compositional articulated implicit model that can reconstruct disentangled 3D hand and object from 2D images. We also further incorporate hand-object constraints to improve hand-object poses and consequently the reconstruction quality. Our method does not rely on 3D hand-object annotations while outperforming fully-supervised baselines in both in-the-lab and challenging in-the-wild settings. Moreover, we qualitatively show its robustness in reconstructing from in-the-wild videos.

More results

See more results on our project page!

@article{fan2024hold,
  title={{HOLD}: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video},
  author={Fan, Zicong and Parelli, Maria and Kadoglou, Maria Eleni and Kocabas, Muhammed and Chen, Xu and Black, Michael J and Hilliges, Otmar},
  booktitle = {Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2024}
}

Star History

Star History Chart

About

[CVPR 2024✨Highlight] This is a repository for HOLD, the first method that jointly reconstructs articulated hands and objects from monocular videos without assuming a pre-scanned object template and 3D hand-object training data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published