Skip to content

Implementation of paper "SceneMaker: Open-set 3D Scene Generation with Decoupled De-occlusion and Pose Estimation Model"

License

Notifications You must be signed in to change notification settings

IDEA-Research/SceneMaker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation

SceneMaker: Open-set 3D Scene Generation with Decoupled De-occlusion and Pose Estimation Model

Yukai Shi1,3, Weiyu Li2,4, Zihao Wang4, Hongyang Li3, Xingyu Chen3, Ping Tan2,4, Lei Zhang3

1 Tsinghua University    2 HKUST    3 IDEA Research    4 LightIllusions

Paper Datasets Code

Scene ImageNormal Map3D Scene
Livingroom Normal Map 3D Scene
Scene ImageNormal Map3D Scene
Building Normal Map 3D Scene
Scene ImageNormal Map3D Scene
Printer Normal Map 3D Scene
Scene ImageNormal Map3D Scene
Scene Normal Map 3D Scene
Scene ImageNormal Map3D Scene
Lounge Normal Map 3D Scene
Scene ImageNormal Map3D Scene
Street Normal Map 3D Scene

Abstract

We propose a decoupled 3D scene generation framework called SceneMaker in this work. Due to the lack of sufficient open-set de-occlusion and pose estimation priors, existing methods struggle to simultaneously produce high-quality geometry and accurate poses under severe occlusion and open-set settings. To address these issues, we first decouple the de-occlusion model from 3D object generation, and enhance it by leveraging image datasets and collected de-occlusion datasets for much more diverse open-set occlusion patterns. Then, we propose a unified pose estimation model that integrates global and local mechanisms for both self-attention and cross-attention to improve accuracy. Besides, we construct an open-set 3D scene dataset to further extend the generalization of the pose estimation model. Comprehensive experiments demonstrate the superiority of our decoupled framework on both indoor and open-set scenes. Our codes and datasets will be released.

Framework

Our framework consists of three main components:

  1. Scene Perception: Understanding the input scene structure
  2. 3D Object Generation under Occlusion: Decoupled de-occlusion model for robust object generation
  3. Pose Estimation: Unified pose estimation model with global and local attention mechanisms

We decouple the de-occlusion model from 3D object generation. We construct a unified pose estimation model that incorporates both global and local attention mechanisms.

Framework

Open Source Progress

  • 🔄 Dataset: Uploading
  • Inference Code: Coming soon
  • Training Code: Coming soon

Citation

If you find our work useful in your research, please consider citing:

@article{
}

Acknowledgement

We would like to thank the authors of the following projects for their excellent work and open-source contributions:

  • MoGe - Monocular depth estimation
  • SAM - Segment Anything Model for image segmentation
  • DINO-X - Grounding segementation
  • CraftsMan - 3D object generation
  • Step1x-3D - 3D object generation
  • Hunyuan3D - 3D object generation
  • MIDI3D - Multi-instance 3D scene generation
  • InstPIFu - Indoor 3D scene generation

Their contributions have been invaluable to the development of SceneMaker.

License

See LICENSE file for details.

About

Implementation of paper "SceneMaker: Open-set 3D Scene Generation with Decoupled De-occlusion and Pose Estimation Model"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published