RelaxFlow: Text-Driven Amodal 3D Generation

Jiayin Zhu¹, Guoji Fu¹, Xiaolu Liu^{2 1}, Qiyuan He¹, Yicong Li¹, Angela Yao¹;
National University of Singapore ¹
Zhejiang University ²

🎯 What We Do: Resolving Semantic Ambiguity

Image-to-3D generation faces inherent semantic ambiguity under occlusion, where partial observation alone is often insufficient to determine the object category. For instance, a visible wooden backboard could plausibly correspond to a sofa, a bed, or a dressing table. Existing feedforward models, like SAM3D, often collapse to an "observation-overfitted" shape by uncontrolled hallucination.

We formalize text-driven amodal 3D generation. Our task allows users to explicitly steer the completion of unseen regions using text prompts, while strictly preserving the visual evidence of the input observation.

⚙️ How We Do It: Decoupled Control & Relaxation

These dual objectives demand distinct control granularities: rigid control for the visible observation versus relaxed structural control for the text prompt. To solve this, we propose RelaxFlow, a training-free dual-branch framework:

Observation Branch: Provides strict adherence to ensure visual fidelity for the observed pixels.
Multi-Prior Consensus: Converts the text prompt into visual proxy reference images. Cross-attention across these priors naturally amplifies structural consensus while suppressing inconsistent, instance-specific textures.
Visibility-Aware Fusion: A spatial blending mechanism ensuring the semantic guide only steers genuinely occluded regions, while the observation strictly governs the visible pixels.

The Theory: Low-Pass Relaxation

A core challenge is preventing the text prompt's high-frequency details from clashing with the input image. We introduce a Relaxation Mechanism that smooths cross-attention logits within the generation backbone.

Theoretically, we prove this smoothing is equivalent to applying a low-pass filter on the generative vector field. This mathematically suppresses high-frequency instance details and exposes a "coarse semantic corridor," enforcing only the low-frequency global geometry needed to accommodate the observation (e.g., the general shape of a "sofa").

📊 Benchmarks & Results

To facilitate systematic evaluation, we introduce two new diagnostic benchmarks:

ExtremeOcc-3D: Targets extreme occlusion in natural indoor scenes where visible evidence cannot identify the object category.
AmbiSem-3D: Targets semantic branching, where the same visual evidence admits multiple plausible interpretations, paired with distinct text prompts.

Results

Extensive experiments demonstrate that RelaxFlow successfully steers the generation of unseen regions to match the prompt intent. It avoids the observation-overfitted collapse of existing models and produces high-quality 3D assets without compromising visual fidelity.

🚀 Get Started

Installation

Follow the setup steps of SAM 3D Objects before running the following. Based on our testing, the minimum requirement is a single GPU with 24GB of memory (e.g., NVIDIA RTX A5000).

Quickstart

For a quick start, run python demo_relaxflow.py using test data:

FOLDER="test_data/A_bike_with_a_blue_front_wheel_and_a_red_rear_wheel"
OUTNAME=$(basename $FOLDER)
IMG=${FOLDER}/image.png
MSK=${FOLDER}/mask.png
# PRI="${FOLDER}/prior1.png ${FOLDER}/prior2.png ${FOLDER}/prior3.png ${FOLDER}/prior4.png"
PRI=${FOLDER}/prior.png
python demo_relaxflow.py --image $IMG --mask $MSK --prior-images $PRI --output-name $OUTNAME

Another case:

FOLDER="test_data/dressing_table"
OUTNAME=$(basename $FOLDER)
IMG=${FOLDER}/input.png
PRI="${FOLDER}/prior1.png ${FOLDER}/prior2.png ${FOLDER}/prior3.png"
python demo_relaxflow.py --image $IMG --prior-images $PRI --output-name $OUTNAME

Results will be saved into outputs/.

Benchmarks

For testing benchmarks **ExtremeOcc-3D** and **AmbiSem-3D**, please first download the datasets via [link](tbd). Then run `python demo_relaxflow_batch.py` using the prepared manifests:

python demo_relaxflow_batch.py \... #todo: publish the datasets and manifest files

License

This repository is built upon the SAM 3D Objects model as a backbone; both the original SAM 3D Objects code and the modifications in this repository are licensed under the SAM License.

Citing RelaxFlow

If you find our work useful, please use the following BibTeX entry.

< TODO: update bibtex here >

@article{zhu2026relaxflow,
  title={RelaxFlow: Text-Driven Amodal 3D Generation},
  author={Zhu, Jiayin and Fu, Guoji and Liu, Xiaolu and He, Qiyuan and Li, Yicong and Yao, Angela},
  journal={arXiv preprint},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
checkpoints		checkpoints
doc		doc
environments		environments
notebook		notebook
sam3d_objects		sam3d_objects
test_data		test_data
trellis_relaxflow		trellis_relaxflow
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo_relaxflow.py		demo_relaxflow.py
demo_relaxflow_batch.py		demo_relaxflow_batch.py
requirements.dev.txt		requirements.dev.txt
requirements.inference.txt		requirements.inference.txt
requirements.p3d.txt		requirements.p3d.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RelaxFlow: Text-Driven Amodal 3D Generation

🎯 What We Do: Resolving Semantic Ambiguity

⚙️ How We Do It: Decoupled Control & Relaxation

The Theory: Low-Pass Relaxation

📊 Benchmarks & Results

Results

🚀 Get Started

Installation

Quickstart

Benchmarks

License

Citing RelaxFlow

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

RelaxFlow: Text-Driven Amodal 3D Generation

🎯 What We Do: Resolving Semantic Ambiguity

⚙️ How We Do It: Decoupled Control & Relaxation

The Theory: Low-Pass Relaxation

📊 Benchmarks & Results

Results

🚀 Get Started

Installation

Quickstart

Benchmarks

License

Citing RelaxFlow

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages