Feedforward 3D Editing Learns from Semantic-Part Transformation

Jiawei Weng^1,*, Saining Zhang^1,*,†, Zhenxin Diao^2,*, Peishuo Li¹, Henghaofan Zhang², Junhao Chen², Hao Zhao^2,†

¹Nanyang Technological University, Singapore ²Tsinghua University, China

_{*Equal contribution. †Corresponding author.}

PartFlow is a feedforward 3D editing network that edits an existing 3D asset to match a target edit image — no per-asset optimisation, no 3D mask at inference. We train it on Pxform, a high-quality 3D editing dataset with 100K+ consistent before/after pairs across seven edit types, grounding edits in semantic 3D parts.

Highlights

Feedforward — one forward pass per edit
Semantic-part grounded — trained on Pxform's part-level pairs
Mask-free at inference — only needs the source asset + a target image
Two-stage flow — sparse-structure edit ➜ structured-latent edit

Method

PartFlow architecture — two-stage controlled flow

PartFlow edits in two stages, conditioning a pretrained 3D generative prior (TRELLIS) on the source asset's latent and a target edit image. Each stage is a controlled flow model with a zero-linear gated reference branch and a mask-aware training loss:

Stage 1 — Sparse-structure flow. Inputs the source SS latent + edit condition, predicts the edited 16³ voxel structure.
Stage 2 — Structured-latent (SLAT) flow. Inputs the source SLAT mapped to the edited coords + edit condition, predicts the edited SLAT, which the TRELLIS decoders turn into a textured edit.glb.

Installation

PartFlow reuses the TRELLIS runtime (same CUDA extensions, same frozen DINOv2 / SS / SLAT decoders). Set up TRELLIS first, then add PartFlow on top. Tested with Python 3.10, PyTorch 2.5.0, CUDA 12.4.

1. Set up the TRELLIS environment. Follow the official TRELLIS installation guide to create the conda env and build the CUDA extensions (spconv, flash-attn, kaolin, diff_gaussian_rasterization, nvdiffrast, diffoctreerast). For convenience, an equivalent one-liner is bundled here:

. ./setup.sh --new-env --basic --flash-attn --diffoctreerast --spconv \
             --mipgaussian --kaolin --nvdiffrast

2. Install PartFlow's extra Python dependencies into the same env:

pip install -r requirements.txt

Weights

python download_weights.py          # -> ./weights/{stage1_ss,stage2_slat}/

Pulls the two trained stage models from ART-3D/PartFlow_models.

Data layout

Inference reads pre-encoded inputs. Each case is a directory:

<case_dir>/
    ori_ss_latents.npz   # key `mean`: float32 [8, 16, 16, 16]   — source sparse-structure latent
    ori_latents.npz      # `coords` [N,3] int, `feats` [N,8] f32 — source structured latent (SLAT)
    edit_img.png         # the target edit image (RGB or RGBA)
    case_meta.json       # optional metadata (prompt, edit type, ...)

ori_ss_latents.npz / ori_latents.npz are the TRELLIS latents of the source asset; produce them with the standard TRELLIS image-to-3D encoder. Ground-truth edit_* files, if present, are ignored by inference.

Run inference

# single case
python inference.py --input examples/mod_glass_disc_table --output_dir outputs

# a whole directory of cases
python inference.py --input /path/to/pxform/cases --output_dir outputs

# useful flags
#   --steps 50           flow-sampling steps
#   --cfg_strength 0.0   classifier-free guidance (0 = condition only)
#   --manifest ids.json  restrict to a list of case ids
#   --skip_existing      resume a partial run

Each case writes outputs/<edit_id>/edit.glb and pred_slat.npz.

Repository layout

PartFlow/
├── inference.py        two-stage inference pipeline + CLI
├── dataset.py          PxformDataset (per-case loader)
├── download_weights.py fetch weights from Hugging Face
├── configs/            Stage 1 / Stage 2 model configs
├── examples/           one ready-to-run example case
├── trellis/            TRELLIS backbone + PartFlow stage models
├── assets/             README figures
├── setup.sh            CUDA-extension installer
└── requirements.txt    pure-pip dependencies

Results Comparison

PartFlow vs. baselines — appearance edits

Citation

@article{weng2026partflow,
  title   = {Feedforward 3D Editing Learns from Semantic-Part Transformation},
  author  = {Weng, Jiawei and Zhang, Saining and Diao, Zhenxin and Li, Peishuo and Zhang, Henghaofan and Chen, Junhao and Zhao, Hao},
  journal = {arXiv preprint arXiv:2605.27351},
  year    = {2026}
}

Acknowledgements

Built on TRELLIS.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Feedforward 3D Editing Learns from Semantic-Part Transformation

Highlights

Method

Installation

Weights

Data layout

Run inference

Repository layout

Results Comparison

Citation

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
assets		assets
configs		configs
examples/mod_glass_disc_table		examples/mod_glass_disc_table
trellis		trellis
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
download_weights.py		download_weights.py
inference.py		inference.py
requirements.txt		requirements.txt
setup.sh		setup.sh

Folders and files

Latest commit

History

Repository files navigation

Feedforward 3D Editing Learns from Semantic-Part Transformation

Highlights

Method

Installation

Weights

Data layout

Run inference

Repository layout

Results Comparison

Citation

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages