Skip to content

[Task] image-to-3d model support #291

@DingmaomaoBJTU

Description

@DingmaomaoBJTU

Overview

Image-to-3D models generate a 3D representation (mesh, NeRF, or structured latent) from a single input image. TRELLIS uses a structured latent representation (SLAT) and a powerful diffusion backbone, supporting both image and text conditioning and outputting high-quality meshes, radiance fields, and Gaussian splats.

Agent Scenarios

  • E-commerce product agent: convert a product photo into a 3D model for AR try-on or 360° viewing directly on Windows
  • Game asset creation agent: generate 3D props or characters from reference images for use in a game engine
  • Interior design agent: lift a photo of a furniture piece into a 3D object to place and visualize in a room layout
  • Digital twin agent: reconstruct physical objects from photos for simulation or inspection workflows

ModelKit Integration

Models must pass the full wmk pipeline on all EPs:

wmk config → wmk build (ONNX export) → wmk perf → wmk eval

Acceptance Criteria

  • microsoft/TRELLIS-image-large

Metadata

Metadata

Assignees

No one assigned

    Labels

    No fields configured for Feature.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions