Skip to content

rheallyc/MoTok

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

MoTok: Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer

Abstract:
Prior motion generation largely follows two paradigms: continuous diffusion models that excel at kinematic control, and discrete token-based generators that are effective for semantic conditioning. To combine their strengths, we propose a three-stage framework comprising condition feature extraction (Perception), discrete token generation (Planning), and diffusion-based motion synthesis (Control).
Central to this framework is MoTok, a diffusion-based discrete motion tokenizer that decouples semantic abstraction from fine-grained reconstruction by delegating motion recovery to a diffusion decoder, enabling compact single-layer tokens while preserving motion fidelity.
For kinematic conditions, coarse constraints guide token generation during planning, while fine-grained constraints are enforced during control through diffusion-based optimization. This design prevents kinematic details from disrupting semantic token planning.
On HumanML3D, our method significantly improves controllability and fidelity over MaskControl while using only one-sixth of the tokens, reducing trajectory error from 0.72 cm to 0.08 cm and FID from 0.083 to 0.029. Unlike prior methods that degrade under stronger kinematic constraints, ours improves fidelity, reducing FID from 0.033 to 0.014.

Changelog🔥

  • [2026/03/20] Repository created.

Citation

If you find our work useful for your research, please consider citing the paper:

@article{gu2026motok,
      title   = {Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer}, 
      author  = {Chenyang Gu and 
                 Mingyuan Zhang and 
                 Haozhe Xie and 
                 Zhongang Cai and 
                 Lei Yang and 
                 Ziwei Liu},
      journal = {arXiv},
      volume  = {2603.19227},
      year    = {2026}
}

About

Implementation for Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors