Motion 3-to-4 reconstructs 3D motion from videos for 4D synthesis in a feedforward mannar within seconds.
For users who want to quickly try the inference:
git clone https://github.com/Inception3D/Motion324.git
cd Motion324
# 1. Setup environment
conda create -n Motion324 python=3.11
conda activate Motion324
pip install -r requirements.txt
# Install Hunyuan3D-2.0 components(optional)
cd scripts/hy3dgen/texgen/custom_rasterizer
python3 setup.py install
cd ../../../..
cd scripts/hy3dgen/texgen/differentiable_renderer
python3 setup.py install
cd ../../../..
# 2. Download pre-trained checkpoints and place in experiments/checkpoints/
# 3. Run inference
chmod +x ./scripts/4D_from_existing.sh
./scripts/4D_from_existing.sh ./examples/chili.glb ./examples/chili.mp4 ./examples/output
# Hunyuan needed
chmod +x ./scripts/4D_from_video.sh
./scripts/4D_from_video.sh ./examples/tiger.mp4Download: Please download the pre-trained checkpoint from here and place it in experiments/checkpoints/.
conda create -n Motion324 python=3.11
conda activate Motion324
pip install -r requirements.txtThe code has been tested with Python 3.11 + Pytorch 2.4.1 + CUDA 12.4.
# Install custom rasterizer
cd scripts/hy3dgen/texgen/custom_rasterizer
python3 setup.py install
cd ../../../..
# Install differentiable renderer
cd scripts/hy3dgen/texgen/differentiable_renderer
python3 setup.py install
cd ../../../..Download and install Blender for 4D asset rendering.
Our results is rendered with blender-4.0.0-linux-x64, using the scripts which is modified from bpy-renderer.
Installation steps:
# Download Blender
wget https://download.blender.org/release/Blender4.0/blender-4.0.0-linux-x64.tar.xz
tar -xf blender-4.0.0-linux-x64.tar.xz
# Add Blender to PATH (optional, or use full path in scripts)
export PATH=$PATH:$(pwd)/blender-4.0.0-linux-x64Note: As we use xformers memory_efficient_attention with flash_attn, the GPU device compute capability needs > 8.0. Otherwise, it would pop up an error. Check your GPU compute capability in CUDA GPUs Page.
- TODO: Dataset will be provided. Please check back for updates.
Update the dataset path in configs/dyscene.yaml:
training:
dataset_path: /path/to/your/dataset
train_lst: /path/to/name_listBefore training, you need to follow the instructions here to generate the Wandb key file for logging and save it in the configs folder as api_keys.yaml.
The default training uses configs/dyscene.yaml:
torchrun --nproc_per_node 8 --nnodes 1 --master_port 12344 train.py --config configs/dyscene.yamlKey training parameters in configs/dyscene.yaml:
You can override any config parameter via command line:
torchrun --nproc_per_node 8 --nnodes 1 --master_port 12346 train.py --config configs/dyscene.yaml \
training.batch_size_per_gpu=32Input: Video file (.mp4/.avi/.mov) or image directory (use ./scripts/images2video.py to convert images to video first)
Output:
- Processed frames and mesh files in
{video_name}_processed/ - Animation output in
{video_name}_processed/animation/(FBX format)
Example:
chmod +x ./scripts/4D_from_video.sh
./scripts/4D_from_video.sh ./examples/tiger.mp4Inputs:
data_dir: Mesh file (.glbor.fbx) - FBX files will be automatically converted to GLBvideo_path: Video file (.mp4/.avi/.mov) or image directoryoutput_dir: Output directory for results
Output:
- Animated mesh files (GLB format) in the specified output directory
- Segmented videos if segmentation is enabled
Example:
chmod +x ./scripts/4D_from_existing.sh
./scripts/4D_from_existing.sh ./examples/chili.glb ./examples/chili.mp4 ./examples/outputYou can evaluate video results with the provided evaluation.py script.
Example:
python ./evaluation/evaluation.py \
--gt_paths /paths/to/gt_videos.mp4 \
--result_paths /paths/to/results_videos.mp4This compares the generated result video(s) to the ground-truth and outputs metrics such as FVD, LPIPS, DreamSim, and CLIP Loss.
You can evaluate mesh geometry with the provided evaluation_pcd.py script.
Example:
python ./evaluation/evaluation_pcd.py \
--gt_path /paths/to/name_pointclouds \
--result_path /paths/to/mesh.fbxThis compares your mesh result with the ground-truth point cloud and evaluates the geometric error between them. It outputs metrics such as Chamfer Distance and F-score.
If you find this work useful in your research, please consider citing:
@article{chen2026motion3to4,
title={Motion 3-to-4: 3D Motion Reconstruction for 4D Synthesis},
author={Hongyuan, Chen and Xingyu, Chen and Youjia Zhang, and Zexiang, Xu and Anpei, Chen},
journal={arXiv preprint arXiv:2601.14253},
year={2026}
}- LVSM (for code architecture reference)
- V2M4, AnimateAnyMesh (for code reference)
- bpy-renderer (for rendering results)
- Hunyuan3D-2 (for 3D generation)
This project is licensed under the CC BY-NC-SA 4.0 License - see the LICENSE.md file for details.