Skip to content

nv-tlabs/TokenGS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TokenGS

Teaser: TokenGS results and exploration

TokenGS: Decoupling 3D Gaussian Prediction from Pixels with Learnable Tokens
Jiawei Ren*, Michal Tyszkiewicz*, Jiahui Huang, Zan Gojcic
* indicates equal contribution

Paper · Project Page · HuggingFace

TokenGS predicts 3D Gaussians with a self-supervised rendering objective. An encoder–decoder stacks learnable Gaussian tokens so the number of primitives is not tied to image resolution or view count.

Installation

Install the package in editable mode (dependencies include PyTorch, gsplat, and fused-ssim via pyproject.toml):

uv pip install -e .

Environment: Python 3.11, CUDA 12.6+ (see pyproject.toml for pinned versions).

Data: DL3DV layout, symlinks, and dataset_kwargs are described in data/DATA.md.

Evaluation

Place weights under checkpoints/ (or pass any path to --resume). Metrics are written to <workspace>/metrics.txt; the workspace directory is created automatically.

Example (6-view preset):

accelerate launch --config_file acc_configs/gpu1.yaml \
    -m tokengs.evaluate eval_dl3dv_6view \
    --workspace results/dl3dv_eval/6view \
    --resume checkpoints/dl3dv_6v.safetensors \
    --use_ttt_for_eval \
    --eval_n_media_dumps 20 \

Presets eval_dl3dv_2view and eval_dl3dv_4view select the matching evaluation JSONs. Remove --use_ttt_for_eval to turn off test-time token tuning.

Media dumps: --eval_n_media_dumps N writes PNGs, MP4s, depth vis, and PLY for the first N dataloader batches under <workspace>/{images,videos,depths,gaussians}/ (default 0 = metrics only).

Training

1. Base run (train_dl3dv_base preset):

accelerate launch --config_file acc_configs/gpu8.yaml \
    -m tokengs.train train_dl3dv_base \
    --workspace workspace/dl3dv_base \
    --experiment_name dl3dv_base

2. Finetune from a checkpoint (presets finetune_dl3dv_2view, finetune_dl3dv_4view, finetune_dl3dv_6view):

accelerate launch --config_file acc_configs/gpu8.yaml \
    -m tokengs.train finetune_dl3dv_2view \
    --workspace workspace/dl3dv_2view \
    --experiment_name dl3dv_2view \
    --resume workspace/dl3dv_base/model.safetensors

Swap the subcommand for 4- or 6-view finetune presets as needed.

License

TokenGS is released under the Apache License 2.0. See CONTRIBUTING.md for contribution guidelines.

Citation

If you use TokenGS in your research, please cite:

@article{tokengs2026,
  title={TokenGS: Decoupling 3D Gaussian Prediction from Pixels with Learnable Tokens},
  author={Jiawei Ren and Michal Tyszkiewicz and Jiahui Huang and Zan Gojcic},
  journal={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2026}
}

About

[CVPR'26] TokenGS: Decoupling 3D Gaussian Prediction from Pixels with Learnable Tokens

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages