Skip to content

NVIDIA/cosmos-framework

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NVIDIA Cosmos

NVIDIA Cosmos | 🤗 Cosmos 3

Part of the NVIDIA Cosmos project family — the training and serving framework repository.

Cosmos-Framework

Cosmos-Framework is an end-to-end framework for training and serving world models, including the Cosmos3 model family. Everything lives in a single top-level cosmos_framework/ Python package:

  • Training — distributed FSDP / TP / CP / PP trainer, native DCP checkpoints with HuggingFace safetensors import/export, JSONL / WebDataset / LeRobot dataset adapters. Entry point: cosmos_framework.scripts.train. See docs/training.md.
  • Inference — Diffusers / Transformers / vLLM backends with offline batch generation and online serving (Ray + Gradio). Entry point: cosmos_framework.scripts.inference. Ecosystem-facing shim libraries (lightweight standalone wrappers for downstream projects) live under packages/.

Cosmos 3

Cosmos 3 is our newest model family [Report] [Website]. It is a suite of omnimodal world models designed to jointly process and generate language, images, video, audio, and action sequences within a unified Mixture-of-Transformers architecture. By supporting highly flexible input-output configurations, it seamlessly unifies critical modalities for Physical AI — effectively subsuming vision-language models, video generators, world simulators, and world-action models into a single framework. For a guided experience to test out Cosmos3, please visit [Cosmos].

Framework Documentation

Setup

For more details and alternative installation methods, see Setup. Before installing, make sure your machine meets the System Requirements. If you want a curated PyTorch + CUDA environment, start from the recommended NVIDIA NGC base image.

Install system dependencies:

sudo apt-get install -y --no-install-recommends curl ffmpeg git-lfs libx11-dev tree wget

Install the package with uv (pick the dependency group that matches your CUDA toolkit — see CUDA Variants):

# CUDA 13.0 (recommended)
uv sync --all-extras --group=cu130-train
# Or, for CUDA 12.8:
# uv sync --all-extras --group=cu128-train
source .venv/bin/activate && export LD_LIBRARY_PATH=

If you are starting from the recommended NGC image (nvcr.io/nvidia/pytorch:25.09-py3), see the one-shot quickstart.

Training

For the full guide (data preparation, base-checkpoint conversion, parallelism strategies, mixed precision, resuming), see Training. The number of GPUs required depends on the recipe; the shipped recipes under examples/ are 8-GPU configurations (tested on 8× H100 80 GB) launched via their paired launch shells, e.g.:

bash examples/launch_sft_vision_nano.sh

Users may adjust the GPU count to match their model and underlying hardware architecture — tune NPROC_PER_NODE and the parallelism degrees (DP/CP/FSDP shard) in the recipe accordingly.

Inference

See Inference for the full guide — launch commands, supported modes, parallelism presets, and troubleshooting.

Quick single-GPU launch:

python -m cosmos_framework.scripts.inference \
    --parallelism-preset=latency \
    -i "inputs/omni/t2v.json" \
    -o outputs/omni_nano \
    --checkpoint-path Cosmos3-Nano \
    --seed=0

Reference

Topic What it covers
Setup Hardware/software prerequisites, uv install paths, CUDA variants, Docker base image, and base-checkpoint downloading.
Code Structure Repository layout and a per-subpackage tour of cosmos_framework/ — where each concern lives and where to add new code.
Training Launching multi-GPU and multi-node runs; parallelism strategies; mixed precision; resuming.
Inference (from a trained checkpoint) Loading a trained checkpoint into one of the inference backends.
FAQ Troubleshooting (OOM, NCCL hangs, slow training), environment variables, and common pitfalls.

About

Our inference and training framework to run on the Cosmos Models

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages