![Cosmos-Transfer1-7B](cosmos-transfer1_banner.png)

**Cosmos-Transfer1** is a multimodal world-to-world (W2W) generation model from the Cosmos WFM series. It allows controllable visual generation using inputs like segmentation, depth, canny edge, and blur, with adaptive spatiotemporal control. This notebook showcases how to use Cosmos-Transfer1-7B for flexible and coherent visual transformations.

The following steps are based on [Github: Cosmos-Transfer1-7B](https://github.com/nvidia-cosmos/cosmos-transfer1/blob/main/examples/inference_cosmos_transfer1_7b.md)
- Tested Spec:
    - GitHub Commit: ed9ab808fb1c4fab04a14ecd7fbccb3e757bd92e
    - GPU: Crusoe L40S
    - VRAM: 48GiB
    - GPU Driver: 535.183.06 (CUDA 12.2)

### Setup Environment and Dependencies
---
Execute the following commands in a terminal. To open a terminal: Launcher tab -> Other -> Terminal

```bash
# Install uv package manager
curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.local/bin/env

# Login your Huggingface account to download checkpoints later
# Get your access token here: https://huggingface.co/settings/tokens
uv tool install -U "huggingface_hub[cli]"
hf auth login

# Create a python virtual environment and install dependencies to pull data
uv venv
source .venv/bin/activate
uv pip install loguru
uv pip install torch
uv pip install huggingface_hub
uv pip install ipykernel
uv pip install ipywidgets

# Create a pyhton kernel for the notebook
python -m ipykernel install --user --name=transfer1 --display-name "Python (.venv) Transfer1"
```

### Switch to the Custom Python Kernel
---
1. Go back to the notebook: *cosmos-transfer1.ipynb*
2. click on the **Python3(ipykernel)** on upper-right corner
3. Pick **Python(.venv)Transfer1** in *Start python Kernel* section, then click Select button. (If you don't see the option, try restaring the notebook.)
4. The upper-right kernel button should be updated to *Python(.venv)Transfer1*

### Create Workspace
---
Make sure you have at least 360 GB of free disk space to store data. The following code is for the Crusoe instance use case, you can simply create a /workspace directory if you're not using Crusoe.

In [None]:
%%bash

# Ensure the target directory exists
mkdir -p /ephemeral/workspace
# Create the symlink only if it doesn't already exist
[ -L ~/workspace ] || ln -s /ephemeral/workspace ~/workspace

### Clone GitHub Repository Which Contains Sample Scripts and Dataset
---

In [None]:
%%bash

cd ~/workspace
# Clone the repository
git clone https://github.com/nvidia-cosmos/cosmos-transfer1.git
# Switch to the tested commit
cd cosmos-transfer1
git fetch
git checkout ed9ab808fb1c4fab04a14ecd7fbccb3e757bd92e

### Download Model Weights
---
It takes a while to download 360+ GB data from HuggingFace.

In [None]:
# Pull model weights from Huggingface
import os
import sys

project_root = os.path.abspath("workspace/cosmos-transfer1")
download_script = "workspace/cosmos-transfer1/scripts/download_checkpoints.py"
checkpoint_dir = "workspace/checkpoints/"

!PYTHONPATH={project_root} {sys.executable} {download_script} --output_dir {checkpoint_dir}

In [None]:
# You should see /ephemera took 363GB, 82% of the disk space
!df -h

### Pull Docker Image
---
You should the following message after docker image downloaded.
```
Status: Downloaded newer image for ...
```

In [None]:
%%bash

# Setup docker config, REPLACE <you key> with your NGC Key
export NGC_CLI_API_KEY=<your Key>
echo "$NGC_CLI_API_KEY" | docker login nvcr.io --username '$oauthtoken' --password-stdin

# Note: This is temporary, there should be a public Docker registry in the future.
# For now, you can use either of them:
# 1. nvcr.io/nvidian/cosmos-transfer1:pytorch-25-04_v2@sha256:b5dd417cf4b5be0377e9e4ebc78564540ce7b27d2a4cf67c0dfe3d81088bdaf8
# 2. nvcr.io/0589085444718644/cosmos/cosmos-transfer1:0.1 (contact samwu@nvidia.com for this one)
docker pull <docker image register>

### Spin Up the Cosmos-Transfer1 Container
---

In [None]:
!docker run --gpus all -d --name cosmos-transfer1 \
    -v ./workspace/cosmos-transfer1:/workspace \
    -v ./workspace/cosmos-transfer1/assets:/workspace/datasets \
    -v ./workspace/checkpoints:/workspace/checkpoints \
    nvcr.io/0589085444718644/cosmos/cosmos-transfer1:0.1 tail -f /dev/null

In [None]:
!docker ps -a

In [None]:
# Sanity check if the environment setup is successful
!docker exec cosmos-transfer1 python scripts/test_environment.py

### Use Case #1: Single Control (Edge)
---
- VRAM Used: ~30 GB
- Inference Time:
    - ~20 minutes (A100 x 1)
    - ~18 minutes (L40S x 1)

You should see this message in the end of the log:
```
[08-05 12:47:18|INFO|cosmos_transfer1/diffusion/inference/transfer.py:396:demo] Saved video to outputs/example1_single_control_edge/output.mp4
[08-05 12:47:18|INFO|cosmos_transfer1/diffusion/inference/transfer.py:397:demo] Saved prompt to outputs/example1_single_control_edge/output.txt
```

In [None]:
%%bash

docker exec cosmos-transfer1 bash -c "
export CUDA_VISIBLE_DEVICES=\${CUDA_VISIBLE_DEVICES:=0}
export CHECKPOINT_DIR=\${CHECKPOINT_DIR:=./checkpoints}
export NUM_GPU=\${NUM_GPU:=1}
export PYTHONPATH=/workspace
torchrun --nproc_per_node=\$NUM_GPU --nnodes=1 --node_rank=0 /workspace/cosmos_transfer1/diffusion/inference/transfer.py \
    --checkpoint_dir \$CHECKPOINT_DIR \
    --video_save_folder outputs/example1_single_control_edge \
    --controlnet_specs assets/inference_cosmos_transfer1_single_control_edge.json \
    --offload_text_encoder_model \
    --offload_guardrail_models \
    --num_gpus \$NUM_GPU
"

### Use Case #1.1 Prompt Upsampler
You can use our prompt upsampler to convert your short prompt into a longer, more detailed prompt for video generation by using the --upsample_prompt argument.

In [None]:
%%bash

docker exec cosmos-transfer1 bash -c "
export CUDA_VISIBLE_DEVICES=\${CUDA_VISIBLE_DEVICES:=0}
export CHECKPOINT_DIR=\${CHECKPOINT_DIR:=./checkpoints}
export NUM_GPU=\${NUM_GPU:=1}
export PYTHONPATH=/workspace
torchrun --nproc_per_node=\$NUM_GPU --nnodes=1 --node_rank=0 /workspace/cosmos_transfer1/diffusion/inference/transfer.py \
    --checkpoint_dir \$CHECKPOINT_DIR \
    --video_save_folder outputs/example1_single_control_edge_upsampled_prompt \
    --controlnet_specs assets/inference_cosmos_transfer1_single_control_edge_short_prompt.json \
    --offload_text_encoder_model \
    --upsample_prompt \
    --offload_prompt_upsampler \
    --offload_guardrail_models \
    --num_gpus \$NUM_GPU
"

### Stop the Container
---

In [None]:
%%bash

docker stop cosmos-transfer1
docker rm cosmos-transfer1