Skip to content

Code for "ACG: Action Coherence Guidance for Flow-based VLA Models"

DAVIAN-Robotics/ACG

Repository files navigation

ACG: Action Coherence Guidance for Flow-based VLA Models

arXiv GitHub Code Project Page Hugging Face YouTube Demo

Minho Park*, Kinam Kim*, Junha Hyung, Hyojin Jang, Hoiyeong Jin, Jooyeol Yun, Hojoon Lee, and Jaegul Choo
DAVIAN Robotics, KAIST AI
arXiv 2025. (* indicates equal contribution)

🌐 Overview

Action Coherence Guidance (ACG) is a training-free, test-time guidance algorithm that improves temporal and spatial action consistency in Vision-Language-Action (VLA) models. It mitigates motion jitter, unintended pauses, and trajectory drift caused by noisy demonstrations, resulting in stable and precise robotic manipulation.

Vanilla_GR00T-N1_vs_Ours.mp4

🔑 Key Features

  • Training-Free Guidance: Enhances action coherence during inference without retraining or fine-tuning.
  • Plug-and-Play Integration: Seamlessly compatible with existing diffusion and flow-matching VLA policies across multiple benchmarks.
  • Proven Performance: Demonstrates consistent improvements in success rate on RoboCasa, DexMimicGen, and real-world SO-101 manipulation tasks.

⚙️ Getting Started

📦 Installation Guide

# Create conda environment
conda create -n acg python=3.10 -y
conda activate acg

# Install PyTorch (adjust CUDA version if needed)
pip install torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1 --index-url https://download.pytorch.org/whl/cu128

# Install core dependencies
pip install -e libs/Isaac-GR00T-N1
pip install --no-build-isolation flash-attn==2.8.3

# Robosuite and RoboCasa
pip install libs/robosuite/
pip install -e libs/robocasa/

# Download RoboCasa assets (~5GB)
python libs/robocasa/robocasa/scripts/download_kitchen_assets.py

# Robomimic and DexMimicGen
pip install -e libs/robomimic/ --no-dependencies
pip install -e libs/dexmimicgen/ --no-dependencies

# Remaining requirements
pip install -r requirements.txt

🧩 Repository Features

To the best of our knowledge, this is the first public repository offering post-trained GR00T-N1-2B models and rollout scripts on both RoboCasa and DexMimicGen benchmarks.

  • RoboCasa: Reproduced score (32.6) closely matches the reported result (32.1).
  • DexMimicGen: Reproduced score (40.6) is lower than the reported (58.5); the cause is under investigation.

⚠️ This is an unofficial reproduction. Contributions and issue reports are highly welcome.

Repository Highlights

  • Self-contained Finetuning & Inference for GR00T-N1-2B.

  • Training script: gr00t_finetune_robocasa.py

    • Utilizes a modified Robomimic DataLoader to finetune GR00T-N1-2B on RoboCasa and DexMimicGen datasets.
    • Supports multiple embodiments — SinglePandaGripper for RoboCasa, and BimanualPandaGripper, BimanualPandaHand, GR1 for DexMimicGen.
    • Note: the official GR00T-N1-2B post-training code only supports humanoid (GR1) embodiments.
  • Rollout script: rollout_with_robomimic.py

    • Adapted from the Robomimic rollout framework.
    • You can directly perform rollouts using our finetuned models from Hugging Face.

🚀 Quick Start

We provide RoboCasa examples below. For DexMimicGen and more detailed scripts, see ./scripts/run_gr00t.md.

🔧 Training (Post-Training Phase)

Before starting, make sure to download each dataset from its official repository: RoboCasa, DexMimicGen.
Then, update the dataset paths in the Robomimic config files accordingly.

n_mg="100"
ngpu="1"
bs="64"
ga="2"
steps="60000"
training_seed="42"
exp_name="MG${n_mg}/LR=1e-4_Bs=${ngpu}x${bs}x${ga}_Steps=${steps}_Seed=${training_seed}${note}"

export WANDB_ENTITY="your-entity"
export WANDB_PROJECT="Your Robot Project"

python libs/Isaac-GR00T-N1/scripts/gr00t_finetune_robocasa.py \
  --num-gpus ${ngpu} \
  --output-dir checkpoints/robocasa/${exp_name} \
  --data-configs robocasa_single_panda_gripper \
  --video-backend decord \
  --embodiment_tag single_panda_gripper \
  --exp_name ${exp_name} \
  --batch_size ${bs} \
  --robomimic_config_json libs/Isaac-GR00T-N1/robomimic_configs/robocasa_mg${n_mg}.json \
  --gradient_accumulation_steps ${ga} \
  --no-save-only-model \
  --dataloader_num_workers 16 \
  --pin_memory \
  --max-steps ${steps} \
  --save_steps 1000 \
  --save_total_limit 3 \
  --training_seed ${training_seed}

🤖 Inference (Rollout)

Without ACG:

note=""
n_rollouts="24"
num_batch_envs="8"
export MAX_NUM_EMBODIMENTS="32"
dataset_name="robocasa_mg100"
config_path="libs/Isaac-GR00T-N1/robomimic_configs/${dataset_name}.json"
model_path="DAVIAN-Robotics/GR00T-N1-2B-tuned-RoboCasa-MG100-FrankaPandaGripper"
seed="123"

bash scripts/base_rollout.sh ${config_path} ${model_path} ${seed} ${n_rollouts} ${num_batch_envs} "${note}"

With ACG enabled:

note=""
n_rollouts="24"
num_batch_envs="8"
export MAX_NUM_EMBODIMENTS="32"
dataset_name="robocasa_mg100"
config_path="libs/Isaac-GR00T-N1/robomimic_configs/${dataset_name}.json"
model_path="DAVIAN-Robotics/GR00T-N1-2B-tuned-RoboCasa-MG100-FrankaPandaGripper"

acg_options="
algo.guidance.name=acg
algo.guidance.scale=3.0
algo.guidance.skip_blocks=7,9,11
"  # Corresponding to 4th–6th self-attention layers.

note="${note}_acg"
seed="123"

bash scripts/base_rollout.sh ${config_path} ${model_path} ${seed} ${n_rollouts} ${num_batch_envs} "${note}" ${acg_options}

🧾 Citation

@article{park2025acg,
  title={ACG: Action Coherence Guidance for Flow-based VLA Models},
  author={Park, Minho and Kim, Kinam and Hyung, Junha and Jang, Hyojin and Jin, Hoiyeong and Yun, Jooyeol and Lee, Hojoon and Choo, Jaegul},
  journal={arXiv preprint arXiv:2510.22201},
  year={2025}
}

🙏 Acknowledgement

This repository builds upon the incredible open-source efforts of Isaac-GR00T, Robosuite, Robomimic, RoboCasa, DexMimicGen, and Diffusers.
We sincerely appreciate their outstanding contributions to the robotics and AI community.

About

Code for "ACG: Action Coherence Guidance for Flow-based VLA Models"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published