Skip to content

showlab/ActionMap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ActionMap: Robot Policy Learning via Voxel Action Heatmap

Pei Yang1,* , Hai Ci1,*,† , Yanzhe Chen1,* , Qi Lv1 , Han Cai2 , and Mike Zheng Shou1,✉
1 Show Lab, National University of Singapore    2 NVIDIA
* Equal contribution    Project lead    Corresponding author



ActionMap replaces the single-point action decoder of vision-language-action models with a voxel action heatmap, improving success rate, data efficiency, and convergence across LIBERO simulation and real-world Franka manipulation.

🧩 Pre-Release

Our code is coming soon. As a preview, we release the core implementation of our action head. This action head could be used to replace a VLA's native action decoder (e.g., OpenVLA-OFT's L1 regression head). The example below shows how to plug it in.

import torch
from heatmap_action_head import HeatmapActionHead

head = HeatmapActionHead(
    input_dim=4096,           # VLA backbone hidden size
    num_actions_chunk=8,      # action tokens per chunk
    action_dim=7,             # [x, y, z, r, p, w, grip]
    trans_grid=(32, 32, 16),  # translation voxel grid
    rot_grid=(16, 16, 16),    # rotation voxel grid
)

# Run your VLA backbone and keep the last hidden layer.
outputs = backbone(input_ids=input_ids, attention_mask=attention_mask, output_hidden_states=True)
hidden = outputs.hidden_states[-1]                       # (B, seq_len, llm_dim)

# Gather the hidden states at the action-token positions:
#   (B, num_actions_chunk * action_dim, llm_dim)
actions_hidden = hidden[:, action_token_indices]

# Training (ground-truth actions are normalized to [-1, 1]):
pred_actions, loss = head.predict_action_with_loss(actions_hidden, gt_actions)
loss.backward()

# Inference:
pred_actions = head.predict_action(actions_hidden)       # (B, num_actions_chunk, 7)

📄 Citation

@article{actionmap,
    title={ActionMap: Robot Policy Learning via Voxel Action Heatmap}, 
    author={Pei Yang and Hai Ci and Yanzhe Chen and Qi Lv and Han Cai and Mike Zheng Shou},
    year={2026},
    archivePrefix={arXiv},
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages