AlphaBrain is an all-in-one, open-source community for embodied intelligence, built to be ready out of the box. We unifies multiple VLA architectures, world model backbones, biologically-inspired learning algorithms, and reinforcement learning paradigms under a single, extensible framework. AlphaBrain brings embodied AI within everyone’s reach.
Quick Start & Documentation · Key Features · Community · Citation
| 🧠 | Brain-Inspired VLA (NeuroVLA) — The first open-source biologically-inspired VLA model, achieving SOTA on brain-inspired control tasks. Integrates spiking neural networks (SNN) with STDP learning rules, advancing embodied intelligence toward biological brain learning mechanisms. |
| 🔄 | Cross-Architecture Continual Learning — The first open-source continual learning algorithm designed for cross-architecture VLA, breaking architecture compatibility bottlenecks and supporting universal adaptation and knowledge accumulation across different VLA models. |
| 🎯 | RLActionToken Training Paradigm — The first open-source VLA training architecture based on RL Token, a novel architecture that compresses VLA hidden states through an information bottleneck, followed by off-policy Actor-Critic reinforcement learning. |
| 🌍 | Native World Model Integration — The first open-source VLA to natively integrate Cosmos Policy original weights, supporting flexible world model switching across Cosmos 2 / 2.5, Wan 2.2, and V-JEPA 2.1. |
| 📊 | Comprehensive Benchmark Suite — Full adaptation to the latest embodied benchmarks with open-source support for long-horizon task execution and memory: LIBERO, LIBERO-plus, RoboCasa, RoboCasa365 and more to come. |
Full setup, training, evaluation, and deployment instructions live in our documentation site. Step-by-step guides, configuration references, and troubleshooting notes are all maintained there.
AlphaBrain delivers five core capabilities on a single stack: the VLA framework family as the base, with NeuroVLA / RLActionToken / Continual Learning / World Model as composable capability modules. All capabilities share the same trainer, config system, and inference interface.
| Framework | Action Decoding | Typical Use |
|---|---|---|
| OFT | MLP action head, parallel continuous decoding | Fast prototyping, baseline alignment |
| GR00T | System1 + Flow-Matching DiT System2 | High-precision manipulation, long-horizon planning |
| PI | Flow-Matching action prediction | Diffusion-style policies |
| Adapter | Lightweight Adapter decoding | Parameter-efficient fine-tuning |
| NeuroVLA | Bio-inspired spiking + STDP | Brain-inspired control |
| CosmosPolicy | Latent-space video diffusion | World-model-native policy |
NeuroVLA integrates spiking neural networks with biological learning rules into the VLA pipeline:
- QFormer extracts layer-wise action-relevant features from VLM hidden states;
- SNN Action Head with Leaky Integrate-and-Fire (LIF) neurons for spike-based action prediction;
- R-STDP Training — Reward-Modulated Spike-Timing-Dependent Plasticity, supporting both hybrid (backprop + STDP) and pure STDP modes;
- Online STDP — Test-time adaptation with zero backpropagation, using self-supervised reward signals from environment interaction.
A novel architecture that compresses VLA hidden states through an information bottleneck, followed by off-policy Actor-Critic reinforcement learning:
- Encoder-Decoder: Extracts a compact action token from the VLA's internal features to serve as the state representation for RL.
- Two-Phase Training: An initial adaptation stage to expose the action token → RL fine-tuning with a frozen VLA.
- Low Resource Requirements: The actual reinforcement learning gradient update phase involves a highly lightweight parameters.
Experience-replay-based continual learning for sequential task acquisition:
- Incremental design — all changes are additive, no modification to base training code;
- LoRA integration — parameter-efficient fine-tuning (~6% trainable params, ~3× memory savings);
- Replay buffer with configurable per-task capacity;
- Cross-architecture adaptation — the same CL algorithm drops directly onto different VLA frameworks.
Native support for 4 world model backbones plus full CosmosPolicy finetuning:
| Backbone | Params | Mode Name | Text Encoder |
|---|---|---|---|
| V-JEPA 2.1 | ~1.8B | world_model_vjepa |
T5-small |
| Cosmos Predict 2.5 | ~2.1B | world_model_cosmos |
Reason1-7B |
| Cosmos Predict 2 | ~2.1B | world_model_cosmos2 |
T5-XXL |
| Wan 2.2 | ~5B | world_model_wan |
UMT5-XXL |
| Benchmark | Tasks | Highlights | Path |
|---|---|---|---|
| LIBERO | Spatial / Object / Goal / Long-horizon | Core evaluation suite, 4 task suites | benchmarks/LIBERO/ |
| LIBERO-plus | Robustness (Camera, Robot, Language, Light, etc.) | Zero-shot generalization testing | benchmarks/LIBERO-plus/ |
| RoboCasa | Tabletop & kitchen manipulation | Real-world scene diversity | benchmarks/Robocasa_tabletop/ |
| RoboCasa365 | 365-day kitchen task collection | Large-scale daily tasks | benchmarks/Robocasa365/ |
| ... |
We welcome contributions from the community — including new frameworks, benchmark adapters, bug fixes, and improvements that achieve stronger benchmark performance. Outstanding contributors may be invited to join the community as core members. Every contribution matters.
| Channel | Link |
|---|---|
| GitHub Issues | Report bugs & request features |
| HuggingFace | Models |
| WeChat Group | Scan the QR code to join |
AlphaBrain is mainly forked from starVLA and stands on the shoulders of an incredible open-source ecosystem. We are deeply grateful to the authors and maintainers of the following projects, whose code, models, datasets, and ideas directly enabled this work:
- starVLA/starVLA
- openvla/openvla
- moojink/openvla-oft
- Physical-Intelligence/openpi
- NVIDIA/Isaac-GR00T
- QwenLM/Qwen3-VL
- nvidia-cosmos/cosmos-predict2.5
- Wan-Video/Wan2.2
- Lifelong-Robot-Learning/LIBERO
- robocasa/robocasa
- guoweiyu/NeuroVLA
@software{AlphaBrain2026,
title = {AlphaBrain: a Modular Open-Source Framework for Embodied Intelligence Research},
author = {AlphaBrain Community},
year = {2026},
url = {https://github.com/AlphaBrainGroup/AlphaBrain},
license = {MIT},
doi = {}
}This project is licensed under the MIT License.
