Dual-timescale hybrid SSM+Attention architecture for efficient robotics and IoT.
⚠️ Alpha release — architecture is functional and tested, pretrained weights coming soon. API may change.
ArcMind is a neural architecture purpose-built for robotics and IoT edge deployment. It combines a fast State Space Model (SSM) path for continuous sensor stream processing with a slow exact attention path for episodic memory recall — all at 245K to 10.3M parameters.
Current approaches to "LMs for robotics" fall into two camps:
- Massive VLA models (RT-2, OpenVLA, Pi-Zero) — 7B–55B parameters, cloud-dependent, text-centric tokenization. Can't run on edge hardware.
- Shoehorned text SLMs (quantized Phi, TinyLlama on Jetson) — general-purpose text models forced onto constrained devices. Not designed for sensor streams.
Neither is purpose-built for the sensor-stream → reasoning → action loop that robotics and IoT actually need.
ArcMind introduces a dual-timescale hybrid that mirrors how robotic control actually works:
- Fast path (SSM): Processes every sensor frame at hardware rate (100–1000 Hz). O(n) time, O(1) decode memory, no KV cache. Produces smooth, physically plausible control signals.
- Slow path (Attention): Tiny exact attention (1–2 layers, 2–4 heads) runs at decision rate (1–10 Hz) over an episodic memory buffer for precise spatial/temporal recall.
- Sensor-native tokenization: Raw sensor frames projected directly into model dimension via learned linear layers. No vocabulary table — eliminates the ~40% parameter overhead of embedding tables in small text LMs.
- Episodic memory: Fixed-size ring buffer of compressed environment state snapshots. Enables landmark recall and obstacle memory without a growing KV cache.
Requirements: Python ≥ 3.10, PyTorch ≥ 2.1
pip install arcmindFor development:
git clone https://github.com/jemsbhai/arcmind.git
cd arcmind
pip install -e ".[dev]"
pytest tests/ -v # 44 tests, all passingimport torch
from arcmind import ArcMindConfig, ArcMindModel
# Create a model from a preset
config = ArcMindConfig.robotics_small()
model = ArcMindModel(config)
# Simulate a sensor stream (batch=1, 100 timesteps, 12 channels)
sensor_data = torch.randn(1, 100, config.num_sensor_channels)
# Forward pass
model.reset_memory(batch_size=1)
actions = model(sensor_data)
print(actions.shape) # (1, 100, 6)
# Inspect parameter breakdown
for component, count in model.count_parameters().items():
print(f" {component}: {count:,}")# Build a custom config for your specific robot/sensor setup
config = ArcMindConfig(
num_sensor_channels=9, # e.g., 3-axis accel + 3-axis gyro + 3-axis mag
d_model=96,
num_ssm_layers=6,
ssm_state_dim=12,
num_attn_layers=1,
num_attn_heads=3,
num_memory_slots=32,
action_dim=4, # e.g., 4 motor commands
sensor_freq_hz=200.0,
decision_freq_hz=20.0,
)
model = ArcMindModel(config)All parameter counts independently verified via test suite.
| Preset | Params | SSM | Attention | Tokenizer | Target Hardware |
|---|---|---|---|---|---|
iot_tiny |
245K | 73.5% | 20.3% | 0.2% | Cortex-M7, ESP32-S3 |
robotics_small |
1.7M | 84.7% | 11.7% | 0.1% | Jetson Orin Nano, RPi 5 |
robotics_medium |
10.3M | 82.4% | 15.3% | 0.1% | Desktop GPU, Jetson AGX |
Key property: the sensor tokenizer is consistently <1% of total parameters, confirming the embedding-table-free design eliminates the parameter overhead that dominates sub-100M text LMs.
Sensor Stream → SensorTokenizer → SSMCore (fast, 100-1000 Hz)
↓ periodic snapshot
EpisodicMemory (ring buffer)
↓ read
SSM output → SlowAttention (slow, 1-10 Hz) ← memory slots
↓ gated fusion
ActionHead → action output
Design rationale:
- SSM:Attention parameter ratio of ~5:1 to ~8:1 extends the Granite 4 (9:1) and Jamba (7:1) ratios, justified by sensor streams being temporally smoother than text.
- Only 2–4 attention heads for recall, based on retrieval-aware distillation research showing 2–3 heads suffice for recall in SSM hybrids.
- Episodic ring buffer adapted from Expansion Span's reserved attention context for distant retrieval, applied here to compressed environment state snapshots.
- Core architecture (SSM + Attention + Memory + Gating)
- Sensor-native tokenizer
- Three validated presets (IoT, Robotics-S, Robotics-M)
- Full test suite (44 tests)
- PyPI package
- Training pipeline
- Benchmark evaluation (MuJoCo, sensor datasets)
- Pretrained weights on HuggingFace
- Research paper
Paper forthcoming. For now:
@software{arcmind2026,
title={ArcMind: Dual-Timescale Hybrid SSM+Attention for Efficient Robotics and IoT},
author={Syed, Muntaser},
year={2026},
url={https://github.com/jemsbhai/arcmind},
}MIT License. See LICENSE for details.