Official repository for SpaceTools and the Toolshed system.
π οΈ Code coming soon β we are preparing a full release including the Toolshed infrastructure, DIRL training pipeline, and evaluation scripts.
SpaceTools is a framework that empowers VLMs with vision tools and robotic tools for spatial reasoning and real-world manipulation.
It introduces Double Interactive Reinforcement Learning (DIRL), a two-phase training pipeline that enables effective multi-tool coordination.
The code release will include both:
A scalable infrastructure for deploying compute-heavy tools during both training and inference:
- Isolated environments for each tool
- Decoupled resource scaling
- Async parallel workers per tool
- Support for heavy tools (segmentation, pointing, depth, 3D box, grasp prediction)
- DIRL training pipeline
- SFT + RL dataset
- Tool-augmented inference
- Spatial benchmark evaluation
For project details and demos:
π Project Page: https://spacetools.github.io/
π Paper: https://arxiv.org/pdf/2512.04069
We will provide detailed setup instructions, including:
- Recommended environment (conda / pip)
- CUDA / PyTorch version requirements
- Setup for Toolshed workers and servers
- Integration of Toolshed for interactive RL
- Integration of Toolshed for zero-shot frontier model reasoning
- Supervise fine-tuning for tool use
- Dependencies for VLMs, RL, SFT, and each tool backend
# Placeholder β installation instructions coming soon
git clone https://github.com/spacetools/spacetools.git
cd spacetools