Skip to content

nimit/domin

Repository files navigation

Domin

A high-throughput orchestrator for synthetic robotics datasets; named after Harry Domin of R.U.R., the visionary who sought to free humanity from labor through the mass production of robots.

domin_demo.mp4

domin makes the generation of synthetic datasets for Visual-Language-Action (VLA) models easy, performant, and scalable.

While most current simulation in robotics is geared towards Reinforcement Learning (RL), this package leverages the same massively parallel, multi-environment paradigm to generate training data. This allows you to produce large-scale datasets using NVIDIA Isaac Sim and Isaac Lab with a single configuration file.

Features

This package generates datasets in the popular LeRobot format, compatible with Hugging Face's LeRobot library. The core dataset builder is a modified and extended version of the LeRobot dataset builder, optimized for simulation environments.

Key features inherited from our extended dataset_builder:

  • Ease of Use: A single command to install the package and a single configuration file can generate a complete dataset.
  • Simultaneous Episode Recording: Record multiple episodes (environments) in parallel for high throughput, significantly speeding up data generation.
  • Scheduled Re-recording: Robust handling of failed episodes. If an episode fails, it is automatically cleared and scheduled for a retry in the next batch, ensuring dataset completeness without manual intervention.
  • Metadata & Custom Metrics: Easily save arbitrary metadata (e.g., success rates, simulation parameters) and automatically compute episode statistics in the dataset's info.json.
  • LeRobot Format Compatibility: Produces datasets in the standard LeRobot format (Parquet files with embedded or external images), ready for training.

Installation

You can install this package via pip. Note that you must specify the NVIDIA PyPI index for isaacsim and related dependencies.

pip install domin --extra-index-url https://pypi.nvidia.com

For advanced installation instructions, including setting up Isaac Sim and creating a development environment, please see CONTRIBUTING.md.

Usage

To generate a dataset, use the domin-gen command with a dataset configuration file.

The dataset configuration file should inherit domin.base_dataset_config.BaseDatasetConfig (see example).

Example

domin-gen examples/dexterous_dataset_config.py --num_envs 10 --num_episodes 100

Arguments

  • config_path: Path to the Python file containing the dataset configuration (e.g., examples/dexterous_dataset_config.py).
  • --num_envs: (Optional) Number of parallel environments to simulate (default: 1).
  • --num_episodes: (Required) Total number of episodes to record.

Acknowledgements

This project builds upon the excellent work of the Hugging Face LeRobot team. The domin.dataset_builder module is a modified adaptation of their dataset building tools, tailored for the specific needs of massive parallel simulation in Isaac Lab. I gratefully acknowledge their contributions to the open-source robotics community.

About

domin makes the generation of synthetic datasets for Visual-Language-Action (VLA) models easy, performant, and scalable.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages