In-hand cube reorientation on the Wuji Hand: PPO policies trained in mjlab (GPU-batched physics via mujoco-warp), covering the full SO(3) goal space, with a sim2real bridge for closed-loop deployment on the physical hand.
| Robot | Task ID | Pretrained checkpoint | Demo |
|---|---|---|---|
| Wuji Hand | WujiHand_Reorient |
Latest release assets | sim + real GIFs above |
Pull the checkpoint and CAD bundle from the latest release:
# Requires gh CLI (https://cli.github.com); the glob keeps this command
# working across future release tags. See docs/sim2real/setup.md §3 for
# the manual fallback if you don't have gh installed.
gh release download --repo wuji-technology/wuji-mjlab --pattern '*-assets.zip'
unzip wuji-mjlab-*-assets.zip
mv wuji-mjlab-*-assets release-assetswuji-mjlab/
├── src/
│ ├── wuji_mjlab/ # task package (tasks/reorient/, assets/, utils/, rl/)
│ └── wuji_rl_libs/ # vendored rsl-rl (5.0.1+wuji1, min_std clamp)
├── deploy/reorient/ # sim2real bridge (vision, ZMQ, hand driver)
├── scripts/ # train / play / tools entry points
├── docs/ # architecture + sim2real setup
├── pixi.toml # canonical install + task runner
└── pyproject.toml # package metadata
- Linux x86_64
- NVIDIA GPU, CUDA 12.8 (Blackwell sm_120 / RTX 50-series supported)
- pixi ≥ 0.66 (the version CI uses) — the only supported installer
- For sim2real: Wuji Hand hardware + Hikrobot USB-3 camera + Hikvision MVS SDK + 3D-printed ArUco-tagged cube + wrist AprilTag — see
docs/sim2real/setup.md
⚠️ CAUTION: this repo is pixi-only.conda + pip install -e .is not tested and not supported.
# 1. install pixi (one-time)
curl -fsSL https://pixi.sh/install.sh | bash
# 2. clone + resolve environment
git clone https://github.com/wuji-technology/wuji-mjlab
cd wuji-mjlab
pixi installThis produces a default environment for training/eval and an optional deploy environment (pixi install -e deploy) for the sim2real bridge.
Verify the environment: pixi run list-envs (lists registered tasks and confirms the mjlab + tyro stack imports cleanly).
pixi run train --task WujiHand_Reorient --agent.upload-model False--agent.upload-model False keeps checkpoints local-only. Drop it (and set WANDB_API_KEY) to also push the final-iteration checkpoint to W&B as a model artifact — local .pt files are still written on every save_interval boundary either way.
If
pixi run trainOOMs, swap to the lower-VRAM variant:pixi run train --task WujiHand_Reorient_Light
WujiHand_Reorientapproximately reproduces the released checkpoint atnum_envs=8192, max_iterations=5000and needs ~20 GB of GPU memory.WujiHand_Reorient_Lightusesnum_envs=4096, max_iterations=7500— fits comfortably under ~12 GB but converges to a visibly weaker policy (occasional cube drops, finger-jam behavior on harder reorientations).
Checkpoints and W&B logs land under logs/rsl_rl/<run_name>/. Task MDP, reward shaping, and the contact-parameter domain randomisation split into two anatomical groups (palm + thumb compliance zone vs fingers 2-5) are documented in the Architecture section below.
# Interactive viewer with a trained checkpoint
pixi run play --task WujiHand_Reorient --checkpoint-file <path-to-ckpt.pt>
# Success-rate eval over N trials (consumes ONNX)
pixi run python -m wuji_mjlab.tasks.reorient.scripts.eval_success_rate <path-to-policy.onnx>
# Export PPO checkpoint → ONNX (sidecar JSON with action_scale / ema_alpha / ctrl_dt)
pixi run python -m wuji_mjlab.tasks.reorient.scripts.export_onnx <path-to-ckpt.pt>Additional dev utilities:
pixi run list-envs # list registered tasks
pixi run python -m wuji_mjlab.tasks.reorient.scripts.view_task WujiHand_Reorient # view task with a dummy policyThe deploy bridge runs the exported ONNX policy on the real Wuji Hand. A vision module tracks an ArUco-tagged cube (anchored to a wrist AprilTag world frame) via a USB camera and publishes the pose over ZMQ; play_real subscribes to that pose, runs ONNX inference, and closes the loop by sending commands to the hand driver.
No training needed to deploy. Download the pre-trained
policy.onnx+policy_config.jsonfrom Releases and pass thepolicy.onnxpath as--ckptbelow. The released policy is what produces the demo GIF above.
# After the hardware setup in docs/sim2real/setup.md is complete:
pixi run -e deploy home # reset hand to home pose
pixi run -e deploy vision # launch cube observer (OpenCV preview)
pixi run -e deploy play-real --ckpt <path-to.onnx> # closed-loop control + mirror viewer- Software pipeline & configuration:
deploy/reorient/README.md - Hardware setup, 3D-printed cube, camera mounting, calibration:
docs/sim2real/setup.md
Three-layer stack: this repo (tasks + deploy) → mjlab → MuJoCo + mujoco-warp. PPO via the vendored rsl-rl backend under src/wuji_rl_libs/rsl_rl/.
Deep dive — three-layer diagram, MDP spec, domain randomisation, adding a new task
+--------------------------------------------------------+
| wuji-mjlab (this repo) |
| +----------------------+ +-------------------------+ |
| | tasks/reorient/ | | deploy/reorient/ | |
| | - env cfg + MDP | | - real-hand env | |
| | - 2-group DR | | - vision pipeline | |
| | - eval + export | | - closed-loop control | |
| +----------------------+ +-------------------------+ |
| +----------------------+ |
| | utils/ | <- shared building blocks |
| +----------------------+ |
| +----------------------+ |
| | rl/ | <- thin RL backend adapter |
| +----------------------+ |
| |
| src/wuji_rl_libs/rsl_rl/ <- vendored PPO backend |
+--------------------------------------------------------+
| |
v v
+-----------------------+ +---------------------------+
| mjlab (pip / pixi) | | torch + onnxruntime |
| + mujoco-warp | | (training + inference) |
| + mujoco | | |
+-----------------------+ +---------------------------+
Full SO(3) in-hand reorientation with the Wuji Hand. Files:
| File | Role |
|---|---|
reorient_env_cfg.py |
Top-level ManagerBasedRlEnvCfg factory |
reorient_terms.py |
All event / termination / reward / DR terms (the task design lives here, not in robot bindings) |
reorient_constants.py |
Initial pose constants (palm-up R_y(-90°), cube above palm) |
config/wuji_hand/ |
Robot-binding layer: thin wiring of the task design onto Wuji Hand (20-DoF dexterous hand) |
mdp/ |
Observations, commands, actions specific to reorientation |
tooling/ |
Eval entrypoints + ONNX export |
Task design (MDP terms, reward shaping, anatomically-split contact-parameter DR groups) lives in reorient_terms.py. See src/wuji_mjlab/tasks/reorient/README.md for the architecture invariants and deploy/reorient/README.md for the sim2real bridge — RealHandEnv reuses the sim observation + action managers verbatim, no parallel pipelines.
- Create
src/wuji_mjlab/tasks/<your_task>/with an env cfg factory. - Put all MDP design (events, rewards, terminations) in
<your_task>_terms.py. The robot-specific config layer inconfig/<robot>/should be a thin binding only. - Register via
register_mjlab_task()inconfig/<robot>/__init__.py. - Add a quick
pixi run train --task <your_task_id>smoke run before committing (canonical training entrypoint isscripts/train/train_rsl_rl.py, exposed via thetrainpixi task).
After cloning, install the pre-commit hooks:
pixi run pre-commit installEvery git commit then runs ruff, codespell, and the YAML/TOML/large-file checks defined in .pre-commit-config.yaml — locally, before CI sees the change. Manual full-tree run: pixi run pre-commit run --all-files.
⚠️ Don'tpip installpackages into the pixi env — pip deps aren't tracked bypixi.toml/pixi.lockand disappear on the next resolve. Editpixi.tomland runpixi install.
- wujihandpy — Wuji Hand SDK (C++ core with Python bindings)
- wuji-retargeting — Hand pose retargeting (Vision Pro / glove / video → robot joints)
- wujihandros2 — ROS 2 driver for Wuji Hand
- docs.wuji.tech — Official Wuji documentation portal
This project builds on the following open-source projects:
- mjlab — manager-based RL framework
- mujoco-warp — GPU-batched MuJoCo physics
- MuJoCo — the underlying physics engine
- rsl_rl — PPO implementation (vendored under
src/wuji_rl_libs/) - pupil-apriltags — AprilTag detector for the deploy vision module
If you find this project useful, please consider citing:
@software{wuji2026mjlab,
title={Wuji-MJLab: RL Training for Wuji Hand Dexterous Manipulation},
author={{Wuji Technology}},
year={2026},
url={https://github.com/wuji-technology/wuji-mjlab}
}Apache 2.0. See LICENSE and NOTICE for third-party attribution.


