Skip to content

wuji-technology/wuji-mjlab

Repository files navigation

wuji-mjlab

中文版

License: Apache 2.0 Release CI Python PyTorch CUDA Ruff pre-commit Stars

In-hand cube reorientation on the Wuji Hand: PPO policies trained in mjlab (GPU-batched physics via mujoco-warp), covering the full SO(3) goal space, with a sim2real bridge for closed-loop deployment on the physical hand.

sim reorient demo real-hand reorient demo

Tasks

Robot Task ID Pretrained checkpoint Demo
Wuji Hand WujiHand_Reorient Latest release assets sim + real GIFs above

Pull the checkpoint and CAD bundle from the latest release:

# Requires gh CLI (https://cli.github.com); the glob keeps this command
# working across future release tags. See docs/sim2real/setup.md §3 for
# the manual fallback if you don't have gh installed.
gh release download --repo wuji-technology/wuji-mjlab --pattern '*-assets.zip'
unzip wuji-mjlab-*-assets.zip
mv wuji-mjlab-*-assets release-assets

Repository layout

wuji-mjlab/
├── src/
│   ├── wuji_mjlab/        # task package (tasks/reorient/, assets/, utils/, rl/)
│   └── wuji_rl_libs/      # vendored rsl-rl (5.0.1+wuji1, min_std clamp)
├── deploy/reorient/       # sim2real bridge (vision, ZMQ, hand driver)
├── scripts/               # train / play / tools entry points
├── docs/                  # architecture + sim2real setup
├── pixi.toml              # canonical install + task runner
└── pyproject.toml         # package metadata

Requirements

  • Linux x86_64
  • NVIDIA GPU, CUDA 12.8 (Blackwell sm_120 / RTX 50-series supported)
  • pixi ≥ 0.66 (the version CI uses) — the only supported installer
  • For sim2real: Wuji Hand hardware + Hikrobot USB-3 camera + Hikvision MVS SDK + 3D-printed ArUco-tagged cube + wrist AprilTag — see docs/sim2real/setup.md

⚠️ CAUTION: this repo is pixi-only. conda + pip install -e . is not tested and not supported.

Installation

# 1. install pixi (one-time)
curl -fsSL https://pixi.sh/install.sh | bash

# 2. clone + resolve environment
git clone https://github.com/wuji-technology/wuji-mjlab
cd wuji-mjlab
pixi install

This produces a default environment for training/eval and an optional deploy environment (pixi install -e deploy) for the sim2real bridge.

Verify the environment: pixi run list-envs (lists registered tasks and confirms the mjlab + tyro stack imports cleanly).

Train

pixi run train --task WujiHand_Reorient --agent.upload-model False

--agent.upload-model False keeps checkpoints local-only. Drop it (and set WANDB_API_KEY) to also push the final-iteration checkpoint to W&B as a model artifact — local .pt files are still written on every save_interval boundary either way.

If pixi run train OOMs, swap to the lower-VRAM variant:

pixi run train --task WujiHand_Reorient_Light

WujiHand_Reorient approximately reproduces the released checkpoint at num_envs=8192, max_iterations=5000 and needs ~20 GB of GPU memory. WujiHand_Reorient_Light uses num_envs=4096, max_iterations=7500 — fits comfortably under ~12 GB but converges to a visibly weaker policy (occasional cube drops, finger-jam behavior on harder reorientations).

Checkpoints and W&B logs land under logs/rsl_rl/<run_name>/. Task MDP, reward shaping, and the contact-parameter domain randomisation split into two anatomical groups (palm + thumb compliance zone vs fingers 2-5) are documented in the Architecture section below.

Play and evaluate

# Interactive viewer with a trained checkpoint
pixi run play --task WujiHand_Reorient --checkpoint-file <path-to-ckpt.pt>

# Success-rate eval over N trials (consumes ONNX)
pixi run python -m wuji_mjlab.tasks.reorient.scripts.eval_success_rate <path-to-policy.onnx>

# Export PPO checkpoint → ONNX (sidecar JSON with action_scale / ema_alpha / ctrl_dt)
pixi run python -m wuji_mjlab.tasks.reorient.scripts.export_onnx <path-to-ckpt.pt>

Additional dev utilities:

pixi run list-envs                                                              # list registered tasks
pixi run python -m wuji_mjlab.tasks.reorient.scripts.view_task WujiHand_Reorient  # view task with a dummy policy

Sim-to-real

sim2real deploy rig: camera over the Wuji Hand + jig, MuJoCo mirror viewer on the right

The deploy bridge runs the exported ONNX policy on the real Wuji Hand. A vision module tracks an ArUco-tagged cube (anchored to a wrist AprilTag world frame) via a USB camera and publishes the pose over ZMQ; play_real subscribes to that pose, runs ONNX inference, and closes the loop by sending commands to the hand driver.

No training needed to deploy. Download the pre-trained policy.onnx + policy_config.json from Releases and pass the policy.onnx path as --ckpt below. The released policy is what produces the demo GIF above.

# After the hardware setup in docs/sim2real/setup.md is complete:
pixi run -e deploy home                              # reset hand to home pose
pixi run -e deploy vision                            # launch cube observer (OpenCV preview)
pixi run -e deploy play-real --ckpt <path-to.onnx>   # closed-loop control + mirror viewer

Architecture

Three-layer stack: this repo (tasks + deploy) → mjlabMuJoCo + mujoco-warp. PPO via the vendored rsl-rl backend under src/wuji_rl_libs/rsl_rl/.

Deep dive — three-layer diagram, MDP spec, domain randomisation, adding a new task
  +--------------------------------------------------------+
  | wuji-mjlab (this repo)                                 |
  |  +----------------------+  +-------------------------+ |
  |  | tasks/reorient/      |  | deploy/reorient/        | |
  |  |   - env cfg + MDP    |  |   - real-hand env       | |
  |  |   - 2-group DR       |  |   - vision pipeline     | |
  |  |   - eval + export    |  |   - closed-loop control | |
  |  +----------------------+  +-------------------------+ |
  |  +----------------------+                              |
  |  | utils/               |  <- shared building blocks   |
  |  +----------------------+                              |
  |  +----------------------+                              |
  |  | rl/                  |  <- thin RL backend adapter  |
  |  +----------------------+                              |
  |                                                        |
  |  src/wuji_rl_libs/rsl_rl/ <- vendored PPO backend      |
  +--------------------------------------------------------+
              |                            |
              v                            v
  +-----------------------+  +---------------------------+
  | mjlab (pip / pixi)    |  | torch + onnxruntime       |
  | + mujoco-warp         |  | (training + inference)    |
  | + mujoco              |  |                           |
  +-----------------------+  +---------------------------+

Reorient task (src/wuji_mjlab/tasks/reorient/)

Full SO(3) in-hand reorientation with the Wuji Hand. Files:

File Role
reorient_env_cfg.py Top-level ManagerBasedRlEnvCfg factory
reorient_terms.py All event / termination / reward / DR terms (the task design lives here, not in robot bindings)
reorient_constants.py Initial pose constants (palm-up R_y(-90°), cube above palm)
config/wuji_hand/ Robot-binding layer: thin wiring of the task design onto Wuji Hand (20-DoF dexterous hand)
mdp/ Observations, commands, actions specific to reorientation
tooling/ Eval entrypoints + ONNX export

Task design (MDP terms, reward shaping, anatomically-split contact-parameter DR groups) lives in reorient_terms.py. See src/wuji_mjlab/tasks/reorient/README.md for the architecture invariants and deploy/reorient/README.md for the sim2real bridge — RealHandEnv reuses the sim observation + action managers verbatim, no parallel pipelines.

Adding a new task

  1. Create src/wuji_mjlab/tasks/<your_task>/ with an env cfg factory.
  2. Put all MDP design (events, rewards, terminations) in <your_task>_terms.py. The robot-specific config layer in config/<robot>/ should be a thin binding only.
  3. Register via register_mjlab_task() in config/<robot>/__init__.py.
  4. Add a quick pixi run train --task <your_task_id> smoke run before committing (canonical training entrypoint is scripts/train/train_rsl_rl.py, exposed via the train pixi task).

Development

After cloning, install the pre-commit hooks:

pixi run pre-commit install

Every git commit then runs ruff, codespell, and the YAML/TOML/large-file checks defined in .pre-commit-config.yaml — locally, before CI sees the change. Manual full-tree run: pixi run pre-commit run --all-files.

⚠️ Don't pip install packages into the pixi env — pip deps aren't tracked by pixi.toml / pixi.lock and disappear on the next resolve. Edit pixi.toml and run pixi install.

Related Projects

Acknowledgements

This project builds on the following open-source projects:

  • mjlab — manager-based RL framework
  • mujoco-warp — GPU-batched MuJoCo physics
  • MuJoCo — the underlying physics engine
  • rsl_rl — PPO implementation (vendored under src/wuji_rl_libs/)
  • pupil-apriltags — AprilTag detector for the deploy vision module

Contributors

Citation

If you find this project useful, please consider citing:

@software{wuji2026mjlab,
  title={Wuji-MJLab: RL Training for Wuji Hand Dexterous Manipulation},
  author={{Wuji Technology}},
  year={2026},
  url={https://github.com/wuji-technology/wuji-mjlab}
}

License

Apache 2.0. See LICENSE and NOTICE for third-party attribution.

About

Wuji Hand in-hand reorientation RL with sim-to-real deployment, built on mjlab

Resources

License

Stars

Watchers

Forks

Contributors

Languages