mjlab v1.3.0
TLDR: A packed release with a viewer rebuilt on mjviser, a preset-based terrain system, simplified actuator configuration, and new MDP primitives like RecorderManager and termination_curriculum.
Physics engine bump
mjlab 1.3.0 uses mujoco-warp 3.7.0.1 and mujoco 3.7.0.
Viewer: Rebuilt on mjviser
The Viser viewer internals have been replaced with the standalone mjviser package. Scene creation, mesh conversion, and overlay rendering (contacts, forces, inertia, tendons, joints, frames) now live in mjviser, while mjlab keeps debug visualization and warp tensor conversion in its MjlabViserScene subclass. The viewer exposes a new Visualization tab for overlay controls and a Groups tab for geom and site visibility.
mjviser_teaser.mp4
New panels and tabs:
- Reward bar panel showing horizontal bars for each reward term with a running mean over ~1 second
- W&B run tab for browsing recent runs and pulling checkpoints
- Checkpoints tab in play for hot-swapping checkpoints without restarting, with support for local directories and W&B runs
- Motion reference scrubber for tracking tasks
- Per-pixel segmentation camera data type for geom ID output alongside RGB and depth, with a new
Mjlab-Multi-Cube-Seg-Yamtask that uses it
Terrain System, Revamped
Terrain configuration moves to a preset-based system with a new @terrain_preset decorator for composing reusable configurations. Curriculum mode now assigns exactly one column per terrain type, with proportion controlling robot spawning distribution rather than column counts. A new STAIRS_TERRAINS_CFG preset provides a progressive stair curriculum out of the box.
mjlab_go1_stair_terrain.mp4
A Go1 velocity policy trained on the new stair terrain
TerrainHeightSensor, a RayCastSensor subclass, computes per-frame vertical clearance above terrain. The velocity task configs now use it for feet_clearance, feet_swing_height, and foot_height, replacing the previous world-Z proxy that was incorrect on rough terrain. A new terrain-aware upright reward penalizes body pitch and roll relative to the local terrain normal.
RayCastSensor itself was generalized: pass a tuple of ObjRef to frame for multi-frame, per-site raycasting, and use the new RingPatternCfg for concentric ring sampling around each frame.
Actuator Configuration, Simplified
Actuator delay is now configured inline on any ActuatorCfg subclass:
# Before
DelayedActuatorCfg(BuiltinPositionActuatorCfg(...), delay_min_lag=2, delay_max_lag=5)
# After
BuiltinPositionActuatorCfg(..., delay_min_lag=2, delay_max_lag=5)DelayedActuator, DelayedActuatorCfg, and DelayedBuiltinActuatorGroup are removed. Delay always applies to the actuator's command_field automatically, so delay_target is no longer needed.
The four XML actuator config classes (XmlPositionActuatorCfg, XmlVelocityActuatorCfg, XmlMotorActuatorCfg, XmlMuscleActuatorCfg) collapse into a single XmlActuatorCfg that auto detects the actuator type from XML. Pass command_field=... to override.
Two new behaviors on ActuatorCfg:
viscous_dampingfor passive velocity proportional damping (f = -b·v), distinct from the PD derivative gaindampingused by position and velocity actuators. Maps to<joint damping>for JOINT transmission and<tendon damping>for TENDON transmission.armatureandfrictionlossnow default toNoneinstead of0.0, preserving the XML values instead of silently overwriting them. Passarmature=0.0orfrictionloss=0.0explicitly to restore the old behavior.
New MDP Primitives
Several additions to the manager and MDP APIs:
RecorderManagerfor logging observations, actions, or arbitrary environment data during rollouts. Implement aRecorderTermsubclass and register it in therecordersdict onManagerBasedRlEnvCfg. The manager providesrecord_pre_reset,record_post_reset, andrecord_post_steplifecycle hooks with no opinion on how data is stored.termination_curriculumfor scheduling changes to termination term parameters during training, matching the existingreward_curriculumpattern. Both now share a single internal engine with init-time validation of stage ordering, field existence, and param keys.MetricsTermCfg.reducefield with a"last"option that reports the final step of the episode rather than the episode mean. Useful for binary success metrics.RelativeJointPositionActionfor joint position control relative to the current configuration. The target iscurrent_pos + action * scale, so a zero action holds the current configuration rather than commanding the default pose.dr.pair_frictionfor randomizing geom-pair friction overrides, with anisotropic=Trueoption that mirrors the symmetric tangent and roll axes so single-axis randomization does not leave the paired axis stale.ActionTermCfg.clipfor clamping processed actions after scale and offset.qfrc_actuatorandqfrc_externalgeneralized force accessors onEntityData.qfrc_actuatorgives actuator forces in joint space (projected through the transmission);qfrc_externalrecovers the generalized force from body external wrenches (xfrc_applied).
Cartpole Tutorial
A new cartpole tutorial walks through building an environment from scratch, using the Mjlab-Cartpole-Balance and Mjlab-Cartpole-Swingup tasks as running examples. It covers scene setup, action and observation terms, rewards, terminations, and training.
Also In This Release
- Mujoco 3.7.0 and mujoco-warp 3.7.0.1 are now the minimum supported versions
- Motion imitation documentation with preprocessing instructions, replacing the BeyondMimic link that produced incompatible NPZ files
margin,gap,solmixfields onCollisionCfgfor per geom contact parameter configurationManagerBasedRlEnvCfg.auto_resetflag for custom training loops that need access to the true terminal state- Top-level
--helpfortrainandplaythat points users atlist-envsand<TASK> --help list_envsrenamed tolist-envsfor consistency with other hyphenated entry points- RNN model support in
RslRlModelCfg - NaN guard now captures mocap body poses, enabling full state reconstruction in the dump viewer for fixed-base entities
- Per-substep metric evaluation via
MetricsTermCfg.per_substepfor metrics evaluated inside the decimation loop dr.pseudo_inertiano longer loads cuSOLVER, eliminating ~4 GB of persistent GPU memory overhead- CUDA graph capture no longer triggers GC, avoiding capture failures on long runs
- Onnxruntime roundtrip tests for ONNX policy export
Breaking Changes
DelayedActuator,DelayedActuatorCfg,DelayedBuiltinActuatorGroupare removed. Configure delay inline on anyActuatorCfgsubclass.delay_targetis removed. Delay always applies to the actuator'scommand_field.XmlPositionActuatorCfg,XmlVelocityActuatorCfg,XmlMotorActuatorCfg,XmlMuscleActuatorCfgare replaced by a singleXmlActuatorCfg.TerrainImporterandTerrainImporterCfgaliases are removed. UseTerrainEntityandTerrainEntityCfg.EntityData.generalized_forceis removed. Useqfrc_actuatororqfrc_external.ActuatorCfg.armatureand.frictionlossdefault toNoneinstead of0.0.Entity.clear_state()is deprecated. UseEntity.reset().out_of_terrain_boundsis replaced withterrain_edge_reached.
Bug Fixes
SceneEntityCfgnames and IDs ordering mismatch withpreserve_order=False(#876)- ONNX export path resolution when a parent directory name contains
"model"(#867) export-scenewriting assets to the wrong location and allowing path traversal in asset keys (#858)electrical_power_costnow uses joint space forces, correct for actuators with non-unit gear ratios (#776)create_velocity_actuatorno longer setsctrllimited=Truewithinheritrange=1.0, which crashed on continuous joints such as wheels (#787)write_root_com_velocity_to_simwith tensorenv_idson floating base entities (#793)dr.pseudo_inertia4 GB GPU memory overhead (#753)- Contact force visualization for non-builtin actuators (#786)
BoxSteppingStonesTerrainCfglarge gap around the platform (#785)TerrainHeightSensorreporting box thickness during penetration andmax_distanceduring ground contact (#835, #841)- Sensor name prefix duplicated on deepcopy (#851)
get_wandb_checkpoint_pathstale cache on repeated calls (#869)- Ghost geom filtering now uses visual alpha instead of collision flags (#888)
- Native viewer syncs
qpos0when domain randomized, fixing incorrect body positions afterdr.joint_default_pos(#760) command_manager.compute()is now called duringreset()so derived command state is populated before the first observation (#761)RayCastSensorframe offset alignment for sites and geoms with a local offset from their parent body (#775)
New Contributors
Thank you to @cjyyx, @omarrayyann, @gokulp01, @cmjang, @jsw7460, and @lzyang2000 for their first contributions to mjlab!
Full Changelog: v1.2.0...v1.3.0