Skip to content

mjlab v1.3.0

Choose a tag to compare

@kevinzakka kevinzakka released this 14 Apr 22:05
· 79 commits to main since this release
Immutable release. Only release title and notes can be modified.
19d6c06

TLDR: A packed release with a viewer rebuilt on mjviser, a preset-based terrain system, simplified actuator configuration, and new MDP primitives like RecorderManager and termination_curriculum.

Physics engine bump

mjlab 1.3.0 uses mujoco-warp 3.7.0.1 and mujoco 3.7.0.

Viewer: Rebuilt on mjviser

The Viser viewer internals have been replaced with the standalone mjviser package. Scene creation, mesh conversion, and overlay rendering (contacts, forces, inertia, tendons, joints, frames) now live in mjviser, while mjlab keeps debug visualization and warp tensor conversion in its MjlabViserScene subclass. The viewer exposes a new Visualization tab for overlay controls and a Groups tab for geom and site visibility.

mjviser_teaser.mp4

New panels and tabs:

  • Reward bar panel showing horizontal bars for each reward term with a running mean over ~1 second
  • W&B run tab for browsing recent runs and pulling checkpoints
  • Checkpoints tab in play for hot-swapping checkpoints without restarting, with support for local directories and W&B runs
  • Motion reference scrubber for tracking tasks
  • Per-pixel segmentation camera data type for geom ID output alongside RGB and depth, with a new Mjlab-Multi-Cube-Seg-Yam task that uses it

Terrain System, Revamped

Terrain configuration moves to a preset-based system with a new @terrain_preset decorator for composing reusable configurations. Curriculum mode now assigns exactly one column per terrain type, with proportion controlling robot spawning distribution rather than column counts. A new STAIRS_TERRAINS_CFG preset provides a progressive stair curriculum out of the box.

mjlab_go1_stair_terrain.mp4

A Go1 velocity policy trained on the new stair terrain

TerrainHeightSensor, a RayCastSensor subclass, computes per-frame vertical clearance above terrain. The velocity task configs now use it for feet_clearance, feet_swing_height, and foot_height, replacing the previous world-Z proxy that was incorrect on rough terrain. A new terrain-aware upright reward penalizes body pitch and roll relative to the local terrain normal.

RayCastSensor itself was generalized: pass a tuple of ObjRef to frame for multi-frame, per-site raycasting, and use the new RingPatternCfg for concentric ring sampling around each frame.

Actuator Configuration, Simplified

Actuator delay is now configured inline on any ActuatorCfg subclass:

# Before
DelayedActuatorCfg(BuiltinPositionActuatorCfg(...), delay_min_lag=2, delay_max_lag=5)

# After
BuiltinPositionActuatorCfg(..., delay_min_lag=2, delay_max_lag=5)

DelayedActuator, DelayedActuatorCfg, and DelayedBuiltinActuatorGroup are removed. Delay always applies to the actuator's command_field automatically, so delay_target is no longer needed.

The four XML actuator config classes (XmlPositionActuatorCfg, XmlVelocityActuatorCfg, XmlMotorActuatorCfg, XmlMuscleActuatorCfg) collapse into a single XmlActuatorCfg that auto detects the actuator type from XML. Pass command_field=... to override.

Two new behaviors on ActuatorCfg:

  • viscous_damping for passive velocity proportional damping (f = -b·v), distinct from the PD derivative gain damping used by position and velocity actuators. Maps to <joint damping> for JOINT transmission and <tendon damping> for TENDON transmission.
  • armature and frictionloss now default to None instead of 0.0, preserving the XML values instead of silently overwriting them. Pass armature=0.0 or frictionloss=0.0 explicitly to restore the old behavior.

New MDP Primitives

Several additions to the manager and MDP APIs:

  • RecorderManager for logging observations, actions, or arbitrary environment data during rollouts. Implement a RecorderTerm subclass and register it in the recorders dict on ManagerBasedRlEnvCfg. The manager provides record_pre_reset, record_post_reset, and record_post_step lifecycle hooks with no opinion on how data is stored.
  • termination_curriculum for scheduling changes to termination term parameters during training, matching the existing reward_curriculum pattern. Both now share a single internal engine with init-time validation of stage ordering, field existence, and param keys.
  • MetricsTermCfg.reduce field with a "last" option that reports the final step of the episode rather than the episode mean. Useful for binary success metrics.
  • RelativeJointPositionAction for joint position control relative to the current configuration. The target is current_pos + action * scale, so a zero action holds the current configuration rather than commanding the default pose.
  • dr.pair_friction for randomizing geom-pair friction overrides, with an isotropic=True option that mirrors the symmetric tangent and roll axes so single-axis randomization does not leave the paired axis stale.
  • ActionTermCfg.clip for clamping processed actions after scale and offset.
  • qfrc_actuator and qfrc_external generalized force accessors on EntityData. qfrc_actuator gives actuator forces in joint space (projected through the transmission); qfrc_external recovers the generalized force from body external wrenches (xfrc_applied).

Cartpole Tutorial

A new cartpole tutorial walks through building an environment from scratch, using the Mjlab-Cartpole-Balance and Mjlab-Cartpole-Swingup tasks as running examples. It covers scene setup, action and observation terms, rewards, terminations, and training.

Also In This Release

  • Mujoco 3.7.0 and mujoco-warp 3.7.0.1 are now the minimum supported versions
  • Motion imitation documentation with preprocessing instructions, replacing the BeyondMimic link that produced incompatible NPZ files
  • margin, gap, solmix fields on CollisionCfg for per geom contact parameter configuration
  • ManagerBasedRlEnvCfg.auto_reset flag for custom training loops that need access to the true terminal state
  • Top-level --help for train and play that points users at list-envs and <TASK> --help
  • list_envs renamed to list-envs for consistency with other hyphenated entry points
  • RNN model support in RslRlModelCfg
  • NaN guard now captures mocap body poses, enabling full state reconstruction in the dump viewer for fixed-base entities
  • Per-substep metric evaluation via MetricsTermCfg.per_substep for metrics evaluated inside the decimation loop
  • dr.pseudo_inertia no longer loads cuSOLVER, eliminating ~4 GB of persistent GPU memory overhead
  • CUDA graph capture no longer triggers GC, avoiding capture failures on long runs
  • Onnxruntime roundtrip tests for ONNX policy export

Breaking Changes

  • DelayedActuator, DelayedActuatorCfg, DelayedBuiltinActuatorGroup are removed. Configure delay inline on any ActuatorCfg subclass.
  • delay_target is removed. Delay always applies to the actuator's command_field.
  • XmlPositionActuatorCfg, XmlVelocityActuatorCfg, XmlMotorActuatorCfg, XmlMuscleActuatorCfg are replaced by a single XmlActuatorCfg.
  • TerrainImporter and TerrainImporterCfg aliases are removed. Use TerrainEntity and TerrainEntityCfg.
  • EntityData.generalized_force is removed. Use qfrc_actuator or qfrc_external.
  • ActuatorCfg.armature and .frictionloss default to None instead of 0.0.
  • Entity.clear_state() is deprecated. Use Entity.reset().
  • out_of_terrain_bounds is replaced with terrain_edge_reached.

Bug Fixes

  • SceneEntityCfg names and IDs ordering mismatch with preserve_order=False (#876)
  • ONNX export path resolution when a parent directory name contains "model" (#867)
  • export-scene writing assets to the wrong location and allowing path traversal in asset keys (#858)
  • electrical_power_cost now uses joint space forces, correct for actuators with non-unit gear ratios (#776)
  • create_velocity_actuator no longer sets ctrllimited=True with inheritrange=1.0, which crashed on continuous joints such as wheels (#787)
  • write_root_com_velocity_to_sim with tensor env_ids on floating base entities (#793)
  • dr.pseudo_inertia 4 GB GPU memory overhead (#753)
  • Contact force visualization for non-builtin actuators (#786)
  • BoxSteppingStonesTerrainCfg large gap around the platform (#785)
  • TerrainHeightSensor reporting box thickness during penetration and max_distance during ground contact (#835, #841)
  • Sensor name prefix duplicated on deepcopy (#851)
  • get_wandb_checkpoint_path stale cache on repeated calls (#869)
  • Ghost geom filtering now uses visual alpha instead of collision flags (#888)
  • Native viewer syncs qpos0 when domain randomized, fixing incorrect body positions after dr.joint_default_pos (#760)
  • command_manager.compute() is now called during reset() so derived command state is populated before the first observation (#761)
  • RayCastSensor frame offset alignment for sites and geoms with a local offset from their parent body (#775)

New Contributors

Thank you to @cjyyx, @omarrayyann, @gokulp01, @cmjang, @jsw7460, and @lzyang2000 for their first contributions to mjlab!

Full Changelog: v1.2.0...v1.3.0