Skip to content

[mujoco] Complete dm_control suite tasks#387

Merged
Trinkle23897 merged 15 commits intomainfrom
jiayi/dmc-complete-suite
Apr 8, 2026
Merged

[mujoco] Complete dm_control suite tasks#387
Trinkle23897 merged 15 commits intomainfrom
jiayi/dmc-complete-suite

Conversation

@Trinkle23897
Copy link
Copy Markdown
Collaborator

Summary

  • Problem: EnvPool's dm_control coverage was missing tasks from dm_control==1.0.38, including dog, lqr, quadruped, stacker, and HumanoidCMU walk.
  • Scope: Adds native EnvPool implementations and registration for the missing DMC suite tasks, plus tests/docs/assets needed to exercise them. This does not change non-DMC APIs.
  • Outcome: EnvPool now registers the full suite.ALL_TASKS set for dm_control==1.0.38, including normal and pixel variants for the new domains.

This diff completes the dm_control suite task set in EnvPool and tightens alignment coverage for both existing and newly added Mujoco DMC tasks.

Technical Details

  • Approach: Implemented new C++ task headers for dog, lqr, quadruped, and stacker by mirroring dm_control 1.0.38 task semantics, then wired them through pybind registration, Python env wrappers, task registration, XML generation, and the docs.
  • Code pointers:
    • envpool/mujoco/dmc/dog.h: dog task initialization, observations, rewards, fetch ball state, and contact sensor handling.
    • envpool/mujoco/dmc/lqr.h: LQR XML generation, action bounds, randomized stiffness alignment, and rewards.
    • envpool/mujoco/dmc/quadruped.h: quadruped terrain/fetch/run/walk logic, including fetch ball body-state alignment.
    • envpool/mujoco/dmc/stacker.h: stacker initialization, target placement, box observations, and box velocity ordering.
    • envpool/mujoco/dmc/utils.cc: XML transforms for dog, lqr, quadruped, and stacker task variants.
    • envpool/mujoco/dmc/mujoco_dmc_suite_ext_align_test.py: alignment coverage for the newly added domains, with per-key dog contact tolerances only for foot_forces/touch_sensors.
    • envpool/mujoco/dmc/mujoco_dmc_suite_align_test.py: removes an unused pre-step action sample so align tests do not advance NumPy RNG without using the action.
  • Notes: Dog contact-force observations still need tiny contact-specific tolerance under MuJoCo v3 because foot_forces/touch_sensors diverge at roughly 1e-4 to 1e-3 absolute scale despite synced reset state; other dog/ext observations stay on the stricter rtol=1e-7 path.

Test Plan

Automated

  • bazel test --config=macos --config=test //envpool/mujoco:mujoco_dmc_suite_ext_align_test --test_output=errors: passed.
  • bazel test --config=macos --config=test //envpool/mujoco:mujoco_dmc_suite_align_test --test_output=errors: passed.
  • bazel test --config=macos --config=test //envpool/mujoco:mujoco_dmc_suite_ext_deterministic_test //envpool/mujoco:mujoco_dmc_render_test //envpool/mujoco:mujoco_dmc_pixel_observation_test --test_output=errors: passed.
  • bazel test --config=macos --config=test //envpool/mujoco:mujoco_dmc_suite_deterministic_test --test_output=errors: passed.
  • python -m compileall -q envpool/mujoco/dmc: passed.
  • git diff --check: passed.

Suggested Manual

  • bazel test --config=macos --config=test //envpool:make_test --test_filter=_MakeTest.test_make_mujoco_dmc: run in an environment with cmake available; local attempt was blocked while building the unrelated SDL2/VizDoom dependency because cmake was missing.

DmcCheetahPixelGymnasiumEnvPool,
) = py_env(_DmcCheetahPixelEnvSpec, _DmcCheetahPixelEnvPool)
(
DmcDogEnvSpec,
) = py_env(_DmcCheetahPixelEnvSpec, _DmcCheetahPixelEnvPool)
(
DmcDogEnvSpec,
DmcDogDMEnvPool,
(
DmcDogEnvSpec,
DmcDogDMEnvPool,
DmcDogGymEnvPool,
DmcDogEnvSpec,
DmcDogDMEnvPool,
DmcDogGymEnvPool,
DmcDogGymnasiumEnvPool,
DmcDogGymnasiumEnvPool,
) = py_env(_DmcDogEnvSpec, _DmcDogEnvPool)
(
DmcDogPixelEnvSpec,
) = py_env(_DmcDogEnvSpec, _DmcDogEnvPool)
(
DmcDogPixelEnvSpec,
DmcDogPixelDMEnvPool,
(
DmcDogPixelEnvSpec,
DmcDogPixelDMEnvPool,
DmcDogPixelGymEnvPool,
DmcDogPixelEnvSpec,
DmcDogPixelDMEnvPool,
DmcDogPixelGymEnvPool,
DmcDogPixelGymnasiumEnvPool,
DmcHumanoidCMUPixelGymnasiumEnvPool,
) = py_env(_DmcHumanoidCMUPixelEnvSpec, _DmcHumanoidCMUPixelEnvPool)
(
DmcLqrEnvSpec,
) = py_env(_DmcHumanoidCMUPixelEnvSpec, _DmcHumanoidCMUPixelEnvPool)
(
DmcLqrEnvSpec,
DmcLqrDMEnvPool,
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b4d8ebc7ce

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread envpool/mujoco/dmc/mujoco_env.cc Outdated
mj_addBufferVFS(vfs.get(), asset_name.c_str(), content.data(),
content.size());
}
AddDirectoryAssetsToVFS(vfs.get(), base_path, "dog_assets");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Load dog mesh assets only for dog domains

This call makes every MujocoEnv instance (including non-dog tasks) recursively read and register the entire dog_assets directory before model load. Because MujocoEnv is shared by all DMC domains, environment startup now pays unnecessary filesystem I/O and VFS population cost for hundreds of dog mesh files, which can noticeably slow initialization when creating many envs. Gate this asset loading to dog tasks (or detect references from the XML) so other domains do not regress.

Useful? React with 👍 / 👎.

Comment on lines +365 to +366
mjtNum bump = RandUniform(0.15, 1.0)(gen_);
model_->hfield_data[start + row * res + col] = bowl * bump;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Smooth escape terrain instead of per-cell random noise

The escape terrain currently multiplies the bowl profile by an independent random value at every heightfield cell, producing high-frequency noise. The dm_control escape task uses a coarse random bump map that is upsampled/smoothed, yielding spatially coherent terrain; without that smoothing, QuadrupedEscape-v1 dynamics and rewards deviate materially from upstream behavior. This should generate coarse bumps and interpolate them rather than sampling each cell independently.

Useful? React with 👍 / 👎.

@Trinkle23897 Trinkle23897 merged commit fd58f83 into main Apr 8, 2026
16 of 18 checks passed
@Trinkle23897 Trinkle23897 deleted the jiayi/dmc-complete-suite branch April 8, 2026 04:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant