[Configurator] Add GymnasiumAdapter for CloudAI envs#894
Conversation
📝 WalkthroughWalkthroughAdds GymnasiumAdapter wrapping BaseGym to expose Gymnasium-style action/observation spaces and APIs; makes CloudAIGym observation size metric-driven; exports the adapter; adds optional gymnasium dev/rl extras and a comprehensive pytest suite for the adapter. ChangesGymnasium adapter integration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/cloudai/configurator/gymnasium_adapter.py`:
- Around line 92-112: decode_action and step_raw currently allow partial or
unexpected parameter sets which can silently produce incorrect runs; update
decode_action, step, and step_raw to validate that the final parameter keys
exactly match the expected set (the union of self._tunable_params.keys() and
self._fixed_params.keys()) before calling _step_with_params, and raise a clear
exception (e.g., ValueError) listing missing or unexpected keys if validation
fails so the caller fails fast; enforce the same check when step builds params
from decode_action and when step_raw is passed params directly, and reference
the symbols decode_action, step, step_raw, self._tunable_params,
self._fixed_params, and _step_with_params to locate the changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 9474faf6-0f97-42f8-82a0-1e23932471bc
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
pyproject.tomlsrc/cloudai/configurator/__init__.pysrc/cloudai/configurator/cloudai_gym.pysrc/cloudai/configurator/gymnasium_adapter.pytests/test_gymnasium_adapter.py
- GymnasiumAdapter wraps a CloudAI BaseGym as a gymnasium-compatible env: Dict of Discrete actions over the tunable params, float32 Box observation derived from define_observation_space(), gymnasium 5-tuple from step/step_raw. - Fixed (single-value) params are injected on every step so agents only ever see the tunable subset; step_raw bypasses index decoding and fixed injection for callers that already have a fully-formed parameter dict. - The adapter mirrors handle_dse_job's 1-based test_run.step so artifact paths match whether the env is driven by the DSE loop or an external training loop. - CloudAIGymEnv.define_observation_space() now returns one slot per agent metric (at least one), giving adapters the correct Box shape. - gymnasium is an optional dependency: lazy-imported inside the adapter and exposed via a new rl extra (also added to dev) pinned to ~=1.2.
decode_action used to accept partial action dicts and step_raw accepted arbitrary key sets, both silently propagating incomplete params to the underlying env. Validate up front: - decode_action requires keys exactly == self._tunable_params, and each discrete index must be in range. - step_raw requires keys exactly == self._tunable_params | self._fixed_params. A new _assert_keys helper raises ValueError with sorted missing / extra lists so callers see exactly what was wrong. Adds tests for missing key, unknown key, out-of-range index, missing fixed param, and unknown raw key paths.
9937ef3 to
78b1bb3
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 79-82: The constraint-failure branch returns a hardcoded
single-element observation that violates the declared observation shape; update
the branch to return the same-width list as the normal case by using the
metric-derived width (use max(len(self.test_run.test.agent_metrics), 1)) instead
of a single 0.0 so that both the regular return (currently using [0.0] *
max(len(self.test_run.test.agent_metrics), 1)) and the constraint-failure branch
produce identically shaped observations.
In `@tests/test_gymnasium_adapter.py`:
- Around line 209-215: The tests assert on ValueError messages using exact list
formatting (e.g., match=r"missing=\['param_b'\]"), which is brittle; update the
assertions in test_decode_action_missing_key and
test_decode_action_rejects_unknown_keys to use more focused regexes that only
check for the presence of the key names (e.g., look for "param_b" and "bogus")
or words like "missing" / "extra" rather than exact list punctuation; adjust the
two places mentioned (the current block around adapter.decode_action({"param_a":
0}) and the block for adapter.decode_action({"param_a": 0, "param_b": 1,
"bogus": 0}) and the similar assertions at lines 226-234) to use these looser
key-focused regexes so tests validate behavior without depending on formatting.
- Around line 131-136: Remove the test's assertion on the private attribute
adapter._fixed_params and instead verify public behavior: keep the assertion
that adapter.action_space.spaces equals {"param_a","param_b"}, then call
GymnasiumAdapter.step (using the _FixedParamGym instance which records received
actions) with an action containing only "param_a"/"param_b" and assert the
environment received/processed an action that includes the fixed parameter (e.g.
the recorded last_action or returned observation/info shows "fixed_param": 42);
this exercises action-space exclusion plus the merged action behavior without
relying on the private _fixed_params field.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: bf1b3854-aa01-4f65-8aeb-7699901ac890
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
pyproject.tomlsrc/cloudai/configurator/__init__.pysrc/cloudai/configurator/cloudai_gym.pysrc/cloudai/configurator/gymnasium_adapter.pytests/test_gymnasium_adapter.py
| list: One float slot per agent metric (at least one), giving the correct shape | ||
| for adapters that derive ``gymnasium.spaces.Box`` from this output. | ||
| """ | ||
| return [0.0] | ||
| return [0.0] * max(len(self.test_run.test.agent_metrics), 1) |
There was a problem hiding this comment.
Return constraint-failure observations with the declared shape.
Line 82 makes observation width metric-driven, but the constraint-failure branch still returns a hardcoded single-element observation (Line 143). That breaks the fixed observation-shape contract expected by Gymnasium consumers.
Suggested fix
- return [-1.0], self.rewards.constraint_failure, True, {}
+ fallback_obs = [-1.0] * len(self.define_observation_space())
+ return fallback_obs, self.rewards.constraint_failure, True, {}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/cloudai/configurator/cloudai_gym.py` around lines 79 - 82, The
constraint-failure branch returns a hardcoded single-element observation that
violates the declared observation shape; update the branch to return the
same-width list as the normal case by using the metric-derived width (use
max(len(self.test_run.test.agent_metrics), 1)) instead of a single 0.0 so that
both the regular return (currently using [0.0] *
max(len(self.test_run.test.agent_metrics), 1)) and the constraint-failure branch
produce identically shaped observations.
| def test_single_value_params_are_excluded_from_action_space() -> None: | ||
| adapter = GymnasiumAdapter(_FixedParamGym()) | ||
|
|
||
| assert set(adapter.action_space.spaces) == {"param_a", "param_b"} | ||
| assert adapter._fixed_params == {"fixed_param": 42} | ||
|
|
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | ⚡ Quick win
Avoid asserting private adapter internals in tests.
Line 135 couples the test to adapter._fixed_params, which is implementation detail. Assert only public behavior (action-space exclusion + merged action on step) to keep tests stable across refactors.
Suggested diff
def test_single_value_params_are_excluded_from_action_space() -> None:
adapter = GymnasiumAdapter(_FixedParamGym())
assert set(adapter.action_space.spaces) == {"param_a", "param_b"}
- assert adapter._fixed_params == {"fixed_param": 42}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def test_single_value_params_are_excluded_from_action_space() -> None: | |
| adapter = GymnasiumAdapter(_FixedParamGym()) | |
| assert set(adapter.action_space.spaces) == {"param_a", "param_b"} | |
| assert adapter._fixed_params == {"fixed_param": 42} | |
| def test_single_value_params_are_excluded_from_action_space() -> None: | |
| adapter = GymnasiumAdapter(_FixedParamGym()) | |
| assert set(adapter.action_space.spaces) == {"param_a", "param_b"} | |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/test_gymnasium_adapter.py` around lines 131 - 136, Remove the test's
assertion on the private attribute adapter._fixed_params and instead verify
public behavior: keep the assertion that adapter.action_space.spaces equals
{"param_a","param_b"}, then call GymnasiumAdapter.step (using the _FixedParamGym
instance which records received actions) with an action containing only
"param_a"/"param_b" and assert the environment received/processed an action that
includes the fixed parameter (e.g. the recorded last_action or returned
observation/info shows "fixed_param": 42); this exercises action-space exclusion
plus the merged action behavior without relying on the private _fixed_params
field.
| with pytest.raises(ValueError, match=r"missing=\['param_b'\]"): | ||
| adapter.decode_action({"param_a": 0}) | ||
|
|
||
|
|
||
| def test_decode_action_rejects_unknown_keys(adapter: GymnasiumAdapter) -> None: | ||
| with pytest.raises(ValueError, match=r"extra=\['bogus'\]"): | ||
| adapter.decode_action({"param_a": 0, "param_b": 1, "bogus": 0}) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | ⚡ Quick win
Relax exception-message matching to avoid brittle formatting dependencies.
These assertions currently depend on exact list representation/order in error strings. Prefer key-focused regex so tests validate behavior, not message formatting details.
Suggested diff
def test_decode_action_rejects_missing_keys(adapter: GymnasiumAdapter) -> None:
- with pytest.raises(ValueError, match=r"missing=\['param_b'\]"):
+ with pytest.raises(ValueError, match=r"missing=.*param_b"):
adapter.decode_action({"param_a": 0})
def test_decode_action_rejects_unknown_keys(adapter: GymnasiumAdapter) -> None:
- with pytest.raises(ValueError, match=r"extra=\['bogus'\]"):
+ with pytest.raises(ValueError, match=r"extra=.*bogus"):
adapter.decode_action({"param_a": 0, "param_b": 1, "bogus": 0})
@@
def test_step_raw_rejects_missing_fixed_param() -> None:
@@
- with pytest.raises(ValueError, match=r"missing=\['fixed_param'\]"):
+ with pytest.raises(ValueError, match=r"missing=.*fixed_param"):
adapter.step_raw({"param_a": 1, "param_b": 10})
@@
def test_step_raw_rejects_unknown_keys() -> None:
@@
- with pytest.raises(ValueError, match=r"extra=\['bogus'\]"):
+ with pytest.raises(ValueError, match=r"extra=.*bogus"):
adapter.step_raw({"param_a": 1, "param_b": 10, "fixed_param": 42, "bogus": 0})Also applies to: 226-234
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/test_gymnasium_adapter.py` around lines 209 - 215, The tests assert on
ValueError messages using exact list formatting (e.g.,
match=r"missing=\['param_b'\]"), which is brittle; update the assertions in
test_decode_action_missing_key and test_decode_action_rejects_unknown_keys to
use more focused regexes that only check for the presence of the key names
(e.g., look for "param_b" and "bogus") or words like "missing" / "extra" rather
than exact list punctuation; adjust the two places mentioned (the current block
around adapter.decode_action({"param_a": 0}) and the block for
adapter.decode_action({"param_a": 0, "param_b": 1, "bogus": 0}) and the similar
assertions at lines 226-234) to use these looser key-focused regexes so tests
validate behavior without depending on formatting.
GymnasiumAdapterwraps a CloudAIBaseGymas agymnasium-compatible env:DictofDiscreteactions over the tunable params,float32Boxobservation derived fromdefine_observation_space(), gymnasium 5-tuple fromstep/step_raw.step_rawbypasses index decoding and fixed injection for callers that already have a fully-formed parameter dict.handle_dse_job's 1-basedtest_run.stepso artifact paths match whether the env is driven by the DSE loop or by an external training loop wrapping the adapter.CloudAIGymEnv.define_observation_space()now returns one slot per agent metric (at least one), giving adapters the correctBoxshape.gymnasiumis an optional dependency: lazy-imported inside the adapter and exposed via a newrlextra (also added todev) pinned to~=1.2.