-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] New API stack EnvRunners were not passing EnvContext
to environments (but plain config dicts).
#43818
[RLlib] New API stack EnvRunners were not passing EnvContext
to environments (but plain config dicts).
#43818
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Some general questions to our design for creating environments.
rllib/env/multi_agent_env_runner.py
Outdated
if not self.config.is_multi_agent(): | ||
raise ValueError( | ||
f"Cannot use this EnvRunner class ({type(self).__name__}), if your " | ||
"setup is nt multi-agent! Try adding multi-agent information to your " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny typo: "nt -> not"
) | ||
|
||
# Perform actual gym.make call. | ||
self.env = gym.make("rllib-multi-agent-env-v0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw. gym.make
has a autoreset
flag with which we can set off the auto resetting after the end of an epsiode if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gym.register
as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok, maybe that's useful for the callback we fixed yesterday. I mean for multi-agent, this doesn't matter right now b/c we are only using a single gym.Env here (no gym.vector.Env).
Would this disabling of auto-reset work for gym.vector.Env (for our SingleAgentEnvRunners)? I couldn't find anything on the gymnasium docs about this, only that it's always on and no mention of an off-switch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, gym.vector.make()
does not have this option. It's doing autoreset always.
And gym.vector.register()
does not exist :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we use gym.register()
with the vector_entry_point
and autoreset
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That could work! Lemme check ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't work. I think Vector.Env overrides this setting again somehow :( .
import gymnasium as gym
from ray.rllib.env.utils import _gym_env_creator
from functools import partial
env = "CartPole-v1"
entry_point = partial(
_gym_env_creator,
env_context={},
env_descriptor="CartPole-v1",
)
gym.register("rllib-single-agent-env-runner-v0", entry_point=entry_point, autoreset=False)
# Wrap into `VectorListInfo`` wrapper to get infos as lists.
env = gym.wrappers.VectorListInfo(
gym.vector.make(
"rllib-single-agent-env-runner-v0",
num_envs=2,
asynchronous=False,
)
)
obs, infos = env.reset()
while True:
actions = env.action_space.sample()
obs, rewards, terminateds, truncateds, infos = env.step(actions)
for i in range(2):
# Manually reset sub-env.
if terminateds[i] or truncateds[i]:
# If we continue stepping here, we should see a warning:
# ```
# "You are calling 'step()' even though this environment has already ... "
# ```
# but we don't, so vector env DOES seem to auto-reset, no matter what :(
print(f"sub-env {i} done -> needs to be reset?")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the source code. In SyncVectorEnv
and AsyncVectorEnv
the autoreset
flag is set to False
explicitly. No chance then.
"""Creates a MultiAgentEnv (is-a gymnasium env).""" | ||
env_ctx = self.config.env_config | ||
if not isinstance(env_ctx, EnvContext): | ||
env_ctx = EnvContext( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need the EnvContext
to start the gymnasium environments? gymnasium
uses the kwargs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gym Envs don't need it, but RLlib users are so used to having these properties in the env config dict that it would break things for a lot of them:
config.worker_index = ...
config.num_workers = ...
config["some_setting_for_env_ctor"] = ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. It is sometimes a bit unintuitive to use a dicitonary instead of directly inserting variables like in gymnasium
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree. It's historical and I don't want to break this API right now. We should discuss this in more detail. For now I just wanted to make the new stack behave just like the old in this regard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understandable.
Signed-off-by: Sven Mika <sven@anyscale.io>
…ironments (but plain config dicts). (ray-project#43818)
Fix: New API stack EnvRunners were not passing
EnvContext
to environments (but plain config dicts).Why are these changes needed?
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.