-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
believe fixed a bug #825
believe fixed a bug #825
Conversation
Codecov Report
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more @@ Coverage Diff @@
## master #825 +/- ##
==========================================
- Coverage 91.12% 91.11% -0.02%
==========================================
Files 73 73
Lines 5083 5085 +2
==========================================
+ Hits 4632 4633 +1
- Misses 451 452 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
- [ ] I have marked all applicable categories: + [x] exception-raising fix + [ ] algorithm implementation fix + [ ] documentation modification + [ ] new feature - [ ] I have reformatted the code using `make format` (**required**) - [ ] I have checked the code using `make commit-checks` (**required**) - [ ] If applicable, I have mentioned the relevant/related issue(s) - [ ] If applicable, I have listed every items in this Pull Request below I'm developing a new PettingZoo environment. It is a two players turns board game. ``` obs_space = dict( board = gym.spaces.MultiBinary([8, 8]), player = gym.spaces.Tuple([gym.spaces.Discrete(8)] * 2), other_player = gym.spaces.Tuple([gym.spaces.Discrete(8)] * 2) ) self._observation_space = gym.spaces.Dict(spaces=obs_space) self._action_space = gym.spaces.Tuple([gym.spaces.Discrete(8)] * 2) ... # this cache ensures that same space object is returned for the same agent # allows action space seeding to work as expected @functools.lru_cache(maxsize=None) def observation_space(self, agent): # gymnasium spaces are defined and documented here: https://gymnasium.farama.org/api/spaces/ return self._observation_space @functools.lru_cache(maxsize=None) def action_space(self, agent): return self._action_space ``` My test is: ``` def test_with_tianshou(): action = None # env = gym.make('qwertyenv/CollectCoins-v0', pieces=['rock', 'rock']) env = CollectCoinsEnv(pieces=['rock', 'rock'], with_mask=True) def another_action_taken(action_taken): nonlocal action action = action_taken # Wrapping the original environment as to make sure a valid action will be taken. env = EnsureValidAction( env, env.check_action_valid, env.provide_alternative_valid_action, another_action_taken ) env = PettingZooEnv(env) policies = MultiAgentPolicyManager([RandomPolicy(), RandomPolicy()], env) env = DummyVectorEnv([lambda: env]) collector = Collector(policies, env) result = collector.collect(n_step=200, render=0.1) ``` I have also a wrapper that may be redundant as of Tianshou capability to action_mask, yet it is still part of the code: ``` from typing import TypeVar, Callable import gymnasium as gym from pettingzoo.utils.wrappers import BaseWrapper Action = TypeVar("Action") class ActionWrapper(BaseWrapper): def __init__(self, env: gym.Env): super().__init__(env) def step(self, action): action = self.action(action) self.env.step(action) def action(self, action): pass def render(self, *args, **kwargs): self.env.render(*args, **kwargs) class EnsureValidAction(ActionWrapper): """ A gym environment wrapper to help with the case that the agent wants to take invalid actions. For example consider a Chess game, where you let the action_space be any piece moving to any square on the board, but then when a wrong move is taken, instead of returing a big negative reward, you just take another action, this time a valid one. To make sure the learning algorithm is aware of the action taken, a callback should be provided. """ def __init__(self, env: gym.Env, check_action_valid: Callable[[Action], bool], provide_alternative_valid_action: Callable[[Action], Action], alternative_action_cb: Callable[[Action], None]): super().__init__(env) self.check_action_valid = check_action_valid self.provide_alternative_valid_action = provide_alternative_valid_action self.alternative_action_cb = alternative_action_cb def action(self, action: Action) -> Action: if self.check_action_valid(action): return action alternative_action = self.provide_alternative_valid_action(action) self.alternative_action_cb(alternative_action) return alternative_action ``` To make above work I had to patch a bit PettingZoo (opened a pull-request there), and a small patch here (this PR). Maybe I'm doing something wrong, yet I fail to see it. With my both fixes of PZ and of Tianshou, I have two tests, one of the environment by itself, and the other as of above.
make format
(required)make commit-checks
(required)I'm developing a new PettingZoo environment. It is a two players turns board game.
My test is:
I have also a wrapper that may be redundant as of Tianshou capability to action_mask, yet it is still part of the code:
To make above work I had to patch a bit PettingZoo (opened a pull-request there), and a small patch here (this PR).
Maybe I'm doing something wrong, yet I fail to see it.
With my both fixes of PZ and of Tianshou, I have two tests, one of the environment by itself, and the other as of above.