-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added support for new PettingZoo API #751
Conversation
Codecov Report
@@ Coverage Diff @@
## master #751 +/- ##
==========================================
+ Coverage 89.42% 91.60% +2.18%
==========================================
Files 70 70
Lines 4925 4934 +9
==========================================
+ Hits 4404 4520 +116
+ Misses 521 414 -107
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. I contend your choice to change the syntax, but I'm open to being convinced.
@@ -57,7 +57,8 @@ def __init__(self, env: BaseWrapper): | |||
|
|||
def reset(self, *args: Any, **kwargs: Any) -> Union[dict, Tuple[dict, dict]]: | |||
self.env.reset(*args, **kwargs) | |||
observation, _, _, info = self.env.last(self) | |||
last_return = self.env.last(self) | |||
observation, info = last_return[0], last_return[-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have you done this rather than observation, _, _, _, info = self.env.last(self)
? This would keep the style of Tianshou, Gym and PZ.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment below
Yes, I completely agree with your remarks. The code was more readable before, due to the explicit labeling of the return values. I chose to do it this way to keep backward compatibility with older PZ versions (which use the old step API) at the expense of readability. We have a similar dilemma for Gym, but I don't believe that we should impose gym>=0.26.0 since this would probably break a lot of user code and (afaik) envpool hasn't caught up with the API changes. |
Very good points. Indeed, some users may wish to use a new Tianshou feature with an old PZ env. However, I believe we do have wrappers to convert old PZ envs to new PZ envs pretty easily. But the question is will the rest of Tianshou's update preserve backwards compat? Are you saying it will? If it will, fab, but perhaps there should be warnings making users aware that: |
@WillDudley my changes to tianshou should be backward compatible with respect to Gym and PettingZoo in the sense that you should be able to use old-style environments without any issues. However, it is not entirely compatible with respect to existing user code since my changes may (slightly) break existing code that directly interacts with the replay buffer. I would expect very few users to actually be affected, given the abstraction offered by collectors and the So for all intents and purposes, I think one can call the changes backward compatible. I agree that deprecation warnings may be reasonable. What do you think @Trinkle23897 @pseudo-rnd-thoughts? |
Would personally vote for a depreciation warning encouraging users to upgrade their envs with a simple wrapper (for maintenance reasons), as well as code comments explaining that the unconventional code logic is due to backwards compatibility with PettingZoo<1.21. |
LGTM! |
Maybe someone who really knows PZ should do a sanity check here. I simply changed
step
to be compatible with both the old and the new step API. Removal of seeding was seemingly already anticipated in the existing code.