Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPE Continuous Action support #419

Merged
merged 21 commits into from Jul 16, 2021
Merged

MPE Continuous Action support #419

merged 21 commits into from Jul 16, 2021

Conversation

Rohan138
Copy link
Contributor

@Rohan138 Rohan138 commented Jul 8, 2021

I've added support for continuous actions in MPE as discussed in #249 through the continuous_actions=False argument in the environment config.

I've tested my changes on all environments using RLLib MADDPG, and run ./release_test.sh. The latter needed some updates as well.

All environments are working except simple_world_comm_v2, which seems to have a supersuit-related bug.

@jkterry1
Copy link
Member

jkterry1 commented Jul 8, 2021

Hey I haven't been tracking your conversation on discord, but you understand that our MPE environments are rather substantially fixed upband can't be directly compared to the ones in the original repo right?

@jkterry1
Copy link
Member

jkterry1 commented Jul 8, 2021

And does Ben have the needed information to look into the "supersuit related bug"?

@Rohan138
Copy link
Contributor Author

Rohan138 commented Jul 8, 2021

Yes, I understand-however, I think having the option of using continuous action spaces makes it much easier to use PettingZoo MPE with most papers using MPE, just because they've been using it as well. I've still kept the default actions to discrete, so the original behavior isn't affected.
For example, I used the RLLib MADDPG written by you, and got similar results to the original MPE + MADDPG, although not identical.

About simple_world_comm_v2 - I'm not sure what the bug is. If you run the script above with --env-name=simple_world_comm_v2, it crashes with different errors with SuperSuit 2.6.6 and 3.0.1, but it's about incorrect observation shapes for one of the agents with observation shape Discrete(34).

@benblack769
Copy link
Contributor

benblack769 commented Jul 8, 2021

So I enabled CI tests for your PR, and it looks like simple_world_comm_v2 is failing due to the observation space issue. You can run the respective test with

pytest ./test/pytest_runner.py::test_module[mpe/simple_world_comm_v2-pettingzoo.mpe.simple_world_comm_v2]

Let me know if you have any questions.

@benblack769
Copy link
Contributor

Note that supersuit is not in those tests, so it is not a supersuit issue.

@benblack769
Copy link
Contributor

Ok, so I am reviewing the PR, and it generally looks good, really good job figuring out the tests and documentation.

One issue is that the continuous action space for environments with both communication and movement (like simple_world_comm_v2) is not what it should be (this might be what is causing the test to fail).

In particular, if you have 5 movement options and 4 communication options, then in a continuous space, the action should be of size 9 (the actions are concatenated), and in a discrete space, it should be of size 20 (the actions are a cross product of the two subspaces). The continuous space should break the action down into the movement and communication components differently in the continuous and discrete cases here: https://github.com/PettingZoo-Team/PettingZoo/blob/master/pettingzoo/mpe/_mpe_utils/simple_env.py#L94

The documentation should reflect this.

Thanks again for working on this!

@benblack769
Copy link
Contributor

Ok, so the action space and action handling looks good now. I don't like the way that the rendering misrepresents the communication (it renders the argmax of the actions), but that is inherited from the original MPE, and I don't know if its worth fixing for now.

So I am happy with this PR. @jkterry1 Do you want to look over the documentation? I think its fine, but I have low standards, as you know.

@jkterry1
Copy link
Member

"I don't like the way that the rendering misrepresents the communication (it renders the argmax of the actions), but that is inherited from the original MPE, and I don't know if its worth fixing for now."

@Rohan138 would you be willing to work on this?

@jkterry1
Copy link
Member

@benblack769 the docs look fine..? what body am I supposed to be looking for under the rug here

@benblack769
Copy link
Contributor

@jkterry1 I don't see any dead bodies either, I'm just checking that you don't.

@Rohan138
Copy link
Contributor Author

Rohan138 commented Jul 11, 2021

I added something-just printing out the entire comm, rounded to 2dp - that's the best thing I could think of, although I agree that the argmax is unsatisfying. Perhaps a new PR?
I also added the mpe_maddpg script to tutorials.
Feel free to reject either of these two commits.
I hereby attest that there are no dead bodies in this PR.

@benblack769
Copy link
Contributor

Can you check the "Allow edits from maintainers" box? There are a couple minor things I want to change.
bisq-network/style#4

@Rohan138
Copy link
Contributor Author

It's already checked, are you unable to edit?

@jkterry1
Copy link
Member

jkterry1 commented Jul 12, 2021 via email

@Rohan138
Copy link
Contributor Author

Hi, any updates on this?

@jkterry1
Copy link
Member

I'll try to meet with Ben today and deal with this

@benblack769
Copy link
Contributor

Sorry for ignoring this. I just fiddled around a bit with the rendering, no real complains.

@benblack769
Copy link
Contributor

So I am happy with this being merged

@benblack769
Copy link
Contributor

@jkterry1

@jkterry1 jkterry1 merged commit 383c152 into Farama-Foundation:master Jul 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants