-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Draft] PettingZoo Support #45
Conversation
This PR also adds basic CI |
That is something I was already thinking about! |
@jkterry1 pytest is failing because of:
I guess you should add PIL as pettinzoo dependence. |
Thanks a ton! A few things: -The PIL issue will be fixed in the next pettingzoo release (in a few weeks), can you please just add it as a dependency for now so tests properly function? |
Done!
I made all traffic signal agents act synchronously every 'delta' sumo time-steps. Then, if an agent can't change phase because it has not passed 'min_green' seconds yet, it will just keep the same phase no matter the action is. It solves the problem of agents popping in and out of life, and thus addresses your comment about the expected behavior of rewards and observations. |
That sounds great, but just to confirm, what's the observation and reward for a light that can't change phase? I'll run a last round of tests in the morning. |
The observation is whatever it is at the moment, and the reward too. You can think of it as a state where all actions have the same effect. It does not affect learning. |
I don't believe that giving whatever reward the agent can get at the moment, when it was not able to act, is an expected behavior? This generally is not how reward works in reinforcement learning. |
Sorry if I was not clear. It is not giving whatever reward the agent can get at the moment. It receives the actual reward as a consequence of its action. An analogy of this situation is a grid world where the agent is facing an upper-right corner wall. If he moves UP or RIGHT, it will have the same effect and it will receive the reward of the current cell. |
Hey, so I just sat down to run tests and I added the specific learning file that I'll be using with "sb3.py" in experiments. I had a few questions and issues that came up in the process: -I know you said that you'd document the different environment xml files you created eventually, but in the mean time how many agents are in the 4x4 environment I'm using for testing so I can properly space evaluations?
|
There are 16 agents in the 4x4 grid.
100 k steps should be enough.
To use RLlib API I'm using the RLlib wrapper, check the file experiments/a3c_4x4grid.py in this PR. But I agree, I will update the imports.
I already did, now it says "The main class SumoEnvironment behaves like a MultiAgentEnv from RLlib." But I will improve the README when I add the new environments.
This is because the method .parallel_env() does not exist. Where should it come from? |
@jkterry1 I also notice that RLLib PettingZooEnv wrapper does not work together with the OrderEnforcingWrapper:
|
So the parallel_env() issue is that in pettingzoo, there are these modules with a env() and parallel_env() functions. https://github.com/PettingZoo-Team/PettingZoo/blob/master/test/example_envs/generated_agents_parallel_v0.py#L12 Environments don't spawn parallel environments, modules do. This isn't a requirement of the API, it is just a convention. If you want to turn an existing environment into a parallel environment, then you can just use
|
As for the rllib issue, what version of rllib are you using? That appears to be a fairly old version. |
@LucasAlegre Could you please add the parallel extension that ben described with the other changes? |
Updating solved the issue, thanks! |
@jkterry1 @benblack769
I believe we could merge this PR, as there are already many changes. Next I can:
What do you think? |
If you want to merge now and open a new PR for future changes that's fine with me, I'll reply tonight about the render and environment duplication problems. |
This is a nonworking draft of adding support for the PettingZoo API for multi-agent RL. This will make it able to be used with a bunch of other multi-agent libraries (SB3 and similar via SuperSuit, the ALL, Tianshou, etc.) RLlib also has native PettingZoo support too. Feel free to take a look and hopefully finish this in the next week.