New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-agent vectorized environment support for gym wrapper. #4120
Comments
Hi, ray-rllib link is broken |
@Hsgngr I updated the link!! |
As of now I have zeroed in to this snippet of code that raises an exception whenever the number of agents is greater than one in the ml-agents/gym-unity/gym_unity/envs/__init__.py Lines 267 to 272 in 3c2fa4d
Is there any reason why this is the norm? Is it just to make sure that the environments are compatible with the standard RL libraries like OpenAI's baselines and Dopamine? |
Hi @xiaomaogy,
using ml-agents/ml-agents-envs/mlagents_envs/environment.py Lines 338 to 345 in 20527d1
The issue that I am facing now is that when the episode ends, I only get the observations from the agent that is responsible for termination. My use case however is quite different. I want the episode to be agent dependent, and even if some agent might "die", the rest of the agent should continue. The dead agent would spawn somewhere else in the map! Is this achievable? And could you give me some pointers on some of the possible pitfalls that I should look out for. Thanks in advance!! |
This snippet is actually responsible for sending the observations when some agent reaches its terminal condition. ml-agents/gym-unity/gym_unity/envs/__init__.py Lines 175 to 180 in 3c2fa4d
In a multi-agent/vectorized setting would it be okay to return the observations/rewards/dones considering both P.S.: One workaround that I can think of for my particular use case is to not call the |
In a similar question to the original poster: is there any reason why the multi agent isn't supported? Code changes to make the gym-like API support isn't difficult so I'm trying figure out whether I'm missing something or this is purely conceptual difficulty. |
Since I need this for my own purpose, I've added my own wrapper (based on Unity's wrapper) which can be found here https://github.com/laszukdawid/ai-traineree/blob/master/ai_traineree/tasks.py#L101 (or with associated commit laszukdawid/ai-traineree@39dcf31). I'd appreciate any reply from Unity's team. I'm planning on adding more support for Multi Agent use cases and wouldn't mind contributing a bit. |
@laszukdawid can you provide a simple collab to learn how to use your wrapper? i am in the dark with the Python API and Gym Wrapper's outdated documentation. |
For log continuuity: |
Are there any updates for this issue? It would be great to see support for Ray's RLlib in ML Agents - particularly multi-agent reinforcement learning. |
Sorry, we are not currently supporting multi-agent vectorized env for gym wrapper. |
Understood, and thanks for the update, @xcao65! |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Is your feature request related to a problem? Please describe.
I currently installed ml-agents to use for a research project. My use case is a multi-agent scenario involving coordination amongst the agents. The gym wrapper however only allows for single agents to be used as of now.
Also having a bunch of environments in parallel that can be handled by the gym wrapper would be awesome!
Describe the solution you'd like
Currently the low level python API, the
UnityEnvironment
class provides access to multiple agents. Exposing this to the gym wrapper would be really helpful. One of the implementations for a multiagent environment that I personally like was inray-rllib
hereFor multiple environments, a vectorized approach could work like the openAI's
VecEnv
linkAs I am going to be developing workarounds for my project, I would like to contribute towards this goal. As of now I am going to developing the solution to this as close as the
ray-rllib
's implementation.Inputs and critiques are welcome!
The text was updated successfully, but these errors were encountered: