Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] action_space as gym.spaces.Dict is supported? Is it going to be supported? #731

Closed
2 tasks done
EloyAnguiano opened this issue Jan 19, 2022 · 4 comments
Closed
2 tasks done
Labels
duplicate This issue or pull request already exists question Further information is requested

Comments

@EloyAnguiano
Copy link

Question

I have seen that little by little the algorithms are adapting to be able to receive a state of the dictionary type as specified here, but I have not seen any reference to algorithms that can use this same of strategies for the case of use of the space of the actions. Is it possible to do this with SB3 or do you plan to address work in this direction?

Additional context

I am trying to assess SB3 as a good option to address the grid2op environments of l2rpn competition. Such type of environments, have a functionality to be translated to a GYM environment, but both the action space and the observation space are dictionary type. More information can be found here.

Checklist

  • I have read the documentation (required)
  • I have checked that there is no similar issue in the repo (required)
@EloyAnguiano EloyAnguiano added the question Further information is requested label Jan 19, 2022
@araffin araffin added the duplicate This issue or pull request already exists label Jan 19, 2022
@araffin
Copy link
Member

araffin commented Jan 19, 2022

Hello,

s. Is it possible to do this with SB3 or do you plan to address work in this direction?

It is not planned for now but you can have a look at projects that implement it.

Duplicate of #527

You can probably take a look at https://github.com/minerllabs/basalt_competition_baseline_submissions/blob/master/basalt_utils/src/basalt_utils/sb3_compat/distributions.py#L9

@araffin araffin closed this as completed Jan 19, 2022
@edbeeching
Copy link

Hello Antonin.

A number of environments implemented in Godot RL Agents include independant discrete and continuous action spaces. This is represented as a Gym Dict with nested Box and Discrete spaces.

I was curious if you have any ideas about how to handle this use case?

Thanks

@araffin
Copy link
Member

araffin commented Mar 10, 2022

I was curious if you have any ideas about how to handle this use case?

As I wrote, you can take a look at https://github.com/minerllabs/basalt_competition_baseline_submissions/blob/master/basalt_utils/src/basalt_utils/sb3_compat/distributions.py#L9 to combine different distributions.

If you have only discrete and continuous, you could also hack it quickly by transforming the discrete actions to continuous ones (or the other way around, discretize the continuous actions).

@EloyAnguiano
Copy link
Author

I think that this adaptation should be taken into account at some point. Sometimes you have to choose an entity (Discrete) to perform an action (Binary, Discrete, Continous, ...). The clearest example is the Deepmind's agent to beat Dota2. Of course you can make the combinaroty of some of those actions a MultiDiscrete action space, but now I have an environment that making that combinatory makes a Discrete environment of 145k actions instead of using a Dict action space. RLLib has this implemented and, as you said, there are some implementations available. Is there any strong reason for this not being at the roadmap?

I think that the library could be much more versatile with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants