-
-
Notifications
You must be signed in to change notification settings - Fork 715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Proposal] Add a flag to MultiBinary to allow one-hot encoding #508
Comments
Hey, I think this is a good idea. |
Hey @pseudo-rnd-thoughts , absolutely. How do we proceed from here? |
I'm not sure about this tbh. If we introduce this, we'd also have to adjust the The code in both def sample(self):
if self.onehot:
...
else:
... at which point the functionality should probably just be split into two classes. So then we can consider whether we want to create a new |
I am not sure I completely agree that this would convey the same information of I agree you can always go from one to the other, it is just a matter of how common one practice is with respect to others. The def sample(self):
def sample_onehot(self):
# code for sampling with onehot
return sample
def sample_simple(self):
# old code
return sample
if self.onehot:
return sample_onehot()
else:
return sample_simple() The same structure would be used for contains. Maybe I could post here my |
These two statements are in contradiction. To be clear with what I mean, you have a direct correspondence between
To clarify my other point about the Overall my argument against adding this goes roughly as follows:
I might be open to a different approach for adding this functionality, but I don't know how important it even is for Gymnasium to handle. Converting between integers and one-hot arrays is fairly trivial. If you want to use one-hot in your code, you can call It miiiiiiiight make sense to have back-and-forth conversion as a method on space = Discrete(5)
space.to_onehot(space.sample()) # [0, 1, 0, 0, 0]
space.from_onehot([0, 1, 0, 0, 0]) # 2 but I'm not sure how useful it really is since DL/RL libraries usually have their own utilities to do that (and it's a one-liner anyways, unless we want to optimize for high dimensionality) |
Thanks for the very precise answer. I clearly am a novice to these topics, hence my appreciation for such a detailed answer. I agree on most of the things you said. I am still convinced that, for good design purposes in custom environments, having a Ultimately, I am convinced that having a OneHot already available would make custom-env development easier, but I am more than happy to accept your feedback on this |
|
Very similar to MultiBinary ;) |
This conversion reminds me of #102 which pointed out that To me, this is part of the legacy / technical debt that Gym / Gymnasium has and is difficult to change / fix due to its size and aim of backward compatibility. |
I am not sure I completely understand this... Can you please point out which conversion are you referring to? So why not adding a |
Hey there, any update? |
Hey, Read #102 for the whole discussion, in short, Gymnasium allows users to flatten a space which for One of the proposals in #102 was to introduce a Thank you for opening the issue, I am going to close the issue, we can continue the discussion until there is a particularly strong argument that we haven't considered yet then we can reopen |
Hi, I have a fairly simple motivating example and ran across this and #102. I'm trying to write a new gym env (hopefully written The Right Way from the start), and I'd like to run stable-baselines3 on it. stable-baseline3 does not support The advice in #102 to sample before flattening does not seem to apply here as the sampling isn't on my end. The only fix I can think of would be to do some truly horrible one-hotifying in the FlattenAction wrapper before the invalid values can be passed to But overall the fundamental problem seems to me to be that there does not exist a space that is both 1. flat 2. sample() behaves like a one-hot. Discrete violates 1, MultiBinary violates 2. Without such a space, there doesn't appear to be a 'clean' env setup for both random sampling and actually feeding values to/from a NN. To me, this would make the addition of the Discrete-lookalike |
Hi, thanks for your comment. In short, yes, I agree with your pain. However, I still believe this is not generally possible due to the current implementation of |
Proposal
Currently, sampling from
spaces.MultiBinary
produces an array in which more than one element is 1. When dealing with flatten arrays (that is, arrays of shape(n,)
) this might not be problematic.However, for matrices and tensors it is often the case that one uses MultiBinary to produce a one-hot encoding like representation of either observations or actions. I am proposing we implement this by adding a flag
onehot:bool=False
to MultiBinary__init__(...)
so that when sampling one has that only one bit per row is set to 1.Motivation
No space allows a representation of one-hot encoded like observations/actions.
Pitch
Bring one-hot encoding to
gymnasium.spaces
.Alternatives
I have considered implementing my own custom space but thought this might be interesting for the community too.
Additional context
Which clearly is non-desirable when dealing with one-hot encodings as, at an action level, this sample represents choosing 3 options for the choice represented in the first row, for instance.
Checklist
The text was updated successfully, but these errors were encountered: