[feature request] Support of gym.spaces.Tuple #100

RGring · 2018-11-27T08:54:41Z

Thanks for the nice and clean drl library!
Is the support of gym.spaces.Tuple coming in the near future? :) Would be helpful for more complex problems...

araffin · 2018-11-27T09:46:05Z

Hello,

Thanks for the nice and clean drl library!

Your welcome =)

Is the support of gym.spaces.Tuple coming in the near future?

It is not planned for now. But PR are welcomed ;) Do you have a concrete use-case where tuple space is needed?
Also, I don't know how easy it will be to integrate it.

RGring · 2018-11-27T10:05:01Z

I want to define an action space for a mobile robot with action space for the translational and rotational velocity. They have different limits (low/high). It's not possible to put them in one gym.spaces.Box.

araffin · 2018-11-27T10:13:14Z

They have different limits (low/high). It's not possible to put them in one gym.spaces.Box.

Are you sure? I would do something like (if my limits are [-1, 1] and [-5, 5] for instance):

spaces.Box(low=np.array([-1, -5]), high=np.array([1, 5]), dtype=np.float32)

EDIT: tuple spaces are normally useful when you mix for instance discrete and continuous spaces

hill-a · 2018-11-27T10:18:14Z

It should not be too hard to implement for the action_space if it is really important (only actor-critic models). You just need to imitate MultiCategoricalProbabilityDistribution (https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/common/distributions.py#L333)

The hard part might be the observation_space however.

RGring · 2018-11-27T10:29:04Z

@araffin okay, true - that solves the problem. Thanks a lot! And sorry for requesting the "wrong" issue.

VXU1230 · 2019-10-25T15:06:17Z

It should not be too hard to implement for the action_space if it is really important (only actor-critic models). You just need to imitate MultiCategoricalProbabilityDistribution (https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/common/distributions.py#L333)

The hard part might be the observation_space however.

@hill-a Is it possible to implement other models (DQN etc) with tuple action space?

denyHell · 2020-01-16T17:12:45Z

Hi,

How can one define a action space where an action consists of a categorical variable (e.g. 0 or 1) and a continuous variable (any real number in the interval (0,10)) ?

If I were to use spaces.Tuple in my action space definition, and have implemented a corresponding probability distribution (e.g. TupleProbabilityDistribution), what may go wrong if I would like to try a model, say PPO2, on my environment?

Thank you very much!

Miffyli · 2020-01-16T17:14:25Z

@denyHell

The issue lies in how observations are handled under the hood: They are being concatenated into numpy arrays of right shape and whatnot, and thus they won't work when observation is a Tuple/Dict. This would require major rework all around the code and it is currently planned for v3.1 (the next update after migrating to TF2).

denyHell · 2020-01-16T17:18:59Z

@denyHell

The issue lies in how observations are handled under the hood: They are being concatenated into numpy arrays of right shape and whatnot, and thus they won't work when observation is a Tuple/Dict. This would require major rework all around the code and it is currently planned for v3.1 (the next update after migrating to TF2).

Thank you for the quick reply! I am not using Tuple/Dict for observation space, but action space. Would that also cause a lot of issues?

Miffyli · 2020-01-16T17:20:32Z

Ah, sorry for misunderstanding! The issue is still the same, though: Actions are being stacked into arrays of right shapes, and thus Tuple/Dict spaces won't work even if you have necessary distributions available etc. Support for this too is planned for v3.1.

denyHell · 2020-01-16T17:22:54Z

For my first question: To design a customized environment, how can one define a observation/action space where an obs/action consists of categorical variables (e.g. 0 or 1) and continuous variables (e.g. any real number in the interval (0,10)) ?

Do you have any suggestion? Thanks a lot!

Miffyli · 2020-01-16T17:26:40Z

Like mentioned, the correct way to do this (Tuple/Dict) is not supported in stable-baselines as of writing. You could try doing some trickery around this by e.g. defining a single action space of bunch of continuous actions (Box), then slicing off the variables you want to be discrete (categorical) and thresholding them (i.e. if above 0.5, then set to 1, otherwise 0). There are zero guarantees this will work, though. Other than that I do not have tips to give :/

BarisYazici · 2020-04-09T10:55:11Z

Any recommendation on how to use 3 dimensional image observation(rgb) with 1 dimensional continous observation. I was thinking about using Tuple, but just noticed that it's not supported. Warning from check_env helper recommends to flatten the Tuple by unpacking it. I can flatten the whole image observation but then don't how it affects the optimizer when the dimensionalty of the image observation is gone. Then, I wouldn't be able to use CNN policies, I guess.

Miffyli · 2020-04-09T11:09:49Z

@BarisYazici

With bit of trickery, you can do this. See this example. TL;DR: Put the 1D stuff on the (new) last channel of the RGB image, and extract accordingly in the network.

GraderYuval · 2022-04-12T12:45:48Z

Hi @Miffyli,
Is there any update regarding the support of continuous and discrete action space (e.g support of gym spaces such as tuple, dict)?

araffin · 2022-04-12T12:47:50Z

Hi @Miffyli, Is there any update regarding the support of continuous and discrete action space (e.g support of gym spaces such as tuple, dict)?

See DLR-RM/stable-baselines3#731 and DLR-RM/stable-baselines3#527

araffin added the enhancement New feature or request label Nov 27, 2018

araffin closed this as completed Nov 27, 2018

araffin mentioned this issue Dec 1, 2018

Tuple action space with stable baselines PPO2 [question] #107

Closed

Miffyli mentioned this issue Jul 15, 2020

Feature request for adding Dict type observation space #947

Closed

Miffyli mentioned this issue Nov 23, 2020

Support for input space of Dict format #1045

Closed

araffin mentioned this issue May 5, 2023

[Question] Does it make sense to make small-sized discrete actions continious? DLR-RM/stable-baselines3#1482

Closed

4 tasks

nrigol mentioned this issue Jun 13, 2023

[Question] Observation space must be a 1D vector or it can be a Dict, Tuple ? DLR-RM/stable-baselines3#1549

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature request] Support of gym.spaces.Tuple #100

[feature request] Support of gym.spaces.Tuple #100

RGring commented Nov 27, 2018

araffin commented Nov 27, 2018

RGring commented Nov 27, 2018

araffin commented Nov 27, 2018 •

edited

Loading

hill-a commented Nov 27, 2018

RGring commented Nov 27, 2018

VXU1230 commented Oct 25, 2019

denyHell commented Jan 16, 2020 •

edited

Loading

Miffyli commented Jan 16, 2020

denyHell commented Jan 16, 2020

Miffyli commented Jan 16, 2020

denyHell commented Jan 16, 2020

Miffyli commented Jan 16, 2020

BarisYazici commented Apr 9, 2020

Miffyli commented Apr 9, 2020 •

edited

Loading

GraderYuval commented Apr 12, 2022

araffin commented Apr 12, 2022

[feature request] Support of gym.spaces.Tuple #100

[feature request] Support of gym.spaces.Tuple #100

Comments

RGring commented Nov 27, 2018

araffin commented Nov 27, 2018

RGring commented Nov 27, 2018

araffin commented Nov 27, 2018 • edited Loading

hill-a commented Nov 27, 2018

RGring commented Nov 27, 2018

VXU1230 commented Oct 25, 2019

denyHell commented Jan 16, 2020 • edited Loading

Miffyli commented Jan 16, 2020

denyHell commented Jan 16, 2020

Miffyli commented Jan 16, 2020

denyHell commented Jan 16, 2020

Miffyli commented Jan 16, 2020

BarisYazici commented Apr 9, 2020

Miffyli commented Apr 9, 2020 • edited Loading

GraderYuval commented Apr 12, 2022

araffin commented Apr 12, 2022

araffin commented Nov 27, 2018 •

edited

Loading

denyHell commented Jan 16, 2020 •

edited

Loading

Miffyli commented Apr 9, 2020 •

edited

Loading