[Question] Action Space showing floats when dtype=int #3107

windowshopr · 2022-10-01T22:07:33Z

Question

This isn't NOT a bug, but I also didn't want to submit it as a bug in case it's really not, but when I set an action space as a Box with dtype=int or dtype=np.int32, when the environment is reset for the first time, the actions are showing as floats? Any idea why?

I've provided a minimal reproducible example below:

from random import randint
from numpy import inf, float32, array, int32, int64
import gym
from stable_baselines3 import A2C, DQN, PPO

"""Class of environment"""
class Custom_Environment(gym.Env):

    metadata = {'render.modes': ['human', 'text']}

    """Initialize the environment"""
    def __init__(self):
        super(Custom_Environment, self).__init__()
        # Spaces
        self.action_space = gym.spaces.Box(low=0, high=1000, shape=(37,), dtype=int)
        self.observation_space = gym.spaces.Box(low=0, high=1000, shape=(37,), dtype=int)


    """Reset the Environment"""
    def reset(self):
        self.done = False
        self.current_state = self.observation_space.sample()
        return self.current_state


    """Step Through the Environment"""
    def step(self, action):
        # Inspect the action space, see not integers
        for i in range(len(action)):
            print(action[i])
        # Throw an error here to stop the code
        stop()
        step_reward = 0
        return self.current_state, step_reward, self.done, {} # step_reward

env = Custom_Environment()

model = PPO('MlpPolicy', env, verbose=1)
model.learn(total_timesteps=10000)

OUTPUT

Using cpu device
Wrapping the env with a `Monitor` wrapper
Wrapping the env in a DummyVecEnv.
0.0
0.0
0.0
0.6412628293037415
1.6904927492141724
1.3439135551452637
0.0
1.1811418533325195
0.0
1.066895842552185
0.0
0.0
0.0
0.6355175375938416
0.0
0.0
0.1874774545431137
0.4175698757171631
0.0
0.8219074010848999
0.0
0.0
0.0
0.0
0.0
0.0
0.10093195736408234
0.7056602835655212
0.0
0.2542327344417572
0.17636002600193024
0.0
1.3428149223327637
0.25007113814353943
0.0
0.0
0.6961379051208496

The code will stop at my stop() function call which doesn't exist, but you can see the floats printed out in the output. Any ideas?

The text was updated successfully, but these errors were encountered:

windowshopr · 2022-10-01T22:11:20Z

Someone here has asked the same question, but there's no answer

balisujohn · 2022-10-02T02:26:48Z

This looks like a bug to me, I'll take a look.

edit: this is a bug in stable-baselines3, since the action is produced by a sb3 policy, the sb3 policy is ignoring the fact that the action space is a box with type int. So to look into this, we would need to know the versions of gym and stable-baselines3 you're using.

windowshopr · 2022-10-02T03:40:34Z

gym==0.21.0
stable_baselines3==1.6.1

It's worth noting that I am running this in a Google Colab notebook...

windowshopr · 2022-10-02T03:43:17Z

I also just tried upgrading gym in the notebook to 0.26.1, same issue.

balisujohn · 2022-10-02T05:41:10Z

So a box action space with int values is equivalent in some sense to a multi-discrete action space, which stable-baselines3 does currently support. I think stable-baselines3 doesn't support integer values box action spaces from looking at their code, so if you really need int valued box support and a multi-discrete space wont work, then I'd recommend making a feature request on stable-baselines3.

To be clear, I think most likely you will be able to get the desired behavior with a MultiDiscrete action space.

windowshopr · 2022-10-02T23:44:13Z

This did fix the issue, changing the action space to:

self.action_space = gym.spaces.MultiDiscrete([1000 for _ in range(37)], dtype=int)

...worked in turning them into int's so I can go ahead with that, but maybe the Box int thing could be figured out in the future. Thanks!

windowshopr closed this as completed Oct 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Action Space showing floats when dtype=int #3107

[Question] Action Space showing floats when dtype=int #3107

windowshopr commented Oct 1, 2022 •

edited

windowshopr commented Oct 1, 2022

balisujohn commented Oct 2, 2022 •

edited

windowshopr commented Oct 2, 2022

windowshopr commented Oct 2, 2022

balisujohn commented Oct 2, 2022 •

edited

windowshopr commented Oct 2, 2022

[Question] Action Space showing floats when dtype=int #3107

[Question] Action Space showing floats when dtype=int #3107

Comments

windowshopr commented Oct 1, 2022 • edited

Question

windowshopr commented Oct 1, 2022

balisujohn commented Oct 2, 2022 • edited

windowshopr commented Oct 2, 2022

windowshopr commented Oct 2, 2022

balisujohn commented Oct 2, 2022 • edited

windowshopr commented Oct 2, 2022

windowshopr commented Oct 1, 2022 •

edited

balisujohn commented Oct 2, 2022 •

edited

balisujohn commented Oct 2, 2022 •

edited