[rllib] _get_torch_exploration_action doesn't support tuple action dist #10228

ThomasLecat · 2020-08-20T18:44:40Z

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Mac OS 10.15.4
Ray installed from (source or binary): binary (via pip)
Ray version: 0.8.6., but nothing seems to have changed on master
Python version: 3.7

What is the problem?

When using tuple action distributions (as advised in #6372) and exploration is disabled, the line:

ray/rllib/utils/exploration/stochastic_sampling.py

Line 75 in a462ae2

logp = torch.zeros((action.size()[0], ), dtype=torch.float32)

from _get_torch_exploration_action raises the following exception:

AttributeError: 'tuple' object has no attribute 'size'

A simple fix that supports any type of distribution would be:

logp = torch.zeros_like(action_dist.sampled_action_logp())

I can submit a PR if it helps.

Reproduction (REQUIRED)

Exact command to reproduce: python rllib_cartpole.py for the following file

import gym.envs.classic_control
from gym.spaces import Tuple, Discrete

import ray
from ray import tune


class CustomCartpole(gym.envs.classic_control.CartPoleEnv):
    """Add a dimension to the cartpole action space that is ignored."""

    def __init__(self, env_config):
        super().__init__()
        # if override_actions is false this is just the Cartpole environment
        self.override_actions = env_config['override_actions']
        if self.override_actions:
            # 2 is the environment's normal action space
            # 4 is just a dummy number to give it an extra dimension
            self.original_action_space = self.action_space
            self.action_space = Tuple([Discrete(2), Discrete(4)])
            self.tuple_action_space = self.action_space

    def step(self, action):
        # call the cartpole environment with the original action
        if self.override_actions:
            self.action_space = self.original_action_space
            return super().step(action[0])
        else:
            return super().step(action)


def main():
    ray.init()
    tune.run(
        "PPO",
        stop={"episode_reward_mean": 50},
        config={
            "env": CustomCartpole,
            "env_config": {'override_actions': True},
            "num_gpus": 0,
            "num_workers": 1,
            "eager": False,
            "evaluation_interval": 1,
            "evaluation_config": {
                "explore": False,
            },
            "framework": "torch",
        },
    )


if __name__ == '__main__':
    main()

I have verified my script runs in a clean environment and reproduces the issue.
I have verified the issue also occurs with the latest wheels.

The text was updated successfully, but these errors were encountered:

ericl · 2020-08-20T20:46:34Z

The proposed fix makes sense to me. We could alternatively try to get the batch dimension of the tuple, but I don't see an existing helper method for that, so your proposal is probably simpler.

And yeah, a PR would be great!

ThomasLecat · 2020-08-31T11:45:46Z

Thanks for your answer!
Just got back from holidays, I opened a PR #10443

ThomasLecat added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Aug 20, 2020

ericl added rllib P2 Important issue, but not time-critical and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Aug 20, 2020

ThomasLecat mentioned this issue Aug 31, 2020

[RLlib] no exploration and tuple action distributions #10443

Merged

6 tasks

ThomasLecat mentioned this issue Sep 10, 2020

[rllib] Custom model for multi-agent environment: access to all states #7341

Closed

ericl closed this as completed in #10443 Sep 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[rllib] _get_torch_exploration_action doesn't support tuple action dist #10228

[rllib] _get_torch_exploration_action doesn't support tuple action dist #10228

ThomasLecat commented Aug 20, 2020 •

edited

Loading

ericl commented Aug 20, 2020

ThomasLecat commented Aug 31, 2020

[rllib] _get_torch_exploration_action doesn't support tuple action dist #10228

[rllib] _get_torch_exploration_action doesn't support tuple action dist #10228

Comments

ThomasLecat commented Aug 20, 2020 • edited Loading

System information

What is the problem?

Reproduction (REQUIRED)

ericl commented Aug 20, 2020

ThomasLecat commented Aug 31, 2020

ThomasLecat commented Aug 20, 2020 •

edited

Loading