Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rllib] Custom environment observations always have the dtype float32 #7946

Closed
internetcoffeephone opened this issue Apr 9, 2020 · 8 comments
Labels
bug Something that is supposed to be working; but isn't rllib RLlib related issues

Comments

@internetcoffeephone
Copy link
Contributor

ray[rllib] 0.8.3
Python 3.6
tensorflow-gpu 2.1.0

What is the problem?

When using a custom environment with an observation with a dtype other than float32, e.g. uint8, the observation type is changed to float32. This causes models that expect other data types to fail.

This happens due to a hardcoded float32 in dynamic_tf_policy.py.

When using a Dict observation space (a set of simpler gym spaces), it also happens in preprocessors.py.

Reproduction

import numpy as np
import ray
from gym.spaces import Box, Dict, Discrete
from ray import tune
from ray.rllib.env import BaseEnv
from ray.rllib.utils import try_import_tf
from ray.tune.registry import register_env

tf = try_import_tf()


class TestEnv(BaseEnv):
    def poll(self):
        return {}, [], [], [], []

    def send_actions(self, action_dict):
        pass

    observation_space = Dict({"curr_obs": Box(low=0, high=255, shape=(1, 1), dtype=np.uint8)})
    action_space = Discrete(1)


def env_creator(_):
    return TestEnv()


env_name = "test_env"
register_env(env_name, env_creator)

ray.init(local_mode=True)
tune.run("PPO", config={"env": env_name})

Note: You will have to set the breakpoints in the affected lines yourself. I wanted to provide a simple custom model that takes in only uint8, but couldn't get it to work.

@internetcoffeephone internetcoffeephone added the bug Something that is supposed to be working; but isn't label Apr 9, 2020
internetcoffeephone added a commit to internetcoffeephone/sequential_social_dilemma_games that referenced this issue Apr 10, 2020
… for both observations and actions. This saves memory.

Fix bug where actions of other environments are always included.
NB: This version requires a ray bug to be patched: ray-project/ray#7946
internetcoffeephone added a commit to internetcoffeephone/sequential_social_dilemma_games that referenced this issue Apr 14, 2020
@internetcoffeephone
Copy link
Contributor Author

@stale
Copy link

stale bot commented Nov 12, 2020

Hi, I'm a bot from the Ray team :)

To help human contributors to focus on more relevant issues, I will automatically add the stale label to issues that have had no activity for more than 4 months.

If there is no further activity in the 14 days, the issue will be closed!

  • If you'd like to keep the issue open, just leave any comment, and the stale label will be removed!
  • If you'd like to get more attention to the issue, please tag one of Ray's contributors.

You can always ask for help on our discussion forum or Ray's public slack channel.

@stale stale bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Nov 12, 2020
@internetcoffeephone
Copy link
Contributor Author

Comment to remove stale label.

@stale stale bot removed the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Nov 13, 2020
@stale
Copy link

stale bot commented Mar 13, 2021

Hi, I'm a bot from the Ray team :)

To help human contributors to focus on more relevant issues, I will automatically add the stale label to issues that have had no activity for more than 4 months.

If there is no further activity in the 14 days, the issue will be closed!

  • If you'd like to keep the issue open, just leave any comment, and the stale label will be removed!
  • If you'd like to get more attention to the issue, please tag one of Ray's contributors.

You can always ask for help on our discussion forum or Ray's public slack channel.

@stale stale bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Mar 13, 2021
@internetcoffeephone
Copy link
Contributor Author

This comment is to remove the stale label.

@stale stale bot removed the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Mar 13, 2021
@rkooo567 rkooo567 added the rllib RLlib related issues label Jul 2, 2021
@stale
Copy link

stale bot commented Oct 30, 2021

Hi, I'm a bot from the Ray team :)

To help human contributors to focus on more relevant issues, I will automatically add the stale label to issues that have had no activity for more than 4 months.

If there is no further activity in the 14 days, the issue will be closed!

  • If you'd like to keep the issue open, just leave any comment, and the stale label will be removed!
  • If you'd like to get more attention to the issue, please tag one of Ray's contributors.

You can always ask for help on our discussion forum or Ray's public slack channel.

@stale stale bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Oct 30, 2021
@internetcoffeephone
Copy link
Contributor Author

This comment is to remove the stale label.

@stale stale bot removed the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Oct 31, 2021
@avnishn
Copy link
Member

avnishn commented Feb 7, 2022

closing because after running the repro script, it looks like we chased this bug away.

@avnishn avnishn closed this as completed Feb 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't rllib RLlib related issues
Projects
None yet
Development

No branches or pull requests

3 participants