Error running example "self_play_train.py" #10

kinalmehta · 2021-12-16T15:41:39Z

I'm getting following error when I try to run the given example code

(marl) ➜  rllib git:(main) ✗ python self_play_train.py 
2021-12-16 20:46:53,589	INFO services.py:1338 -- View the Ray dashboard at http://127.0.0.1:8265
2021-12-16 20:46:54,490	INFO trainer.py:722 -- Your framework setting is 'tf', meaning you are using static-graph mode. Set framework='tf2' to enable eager execution with tf2.x. You may also want to then set `eager_tracing=True` in order to reach similar execution speed as with static-graph mode.
2021-12-16 20:46:54,491	WARNING ppo.py:143 -- `train_batch_size` (200) cannot be achieved with your other settings (num_workers=1 num_envs_per_worker=1 rollout_fragment_length=30)! Auto-adjusting `rollout_fragment_length` to 200.
2021-12-16 20:46:54,491	INFO ppo.py:166 -- In multi-agent mode, policies will be optimized sequentially by the multi-GPU optimizer. Consider setting simple_optimizer=True if this doesn't work for you.
Traceback (most recent call last):
  File "/home/kinal/Desktop/marl/meltingpot/examples/rllib/self_play_train.py", line 95, in <module>
    main()
  File "/home/kinal/Desktop/marl/meltingpot/examples/rllib/self_play_train.py", line 88, in main
    trainer = get_trainer_class(agent_algorithm)(env="meltingpot", config=config)
  File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 102, in __init__
    Trainer.__init__(self, config, env, logger_creator,
  File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 661, in __init__
    super().__init__(config, logger_creator, remote_checkpoint_dir,
  File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/tune/trainable.py", line 121, in __init__
    self.setup(copy.deepcopy(self.config))
  File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 113, in setup
    super().setup(config)
  File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 764, in setup
    self._init(self.config, self.env_creator)
  File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 136, in _init
    self.workers = self._make_workers(
  File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 1727, in _make_workers
    return WorkerSet(
  File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 87, in __init__
    remote_spaces = ray.get(self.remote_workers(
  File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
    return func(*args, **kwargs)
  File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/worker.py", line 1715, in get
    raise value
ray.exceptions.RayActorError: The actor died because of an error raised in its creation task, ray::RolloutWorker.__init__() (pid=69363, ip=10.2.40.108)
  File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 587, in __init__
    self._build_policy_map(
  File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1543, in _build_policy_map
    preprocessor = ModelCatalog.get_preprocessor_for_space(
  File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/rllib/models/catalog.py", line 703, in get_preprocessor_for_space
    prep = cls(observation_space, options)
  File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/rllib/models/preprocessors.py", line 40, in __init__
    self.shape = self._init_shape(obs_space, self._options)
  File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/rllib/models/preprocessors.py", line 265, in _init_shape
    preprocessor = preprocessor_class(space, self._options)
  File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/rllib/models/preprocessors.py", line 43, in __init__
    self._obs_for_type_matching = self._obs_space.sample()
  File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/gym/spaces/box.py", line 132, in sample
    sample[bounded] = self.np_random.uniform(
  File "mtrand.pyx", line 1130, in numpy.random.mtrand.RandomState.uniform
OverflowError: Range exceeds valid bounds
(RolloutWorker pid=69363) 2021-12-16 20:47:04,439	INFO rollout_worker.py:1705 -- Validating sub-env at vector index=0 ... (ok)
(RolloutWorker pid=69363) 2021-12-16 20:47:04,463	DEBUG rollout_worker.py:1534 -- Creating policy for av
(RolloutWorker pid=69363) 2021-12-16 20:47:04,470	DEBUG preprocessors.py:262 -- Creating sub-preprocessor for Box([-2147483648 -2147483648 -2147483648 -2147483648 -2147483648 -2147483648
(RolloutWorker pid=69363)  -2147483648 -2147483648 -2147483648 -2147483648 -2147483648 -2147483648
(RolloutWorker pid=69363)  -2147483648 -2147483648 -2147483648 -2147483648], [2147483647 2147483647 2147483647 2147483647 2147483647 2147483647
(RolloutWorker pid=69363)  2147483647 2147483647 2147483647 2147483647 2147483647 2147483647
(RolloutWorker pid=69363)  2147483647 2147483647 2147483647 2147483647], (16,), int32)
(RolloutWorker pid=69363) 2021-12-16 20:47:04,471	DEBUG preprocessors.py:262 -- Creating sub-preprocessor for Box([-2147483648 -2147483648 -2147483648 -2147483648 -2147483648 -2147483648
(RolloutWorker pid=69363)  -2147483648 -2147483648 -2147483648 -2147483648 -2147483648 -2147483648
(RolloutWorker pid=69363)  -2147483648 -2147483648 -2147483648 -2147483648], [2147483647 2147483647 2147483647 2147483647 2147483647 2147483647
(RolloutWorker pid=69363)  2147483647 2147483647 2147483647 2147483647 2147483647 2147483647
(RolloutWorker pid=69363)  2147483647 2147483647 2147483647 2147483647], (16,), int32)
(RolloutWorker pid=69363) 2021-12-16 20:47:04,471	DEBUG preprocessors.py:262 -- Creating sub-preprocessor for Box(-1.7976931348623157e+308, 1.7976931348623157e+308, (), float64)
(RolloutWorker pid=69363) /home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/gym/spaces/box.py:132: RuntimeWarning: overflow encountered in subtract
(RolloutWorker pid=69363)   sample[bounded] = self.np_random.uniform(
(RolloutWorker pid=69363) 2021-12-16 20:47:04,472	ERROR worker.py:431 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RolloutWorker.__init__() (pid=69363, ip=10.2.40.108)
(RolloutWorker pid=69363)   File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 587, in __init__
(RolloutWorker pid=69363)     self._build_policy_map(
(RolloutWorker pid=69363)   File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1543, in _build_policy_map
(RolloutWorker pid=69363)     preprocessor = ModelCatalog.get_preprocessor_for_space(
(RolloutWorker pid=69363)   File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/rllib/models/catalog.py", line 703, in get_preprocessor_for_space
(RolloutWorker pid=69363)     prep = cls(observation_space, options)
(RolloutWorker pid=69363)   File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/rllib/models/preprocessors.py", line 40, in __init__
(RolloutWorker pid=69363)     self.shape = self._init_shape(obs_space, self._options)
(RolloutWorker pid=69363)   File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/rllib/models/preprocessors.py", line 265, in _init_shape
(RolloutWorker pid=69363)     preprocessor = preprocessor_class(space, self._options)
(RolloutWorker pid=69363)   File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/ray/rllib/models/preprocessors.py", line 43, in __init__
(RolloutWorker pid=69363)     self._obs_for_type_matching = self._obs_space.sample()
(RolloutWorker pid=69363)   File "/home/kinal/miniconda3/envs/marl/lib/python3.9/site-packages/gym/spaces/box.py", line 132, in sample
(RolloutWorker pid=69363)     sample[bounded] = self.np_random.uniform(
(RolloutWorker pid=69363)   File "mtrand.pyx", line 1130, in numpy.random.mtrand.RandomState.uniform
(RolloutWorker pid=69363) OverflowError: Range exceeds valid bounds

Environment details:

gym                       0.21.0                   pypi_0    pypi
dm-meltingpot                1.0.1     /home/kinal/Desktop/marl/meltingpot
numpy                        1.21.4
ray                       1.9.0                    pypi_0    pypi
tensorflow                2.7.0                    pypi_0    pypi

kinalmehta · 2021-12-17T11:22:31Z

UPDATE:

I tried narrowing down the issue and I found that when converting dm_env to gym environment, the observation dict for the following keys has issues

Env name: "allelopathic_harvest"
Keys with issue:

COLOR_ID
MOST_TASTY_BERRY_ID
READY_TO_SHOOT
All of these have dtype Box(-1.7976931348623157e+308, 1.7976931348623157e+308, (), float64)

So basically this exists only for dtypes np.float64 as numpy gives overflow error for uniform sampling when sampling for min to max range of np.float64

I solved it by updating line 62 in file ./examples/rllib/multiagent_wrapper.py as below

return spaces.Box(info.min/10, info.max/10, spec.shape, spec.dtype)

Not sure if this is the right thing to do and whether it affects the environment in anyway.

After solving this I ran into another error stating

ValueError: No default configuration for obs shape [88, 88, 3], you must specify `conv_filters` manually as a model option. Default configurations are only available for inputs of shape [42, 42, K] and [84, 84, K]. You may alternatively want to use a custom model or preprocessor.

This seems to be because there is no inbuilt config in rllib for the observation space from the example environment.
Writing a custom model for state space or updating the configuration to accommodate shape of (88,88,3) would solve this.

Previously we used the limits of precision, but this caused issues (#10). Since the bounds were finite spaces.Box would attempt uniform sampling. By passing np.inf spaces.Box will instead sample using Normal/Exponential distribution as desired. PiperOrigin-RevId: 421009922 Change-Id: Ie24571a8ab72fc2564b302ff0e9dbad9e5856a9e

jagapiou · 2022-01-11T13:53:09Z

Heya, thanks for raising these issues.

Regarding your first point on spaces.Box bounds, it looks like the fix is to pass in (-np.inf, np.inf) as the bounds for floating point numbers. 5b36a2a should fix.

Note that these specs aren't precise. For example, READY_TO_SHOOT is actually either 0. or 1. So sampling from N(0, 1) will lead to a lot of values that won't be seen during behavior.

I'll leave this open so your second issue can be dealt with.

GoingMyWay · 2022-02-11T10:26:55Z

Hi, how did you set the conv_filters, I tried to use [88, 88, 3] but it seems not right.

config['model']['conv_filters'] = [88, 88, 3]

jzleibo · 2022-02-11T10:56:59Z

Hi both! I just submitted a fix for this: 6a3a5c2. The example should work now. However, please take note that in order to please RLLib, I had to pick a different conv net configuration from what we used to run the baselines in the paper. See the explanation here. You might want to experiment with different sizes.

GoingMyWay · 2022-02-11T14:52:00Z

Hi both! I just submitted a fix for this: 6a3a5c2. The example should work now. However, please take note that in order to please RLLib, I had to pick a different conv net configuration from what we used to run the baselines in the paper. See the explanation here. You might want to experiment with different sizes.

Dear Prof Leibo, Thanks for your quick response. Could you please also specify the version of Python, Ray and TensorFlow in the repo? My laptop is MacBook and the version of Python, Ray and TensorFlow are 3.9, 1.10.0 and 2.8 respectively. However, when running the example code, it returns the following error:

Traceback (most recent call last):
  File "/Users/goingmyway/Projects/deepmind/meltingpot/examples/rllib/self_play_train.py", line 115, in <module>
    main()
  File "/Users/goingmyway/Projects/deepmind/meltingpot/examples/rllib/self_play_train.py", line 108, in main
    trainer = get_trainer_class(agent_algorithm)(env="meltingpot", config=config)
  File "/usr/local/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 728, in __init__
    super().__init__(config, logger_creator, remote_checkpoint_dir,
  File "/usr/local/lib/python3.9/site-packages/ray/tune/trainable.py", line 122, in __init__
    self.setup(copy.deepcopy(self.config))
  File "/usr/local/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 826, in setup
    self.workers = self._make_workers(
  File "/usr/local/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 1925, in _make_workers
    return WorkerSet(
  File "/usr/local/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 123, in __init__
    self._local_worker = self._make_worker(
  File "/usr/local/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 497, in _make_worker
    worker = cls(
  File "/usr/local/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 588, in __init__
    self._build_policy_map(
  File "/usr/local/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1555, in _build_policy_map
    self.policy_map.create_policy(name, orig_cls, obs_space, act_space,
  File "/usr/local/lib/python3.9/site-packages/ray/rllib/policy/policy_map.py", line 133, in create_policy
    self[policy_id] = class_(
  File "/usr/local/lib/python3.9/site-packages/ray/rllib/policy/tf_policy_template.py", line 238, in __init__
    DynamicTFPolicy.__init__(
  File "/usr/local/lib/python3.9/site-packages/ray/rllib/policy/dynamic_tf_policy.py", line 333, in __init__
    dist_inputs, self._state_out = self.model(
  File "/usr/local/lib/python3.9/site-packages/ray/rllib/models/modelv2.py", line 231, in __call__
    restored["obs"] = restore_original_dimensions(
  File "/usr/local/lib/python3.9/site-packages/ray/rllib/models/modelv2.py", line 394, in restore_original_dimensions
    return _unpack_obs(obs, original_space, tensorlib=tensorlib)
  File "/usr/local/lib/python3.9/site-packages/ray/rllib/models/modelv2.py", line 433, in _unpack_obs
    batch_dims = [
  File "/usr/local/lib/python3.9/site-packages/ray/rllib/models/modelv2.py", line 434, in <listcomp>
    v if isinstance(v, int) else v.value for v in obs.shape[:-1]
AttributeError: 'NoneType' object has no attribute 'value'

I guess there might be something different in the versions. As Ray is updating and changing in its versions. I would appreciate it if you can specify the versions of all the dependencies.

jzleibo · 2022-02-11T15:21:37Z

Hmm, I've never seen the error you got there. I tested it on a virtual machine with Python 3.7.12 and Ray 1.10.0. It's not clear to me what version of tensorflow I had there. It would be whatever version came along when I installed Ray.

GoingMyWay · 2022-02-11T17:17:18Z

Hmm, I've never seen the error you got there. I tested it on a virtual machine with Python 3.7.12 and Ray 1.10.0. It's not clear to me what version of tensorflow I had there. It would be whatever version came along when I installed Ray.

Dear Prof Leibo, thank you for the clarification and the response. I guess maybe the issue came from TF which caused the error of Ray. After changing the 434-th line of ray/rllib/models/modelv2.py to the following code, the example code can work now

# here the workaround is simply adding `or v is None`
v if (isinstance(v, int) or v is None) else v.value for v in obs.shape[:-1]

Thank you.

jzleibo closed this as completed Feb 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error running example "self_play_train.py" #10

Error running example "self_play_train.py" #10

kinalmehta commented Dec 16, 2021

kinalmehta commented Dec 17, 2021

jagapiou commented Jan 11, 2022

GoingMyWay commented Feb 11, 2022

jzleibo commented Feb 11, 2022

GoingMyWay commented Feb 11, 2022 •

edited

jzleibo commented Feb 11, 2022

GoingMyWay commented Feb 11, 2022 •

edited

Error running example "self_play_train.py" #10

Error running example "self_play_train.py" #10

Comments

kinalmehta commented Dec 16, 2021

kinalmehta commented Dec 17, 2021

jagapiou commented Jan 11, 2022

GoingMyWay commented Feb 11, 2022

jzleibo commented Feb 11, 2022

GoingMyWay commented Feb 11, 2022 • edited

jzleibo commented Feb 11, 2022

GoingMyWay commented Feb 11, 2022 • edited

GoingMyWay commented Feb 11, 2022 •

edited

GoingMyWay commented Feb 11, 2022 •

edited