Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rllib] Cannot run the custom environment sample code in Python #8754

Closed
SarahBelieve opened this issue Jun 2, 2020 · 4 comments
Closed
Labels
question Just a question :) stale The issue is stale. It will be closed within 7 days unless there are further conversation

Comments

@SarahBelieve
Copy link

What is the problem?

I tried the custom environment example with PyTorch. But it cannot run with this error "RuntimeError: Expected object of scalar type Float but got scalar type Long for argument #2 'mat1' in call to _th_addmm".

When I switched the framework to Tensorflow, it runs properly. I can't find any documentation about how I am supposed to make changes if I want to use the custom environment with PyTorch.

System environment:

Ray: latest pip version
Python: 3.7.7
PyTorch: 1.5.0
OS: Ubuntu 18

Reproduction

import gym
from gym.spaces import Discrete, Box
from ray import tune

class SimpleCorridor(gym.Env):
    def __init__(self, config):
        self.end_pos = config["corridor_length"]
        self.cur_pos = 0
        self.action_space = Discrete(2)
        self.observation_space = Box(0.0, self.end_pos, shape=(1, ))

    def reset(self):
        self.cur_pos = 0
        return [self.cur_pos]

    def step(self, action):
        if action == 0 and self.cur_pos > 0:
            self.cur_pos -= 1
        elif action == 1:
            self.cur_pos += 1
        done = self.cur_pos >= self.end_pos
        return [self.cur_pos], 1 if done else 0, done, {}

tune.run(
    "PPO",
    config={
        "env": SimpleCorridor,
        "num_workers": 1,
        "env_config": {"corridor_length": 5},
        "use_pytorch":True})
@SarahBelieve SarahBelieve added the question Just a question :) label Jun 2, 2020
@sven1977
Copy link
Contributor

sven1977 commented Jun 3, 2020

Could you also upgrade to the latest ray wheel? Then use the --torch flag on the command line with this script. It's running fine for me. Could you also post the entire stacktrace?

@SarahBelieve
Copy link
Author

@sven1977 Thanks for your quick reply!
I upgrade the ray through pip install ray[rllib] --upgrade.
Then run the script with python sample_corridor.py --torch
Here is what I got:

2020-06-03 14:05:41,553	INFO resource_spec.py:212 -- Starting Ray with 3.32 GiB memory available for workers and up to 1.67 GiB for objects. You can adjust these settings with ray.init(memory=<bytes>, object_store_memory=<bytes>).
2020-06-03 14:05:41,907	INFO services.py:1170 -- View the Ray dashboard at localhost:8265
== Status ==
Memory usage on this node: 11.8/15.5 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/4 CPUs, 0/0 GPUs, 0.0/3.32 GiB heap, 0.0/1.12 GiB objects
Result logdir: /home/hcr/ray_results/PPO
Number of trials: 1 (1 RUNNING)
+--------------------------+----------+-------+
| Trial name               | status   | loc   |
|--------------------------+----------+-------|
| PPO_SimpleCorridor_00000 | RUNNING  |       |
+--------------------------+----------+-------+


(pid=16576) 2020-06-03 14:05:44,568	INFO trainer.py:580 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
(pid=16576) 2020-06-03 14:05:44,598	INFO trainable.py:217 -- Getting current IP.
(pid=16576) 2020-06-03 14:05:44,599	WARNING util.py:37 -- Install gputil for GPU system monitoring.
2020-06-03 14:05:45,504	ERROR trial_runner.py:519 -- Trial PPO_SimpleCorridor_00000: Error processing event.
Traceback (most recent call last):
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/ray/tune/trial_runner.py", line 467, in _process_trial
    result = self.trial_executor.fetch_result(trial)
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/ray/tune/ray_trial_executor.py", line 431, in fetch_result
    result = ray.get(trial_future[0], DEFAULT_GET_TIMEOUT)
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/ray/worker.py", line 1515, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(RuntimeError): ray::PPO.train() (pid=16576, ip=192.168.1.27)
  File "python/ray/_raylet.pyx", line 463, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 417, in ray._raylet.execute_task.function_executor
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 495, in train
    raise e
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 484, in train
    result = Trainable.train(self)
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/ray/tune/trainable.py", line 261, in train
    result = self._train()
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/ray/rllib/agents/trainer_template.py", line 151, in _train
    fetches = self.optimizer.step()
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/ray/rllib/optimizers/sync_samples_optimizer.py", line 59, in step
    for e in self.workers.remote_workers()
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/ray/rllib/utils/memory.py", line 32, in ray_get_and_free
    return ray.get(object_ids)
ray.exceptions.RayTaskError(RuntimeError): ray::RolloutWorker.sample() (pid=16575, ip=192.168.1.27)
  File "python/ray/_raylet.pyx", line 463, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 417, in ray._raylet.execute_task.function_executor
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 510, in sample
    batches = [self.input_reader.next()]
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 54, in next
    batches = [self.get_data()]
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 98, in get_data
    item = next(self.rollout_provider)
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 358, in _env_runner
    active_episodes)
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 610, in _do_policy_eval
    timestep=policy.global_timestep)
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/ray/rllib/policy/torch_policy.py", line 151, in compute_actions
    input_dict, state_batches, seq_lens)
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/ray/rllib/models/modelv2.py", line 164, in __call__
    res = self.forward(restored, state or [], seq_lens)
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/ray/rllib/models/torch/fcnet.py", line 92, in forward
    features = self._hidden_layers(obs.reshape(obs.shape[0], -1))
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/torch/nn/modules/container.py", line 100, in forward
    input = module(input)
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/ray/rllib/models/torch/misc.py", line 108, in forward
    return self._model(x)
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/torch/nn/modules/container.py", line 100, in forward
    input = module(input)
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 87, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/torch/nn/functional.py", line 1610, in linear
    ret = torch.addmm(bias, input, weight.t())
RuntimeError: Expected object of scalar type Float but got scalar type Long for argument #2 'mat1' in call to _th_addmm
== Status ==
Memory usage on this node: 11.9/15.5 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/4 CPUs, 0/0 GPUs, 0.0/3.32 GiB heap, 0.0/1.12 GiB objects
Result logdir: /home/hcr/ray_results/PPO
Number of trials: 1 (1 ERROR)
+--------------------------+----------+-------+
| Trial name               | status   | loc   |
|--------------------------+----------+-------|
| PPO_SimpleCorridor_00000 | ERROR    |       |
+--------------------------+----------+-------+
Number of errored trials: 1
+--------------------------+--------------+--------------------------------------------------------------------------------------+
| Trial name               |   # failures | error file                                                                           |
|--------------------------+--------------+--------------------------------------------------------------------------------------|
| PPO_SimpleCorridor_00000 |            1 | /home/hcr/ray_results/PPO/PPO_SimpleCorridor_0_2020-06-03_14-05-436wiesl7d/error.txt |
+--------------------------+--------------+--------------------------------------------------------------------------------------+

Traceback (most recent call last):
  File "sample_corridor.py", line 30, in <module>
    "use_pytorch":True})
  File "/home/hcr/python-envs/jh-warehouse/lib/python3.7/site-packages/ray/tune/tune.py", line 347, in run
    raise TuneError("Trials did not complete", incomplete_trials)
ray.tune.error.TuneError: ('Trials did not complete', [PPO_SimpleCorridor_00000])
(pid=16575) /pytorch/torch/csrc/utils/tensor_numpy.cpp:141: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program.

@stale
Copy link

stale bot commented Nov 11, 2020

Hi, I'm a bot from the Ray team :)

To help human contributors to focus on more relevant issues, I will automatically add the stale label to issues that have had no activity for more than 4 months.

If there is no further activity in the 14 days, the issue will be closed!

  • If you'd like to keep the issue open, just leave any comment, and the stale label will be removed!
  • If you'd like to get more attention to the issue, please tag one of Ray's contributors.

You can always ask for help on our discussion forum or Ray's public slack channel.

@stale stale bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Nov 11, 2020
@stale
Copy link

stale bot commented Nov 25, 2020

Hi again! The issue will be closed because there has been no more activity in the 14 days since the last message.

Please feel free to reopen or open a new issue if you'd still like it to be addressed.

Again, you can always ask for help on our discussion forum or Ray's public slack channel.

Thanks again for opening the issue!

@stale stale bot closed this as completed Nov 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Just a question :) stale The issue is stale. It will be closed within 7 days unless there are further conversation
Projects
None yet
Development

No branches or pull requests

3 participants