Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: The parameter loc has invalid values #103

Closed
wanglinghao1 opened this issue May 10, 2023 · 5 comments
Closed

ValueError: The parameter loc has invalid values #103

wanglinghao1 opened this issue May 10, 2023 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@wanglinghao1
Copy link

In localmode, the code will not report an error, but when localmode=False, the following error will be reported every time the 36th iteration is reached:
Failure # 1 (occurred at 2023-05-09_21-08-13)
Traceback (most recent call last):
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\tune\trial_runner.py", line 890, in _process_trial
results = self.trial_executor.fetch_result(trial)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\tune\ray_trial_executor.py", line 788, in fetch_result
result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray_private\client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\worker.py", line 1625, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(ValueError): �[36mray::VDA2CTrainer.train()�[39m (pid=23324, ip=127.0.0.1, repr=VDA2CTrainer)
File "E:\Linghao\MARLlib-sy_dev_0\marllib\marl\algos\core\VD\vda2c.py", line 65, in value_mix_actor_critic_loss
dist = dist_class(logits, model)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\models\torch\torch_action_dist.py", line 186, in init
self.dist = torch.distributions.normal.Normal(mean, torch.exp(log_std))
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\torch\distributions\normal.py", line 50, in init
super(Normal, self).init(batch_shape, validate_args=validate_args)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\torch\distributions\distribution.py", line 53, in init
raise ValueError("The parameter {} has invalid values".format(param))
ValueError: The parameter loc has invalid values

The above exception was the direct cause of the following exception:

�[36mray::VDA2CTrainer.train()�[39m (pid=23324, ip=127.0.0.1, repr=VDA2CTrainer)
File "python\ray_raylet.pyx", line 558, in ray._raylet.execute_task
File "python\ray_raylet.pyx", line 596, in ray._raylet.execute_task
File "python\ray_raylet.pyx", line 565, in ray._raylet.execute_task
File "python\ray_raylet.pyx", line 569, in ray._raylet.execute_task
File "python\ray_raylet.pyx", line 519, in ray._raylet.execute_task.function_executor
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray_private\function_manager.py", line 576, in actor_method_executor
return method(__ray_actor, *args, **kwargs)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\util\tracing\tracing_helper.py", line 451, in _resume_span
return method(self, *_args, **_kwargs)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\agents\trainer.py", line 682, in train
raise e
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\agents\trainer.py", line 668, in train
result = Trainable.train(self)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\tune\trainable.py", line 283, in train
result = self.step()
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\util\tracing\tracing_helper.py", line 451, in _resume_span
return method(self, *_args, **_kwargs)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\agents\trainer_template.py", line 206, in step
step_results = next(self.train_exec_impl)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\util\iter.py", line 756, in next
return next(self.built_iterator)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\util\iter.py", line 783, in apply_foreach
for item in it:
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\util\iter.py", line 843, in apply_filter
for item in it:
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\util\iter.py", line 843, in apply_filter
for item in it:
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\util\iter.py", line 791, in apply_foreach
result = fn(item)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\execution\train_ops.py", line 230, in call
results = policy.learn_on_loaded_batch(
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\policy\torch_policy.py", line 632, in learn_on_loaded_batch
return self.learn_on_batch(batch)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\utils\threading.py", line 21, in wrapper
return func(self, *a, **k)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\policy\torch_policy.py", line 529, in learn_on_batch
grads, fetches = self.compute_gradients(postprocessed_batch)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\policy\policy_template.py", line 336, in compute_gradients
return parent_cls.compute_gradients(self, batch)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\utils\threading.py", line 21, in wrapper
return func(self, *a, **k)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\policy\torch_policy.py", line 709, in compute_gradients
tower_outputs = self._multi_gpu_parallel_grad_calc(
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\policy\torch_policy.py", line 1083, in _multi_gpu_parallel_grad_calc
raise last_result[0] from last_result[1]
ValueError: The parameter loc has invalid values
In tower 0 on device cpu

There should be no conflicting packages at the moment

@Theohhhu
Copy link
Collaborator

I've encountered this problem previously, and it seems to be related to RLlib. Could you please provide me with the script you're currently running?

@Theohhhu Theohhhu added the bug Something isn't working label May 10, 2023
@wanglinghao1
Copy link
Author

I'm currently running in an environment I wrote myself, so what part of the script do I need to provide

@Theohhhu
Copy link
Collaborator

Just the API part please.

@wanglinghao1
Copy link
Author

ENV_REGISTRY["sumo"] = RllibSUMO
env = marl.make_env(environment_name="sumo", map_name="simple_liushi_so_3OD", continuous_actions=True)
vda2c = marl.algos.vda2c(hyperparam_source="common")
model = marl.build_model(env, vda2c, {"core_arch": "lstm"})

vda2c.fit(env, model, stop={'episode_reward_mean': 20000, 'timesteps_total': 10000000}, local_mode=False, num_gpus=0,
          num_workers=20, share_policy='individual', checkpoint_freq=1, checkpoint_end=True)

Also, the same code doesn't seem to report an error in another machine, and I'd like to ask if this is device-related

@Theohhhu
Copy link
Collaborator

Dealing with device-related problems seems to be a regular thing in Ray-based projects. Swapping out the machine is usually a good way if possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants