[BUG] NaN error related to model-based RL algos #296

familyld · 2024-01-03T12:59:16Z

Required prerequisites

I have read the documentation https://omnisafe.readthedocs.io.
I have searched the Issue Tracker and Discussions that this hasn't already been reported. (+1 or comment there if it has.)
Consider asking first in a Discussion.

What version of OmniSafe are you using?

0.4.0

System information

python: 3.8
safety-gymnasium：1.0.0

The installation process is as follows:

$ conda create -n omnisafe python=3.8
$ conda activate omnisafe
$ pip install omnisafe

Problem description

I try to run the model-based benchmark experiments following the instructions on https://github.com/PKU-Alignment/omnisafe/blob/main/benchmarks/model-based/README.md, but it fails.

I've also noticed that you mentioned the NaN issue in #33, but the reason for generating such errors is unclear.

Additionally, it would be great if you could provide a simple script for running a single model-based algorithm just like https://github.com/PKU-Alignment/omnisafe/blob/main/examples/train_policy.py. Thank you!

Reproducible example code

Set the main function of examples/benchmarks/experiment_grid.py as:

if __name__ == '__main__':
    eg = ExperimentGrid(exp_name='Model-Based-Benchmarks')

    # set up the algorithms.
    model_based_base_policy = ['PETS',]
    model_based_safe_policy = []
    eg.add('algo', model_based_base_policy + model_based_safe_policy)

    # you can use wandb to monitor the experiment.
    eg.add('logger_cfgs:use_wandb', [False])
    # you can use tensorboard to monitor the experiment.
    eg.add('logger_cfgs:use_tensorboard', [True])
    eg.add('train_cfgs:total_steps', [10000])

    # set up the environment.
    eg.add('env_id', [
        'SafetyPointGoal1-v0-modelbased',
        ])
    eg.add('seed', [0, ])

    # total experiment num must can be divided by num_pool
    # meanwhile, users should decide this value according to their machine
    eg.run(train, num_pool=1)

After that, run the following command to run the benchmark:

cd examples/benchmarks
python run_experiment_grid.py

Traceback

Traceback (most recent call last):
  File "C:\Users\xxx\anaconda3\envs\omnisafe\lib\concurrent\futures\process.py", line 239, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "C:\Users\xxx\anaconda3\envs\omnisafe\lib\site-packages\omnisafe\utils\exp_grid_tools.py", line 100, in train
    reward, cost, ep_len = agent.learn()
  File "C:\Users\xxx\anaconda3\envs\omnisafe\lib\site-packages\omnisafe\algorithms\algo_wrapper.py", line 172, in learn
    ep_ret, ep_cost, ep_len = self.agent.learn()
  File "C:\Users\xxx\anaconda3\envs\omnisafe\lib\site-packages\omnisafe\algorithms\model_based\base\pets.py", line 222, in learn
    ep_len = int(self._logger.get_stats('Metrics/EpLen')[0])
ValueError: cannot convert float NaN to integer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "run_experiment_grid.py", line 97, in <module>
    eg.run(train, num_pool=1)
  File "C:\Users\xxx\anaconda3\envs\omnisafe\lib\site-packages\omnisafe\common\experiment_grid.py", line 467, in run
    self.save_results(exp_names, variants, results)
  File "C:\Users\xxx\anaconda3\envs\omnisafe\lib\site-packages\omnisafe\common\experiment_grid.py", line 489, in save_results
    reward, cost, ep_len = results[idx].result()
  File "C:\Users\xxx\anaconda3\envs\omnisafe\lib\concurrent\futures\_base.py", line 437, in result
    return self.__get_result()
  File "C:\Users\xxx\anaconda3\envs\omnisafe\lib\concurrent\futures\_base.py", line 389, in __get_result
    raise self._exception
ValueError: cannot convert float NaN to integer



### Expected behavior

_No response_

### Additional context

_No response_

The text was updated successfully, but these errors were encountered:

familyld · 2024-01-03T13:13:46Z

It turns out to be a mistake caused by reduced "total_steps" in the script. The default "steps_per_epoch" is 20000 in https://github.com/PKU-Alignment/omnisafe/tree/main/omnisafe/configs/model-based, and the "total_steps" should not be smaller than "steps_per_epoch". Maybe adding a simple assertion after parsing the config params would be helpful 😉

Gaiejj · 2024-01-03T13:55:53Z

Thanks for your considerate suggestions! Reopen this issue if you encounter any further problems. 😊

familyld added the bug Something isn't working label Jan 3, 2024

familyld assigned zmsn-2077 Jan 3, 2024

familyld closed this as completed Jan 3, 2024

familyld reopened this Jan 3, 2024

familyld closed this as completed Jan 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] NaN error related to model-based RL algos #296

[BUG] NaN error related to model-based RL algos #296

familyld commented Jan 3, 2024

familyld commented Jan 3, 2024

Gaiejj commented Jan 3, 2024

[BUG] NaN error related to model-based RL algos #296

[BUG] NaN error related to model-based RL algos #296

Comments

familyld commented Jan 3, 2024

Required prerequisites

What version of OmniSafe are you using?

System information

Problem description

Reproducible example code

Traceback

familyld commented Jan 3, 2024

Gaiejj commented Jan 3, 2024