Couldn't reproduce the result on Mujoco suite. #6

sweetice · 2020-07-20T02:10:12Z

Couldn't reproduce the result on the Mujoco suite.
Setting: We run the BEAR with the recommend settings: ** mmd_sigma = 20.0 , kernel_type = gaussian , num_samples_match = 5 , version = 0 or 2 , lagrange_thresh = 10.0 , `mode = auto**
The batch dataset is produced by training a DDPG agent for 1 million time steps. For reproducing, we use the DDPG code in BCQ repository.

We utilize the final buffer setting in BCQ paper.
Here are the whole results.
Note that the "behavioral" means the evaluation of the DDPG agent when training.

Sweetice 0720 update: Upload the png file for easily reading.

For more clear reading.
Ant-v2.pdf
HalfCheetah-v2.pdf
Hopper-v2.pdf
InvertedDoublePendulum-v2.pdf
InvertedPendulum-v2.pdf
Reacher-v2.pdf
Swimmer-v2.pdf
Walker2d-v2.pdf

The text was updated successfully, but these errors were encountered:

aviralkumar2907 · 2020-07-20T02:16:42Z

Hi! In the BEAR paper, we didn't test on the final buffer setting, so I am not sure if the hyperparameters are optimal for this setting as well. I would also recommend trying out the new cleaned up implementation of BEAR, where we have searched over hyperparameters: https://github.com/rail-berkeley/d4rl_evaluations and would also recommend using the D4RL datasets (https://github.com/rail-berkeley/d4rl).

aviralkumar2907 closed this as completed Aug 5, 2020

sweetice mentioned this issue Aug 18, 2020

Couldn't reproduce the result on MuJoCo Suite (d4rl datasets). #8

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Couldn't reproduce the result on Mujoco suite. #6

Couldn't reproduce the result on Mujoco suite. #6

sweetice commented Jul 20, 2020 •

edited

Loading

aviralkumar2907 commented Jul 20, 2020

Couldn't reproduce the result on Mujoco suite. #6

Couldn't reproduce the result on Mujoco suite. #6

Comments

sweetice commented Jul 20, 2020 • edited Loading

aviralkumar2907 commented Jul 20, 2020

sweetice commented Jul 20, 2020 •

edited

Loading