Performance check against SB3 #1

zhihanyang2022 · 2021-06-06T13:43:58Z

Following the style here, we will run

DDPG
TD3
SAC

on 4 PyBullet environments

HalfCheetahBulletEnv-v0
AntBulletEnv-v0
HopperBulletEnv-v0
Walker2DBulletEnv-v0

for 3 seeds, and report the mean and std of final performance.

In total, this would be 3 * 4 * 3 = 36 runs.

The text was updated successfully, but these errors were encountered:

zhihanyang2022 · 2021-06-08T12:37:12Z

Commit hash: 35dc700

For DDPG:

python launch.py --env HalfCheetahBulletEnv-v0 --algo ddpg --config configs/reproduce_sb3/ddpg_halfcheetah.gin --run_id 1 2 3
python launch.py --env AntBulletEnv-v0 --algo ddpg --config configs/reproduce_sb3/ddpg_ant.gin --run_id 1 2 3
python launch.py --env HopperBulletEnv-v0 --algo ddpg --config configs/reproduce_sb3/ddpg_hopper.gin --run_id 1 2 3
python launch.py --env Walker2DBulletEnv-v0 --algo ddpg --config configs/reproduce_sb3/ddpg_walker2d.gin --run_id 1 2 3

For TD3:

python launch.py --env HalfCheetahBulletEnv-v0 --algo td3 --config configs/reproduce_sb3/td3_all.gin --run_id 1 2 3
python launch.py --env AntBulletEnv-v0 --algo td3 --config configs/reproduce_sb3/td3_all.gin --run_id 1 2 3
python launch.py --env HopperBulletEnv-v0 --algo td3 --config configs/reproduce_sb3/td3_all.gin --run_id 1 2 3
python launch.py --env Walker2DBulletEnv-v0 --algo td3 --config configs/reproduce_sb3/td3_all.gin --run_id 1 2 3

For SAC:

python launch.py --env HalfCheetahBulletEnv-v0 --algo sac --config configs/reproduce_sb3/sac_halfcheetah_ant.gin --run_id 1 2 3
python launch.py --env AntBulletEnv-v0 --algo sac --config configs/reproduce_sb3/sac_halfcheetah_ant.gin --run_id 1 2 3
python launch.py --env HopperBulletEnv-v0 --algo sac --config configs/reproduce_sb3/sac_hopper_walker2d.gin --run_id 1 2 3
python launch.py --env Walker2DBulletEnv-v0 --algo sac --config configs/reproduce_sb3/sac_hopper_walker2d.gin --run_id 1 2 3

zhihanyang2022 · 2021-06-09T02:14:42Z

As mentioned earlier, we used the same hyper-parameters as SB3 (available in rl-baselines3-zoo).

For the training curves, both SB3 and our visualization reports un-smoothed mean and standard error.

Potential causes for minor differences (e.g., our error bars seem to be wider in some cases):

Stochasticity (e.g., our TD3 on Walker2D seem to be a little weak, but we believe it can corrected with more runs).
SB3 used numpy and matplotlib, while we relied on weights and biases.
For each trial, SB3 uses 10 evaluation episodes per 10000 steps, while we used 5 evaluation episodes per 10000 steps.

DDPG

	DDPG (SB3; 6 seeds)	DDPG (ours; 3 seeds)
HalfCheetah
Ant
Hopper
Walker2D

TD3

	TD3 (SB3; 3 seeds)	TD3 (ours; 3 seeds)
HalfCheetah
Ant
Hopper
Walker2D

SAC

	SAC (SB3; 3 seeds)	SAC (ours; 3 seeds)
HalfCheetah
Ant
Hopper
Walker2D

zhihanyang2022 · 2021-10-11T01:05:23Z

Comparison of final performance for SAC:

(3 seeds; report stderr)	SAC (SB3)	SAC (ours)
HalfCheetahBulletEnv	2725 +/- 129	2757 +/- 53
AntBulletEnv	3493 +/- 23	3146 +/- 35
HopperBulletEnv	2546 +/- 196	2422 +/- 168
Walked2DBulletEnv	2367 +/- 83	2184 +/- 54

where SB3 stats are obtained from DLR-RM/stable-baselines3#48.

zhihanyang2022 self-assigned this Jun 8, 2021

zhihanyang2022 added reproducibility performance-check and removed reproducibility labels Jun 8, 2021

zhihanyang2022 closed this as completed Jun 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance check against SB3 #1

Performance check against SB3 #1

zhihanyang2022 commented Jun 6, 2021 •

edited

Loading

zhihanyang2022 commented Jun 8, 2021 •

edited

Loading

zhihanyang2022 commented Jun 9, 2021 •

edited

Loading

zhihanyang2022 commented Oct 11, 2021 •

edited

Loading

Performance check against SB3 #1

Performance check against SB3 #1

Comments

zhihanyang2022 commented Jun 6, 2021 • edited Loading

zhihanyang2022 commented Jun 8, 2021 • edited Loading

zhihanyang2022 commented Jun 9, 2021 • edited Loading

DDPG

TD3

SAC

zhihanyang2022 commented Oct 11, 2021 • edited Loading

zhihanyang2022 commented Jun 6, 2021 •

edited

Loading

zhihanyang2022 commented Jun 8, 2021 •

edited

Loading

zhihanyang2022 commented Jun 9, 2021 •

edited

Loading

zhihanyang2022 commented Oct 11, 2021 •

edited

Loading