fix: ddpg action bias #299

sdpkjc · 2022-10-22T14:03:43Z

Description

Fixes the first part of #297

Types of changes

Bug fix
New feature
New algorithm
Documentation

Checklist:

I've read the CONTRIBUTION guide (required).
I have ensured pre-commit run --all-files passes (required).
I have updated the documentation and previewed the changes via mkdocs serve.
I have updated the tests accordingly (if applicable).

If you are adding new algorithms or your change could result in performance difference, you may need to (re-)run tracked experiments. See #137 as an example PR.

vercel · 2022-10-22T14:03:48Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Updated
cleanrl	✅ Ready (Inspect)	Visit Preview	Nov 3, 2022 at 9:12PM (UTC)

vwxyzjn · 2022-11-01T00:22:51Z

Thanks for the PR. Running some benchmark experiments now.

vwxyzjn · 2022-11-01T20:05:27Z

Using the following snippet from #307

python rlops.py --exp-name ddpg_continuous_action \
    --wandb-project-name cleanrl \
    --wandb-entity openrlbenchmark \
    --tags  pr-299 rlops-pilot \
    --env-ids Hopper-v2 Walker2d-v2 HalfCheetah-v2 \
    --output-filename compare.png \
    --report

we generate the following image

Discussion

The matplotlib subsamples from the wandb runs and seems to result in slightly inaccurate curves sometimes
This PR improves the performance in HalfCheetah-v2
Speed is slightly faster, probably because I am now using --worker 1 instead of --worker 3

What remains is to update the documentation and optionally run more experiments in more envs.

vwxyzjn · 2022-11-03T21:14:00Z

Experiments were done, and the docs were updated. Using the following command from #307 generated the following figure and table

python -m cleanrl_utils.rlops --exp-name ddpg_continuous_action \
    --wandb-project-name cleanrl \
    --wandb-entity openrlbenchmark \
    --tags 'pr-299' 'rlops-pilot' \
    --env-ids HalfCheetah-v2 Walker2d-v2 Hopper-v2 InvertedPendulum-v2 Humanoid-v2 Pusher-v2 \
    --output-filename compare.png \
    --scan-history \
    --metric-last-n-average-window 100 \
    --report

                    CleanRL's ddpg_continuous_action (pr-299) CleanRL's ddpg_continuous_action (rlops-pilot)
HalfCheetah-v2                              10210.57 ± 196.22                              9205.65 ± 1093.88
Walker2d-v2                                  1661.14 ± 250.01                               1447.09 ± 260.24
Hopper-v2                                    1007.44 ± 148.29                               1126.37 ± 278.02
InvertedPendulum-v2                            684.61 ± 94.41                                 544.77 ± 50.98
Humanoid-v2                                    910.61 ± 97.58                                 849.05 ± 40.64
Pusher-v2                                       -39.39 ± 9.54                                  -32.52 ± 2.03

vwxyzjn · 2022-11-03T22:45:58Z

Thanks @sdpkjc for this PR and raising the issue.

fix: ddpg action bias

6081d30

vercel bot deployed to Preview October 22, 2022 14:04 View deployment

Merge branch 'master' into fix-ddpg-bias

578d012

vercel bot deployed to Preview November 2, 2022 15:17 View deployment

update docs

cde4ff1

vercel bot deployed to Preview November 3, 2022 21:12 View deployment

update docs

5d234a2

vercel bot deployed to Preview November 3, 2022 21:13 View deployment

vwxyzjn merged commit 023eaea into vwxyzjn:master Nov 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: ddpg action bias #299

fix: ddpg action bias #299

sdpkjc commented Oct 22, 2022 •

edited by vwxyzjn

vercel bot commented Oct 22, 2022 •

edited

vwxyzjn commented Nov 1, 2022

vwxyzjn commented Nov 1, 2022 •

edited

vwxyzjn commented Nov 3, 2022 •

edited

vwxyzjn commented Nov 3, 2022

fix: ddpg action bias #299

fix: ddpg action bias #299

Conversation

sdpkjc commented Oct 22, 2022 • edited by vwxyzjn

Description

Types of changes

Checklist:

vercel bot commented Oct 22, 2022 • edited

vwxyzjn commented Nov 1, 2022

vwxyzjn commented Nov 1, 2022 • edited

Discussion

vwxyzjn commented Nov 3, 2022 • edited

vwxyzjn commented Nov 3, 2022

sdpkjc commented Oct 22, 2022 •

edited by vwxyzjn

vercel bot commented Oct 22, 2022 •

edited

vwxyzjn commented Nov 1, 2022 •

edited

vwxyzjn commented Nov 3, 2022 •

edited