-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(off-policy): support off-policy lag #204
Conversation
Co-authored-by: borong <borongzh@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ruiyang sun <rockmagma02@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: ruiyang sun <rockmagma02@gmail.com>
Co-authored-by: zmsn-2077 <73586554+zmsn-2077@users.noreply.github.com> Co-authored-by: borong <borongzh@gmail.com> Co-authored-by: friedmainfunction <73703265+friedmainfunction@users.noreply.github.com>
…nment#147) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…thms (PKU-Alignment#148) Co-authored-by: borong <borongzh@gmail.com> Co-authored-by: Gaiejj <524339208@qq.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: zmsn-2077 <jiamg.ji@gmail.com>
Co-authored-by: borong <borongzh@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: ruiyang sun <rockmagma02@gmail.com> Co-authored-by: zmsn-2077 <73586554+zmsn-2077@users.noreply.github.com> Co-authored-by: friedmainfunction <73703265+friedmainfunction@users.noreply.github.com> Co-authored-by: Gaiejj <524339208@qq.com> Co-authored-by: zmsn-2077 <jiamg.ji@gmail.com> Co-authored-by: Ruiyang Sun <rockmagma02@gmail.com> Co-authored-by: Jiayi Zhou <108712610+Gaiejj@users.noreply.github.com> Co-authored-by: 1Asan <99461435+1Asan@users.noreply.github.com> fix(algo): fix no return in algo_wrapper::learn (PKU-Alignment#122) fix(logger, wrapper): support csv file and velocity tasks (PKU-Alignment#131) fix typo. (PKU-Alignment#134) fix(ppo): fix entropy loss (PKU-Alignment#135) fix bugs (PKU-Alignment#136) fix: support new config for exp_grid (PKU-Alignment#142) fix(rollout, exp_grid): fix logdir path conflict (PKU-Alignment#145) fix(on-policy): fix the second order algorithms performance (PKU-Alignment#147)
…lignment#162) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: borong <borongzh@gmail.com>
Co-authored-by: borong <borongzh@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: ruiyang sun <rockmagma02@gmail.com> Co-authored-by: zmsn-2077 <73586554+zmsn-2077@users.noreply.github.com> Co-authored-by: friedmainfunction <73703265+friedmainfunction@users.noreply.github.com> Co-authored-by: Gaiejj <524339208@qq.com> Co-authored-by: zmsn-2077 <jiamg.ji@gmail.com> Co-authored-by: Ruiyang Sun <rockmagma02@gmail.com> Co-authored-by: Jiayi Zhou <108712610+Gaiejj@users.noreply.github.com> Co-authored-by: 1Asan <99461435+1Asan@users.noreply.github.com> fix(algo): fix no return in algo_wrapper::learn (PKU-Alignment#122) fix(logger, wrapper): support csv file and velocity tasks (PKU-Alignment#131) fix typo. (PKU-Alignment#134) fix(ppo): fix entropy loss (PKU-Alignment#135) fix bugs (PKU-Alignment#136) fix: support new config for exp_grid (PKU-Alignment#142) fix(rollout, exp_grid): fix logdir path conflict (PKU-Alignment#145) fix(on-policy): fix the second order algorithms performance (PKU-Alignment#147)
@@ -0,0 +1,74 @@ | |||
# Copyright 2022-2023 OmniSafe Team. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2023
@@ -0,0 +1,75 @@ | |||
# Copyright 2022-2023 OmniSafe Team. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2022
@@ -0,0 +1,63 @@ | |||
# Copyright 2022-2023 OmniSafe Team. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
"""Initialize the off-policy adapter. | ||
|
||
Args: | ||
env_id: The environment id. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about considering type of parameters?
omnisafe/adapter/online_adapter.py
Outdated
|
||
Args: | ||
env_id: The environment id. | ||
num_envs: The number of environments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about considering type of parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Description
feat(off-policy): support off-policy lag
Types of changes
What types of changes does your code introduce? Put an
x
in all the boxes that apply:Checklist
Go over all the following points, and put an
x
in all the boxes that apply.If you are unsure about any of these, don't hesitate to ask. We are here to help!
make format
. (required)make lint
. (required)make test
pass. (required)