Refine RL workflow & tune RL models under GYM #577

lihuoran · 2023-02-06T05:58:28Z

Description

Refine RL workflow
- Refine rollout workflow. Now we can:
  - Run specific number of steps in rollout.
  - Use num_eval_episodes to control the number of episodes during evaluation.
- Add AbsEnvSampler.metrics. Let env samplers use this attribute to manage metrics during rollout.
- Add Callback, a generate interface to add customized operations in each phase of the workflow. Add two Callback instances: Checkpoint & MetricsRecorder.
- Other miner code refinements.
Tune RL models under GYM
- Add DDPG. Optimize the performance of PPO, SAC, and DDPG.
- Fix several RL algorithm bugs.
- Re-organize RL jobs' output paths.
- Re-organize RL test's file structure.

Linked issue(s)/Pull request(s)

issue_number

Type of Change

Related Component

Simulation toolkit
RL toolkit
Distributed toolkit

Has Been Tested

OS:
- Windows
- Mac OS
- Linux
Python version:
- 3.7
- 3.8
- 3.9
Key information snapshot(s):

Needs Follow Up Actions

New release package
New docker image

Checklist

Add/update the related comments
Add/update the related tests
Add/update the related documentations
Update the dependent downstream modules usage

Co-authored-by: Jinyu Wang <wang.jinyu@microsoft.com>

codecov · 2023-02-06T06:01:35Z

Codecov Report

Merging #577 (9371949) into v0.3 (214383f) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             v0.3     #577   +/-   ##
=======================================
  Coverage   79.30%   79.30%           
=======================================
  Files          86       86           
  Lines        5464     5464           
=======================================
  Hits         4333     4333           
  Misses       1131     1131

Flag	Coverage Δ
unittests	`79.30% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

* remove useless files; add device mapping; update pdoc * add default checkpoint path; fix distributed worker log path issue; update example log path * update performance doc * remove tests/rl/algorithms folder

Jinyu-W · 2023-02-08T04:01:04Z

examples/rl/cim_distributed.yml

@@ -10,7 +10,7 @@

 job: cim_rl_workflow


TODO: runtime error

maro/rl/workflows/callback.py

maro/rl/rollout/env_sampler.py

* [wip] compare PPO * PPO matching * Revert unnecessary changes * Minor * Minor

* fix sac to_device issue; update sac gym test parameters * add rl test performance plot func * update sac eval interval config * update sac checkpoint interval config * fix callback issue * update plot func * update plot func * update plot func * update performance doc; upload performance images * Minor fix in callbacks; refine plot.py format. * Add n_interactions. Use n_interactions to plot curves. * pre-commit --------- Co-authored-by: Huoran Li <huo53926@126.com> Co-authored-by: Huoran Li <huoranli@microsoft.com>

* Add truncated logic * (To be tested) early stop * Early stop test passed * Test passed * Random action. To be tested. * Warmup OK * Pre-commit * random seed * Revert pre-commit config

lihuoran and others added 30 commits April 24, 2022 11:58

PPO, SAC, DDPG passed

6bc7f0b

Explore in SAC

b3f5aef

Test GYM on server

5dab711

Sync server changes

211c06f

Merge branch 'v0.3' into rl_benchmark_debug

f92f7f1

pre-commit

514250a

Ready to try on server

fc0c02d

.

9fcdf42

.

01b5a94

.

dd27eed

.

1c8f258

.

1aa1085

Performance OK

148af38

Move to tests

99ff7b9

Remove old versions

65ba1a1

PPO done

f4a85b8

Start to test AC

2349191

Start to test SAC

f6f7dae

SAC test passed

110fec4

Multiple round in evaluation

2a1ccd5

Modify config.yml

c371220

Add Callbacks

a65d902

[wip] SAC performance not good

aa484f8

[wip] still not good

84ec6e6

update for some PR comments; Add a MARKDOWN file (#576)

0ceaac4

Co-authored-by: Jinyu Wang <wang.jinyu@microsoft.com>

Use FullyConnected to replace mlp

aad41d9

Update action bound

8884231

Merge branch 'rl_benchmark_debug' into rl_workflow_refine

0a01fb1

???

0bd25ca

Change gym env wrapper metrics logci

8781dd6

lihuoran and others added 13 commits January 31, 2023 14:40

Change gym env wrapper metrics logci

7b9b698

refine env_sampler.sample under step mode

52b4d1d

Add DDPG. Performance not good...

a3fea0d

Add DDPG. Performance not good...

23f39d1

wip

9da8b90

Sounds like sac works

fb11c31

Refactor file structure

d7d3282

Refactor file structure

ea26275

Refactor file structure

8881a1c

Pre-commit

b4db842

Merge branch 'rl_benchmark_debug' into rl_workflow_refine

8874a65

Merge branch 'v0.3' into rl_workflow_refine

2a7334b

Pre commit

eb7ae9b

lihuoran requested a review from Jinyu-W February 6, 2023 05:58

lihuoran and others added 2 commits February 8, 2023 13:53

Minor refinement of CIM RL

627b7d1

Jinyu/rl workflow refine (#578)

8386312

* remove useless files; add device mapping; update pdoc * add default checkpoint path; fix distributed worker log path issue; update example log path * update performance doc * remove tests/rl/algorithms folder

Jinyu-W reviewed Feb 9, 2023

View reviewed changes

lihuoran and others added 4 commits February 9, 2023 15:20

Resolve PR comments

b05c849

Compare PPO with spinning up (#579)

ab5e675

* [wip] compare PPO * PPO matching * Revert unnecessary changes * Minor * Minor

Episode truncation & early stopping (#581)

9371949

* Add truncated logic * (To be tested) early stop * Early stop test passed * Test passed * Random action. To be tested. * Warmup OK * Pre-commit * random seed * Revert pre-commit config

Jinyu-W approved these changes Feb 17, 2023

View reviewed changes

Jinyu-W merged commit b8a955e into v0.3 Feb 17, 2023

lihuoran deleted the rl_workflow_refine branch February 22, 2023 04:53

Jinyu-W mentioned this pull request Mar 20, 2023

V0.3: Upgrade RL Workflow; Add RL Benchmarks; Update Package Version #588

Merged

21 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refine RL workflow & tune RL models under GYM #577

Refine RL workflow & tune RL models under GYM #577

lihuoran commented Feb 6, 2023

codecov bot commented Feb 6, 2023 •

edited

Jinyu-W Feb 8, 2023

Refine RL workflow & tune RL models under GYM #577

Refine RL workflow & tune RL models under GYM #577

Conversation

lihuoran commented Feb 6, 2023

Description

Linked issue(s)/Pull request(s)

Type of Change

Related Component

Has Been Tested

Needs Follow Up Actions

Checklist

codecov bot commented Feb 6, 2023 • edited

Codecov Report

Jinyu-W Feb 8, 2023

Choose a reason for hiding this comment

codecov bot commented Feb 6, 2023 •

edited