Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MARO v0.3: a new design of RL Toolkit, CLI refactorization, and corresponding updates. #539

Merged
merged 579 commits into from
Jun 1, 2022
Merged
Show file tree
Hide file tree
Changes from 250 commits
Commits
Show all changes
579 commits
Select commit Hold shift + click to select a range
3456b43
refined proxy coding style
May 27, 2021
fe96364
merge v0.2_rl_refinement to get rl toolkit updates
Jinyu-W May 28, 2021
fed8467
updated images and refined doc
May 28, 2021
03bcb41
updated images
May 28, 2021
6ef102f
Merge branch 'master' into v0.2
Jinyu-W May 31, 2021
12ba200
Merge branch 'v0.2' into v0.2_rl_refinement
Jinyu-W May 31, 2021
fda6ab5
merge v0.2_rl_refinment
Jinyu-W May 31, 2021
f044471
updated CIM-AC example
May 31, 2021
1692871
refined proxy retry logic
Jun 1, 2021
e3cbe14
Merge branch 'v0.2_rl_refinement' of github.com:microsoft/maro into v…
Jun 1, 2021
2a3e90d
call policy update only for AbsCorePolicy
Jinyu-W Jun 1, 2021
2e925ab
add limitation of AbsCorePolicy in Actor.collect()
Jinyu-W Jun 2, 2021
e3710e3
modify the supply_chain example to use the new rl toolkit architecture
Jinyu-W Jun 2, 2021
03c26b8
refined actor to return only experiences for policies that received n…
Jun 2, 2021
aff0f44
fix MsgKey issue in rollout_manager
Jinyu-W Jun 2, 2021
fba2ccf
Merge branch 'v0.2_rl_refinement' into v0.2_sc_0506_updated
Jinyu-W Jun 2, 2021
8c05562
fix typo in learner
Jinyu-W Jun 2, 2021
ce155d4
Merge branch 'v0.2_rl_refinement' into v0.2_sc_0506_updated
Jinyu-W Jun 2, 2021
cc0c555
call exit function for parallel rollout manager
Jinyu-W Jun 3, 2021
74ede26
Merge branch 'v0.2_rl_refinement' into v0.2_sc_0506_updated
Jinyu-W Jun 3, 2021
034e5bb
update supply chain example distributed training scripts
Jinyu-W Jun 3, 2021
13d7d9b
1. moved exploration scheduling to rollout manager; 2. fixed bug in l…
Jun 3, 2021
5337aa9
fixed merge conflicts
Jun 3, 2021
6dbfd36
reformat render
Jinyu-W Jun 3, 2021
146eeb6
fix supply chain business engine action type problem
Jinyu-W Jun 3, 2021
a80f6c3
reset supply chain example render figsize from 4 to 3
Jinyu-W Jun 3, 2021
731bc59
Add render to all modes of supply chain example
Jinyu-W Jun 3, 2021
ebe5065
fix or policy typos
Jinyu-W Jun 3, 2021
0549a14
1. added parallel policy manager prototype; 2. used training ep for e…
Jun 4, 2021
023a9d3
refined parallel policy manager
Jun 9, 2021
5a57e01
updated rl/__init__/py
Jun 9, 2021
1f62f3a
fixed lint issues and CIM local learner bugs
Jun 9, 2021
6208d86
deleted unwanted supply_chain test files
Jun 9, 2021
11ca4be
revised default config for cim-dqn
Jun 9, 2021
36d4178
removed test_store.py as it is no longer needed
Jun 10, 2021
0fab08b
1. changed Actor class to rollout_worker function; 2. renamed algorit…
Jun 11, 2021
3b5faeb
updated figures
Jun 11, 2021
7911162
removed unwanted import
Jun 11, 2021
4f2182f
refactored CIM-DQN example
Jun 15, 2021
2b1541b
added MultiProcessRolloutManager and MultiProcessTrainingManager
Jun 16, 2021
6392fcf
updated doc
Jun 17, 2021
5089f7c
lint issue fix
Jun 18, 2021
41a7b27
lint issue fix
Jun 18, 2021
35cf25a
fixed import formatting
Jun 18, 2021
ceadf4f
[Feature] Prioritized Experience Replay (#355)
ysqyang Jun 18, 2021
248d1e4
rm AbsDecisionGenerator
Jun 18, 2021
721d91b
Merge branch 'v0.2_rl_refinement' of github.com:microsoft/maro into v…
Jun 18, 2021
85e304a
small fixes
Jun 18, 2021
2601970
bug fix
Jun 18, 2021
f72e884
reorganized training folder structure
Jun 20, 2021
4f4d5bb
fixed lint issues
Jun 20, 2021
96b9cce
fixed lint issues
Jun 20, 2021
78c225a
policy manager refined
Jun 21, 2021
9acae80
lint fix
Jun 21, 2021
424cabb
restructured CIM-dqn sync code
Jun 21, 2021
18f73f2
added policy version index and used it as a measure of experience sta…
Jun 22, 2021
49d93c2
lint issue fix
Jun 22, 2021
bc96c5e
lint issue fix
Jun 22, 2021
1bb4b56
switched log_dir and proxy_kwargs order
Jun 22, 2021
20c6385
cim example refinement
Jun 23, 2021
42c24ab
eval schedule sorted only when it's a list
Jun 28, 2021
8db90d5
eval schedule sorted only when it's a list
Jun 28, 2021
81f574a
update sc env wrapper
Jinyu-W Jun 28, 2021
5ad21e4
added docker scripts for cim-dqn
Jun 28, 2021
bb25e71
Merge branch 'master' into v0.2
Jinyu-W Jun 29, 2021
a56d4c2
refactored example folder structure and added workflow templates
Jun 29, 2021
2525327
fixed merge conflicts
Jun 29, 2021
f427b07
fixed lint issues
Jun 30, 2021
b8dc7e4
fixed lint issues
Jun 30, 2021
92a51da
fixed template bugs
Jun 30, 2021
31b68f3
removed unused imports
Jun 30, 2021
bab8128
refactoring sc in progress
Jun 30, 2021
f964924
simplified cim meta
Jun 30, 2021
f9ccf2a
updated sc code
Jun 30, 2021
5ad3e54
fixed build.sh path bug
Jun 30, 2021
916d8ad
refined sc and template code
Jun 30, 2021
06c1cd3
template refinement
Jun 30, 2021
c17557c
fixed merge conflicts
Jun 30, 2021
ff76caa
deleted obsolete svgs
Jul 1, 2021
4842d16
merged with remote
Jul 1, 2021
35e55a7
updated learner logs
Jul 1, 2021
ae1e93f
minor edits
Jul 1, 2021
04c53e6
refactored templates for easy merge with async PR
Jul 1, 2021
1315f04
added component names for rollout manager and policy manager
Jul 1, 2021
de40647
fixed incorrect position to add last episode to eval schedule
Jul 1, 2021
360240f
added max_lag option in templates
Jul 1, 2021
315e85f
formatting edit in docker_compose_yml script
Jul 1, 2021
953c873
moved local learner and early stopper outside sync_tools
Jul 1, 2021
ed9d44a
refactored rl toolkit folder structure
Jul 1, 2021
d2b433e
refactored rl toolkit folder structure
Jul 2, 2021
9f799d4
moved env_wrapper and agent_wrapper inside rl/learner
Jul 2, 2021
f8cccca
refined scripts
Jul 2, 2021
a4491a7
modified sc imports according to changes in rl toolkit folder structure
Jul 2, 2021
0906577
fixed typo in script
Jul 2, 2021
a13322b
changes needed for running sc
Jul 2, 2021
8ec0282
removed unwanted imports
Jul 2, 2021
56af26a
Merge branch 'v0.2_rl_refinement' into v0.2_rl_refinement_sc
Jul 2, 2021
894c376
config change for testing sc scenario
Jul 2, 2021
743e9f3
changes for perf testing
Jul 4, 2021
8e97adc
Asynchronous Training (#364)
ysqyang Jul 5, 2021
fac6006
renamed sync to synchronous and async to asynchronous to avoid confli…
Jul 5, 2021
0004dfe
fixed merge conflicts
Jul 5, 2021
60a7423
added missing policy version increment in LocalPolicyManager
Jul 5, 2021
c004697
Merge remote-tracking branch 'origin/v0.2_rl_refinement' into v0.2_rl…
Jul 5, 2021
a163554
refined rollout manager recv logic
Jul 5, 2021
803faad
removed a debugging print
Jul 5, 2021
0c10f36
moved supply_chain inside examples/rl
Jul 5, 2021
34b47a5
added sleep in distributed launcher to avoid hanging
Jul 6, 2021
c41ca35
updated api doc and rl toolkit doc
Jul 7, 2021
a2244b5
refined dynamic imports using importlib
Jul 7, 2021
edf9df4
Merge branch 'master' into v0.2
Jul 8, 2021
740efa7
1. moved policy update triggers to policy manager; 2. added version c…
Jul 8, 2021
c278693
fixed a few bugs and updated cim RL example
Jul 8, 2021
455751a
fixed a few more bugs
Jul 8, 2021
ef50957
resolved merge conflicts
Jul 8, 2021
9a04a99
Merge remote-tracking branch 'origin/v0.2' into v0.2_rl_refinement
Jul 9, 2021
746f0f9
added agent wrapper instantiation to workflows
Jul 9, 2021
18cd676
added agent wrapper instantiation to workflows
Jul 9, 2021
c5cf9df
removed abs_block and added max_prob option for DiscretePolicyNet and…
Jul 9, 2021
1f3b590
fixed incorrect get_ac_policy signature for CIM
Jul 9, 2021
dd017d3
moved exploration inside core policy
Jul 9, 2021
98d0961
added state to exploration call to support context-dependent exploration
Jul 11, 2021
17f1655
updated sc example according to RL toolkit changes
Jul 11, 2021
bbb6ba7
separated non_rl_policy_index and rl_policy_index in workflows
Jul 11, 2021
c70105d
Merge branch 'v0.2_rl_refinement' into v0.2_rl_refinement_sc
Jul 11, 2021
f004fba
modified sc example code according to workflow changes
Jul 11, 2021
2be9114
modified sc example code according to workflow changes
Jul 11, 2021
9b04ad5
added replay_agent_ids parameter to get_env_func for RL examples
Jul 12, 2021
c004323
Merge branch 'v0.2_rl_refinement' into v0.2_rl_refinement_sc
Jul 12, 2021
700b149
fixed a few bugs
Jul 12, 2021
b9afaef
added maro/simulator/scenarios/supply_chain as bind mount
Jul 12, 2021
87066c9
added post-step, post-collect, post-eval and post-update callbacks
Jul 14, 2021
f0a29ef
fixed lint issues
Jul 14, 2021
56fd2d6
fixed lint issues
Jul 14, 2021
cb533fa
fixed some bugs
Jul 15, 2021
d2d66cd
moved instantiation of policy manager inside simple learner
Jul 15, 2021
513ca40
Merge branch 'v0.2_rl_refinement' into v0.2_rl_refinement_sc
Jul 15, 2021
07fba7a
fixed env_wrapper get_reward signature
Jul 15, 2021
a9e6b11
minor edits
Jul 15, 2021
1d5c242
Merge branch 'v0.2_rl_refinement' into v0.2_rl_refinement_sc
Jul 15, 2021
8a84b11
removed get_eperience kwargs from env_wrapper
Jul 15, 2021
2cc0f7b
Merge branch 'v0.2_rl_refinement' into v0.2_rl_refinement_sc
Jul 15, 2021
ec338fb
1. renamed step_callback to post_step in env_wrapper; 2. added get_ev…
Jul 15, 2021
8f00dc7
Merge branch 'v0.2_rl_refinement' into v0.2_rl_refinement_sc
Jul 15, 2021
1c94b62
added rollout exp disribution option in RL examples
Jul 15, 2021
2b04cc0
fixed merge conflicts
Jul 15, 2021
36092c2
Merge branch 'v0.2_sc_0506_updated' into v0.2_sc
lihuoran Jul 16, 2021
b4f5afa
Merge branch 'v0.2_sc' into v0.2_rl_refinement_sc
lihuoran Jul 16, 2021
4252a0c
removed unwanted files
Jul 16, 2021
c2c1c62
1. made logger internal in learner; 2 removed logger creation in abs …
Jul 16, 2021
be82de3
fixed merge conflicts
Jul 16, 2021
0a05184
checked out supply chain test files from v0.2_sc
Jul 16, 2021
c7bca77
1. added missing model.eval() to choose_action; 2.added entropy featu…
Jul 19, 2021
81245dd
fixed a bug in ac entropy
Jul 19, 2021
e56e9b8
abbreviated coefficient to coeff
Jul 19, 2021
072d9de
removed -dqn from job name in rl example config
Jul 22, 2021
103eb40
added tmp patch to dev.df
Jul 22, 2021
9c5e135
renamed image name for running rl examples
Jul 22, 2021
d96aa44
added get_loss interface for core policies
Jul 28, 2021
1f37369
added policy manager in rl_toolkit.rst
Jul 30, 2021
12ac058
1. env_wrapper bug fix; 2. policy manager update logic refinement
Jul 30, 2021
fc14e66
refactored policy and algorithms
Aug 3, 2021
7702eba
policy interface redesigned
Aug 5, 2021
704c17f
refined policy interfaces
Aug 8, 2021
56a54cb
fixed typo
Aug 8, 2021
0b57d70
fixed bugs in refactored policy interface
Aug 9, 2021
cad2872
fixed some bugs
Aug 9, 2021
3ba96d4
refactoring in progress
Aug 11, 2021
5f6c47c
policy interface and policy manager redesigned
Aug 17, 2021
cb8a355
1. fixed bugs in ac and pg; 2. fixed bugs rl workflow scripts
Aug 18, 2021
f0222a7
fixed bug in distributed policy manager
Aug 18, 2021
c0a8480
fixed lint issues
Aug 18, 2021
3a10544
fixed lint issues
Aug 18, 2021
026bcd3
added scipy in setup
Aug 18, 2021
00df5d8
1. trimmed rollout manager code; 2. added option to docker scripts
Aug 19, 2021
8619408
updated api doc for policy manager
Aug 20, 2021
ca7b0d9
1. simplified rl/learning code structure; 2. fixed bugs in rl example…
Aug 23, 2021
aefd3b5
1. simplified rl example structure; 2. fixed lint issues
Aug 23, 2021
db99ce2
further rl toolkit code simplifications
Aug 25, 2021
b3a244d
more numpy-based optimization in RL toolkit
Aug 26, 2021
505cf4e
moved replay buffer inside policy
Aug 27, 2021
af1eed6
bug fixes
Aug 27, 2021
e924495
numpy optimization and associated refactoring
Aug 29, 2021
7c407a4
extracted shaping logic out of env_sampler
Aug 31, 2021
07a051b
fixed bug in CIM shaping and lint issues
Aug 31, 2021
6a027fa
preliminary implemetation of parallel batch inference
Sep 1, 2021
fde7895
fixed bug in ddpg transition recording
Sep 2, 2021
b9010ef
put get_state, get_env_actions, get_reward back in EnvSampler
Sep 2, 2021
aa69409
simplified exploration and core model interfaces
Sep 5, 2021
2dbf3c3
bug fixes and doc update
Sep 6, 2021
f136e3c
added improve() interface for RLPolicy for single-thread support
Sep 6, 2021
92561f6
fixed simple policy manager bug
Sep 7, 2021
013d0fb
updated doc, rst, notebook
Sep 11, 2021
8f652b4
updated notebook
Sep 11, 2021
8dd708f
fixed lint issues
Sep 11, 2021
971fd04
fixed entropy bugs in ac.py
Sep 12, 2021
bf3cadb
reverted to simple policy manager as default
Sep 12, 2021
e89b6db
1. unified single-thread and distributed mode in learning_loop.py; 2.…
Sep 14, 2021
3738bd1
fixed lint issues and updated rl toolkit images
Sep 14, 2021
69c5a56
removed obsolete images
Sep 15, 2021
372d44c
Merge branch 'v0.2_rl_refinement' of github.com:microsoft/maro into v…
Sep 15, 2021
9030200
added back agent2policy for general workflow use
Sep 15, 2021
f2dd5c0
V0.2 rl refinement dist (#377)
buptchan Sep 16, 2021
1a70410
Merge branch 'v0.2_rl_refinement' of github.com:microsoft/maro into v…
Sep 17, 2021
5f70b65
added checkpointing for simple and multi-process policy managers
Sep 17, 2021
7b76dce
1. bug fixes in checkpointing; 2. removed version and max_lag in roll…
Sep 17, 2021
c1d9871
added missing set_state and get_state for CIM policies
Sep 17, 2021
fc59379
removed blank line
Sep 17, 2021
f23fdc6
updated RL workflow README
Sep 22, 2021
78a2cb8
Integrate `data_parallel` arguments into `worker_allocator` (#402)
buptchan Sep 22, 2021
f8f2e6a
1. simplified workflow config; 2. added comments to CIM shaping
Sep 22, 2021
0b5fcd1
lint issue fix
Sep 22, 2021
190802f
1. added algorithm type setting in CIM config; 2. added try-except cl…
Sep 22, 2021
6e941a4
1. moved post_step callback inside env sampler; 2. updated README for…
Sep 24, 2021
1edd4c4
refined READEME for CIM
Sep 24, 2021
2b4d4eb
VM scheduling with RL (#375)
ysqyang Sep 26, 2021
3a928b9
SC refinement (#397)
lihuoran Sep 26, 2021
ea7fdde
refined workflow scripts
Oct 9, 2021
c1f8faf
fixed bug in ParallelAgentWrapper
Oct 9, 2021
cf1430a
1. fixed lint issues; 2. refined main script in workflows
Oct 10, 2021
485ffd7
lint issue fix
Oct 10, 2021
4e1d37c
restored default config for rl example
Oct 10, 2021
5b21e67
Update rollout.py
ysqyang Oct 10, 2021
868bd53
refined env var processing in policy manager workflow
Oct 11, 2021
12ffd98
added hasattr check in agent wrapper
Oct 12, 2021
c0bae0b
updated docker_compose_yml.py
Oct 12, 2021
a5ddfd5
Minor refinement
lihuoran Oct 13, 2021
0f2f83e
Merge branch 'v0.2_rl_refinement' into v0.3
lihuoran Oct 14, 2021
6a1179c
Minor PR. Prepare to merge latest master branch into v0.3 branch. (#412)
lihuoran Dec 6, 2021
ff0f706
Merge latest master into v0.3 (#426)
lihuoran Dec 8, 2021
8a25f9e
Change `Env.set_seed()` logic (#456)
lihuoran Jan 24, 2022
526627c
Remove all SC related files (#473)
lihuoran Mar 4, 2022
696f5b5
RL Toolkit V3 (#471)
lihuoran Mar 7, 2022
7b3d78a
RL renaming v2 (#476)
lihuoran Mar 9, 2022
00fbcee
Cherry pick latest RL (#498)
lihuoran Mar 31, 2022
0e11ae9
Cherry pick RL changes from `sc_refinement` (latest commit: `2a4869`)…
lihuoran Apr 22, 2022
1219513
RL incremental refactor (#501)
lihuoran Apr 24, 2022
333986f
RL component bundle (#513)
lihuoran May 10, 2022
ae83ac0
Add method to get mapping of available tick to frame index (#415)
chaosddp May 16, 2022
10b9c02
Cherry pick from sc_refinement (#527)
lihuoran May 18, 2022
0d132cc
Refine `terminal` / `next_agent_state` logic (#531)
lihuoran May 25, 2022
a3dade7
Merge master into v0.3 (#536)
lihuoran May 31, 2022
cdadbc9
Merge branch 'v0.3' into merge_v0.3_into_master
lihuoran Jun 1, 2022
e0209d6
Remove random_config.py
lihuoran Jun 1, 2022
c135e73
Remove test_trajectory_utils.py
lihuoran Jun 1, 2022
fb77545
Pass tests
lihuoran Jun 1, 2022
c144c3b
Update rl docs
lihuoran Jun 1, 2022
0d6ac99
Remove python 3.6 in test
lihuoran Jun 1, 2022
5ce9e4c
Merge branch 'master' into merge_v0.3_into_master
Jinyu-W Jun 1, 2022
2e4b436
Update docs
lihuoran Jun 1, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ jobs:
strategy:
matrix:
os: [ubuntu-18.04, windows-latest, macos-latest]
python-version: [3.6, 3.7, 3.8, 3.9]
python-version: [3.7, 3.8, 3.9]

steps:
- uses: actions/checkout@v2
Expand Down
12 changes: 8 additions & 4 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
*.pyd
*.log
*.csv
*.parquet
*.c
*.cpp
*.DS_Store
Expand All @@ -12,15 +13,18 @@
.vs/
build/
log/
logs/
checkpoint/
checkpoints/
streamit/
dist/
*.egg-info/
tools/schedule
docs/_build
test/
data/
.eggs/
maro_venv/
pyvenv.cfg
htmlcov/
.coverage
.coveragerc
.coverage
.coveragerc
.tmp/
36 changes: 36 additions & 0 deletions docker_files/dev.df
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
FROM python:3.7-buster
WORKDIR /maro

# Install Apt packages
RUN apt-get update --fix-missing
RUN apt-get install -y apt-utils
RUN apt-get install -y sudo
RUN apt-get install -y gcc
RUN apt-get install -y libcurl4 libcurl4-openssl-dev libssl-dev curl
RUN apt-get install -y libzmq3-dev
RUN apt-get install -y python3-pip
RUN apt-get install -y python3-dev libpython3.7-dev python-numpy
RUN rm -rf /var/lib/apt/lists/*

# Install Python packages
RUN pip install --upgrade pip
RUN pip install --no-cache-dir Cython==0.29.14
RUN pip install --no-cache-dir pyaml==20.4.0
RUN pip install --no-cache-dir pyzmq==19.0.2
RUN pip install --no-cache-dir numpy==1.19.1
RUN pip install --no-cache-dir matplotlib
RUN pip install --no-cache-dir torch==1.6.0
RUN pip install --no-cache-dir scipy
RUN pip install --no-cache-dir matplotlib
RUN pip install --no-cache-dir redis
RUN pip install --no-cache-dir networkx

COPY maro /maro/maro
COPY scripts /maro/scripts/
COPY setup.py /maro/
RUN bash /maro/scripts/install_maro.sh
RUN pip cache purge

ENV PYTHONPATH=/maro

CMD ["/bin/bash"]