Skip to content

V0.2 update#262

Merged
Jinyu-W merged 72 commits into
masterfrom
v0.2
Jan 25, 2021
Merged

V0.2 update#262
Jinyu-W merged 72 commits into
masterfrom
v0.2

Conversation

@Jinyu-W
Copy link
Copy Markdown
Contributor

@Jinyu-W Jinyu-W commented Jan 25, 2021

Description

Linked issue(s)/Pull request(s)

Type of Change

  • Non-breaking bug fix
  • Breaking bug fix
  • New feature
  • Test
  • Doc update
  • Docker update

Related Component

  • Simulation toolkit
  • RL toolkit
  • Distributed toolkit

Has Been Tested

  • OS:
    • Windows
    • Mac OS
    • Linux
  • Python version:
    • 3.6
    • 3.7
  • Key information snapshot(s):

Needs Follow Up Actions

  • New release package
  • New docker image

Checklist

  • Add/update the related comments
  • Add/update the related tests
  • Add/update the related documentations
  • Update the dependent downstream modules usage

Arthur Jiang and others added 30 commits October 1, 2020 15:08
* feat: refine data push/pull

* test: add cli provision testing

* fix: style fix

* fix: add necessary comments

* fix: from code review
* fix deployment issue in multi envs

* fix typo

* fix ~/.maro not exist issue in build

* skip deploy when build

* update for comments

* temporarily disable weather info

* replace ecr with cim in setup.py

* replace ecr in manifest

* remove weather check when read data

* fix station id issue

* fix format

* add TODO in comments

* add noaa weather source

* fix weather reset and weather comment

* add comment for weather data url

* some format update

* add fall back function in weather download

* update comment

* update for comments

* update comment

* add period

* fix for pylint

* update for pylint check
* added example docs

* added citibike greedy example doc

* modified citibike doc

* fixed PR comments

* fixed more PR comments

* fixed small formatting issue

Co-authored-by: ysqyang <v-yangqi@microsoft.com>
* switch the key and value of handler_dict in decorator

* add dist decorator UT and fixed multithreading conflict in maro test suite

* pr comments update.

* resolved comments about decorator UT

* rename handler_fun in dist decorator

* change self.attr into class_name.attr

* update UT tests comments
* refine the annotation of simulator core

* remove reward from env(be)

* format refined

* white spaces test

* left-padding spaces refined

* format modifed

* update the left-padding spaces of docstrings

* code format updated

* update according to comments

* update according to PR comments

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
* key_list of events added for env.summary

* code refined according to lint

* 2 kinds of Payload added for CIM scenario; citi bike summary refined according to comments

* code format refined

* try trigger the git tests

* update github workflow

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
* key_list of events added for env.summary

* code refined according to lint

* 2 kinds of Payload added for CIM scenario; citi bike summary refined according to comments

* code format refined

* try trigger the git tests

* update github workflow

* online LP example added for citi bike

* infeasible solution

* infeasible solution fixed: call snapshot before any env.step()

* experiment results of toy topos added

* experiment results of toy topos added

* experiment result update: better than naive baseline

* PuLP version added

* greedy experiment results update

* citibike result update

* modified according to PR comments

* update experiment results and forecasting comparison

* citi bike lp README updated

* README updated

* modified according to PR comments

* update according to PR comments

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Jinyu Wang <jinywan@microsoft.com>
* refined rl abstractions

* fixed formattin issues

* checked out error-code related code from v0.2_pg

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* renamed save_models to dump_models

* 1. set default batch_norm_enabled to True; 2. used state_dict in dqn model saving

* renamed dump_experience_store to dump_experience_pool

* fixed a bug in the dump_experience_pool method

* fixed some PR comments

* fixed more PR comments

* 1.fixed some PR comments; 2.added early_stopping_checker; 3.revised explorer class

* fixed cim example according to rl toolkit changes

* fixed some more PR comments

* rewrote multi_process_launcher to eliminate the distributed section in config

* 1. fixed a typo; 2. added logging before early stopping

* fixed a bug

* fixed a bug

* fixed a bug

* added early stopping feature to CIM exmaple

* fixed a typo

* fixed some issues with early stopping

* changed early stopping metric func

* fixed a bug

* fixed a bug

* added early stopping to dist mode cim

* added experience collecting func

* edited notebook according to changes in CIM example

* fixed bugs in nb

* fixed lint formatting issues

* fixed a typo

* fixed some PR comments

* fixed more PR comments

* revised docs

* removed nb output

* fixed a bug in simple_learner

* fixed a typo in nb

* fixed a bug

* fixed a bug

* fixed a bug

* removed unused import

* fixed a bug

* 1. changed early stopping default config; 2. renamed param in early stopping checker and added typing

* fixed some doc issues

* added output to nb

Co-authored-by: ysqyang <v-yangqi@microsoft.com>
* 1. added logical operator overloading for early stopping checker; 2. added mean value checker

* fixed PR comments

* removed learner.exit() in single_process_launcher

* added another early stopping checker in example

* fixed PR comments and lint issues

* lint issue fix

* fixed lint issues

* fixed a bug

* fixed a bug

Co-authored-by: ysqyang <v-yangqi@microsoft.com>
* replaced IdentityLayers with nn.Identity

* 1. added skip connection option in FC_net; 2. generalized learning model

* added skip_connection option in config

* removed type casting in fc_net

* fixed lint formatting issues

* refined docstring

* added multi-head functionality to LearningModel

* refined learning model docstring

* added head_key param in learningModel forward

* fixed PR comments

* added top layer logic and is_top option in fc_net

* fixed a bug

* fixed a bug

* reverted some changes in learning model

* reverted some changes in learning model

* added members to learning model to fix the mode issue

* fixed a bug

* fixed mode setting issue in learning model

* removed learner.exit() in single_process_launcher

* fixed PR comments

* fixed rl/__init__

* fixed issues in example

* fixed a bug

* fixed a bug

* fixed lint formatting issues

* moved reward type casting to exp shaper

Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>
* added dueling action value model

* renamed params in dueling_action_value_model

* renamed shared_features to features

* replaced IdentityLayers with nn.Identity

* 1. added skip connection option in FC_net; 2. generalized learning model

* added skip_connection option in config

* removed type casting in fc_net

* fixed lint formatting issues

* refined docstring

* mv dueling_actiovalue_model and fixed some bugs

* added multi-head functionality to LearningModel

* refined learning model docstring

* added head_key param in learningModel forward

* added double DQN and dueling features to DQN

* fixed a bug

* added DuelingQModelHead enum

* fixed a bug

* removed unwanted file

* fixed PR comments

* added top layer logic and is_top option in fc_net

* fixed a bug

* fixed a bug

* reverted some changes in learning model

* reverted some changes in learning model

* added members to learning model to fix the mode issue

* fixed a bug

* fixed mode setting issue in learning model

* fixed PR comments

* revised cim example according to DQN changes

* renamed eval_model to q_value_model in cim example

* more fixes

* fixed a bug

* fixed a bug

* added doc per PR comments

* removed learner.exit() in single_process_launcher

* removed learner.exit() in single_process_launcher

* fixed PR comments

* fixed rl/__init__

* fixed issues in example

* fixed a bug

* fixed a bug

* fixed lint formatting issues

* double DQN feature

* fixed a bug

* fixed a bug

* fixed PR comments

* fixed lint issue

* 1. fixed PR comments related to load/dump; 2. removed abstract load/dump methods from AbsAlgorithm

* added load_models in simple_learner

* minor docstring edits

* minor docstring edits

* set is_double to true in DQN config

Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Arthur Jiang <ArthurSJiang@gmail.com>
* feat: support predefined image provision

* style: fix linting errors

* style: fix linting errors

* style: fix linting errors

* style: fix linting errors

* fix: error scripts invocation after using relative import

* fix: missing init.py

* fixed a bug in learner's test()

* feat: add distributed_config for dqn example

* test: update test for grass

* test: update test for k8s

* feat: add promptings for steps

* fix: change relative imports to absolute imports

Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Arthur Jiang <ArthurSJiang@gmail.com>
kyu-kuanwei and others added 21 commits December 29, 2020 15:28
* Fix typo

* Refining vm scheduling docs
* updated docs and images for rl toolkit

* 1. fixed import formats for maro/rl; 2. changed decorators to hypers in store

* fixed lint issues

Co-authored-by: ysqyang <v-yangqi@microsoft.com>
vm scenario: fix the event type bug of the postpone event
* updated docs and images for rl toolkit

* updated cim example doc

* updated cim exmaple docs

* updated cim example rst

* updated rl_toolkit and cim example docs

* replaced q_module with q_net in example rst

* refined doc

* refined doc

* updated figures

* updated figures

Co-authored-by: ysqyang <v-yangqi@microsoft.com>
* Implemented dump snapshots and convert to CSV.

* Let BE supports params when dump snapshot.

* Refactor dump code to core.py

* Implemented decision event dump.

* replace is not '' with !=''

* Fixed issues that code review mentioned.

* removed path from hello.py

* Changed import sort.

* Fix  import sorting in citi_bike/business_engine

* visualization 0.1

* Updated lint configurations.

* Fixed formatting error that caused lint errors.

* render html title function

* Try to fix lint errors.

* flake-8 style fix

* remove space around 18,35

* dump_csv_converter.py re-formatting.

* files re-formatting.

* style fixed

* tab delete

* white space fix

* white space fix-2

* vis redundant function delete

* refine

* re-formatting after merged upstream.

* Updated import section.

* Updated import section.

* pr refine

* isort fix

* white space

* lint error

* \n error

* test continuation

* indent

* continuation of indent

* indent 0.3

* comment update

* comment update 0.2

* f-string update

* f-string 0.2

* lint 0.3

* lint 0.4

* lint 0.4

* lint 0.5

* lint 0.6

* docstring update

* data version deploy update

* condition update

* add whitespace

* V0.2 vis dump feature enhancement. (#190)

* Dumps added manifest file.
* Code updated format by flake8
* Changed manifest file format for easy reading.

* deploy info update; docs update

* weird white space

* Update dashboard_visualization.md

* new endline?

* delete dependency

* delete irrelevant file

* change scenario to enum, divide file path into a separated class

* doc refine

* doc update

* params type

* data structure update

* doc&enum, formula refine

* refine

* add ut, refine doc

* style refine

* isort

* strong type fix

* os._exit delete

* revert datalib

* import new line

* change test case

* change file name & doc

* change deploy path

* delete params

* revert file

* delete duplicate file

* delete single process

* update naming

* manually change import order

* delete blank

* edit error

* requirement txt

* style fix & refine

* comments&docstring refine

* add parameter name

* test & dump

* comments update

* Added manifest file. (#201)

Only a few changes that need to meet requirements of manifest file format.

* comments fix

* delete toolkit change

* doc update

* citi bike update

* deploy path

* datalib update

* revert datalib

* revert

* maro file format

* comments update

* doc update

* update param name

* doc update

* new link

* image update

* V0.2 visualization-0.1 (#181)

* visualization 0.1

* render html title function

* flake-8 style fix

* style fixed

* tab delete

* white space fix

* white space fix-2

* vis redundant function delete

* refine

* pr refine

* isort fix

* white space

* lint error

* \n error

* test continuation

* indent

* continuation of indent

* indent 0.3

* comment update

* comment update 0.2

* f-string update

* f-string 0.2

* lint 0.3

* lint 0.4

* lint 0.4

* lint 0.5

* lint 0.6

* docstring update

* data version deploy update

* condition update

* add whitespace

* deploy info update; docs update

* weird white space

* Update dashboard_visualization.md

* new endline?

* delete dependency

* delete irrelevant file

* change scenario to enum, divide file path into a separated class

* fix the visualization of docs/key_components/distributed_toolkit

* doc refine

* doc update

* params type

* add examples into isort ignore

* data structure update

* doc&enum, formula refine

* refine

* add ut, refine doc

* style refine

* isort

* strong type fix

* os._exit delete

* revert datalib

* import new line

* change test case

* change file name & doc

* change deploy path

* delete params

* revert file

* delete duplicate file

* delete single process

* update naming

* manually change import order

* delete blank

* edit error

* requirement txt

* style fix & refine

* comments&docstring refine

* add parameter name

* test & dump

* comments update

* comments fix

* delete toolkit change

* doc update

* citi bike update

* deploy path

* datalib update

* revert datalib

* revert

* maro file format

* comments update

* doc update

* update param name

* doc update

* new link

* image update

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Miaoran Chen (Wicresoft) <v-miaorc@microsoft.com>

* image change

* add reset snapshot

* delete dump

* add new line

* add next steps

* import change

* relative import

* add init file

* import change

* change utils file

* change cliexpcetion to clierror

* dashboard test

* change result

* change assertation

* move not

* unit test change

* core change

* unit test delete name_mapping_file

* update cim business engine

* doc update

* change relative path

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* duc update

* duc update

* duc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* change import sequence

* comments update

* doc add pic

* add dependency

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* Update dashboard_visualization.rst

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* delete white space

* doc update

* doc update

* update doc

* update doc

* update doc

Co-authored-by: Michael Li <mic_lee2000@hotmail.com>
Co-authored-by: Miaoran Chen (Wicresoft) <v-miaorc@microsoft.com>
Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>
* Update process mode docs and fixed on premises

* Update orchestration docs

* Update process mode docs add JOB_NAME as env variable

* fixed bugs

* fixed isort issue

* update docs index

Co-authored-by: kaiqli <v-kaiqli@microsoft.com>
* moved optimizer options to LearningModel

* typo fix

* fixed lint issues

* updated notebook

* misc edits

* 1. renamed CIMAgent to DQNAgent; 2. moved create_dqn_agents to Agent section in notebook

* renamed single_host_cim_learner ot cim_learner in notebook

* updated notebook output

* typo fix

* removed dimension check in absence of shared stack

* fixed a typo

* fixed lint issues

Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>
* update readme

* update version

* refine reademe format

* add vis gif

* add citation

* update citation

* update badge

Co-authored-by: Arthur Jiang <sjian@microsoft.com>
* Fix typo

* fix typo
* syntax fix

* syntax fix

* syntax fix

* rm unwanted import

Co-authored-by: ysqyang <v-yangqi@microsoft.com>
* Remove topology

* Update pipeline

* Update pipeline

* Update pipeline

* Modify metafile

* Add two attributes of VM

* Update pipeline

* Add vm category

* Add todo

* Add oversub config

* Add oversubscription feature

* Lint fix

* Update based on PR comment.

* Update pipeline

* Update pipeline

* Update config.

* Update based on PR comment

* Update

* Add pm sku feature

* Add sku setting

* Add sku feature

* Lint fix

* Lint style

* Update sku, overloading

* Lint fix

* Lint style

* Fix bug

* Modify config

* Remove sky and replaced it by pm stype

* Add and refactor vm category

* Comment out cofig

* Unify the enum format

* Fix lint style

* Fix import order

* Update based on PR comment

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
* Fix data preparation bug

* Add frame index
* fixed a bug

* fixed lint issues

* added load/dump functions to LearningModel

* fixed a bug

* fixed a bug

* fixed lint issues

* merged with v0.2_embedded_optims

* refined DQN docstrings

* removed load/dump functions from DQN

* added task validator

* fixed decorator use

* fixed a typo

* fixed a bug

* revised

* fixed lint issues

* changed LearningModel's step() to take a single loss

* revised learning model design

* revised example

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* added decorator utils to algorithm

* fixed a bug

* renamed core_model to model

* fixed a bug

* 1. fixed lint formatting issues; 2. refined learning model docstrings

* rm trailing whitespaces

* added decorator for choose_action

* fixed a bug

* fixed a bug

* fixed version-related issues

* renamed add_zeroth_dim decorator to expand_dim

* overhauled exploration abstraction

* fixed a bug

* fixed a bug

* fixed a bug

* added exploration related methods to abs_agent

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* separated learning with exploration schedule and without

* small fixes

* moved explorer logic to actor side

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* removed unwanted param from simple agent manager

* small fixes

* revised code based on revised abstractions

* fixed some bugs

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* added shared_module property to LearningModel

* added shared_module property to LearningModel

* fixed a bug with k-step return in AC

* fixed a bug

* fixed a bug

* merged pg, ac and ppo examples

* fixed a bug

* fixed a bug

* fixed naming for ppo

* renamed some variables in PPO

* added ActionWithLogProbability return type for PO-type algorithms

* fixed a bug

* fixed a bug

* fixed lint issues

* revised __getstate__ for LearningModel

* fixed a bug

* added soft_update function to learningModel

* fixed a bug

* revised learningModel

* rm __getstate__ and __setstate__ from LearningModel

* added noise explorer

* formatting

* fixed formatting

* removed unnecessary comma

* removed unnecessary comma

* removed unnecessary comma

* fixed PR comments

* removed unwanted exception and imports

* removed unwanted exception and imports

* fixed a bug

* fixed PR comments

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issue

* fixed a bug

* fixed lint issue

* fixed naming

* combined exploration param generation and early stopping in scheduler

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issues

* fixed lint issue

* moved logger inside scheduler

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issues

* fixed lint issue

* removed epsilon parameter from choose_action

* removed epsilon parameter from choose_action

* changed agent manager's train parameter to experience_by_agent

* fixed some PR comments

* renamed zero_grad to zero_gradients in LearningModule

* fixed some PR comments

* bug fix

* bug fix

* bug fix

* removed explorer abstraction from agent

* added DEVICE env variable as first choice for torch device

* refined dqn example

* fixed lint issues

* removed unwanted import in cim example

* updated cim-dqn notebook

* simplified scheduler

* edited notebook according to merged scheduler changes

* refined dimension check for learning module manager and removed num_actions from DQNConfig

* bug fix for cim example

* added notebook output

* updated cim PO example code according to changes in maro/rl

* removed early stopping from CIM dqn example

* combined ac and ppo and simplified example code and config

* removed early stopping from cim example config

* moved decorator logic inside algorithms

* renamed early_stopping_callback to early_stopping_checker

* put PG and AC under PolicyOptimization class and refined examples accordingly

* fixed lint issues

* removed action_dim from noise explorer classes and added some shape checks

* modified NoiseExplorer's __call__ logic to batch processing

* made NoiseExplorer's __call__ return type np array

* renamed update to set_parameters in explorer

* fixed old naming in test_grass

* moved optimizer options to LearningModel

* typo fix

* fixed lint issues

* updated notebook

* updated cim example for policy optimization

* typo fix

* typo fix

* typo fix

* typo fix

* misc edits

* minor edits to rl_toolkit.rst

* checked out docs from master

* fixed typo in k-step shaper

* fixed lint issues

* bug fix in store

* lint issue fix

* changed default max_ep to 100 for policy_optimization algos

* vis doc update to master (#244)

* refine readme

* feat: refine data push/pull (#138)

* feat: refine data push/pull

* test: add cli provision testing

* fix: style fix

* fix: add necessary comments

* fix: from code review

* add fall back function in weather download (#112)

* fix deployment issue in multi envs

* fix typo

* fix ~/.maro not exist issue in build

* skip deploy when build

* update for comments

* temporarily disable weather info

* replace ecr with cim in setup.py

* replace ecr in manifest

* remove weather check when read data

* fix station id issue

* fix format

* add TODO in comments

* add noaa weather source

* fix weather reset and weather comment

* add comment for weather data url

* some format update

* add fall back function in weather download

* update comment

* update for comments

* update comment

* add period

* fix for pylint

* update for pylint check

* added example docs (#136)

* added example docs

* added citibike greedy example doc

* modified citibike doc

* fixed PR comments

* fixed more PR comments

* fixed small formatting issue

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* switch the key and value of handler_dict in decorator (#144)

* switch the key and value of handler_dict in decorator

* add dist decorator UT and fixed multithreading conflict in maro test suite

* pr comments update.

* resolved comments about decorator UT

* rename handler_fun in dist decorator

* change self.attr into class_name.attr

* update UT tests comments

* V0.1 annotation (#147)

* refine the annotation of simulator core

* remove reward from env(be)

* format refined

* white spaces test

* left-padding spaces refined

* format modifed

* update the left-padding spaces of docstrings

* code format updated

* update according to comments

* update according to PR comments

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>

* Event payload details for env.summary (#156)

* key_list of events added for env.summary

* code refined according to lint

* 2 kinds of Payload added for CIM scenario; citi bike summary refined according to comments

* code format refined

* try trigger the git tests

* update github workflow

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>

* Implemented dump snapshots and convert to CSV.

* Let BE supports params when dump snapshot.

* Refactor dump code to core.py

* Implemented decision event dump.

* V0.2 online lp for citi bike (#159)

* key_list of events added for env.summary

* code refined according to lint

* 2 kinds of Payload added for CIM scenario; citi bike summary refined according to comments

* code format refined

* try trigger the git tests

* update github workflow

* online LP example added for citi bike

* infeasible solution

* infeasible solution fixed: call snapshot before any env.step()

* experiment results of toy topos added

* experiment results of toy topos added

* experiment result update: better than naive baseline

* PuLP version added

* greedy experiment results update

* citibike result update

* modified according to PR comments

* update experiment results and forecasting comparison

* citi bike lp README updated

* README updated

* modified according to PR comments

* update according to PR comments

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Jinyu Wang <jinywan@microsoft.com>

* V0.2 rl toolkit refinement (#165)

* refined rl abstractions

* fixed formattin issues

* checked out error-code related code from v0.2_pg

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* renamed save_models to dump_models

* 1. set default batch_norm_enabled to True; 2. used state_dict in dqn model saving

* renamed dump_experience_store to dump_experience_pool

* fixed a bug in the dump_experience_pool method

* fixed some PR comments

* fixed more PR comments

* 1.fixed some PR comments; 2.added early_stopping_checker; 3.revised explorer class

* fixed cim example according to rl toolkit changes

* fixed some more PR comments

* rewrote multi_process_launcher to eliminate the distributed section in config

* 1. fixed a typo; 2. added logging before early stopping

* fixed a bug

* fixed a bug

* fixed a bug

* added early stopping feature to CIM exmaple

* fixed a typo

* fixed some issues with early stopping

* changed early stopping metric func

* fixed a bug

* fixed a bug

* added early stopping to dist mode cim

* added experience collecting func

* edited notebook according to changes in CIM example

* fixed bugs in nb

* fixed lint formatting issues

* fixed a typo

* fixed some PR comments

* fixed more PR comments

* revised docs

* removed nb output

* fixed a bug in simple_learner

* fixed a typo in nb

* fixed a bug

* fixed a bug

* fixed a bug

* removed unused import

* fixed a bug

* 1. changed early stopping default config; 2. renamed param in early stopping checker and added typing

* fixed some doc issues

* added output to nb

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* replace is not '' with !=''

* Fixed issues that code review mentioned.

* removed path from hello.py

* Changed import sort.

* Fix  import sorting in citi_bike/business_engine

* visualization 0.1

* Updated lint configurations.

* Fixed formatting error that caused lint errors.

* render html title function

* Try to fix lint errors.

* flake-8 style fix

* remove space around 18,35

* dump_csv_converter.py re-formatting.

* files re-formatting.

* style fixed

* tab delete

* white space fix

* white space fix-2

* vis redundant function delete

* refine

* update according to flake8

* re-formatting after merged upstream.

* Updated import section.

* Updated import section.

* V0.2 Logical operator overloading for EarlyStoppingChecker (#178)

* 1. added logical operator overloading for early stopping checker; 2. added mean value checker

* fixed PR comments

* removed learner.exit() in single_process_launcher

* added another early stopping checker in example

* fixed PR comments and lint issues

* lint issue fix

* fixed lint issues

* fixed a bug

* fixed a bug

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 skip connection (#176)

* replaced IdentityLayers with nn.Identity

* 1. added skip connection option in FC_net; 2. generalized learning model

* added skip_connection option in config

* removed type casting in fc_net

* fixed lint formatting issues

* refined docstring

* added multi-head functionality to LearningModel

* refined learning model docstring

* added head_key param in learningModel forward

* fixed PR comments

* added top layer logic and is_top option in fc_net

* fixed a bug

* fixed a bug

* reverted some changes in learning model

* reverted some changes in learning model

* added members to learning model to fix the mode issue

* fixed a bug

* fixed mode setting issue in learning model

* removed learner.exit() in single_process_launcher

* fixed PR comments

* fixed rl/__init__

* fixed issues in example

* fixed a bug

* fixed a bug

* fixed lint formatting issues

* moved reward type casting to exp shaper

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* pr refine

* isort fix

* white space

* lint error

* \n error

* test continuation

* indent

* continuation of indent

* indent 0.3

* comment update

* comment update 0.2

* f-string update

* f-string 0.2

* lint 0.3

* lint 0.4

* lint 0.4

* lint 0.5

* lint 0.6

* docstring update

* data version deploy update

* condition update

* add whitespace

* V0.2 vis dump feature enhancement. (#190)

* Dumps added manifest file.
* Code updated format by flake8
* Changed manifest file format for easy reading.

* deploy info update; docs update

* weird white space

* Update dashboard_visualization.md

* new endline?

* delete dependency

* delete irrelevant file

* change scenario to enum, divide file path into a separated class

* fixed a bug in learner's test() (#193)

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 double dqn (#188)

* added dueling action value model

* renamed params in dueling_action_value_model

* renamed shared_features to features

* replaced IdentityLayers with nn.Identity

* 1. added skip connection option in FC_net; 2. generalized learning model

* added skip_connection option in config

* removed type casting in fc_net

* fixed lint formatting issues

* refined docstring

* mv dueling_actiovalue_model and fixed some bugs

* added multi-head functionality to LearningModel

* refined learning model docstring

* added head_key param in learningModel forward

* added double DQN and dueling features to DQN

* fixed a bug

* added DuelingQModelHead enum

* fixed a bug

* removed unwanted file

* fixed PR comments

* added top layer logic and is_top option in fc_net

* fixed a bug

* fixed a bug

* reverted some changes in learning model

* reverted some changes in learning model

* added members to learning model to fix the mode issue

* fixed a bug

* fixed mode setting issue in learning model

* fixed PR comments

* revised cim example according to DQN changes

* renamed eval_model to q_value_model in cim example

* more fixes

* fixed a bug

* fixed a bug

* added doc per PR comments

* removed learner.exit() in single_process_launcher

* removed learner.exit() in single_process_launcher

* fixed PR comments

* fixed rl/__init__

* fixed issues in example

* fixed a bug

* fixed a bug

* fixed lint formatting issues

* double DQN feature

* fixed a bug

* fixed a bug

* fixed PR comments

* fixed lint issue

* 1. fixed PR comments related to load/dump; 2. removed abstract load/dump methods from AbsAlgorithm

* added load_models in simple_learner

* minor docstring edits

* minor docstring edits

* set is_double to true in DQN config

Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Arthur Jiang <ArthurSJiang@gmail.com>

* V0.2 feature predefined image (#183)

* feat: support predefined image provision

* style: fix linting errors

* style: fix linting errors

* style: fix linting errors

* style: fix linting errors

* fix: error scripts invocation after using relative import

* fix: missing init.py

* fixed a bug in learner's test()

* feat: add distributed_config for dqn example

* test: update test for grass

* test: update test for k8s

* feat: add promptings for steps

* fix: change relative imports to absolute imports

Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Arthur Jiang <ArthurSJiang@gmail.com>

* doc refine

* doc update

* params type

* data structure update

* doc&enum, formula refine

* refine

* add ut, refine doc

* style refine

* isort

* strong type fix

* os._exit delete

* revert datalib

* import new line

* change test case

* change file name & doc

* change deploy path

* delete params

* revert file

* delete duplicate file

* delete single process

* update naming

* manually change import order

* delete blank

* edit error

* requirement txt

* style fix & refine

* comments&docstring refine

* add parameter name

* test & dump

* comments update

* V0.2 feature proxy rejoin (#158)

* update dist decorator

* replace proxy.get_peers by proxy.peers

* update proxy rejoin (draft, not runable for proxy rejoin)

* fix bugs in proxy

* add message cache, and redesign rejoin parameter

* feat: add checkpoint with test

* update proxy.rejoin

* fixed rejoin bug, rename func

* add test example(temp)

* feat: add FaultToleranceAgent, refine other MasterAgents and NodeAgents.

* capital env vari name

* rm json.dumps; change retries to 10; temp add warning level for rejoin

* fix: unable to load FaultToleranceAgent, missing params

* fix: delete mapping in StopJob if FaultTolerance is activated, add exception handler for FaultToleranceAgent

* feat: add node_id to node_details

* fix: add a new dependency for tests

* style: meet linting requirements

* style: remaining linting problems

* lint fixed; rm temp test folder.

* fixed lint f-string without placeholder

* fix: add a flag for "remove_container", refine restart logic and Redis keys naming

* proxy rejoin update.

* variable rename.

* fixed lint issues

* fixed lint issues

* add exit code for different error

* feat: add special errors handler

* add max rejoin times

* remove unused import

* add rejoin UT; resolve rejoin comments

* lint fixed

* fixed UT import problem

* rm MessageCache in proxy

* fix: refine key naming

* update proxy rejoin; add topic for broadcast

* feat: support predefined image provision

* update UT for communication

* add docstring for rejoin

* fixed isort and zmq driver import

* fixed isort and UT test

* fix isort issue

* proxy rejoin update (comments v2)

* fixed isort error

* style: fix linting errors

* style: fix linting errors

* style: fix linting errors

* style: fix linting errors

* feat: add exists method for checkpoint

* fix: error scripts invocation after using relative import

* fix: missing init.py

* fixed a bug in learner's test()

* add driver close and socket SUB disconnect for rejoin

* feat: add distributed_config for dqn example

* test: update test for grass

* test: update test for k8s

* feat: add promptings for steps

* fix: change relative imports to absolute imports

* fixed comments and update logger level

* mv driver in proxy.__init__ for issue temp fixed.

* Update docstring and comments

* style: fix code reviews problems

* fix code format

Co-authored-by: Lyuchun Huang <romic.kid@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 feature cli windows (#203)

* fix: change local mkdir to os.makedirs

* fix: add utf8 encoding for logger

* fix: add powershell.exe prefix to subprocess functions

* feat: add debug_green

* fix: use fsutil to create fix-size files in Windows

* fix: use universal_newlines=True to handle encoding problem in different operating systems

* fix: use temp file to do copy when the operating system is not Linux

* fix: linting error

* fix: use fsutil in test_k8s.py

* feat: dynamic init ABS_PATH in GlobalParams

* fix: use -Command to execute Powershell command

* fix: refine code style in k8s_azure_executor.py, add Windows support for k8s mode

* fix: problems in code review

* EventBuffer refine (#197)

* merge uniform event changes back

* 1st step: move executing events into stack for better removing performance

* flush event pool

* typo

* add option for env to enable event pool

* refine stack functions

* fix comment issues, add typings

* lint fixing

* lint fix

* add missing fix

* linting

* lint

* use linked list instead original event list and execute stack

* add missing file

* linting, and fixes

* add missing file

* linting fix

* fixing comments

* add missing file

* rename event_list to event_linked_list

* correct import path

* change enable_event_pool to disable_finished_events

* add missing file

* V0.2 merge master (#214)

* fix the visualization of docs/key_components/distributed_toolkit

* add examples into isort ignore

* refine import path for examples (#195)

* refine import path for examples

* refine indents

* fixed formatting issues

* update code style

* add editorconfig-checker, add editorconfig path into lint, change super-linter version

* change path for code saving in cim.gnn

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Wenlei Shi <Wenlei.Shi@microsoft.com>

* fix issue that sometimes there is conflict between distutils and setuptools  (#208)

* fix issue that cython and setuptools conflict

* follow the accepted temp workaround

* update comment, it should be conflict between setuptools and distutils

* fixed bugs related to proxy interface changes

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Wenlei Shi <Wenlei.Shi@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>

* typo fix

* Bug fix: event buffer issue that cause Actions cannot be passed into business engine (#215)

* bug fix

* clear the reference after extract sub events, update ut to cover this issue

Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

* fix flake8 style problem

* V0.2 feature refine mode namings (#212)

* feat: refine cli exception

* feat: refine mode namings

* EventBuffer refine (#197)

* merge uniform event changes back

* 1st step: move executing events into stack for better removing performance

* flush event pool

* typo

* add option for env to enable event pool

* refine stack functions

* fix comment issues, add typings

* lint fixing

* lint fix

* add missing fix

* linting

* lint

* use linked list instead original event list and execute stack

* add missing file

* linting, and fixes

* add missing file

* linting fix

* fixing comments

* add missing file

* rename event_list to event_linked_list

* correct import path

* change enable_event_pool to disable_finished_events

* add missing file

* fixed bugs in dist rl

* feat: rename files

* tests: set longer gracefully wait time

* style: fix linting errors

* style: fix linting errors

* style: fix linting errors

* fix: rm redundant variables

* fix: refine error message

Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 vis new (#210)

Co-authored-by: Wenlei Shi <Wenlei.Shi@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>

* V0.2 local host process (#221)

* Update local process (not ready)

* update cli process mode

* add setup/clear/template for maro process

* fix process stop

* add logger and rename parameters

* add logger for setup/clear

* fixed close not exist pid when given pid list.

* Fixed comments and rename setup/clear with create/delete

* update ProcessInternalError

* comments fix

* delete toolkit change

* doc update

* citi bike update

* deploy path

* datalib update

* revert datalib

* revert

* maro file format

* comments update

* doc update

* V0.2 grass on premises (#220)

* feat: refine cli exception
* commit on v0.2_grass_on_premises

Co-authored-by: Lyuchun Huang <romic.kid@gmail.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 vm scheduling scenario (#189)

* Initialize

* Data center scenario init

* Code style modification

* V0.2 event buffer subevents expand (#180)

* V0.2 rl toolkit refinement (#165)

* refined rl abstractions

* fixed formattin issues

* checked out error-code related code from v0.2_pg

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* renamed save_models to dump_models

* 1. set default batch_norm_enabled to True; 2. used state_dict in dqn model saving

* renamed dump_experience_store to dump_experience_pool

* fixed a bug in the dump_experience_pool method

* fixed some PR comments

* fixed more PR comments

* 1.fixed some PR comments; 2.added early_stopping_checker; 3.revised explorer class

* fixed cim example according to rl toolkit changes

* fixed some more PR comments

* rewrote multi_process_launcher to eliminate the distributed section in config

* 1. fixed a typo; 2. added logging before early stopping

* fixed a bug

* fixed a bug

* fixed a bug

* added early stopping feature to CIM exmaple

* fixed a typo

* fixed some issues with early stopping

* changed early stopping metric func

* fixed a bug

* fixed a bug

* added early stopping to dist mode cim

* added experience collecting func

* edited notebook according to changes in CIM example

* fixed bugs in nb

* fixed lint formatting issues

* fixed a typo

* fixed some PR comments

* fixed more PR comments

* revised docs

* removed nb output

* fixed a bug in simple_learner

* fixed a typo in nb

* fixed a bug

* fixed a bug

* fixed a bug

* removed unused import

* fixed a bug

* 1. changed early stopping default config; 2. renamed param in early stopping checker and added typing

* fixed some doc issues

* added output to nb

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* unfold sub-events, insert after parent

* remove event category, use different class instead, add helper functions to gen decision and action event

* add a method to support add immediate event to cascade event with tick validation

* fix ut issue

* add action as 1st sub event to ensure the executing order

Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Data center scenario update

* Code style update

* Data scenario business engine update

* Isort update

* Fix lint code check

* Fix based on PR comments.

* Update based on PR comments.

* Add decision payload

* Add config file

* Update utilization series logic

* Update based on PR comment

* Update based on PR

* Update

* Update

* Add the ValidPm class

* Update docs string and naming

* Add energy consumption

* Lint code fixed

* Refining postpone function

* Lint style update

* Init data pipeline

* Update based on PR comment

* Add data pipeline download

* Lint style update

* Code style fix

* Temp update

* Data pipeline update

* Add aria2p download function

* Update based on PR comment

* Update based on PR comment

* Update based on PR comment

* Update naming of variables

* Rename topology

* Renaming

* Fix valid pm list

* Pylint fix

* Update comment

* Update docstring and comment

* Fix init import

* Update tick issue

* fix merge problem

* update style

* V0.2 datacenter data pipeline (#199)

* Data pipeline update

* Data pipeline update

* Lint update

* Update pipeline

* Add vmid mapping

* Update lint style

* Add VM data analytics

* Update notebook

* Add binary converter

* Modift vmtable yaml

* Update binary meta file

* Add cpu reader

* random example added for data center

* Fix bugs

* Fix pylint

* Add launcher

* Fix pylint

* best fit policy added

* Add reset

* Add config

* Add config

* Modify action object

* Modify config

* Fix naming

* Modify config

* Add snapshot list

* Modify a spelling typo

* Update based on PR comments.

* Rename scenario to vm scheduling

* Rename scenario

* Update print messages

* Lint fix

* Lint fix

* Rename scenario

* Modify the calculation of cpu utilization

* Add comment

* Modify data pipeline path

* Fix typo

* Modify naming

* Add unittest

* Add comment

* Unify naming

* Fix data path typo

* Update comments

* Update snapshot features

* Add take snapshot

* Add summary keys

* Update cpu reader

* Update naming

* Add unit test

* Rename snapshot node

* Add processed data pipeline

* Modify config

* Add comment

* Lint style fix

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>

* Add package used in vm_scheduling

* add aria2p to test requirement

* best fit example: update the usage of snapshot

* Add aria2p to test requriement

* Remove finish event

* Fix unittest

* Add test dataset

* Update based on PR comment

* Refine cpu reader and unittest

* Lint update

* Refine based on PR comment

* Add agent index

* Add node maping

* Refine based on PR comments

* Renaming postpone_step

* Renaming and refine based on PR comments

* Rename config

* Update

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

* Resolve none action problem (#224)

* V0.2 vm_scheduling notebook (#223)

* Initialize

* Data center scenario init

* Code style modification

* V0.2 event buffer subevents expand (#180)

* V0.2 rl toolkit refinement (#165)

* refined rl abstractions

* fixed formattin issues

* checked out error-code related code from v0.2_pg

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* renamed save_models to dump_models

* 1. set default batch_norm_enabled to True; 2. used state_dict in dqn model saving

* renamed dump_experience_store to dump_experience_pool

* fixed a bug in the dump_experience_pool method

* fixed some PR comments

* fixed more PR comments

* 1.fixed some PR comments; 2.added early_stopping_checker; 3.revised explorer class

* fixed cim example according to rl toolkit changes

* fixed some more PR comments

* rewrote multi_process_launcher to eliminate the distributed section in config

* 1. fixed a typo; 2. added logging before early stopping

* fixed a bug

* fixed a bug

* fixed a bug

* added early stopping feature to CIM exmaple

* fixed a typo

* fixed some issues with early stopping

* changed early stopping metric func

* fixed a bug

* fixed a bug

* added early stopping to dist mode cim

* added experience collecting func

* edited notebook according to changes in CIM example

* fixed bugs in nb

* fixed lint formatting issues

* fixed a typo

* fixed some PR comments

* fixed more PR comments

* revised docs

* removed nb output

* fixed a bug in simple_learner

* fixed a typo in nb

* fixed a bug

* fixed a bug

* fixed a bug

* removed unused import

* fixed a bug

* 1. changed early stopping default config; 2. renamed param in early stopping checker and added typing

* fixed some doc issues

* added output to nb

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* unfold sub-events, insert after parent

* remove event category, use different class instead, add helper functions to gen decision and action event

* add a method to support add immediate event to cascade event with tick validation

* fix ut issue

* add action as 1st sub event to ensure the executing order

Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Data center scenario update

* Code style update

* Data scenario business engine update

* Isort update

* Fix lint code check

* Fix based on PR comments.

* Update based on PR comments.

* Add decision payload

* Add config file

* Update utilization series logic

* Update based on PR comment

* Update based on PR

* Update

* Update

* Add the ValidPm class

* Update docs string and naming

* Add energy consumption

* Lint code fixed

* Refining postpone function

* Lint style update

* Init data pipeline

* Update based on PR comment

* Add data pipeline download

* Lint style update

* Code style fix

* Temp update

* Data pipeline update

* Add aria2p download function

* Update based on PR comment

* Update based on PR comment

* Update based on PR comment

* Update naming of variables

* Rename topology

* Renaming

* Fix valid pm list

* Pylint fix

* Update comment

* Update docstring and comment

* Fix init import

* Update tick issue

* fix merge problem

* update style

* V0.2 datacenter data pipeline (#199)

* Data pipeline update

* Data pipeline update

* Lint update

* Update pipeline

* Add vmid mapping

* Update lint style

* Add VM data analytics

* Update notebook

* Add binary converter

* Modift vmtable yaml

* Update binary meta file

* Add cpu reader

* random example added for data center

* Fix bugs

* Fix pylint

* Add launcher

* Fix pylint

* best fit policy added

* Add reset

* Add config

* Add config

* Modify action object

* Modify config

* Fix naming

* Modify config

* Add snapshot list

* Modify a spelling typo

* Update based on PR comments.

* Rename scenario to vm scheduling

* Rename scenario

* Update print messages

* Lint fix

* Lint fix

* Rename scenario

* Modify the calculation of cpu utilization

* Add comment

* Modify data pipeline path

* Fix typo

* Modify naming

* Add unittest

* Add comment

* Unify naming

* Fix data path typo

* Update comments

* Update snapshot features

* Add take snapshot

* Add summary keys

* Update cpu reader

* Update naming

* Add unit test

* Rename snapshot node

* Add processed data pipeline

* Modify config

* Add comment

* Lint style fix

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>

* Add package used in vm_scheduling

* add aria2p to test requirement

* best fit example: update the usage of snapshot

* Add aria2p to test requriement

* Remove finish event

* Fix unittest

* Add test dataset

* Update based on PR comment

* Refine cpu reader and unittest

* Lint update

* Refine based on PR comment

* Add agent index

* Add node maping

* Init vm shceduling notebook

* Add notebook

* Refine based on PR comments

* Renaming postpone_step

* Renaming and refine based on PR comments

* Rename config

* Update based on the v0.2_datacenter

* Update notebook

* Update

* update filepath

* notebook updated

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

* Update process mode docs and fixed on premises (#226)

* V0.2 Add github workflow integration (#222)

* test: add github workflow integration

* fix: split procedures && bug fixed

* test: add training only restriction

* fix: add 'approved' restriction

* fix: change default ssh port to 22

* style: in one line

* feat: add timeout for Subprocess.run

* test: change default node_size to Standard_D2s_v3

* style: refine style

* fix: add ssh_port param to on-premises mode

* fix: add missing init.py

* update param name

* V0.2 explorer (#198)

* overhauled exploration abstraction

* fixed a bug

* fixed a bug

* fixed a bug

* added exploration related methods to abs_agent

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* separated learning with exploration schedule and without

* small fixes

* moved explorer logic to actor side

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* removed unwanted param from simple agent manager

* added noise explorer

* fixed formatting

* removed unnecessary comma

* fixed PR comments

* removed unwanted exception and imports

* fixed a bug

* fixed PR comments

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issue

* fixed a bug

* fixed lint issue

* fixed naming

* combined exploration param generation and early stopping in scheduler

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issues

* fixed lint issue

* moved logger inside scheduler

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issues

* removed epsilon parameter from choose_action

* fixed some PR comments

* fixed some PR comments

* bug fix

* bug fix

* bug fix

* removed explorer abstraction from agent

* refined dqn example

* fixed lint issues

* simplified scheduler

* removed early stopping from CIM dqn example

* removed early stopping from cim example config

* renamed early_stopping_callback to early_stopping_checker

* removed action_dim from noise explorer classes and added some shape checks

* modified NoiseExplorer's __call__ logic to batch processing

* made NoiseExplorer's __call__ return type np array

* renamed update to set_parameters in explorer

* fixed old naming in test_grass

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 embedded optim (#191)

* added dueling action value model

* renamed params in dueling_action_value_model

* renamed shared_features to features

* replaced IdentityLayers with nn.Identity

* 1. added skip connection option in FC_net; 2. generalized learning model

* added skip_connection option in config

* removed type casting in fc_net

* fixed lint formatting issues

* refined docstring

* mv dueling_actiovalue_model and fixed some bugs

* added multi-head functionality to LearningModel

* refined learning model docstring

* added head_key param in learningModel forward

* added double DQN and dueling features to DQN

* fixed a bug

* added DuelingQModelHead enum

* fixed a bug

* removed unwanted file

* fixed PR comments

* added top layer logic and is_top option in fc_net

* fixed a bug

* fixed a bug

* reverted some changes in learning model

* reverted some changes in learning model

* added members to learning model to fix the mode issue

* fixed a bug

* fixed mode setting issue in learning model

* fixed PR comments

* revised cim example according to DQN changes

* renamed eval_model to q_value_model in cim example

* more fixes

* fixed a bug

* fixed a bug

* added doc per PR comments

* removed learner.exit() in single_process_launcher

* removed learner.exit() in single_process_launcher

* fixed PR comments

* fixed rl/__init__

* fixed issues in example

* fixed a bug

* fixed a bug

* fixed lint formatting issues

* double DQN feature

* fixed a bug

* fixed a bug

* fixed PR comments

* fixed lint issue

* embedded optimizer into SingleHeadLearningModel

* 1. fixed PR comments related to load/dump; 2. removed abstract load/dump methods from AbsAlgorithm

* added load_models in simple_learner

* minor docstring edits

* minor docstring edits

* minor docstring edits

* mv optimizer options inside LearningMode

* modified example accordingly

* fixed a bug

* fixed a bug

* fixed a bug

* added dueling DQN feature

* revised and refined docstrings

* fixed a bug

* fixed lint issues

* added load/dump functions to LearningModel

* fixed a bug

* fixed a bug

* fixed lint issues

* refined DQN docstrings

* removed load/dump functions from DQN

* added task validator

* fixed decorator use

* fixed a typo

* fixed a bug

* fixed lint issues

* changed LearningModel's step() to take a single loss

* revised learning model design

* revised example

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* added decorator utils to algorithm

* fixed a bug

* renamed core_model to model

* fixed a bug

* 1. fixed lint formatting issues; 2. refined learning model docstrings

* rm trailing whitespaces

* added decorator for choose_action

* fixed a bug

* fixed a bug

* fixed version-related issues

* renamed add_zeroth_dim decorator to expand_dim

* overhauled exploration abstraction

* fixed a bug

* fixed a bug

* fixed a bug

* added exploration related methods to abs_agent

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* separated learning with exploration schedule and without

* small fixes

* moved explorer logic to actor side

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* removed unwanted param from simple agent manager

* small fixes

* added shared_module property to LearningModel

* added shared_module property to LearningModel

* revised __getstate__ for LearningModel

* fixed a bug

* added soft_update function to learningModel

* fixed a bug

* revised learningModel

* rm __getstate__ and __setstate__ from LearningModel

* added noise explorer

* fixed formatting

* removed unnecessary comma

* removed unnecessary comma

* fixed PR comments

* removed unwanted exception and imports

* removed unwanted exception and imports

* fixed a bug

* fixed PR comments

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issue

* fixed a bug

* fixed lint issue

* fixed naming

* combined exploration param generation and early stopping in scheduler

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issues

* fixed lint issue

* moved logger inside scheduler

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issues

* fixed lint issue

* removed epsilon parameter from choose_action

* removed epsilon parameter from choose_action

* changed agent manager's train parameter to experience_by_agent

* fixed some PR comments

* renamed zero_grad to zero_gradients in LearningModule

* fixed some PR comments

* bug fix

* bug fix

* bug fix

* removed explorer abstraction from agent

* added DEVICE env variable as first choice for torch device

* refined dqn example

* fixed lint issues

* removed unwanted import in cim example

* updated cim-dqn notebook

* simplified scheduler

* edited notebook according to merged scheduler changes

* refined dimension check for learning module manager and removed num_actions from DQNConfig

* bug fix for cim example

* added notebook output

* removed early stopping from CIM dqn example

* removed early stopping from cim example config

* moved decorator logic inside algorithms

* renamed early_stopping_callback to early_stopping_checker

* removed action_dim from noise explorer classes and added some shape checks

* modified NoiseExplorer's __call__ logic to batch processing

* made NoiseExplorer's __call__ return type np array

* renamed update to set_parameters in explorer

* fixed old naming in test_grass

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 VM scheduling docs (#228)

* Initialize

* Data center scenario init

* Code style modification

* V0.2 event buffer subevents expand (#180)

* V0.2 rl toolkit refinement (#165)

* refined rl abstractions

* fixed formattin issues

* checked out error-code related code from v0.2_pg

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* renamed save_models to dump_models

* 1. set default batch_norm_enabled to True; 2. used state_dict in dqn model saving

* renamed dump_experience_store to dump_experience_pool

* fixed a bug in the dump_experience_pool method

* fixed some PR comments

* fixed more PR comments

* 1.fixed some PR comments; 2.added early_stopping_checker; 3.revised explorer class

* fixed cim example according to rl toolkit changes

* fixed some more PR comments

* rewrote multi_process_launcher to eliminate the distributed section in config

* 1. fixed a typo; 2. added logging before early stopping

* fixed a bug

* fixed a bug

* fixed a bug

* added early stopping feature to CIM exmaple

* fixed a typo

* fixed some issues with early stopping

* changed early stopping metric func

* fixed a bug

* fixed a bug

* added early stopping to dist mode cim

* added experience collecting func

* edited notebook according to changes in CIM example

* fixed bugs in nb

* fixed lint formatting issues

* fixed a typo

* fixed some PR comments

* fixed more PR comments

* revised docs

* removed nb output

* fixed a bug in simple_learner

* fixed a typo in nb

* fixed a bug

* fixed a bug

* fixed a bug

* removed unused import

* fixed a bug

* 1. changed early stopping default config; 2. renamed param in early stopping checker and added typing

* fixed some doc issues

* added output to nb

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* unfold sub-events, insert after parent

* remove event category, use different class instead, add helper functions to gen decision and action event

* add a method to support add immediate event to cascade event with tick validation

* fix ut issue

* add action as 1st sub event to ensure the executing order

Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Data center scenario update

* Code style update

* Data scenario business engine update

* Isort update

* Fix lint code check

* Fix based on PR comments.

* Update based on PR comments.

* Add decision payload

* Add config file

* Update utilization series logic

* Update based on PR comment

* Update based on PR

* Update

* Update

* Add the ValidPm class

* Update docs string and naming

* Add energy consumption

* Lint code fixed

* Refining postpone function

* Lint style update

* Init data pipeline

* Update based on PR comment

* Add data pipeline download

* Lint style update

* Code style fix

* Temp update

* Data pipeline update

* Add aria2p download function

* Update based on PR comment

* Update based on PR comment

* Update based on PR comment

* Update naming of variables

* Rename topology

* Renaming

* Fix valid pm list

* Pylint fix

* Update comment

* Update docstring and comment

* Fix init import

* Update tick issue

* fix merge problem

* update style

* V0.2 datacenter data pipeline (#199)

* Data pipeline update

* Data pipeline update

* Lint update

* Update pipeline

* Add vmid mapping

* Update lint style

* Add VM data analytics

* Update notebook

* Add binary converter

* Modift vmtable yaml

* Update binary meta file

* Add cpu reader

* random example added for data center

* Fix bugs

* Fix pylint

* Add launcher

* Fix pylint

* best fit policy added

* Add reset

* Add config

* Add config

* Modify action object

* Modify config

* Fix naming

* Modify config

* Add snapshot list

* Modify a spelling typo

* Update based on PR comments.

* Rename scenario to vm scheduling

* Rename scenario

* Update print messages

* Lint fix

* Lint fix

* Rename scenario

* Modify the calculation of cpu utilization

* Add comment

* Modify data pipeline path

* Fix typo

* Modify naming

* Add unittest

* Add comment

* Unify naming

* Fix data path typo

* Update comments

* Update snapshot features

* Add take snapshot

* Add summary keys

* Update cpu reader

* Update naming

* Add unit test

* Rename snapshot node

* Add processed data pipeline

* Modify config

* Add comment

* Lint style fix

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>

* Add package used in vm_scheduling

* add aria2p to test requirement

* best fit example: update the usage of snapshot

* Add aria2p to test requriement

* Remove finish event

* Fix unittest

* Add test dataset

* Update based on PR comment

* vm doc init

* Update docs

* Update docs

* Update docs

* Update docs

* Remove old notebook

* Update docs

* Update docs

* Add figure

* Update docs

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

* doc update

* new link

* image update

* v0.2 VM Scheduling docs refinement (#231)

* Fix typo

* Refining vm scheduling docs

* image change

* V0.2 store refinement (#234)

* updated docs and images for rl toolkit

* 1. fixed import formats for maro/rl; 2. changed decorators to hypers in store

* fixed lint issues

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Fix bug (#237)

vm scenario: fix the event type bug of the postpone event

* V0.2 rl toolkit doc (#235)

* updated docs and images for rl toolkit

* updated cim example doc

* updated cim exmaple docs

* updated cim example rst

* updated rl_toolkit and cim example docs

* replaced q_module with q_net in example rst

* refined doc

* refined doc

* updated figures

* updated figures

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Merge V0.2 vis into V0.2 (#233)

* Implemented dump snapshots and convert to CSV.

* Let BE supports params when dump snapshot.

* Refactor dump code to core.py

* Implemented decision event dump.

* replace is not '' with !=''

* Fixed issues that code review mentioned.

* removed path from hello.py

* Changed import sort.

* Fix  import sorting in citi_bike/business_engine

* visualization 0.1

* Updated lint configurations.

* Fixed formatting error that caused lint errors.

* render html title function

* Try to fix lint errors.

* flake-8 style fix

* remove space around 18,35

* dump_csv_converter.py re-formatting.

* files re-formatting.

* style fixed

* tab delete

* white space fix

* white space fix-2

* vis redundant function delete

* refine

* re-formatting after merged upstream.

* Updated import section.

* Updated import section.

* pr refine

* isort fix

* white space

* lint error

* \n error

* test continuation

* indent

* continuation of indent

* indent 0.3

* comment update

* comment update 0.2

* f-string update

* f-string 0.2

* lint 0.3

* lint 0.4

* lint 0.4

* lint 0.5

* lint 0.6

* docstring update

* data version deploy update

* condition update

* add whitespace

* V0.2 vis dump feature enhancement. (#190)

* Dumps added manifest file.
* Code updated format by flake8
* Changed manifest file format for easy reading.

* deploy info update; docs update

* weird white space

* Update dashboard_visualization.md

* new endline?

* delete dependency

* delete irrelevant file

* change scenario to enum, divide file path into a separated class

* doc refine

* doc update

* params type

* data structure update

* doc&enum, formula refine

* refine

* add ut, refine doc

* style refine

* isort

* strong type fix

* os._exit delete

* revert datalib

* import new line

* change test case

* change file name & doc

* change deploy path

* delete params

* revert file

* delete duplicate file

* delete single process

* update naming

* manually change import order

* delete blank

* edit error

* requirement txt

* style fix & refine

* comments&docstring refine

* add parameter name

* test & dump

* comments update

* Added manifest file. (#201)

Only a few changes that need to meet requirements of manifest file format.

* comments fix

* delete toolkit change

* doc update

* citi bike update

* deploy path

* datalib update

* revert datalib

* revert

* maro file format

* comments update

* doc update

* update param name

* doc update

* new link

* image update

* V0.2 visualization-0.1 (#181)

* visualization 0.1

* render html title function

* flake-8 style fix

* style fixed

* tab delete

* white space fix

* white space fix-2

* vis redundant function delete

* refine

* pr refine

* isort fix

* white space

* lint error

* \n error

* test continuation

* indent

* continuation of indent

* indent 0.3

* comment update

* comment update 0.2

* f-string update

* f-string 0.2

* lint 0.3

* lint 0.4

* lint 0.4

* lint 0.5

* lint 0.6

* docstring update

* data version deploy update

* condition update

* add whitespace

* deploy info update; docs update

* weird white space

* Update dashboard_visualization.md

* new endline?

* delete dependency

* delete irrelevant file

* change scenario to enum, divide file path into a separated class

* fix the visualization of docs/key_components/distributed_toolkit

* doc refine

* doc update

* params type

* add examples into isort ignore

* data structure update

* doc&enum, formula refine

* refine

* add ut, refine doc

* style refine

* isort

* strong type fix

* os._exit delete

* revert datalib

* import new line

* change test case

* change file name & doc

* change deploy path

* delete params

* revert file

* delete duplicate file

* delete single process

* update naming

* manually change import order

* delete blank

* edit error

* requirement txt

* style fix & refine

* comments&docstring refine

* add parameter name

* test & dump

* comments update

* comments fix

* delete toolkit change

* doc update

* citi bike update

* deploy path

* datalib update

* revert datalib

* revert

* maro file format

* comments update

* doc update

* update param name

* doc update

* new link

* image update

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Miaoran Chen (Wicresoft) <v-miaorc@microsoft.com>

* image change

* add reset snapshot

* delete dump

* add new line

* add next steps

* import change

* relative import

* add init file

* import change

* change utils file

* change cliexpcetion to clierror

* dashboard test

* change result

* change assertation

* move not

* unit test change

* core change

* unit test delete name_mapping_file

* update cim business engine

* doc update

* change relative path

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* duc update

* duc update

* duc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* change import sequence

* comments update

* doc add pic

* add dependency

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* Update dashboard_visualization.rst

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* delete white space

* doc update

* doc update

* update doc

* update doc

* update doc

Co-authored-by: Michael Li <mic_lee2000@hotmail.com>
Co-authored-by: Miaoran Chen (Wicresoft) <v-miaorc@microsoft.com>
Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

* V0.2 docs process mode (#230)

* Update process mode docs and fixed on premises

* Update orchestration docs

* Update process mode docs add JOB_NAME as env variable

* fixed bugs

* fixed isort issue

* update docs index

Co-authored-by: kaiqli <v-kaiqli@microsoft.com>

* V0.2 learning model refinement (#236)

* moved optimizer options to LearningModel

* typo fix

* fixed lint issues

* updated notebook

* misc edits

* 1. renamed CIMAgent to DQNAgent; 2. moved create_dqn_agents to Agent section in notebook

* renamed single_host_cim_learner ot cim_learner in notebook

* updated notebook output

* typo fix

* removed dimension check in absence of shared stack

* fixed a typo

* fixed lint issues

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Update vm docs (#241)

Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

* V0.2 info update (#240)

* update readme

* update version

* refine reademe format

* add vis gif

* add citation

* update citation

* update badge

Co-authored-by: Arthur Jiang <sjian@microsoft.com>

* Fix typo (#242)

* Fix typo

* fix typo

* fix

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

Co-authored-by: Arthur Jiang <sjian@microsoft.com>
Co-authored-by: Arthur Jiang <ArthurSJiang@gmail.com>
Co-authored-by: Romic Huang <romic.kid@gmail.com>
Co-authored-by: zhanyu wang <pocket_2001@163.com>
Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: kaiqli <59279714+kaiqli@users.noreply.github.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>
Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Jinyu Wang <jinywan@microsoft.com>
Co-authored-by: Michael Li <mic_lee2000@hotmail.com>
Co-authored-by: Miaoran Chen (Wicresoft) <v-miaorc@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: Wenlei Shi <Wenlei.Shi@microsoft.com>
Co-authored-by: kyu-kuanwei <72911362+kyu-kuanwei@users.noreply.github.com>
Co-authored-by: kaiqli <v-kaiqli@microsoft.com>

* bug fix related to np array divide (#245)

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Master.simple bike (#250)

* notebook for simple bike repositioning added

* add simple rule-based algorithms

* unify input

* add policy based on statistics

* update be for simple bike scenario to fit latest event buffer changes (#247)

* change rendered graph

* figures updated

* change notebook

* matplot updated

* figures updated

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: wesley <Wenlei.Shi@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>

* simple bike repositioning article: formula updated

* checked out docs/source from v0.2

* aligned with v0.2

* rm unwanted import

* added references in policy_optimization.py

* fixed lint issues

Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Meroy Chen <39452768+Meroy9819@users.noreply.github.com>
Co-authored-by: Arthur Jiang <sjian@microsoft.com>
Co-authored-by: Arthur Jiang <ArthurSJiang@gmail.com>
Co-authored-by: Romic Huang <romic.kid@gmail.com>
Co-authored-by: zhanyu wang <pocket_2001@163.com>
Co-authored-by: kaiqli <59279714+kaiqli@users.noreply.github.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>
Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Jinyu Wang <jinywan@microsoft.com>
Co-authored-by: Michael Li <mic_lee2000@hotmail.com>
Co-authored-by: Miaoran Chen (Wicresoft) <v-miaorc@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: Wenlei Shi <Wenlei.Shi@microsoft.com>
Co-authored-by: kyu-kuanwei <72911362+kyu-kuanwei@users.noreply.github.com>
Co-authored-by: kaiqli <v-kaiqli@microsoft.com>
* update lint workflow

* fix workflow issue

* Update lint.yml

* Create tox.ini

* Update lint.yml

* Update lint.yml

* Update tox.ini

* Update lint.yml

* Delete tox.ini from root folder, move it to .github/linters

* Update CONTRIBUTING.md

* add more comments

* update lint conf to ignore cli banner issue

* change extension implementation from c to cpp

* update script to gen cpp files

* backend base interface redefine

* interface revamp for np backend

* 1st step for revamp

* bug fix

* draft

* implementation of attribute

* implementation of backend

* remove  backend switching

* draft raw backend wrapper

* correct function parameter type

* 1st runable version

* bug fix for types

* ut passed

* change CRLF to LF

* fix get_node_info interface

* add raw test in frame ut

* return np.array for all query result

* use ticks from backend

* set init value

* snapshot ut passed

* support set default backend by environemnt variable

* env ut with different backend

* fix take snapshot index bug

* test under both backends

* ignore generated cpp file

* fix lint isues

* more lint fix

* use ordered map to store ticks to keep the order

* remove test code

* refine dup code

* refine code to avoid too much if/else

* handle and raise exception for attr getter

* change the way to handle cpp exception, use cython runtimeerror instead

* add missing function, and fix bug in np impl

* fix lint issue

* specify c++11 flag for compilers

* use normal field assignment instead initializer list, as linux gcc will complain it

* add np ignore macro

* try to refine token pasting operator to avoid error on linux

* more pasting operator issue fix

* remove un-used options

* update workflow files to fit new backend

* 1st version of dynamic backend structure

* setup ut for cpp using lest

* bitset complete

* attributestore and ut

* arrange

* copy_to

* current frame

* ut for frame

* bug fix and ut correct

* fix issue that value not correct after arrange

* fix bug in test case

* frame update

* change the way to add nodes, support add node from middle

* frame in backend

* snapshotlist code complete

* add size method for snapshotlist, add ut template

* make sure snapshot max size not be 0

* add max size

* fix query parameters

* fix attribute store extend error

* add function to retrieve attribute from snapshotlist

* return nan for invalid index

* add function to check if nan for float attribute only

* fix bug that not update _last_tick for snapshot list, that cause take snapshot for same tick crash

* add functions to expose internal state under debug mode, make it easy to do unit test

* fix issue that cause overlap logic skiped

* ut passed for all implemented functions

* remove query in ut, as it not completed yet

* refine querying interfaces, use 2 functions for 1 querying

* snapshot query,

* use pointer instead weak_ptr

* backend impl

* set default parameters value

* query bug fix,

* bug fix: new_attr should return attr id not node id

* use macro to create attribute getters

* add reset support

* change the way to reset, avoid allocation time

* test reset for attributestore

* use Bitset instead vector<bool> to make it easy to reset

* refine backend interfaces to make it compact with old one

* correct quering interface, cython compile passed

* bug fix: get_ticks not set correct index

* correct cpp backend binding, add type for frame

* correct ut for snapshot

* bug fix: query cause crash after snapshot reset

* fix env test

* bug fix: is_nan should check data type first

* fix cim ut issues with raw backend

* fix citibike ut issues for raw backend

* add interfaces to support dynamic nodes, not tested

* bug fix: access cpp object without cdef

* bug fix: missing impl for dynamic methods

* ut for append nodes

* return node number dynamiclly

* remove unused parameters for snapshot

* remove unused code

* allow get attribute for deleted node

* ut for delete and resume node

* function to set attribute slot

* bug fix: set attribute will cause crash

* bug fix: remove append node when reset cause exception

* bug fix: frame.backend_type return incorrect name

* backends performance comparison

* correct internal type

* correct warnings

* missing ;

* formating

* fix lint issue

* simple the way to copy mapping

* add dump interfaces

* frame dump

* ignore if dump path is not exist

* bug fix: use max slots instead of current slots for padding in snapshot querying

* use max slot number in history instead of current for padding

* dump for snapshot

* close file at the end

* refine snapshot dump function

* fix lint issue

* avoid too much allocate operation

* use pointer instead reference for furthure changes

* avoid 2 times map copy

* add comments for missing functions

* performance optimize

* use emplace instead push

* use emplace instead  push

* remove cpp files

* add missing lisence

* ignore .vs folder

* add lest lisence for cpp unittest

* Delete CMakeLists.txt

* add error msg for exception, make it easy to identify error at python side

* remove old codes

* replace with new code

* change IDENTIER to NODE_TYPE and ATTR_TYPE

* build pass

* fix attr type not correct bug

* reomve unused comment

* make frame ut pass

* correct the max snapshots checking

* fix test case

* add missing file

* correct performance test

* refine attribute code

* refine bitset code

* update FrameBase doc about switch backend

* correct the exception name

* refine frame code

* refine node code

* refine snapshot list code

* add is_const and is_list when adding attribute

* support query const attribute without tick exist

* add operations for list attribute

* remove cache as we have list attribute

* add remove and insert for list attribute

* add for-loop support for list attribute

* fix bug that not update list attribute slot number after operations

* test for dynamic features

* frame dump

* dump for snapshot list

* fix issue on gcc compiler

* add missing file

* fix lint issues

* refine the exception, more comments

* fix lint issue

* fix lint issue

* use simulate enum instead of str

* Use new type instead old in tests

* using mapping instead if-else

* remove generated code

* use mapping to reduce too much if-else

* add default attribute type int if not provided or invalid provided

* remove generated code

* update workflow with code gen

* more frame test

* add missing files

* test: cover maro.simulator.utils.common

* update test with new scenario

* comments

* tests

* update doc

* fix lint and comments

* CRLF to LF

* fix lint issue

Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>
* Remove topology

* Update pipeline

* Update pipeline

* Update pipeline

* Modify metafile

* Add two attributes of VM

* Update pipeline

* Add vm category

* Add todo

* Add oversub config

* Add oversubscription feature

* Lint fix

* Update based on PR comment.

* Update pipeline

* Update pipeline

* Update config.

* Update based on PR comment

* Update

* Add pm sku feature

* Add sku setting

* Add sku feature

* Lint fix

* Lint style

* Update sku, overloading

* Lint fix

* Lint style

* Fix bug

* Modify config

* Remove sky and replaced it by pm stype

* Add and refactor vm category

* Comment out cofig

* Unify the enum format

* Fix lint style

* Fix import order

* Update based on PR comment

* Update overload to the VM docs

* Update docs

* Update vm docs

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>
@Jinyu-W Jinyu-W changed the title V0.2 V0.2 update Jan 25, 2021
@codecov-io
Copy link
Copy Markdown

Codecov Report

Merging #262 (4f7529c) into master (9bf589a) will decrease coverage by 0.56%.
The diff coverage is 39.92%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #262      +/-   ##
==========================================
- Coverage   64.94%   64.37%   -0.57%     
==========================================
  Files         111      113       +2     
  Lines        5335     5553     +218     
==========================================
+ Hits         3465     3575     +110     
- Misses       1870     1978     +108     
Flag Coverage Δ
unittests 64.37% <39.92%> (-0.57%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
maro/data_lib/dump_csv_converter.py 15.33% <ø> (-0.57%) ⬇️
maro/rl/algorithms/dqn.py 22.22% <ø> (ø)
maro/rl/storage/column_based_store.py 67.47% <0.00%> (ø)
maro/utils/exception/error_code.py 100.00% <ø> (ø)
...mulator/scenarios/vm_scheduling/business_engine.py 17.94% <10.86%> (-2.46%) ⬇️
maro/rl/agent/simple_agent_manager.py 35.08% <14.28%> (-1.46%) ⬇️
maro/rl/algorithms/policy_optimization.py 30.76% <30.76%> (ø)
...ulator/scenarios/vm_scheduling/physical_machine.py 41.81% <40.00%> (-1.67%) ⬇️
...mulator/scenarios/vm_scheduling/virtual_machine.py 34.37% <40.00%> (-1.34%) ⬇️
maro/simulator/scenarios/vm_scheduling/common.py 51.42% <50.00%> (-6.47%) ⬇️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9bf589a...4f7529c. Read the comment docs.

Copy link
Copy Markdown
Contributor

@kyu-kuanwei kyu-kuanwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VM related description ok.

@Jinyu-W Jinyu-W merged commit fa092f3 into master Jan 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.