Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OA: Anomaly Prediction #1012

Draft
wants to merge 21 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
4ae6499
OA: Templates for Anomaly Prediction #1011
detlefarend Jun 4, 2024
3aea39f
OA: Templates for Anomaly Prediction #1011
detlefarend Jun 4, 2024
d3a7ace
OA: Templates for Anomaly Prediction #1011
detlefarend Jun 4, 2024
94e9c5e
Merge remote-tracking branch 'origin/main' into oa/streams/ap
detlefarend Jun 5, 2024
525ec17
Merge remote-tracking branch 'origin/main' into oa/streams/ap
detlefarend Jun 10, 2024
2c9728b
Merge remote-tracking branch 'origin/main' into oa/streams/ap
Devindi97 Jun 14, 2024
451bb59
Merge remote-tracking branch 'origin/main' into oa/streams/ap
detlefarend Jun 25, 2024
5f7e885
Merge remote-tracking branch 'origin/main' into oa/streams/ap
detlefarend Jun 26, 2024
bebe902
RTD cleaning #980
steveyuwono Jul 3, 2024
b0a262e
RTD cleaning #980
steveyuwono Jul 3, 2024
e7a5837
RTD cleaning #980
steveyuwono Jul 3, 2024
5d58292
RTD cleaning #980
steveyuwono Jul 3, 2024
2e2e2fd
RTD cleaning #980
steveyuwono Jul 3, 2024
a9b8222
Merge branch 'main' of https://github.com/fhswf/MLPro into rtd_v2.0.0
steveyuwono Jul 11, 2024
ce35047
RTD 2.0.0
steveyuwono Jul 11, 2024
1f33c4e
RTD 2.0.0
steveyuwono Jul 11, 2024
37fd846
RTD 2.0.0
steveyuwono Jul 11, 2024
82a5fdc
RL: Collision Avoidance Problem Environment
steveyuwono Jul 12, 2024
1de60f2
Merge remote-tracking branch 'origin/main' into oa/streams/ap
Devindi97 Jul 12, 2024
ecc9a1d
Merge remote-tracking branch 'origin/rtd_v2.0.0' into oa/streams/ap
Devindi97 Jul 12, 2024
0a30d3c
OA-Streams: Anomaly Prediction #886
detlefarend Jul 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Reusing RL Environments

Alternatively, if your environment follows Gym or PettingZoo interface, you can apply our
relevant useful wrappers for the integration between third-party packages and MLPro.
For more information about the available third-party packages, please click :ref:`here<target-package-third>`.
For more information about the available third-party packages, please click :ref:`here<target_extension_hub>`.
Then, you need to transfer the wrapped RL environment to a GT Game Board.


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,8 +55,8 @@ After following the below step-by-step guideline, we expect the user understands
After following the previous steps, we hope that you could practice MLPro-GT and start using this subpackage for your GT-related activities.
For more advanced features, we highly recommend you to check out the following howto files:

(a) :ref:`Howto RL-HT-001: Hyperopt <Howto RL HT 001>`
(a) `Howto RL-HT-001: Hyperparameter Tuning using Hyperopt <https://mlpro-int-hyperopt.readthedocs.io/en/latest/content/01_examples_pool/howto.rl.ht.001.html>`_

(b) :ref:`Howto RL-HT-002: Optuna <Howto RL HT 002>`
(b) `Howto RL-HT-001: Hyperparameter Tuning using Optuna <https://mlpro-int-optuna.readthedocs.io/en/latest/content/01_examples_pool/howto.rl.ht.002.html>`_

(c) :ref:`Howto RL-ATT-001: Stagnation Detection <Howto RL ATT 001>`
(c) `Howto RL-ATT-001: Train and Reload Single Agent using Stagnation Detection (Gymnasium) <https://mlpro-int-sb3.readthedocs.io/en/latest/content/01_example_pool/03_howtos_att/howto_rl_att_001_train_and_reload_single_agent_gym_sd.html>`_
28 changes: 12 additions & 16 deletions doc/rtd/content/03_machine_learning/mlpro_rl/sub/02_getstarted.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,17 +44,17 @@ After following the below step-by-step guideline, we expect the user understands

(a) :ref:`Howto RL-001: Reward <Howto RL 001>`

(b) :ref:`Howto RL-AGENT-001: Run an Agent with Own Policy <Howto Agent RL 001>`
(b) `Howto RL-AGENT-001: Run an Agent with Own Policy <https://mlpro-int-gymnasium.readthedocs.io/en/latest/content/01_example_pool/01_howtos_rl/howto_rl_agent_001_run_agent_with_own_policy_on_gym_environment.html>`_

**5. Understanding Agent in MLPro-RL**
In reinforcement learning, we have two types of agents, such as a single-agent RL or a multi-agent RL. Both of the types are covered by MLPro-RL.
To understand the different possibilities of an agent in MLPro, you can visit :ref:`this page <target_agents_RL>`.

Then, you need to understand how to set up a single-agent and a multi-agent RL in MLPro-RL by following these examples:

(a) :ref:`Howto RL-AGENT-001: Run an Agent with Own Policy <Howto Agent RL 001>`
(a) `Howto RL-AGENT-001: Run an Agent with Own Policy <https://mlpro-int-gymnasium.readthedocs.io/en/latest/content/01_example_pool/01_howtos_rl/howto_rl_agent_001_run_agent_with_own_policy_on_gym_environment.html>`_

(b) :ref:`Howto RL-AGENT-003: Run Multi-Agent with Own Policy <Howto Agent RL 003>`
(b) `Howto RL-AGENT-003: Run Multi-Agent with Own Policy <https://mlpro-int-gymnasium.readthedocs.io/en/latest/content/01_example_pool/01_howtos_rl/howto_rl_agent_003_run_multiagent_with_own_policy_on_multicartpole_environment.html>`_

**6. Selecting between Model-Free and Model-Based RL**
In this section, you need to select your direction of the RL training, whether it is a model-free RL or a model-based RL.
Expand All @@ -66,36 +66,32 @@ After following the below step-by-step guideline, we expect the user understands

(a) `A sample application video of MLPro-RL on a UR5 robot <https://ars.els-cdn.com/content/image/1-s2.0-S2665963822001051-mmc2.mp4>`_

(b) :ref:`Howto RL-AGENT-002: Train an Agent with Own Policy <Howto Agent RL 002>`
(b) `Howto RL-AGENT-002: Train an Agent with Own Policy <https://mlpro-int-gymnasium.readthedocs.io/en/latest/content/01_example_pool/01_howtos_rl/howto_rl_agent_002_train_agent_with_own_policy_on_gym_environment.html>`_

(c) :ref:`Howto RL-AGENT-004: Train Multi-Agent with Own Policy <Howto Agent RL 004>`
(c) `Howto RL-AGENT-004: Train Multi-Agent with Own Policy <https://mlpro-int-gymnasium.readthedocs.io/en/latest/content/01_example_pool/01_howtos_rl/howto_rl_agent_004_train_multiagent_with_own_policy_on_multicartpole_environment.html>`_

* Model-Based Reinforcement Learning

Model-based RL contains two learning paradigms, such as learning the environment (model-based learning) and utilizing the model (e.g. as an action planner).
To practice model-based RL in the MLPro-RL package, here are a howto file that can be followed:

(a) :ref:`Howto RL-MB-001: Train and Reload Model Based Agent (Gym) <Howto MB RL 001>`
(a) `Howto RL-MB-001: Train and Reload Model Based Agent (Gymnasium) <https://mlpro-int-sb3.readthedocs.io/en/latest/content/01_example_pool/04_howtos_mb/howto_rl_mb_001_train_and_reload_model_based_agent_gym%20copy.html>`_

(b) :ref:`Howto RL-MB-002: MBRL with MPC on Grid World Environment <Howto MB RL 002>`
(b) :ref:`Howto RL-MB-001: MBRL with MPC on Grid World Environment <Howto MB RL 001>`

For more advanced MBRL technique, e.g. applying a native MBRL network, here is an example that can be used as a reference:

(c) :ref:`Howto RL-MB-003: MBRL on RobotHTM Environment <Howto MB RL 003>`
(c) `Howto RL-MB-002: MBRL on RobotHTM Environment <https://mlpro-int-sb3.readthedocs.io/en/latest/content/01_example_pool/04_howtos_mb/howto_rl_mb_002_robothtm_environment.html>`_


**7. Additional Guidance**
After following the previous steps, we hope that you could practice MLPro-RL and start using this subpackage for your RL-related activities.
For more advanced features, we highly recommend you to check out the following howto files:

(a) :ref:`Howto RL-AGENT-011: Train and Reload Single Agent (Gym) <Howto Agent RL 011>`
(a) `Howto RL-AGENT-001: Train and Reload Single Agent (Gymnasium) <https://mlpro-int-sb3.readthedocs.io/en/latest/content/01_example_pool/01_howtos_agent/howto_rl_agent_001_train_and_reload_single_agent_gym.html>`_

(b) :ref:`Howto RL-AGENT-021: Train and Reload Single Agent (MuJoCo) <Howto Agent RL 021>`
(b) `Howto RL-HT-001: Hyperparameter Tuning using Hyperopt <https://mlpro-int-hyperopt.readthedocs.io/en/latest/content/01_examples_pool/howto.rl.ht.001.html>`_

(c) :ref:`Howto RL-HT-001: Hyperopt <Howto RL HT 001>`
(c) `Howto RL-HT-001: Hyperparameter Tuning using Optuna <https://mlpro-int-optuna.readthedocs.io/en/latest/content/01_examples_pool/howto.rl.ht.002.html>`_

(d) :ref:`Howto RL-HT-002: Optuna <Howto RL HT 002>`

(e) :ref:`Howto RL-ATT-001: Stagnation Detection <Howto RL ATT 001>`

(f) :ref:`Howto RL-ATT-002: SB3 Policy with Stagnation Detection <Howto RL ATT 002>`
(d) `Howto RL-ATT-001: Train and Reload Single Agent using Stagnation Detection (Gymnasium) <https://mlpro-int-sb3.readthedocs.io/en/latest/content/01_example_pool/03_howtos_att/howto_rl_att_001_train_and_reload_single_agent_gym_sd.html>`_
12 changes: 6 additions & 6 deletions doc/rtd/content/03_machine_learning/mlpro_rl/sub/03_env.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,17 +33,17 @@ There are two main possibilities to set up an environment in MLPro, such as,
env/customenv
env/pool

Alternatively, you can also :ref:`reuse available environments from 3rd-party packages via wrapper classes <target-package-third>` (currently available: OpenAI Gym or PettingZoo).
Alternatively, you can also :ref:`reuse available environments from 3rd-party packages via wrapper classes <target_extension_hub>` (currently available: Gymnasium or PettingZoo).

For reusing the 3rd packages, we develop a wrapper technology to transform the environment from the 3rd-party package to the MLPro-compatible environment.
Additionally, we also provide the wrapper for the other way around, which is from MLPro Environment to the 3rd-party package.
At the moment, there are two ready-to-use wrapper classes. The first wrapper class is intended for OpenAI Gym and the second wrapper is intended for PettingZoo.
At the moment, there are two ready-to-use wrapper classes. The first wrapper class is intended for Gymnasium and the second wrapper is intended for PettingZoo.
The guide to using the wrapper classes is step-by-step explained in our how-to files, as follows:

(1) :ref:`OpenAI Gym to MLPro <Howto WP RL 004>`,
(1) `Gymnasium to MLPro <https://mlpro-int-gymnasium.readthedocs.io/en/latest/content/01_example_pool/01_howtos_rl/howto_rl_wp_002_gymnasium_environment_to_mlpro_environment.html>`_,

(2) :ref:`MLPro to OpenAI Gym <Howto WP RL 001>`,
(2) `MLPro to Gymnasium <https://mlpro-int-gymnasium.readthedocs.io/en/latest/content/01_example_pool/01_howtos_rl/howto_rl_wp_001_mlpro_environment_to_gymnasium_environment.html>`_,

(3) :ref:`PettingZoo to MLPro <Howto WP RL 003>`, and
(3) `PettingZoo to MLPro <https://mlpro-int-pettingzoo.readthedocs.io/en/latest/content/01_example_pool/01_howtos_rl/howto_rl_wp_002_run_multiagent_with_own_policy_on_petting_zoo_environment.html>`_, and

(4) :ref:`MLPro to PettingZoo <Howto WP RL 002>`.
(4) `MLPro to PettingZoo <https://mlpro-int-pettingzoo.readthedocs.io/en/latest/content/01_example_pool/01_howtos_rl/howto_rl_wp_001_mlpro_environment_to_petting_zoo_environment.html>`_.
Original file line number Diff line number Diff line change
Expand Up @@ -26,5 +26,5 @@ Moreover, the users can create either a single-agent scenario or a multi-agent s
**Cross Reference**

- :ref:`MLPro-RL: Training <target_training_RL>`
- :ref:`Howto RL-AGENT-001: Run an Agent with Own Policy <Howto Agent RL 001>`
- :ref:`Howto RL-AGENT-003: Run Multi-Agent with Own Policy <Howto Agent RL 003>`
- `Howto RL-AGENT-001: Run an Agent with Own Policy <https://mlpro-int-gymnasium.readthedocs.io/en/latest/content/01_example_pool/01_howtos_rl/howto_rl_agent_001_run_agent_with_own_policy_on_gym_environment.html>`_
- `Howto RL-AGENT-003: Run Multi-Agent with Own Policy <https://mlpro-int-gymnasium.readthedocs.io/en/latest/content/01_example_pool/01_howtos_rl/howto_rl_agent_003_run_multiagent_with_own_policy_on_multicartpole_environment.html>`_
16 changes: 7 additions & 9 deletions doc/rtd/content/03_machine_learning/mlpro_rl/sub/06_train.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ In this RL training, we always start with a defined random initial state of the
(3) **Event Timeout**: This means that the maximum training cycles for an episode are reached and the actual episode is ended.

If none of the events is satisfied, then the training continues. The goal of the training is to maximize the score of the repetitive evaluations.
In this case, a :ref:`stagnation detection functionality <Howto RL ATT 001>` can be incorporated to avoid a long training time without any more improvements.
In this case, a `stagnation detection functionality <https://mlpro-int-sb3.readthedocs.io/en/latest/content/01_example_pool/03_howtos_att/howto_rl_att_001_train_and_reload_single_agent_gym_sd.html>`_ can be incorporated to avoid a long training time without any more improvements.
The training can be ended, once the stagnation is detected. For more information, you can read `Section 4.3 of MLPro 1.0 paper <https://doi.org/10.1016/j.mlwa.2022.100341>`_.

In MLPro-RL, we simplify the process of setting up an RL scenario and training for both single-agent and multi-agent RL, as shown below:
Expand Down Expand Up @@ -125,12 +125,10 @@ In MLPro-RL, we simplify the process of setting up an RL scenario and training f
**Cross Reference**

- `A sample application video of MLPro-RL on a UR5 robot <https://ars.els-cdn.com/content/image/1-s2.0-S2665963822001051-mmc2.mp4>`_
- :ref:`Howto RL-AGENT-002: Train an Agent with Own Policy <Howto Agent RL 002>`
- :ref:`Howto RL-AGENT-004: Train Multi-Agent with Own Policy <Howto Agent RL 004>`
- :ref:`Howto RL-AGENT-011: Train and Reload Single Agent (Gym) <Howto Agent RL 011>`
- :ref:`Howto RL-AGENT-021: Train and Reload Single Agent (MuJoCo) <Howto Agent RL 021>`
- :ref:`Howto RL-ATT-001: Train and Reload Single Agent using Stagnation Detection (Gym) <Howto RL ATT 001>`
- :ref:`Howto RL-ATT-002: Train and Reload Single Agent using Stagnation Detection (MuJoCo) <Howto RL ATT 002>`
- :ref:`Howto RL-MB-001: Train and Reload Model Based Agent (Gym) <Howto MB RL 001>`
- :ref:`Howto RL-MB-002: MBRL with MPC on Grid World Environment <Howto MB RL 002>`
- `Howto RL-AGENT-002: Train an Agent with Own Policy <https://mlpro-int-gymnasium.readthedocs.io/en/latest/content/01_example_pool/01_howtos_rl/howto_rl_agent_002_train_agent_with_own_policy_on_gym_environment.html>`_
- `Howto RL-AGENT-004: Train Multi-Agent with Own Policy <https://mlpro-int-gymnasium.readthedocs.io/en/latest/content/01_example_pool/01_howtos_rl/howto_rl_agent_004_train_multiagent_with_own_policy_on_multicartpole_environment.html>`_
- `Howto RL-AGENT-001: Train and Reload Single Agent (Gymnasium) <https://mlpro-int-sb3.readthedocs.io/en/latest/content/01_example_pool/01_howtos_agent/howto_rl_agent_001_train_and_reload_single_agent_gym.html>`_
- `Howto RL-ATT-001: Train and Reload Single Agent using Stagnation Detection (Gymnasium) <https://mlpro-int-sb3.readthedocs.io/en/latest/content/01_example_pool/03_howtos_att/howto_rl_att_001_train_and_reload_single_agent_gym_sd.html>`_
- `Howto RL-MB-001: Train and Reload Model Based Agent (Gymnasium) <https://mlpro-int-sb3.readthedocs.io/en/latest/content/01_example_pool/04_howtos_mb/howto_rl_mb_001_train_and_reload_model_based_agent_gym%20copy.html>`_
- :ref:`Howto RL-MB-001: MBRL with MPC on Grid World Environment <Howto MB RL 001>`
- :ref:`MLPro-BF-ML: Training and Tuning <target_bf_ml_train_and_tune>`
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ Custom Policies
- **Policy from Third Party Packages**

Alternatively, the user can also apply algorithms from Stable Baselines 3 by using the developed relevant wrapper for the integration between third-party packages and MLPro.
For more information, please click :ref:`here<target-package-third>`.
For more information, please click :ref:`here<target_extension_hub>`.

- **Algorithm Checker**

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ Model-Based Agents
==================

Model-Based Agents have a dissimilar learning target as Model-Free Agents, whereas learning the environment model is not required in the model-free RL.
An environment model can be incorporated into a single agent, see :ref:`EnvModel <customEnvModel>` for an overview.
An environment model can be incorporated into a single agent, as **EnvModel**.
Then, this model learns the behaviour and dynamics of the environment.
After learning the environment, the model is optimized to be able to accurately predict the output states, rewards, or status of the environment with respect to the calculated actions.
As a result, if the predictions of the subsequent state and reward diverge too far from the actual values of the environment, the environment model itself is incorporated into the agent's adaptation process and is always retrained.
Expand Down Expand Up @@ -112,8 +112,8 @@ the original environment module.

**Cross Reference**

- :ref:`Howto RL-MB-001: Train and Reload Model Based Agent (Gym) <Howto MB RL 001>`
- :ref:`Howto RL-MB-002: MBRL with MPC on Grid World Environment <Howto MB RL 002>`
- :ref:`Howto RL-MB-003: MBRL on RobotHTM Environment <Howto MB RL 003>`
- `Howto RL-AGENT-001: Train and Reload Single Agent (Gymnasium) <https://mlpro-int-sb3.readthedocs.io/en/latest/content/01_example_pool/01_howtos_agent/howto_rl_agent_001_train_and_reload_single_agent_gym.html>`_
- :ref:`Howto RL-MB-001: MBRL with MPC on Grid World Environment <Howto MB RL 001>`
- `Howto RL-MB-002: MBRL on RobotHTM Environment <https://mlpro-int-sb3.readthedocs.io/en/latest/content/01_example_pool/04_howtos_mb/howto_rl_mb_002_robothtm_environment.html>`_
- :ref:`MLPro-SL <target_bf_sl_afct>`

Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@ It is compatible with single-agent but does not have its own policy.
Instead, it is utilized to combine and control any quantity of single agents that together control the action calculation.
Every single agent in this situation interacts with a separate portion of the surrounding multi-observation agents and action space.
Multi-agent interactions take place in appropriate contexts that support the scalar reward per agent reward type.
These are native applications that incorporate the MLPro environment template or PettingZoo environments that may be incorporated using the corresponding :ref:`wrapper class<target-package-third>` offered by MLPro.
These are native applications that incorporate the MLPro environment template or PettingZoo environments that may be incorporated using the corresponding :ref:`wrapper class<target_extension_hub>` offered by MLPro.


**Cross Reference**
- :ref:`Howto RL-AGENT-004: Train Multi-Agent with Own Policy <Howto Agent RL 004>`
- `Howto RL-AGENT-004: Train Multi-Agent with Own Policy <https://mlpro-int-gymnasium.readthedocs.io/en/latest/content/01_example_pool/01_howtos_rl/howto_rl_agent_004_train_multiagent_with_own_policy_on_multicartpole_environment.html>`_
- :ref:`MLPro-RL: Training <target_training_RL>`
Original file line number Diff line number Diff line change
Expand Up @@ -41,5 +41,5 @@ Depending on the number of planning horizon, but we believe that this reduces th

**Citation**

If you apply this policy in your research or work, please :ref:`cite <target_publications>` us and the `original paper <https://ieeexplore.ieee.org/document/7989202>`_.
If you apply this policy in your research or work, please :ref:`cite <target_publications>` us and the `original paper <https://ieeexplore.ieee.org/document/10185716>`_.

Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@ Developing Custom Environments

Alternatively, if your environment follows Gym or PettingZoo interface, you can apply our
relevant useful wrappers for the integration between third party packages and MLPro. For more
information, please click :ref:`here<target-package-third>`.
information, please click :ref:`here<target_extension_hub>`.

- **Environment Checker**

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,5 @@ Reusing Environment from the Pool
pool/multicartpole
pool/gridworld
pool/robotmanipulator
pool/doublependulum
pool/doublependulum
pool/2Dcollisiondetection
Loading
Loading