A library for applying reinforcement learning to inspection and maintenance planning of deteriorating engineering systems. This library was primarily developed as a pedogogic excercise and for research use.
Example rollout of a DDQN agent in a 5-out-of-5 system:
conda create --name imprl_env -y python==3.9
conda activate imprl_env
pip install poetry==1.8 # or conda install -c conda-forge poetry==1.8
poetry install
Following best practices, poetry install
installs the dependencies from the poetry.lock
file. This file rigorously specifies all the dependencies required to build the library. It ensures that the project does not break because of unexpected changes in (transitive) dependencies (more info).
Installing additional packages
You can them add via poetry add
(official docs) in the command line.
For example, to install Jupyter notebook,
# Allow >=7.1.2, <8.0.0 versions
poetry add notebook@^7.1.2
This will resolve the package dependencies (and adjust versions of transitive dependencies if necessary) and install the package. If the package dependency cannot be resolved, try to relax the package version and try again.
For logging, the library relies on wandb. You can log into wandb using your private API key,
wandb login
# <enter wandb API key>
The following (multiagent) reinforcement algorithms are implemented,
- Double Deep Q-Network (DDQN)
- Joint Actor Critic (JAC)
- Deep Centralized Multiagent Actor Critic (DCMAC)
- Deep Decentralized Multiagent Actor Critic (DDMAC)
- Independent Actor Centralized Critic (IACC)
- Independent Actor Centralized Critic with Paramater Sharing (IACC-PS)
- Independent Actor Critic (IAC)
- Independent Actor Critic with Paramater Sharing (IAC-PS)
The base actor-critic algorithm: ACER from SAMPLE EFFICIENT ACTOR-CRITIC WITH EXPERIENCE REPLAY by Wang et al, an off-policy algorithm that uses weighted sampling for experience replay.
Paradigm | Mathematical Framework |
Algorithm | Observation | Action | Critic | Actor |
CTCE | POMDP | JAC | Joint | Joint | Centralized | Shared |
MPOMDP | DCMAC | Factored | Shared | |||
DDMAC | Factored | Independent | ||||
CTDE | Dec-POMDP | IACC | Independent | Independent | Centralized | Independent |
IACC-PS | Independent | Shared | ||||
DTDE | IAC | Independent | Independent | Decentralized | Independent | |
IAC-PS | Independent | Shared |
This project utilizes the clever abstractions in EPyMARL and the author would like to acknowledge the insights shared in Reinforcement Learning Implementation Tips and Tricks for developing this library.
- IMP-MARL: a platform for benchmarking the scalability of cooperative MARL methods in real-world engineering applications.