Logic Based Reward Shaping for Multi-agent Reinforcement Learning (MARL)

This repository contains the implementation of the project described in this document.

This repository also includes the implementation of the learning-based synthesis algorithm described in this article which was developed by Alper Kamil Bozkurt from this repository.

The video rendering and recording is based on this gridworld repository.

Dependencies

Python: (>=3.5)
Rabinizer 4: ltl2ldba must be in PATH (ltl2ldra is optional)
NumPy: (>=1.15)

The examples in this repository also require the following optional libraries for visualization:

Matplotlib: (>=3.03)
JupyterLab: (>=1.0)
ipywidgets: (>=7.5)
Spot Library

Installation

To install the current release and install the CSRL codebase:

git clone https://github.com/IngyN/macsrl.git
cd macsrl
pip3 install .

Basic Usage of this repo

The main class for this repo is MultiControlSynthesis, it takes set of ControlSynthesis classes (based on the number agents), a GridMDP object, a OmegaAutomaton object representing the shared automaton with sharedoa=True for our method.

The Graphing is done by loading the saved episode returns then loaded in the Graphing Notebook. For the video rendering, we use the Annotation and the Plotterclasses the annotation.py and the plotter.py.

Basic Usage of CSRL

The package consists of three main classes GridMDP, OmegaAutomaton and ControlSynthesis. The class GridMDP constructs a grid-world MDP using the parameters shape, structure and label. The class OmegaAutomaton takes an LTL formula ltl and translates it into an LDBA. The class ControlSynthesis can then be used to compose a product MDP of the given GridMDP and OmegaAutomaton objects and its method q_learning can be used to learn a control policy for the given objective. For example,

Examples

The repository contains a couple of example IPython notebooks:

Animations of the case studies:

HTML representation of the Automatons:

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
csrl		csrl
graph_data		graph_data
sharedoa3		sharedoa3
sharedoa3_base		sharedoa3_base
sharedoa_bench2		sharedoa_bench2
.gitignore		.gitignore
ElSayed-Aly_Presentation.pdf		ElSayed-Aly_Presentation.pdf
Example1_MDP.ipynb		Example1_MDP.ipynb
Examples of LTL to Omega-Automata Translation.ipynb		Examples of LTL to Omega-Automata Translation.ipynb
Examples of MDPs.ipynb		Examples of MDPs.ipynb
LICENSE		LICENSE
Nursery Scenario.ipynb		Nursery Scenario.ipynb
README.md		README.md
Safe Absorbing States.ipynb		Safe Absorbing States.ipynb
annotation.py		annotation.py
example_1.html		example_1.html
example_2.html		example_2.html
example_3.html		example_3.html
graphing.ipynb		graphing.ipynb
independent_tasks.ipynb		independent_tasks.ipynb
logger.py		logger.py
multi.py		multi.py
plotter.py		plotter.py
setup.py		setup.py
shared_oa_bench2_returns.csv		shared_oa_bench2_returns.csv
shared_oa_bench2_returns_base.csv		shared_oa_bench2_returns_base.csv
shared_oa_bench2_returns_df.csv		shared_oa_bench2_returns_df.csv
shared_oa_bench2_returns_df_base.csv		shared_oa_bench2_returns_df_base.csv
shared_oa_benchmark2.ipynb		shared_oa_benchmark2.ipynb
shared_oa_ex2.ipynb		shared_oa_ex2.ipynb
shared_oa_ex2_alt_returns.csv		shared_oa_ex2_alt_returns.csv
shared_oa_ex2_alt_returns_df.csv		shared_oa_ex2_alt_returns_df.csv
shared_oa_ex2_returns.csv		shared_oa_ex2_returns.csv
shared_oa_ex2_returns_base.csv		shared_oa_ex2_returns_base.csv
shared_oa_ex2_returns_df.csv		shared_oa_ex2_returns_df.csv
shared_oa_ex2_returns_df_base.csv		shared_oa_ex2_returns_df_base.csv
shared_oa_ex2_returns_df_debug.csv		shared_oa_ex2_returns_df_debug.csv
shared_oa_ex3.ipynb		shared_oa_ex3.ipynb
shared_oa_ex3_returns.csv		shared_oa_ex3_returns.csv
shared_oa_ex3_returns_base.csv		shared_oa_ex3_returns_base.csv
shared_oa_ex3_returns_df.csv		shared_oa_ex3_returns_df.csv
shared_oa_ex3_returns_df_base.csv		shared_oa_ex3_returns_df_base.csv

License

IngyN/macsrl

Folders and files

Latest commit

History

Repository files navigation

Logic Based Reward Shaping for Multi-agent Reinforcement Learning (MARL)

Dependencies

Installation

Basic Usage of this repo

Basic Usage of CSRL

Examples

About

Topics

Resources

License

Stars

Watchers

Forks

Languages