Multi-agent Path Finding using Reinforcement Learning

Description

Multi-agent pathfinding in partially observable environments. Search-based vs. RL-based algorithms.

The main goal of this repository is to provide a DHC [1] model implementation alongside with some benchmarks and charts. We also aim to compare the performance of the DHC model with the basic M* algorithm.

Requirements

In order for models.dhc.train to be successfully run, you have to have a machine equipped with 1 GPU and several CPUs. Consider having num_cpus - 2 actors configured through the dhc.train.num_actors in config.yaml

Attention: We do not guarantee the desired performance on a non-GPU machine.

While we aim at supporting MacOS, Linux and Windows platforms, the successful training is not guaranteed on a Windows-based machine. The benchmarking script should work there, though. Please report it here if it doesn't.

Setting up

Install Poetry
Run poetry install to install the dependencies

If you see Failed to create the collection: Prompt dismissed.. this error when trying to run poetry install, consider executing this line first:

export PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring

Repository description & Usage

models dir contains the weights of the trained models
config.yaml - training & model params, environmental settings etc.
pathfinding/models provides one with the implementation of different models

Models

DHC

Distributed, Heuristic and Communication [1]

To guide RL algorithm on long-horizon goal-oriented tasks, we embed the potential choices of shortest paths from single source as heuristic guidance instead of using a specific path as in most existing works. Our method treats each agent independently and trains the model from a single agent’s perspective. The final trained policy is applied to each agent for decentralized execution. The whole system is distributed during training and is trained under a curriculum learning strategy.

M*

Multi-dimensional A*

Details:

Generate test cases

To generate test cases, run

poetry run python3 pathfinding/utils.py generate_test_suits TESTS_DESCR REPEAT_FOR

where

TESTS_DESCR is a string of the format '[(map_length, num_agents, density), ...]'
REPEAT_FOR is the amount of test cases of each configuration

For example, by running

poetry run python3 pathfinding/utils.py generate_test_suits '[(40, 16, 0.3), (80, 4, 0.1)]' 10

you will create 20 test cases in total:

10 cases with a 40x40 map with a density of 30% + 16 agents in it
10 cases with a 80x80 map with a density of 10% + 4 agents in it

DHC

Benchmarking

To run the generated test suite, run

poetry run python3 pathfinding/models/dhc/evaluate.py test_model TESTS_DESCR MODEL_ID

where

TESTS_DESCR is a string of the format '[(map_length, num_agents, density), ...]' (you may want to copy this line from the generation command)
MODEL_ID is the name of the file from the models dir For example, by running

poetry run python3 pathfinding/models/dhc/evaluate.py test_model '[(40, 16, 0.3), (80, 4, 0.1)]' 60000

you will benchmark the models/60000.pth on the provided test cases

Attention: the test cases must be generated first!

Training

Set the desired actors amount by setting the appropriate value for dhc.train.num_actors in config.yaml

It is recommended to use the amount of CPU cores on you machine minus 2

To initialize training, run

poetry run python3 pathfinding/models/dhc/train.py

The models dir will be created afterwards where the weights of the intermediate models will be saved.

Visualizing

To visualize the results, run

poetry run python3 pathfinding/models/dhc/visualize.py MODEL_ID TEST_NAME TEST_ID

where

MODEL_ID is the name of the file from the models dir
TEST_NAME is the name of the file with tests, for example 80length_32agents_0.3density.pkl
TEST_ID [optional], id of the test from the provided test suite

M*

Benchmarking

To run the generated test suite, run

poetry run python3 pathfinding/models/mstar/evaluate.py test_model TESTS_DESCR

where

TESTS_DESCR is a string of the format '[(map_length, num_agents, density), ...]' (you may want to copy this line from the generation command) For example, by running

poetry run python3 pathfinding/models/mstar/evaluate.py test_model '[(10, 4, 0.1), (20, 4, 0.1)]'

you will benchmark the Multi-dimensional A* algorithm on the provided test cases

Attention: the test cases must be generated first!

While we aim at supporting different platforms, the current implementation requires the call to be successful on your platform.

Please ensure this is the case before running any benchmark for M*

g++ -O2 -std=c++17 -o 'pathfinding/models/mstar/main' 'pathfinding/models/mstar/main.cpp'

Our setup

The DHC network was trained on a single NVIDIA TESLA T4 GPU for 7 hours.

We used 20 CPU cores, 18 were used for the actors, additionally, 2 cores were used for the Learner and GlobalBuffer all together.

DHC Results

Our trained model outperforms PRIMAL benchmarks

Contributing

See the detailed contribution guide

Install black, you can likely run

pip3 install black

Use black to ensure that the codestyle remains great

poetry run black dir

Make sure tests are OK

poetry run pytest

Create a PR with new features

References

[1] Ma, Ziyuan and Luo, Yudong and Ma, Hang, 2021. Distributed Heuristic Multi-Agent Path Finding with Communication.

[2] Sartoretti, G., Kerr, J., Shi, Y., Wagner, G., Kumar, T.S., Koenig, S. and Choset, H., 2019. Primal: Pathfinding via reinforcement and imitation multi-agent learning. IEEE Robotics and Automation Letters, 4(3), pp.2378-2385.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
models		models
pathfinding		pathfinding
static		static
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

License

acforvs/multi-agent-pathfinding

Folders and files

Latest commit

History

Repository files navigation

Multi-agent Path Finding using Reinforcement Learning

Description

Requirements

Setting up

Repository description & Usage

Models

DHC

M*

Benchmarking

Training

Visualizing

Benchmarking

Our setup

DHC Results

Contributing

References

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages