Skip to content

raaslab/rl_multi_agent

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rl_maxm-visibility

Using Reinforcement learning to solve the persistent monitoring problem using a Multi-Agent setup. A decentralized system using Proximal Policy Optimization (PPO) is trained with various scenarios. There is no cooperation introduced, just a local view of the agent (25x25 sq. units area around the agent) and compressed minimap (50x50 sq. units environment compressed to 25x25 sq. units) is provided as an input to the agent which will then decide to execute one of 4 descrete motions (stay, move up, left, down and right).

The environment is made of descrete element that accumulate a penalty value based on a pre-defined decay rate until the agent observes the element in it visibility region. The sum of all the penalty values of all the elements of the map is used to train the agent. The agent must uncover the right behavior to keep observing every descrete element in the map to achieve high reward (less penalty), hence Persistent Monitoring Problem.

Link to Logs

Logs to Discussions and work on the project

Training Models

  1. A single agent was trained on an environment with 2 compartments/ rooms. The final behavior can be seen bellow 1Agent_2Room

  2. Two agents were trained on an environment with 2 compartments/ rooms. The final behavior can be seen bellow 2Agent_2Room

  3. Two agents were trained on an environment with 2 compartments/ rooms. The final behavior can be seen bellow 2Agent_4Room

This system is further trained to find the maximum number of agents it can handle before a cooperation based methodology is used.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%