scalable_maddpg

scalable multi agent reinforcement learning. Details can be found in the Report

to do list

tune L2, does LSTM parameters need L2 regulizer?
fix environments
fix rewards
decrease the frequency of summaries
rearrange main.py
prey boundary problem
modify the initial position of the agents and prey
add another network for prey
add summary for rewards of each episode

Alternative to Gym

An alternative of Gym environment is created (env.py). The rendering implementation of the envrionment is matplot. So, it would be much easier to use. However, you need to implement the prey policy by yourself.

Results

Here, we have done two independent runs. In each run, from episode 1 to episodes 3x10^4, three agents were in the game. At episode 3x10^4, we added three more agents into this game. Here we show the mean Q value of all the agents in our experiments.

Demo results

In this demon, the prey walks randomly. Agents learn to catch the prey.

Name		Name	Last commit message	Last commit date
Latest commit History 164 Commits
Notes		Notes
Test		Test
env		env
README.md		README.md
actor_network.py		actor_network.py
critic_network.py		critic_network.py
env.py		env.py
maddpg.py		maddpg.py
main.py		main.py
main_add_agents.py		main_add_agents.py
ou_noise.py		ou_noise.py
replay_buffer.py		replay_buffer.py
scalable-multi-agent.pdf		scalable-multi-agent.pdf
slow_replay.py		slow_replay.py

livey/scalable_maddpg

Folders and files

Latest commit

History

Repository files navigation

scalable_maddpg

to do list

Alternative to Gym

Results

Demo results

About

Topics

Resources

Stars

Watchers

Forks

Languages