Deep Reinforcement Learning - Collaboration and Competition Project

In this notebook, we have implemented the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) reinforcement learning algorithm for the "Collaboration and Competition" project of the Udacity Deep Reinforcement Learning Nanodegree program.

By Sebastian Castro, 2020

Project Introduction

This project uses a version of the Tennis environment in Unity ML-Agents.

This environment consists of two tennis players, or agents, each of which has its own local set of observations, actions, and rewards. The specifics are discussed below, but the environment is structured such that a "good" game consists of an infinite volley where both players are constantly hitting the ball back to each other without scoring.

The reinforcement learning specifics for each agent are:

State: 24 variables (8 observations stacked for 3 subsequent time steps) corresponding to position and velocity of the ball and racket.
Actions: A vector with 2 elements -- one for moving towards/away from the net and another for jumping. Both are continuous variables between -1.0 and 1.0.
Reward: The agent receives +0.1 reward each time it hits the ball over the net, and -0.01 if it lets a ball hit the ground or go out of bounds. This is what incentivizes the agents to play forever rather than scoring, unlike your typical game of tennis.

As per the project specification, both agents are considered to have "solved" the problem if the maximum return of the 2 agents is greater than 0.5 over a sustained 100-episode average.

To see more details about the MADDPG agent implementation, network and training hyper parameters, and results, refer to the Report included in this repository.

Getting Started

To get started with this project, first you should perform the setup steps in the Udacity Deep Reinforcement Learning Nanodegree Program GitHub repository. Namely, you should

Install Conda and create a Python 3.6 virtual environment
Install OpenAI Gym
Clone the Udacity repo and install the Python requirements included
Download the Tennis Unity files appropriate for your operating system and architecture (Linux, Mac OSX, Win32, Win64)

Once you have performed this setup, you should be ready to run the tennis_maddpg.ipynb Jupyter Notebook in this repo. This notebook contains all the steps needed to define and train MADDPG agents to solve this environment.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
media		media
.gitignore		.gitignore
README.md		README.md
Report.md		Report.md
best_weights_0.pth		best_weights_0.pth
best_weights_1.pth		best_weights_1.pth
maddpg.py		maddpg.py
networks.py		networks.py
tennis_maddpg.ipynb		tennis_maddpg.ipynb
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

media

media

.gitignore

.gitignore

README.md

README.md

Report.md

Report.md

best_weights_0.pth

best_weights_0.pth

best_weights_1.pth

best_weights_1.pth

maddpg.py

maddpg.py

networks.py

networks.py

tennis_maddpg.ipynb

tennis_maddpg.ipynb

utils.py

utils.py

Repository files navigation

Deep Reinforcement Learning - Collaboration and Competition Project

Project Introduction

Getting Started

About

Releases

Packages

Languages

sea-bass/drlnd-multiagent-project

Folders and files

Latest commit

History

Repository files navigation

Deep Reinforcement Learning - Collaboration and Competition Project

Project Introduction

Getting Started

About

Resources

Stars

Watchers

Forks

Languages