Skip to content
This repository has been archived by the owner on Feb 24, 2021. It is now read-only.
/ drlnd-p1-navigation Public archive

Project 1 of Udacity's Deep Reinforcement Learning NanoDegree

Notifications You must be signed in to change notification settings

chrka/drlnd-p1-navigation

Repository files navigation

DRLND Project 1: Navigation

<iframe width="560" height="315" src="https://www.youtube.com/embed/hucCBvA1qT8" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

The goal of the agent is to move around in the world trying to gather as many yellow bananas during each episode, while simultaneously trying to avoid any blue bananas.

The agent receives information about its velocity and what objects can be found in 36 directions around it, ie., it is dealing with a state space that has 37 dimensions. It can take any of four possible discrete actions: move forward, move backwards, turn left, or turn right.

The agent receives a reward of +1 for every yellow banana, and -1 for each blue banana. The agent is considered to have solved its task when it manages to get an average score of at least +13 over 100 episodes.

Installation

Clone this repository and install the requirements needed as per the instructions below.

Python Requirements

Follow the instructions in the Udacity Deep Reinforcement Learning repository on how to set up the drlnd environment, and then also install the Click package (used for handling command line arguments):

pip install click

Alternatively, on some systems it might be enough to install the required packages from the provided requirements.txt file:

pip install -r requirements.txt

Unity environment

Download the Unity environment appropriate for your operating system using the links below and unzip it into the project folder.

Training and Running the Agent

To train the agent, use the train.py program which takes the Unity environment and optional arguments to experiment with various parameters.

(drlnd) $ python train.py --help
Usage: train.py [OPTIONS]

Options:
  --environment PATH     Path to Unity environment  [required]
  --layer1 INTEGER       Number of units in input layer
  --layer2 INTEGER       Number of units in hidden layer
  --eps-decay FLOAT      Epsilon decay factor
  --eps-min FLOAT        Minimum value of epsilon
  --plot-output PATH     Output file for score plot
  --weights-output PATH  File to save weights to after success
  --seed INTEGER         Random seed
  --help                 Show this message and exit.

The default values are:

Option Value
layer1 32
layer2 32
eps-decay 0.999
eps-min 0.01
plot-output score.png
weights-output weights.pth
seed None — do not set

For example:

(drlnd) $ python train.py --environment=Banana.app --seed=20190415 

After successfully training the agent, use the run.py program to load the weights and run the simulation, which takes similar parameters as the training program:

(drlnd) $ python run.py --help
Usage: run.py [OPTIONS]

Options:
  --environment PATH    Path to Unity environment  [required]
  --layer1 INTEGER      Number of units in input layer
  --layer2 INTEGER      Number of units in hidden layer
  --n-episodes INTEGER  Number of episodes to run
  --weights-input PATH  Network weights
  --help                Show this message and exit.

Default values for running the agent are:

Option Value
layer1 32
layer2 32
n-episodes 3
weights-input weights.pth

For example:

(drlnd) $ python run.py --environment=Banana.app --weights-input=weights.pth

About

Project 1 of Udacity's Deep Reinforcement Learning NanoDegree

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages