DRLND Project 1: Navigation

The goal of the agent is to move around in the world trying to gather as many yellow bananas during each episode, while simultaneously trying to avoid any blue bananas.

The agent receives information about its velocity and what objects can be found in 36 directions around it, ie., it is dealing with a state space that has 37 dimensions. It can take any of four possible discrete actions: move forward, move backwards, turn left, or turn right.

The agent receives a reward of +1 for every yellow banana, and -1 for each blue banana. The agent is considered to have solved its task when it manages to get an average score of at least +13 over 100 episodes.

Installation

Clone this repository and install the requirements needed as per the instructions below.

Python Requirements

Follow the instructions in the Udacity Deep Reinforcement Learning repository on how to set up the drlnd environment, and then also install the Click package (used for handling command line arguments):

pip install click

Alternatively, on some systems it might be enough to install the required packages from the provided requirements.txt file:

pip install -r requirements.txt

Unity environment

Download the Unity environment appropriate for your operating system using the links below and unzip it into the project folder.

Linux: click here
Mac OSX: click here
Windows (32-bit): click here
Windows (64-bit): click here

Training and Running the Agent

To train the agent, use the train.py program which takes the Unity environment and optional arguments to experiment with various parameters.

(drlnd) $ python train.py --help
Usage: train.py [OPTIONS]

Options:
  --environment PATH     Path to Unity environment  [required]
  --layer1 INTEGER       Number of units in input layer
  --layer2 INTEGER       Number of units in hidden layer
  --eps-decay FLOAT      Epsilon decay factor
  --eps-min FLOAT        Minimum value of epsilon
  --plot-output PATH     Output file for score plot
  --weights-output PATH  File to save weights to after success
  --seed INTEGER         Random seed
  --help                 Show this message and exit.

The default values are:

Option	Value
layer1	32
layer2	32
eps-decay	0.999
eps-min	0.01
plot-output	score.png
weights-output	weights.pth
seed	None — do not set

For example:

(drlnd) $ python train.py --environment=Banana.app --seed=20190415

After successfully training the agent, use the run.py program to load the weights and run the simulation, which takes similar parameters as the training program:

(drlnd) $ python run.py --help
Usage: run.py [OPTIONS]

Options:
  --environment PATH    Path to Unity environment  [required]
  --layer1 INTEGER      Number of units in input layer
  --layer2 INTEGER      Number of units in hidden layer
  --n-episodes INTEGER  Number of episodes to run
  --weights-input PATH  Network weights
  --help                Show this message and exit.

Default values for running the agent are:

Option	Value
layer1	32
layer2	32
n-episodes	3
weights-input	weights.pth

For example:

(drlnd) $ python run.py --environment=Banana.app --weights-input=weights.pth

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.idea		.idea
.gitignore		.gitignore
README.md		README.md
Report.md		Report.md
agent.py		agent.py
model.py		model.py
replay.py		replay.py
requirements.txt		requirements.txt
run.py		run.py
score.png		score.png
train.py		train.py
weights.pth		weights.pth

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DRLND Project 1: Navigation

Installation

Python Requirements

Unity environment

Training and Running the Agent

About

Releases

Packages

Languages

chrka/drlnd-p1-navigation

Folders and files

Latest commit

History

Repository files navigation

DRLND Project 1: Navigation

Installation

Python Requirements

Unity environment

Training and Running the Agent

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages