Project 1: Navigation

Introduction

For this project, you will train an agent to navigate (and collect bananas!) in a large, square world.

A reward of +1 is provided for collecting a yellow banana, and a reward of -1 is provided for collecting a blue banana. Thus, the goal of your agent is to collect as many yellow bananas as possible while avoiding blue bananas.

The state space has 37 dimensions and contains the agent's velocity, along with ray-based perception of objects around agent's forward direction. Given this information, the agent has to learn how to best select actions. Four discrete actions are available, corresponding to:

0 - move forward.
1 - move backward.
2 - turn left.
3 - turn right.

The task is episodic, and in order to solve the environment, your agent must get an average score of +13 over 100 consecutive episodes.

Getting Started

Download the environment from one of the links below. You need only select the environment that matches your operating system:
- Linux: click here
- Mac OSX: click here
- Windows (32-bit): click here
- Windows (64-bit): click here
- This project uses Unity's rich environments to design, train, and evaluate deep reinforcement learning algorithms. **To run this project you'll need to install Unity ML-Agents.**You can read more about ML-Agents and how to install it by perusing the GitHub repository.
  
  Note: The Unity ML-Agent team frequently releases updated versions of their environment. We are using the v0.4 interface. To avoid any confusion, please use the workspace we provide here or work with v0.4 locally.
(For Windows users) Check out this link if you need help with determining if your computer is running a 32-bit version or 64-bit version of the Windows operating system.

(For AWS) If you'd like to train the agent on AWS (and have not enabled a virtual screen), then please use this link to obtain the environment.
Place the file in the DRLND GitHub repository, in the p1_navigation/ folder, and unzip (or decompress) the file.

Code Description

Navigation.ipynb - Main module containing 1)loading of the helper modules 2) loading of the DQN agent helper module 3)training the DQN agent 4)plotting results and 5) checkpointing the model parameters.
model.py - loads pytroch module and derives a custom NN model for this problem
dqn_agent.py - Helper module contains 1) loads the helper model.py module 2)uses the NN model to train a DQN agent 3) Containts prioritized(optional) experience replay buffer from which the DQN draws sample

Important Hyperparameters

Navigation.ipynb - Main Module contains most of the hyper parameters for training the DQN agent a. n_episodes. Maximum number of episodes for which training will proceed b. max_t. maximum number of steps per episode during training c. eps_start, eps_end, eps_decay - During the exploration using an episilon greedy policy is used. The policy starts with eps_start at episode 1 and decays by eps_decay each episode until it hits the eps_end floor. d. random_replay - If True random sampling of experience buffer is chosen, otherwise the prioritized sampling of experience buffer is chosen e. dqn_fc_layer - architecture of the Hidden layers of the Q network. ex. = [ 64 64 32 256] means there are 4 hidden layers of 64, 64, 32 and 256 units, in that order. f. checkpoint_score - if the score is greater than this threshold, every 100 episode the network will be checkpointed. This can be set as a score target for a reasonably good agent. g. breakpoint_score - if the score is greater than this threshold, network is checkpointed and training is finished. This can be set as a score target for a exceptionally good agent.

ddpg_agent.py -

 BUFFER_SIZE = int(1e5)  # replay buffer size
 BATCH_SIZE = 64         # minibatch size
 GAMMA = 0.99            # discount factor
 TAU = 1e-3              # for soft update of target parameters
 LR = 5e-4               # learning rate

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
python		python
ExtractWorkspace.ipynb		ExtractWorkspace.ipynb
LargeHiddenRR500checkpoint.pth		LargeHiddenRR500checkpoint.pth
LargeHiddenRR600checkpoint.pth		LargeHiddenRR600checkpoint.pth
LargeHiddenRR700checkpoint.pth		LargeHiddenRR700checkpoint.pth
LargeHiddenRR800checkpoint.pth		LargeHiddenRR800checkpoint.pth
LargeHiddenRR900checkpoint.pth		LargeHiddenRR900checkpoint.pth
LargeHiddenRR931checkpoint.pth		LargeHiddenRR931checkpoint.pth
Navigation.ipynb		Navigation.ipynb
PR600checkpoint.pth		PR600checkpoint.pth
README.md		README.md
REPORT.ipynb		REPORT.ipynb
RR600checkpoint.pth		RR600checkpoint.pth
RR700checkpoint.pth		RR700checkpoint.pth
RR751checkpoint.pth		RR751checkpoint.pth
Untitled.ipynb		Untitled.ipynb
checkpoint.pth		checkpoint.pth
download1.png		download1.png
download2.png		download2.png
download3.png		download3.png
dqn_agent.py		dqn_agent.py
environment.yml		environment.yml
model.py		model.py
unity-environment.log		unity-environment.log
workspace_utils.py		workspace_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project 1: Navigation

Introduction

Getting Started

Code Description

Important Hyperparameters

About

Releases

Packages

Languages

pinakigupta/DQN-Udacity-BananaNavEnv

Folders and files

Latest commit

History

Repository files navigation

Project 1: Navigation

Introduction

Getting Started

Code Description

Important Hyperparameters

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages