Project 1: Navigation

Udacity's Deep Reinforcement Learning Nanodegree

Project 1: Navigation

Introduction

For this project, you will train an agent to navigate (and collect bananas!) in a large, square world.

A reward of +1 is provided for collecting a yellow banana, and a reward of -1 is provided for collecting a blue banana. Thus, the goal of your agent is to collect as many yellow bananas as possible while avoiding blue bananas.

The state space has 37 dimensions and contains the agent's velocity, along with ray-based perception of objects around agent's forward direction. Given this information, the agent has to learn how to best select actions. Four discrete actions are available, corresponding to:

0 - move forward.
1 - move backward.
2 - turn left.
3 - turn right.

The task is episodic, and in order to solve the environment, your agent must get an average score of +13 over 100 consecutive episodes.

Project Detail

PyTorch is used as deep learning framework for defining DQN. If we are curious about the theory of DQN used in this project, please refer to Report
Unity is used as an RL environment and its detail is as below.
- One agent is configured with the name of "BananaBrain", and the size of its action spaces is 4.
- The size of a states is 37 dimension, and it is not fixel based.

INFO:unityagents:
'Academy' started successfully!
Unity Academy name: Academy
        Number of Brains: 1
        Number of External Brains : 1
        Lesson number : 0
        Reset Parameters :
Unity brain name: BananaBrain
        Number of Visual Observations (per agent): 0
        Vector Observation space type: continuous
        Vector Observation space size (per agent): 37
        Number of stacked Vector Observation: 1
        Vector Action space type: discrete
        Vector Action space size (per agent): 4
        Vector Action descriptions: , , ,

An episode runs up to 300 steps
To goal of this project is to acheive an average cumulative score over 100 episode 13 or higher. In this project, I increase the target average rewards to 15 from 13 in order to make the agent play well on a single episode.
If you are curious about the detail algorithm and the performance with the trained agent, refer to Report (PDF version)

Codes in this project

Navigation.ipynb
- A Jupyter notebook where all the code execution happens from RL environment creation, RL training, and testing.
dqn_agent.py
- A module which defines Agent class and ReplayBuffer for experience replay.
- Agent class;
  - chooses an action using the policy
  - updates the replay buffer, and trigger DQN training
  - executes DQN training and updates the target policy gradually
- ReplayBuffer class;
  - provides an interface for adding a tuple into the buffer
  - provides a mini-batch from the buffer for mini-batch SGD training
model.py
- QNetwork class is defines a DQN model in PyTorch

Getting Started - Install Unity environment

In order to run this reinforcement learning example, you need to install the environment as well as python and PyTorch. The below is to guide you how to install the environment per your OS environment.

Download the environment from one of the links below. You need only select the environment that matches your operating system:
- Linux: click here
- Mac OSX: click here
- Windows (32-bit): click here
- Windows (64-bit): click here
(For Windows users) Check out this link if you need help with determining if your computer is running a 32-bit version or 64-bit version of the Windows operating system.

(For AWS) If you'd like to train the agent on AWS (and have not enabled a virtual screen), then please use this link to obtain the environment.
Place the file in the clone repository folder, and unzip (or decompress) the file.

Instructions

You first might want to prepare your virtual environment to avoid any conflict with existing libraries installed. Refer to (https://docs.python-guide.org/dev/virtualenvs/) to learn how to create a virtual environment.

Then, follow the instructions in Navigation.ipynb to get started with training your own agent! Or you can load model.pt which I trained already.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.gitignore		.gitignore
Navigation.ipynb		Navigation.ipynb
Navigation_Pixels.ipynb		Navigation_Pixels.ipynb
README.md		README.md
Report.html		Report.html
Report.md		Report.md
Report.pdf		Report.pdf
checkpoint.pth		checkpoint.pth
dqn_agent.py		dqn_agent.py
model.pt		model.pt
model.py		model.py
rewards.png		rewards.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project 1: Navigation

Introduction

Project Detail

Codes in this project

Getting Started - Install Unity environment

Instructions

About

Releases

Packages

Languages

muhyun/dqn_banana_navigation

Folders and files

Latest commit

History

Repository files navigation

Project 1: Navigation

Introduction

Project Detail

Codes in this project

Getting Started - Install Unity environment

Instructions

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages