Skip to content

Udacity's Deep Reinforcement Learning Nanodegree Project - Navigation

Notifications You must be signed in to change notification settings

muhyun/dqn_banana_navigation

Repository files navigation

Udacity's Deep Reinforcement Learning Nanodegree

Project 1: Navigation

Introduction

For this project, you will train an agent to navigate (and collect bananas!) in a large, square world.

Trained Agent

A reward of +1 is provided for collecting a yellow banana, and a reward of -1 is provided for collecting a blue banana. Thus, the goal of your agent is to collect as many yellow bananas as possible while avoiding blue bananas.

The state space has 37 dimensions and contains the agent's velocity, along with ray-based perception of objects around agent's forward direction. Given this information, the agent has to learn how to best select actions. Four discrete actions are available, corresponding to:

  • 0 - move forward.
  • 1 - move backward.
  • 2 - turn left.
  • 3 - turn right.

The task is episodic, and in order to solve the environment, your agent must get an average score of +13 over 100 consecutive episodes.

Project Detail

  • PyTorch is used as deep learning framework for defining DQN. If we are curious about the theory of DQN used in this project, please refer to Report
  • Unity is used as an RL environment and its detail is as below.
    • One agent is configured with the name of "BananaBrain", and the size of its action spaces is 4.
    • The size of a states is 37 dimension, and it is not fixel based.
INFO:unityagents:
'Academy' started successfully!
Unity Academy name: Academy
        Number of Brains: 1
        Number of External Brains : 1
        Lesson number : 0
        Reset Parameters :
Unity brain name: BananaBrain
        Number of Visual Observations (per agent): 0
        Vector Observation space type: continuous
        Vector Observation space size (per agent): 37
        Number of stacked Vector Observation: 1
        Vector Action space type: discrete
        Vector Action space size (per agent): 4
        Vector Action descriptions: , , , 
  • An episode runs up to 300 steps
  • To goal of this project is to acheive an average cumulative score over 100 episode 13 or higher. In this project, I increase the target average rewards to 15 from 13 in order to make the agent play well on a single episode.
  • If you are curious about the detail algorithm and the performance with the trained agent, refer to Report (PDF version)

Codes in this project

  • Navigation.ipynb
    • A Jupyter notebook where all the code execution happens from RL environment creation, RL training, and testing.
  • dqn_agent.py
    • A module which defines Agent class and ReplayBuffer for experience replay.
    • Agent class;
      • chooses an action using the policy
      • updates the replay buffer, and trigger DQN training
      • executes DQN training and updates the target policy gradually
    • ReplayBuffer class;
      • provides an interface for adding a tuple into the buffer
      • provides a mini-batch from the buffer for mini-batch SGD training
  • model.py
    • QNetwork class is defines a DQN model in PyTorch

Getting Started - Install Unity environment

In order to run this reinforcement learning example, you need to install the environment as well as python and PyTorch. The below is to guide you how to install the environment per your OS environment.

  1. Download the environment from one of the links below. You need only select the environment that matches your operating system:

    (For Windows users) Check out this link if you need help with determining if your computer is running a 32-bit version or 64-bit version of the Windows operating system.

    (For AWS) If you'd like to train the agent on AWS (and have not enabled a virtual screen), then please use this link to obtain the environment.

  2. Place the file in the clone repository folder, and unzip (or decompress) the file.

Instructions

You first might want to prepare your virtual environment to avoid any conflict with existing libraries installed. Refer to (https://docs.python-guide.org/dev/virtualenvs/) to learn how to create a virtual environment.

Then, follow the instructions in Navigation.ipynb to get started with training your own agent! Or you can load model.pt which I trained already.

About

Udacity's Deep Reinforcement Learning Nanodegree Project - Navigation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published