Learning to plan using dynamic VIN

A variation of Value Iteration Network, NIPS 2016 [arxiv] .

The main idea building upon original VIN, is to iterate a generated step-wise reward map in the value-iteration loop, in order to learn to plan in a dynamic scene. This work can be combined with "Video Prediction" techniques, and it is still in progress. Currently, it is trained by using the ground-truth state in the simulator.

We use A3C + Curriculum Learning for Rl-training scheme, similar to [Wu et al, ICLR 2017]. Due to the skeleton method of pygame rendering, here we use multi-processes to generate experience from simulator, instead of multiple threads.

About the code

The a3c.py defines the policy/value network with a share structure (a3c) embedded with a VI Module, as the following,

.

The agent.py indicates the single agent and interaction with the environment in reinforcement learning stage, which includes the async with global model and the training methods.

The thread.py contains high-level distributed training with tf.train.ClusterSpec, and curriculum settings.

The constants.py defines all the hyper-parameters.

How to use

Start training: bash train_scipt.sh
Open tmux for monitoring: tmux a -t a3c (you can monitor each thread by switching tmux control pane: ctrl + b, w)
Open tensorboard: **.**.**.**:15000
Check log: less Curriculum log
Stop training: ctrl + c

Requirements

Tensorflow 1.1
Pygame
Numpy

MISC

I completed this code when I was an intern at Horizon Robotics. Greatly thanks my mentor Penghong Lin, and Lisen Mu for helpful discussions.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Env		Env
img		img
.gitignore		.gitignore
Curriculum log		Curriculum log
Readme.md		Readme.md
a3c.py		a3c.py
agent.py		agent.py
constants.py		constants.py
thread.py		thread.py
train_script.sh		train_script.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Env

Env

img

img

.gitignore

.gitignore

Curriculum log

Curriculum log

Readme.md

Readme.md

a3c.py

a3c.py

agent.py

agent.py

constants.py

constants.py

thread.py

thread.py

train_script.sh

train_script.sh

utils.py

utils.py

Repository files navigation

Learning to plan using dynamic VIN

About the code

How to use

Requirements

MISC

Useful Resources

About

Releases

Packages

Languages

ShawnLue/Dynamic-VIN-in-Gridworld

Folders and files

Latest commit

History

Repository files navigation

Learning to plan using dynamic VIN

About the code

How to use

Requirements

MISC

Useful Resources

About

Resources

Stars

Watchers

Forks

Languages