Random Network Distilation

This is an implementation of Random network distillation on Montezuma's Revenge using Pytorch.
paper link: https://arxiv.org/abs/1810.12894

Setup

To run the program, first install the required packages by executing:

$ pip3 install -r requirements.txt

Play

Run the program with pretrained model and see the agent playing:

$ python3 main.py --play --path models/pretrained_model.pth

x-axis: train steps, y-axis: entropy
this diagram shows how entropy decreases. the agent starts by total random movements and learns a stochastic policy after being trained.

The pretrained model models/pretrained_model.pth is obtained by training with the following settings:

variable	value
environment type	"MR"
number of train steps	11400
normilization steps parameter	1000
number of environments	64
number of epoches	4
agent steps(rollout)	128
number of mini batches	2
learning rate	0.0001
discount factor	0.999
intrinsic discount factor	0.99
lambda(related to generilized advantage estimation algorithm)	0.95
clip(related to PPO algorithm)	0.1
value loss coefficient	0.5
entropy coefficient	0.001
the predictior's update proportion	0.25
intrinsic advantages coefficient	1
extrinsic advantages coefficient	2

Train

You can train from a model from scratch by using the following command. Note that if you don't specify the variables, They match the default value described in the table above. The save_int varibale describes the interval of saving a model checkpoint.

Some useful diagrams are stored in tensorboard format while training.

python3 main.py --train --num_env 64 --train_steps 12000 --predictor_update_p 0.25 --num_pre_norm_steps 10 --game_steps 128 --num_epoch 4 --mini_batch 2 --save_int 100

Train from a checkpoint:

python3 main.py ---train --path logs/desired_checkpoint --num_env 64 --train_steps 12000 --predictor_update_p 0.25 --num_pre_norm_steps 10 --game_steps 128 --num_epoch 4 --mini_batch 2 --save_int 100

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
demo		demo
models		models
README.md		README.md
flag.py		flag.py
main.py		main.py
model.py		model.py
montezuma_revenge_env.py		montezuma_revenge_env.py
play.py		play.py
requirements.txt		requirements.txt
rnd_model.py		rnd_model.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

demo

demo

models

models

README.md

README.md

flag.py

flag.py

main.py

main.py

model.py

model.py

montezuma_revenge_env.py

montezuma_revenge_env.py

play.py

play.py

requirements.txt

requirements.txt

rnd_model.py

rnd_model.py

train.py

train.py

utils.py

utils.py

Repository files navigation

Random Network Distilation

Setup

Play

Train

About

Releases

Packages

Contributors 2

Languages

Justkim/random-network-distillation-pytorch

Folders and files

Latest commit

History

Repository files navigation

Random Network Distilation

Setup

Play

Train

About

Resources

Stars

Watchers

Forks

Languages