Generalization and Regularization in DQN

Code and LaTeX source for the paper Generalization and Regularization in DQN .

Reproducing key results

The code used in all experiments can be located at JesseFarebro/dqn-ale. You can checkout the code by cloning this repository with recursive submodules and follow the instructions at the provided repository.

To run the baseline experiments simply run:

python3 main.py --rom {freeway.bin,hero.bin,space_invaders.bin,breakout.bin} \
                --random_seed {0,1,2,3,4,5,6,7,8,9}

To run baseline experiments to obtain a regularized representation run:

python3 main.py --rom {freeway.bin,hero.bin,space_invaders.bin,breakout.bin} \
                --random_seed {0,1,2,3,4,5,6,7,8,9} \
                --use_dropout \
                --use_l2

To evaluate learned policies from checkpoints run:

python3 main.py --rom {freeway.bin,hero.bin,space_invaders.bin,breakout.bin} \
                --random_seed {0,1,2,3,4,5,6,7,8,9} \
                --evaluate \
                --max_episode_count 100 \
                --restore_dir {logdir,logdir_with_reg}

To fine tune the entire network (representation + state value layers) on a new mode:

python3 main.py --rom {freeway.bin,hero.bin,space_invaders.bin,breakout.bin} \
                --random_seed {0,1,2,3,4,5,6,7,8,9} \
                --restore_dir {logdir,logdir_with_reg} \
                --mode M --difficulty D

To re-learn co-adaptations you could run:

python3 main.py --rom {freeway.bin,hero.bin,space_invaders.bin,breakout.bin} \
                --random_seed {0,1,2,3,4,5,6,7,8,9} \
                --restore_dir {logdir, logdir_with_reg} \
                --mode M --difficulty D \
                --load_scope q/conv/

To freeze certain layers while fine-tuning others run:

python3 main.py --rom {freeway.bin,hero.bin,space_invaders.bin,breakout.bin} \
                --random_seed {0,1,2,3,4,5,6,7,8,9} \
                --restore_dir {logdir, logdir_with_reg} \
                --mode M --difficulty D \
                --optimize_scope q/conv/

Reference

@article{Farebrother2018a,
  author    = {Jesse Farebrother and
               Marlos C. Machado and
               Michael Bowling},
  title     = {Generalization and Regularization in {DQN}},
  journal   = {CoRR},
  volume    = {abs/1810.00123},
  year      = {2018},
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
code @ 16705e1		code @ 16705e1
paper		paper
poster		poster
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generalization and Regularization in DQN

Reproducing key results

To run the baseline experiments simply run:

To run baseline experiments to obtain a regularized representation run:

To evaluate learned policies from checkpoints run:

To fine tune the entire network (representation + state value layers) on a new mode:

To re-learn co-adaptations you could run:

To freeze certain layers while fine-tuning others run:

Reference

About

Releases

Packages

Languages

License

JesseFarebro/dqn-generalization

Folders and files

Latest commit

History

Repository files navigation

Generalization and Regularization in DQN

Reproducing key results

To run the baseline experiments simply run:

To run baseline experiments to obtain a regularized representation run:

To evaluate learned policies from checkpoints run:

To fine tune the entire network (representation + state value layers) on a new mode:

To re-learn co-adaptations you could run:

To freeze certain layers while fine-tuning others run:

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages