Skip to content
[ECML2019] Stochastic Actor Critic Methods
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Stochastic Activation Actor Critic Methods

This repository demonstrates our proposed stochastic activation actor critic methods, published at ECML-PKDD 2019. We use Qbert, BeamRider, and Seaquest to showcase sa3c, fully stochastic a3c (fa3c), and hierarchical prior sa3c (hpa3c) respectively. In addition, we provide baseline a3c and noisy-net training code as comparison.

If you find our work and code useful, please cite our paper [pdf][appendix]:

  title={Stochastic Activation Actor Critic Methods},
  author={Shang, Wenling and van Hoof, Herk and Welling, Max},


conda create -n py36 python=3.6 anaconda
source activate py36
conda install pytorch torchvision cudatoolkit=9.0 -c pytorch
conda install -c menpo opencv
pip install gym
pip install gym[atari]
pip3 install logger

Baseline A3C

for Qbert

python --model_type baseline --save_best --game Qbert-v4 

for BeamRider

python --model_type baseline --save_best --game BeamRider-v4

for Seaquest

python --model_type baseline --save_best --game Seaquest-v4

NoisyNet A3C

for Qbert

python --model_type nn --save_best --game Qbert-v4

for BeamRider

python --model_type nn --save_best --game BeamRider-v4 

for Seaquest

python --model_type nn --save_best --game Seaquest-v4 

Stochastic Activation A3C

SA3C for Qbert

python --model_type sa3c --save_best --game Qbert-v4 --sig 4

FSA3C for BeamRider

python --model_type fsa3c --save_best --game BeamRider-v4 --sig 4

HPA3C for Seaquest

python --model_type hpa3c --save_best --game Seaquest-v4 --crelu


We greatly appreciate the dev teams for PyTorch, Gym and ALE. Our implementation has also taken inspiration from the following excellent repositories:

You can’t perform that action at this time.