Viper

Read the accompanying blog post here (tbd).

Verifiability via Iterative Policy ExtRaction (2019) paper]

In this paper the authors distill a Deep Reinforcement Learning such as DeepQN into a decision tree policy which can then be automatically checked for correctness, robustness, and stability.

This repository implements and tests the viper algorithm on the following environments:

CartPole
Atari Pong
ToyPong (tbd)

Usage

The entire project can be run using the main.py script which can take more options than the ones mentioned below. To get a full list of options run python main.py --help.

Training the oracle

The commands below reflect configurations that helped achieve a perfect reward averaged over 50 rollouts.

Atari Pong (TODO: only achieves 20.12 +/- 1.66 out of 21):

python main.py train-oracle --env-name PongNoFrameskip-v4 --n-env 64 --total-timesteps 15_000_000

Toy Pong:

python main.py train-oracle --env-name ToyPong-v0 --n-env 1 --total-timesteps 1_000_000

Cart pole:

python main.py train-oracle --env-name CartPole-v1 --n-env 8 --total-timesteps 100_000

You can always resume training a stored model by adding the --resume flag to the same command.

Running viper

Once the oracle policies are trained you can run viper on the same environment:

Cart pole:

python main.py train-viper --env-name CartPole-v1 --n-env 1

Toy Pong:

python main.py train-viper --env-name ToyPong-v0 --n-env 4 --max-leaves 61 --total-timesteps 1_000_000

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
gym_env		gym_env
model		model
test		test
train		train
verify		verify
.gitignore		.gitignore
1805.08328.pdf		1805.08328.pdf
LEARNINGS.md		LEARNINGS.md
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gym_env

gym_env

model

model

test

test

train

train

verify

verify

.gitignore

.gitignore

1805.08328.pdf

1805.08328.pdf

LEARNINGS.md

LEARNINGS.md

README.md

README.md

main.py

main.py

requirements.txt

requirements.txt

Repository files navigation

Viper

Usage

Training the oracle

Running viper

About

Releases

Packages

Languages

Safe-RL-Team/viper-verifiable-rl-impl

Folders and files

Latest commit

History

Repository files navigation

Viper

Usage

Training the oracle

Running viper

About

Resources

Stars

Watchers

Forks

Languages