Transfer of Value Functions via Variational Methods

This repository contains the code used for the empirical evaluation of our Variational Transfer Algorithms [1], presented in our paper "Transfer of Value Functions via Variational Methods" (NIPS 2018), together with the instruction on how to reproduce our results.

Abstract

We consider the problem of transferring value functions in reinforcement learning. We propose an approach that uses the given source tasks to learn a prior distribution over optimal value functions and provide an efficient variational approximation of the corresponding posterior in a new target task. We show our approach to be general, in the sense that it can be combined with complex parametric function approximators and distribution models, while providing two practical algorithms based on Gaussians and Gaussian mixtures. We theoretically analyze them by deriving a finite-sample analysis and provide a comprehensive empirical evaluation in four different domains.

Requirements

python 3.5
numpy 1.14.2
PyTorch 0.4.1
joblib
matplotlib
OpenAI gym

Repository Structure

algorithms: this folder contains the implementation of the algorithms proposed. It also includes an implementation of DDQN [2] in PyTorch.
approximators: it includes different function approximators implementation such as linear regressor and Feedforward Neural Network.
envs: it includes the implementation of the evaluation environments used (rooms, cartpole, mountain car and maze).
experiments: here, organized by folder, are the main scripts to run the experiments presented in the empirical evaluation of our paper.
features: it includes implementations of some features functions.
misc: it contains implementation of varied auxiliary functions and data structures.
operators: it contains implementations of different Bellman operators.

How to reproduce our results

In the folder experiments/, further folders, corresponding to each experimental environment, can be found. To reproduce our results it enough to use python 3.5 to run the scripts. Each of these uses, by default, the sources.pkl with the data from the source tasks used to transfer and produces a pkl file with the data to plot the performance.

run_ft.py runs the fine-tuning.
run_gvt.py runs the Gaussian Variational Transfer (GVT) experiment.
run_mgvt.py runs the Mixture of Gaussian Variational Transfer (MGVT). By default, it uses 1 component for the posterior representation. To run for 3 components is only required to add the command line argument --post_components 3.
run_nt.py runs a non transfer algorithm. By default, it runs our algorithm based in the minimization of the Mellow Bellman Error. To run DDQN, it is enough to add the command-line argument --dqn 1

Particularly, for the Rooms environment, we have further scripts that correspond to the additional experiments. In here we use * to substitute the algorithm name [gvt,mgvt].

run_*_likelihood runs the GVT or MGVT (depending of the script) using the sources with Gaussian distribution governing the door position. By default, it runs the 2-rooms environment.
run_*_sequential runs the GVT or MGVT (depending of the script) by evaluating the performance with different number of source task given to the algorithm.
To run the experiment with the distribution shift (tasks distribution of the sources restricted) is enough to run python3 run_*.py --source_file path/to/experiments/rooms/sources_gen where path/to/experiments/rooms/sources_gen is the absolute path or relative path (from the working directory) to the file sources_gen.pkl that contains the sources sampled from the restricted distribution of tasks.

References

[1] Andrea Tirinzoni, Rafael Rodriguez Sanchez and Marcello Restelli. Tranfer of Value Functions via Variational Methods. 2018

[2] Hado Van Hasselt, Arthur Guez, and David Silver. Deep reinforcement learning with double q-learning. 2016.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

algorithms

algorithms

approximators

approximators

envs

envs

experiments

experiments

features

features

misc

misc

operators

operators

.gitignore

.gitignore

README.md

README.md

Repository files navigation

Transfer of Value Functions via Variational Methods

Abstract

Requirements

Repository Structure

How to reproduce our results

References

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 453 Commits
algorithms		algorithms
approximators		approximators
envs		envs
experiments		experiments
features		features
misc		misc
operators		operators
.gitignore		.gitignore
README.md		README.md

AndreaTirinzoni/variational-transfer-rl

Folders and files

Latest commit

History

Repository files navigation

Transfer of Value Functions via Variational Methods

Abstract

Requirements

Repository Structure

How to reproduce our results

References

About

Resources

Stars

Watchers

Forks

Languages