Final project for Bayesian theory and computation @ PKU (2021 Spring).
This project aims to train an RL agent able to drive in the CarRacing-v0
environment using features extracted by Q-Consistency regularized VAE (QC-VAE).
The pipeline of the whole process is illustrated in the figure below
You can find our thesis here.
This project depends on the following python packages:
- pytorch
- cudatoolkit=10.2
- tensorflow=1.15.0
- tqdm
- Pillow
- gym
- pybox2d
- pyvirtualdisplay
, and the following Linux libraries:
- xvfb
For your convenience, you can run the setup script to configure the runtime environment:
sudo setup.sh
Sub-modules:
- OpenAI-GYM-CarRacing-DQN: containing an pre-trained expert DQN agent whose action is used when training our Q-network
- TD3: forked from this repo and modified by Zihan Mao, contains components of DDPG (and CNN-DDPG)
In this repo:
VAE.py
: implement VAE with convolutional layerstrain_critic.py
: train Q-network using action performed by expert DQN agenttrain_vae.py
: train QC-VAE with usual VAE loss and our Q-consistency loss computed by the pre-trained Q-networktrain_agent.py
: train DDPG agent using features extracted by the QC-VAEbaseline.py
: train baseline agent (DDPG with convolutional layers) using the image inputs themselvesutil.py
: helper functions that help processing image inputsmodel/
: model folder
Code Author: Zihan Mao -- Personal Email, Educational Email
Project Link: https://github.com/Mzhhh/VAEFRL
Thanks to the related repositories on Github and various questions on Stackoverflow, without which I couldn't have written a single line of bug-free code.