Predictive Coding for Locally-Linear Control

This is a pytorch implementation of the paper "Predictive Coding for Locally-Linear Control". We propose PC3 - an information-theoretic representation learning framework for optimal control from high-dimensional observations. Experiments show that our proposed method outperforms the existing reconstruction-based approaches significantly.

Details of the model architecture and experimental results can be found in our following paper:

@InProceedings{pmlr-v119-shu20a,
  title = 	 {Predictive Coding for Locally-Linear Control},
  author =       {Shu, Rui and Nguyen, Tung and Chow, Yinlam and Pham, Tuan and Than, Khoat and Ghavamzadeh, Mohammad and Ermon, Stefano and Bui, Hung},
  booktitle = 	 {Proceedings of the 37th International Conference on Machine Learning},
  year = 	 {2020},
  volume = 	 {119},
  series = 	 {Proceedings of Machine Learning Research},
  publisher =    {PMLR},
}

Please CITE our paper whenever this repository is used to help produce published results or incorporated into other software.

Installing

First, clone the repository:

https://github.com/VinAIResearch/PC3-pytorch.git

Then install the dependencies as listed in pc3.yml and activate the environment:

conda env create -f pc3.yml
conda activate pc3

Training

The code currently supports training for planar, pendulum, cartpole and 3-link environment. Run train_pc3.py with your own settings. For example:

python train_pc3.py \
    --env=planar \
    --armotized=False \
    --log_dir=planar_1 \
    --seed=1 \
    --data_size=5000 \
    --noise=0 \
    --batch_size=256 \
    --latent_noise=0.1 \
    --lam_nce=1.0 \
    --lam_c=1.0 \
    --lam_cur=7.0 \
    --norm_coeff=0.1 \
    --lr=0.0005 \
    --decay=0.001 \
    --num_iter=2000 \
    --iter_save=1000 \
    --save_map=False

First, data is sampled according to the given data size and noise level, then the PC3 model will be trained using the specified settings.

If the argument save_map is set to True, the latent map will be drawn every 10 epochs (for planar only), then the gif file will be saved at the same directory as the trained model.

You can also visualize the training process by running tensorboard --logdir={path_to_log_dir}, where path_to_log_dir has the form logs/{env}/{log_dir}. The trained model will be saved at result/{env}/{log_dir}.

Latent maps visualization

You can visualize the latent map for planar and pendulum, to do that simply run:

python latent_map_planar.py --log_path={log_to_trained_model} --epoch={epoch}
or 
python latent_map_pendulum.py --log_path={log_to_trained_model} --epoch={epoch}

Data visualization

You can generate training images for visualization purpose by simply running:

cd data
python sample_{env_name}_data.py --sample_size={sample_size} --noise={noise}

Currently the code supports simulating 4 environments: planar, pendulum, cartpole and 3-link.

The raw data (images) is saved in data/{env_name}/raw_{noise}_noise

Running iLQR on latent space

The configuration file for running iLQR for each task is in ilqr_config folder, you can modify with your own settings. Run:

python ilqr.py --task={task} --setting_path={setting_path} --noise={noise} --epoch={epoch}

where task is in {plane, swing, balance, cartpole, 3-link}, setting_path is the path to the model of your 10 trained models (e.g., result/pendulum/).

The code will run iLQR for all trained models for that specific task and compute some statistics. The result is saved in iLQR_result.

Result

Quantitative result

We compare PC3 with two state-of-the-art LCE baselines: PCC (Levine et al., 2020) and SOLAR (Zhang et al., 2019). Specifically, we report the percentage of time spent in the goal region in the underlying system.

Below are videos showing learned policy in 5 tasks.

Qualitative result

We also compare the quality of learned latent maps between PCC and PC3 in planar and pendulum.

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
data		data
ilqr_config		ilqr_config
mdp		mdp
sample_results		sample_results
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
control_stats.py		control_stats.py
datasets.py		datasets.py
ilqr.py		ilqr.py
ilqr_utils.py		ilqr_utils.py
latent_map_pendulum.py		latent_map_pendulum.py
latent_map_planar.py		latent_map_planar.py
losses.py		losses.py
map_scale_stat.py		map_scale_stat.py
networks.py		networks.py
pc3.yml		pc3.yml
pc3_model.py		pc3_model.py
retrain_dynamics.py		retrain_dynamics.py
train_pc3.py		train_pc3.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predictive Coding for Locally-Linear Control

Installing

Training

Latent maps visualization

Data visualization

Running iLQR on latent space

Result

Quantitative result

Qualitative result

About

Releases

Packages

Contributors 3

Languages

License

VinAIResearch/PC3-pytorch

Folders and files

Latest commit

History

Repository files navigation

Predictive Coding for Locally-Linear Control

Installing

Training

Latent maps visualization

Data visualization

Running iLQR on latent space

Result

Quantitative result

Qualitative result

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages