Skip to content

bo-pang/O-LSPI

Repository files navigation

O-LSPI

A version of optimistic least-squares policy iteration (LSPI) for the classic discrete-time linear quaratic regulation (LQR) problem published in paper:

Bo Pang, and Zhong-Ping Jiang. "Robust reinforcement learning: A case study in linear quadratic regulation." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 35. No. 10. 2021.

O-LSPI.m

Implements the main O-LSPI algorithm.

func_data_collect.m

Implements the data collection step for the learning algorithm.

data_collect_Noise_Mag.m & Noise_Mag_exp.m

Collects the data for the experiment in the paper.

draw_picture.m

Draws the Fig. 1 in the paper.

kronv.m, vec2sm.m & sm2vec.m

Auxilliary functions for vector/matrix conversions.

About

A version of optimistic LSPI for LQR problem published in AAAI-21 conference

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages