Skip to content

"A Joint Imitation Reinforcement Learning Framework for Reduced Baseline Regret" accepted @ IROS-2021

License

Notifications You must be signed in to change notification settings

Pi-Star-Lab/JIRL

Repository files navigation

A Joint Imitation-Reinforcement Learning Framework for Reduced Baseline Regret

Official Repository for "A Joint Imitation-Reinforcement Learning (JIRL) Framework for Reduced Baseline Regret"

Technical Report

The report contains a detailed description of the experimental settings and hyperparameters used to obtain the results reported in our paper.

Objectives

  1. Leveraging a baseline’s online demonstrations to minimize the regret w.r.t the baseline policy during training
  2. Eventually surpassing the baseline performance

Assumptions

  1. Access to a baseline policy at every time step
  2. Uses an off-policy RL algorithm

Framework

Experiment Domains

Inverted pendulum
Lunar lander
Walker-2D
Lane following (CARLA)
Lane following (JetRacer)

Results

Performance

Inverted pendulum
Lunar lander
Lane following (CARLA)
Walker-2D
Lane following (JetRacer)

Baseline Regret

Lunar lander (JIRL vs TRPO)

About

"A Joint Imitation Reinforcement Learning Framework for Reduced Baseline Regret" accepted @ IROS-2021

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages