This repository implements and showcases experiments based on the paper "Algorithms for Inverse Reinforcement Learning" by Ng & Russell (2000).
In the initial experiment, a 5 x 5 grid world is used. The agent starts from the lower-left grid square and navigates to the absorbing upper-right grid square. The actions correspond to the four compass directions, but with a 30% chance of moving in a random direction instead. The objective is to recover the reward structure given the policy and problem dynamics.
- Obtained a reward function by observing the policy of a trained agent which closely approximated the true reward.
- Also derived a reward funtion from a given policy.
The second experiment involves the "mountain-car" task, where the goal is to reach the top of the hill. The true, undiscounted, reward is -1 per step until reaching the goal. The state is the car's position and velocity, and the state space is continuous.
- Using a reward function based on the car's position and 26 Gaussian-shaped basis functions, the algorithm produces a reward function that captures the structure of the true reward.
The final experiment applies the sample-based algorithm to a continuous version of the 5 x 5 grid world. The state space is [0, 1] × [0, 1], and actions move the agent 0.2 in the intended direction with added noise. The true reward is 1 in a non-absorbing square [0.8, 1] × [0.8, 1], and 0 everywhere else.
- The algorithm, using linear combinations of two-dimensional Gaussian basis functions and produces reasonable solutions.
Feel free to explore my introductory presentation to Inverse Reinforcement Learning (IRL) and also get an overview of the experiments conducted.
- Ng, A., & Russell, S. (2000). Algorithms for Inverse Reinforcement Learning.
- ShivinDass. (n.d.). GitHub - ShivinDass/inverse_rl. GitHub.
- Neka-Nat. (n.d.). Neka-nat/inv_rl: Inverse reinforcement learning argorithms. GitHub.