irl-hierarchal-maxent-safe-exploration

Implementation of a updated softmax Maxmium Entropy algo for 2D capture the flag (CTP) set up for inverse reinforcement learning.

The agent successfully learned to solve hierarchal tasks, that is learned all its subgoals, while ensuring safe exploration (avoiding risky states in the state space).

We used the interface by Minimalistic gridworld environment for OpenAI Gym (https://github.com/maximecb/gym-minigrid) but modified it for three main scenarios for CTP.

An example configuration for testing :

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
figures		figures
gym_minigrid		gym_minigrid
plot_data		plot_data
plots		plots
pytorch_rl		pytorch_rl
LICENSE		LICENSE
README.md		README.md
TODO		TODO
expert.py		expert.py
expert_traj.npy		expert_traj.npy
hirl.py		hirl.py
inferred_reward.png		inferred_reward.png
inverse_agent.py		inverse_agent.py
plotting.py		plotting.py
report.pdf		report.pdf
risk.py		risk.py
run_tests.py		run_tests.py
setup.py		setup.py
standalone.py		standalone.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

irl-hierarchal-maxent-safe-exploration

About

Releases

Packages

Languages

License

annahedstroem/irl-hierarchal-maxent-safe-exploration

Folders and files

Latest commit

History

Repository files navigation

irl-hierarchal-maxent-safe-exploration

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages