RLHPS

RLHPS is based on the implementation of Deep Reinforcement Learning from Human Preferences [Christiano et al., 2017].

The system allows you to teach a reinforcement learning agent novel behaviors, even when both:

The behavior does not have a pre-defined reward function
A human can recognize the desired behavior, but cannot demonstrate it

It's also just a lot of fun to train simulated robots to do whatever you want! For example, in the MuJoCo "Walker" environment, the agent is usually rewarded for moving forwards, but you might want to teach it to do ballet instead:

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
agents		agents
human-feedback-api		human-feedback-api
rl_teacher		rl_teacher
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RLHPS

About

Releases

Packages

Languages

License

kaichiuwong/rlhps

Folders and files

Latest commit

History

Repository files navigation

RLHPS

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages