Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
2 changed files
with
6 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# Reverse Curriculum Generation for Reinforcement Learning | ||
|
||
Carlos Florensa, David Held, Markus Wulfmeier, Pieter Abbeel | ||
|
||
Many relevant tasks require an agent to reach a certain state, or to manipulate objects into a desired configuration. For example, we might want a robot to align and assemble a gear onto an axle or insert and turn a key in a lock. These tasks present considerable difficulties for reinforcement learning approaches, since the natural reward function for such goal-oriented tasks is sparse and prohibitive amounts of exploration are required to reach the goal and receive a learning signal. Past approaches tackle these problems by manually designing a task-specific reward shaping function to help guide the learning. Instead, we propose a method to learn these tasks without requiring any prior task knowledge other than obtaining a single state in which the task is achieved. The robot is trained in "reverse", gradually learning to reach the goal from a set of starting positions increasingly far from the goal. Our method automatically generates a curriculum of starting positions that adapts to the agent's performance, leading to efficient training on such tasks. We demonstrate our approach on difficult simulated fine-grained manipulation problems, not solvable by state-of-the-art reinforcement learning methods. |