We trained an agent in Gazebo environment using Deep Deterministic Policy Gradient(DDPG) to control turtlebot to reach the target (green circle) without any collision in a continuous action space. The result shows that agent can successfully reach random target in a 4 * 4 square avoiding unseen different shaped obstacles by taking 10-dimensional sparse range input and target position.
Report is available at https://github.com/lyuheng/650project/blob/master/demo/650report.pdf
roslaunch turtlebot3_gazebo turtlebot3_stage_1.launch
roslaunch project ddpg_stage_1.launch
- pretrain on env without any obstacle (using DDPG)
- fine-tunning on env with square obstacles at four corners (using DDPG)
- evaluate agent on unseen virtual environment with different shaped obstacles at random positions
- evaluate agent on a more complex unseen environment, although takes longer time
- evaluate agent on moving obstales with low speed
- Substitute DDPG with TD3
- Add LSTM into Actor