Project Goal:

Implementing the PPO Algorithm
Selecting and conducting experiments with two environments
Improving the performance of the agents in your environments

Requirements

Complete the ... parts in the skeleton code
Train the PPO Agent with your code
Contracting your source codes, evaluation pictures, game video, and presentation slides to zip file and submit to Blackboard

HumanoidStandup-v4

Goal: Make the humanoid standup and then keep it standing

Best Results

MLP: state_dim -> 128 -> 64 -> action_dim
Timesteps: 5M
ent_coef: 0.0001
rpo_coef: 0.5
sym_action_coef: 0.02

Robust Policy Optimization

Robust Policy Optimization (CleanRL)

Modified from PPO
RPO leverages a method of perturbing the distribution representing actions
Improved Performance compared to PPO

Additional: Symmetric Action Loss

Why do Humanoid use only one arm or leg? -> Use additional loss
Symmetric Action Loss guided the use of both arms and legs and improve performance!

HalfCheetah-v4

Goal: Make the cheetah run forward as fast as possible

Best Results

MLP: state_dim -> 64 -> 64 -> action_dim
Timesteps: 1M
ent_coef: 0.5
not use rpo (lower performance)

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Final_Project_slides.pdf		Final_Project_slides.pdf
GAE_calculation.png		GAE_calculation.png
Halfcheetah.ipynb		Halfcheetah.ipynb
Halfcheetah_evaluation.jpg		Halfcheetah_evaluation.jpg
Humanoidstandup.ipynb		Humanoidstandup.ipynb
PPO_loss.png		PPO_loss.png
README.md		README.md
final_code.ipynb		final_code.ipynb
final_skeleton_code.ipynb		final_skeleton_code.ipynb
halfcheetah_test_video.gif		halfcheetah_test_video.gif
halfcheetah_test_video.mp4		halfcheetah_test_video.mp4
humanoidstandup_evaluation.jpg		humanoidstandup_evaluation.jpg
humanoidstandup_test_video.gif		humanoidstandup_test_video.gif
humanoidstandup_test_video.mp4		humanoidstandup_test_video.mp4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Goal:

Requirements

HumanoidStandup-v4

Best Results

Robust Policy Optimization

Additional: Symmetric Action Loss

HalfCheetah-v4

Best Results

About

Releases

Packages

Languages

stop1one/RL_Project

Folders and files

Latest commit

History

Repository files navigation

Project Goal:

Requirements

HumanoidStandup-v4

Best Results

Robust Policy Optimization

Additional: Symmetric Action Loss

HalfCheetah-v4

Best Results

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages