Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RL-baseline] Model v2, original action set, LR 1e-3 #28

Open
wants to merge 21 commits into
base: RL-with-baseline
Choose a base branch
from

Conversation

ziritrion
Copy link
Collaborator

We learned that by modifying the learning rate from our original 1e-2 to 1e-3 we could improve the training dramatically.

With Learning Rate of 1e-2 (commit befd201), we could barely pass a Running Reward of 30, as the images below show:
TensorBoard
TensorBoard-2
TensorBoard-3

By simply changing the learning rate to 1e-3 we managed to get a Running Reward right below 700 at some points, but noticed a dramatic drop near the 10k episode mark (commit 970f84a)
TensorBoard
TensorBoard-2
TensorBoard-3

Here's an example run with the latest model. Notice that the Running Reward achieved by the model at the end of the 10k episodes is about 250 or so, so the behaviour isn't particularly good.
https://user-images.githubusercontent.com/1465235/111752396-f06b4a80-8895-11eb-97c5-4e0a29091cd1.mp4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants