[RL-baseline] Model v2, original action set, LR 1e-3 #28

ziritrion · 2021-03-19T08:33:26Z

We learned that by modifying the learning rate from our original 1e-2 to 1e-3 we could improve the training dramatically.

With Learning Rate of 1e-2 (commit befd201), we could barely pass a Running Reward of 30, as the images below show:

By simply changing the learning rate to 1e-3 we managed to get a Running Reward right below 700 at some points, but noticed a dramatic drop near the 10k episode mark (commit 970f84a)

Here's an example run with the latest model. Notice that the Running Reward achieved by the model at the end of the 10k episodes is about 250 or so, so the behaviour isn't particularly good.
https://user-images.githubusercontent.com/1465235/111752396-f06b4a80-8895-11eb-97c5-4e0a29091cd1.mp4

…sode, successful training)

…-wo-rl into RL-baseline-new-model Trying to solve conflicts

…ignore change

…but drops to 2xx

xeviknal and others added 20 commits March 11, 2021 18:30

Add metrics and logsoftmax

2106b4b

Updating the model

0f2f195

Add new model to baseline

39b3ad3

The line that fixes all

907af7a

Add mean entropy - to reduce tensorboard runs

8e4ee6c

Add action prob mean: mean of prob of actions taken in the episode

3970702

Added simple directory check to params folder

957a3b4

Added additional param save conditions (end of log_interval, last epi…

0105564

…sode, successful training)

Merge branch 'RL-baseline-new-model' of github.com:xeviknal/aidl-2021…

b5a5184

…-wo-rl into RL-baseline-new-model Trying to solve conflicts

Removing old runs; they don't apply to this branch

c6954ec

RL-baseline-NM-save-optim

a7c907c

Load optimizer params

d5b676c

8k runs

bd7f6c0

Fresh start with latest checkpoint load-save changes. Also, small git…

5f246c0

…ignore change

bugfix

e8aa5e4

10k runs

c52a4f2

Almost 20k runs. Reward is starting to improve little by little

192bece

Fixed runner.py for generating videos

11b46c4

25k runs. Slight improvement but far from desirable

befd201

10k episodes. Learning rate 1e-3. Original actions. RR of almost 700 …

970f84a

…but drops to 2xx

ziritrion requested review from xeviknal and jaimepedretp March 19, 2021 08:33

Add 4 different set of actions (#26)

094a5b4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RL-baseline] Model v2, original action set, LR 1e-3 #28

[RL-baseline] Model v2, original action set, LR 1e-3 #28

ziritrion commented Mar 19, 2021

[RL-baseline] Model v2, original action set, LR 1e-3 #28

Are you sure you want to change the base?

[RL-baseline] Model v2, original action set, LR 1e-3 #28

Conversation

ziritrion commented Mar 19, 2021