[RL-baseline] Model v2, experiment #2 #31

ziritrion · 2021-03-20T19:00:04Z

Third of the new experiments with the new action sets (5k episodes) and Learning Rate 1e-3-

The model managed to get a max Running Reward of ~364. Last episode managed 254.

Action set #2 copied below:
[0.0, 0.8, 0.0], # throttle
[0.0, 0.0, 0.6], # break
[-0.8, 0.0, 0.0], # left
[0.8, 0.0, 0.0], # right

Results are copied below:

Sample video below (somehow it managed to get an extraordinarily good result).
https://user-images.githubusercontent.com/1465235/111882632-d8d7b300-89b6-11eb-97bc-1114db39d397.mp4

…sode, successful training)

…-wo-rl into RL-baseline-new-model Trying to solve conflicts

…ignore change

…but drops to 2xx

ziritrion · 2021-03-25T20:35:15Z

Updated with 20k episodes. Running reward is now 240 with an achieved max of 565.

Sample video below. In this sample, the car skids out of the circuit and attempts to go back, but starts going backwards.

openaigym.video.0.187057.video000000.mp4

xeviknal and others added 27 commits March 11, 2021 18:30

Add metrics and logsoftmax

2106b4b

Updating the model

0f2f195

Add new model to baseline

39b3ad3

The line that fixes all

907af7a

Add mean entropy - to reduce tensorboard runs

8e4ee6c

Add action prob mean: mean of prob of actions taken in the episode

3970702

Added simple directory check to params folder

957a3b4

Added additional param save conditions (end of log_interval, last epi…

0105564

…sode, successful training)

Merge branch 'RL-baseline-new-model' of github.com:xeviknal/aidl-2021…

b5a5184

…-wo-rl into RL-baseline-new-model Trying to solve conflicts

Removing old runs; they don't apply to this branch

c6954ec

RL-baseline-NM-save-optim

a7c907c

Load optimizer params

d5b676c

8k runs

bd7f6c0

Fresh start with latest checkpoint load-save changes. Also, small git…

5f246c0

…ignore change

bugfix

e8aa5e4

10k runs

c52a4f2

Almost 20k runs. Reward is starting to improve little by little

192bece

Fixed runner.py for generating videos

11b46c4

25k runs. Slight improvement but far from desirable

befd201

10k episodes. Learning rate 1e-3. Original actions. RR of almost 700 …

970f84a

…but drops to 2xx

Added new action set and removed previous runs for fresh start

1038b29

5k runs, reward around 450

1204087

Fresh start

2b1be44

5k runs. Running Reward 257

a6964e8

Fresh start

b322ca7

Forgot to set new action set

a6f982e

5k runs. Running reward 265

b1d2bf0

ziritrion changed the base branch from main to RL-with-baseline March 20, 2021 19:00

ziritrion requested review from xeviknal and jaimepedretp March 20, 2021 19:00

20k episodes, running reward 241

e5b9dd8

jaimepedretp approved these changes Mar 30, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RL-baseline] Model v2, experiment #2 #31

[RL-baseline] Model v2, experiment #2 #31

ziritrion commented Mar 20, 2021

ziritrion commented Mar 25, 2021

[RL-baseline] Model v2, experiment #2 #31

Are you sure you want to change the base?

[RL-baseline] Model v2, experiment #2 #31

Conversation

ziritrion commented Mar 20, 2021

ziritrion commented Mar 25, 2021