[Final experiments - RL-Baseline] experiment #1 with seed 7081960 #57

ziritrion · 2021-04-15T07:57:06Z

Final experiment with our REINFORCE with Baseline algorithm implementation. showing a final reward of 553 after 20,000 episodes with a preset seed of value 7081960.

The Learning Rate for this experiment was 1*10e-3.

The action set chosen for all experiments is the following:
[0.0, 0.3, 0.0], # throttle
[0.0, 0.1, 0.0], # throttle
[0.0, 0.0, 0.0], # throttle
[0.0, 0.0, 0.7], # break
[0.0, 0.0, 0.5], # break
[0.0, 0.0, 0.2], # break
[-1.0, 0.0, 0.05], # left
[-0.5, 0.0, 0.05], # left
[-0.2, 0.0, 0.05], # left
[1.0, 0.0, 0.05], # right
[0.5, 0.0, 0.05], # right
[0.2, 0.0, 0.05], # right

Tensorboard screenshots below:

log_interval 5

…to main

* Using GPU * Actually using gpu * Adding the running reward to tensorboard * Substituted a hardcoded path for the proper param * Added a getitem into Actions class... * Removing the actions class: pretty unnecesary * Updating .gitignore and install.sh script Co-authored-by: ziri <ziritrion@gmail.com>

Co-authored-by: ziritrion <ziritrion@gmail.com>

…sode, successful training)

…-wo-rl into RL-baseline-new-model Trying to solve conflicts

…ignore change

…but drops to 2xx

…81/825

jaimepedretp and others added 30 commits February 5, 2021 18:39

log_interval 5

358087d

Merge pull request #2 from xeviknal/with-baseline

3635b91

log_interval 5

Merge branch 'main' of https://github.com/xeviknal/aidl-2021-wo-rl in…

2ddf6bd

…to main

RL-with baseline. JuanJo's proposal added

a7ac921

Fixed trainer.py for GPU execution and modified main.py to run in Ubuntu

472e1f7

Add visualization skills: evaluation mode (#12)

caf0744

Co-authored-by: ziritrion <ziritrion@gmail.com>

Finished 30k runs; reward was not significantly improved (#13)

4d7130e

Co-authored-by: ziritrion <ziritrion@gmail.com>

Add metrics and logsoftmax

2106b4b

Updating the model

0f2f195

Add new model to baseline

39b3ad3

The line that fixes all

907af7a

Add mean entropy - to reduce tensorboard runs

8e4ee6c

Add action prob mean: mean of prob of actions taken in the episode

3970702

Added simple directory check to params folder

957a3b4

Added additional param save conditions (end of log_interval, last epi…

0105564

…sode, successful training)

Merge branch 'RL-baseline-new-model' of github.com:xeviknal/aidl-2021…

b5a5184

…-wo-rl into RL-baseline-new-model Trying to solve conflicts

Removing old runs; they don't apply to this branch

c6954ec

RL-baseline-NM-save-optim

a7c907c

Load optimizer params

d5b676c

8k runs

bd7f6c0

Fresh start with latest checkpoint load-save changes. Also, small git…

5f246c0

…ignore change

bugfix

e8aa5e4

10k runs

c52a4f2

Almost 20k runs. Reward is starting to improve little by little

192bece

Fixed runner.py for generating videos

11b46c4

25k runs. Slight improvement but far from desirable

befd201

10k episodes. Learning rate 1e-3. Original actions. RR of almost 700 …

970f84a

…but drops to 2xx

Added new action set and removed previous runs for fresh start

1038b29

5k runs, reward around 450

1204087

ziritrion and others added 6 commits March 23, 2021 17:28

20k episodes. Running reward 513

b44f017

Run up to 33.7k episodes using actions set 3 - max/final av. reward 8…

d3dda5a

…81/825

added video eval mode - using model max/final av. reward 881/825

447dd0c

Adding seeds

2bb3373

Fresh start for RL-Baseline final experiment #1 with seed 7081960

c709898

20k episodes, running reward 553

9830004

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Final experiments - RL-Baseline] experiment #1 with seed 7081960 #57

[Final experiments - RL-Baseline] experiment #1 with seed 7081960 #57

ziritrion commented Apr 15, 2021

[Final experiments - RL-Baseline] experiment #1 with seed 7081960 #57

Are you sure you want to change the base?

[Final experiments - RL-Baseline] experiment #1 with seed 7081960 #57

Conversation

ziritrion commented Apr 15, 2021