A3C Basic Doom: effect of episode length (Discuss) #23

IbrahimSobh · 2017-03-18T16:35:08Z

Hi

This is to discuss how the episode length may affect the learning process.

Case 1: The default as in the repo

Smoothed steady Reward is around 0.55 (see figure below)

game.set_episode_timeout(300)

Case 2: Shorter episode

game.set_episode_timeout(150)

Very similar to Case 1

Case 3: very short episode

game.set_episode_timeout(70)

The agent should find the policy fast because it has very limited time window to explore.
Delayed convergence (after 500 episodes)

However reward is around 0.65 > case 1 (0.55) (see figure below)
Why? convergence is delayed, but on the other hand, we have better rewards. I mean that, the agent usually accomplish the task in less time compared to case 1, or the agent is more efficient and focused compared to case 1) What do you think?!

Case 4: Longer episode

game.set_episode_timeout(450)

smoothed reward is around 0.62
smoothed length is around 33
delayed convergence compared to case 1

Case 5: each worker has its own length

Is it even a valid idea?!

Where: episode length = 75 + (number *25)

worker_0: episode length = 75
worker_1: episode length = 100
worker_2: episode length = 125
.
worker_7: episode length = 250

It seems that worker_0 with episode length 250 converged faster than worker_7 with episode length 75

The following figure includes all workers:

The following figure includes only worker_0 (episode length = 75) and worker_7 (episode length = 250)

However, all workers share the same global network, Do you think by having different episode lengths, could affect / enhance the learning? What to you think?

Again: Is it even a valid idea?!

Case 6: each worker has its own length, with lager range

Where: episode length = 100 + (number *50)

worker_0: episode length = 100
worker_1: episode length = 150
worker_2: episode length = 200
.
worker_7: episode length = 450

The following figure includes all workers:

The following figure includes only worker_0 (episode length = 100), worker_4 (episode length = 300) worker_7 (episode length = 450)

The longer the episode, the faster the learning

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A3C Basic Doom: effect of episode length (Discuss) #23

A3C Basic Doom: effect of episode length (Discuss) #23

IbrahimSobh commented Mar 18, 2017 •

edited

Loading

A3C Basic Doom: effect of episode length (Discuss) #23

A3C Basic Doom: effect of episode length (Discuss) #23

Comments

IbrahimSobh commented Mar 18, 2017 • edited Loading

IbrahimSobh commented Mar 18, 2017 •

edited

Loading