Skip to content

Commit 1a28a9d

Browse files
feat(a3c): set update frequency so multiple updates happen per episode
- usually anyway. I don't know what a good update frequency is, if you do, @ me.
1 parent 0241070 commit 1a28a9d

File tree

1 file changed

+5
-13
lines changed
  • libraries/mathy_python/mathy/agents/a3c

1 file changed

+5
-13
lines changed

libraries/mathy_python/mathy/agents/a3c/config.py

Lines changed: 5 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,11 @@
22

33

44
class A3CConfig(BaseConfig):
5-
# Update frequencey for the Worker to sync with the Main model. This has different
6-
# meaning for different agents:
5+
# Update frequencey for the Worker to sync with the Main model.
76
#
8-
# - for A3C agents this value indicates the maximum number of steps to take in an
9-
# episode before syncing the replay buffer and gradients.
10-
# - for R2D2 agents this value indicates the number of episodes to run between
11-
# syncing the latest model from the learner process.
12-
update_gradients_every: int = 64
7+
# Indicates the maximum number of steps to take in an episode before
8+
# syncing the replay buffer and gradients.
9+
update_gradients_every: int = 12
1310

1411
normalization_style: str = "layer"
1512

@@ -34,14 +31,9 @@ class A3CConfig(BaseConfig):
3431
# MCTS provides higher quality observations at extra computational cost.
3532
mcts_sims: int = 200
3633

37-
# Whether to use the grouping change aux task
38-
use_grouping_control = True
39-
# Clip signal at 0.0 so it does not optimize into the negatives
40-
clip_grouping_control = False
41-
4234
main_worker_use_epsilon = False
4335
e_greedy_min = 0.01
44-
e_greedy_max = 0.1
36+
e_greedy_max = 0.3
4537
# Worker's sleep this long between steps to allow
4638
# other threads time to process. This is useful for
4739
# running more threads than you have processors to

0 commit comments

Comments
 (0)