Questions about multiwalker #314

jkterry1 · 2021-09-14T18:10:04Z

Hey, a quick question- how many timesteps did you train multiwalker for with MAD4PG a few months ago when you were able to learn it so effectively that the environment broke and you created an issue with us?

jkterry1 · 2021-09-14T18:12:00Z

The code you referenced is here, but unless I'm loosing my mind I'm not seeing where you defined timesteps in it: https://github.com/instadeepai/Mava/blob/develop/examples/petting_zoo/sisl/multiwalker/feedforward/decentralised/run_mad4pg.py

Also this is the GitHub issue I was referring to by the way: Farama-Foundation/PettingZoo#376

KaleabTessera · 2021-09-15T10:59:42Z

Hi @jkterry1 👋

It took ~1000000 (1e6) executor steps (logged as evaluator/ExecutorSteps) / ~1h to reach a return (logged as evaluator/RawEpisodeReturn or evaluator/MeanEpisodeReturn) of ~40. I think the prev PZ multiwalker env broke around then.

We used this config:

Config	Param
batch_size	1024
critic_optimizer	Adam
critic_optimizer_lr	0.0001
discount	0.99
executor_variable_update_period	1000
max_gradient_norm	None
policy_optimizer	Adam
policy_optimizer_lr	0.0001
shared_weights	True
sigma	0.3
target_averaging	False
target_update_period	100

Relating to timesteps, each system has a max_executor_steps (

Mava/mava/systems/tf/mad4pg/system.py

Line 75 in 3159ed4

max_executor_steps: int = None,

) which limits how many steps/timesteps our executors (more on executors here) run for. If this is none, we let the experiment run until it is manually cancelled. Not sure if this answers your question?

jkterry1 · 2021-09-16T16:09:48Z

That answers most of it, thank you. One follow up question: is 1 million executor steps steps for each individual agent in the env or a step for all agents at once?

KaleabTessera · 2021-09-17T15:01:38Z

@jkterry1 It is a step for all agents at once (parallel) since our executors are a collection of agents.

jkterry1 · 2021-09-17T21:57:31Z

Thanks a ton!

jkterry1 · 2021-09-30T22:19:26Z

One additional question- what preprocessing did you do and why?

jkterry1 · 2021-10-01T14:47:13Z

And do you know how much average total reward you were getting when multiwalker broke?

arnupretorius assigned KaleabTessera Sep 15, 2021

arnupretorius added the question Further information is requested label Sep 15, 2021

jkterry1 closed this as completed Sep 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about multiwalker #314

Questions about multiwalker #314

jkterry1 commented Sep 14, 2021 •

edited

jkterry1 commented Sep 14, 2021

KaleabTessera commented Sep 15, 2021

jkterry1 commented Sep 16, 2021 •

edited

KaleabTessera commented Sep 17, 2021

jkterry1 commented Sep 17, 2021

jkterry1 commented Sep 30, 2021

jkterry1 commented Oct 1, 2021

Questions about multiwalker #314

Questions about multiwalker #314

Comments

jkterry1 commented Sep 14, 2021 • edited

jkterry1 commented Sep 14, 2021

KaleabTessera commented Sep 15, 2021

jkterry1 commented Sep 16, 2021 • edited

KaleabTessera commented Sep 17, 2021

jkterry1 commented Sep 17, 2021

jkterry1 commented Sep 30, 2021

jkterry1 commented Oct 1, 2021

jkterry1 commented Sep 14, 2021 •

edited

jkterry1 commented Sep 16, 2021 •

edited