Question about generate_starts function for reverse curriculum #7

tldoan · 2018-09-27T03:55:30Z

Hi,

According to your paper you apply a brownian motion to generate new seeds states (Normal 0 variance 1 ) but according to this line
https://github.com/florensacc/rllab-curriculum/blob/master/curriculum/envs/start_env.py#L233

It seems that you are applying a random uniform action with range env.action_space.bounds for the AntMaze Environment.

Can you explain why the action is not a N(0,I) ??

Thank you very much.

florensacc · 2018-10-01T21:47:40Z

Hi,
you are right, we apply random uniform actions. Given that the action bounds are finite and known, the maximum entropy policy is the uniform over that interval. We also tried applying a N(0,I) action and the results were essentially the same. It is unfortunate that this published code has a random walk that is not exactly "Brownian". Anyone interested in verifying that the results are exactly the same under Brownian motion can change the line of code you point out. Thanks for spotting this detail!

florensacc closed this as completed Oct 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about generate_starts function for reverse curriculum #7

Question about generate_starts function for reverse curriculum #7

tldoan commented Sep 27, 2018

florensacc commented Oct 1, 2018

Question about generate_starts function for reverse curriculum #7

Question about generate_starts function for reverse curriculum #7

Comments

tldoan commented Sep 27, 2018

florensacc commented Oct 1, 2018