baseline problem in the vanilla policy gradient example #2

flyers · 2016-04-30T12:23:51Z

Hi everyone,
When I am running the rllab/examples/vpg_2.py example, I found that the baseline is actually a zero vector. It seems that we need to explicitly call the baseline.fit method before calling the baseline.predict method. @dementrock
Thanks.

dementrock · 2016-04-30T13:47:34Z

Ahh yes. Your interpretation is correct & thanks for reporting the issue! I will push a fix to the master branch shortly.

flyers closed this as completed Apr 30, 2016

dementrock added a commit that referenced this issue May 1, 2016

Fix #2

782a16a

alexbeloi pushed a commit to alexbeloi/rllab that referenced this issue Jun 23, 2016

Fix rll#2

d4c50eb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

baseline problem in the vanilla policy gradient example #2

baseline problem in the vanilla policy gradient example #2

flyers commented Apr 30, 2016 •

edited

Loading

dementrock commented Apr 30, 2016

baseline problem in the vanilla policy gradient example #2

baseline problem in the vanilla policy gradient example #2

Comments

flyers commented Apr 30, 2016 • edited Loading

dementrock commented Apr 30, 2016

flyers commented Apr 30, 2016 •

edited

Loading