Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

# TODO: historical accident #6

Closed
camigord opened this issue Jul 11, 2017 · 2 comments
Closed

# TODO: historical accident #6

camigord opened this issue Jul 11, 2017 · 2 comments

Comments

@camigord
Copy link

Hi @pathak22 ,

first of all thanks for releasing the code!

I have been taking a look at it and I have a question concerning those few lines of code where you wrote:
# TODO: historical accident ...

Why are you multiplying the loss by 20 and 288 in those lines?
grads = tf.gradients(self.loss * 20.0, pi.var_list)
self.forwardloss = self.forwardloss * 288.0

I understand this is related to the batch size (or rollout steps) and to the number of features representing a state, but I can not really see the point of multiplying in such a way... could you please give me a hint?

Thanks,

@pathak22
Copy link
Owner

pathak22 commented Jul 11, 2017

Hi @camigord,

Yes. So, this value is nothing special. Universe starter agent originally used tf.reduce_sum instead of tf.reduce_mean to compute the loss. The hyper-parameters in my A3C+ICM code were tuned to that setup. But it is a bad practice to sum across batch and channel dimensions. This is so because as one changes the environments, batch-size or network architecture, the other hyper-parameters will stop making sense. Hence, I switched out the tf.reduce_sum with tf.reduce_mean taking the constant factor out (e.g. 288, 20 etc.). This makes the code generalizable across different network architectures, input sizes and environments.

Moreover, to help the users understand the code better, I deliberately added this comment # TODO: historical accident ... wherever the constants were factored out. Hope this answers your question.

@pathak22
Copy link
Owner

Alright, I have updated the comments in code to make them more informative. Hence, closing this issue now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants