Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The purpose of the logscale_factor=3. in the actnorm function #55

Closed
kmkolasinski opened this issue Sep 17, 2018 · 3 comments
Closed

The purpose of the logscale_factor=3. in the actnorm function #55

kmkolasinski opened this issue Sep 17, 2018 · 3 comments

Comments

@kmkolasinski
Copy link

Hello, I would like to ask what is the purpose of the logscale_factor in the actnorm function here?
I couldn't find any reference in the paper which would explain the reason if this variance modification. As far as I understand this implementation, we recover the paper description by setting logscale_factor=1. It is also clear to me that it just affects the initialization step, but it is interesting to know if this is some kind of trick which helped you or something else. Thanks for feedback.

@kmkolasinski
Copy link
Author

Closing this issue, since I have realized that logscale_factor actually reduces out and does nothing in this code. Sorry for bothering you with this stupid question.

@shehryar-malik
Copy link

Why are you saying that logscale_factor reduces out? Do you mean that the neural network will learn to adjust to this factor i.e. instead of outputting `logs' it will output a scaled down version of it instead?

@guzy0324
Copy link

The logscale_factor can accelerate the update of the parameter. For example, $\theta$ is the parameter before logscale_factor, $\beta$ is the parameter after logscale_factor and the logscale_factore equals 3.

$$ \begin{align*} \theta&=3\beta\\ \frac{\partial\mathcal{l}(3\beta)}{\partial\beta}&=3\frac{\partial\mathcal{l}(3\beta)}{\partial3\beta}=3\frac{\partial\mathcal{l}(\theta)}{\partial\theta}\\ \theta'&=\theta-\alpha\frac{\partial\mathcal{l}(\theta)}{\partial\theta}\\ 3\beta'&=3\beta-9\alpha\frac{\partial\mathcal{l}(\theta)}{\partial\theta} \end{align*} $$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants