Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some about first_layer_sine_init #43

Open
lingtengqiu opened this issue Apr 27, 2021 · 1 comment
Open

some about first_layer_sine_init #43

lingtengqiu opened this issue Apr 27, 2021 · 1 comment

Comments

@lingtengqiu
Copy link

I am confused about first_layer_sine_init, where you set W~uniform(-1/n,1/n).

As we know, input is X-uniform(-1,1) so has var[x] = (2^2/12)=1/3. and after FC layer, var[sin(30Wx+b)] = 30^2n*(1/3)*(c^2/3) =1?
so how you initialize first-layer-weight by uniform(-1/n,1/n)?

@pswpswpsw
Copy link

pswpswpsw commented Jun 9, 2021

the complete logic is:

  1. if the input is only x, so 1D, then the input layer should be W~uniform (-1,1).
  2. however, if you do that, you will find that after first layer (after activation with sine), you DON'T get the U-shaped beta distribution.
  3. what happened? the reason is that two i.i.d (-1,1) uniform product will not trigger the sine function to get the complete U-shaped beta distribution
  4. the easy remedy is introducing w_0 as 30. so that before the activation, these linear outputs are varying so much so that sine will fully activate them

as for your derivation, I don't get how can you get "var[sin(30Wx+b)]"... note that there is a sine... normally, you cannot directly get the variance computed unless you know it is an arcsine distribution. and note that 30 plays a role in make that arcsine as well..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants