Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate initialiser keyword arguments #671

Closed
MikeInnes opened this issue Mar 8, 2019 · 5 comments · Fixed by #1440
Closed

Deprecate initialiser keyword arguments #671

MikeInnes opened this issue Mar 8, 2019 · 5 comments · Fixed by #1440

Comments

@MikeInnes
Copy link
Member

These don't really work, they aren't flexible enough and are no more convenient that just supplying your own weights e.g.

Dense(randn(2, 5), rand(2))

This is partly a docs issue as people don't realise that this is a supported option. Although like most things, it will of course get more convenient with #669.

@shreyas-kowshik
Copy link
Contributor

shreyas-kowshik commented Mar 9, 2019

I am willing to work on this.
But I need to clarify a few things beforehand :

using Flux
using Distributions:Normal

dist = Normal(0.0,1.0)
function initW(out,in)
    W = rand(dist,out,in)
end

Dense(2,3,sigmoid,initW=initW,initb=zeros)

This piece of code does work right?...As in, we can define our own functions for weight initialization right?...So why is there a need to depreciate it?...
I apologize if I'm being silly here. In case if this is not what is relevant to this issue, then can you please explain what needs to be done exactly?

@MikeInnes
Copy link
Member Author

The original idea was that you could choose initilialisers for a large model with the init kwarg (e.g. ResNet50(init = xorot_uniform). But we ended up using different initialisers on a per-parameters basis (initW and initb) which defeats the purpose, since now you have to deal with every weight separately anyway. On top of that, this extra API must be explicitly added to each new layer, which makes it harder to achieve consistency (most newer layers don't have it at all).

So given those issues and the fact that you can just as easily write this without initW, we can simplify things a fair bit by just removing it.

@shreyas-kowshik
Copy link
Contributor

shreyas-kowshik commented Mar 26, 2019

So all that needs to be done is to add deprecation warnings to Dense(...)...?
Also for the warning, is this message fine :

Dense(in,out,act,initW,initb) is deprecated; use Dense(in,out) instead

OR is this required...? :

Dense(in,out,act,initW,initb) is deprecated; use Dense(in,out,init) instead

Accordingly I shall make changes to the API then.

Can you please clarify it a bit here @MikeInnes.
Thank You.

@johnnychen94
Copy link
Contributor

johnnychen94 commented Mar 30, 2019

struct Dense{F,S,T}
W::S
b::T
σ::F
end
Dense(W, b) = Dense(W, b, identity)

This is my understanding:
Dense(in,out,act,initW,initb) is deprecated; use Dense(in,out,σ) or Dense(W, b, σ) instead

@shreyas-kowshik
Copy link
Contributor

@MikeInnes added the deprecation warning in #722 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants