Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a random seed parameter for tensorflow.layer.dense #9744

Closed
rasbt opened this issue May 8, 2017 · 8 comments
Closed

Adding a random seed parameter for tensorflow.layer.dense #9744

rasbt opened this issue May 8, 2017 · 8 comments
Assignees
Labels
stat:awaiting tensorflower Status - Awaiting response from tensorflower type:feature Feature requests

Comments

@rasbt
Copy link
Contributor

rasbt commented May 8, 2017

Hi,
I noticed that the tensorflow.layer.dense wrapper does not have a seed parameter, and I was wondering what you think about adding one. E.g., this seed would be used to initialize the random weights and bias units (if they are not initialized to zero).

Related to the points above, it is currently also not clear how the weights are initialized. I.e., what distribution the random numbers are drawn from (if not zero). Does someone have some info on this? -- maybe this could be added to the API docs.

What do you think? (I am happy to contribute a random seed implementation and a doc update if that's desirable.)

@concretevitamin
Copy link
Contributor

Short answer to the seed request: it is unlikely this will be accepted, since TensorFlow's universal convention has been to use "initializers" instead of a simple seed. For example, see init_ops.py.

That said, we should perhaps document the default inits used in dense() docstring. @fchollet - mind taking a look?

@concretevitamin concretevitamin added stat:awaiting tensorflower Status - Awaiting response from tensorflower type:feature Feature requests labels May 9, 2017
@fchollet
Copy link
Member

fchollet commented May 10, 2017

The way to get deterministic weight initializations in dense is to pass a kernel_initializer argument set to an initializer (instance of a class from init_ops) that has a fixed seed.

If kernel_initializer is not set, we use the global default weight initializer, which is glorot_uniform. But I think it would be good to have it be explicitly present in the dense signature instead of relying on an implicit global default.

@rasbt
Copy link
Contributor Author

rasbt commented May 10, 2017

Thanks for the clarification. So the following would be the current default behavior based on the global default weight initializer?

import tensorflow
from tensorflow.python.ops.init_ops import glorot_uniform_initializer

tensorflow.layers.dense(..., kernel_initializer=glorot_uniform_initializer())

I am happy to put together a PR to add a note to the docstrings of Dense and dense. When I understand correctly, you are suggesting to use glorot_uniform_initializer directly in Dense, i.e., via sth like

if kernel_initializer is None:
    self.kernel_initializer = glorot_uniform_initializer()
else:
    self.kernel_initializer = kernel_initializer

Not sure what the current TensorFlow convention is; would it be okay to do these checks & assignments in the class __init__s directly?

@fchollet
Copy link
Member

This code snippet in __init__ would change neither the behavior of the layer nor its docstring and signature. For the time being I would simply mention the default initializer in the docstring.

@rasbt
Copy link
Contributor Author

rasbt commented May 11, 2017

Okay, will open a PR tomorrow.

This code snippet in init would change neither the behavior of the layer nor its docstring

I saw that in Dense's __init__ and the dense function, the bias initializer is set manually as well:

def dense(
    inputs, units,
    activation=None,
    use_bias=True,
    kernel_initializer=None,
    bias_initializer=init_ops.zeros_initializer(),
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    trainable=True,
    name=None,
    reuse=None):

I am curious why setting kernel_initializer=init_ops.glorot_uniform_initializer() there wouldn't have an effect? I couldn't find anything in Dense and dense that would overwrite this setting ...

@fchollet
Copy link
Member

I am curious why setting kernel_initializer=init_ops.glorot_uniform_initializer() there wouldn't have an effect?

We should avoid instantiating anything in a class constructor signature, because the instance becomes global to all instances of the class. There are no immediate side effects in this case but it is bad practice and potentially dangerous for the user. For the same reason bias_initializer=init_ops.zeros_initializer() should be avoided, but it is there for legacy reasons.

We should instantiate the initializers in the __init__ of the class (so as to create new objects for every layer instance rather than a shared object everywhere). Keras achieves this by using default initializer argument values that are serial, i.e. that are simply the names of the initializers, which are then instantiated in the __init__.

@rasbt
Copy link
Contributor Author

rasbt commented May 11, 2017

Ah thanks, makes sense now!

This code snippet in init would change neither the behavior of the layer nor its docstring and signature. For the time being I would simply mention the default initializer in the docstring.

Yeah, previously, I think I misinterpreted this. I thought you meant that it wouldn't have an effect setting something in __init__ but when I understand correctly now, you meant that it already uses glorot_uniform, so that this setting this in __init__ wouldn't change anything compared to the current behavior. Only reason I thought this might be useful is that it might not be clear for someone looking at the code what the default weight initializer is, but yeah, I think leaving it as is and documenting it in the docstring should be fine.

@girving
Copy link
Contributor

girving commented Jun 16, 2017

Closing since specifying the initializer is the right solution.

@girving girving closed this as completed Jun 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:awaiting tensorflower Status - Awaiting response from tensorflower type:feature Feature requests
Projects
None yet
Development

No branches or pull requests

4 participants