Skip to content
This repository has been archived by the owner on Dec 29, 2022. It is now read-only.

How to set initial values for weights? #44

Closed
Hvass-Labs opened this issue Jan 10, 2017 · 3 comments
Closed

How to set initial values for weights? #44

Hvass-Labs opened this issue Jan 10, 2017 · 3 comments

Comments

@Hvass-Labs
Copy link

Is it possible to scale the initial random values for the weights in PrettyTensor?

I have a network of a few convolutional layers followed by a few fully-connected layers. They all use relu-activations except for the last layer which is just linear output. I would like the initial output values of the network to be random and close to zero. I think the best way would be to init the random weights in the output-layer to be close to zero.

I see that there is a weights parameter to the fully_connected() class but it is not clear to me how to use it.

Could you give an example in this code? Thanks.

with pt.defaults_scope(activation_fn=tf.nn.relu):
    self.q_values = x_pretty.\
        conv2d(kernel=8, depth=16, stride=4, name='layer_conv1').\
        conv2d(kernel=4, depth=32, stride=2, name='layer_conv2').\
        flatten().\
        fully_connected(size=256, name='layer_fc1').\
        fully_connected(size=num_actions, name='layer_fc2', activation_fn=None)
@Hvass-Labs
Copy link
Author

I have made a temporary fix but it is really ugly. The following is added to the above code:

scaler = tf.Variable(initial_value=0.001)
self.q_values = scaler * self.q_values
self.q_values = pt.wrap(self.q_values)

Surely there must be a Pretty way of scaling the initial weights of the layers?

@eiderman
Copy link
Contributor

I see that while the documentation mentions that weights can take an initializer function, it doesn't specify that it is a standard tensorflow initializer (e.g. tf.constant_initializer, tf.random_uniform_initializer, etc.). Basically it is any function that has the signature: init(shape, dtype=tf.float32, partition_info=None). They are listed here (towards the bottom and not in any particular order): https://www.tensorflow.org/api_docs/python/state_ops/sharing_variables

with pt.defaults_scope(activation_fn=tf.nn.relu):
    self.q_values = x_pretty.\
        conv2d(kernel=8, depth=16, stride=4, name='layer_conv1').\
        conv2d(kernel=4, depth=32, stride=2, name='layer_conv2').\
        flatten().\
        fully_connected(size=256, name='layer_fc1').\
        fully_connected(size=num_actions, name='layer_fc2', activation_fn=None, 
                                  weights=tf.random_uniform_initializer(0.001))

@Hvass-Labs
Copy link
Author

Thanks. I think I got a little confused because I found the PrettyTensor initializers which need a shape parameter, which I obviously cannot provide. But tf.random_normal_initializer() and tf.truncated_normal_initializer() work fine. It might be a good idea to mention this in the docs.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants