Transformed Variable not trainable in Keras model #946

keyonvafa · 2020-05-23T01:09:17Z

Hi,
I am trying to train a positive variable using tfp.util.TransformedVariable as an attribute of a tf.keras.Model object. However, the model does not recognize it as a trainable variable, and it does not receive gradients. This behavior holds for tensorflow_probability==0.10.0 and tensorflow==2.2.0, as well as for the nightly builds of both.

Here is a colab notebook illustrating this behavior: https://colab.research.google.com/drive/1XGCcm8l0OGRiy35lr3XcHAyZMuBNpsIB?usp=sharing

In this example, we are trying to train both an unconstrained variable (loc) and a constrained variable (scale). Only the loc variable updates.

import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp

class Model(tf.keras.Model):
  def __init__(self):
    super(Model, self).__init__()
    self.loc = tf.Variable(tf.ones(shape=[5]), name="loc")
    self.scale = tfp.util.TransformedVariable(
        tf.ones([5]),
        bijector=tfp.bijectors.Softplus(),
        name="scale") 
    self.distribution = tfp.distributions.Normal(loc=self.loc, scale=self.scale)

  def call(self, inputs):
    samples = self.distribution.sample()
    assigned_means = tf.gather(samples, inputs)
    return tfp.distributions.Normal(loc=assigned_means, scale=1.)

model = Model()
print(model.trainable_weights)  # only 'loc' shows up

optimizer = tf.keras.optimizers.Adam(learning_rate=0.1)
loss = lambda x, rv: -tf.reduce_sum(rv.log_prob(x))
inputs = np.array([0, 1, 2, 3, 4]).astype(np.int32)
outputs = np.array([0., 1., 2., 3., 4.]).astype(np.float32)
dataset = tf.data.Dataset.from_tensor_slices((inputs, outputs))
dataset = dataset.batch(5)
model.compile(optimizer=optimizer, loss=loss)
model.fit(dataset, epochs=100, verbose=0)

# Check if the location parameters have moved from their original values.
assert(not (np.isclose(model.loc.numpy(), np.ones(5))).all())  # Passes

# Check if the scale parameters have moved from their original values.
assert(not (np.isclose(model.scale.numpy(), np.ones(5))).all())  # Fails

Thanks!

The text was updated successfully, but these errors were encountered:

brianwa84 · 2020-05-23T02:15:26Z

Try adding a `self.foo = self.distribution.variables` Unfortunately keras doesn't recognize the variables inside a tf.Module. No idea why.

…

On Fri, May 22, 2020, 9:09 PM Keyon Vafa ***@***.***> wrote: Hi, I am trying to train a positive variable using tfp.util.TransformedVariable as an attribute of a tf.keras.Model object. However, the model does not recognize the object as a trainable variable, and it does not receive gradients. This behavior holds for tensorflow_probability==0.10.0 and tensorflow==2.2.0, as well as for the nightly builds of both. Here is a colab notebook illustrating this behavior: https://colab.research.google.com/drive/1XGCcm8l0OGRiy35lr3XcHAyZMuBNpsIB?usp=sharing In this example, we are trying to train both an unconstrained variable ( loc) and a constrained variable (scale). Only the loc variable updates. import numpy as npimport tensorflow as tfimport tensorflow_probability as tfp class Model(tf.keras.Model): def __init__(self): super(Model, self).__init__() self.loc = tf.Variable(tf.ones(shape=[5]), name="loc") self.scale = tfp.util.TransformedVariable( tf.ones([5]), bijector=tfp.bijectors.Softplus(), name="scale") self.distribution = tfp.distributions.Normal(loc=self.loc, scale=self.scale) def call(self, inputs): samples = self.distribution.sample() assigned_means = tf.gather(samples, inputs) return tfp.distributions.Normal(loc=assigned_means, scale=1.) model = Model()print(model.trainable_weights) # only 'loc' shows up optimizer = tf.keras.optimizers.Adam(learning_rate=0.1)loss = lambda x, rv: -tf.reduce_sum(rv.log_prob(x))inputs = np.array([0, 1, 2, 3, 4]).astype(np.int32)outputs = np.array([0., 1., 2., 3., 4.]).astype(np.float32)dataset = tf.data.Dataset.from_tensor_slices((inputs, outputs))dataset = dataset.batch(5)model.compile(optimizer=optimizer, loss=loss)model.fit(dataset, epochs=100, verbose=0) # Check if the location parameters have moved from their original values.assert(not (np.isclose(model.loc.numpy(), np.ones(5))).all()) # Passes # Check if the scale parameters have moved from their original values.assert(not (np.isclose(model.scale.numpy(), np.ones(5))).all()) # Fails Thanks! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#946>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFJFSI5JKDL7CQMTSGAVPDLRS4O5DANCNFSM4NIG3RXQ> .

keyonvafa · 2020-05-23T17:10:29Z

Thank you. That works for me. Hopefully it's possible for keras to recognize variables inside a tf.Module.

krzysztofrusek · 2020-10-07T19:27:24Z

Hello,

I have the same problem with the tfp.experimental.nn.util.RandomVariable (tf.version, tfp.version = ('2.3.0', '0.11.0'))

def random_variable_scope(next_creator, **kwargs):
  iv = kwargs['initial_value']
  if callable(iv):
    return tfn.util.RandomVariable(tfd.Normal(tf.Variable(iv()), 1))
  return next_creator(**kwargs)


with tf.variable_creator_scope(random_variable_scope):
  d = tf.keras.layers.Dense(2)
  d(tf.zeros([3,4]))


[type(v) for v in d.variables]

gives

[tensorflow_probability.python.experimental.nn.util.random_variable.RandomVariable,
 tensorflow_probability.python.experimental.nn.util.random_variable.RandomVariable]

The foo trick kind of works but it makes the variables list polluted:

d.foo = [ v.variables for v in d.variables]
[type(v) for v in d.variables]

[tensorflow_probability.python.experimental.nn.util.random_variable.RandomVariable,
 tensorflow_probability.python.experimental.nn.util.random_variable.RandomVariable,
 tensorflow.python.ops.resource_variable_ops.ResourceVariable,
 tensorflow.python.ops.resource_variable_ops.ResourceVariable]

st-- · 2021-02-19T15:12:26Z

@keyonvafa this is a core TensorFlow issue (see tensorflow/tensorflow#47264). You can use the TrackableLayer code that I posted in tensorflow/tensorflow#47264 (comment) to work around this issue as follows: You only need to change the first few lines of your Model class definition to

class Model(tf.keras.Model, TrackableLayer):
  def __init__(self):
    super().__init__()  # this is the recommended style in Python3 anyways

With your original Model definition, model.trainable_variables is missing the scale Variable:

[<tf.Variable 'loc:0' shape=(5,) dtype=float32, numpy=array([1., 1., 1., 1., 1.], dtype=float32)>]

When inheriting from TrackableLayer, model.trainable_variables now returns

[<tf.Variable 'loc:0' shape=(5,) dtype=float32, numpy=array([1., 1., 1., 1., 1.], dtype=float32)>,
 <tf.Variable 'scale:0' shape=(5,) dtype=float32, numpy=
 array([0.54132485, 0.54132485, 0.54132485, 0.54132485, 0.54132485],
       dtype=float32)>]

as expected.

@krzysztofrusek you can use a similar workaround for your own issue as well by using a TrackableDense instead that is defined by

class TrackableDense(tf.keras.layers.Dense, TrackableLayer):
    pass

though I'm not sure it resolves your duplicated-variable issue. 🤔

keyonvafa · 2021-02-19T21:13:50Z

Thank you @st-- ! I'll give that a try.

st-- · 2021-02-23T15:13:48Z

Also, it looks like this issue will finally be fixed by TensorFlow 2.5: tensorflow/tensorflow#47264 (comment)

srvasude added keras layers labels Feb 13, 2021

st-- mentioned this issue Feb 19, 2021

Keras layers do not track tf.Module (not conforming to SOLID principles) tensorflow/tensorflow#47264

Closed

hartikainen mentioned this issue Mar 27, 2021

[Feature Request] Distributions should track their variables #1282

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transformed Variable not trainable in Keras model #946

Transformed Variable not trainable in Keras model #946

keyonvafa commented May 23, 2020 •

edited

brianwa84 commented May 23, 2020 via email

keyonvafa commented May 23, 2020

krzysztofrusek commented Oct 7, 2020

st-- commented Feb 19, 2021

keyonvafa commented Feb 19, 2021

st-- commented Feb 23, 2021

Transformed Variable not trainable in Keras model #946

Transformed Variable not trainable in Keras model #946

Comments

keyonvafa commented May 23, 2020 • edited

brianwa84 commented May 23, 2020 via email

keyonvafa commented May 23, 2020

krzysztofrusek commented Oct 7, 2020

st-- commented Feb 19, 2021

keyonvafa commented Feb 19, 2021

st-- commented Feb 23, 2021

keyonvafa commented May 23, 2020 •

edited