New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update parameter to be savable #1518
Conversation
Hey @awav, how would you like to proceed with this? |
Codecov Report
@@ Coverage Diff @@
## develop #1518 +/- ##
========================================
Coverage 95.31% 95.32%
========================================
Files 85 85
Lines 3823 3766 -57
========================================
- Hits 3644 3590 -54
+ Misses 179 176 -3
Continue to review full report at Codecov.
|
these types can be removed if they're not in use any more https://github.com/GPflow/GPflow/pull/1518/files#diff-e2281e00a6a807545fc8c393630c5c36L17-L19 ok, the link doesn't work... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice work - thanks. Only thing that is missing is a proper unit test to show that a model is actually saveable.
|
||
Parameter._OverloadAllOperators() | ||
tf.register_tensor_conversion_function(Parameter, lambda x, *args, **kwds: x.read_value()) | ||
|
||
|
||
def _cast_to_dtype( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't TF2.2 resolve the issue with cast - can we simply use tf.case now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we still support 2.1
unconstrained_value = self.validate_unconstrained_value(value, self.dtype) | ||
return self._unconstrained.assign( | ||
unconstrained_value = _validate_unconstrained_value(value, self.transform, self.dtype) | ||
return self.unconstrained_variable.assign( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The super class already has this implemented. Making use of it makes it more clear what is happening:
- validate that the unconstrained value exists.
- assign it.
return self.unconstrained_variable.assign( | |
return super().assign(value, use_locking, name, read_value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're comment is right, the suggestion is wrong (I guess you know that though) - If you're going to do that, you can simply remove the method, it'll inherit it from the parent.
I suppose what you're suggesting is
unconstrained_value = super().assign(value, use_locking, name, read_value)
message = (
"gpflow.Parameter: the value to be assigned is incompatible with this parameter's "
"transform (the corresponding unconstrained value has NaN or Inf) and hence cannot be "
"assigned."
)
return tf.debugging.assert_all_finite(unconstrained_value, message=message)
Which makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, the transformation will be performed 2 times. We redefine the method so that we could do extra steps before assigning value without sacrificing performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand what you mean. The snippet @sam-willis posted seems correct to me?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will only be performed twice if you call _validate_unconstrained_value
, which is why I suggested just using tf.debugging.assert_all_finite
. You could make the message a constant, if you don't want code duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can remove _validate_unconstrained_value
, _unconstrained_value
and _constrained_value
, and just check that after either assignment or initialisation the unconstrained value is finite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't apply Sam's suggestion because it changes the value first and then checks correctness, whereas it should be vice versa.
Also, I don't understand what you mean by: I think you can remove _validate_unconstrained_value, _unconstrained_value and _constrained_value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah I do see your point, maybe it's better to not change the value if it's invalid, although an exception is thrown, so I'm not sure how important it is. I guess if you catch the exception your current code makes recovering easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_constrained_value
is unused, and technically you should always have the identity for the transform now, so you don't need the if
in _unconstrained_value
, but I think those are very minor.
@property | ||
def unconstrained_variable(self) -> tf.Variable: | ||
return self._unconstrained | ||
return self._pretransformed_input | ||
|
||
@property | ||
def transform(self) -> Optional[Transform]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand this is mainly because of legacy, but why are we still using the term transform
? We could make the interface cleaner if we followed TFP's naming and used the word bijector
instead. Having two words for the same object makes the code harder to read and understand. This PR seems like the right place and time to update this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like Bijector name, because not every bijector that TFP has is actually a bijection. Transform name has been used since the start of gpflow and I think keeping that name indeed reflects meaning better. And another reason for transform - keep the commit diff minimal.
@@ -190,17 +190,22 @@ def leaf_components(input: tf.Module): | |||
def _merge_leaf_components( | |||
input: Dict[str, Union[tf.Variable, tf.Tensor, Parameter]] | |||
) -> Dict[str, Union[tf.Variable, tf.Tensor, Parameter]]: | |||
input_values = set([value.experimental_ref() for value in input.values()]) | |||
|
|||
ref_fn = lambda x: (x if isinstance(x, Parameter) else x.ref()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not add a ref
/ deref
method to Parameter
? Seems like this is just missing from TFP's TransformedVariable
API.
def ref(self):
return self.unconstrained.ref()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter is not a tensor object and not a trait to the tensor class, it simply a wrapper to a tensor. You cannot use the same reference for two different objects. That was one of the reasons why saving didn't work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it not be cleaner to have
def ref(self):
return self
in order to get rid of ref_fn
and deref_fn
? It seems to me that this referencing functionality could be handy elsewhere as well at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ref()
belongs to tensor API, and ref()
returns a different type. There must be a distinction between these two for clarity. I would prefer not to introduce this method to the parameter class because of a printing issue.
unconstrained_value = self.validate_unconstrained_value(value, self.dtype) | ||
return self._unconstrained.assign( | ||
unconstrained_value = _validate_unconstrained_value(value, self.transform, self.dtype) | ||
return self.unconstrained_variable.assign( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're comment is right, the suggestion is wrong (I guess you know that though) - If you're going to do that, you can simply remove the method, it'll inherit it from the parent.
I suppose what you're suggesting is
unconstrained_value = super().assign(value, use_locking, name, read_value)
message = (
"gpflow.Parameter: the value to be assigned is incompatible with this parameter's "
"transform (the corresponding unconstrained value has NaN or Inf) and hence cannot be "
"assigned."
)
return tf.debugging.assert_all_finite(unconstrained_value, message=message)
Which makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to add more comments, but I see @vdutor has already addressed them all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to have a test for parameter saving. Otherwise LGTM
@vdutor, @sam-willis I added a test. |
@@ -55,3 +56,24 @@ def test_cast_to_dtype_precision_issue(): | |||
assert actual_value.dtype == np.float64 | |||
expected_value = np.float64(0.2) | |||
assert actual_value == expected_value | |||
|
|||
|
|||
def test_parameter_saved(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also check that the number of trainable parameters is the same in the original and reloaded model? I've had issues with that before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exec returns the same result. If it were a different parameter the result would be different? Or do you mean something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you run model.variables
on both models do you get the same number of parameters back? There used to be an issue that certain variables / parameters were dropped even though the execution of exec
gave the same result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand, tf.save_model
saves models for deployment purposes only (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/saved_model/README.md#builder). m1
can be a different object that wraps graph computation and doesn't support interface of the tf.Module
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keras uses tf.save_model
format in a different way. Standard tf.save_model
returns not the same object that you pass in at saving time.
> m1
<tensorflow.python.saved_model.load.Loader._recreate_base_user_object.<locals>._UserObject object at 0x13f553fd0>
> m1.variables
AttributeError: '_UserObject' object has no attribute 'variables'
* Update pull request template (#1510) Clarify template to make it easier for contributors to fill in relevant information. * Temporary workaround for tensorflow_probability dependency issue (#1522) * pin cloudpickle==1.3.0 as temporary workaround for tensorflow/probability#991 to unblock our build (to be reverted once fixed upstream) * Update readme with new project using GPflow (#1530) * fix bug in varying_noise notebook (#1526) * Fix formatting in docs (intro.md) and restore link removed by #1498 (#1520) * pin tensorflow<2.3 tensorflow-probability<0.11 (#1537) * Quadrature Refactoring (#1505) * WIP: quadrature refactoring * Removing old ndiagquad code * deleted test code * formatting and type-hint * merge modules * black formatting * formatting * solving failing tests * fixing failing tests * fixes * adapting tests for new syntax, keeping numerical behavior * black formatting * remove printf * changed code for compiled tf compatibility * black * restored to original version * undoing changes * renaming * renaming * renaming * reshape kwargs * quadrature along axis=-2, simplified broadcasting * black * docs * docs * helper function * docstrings and typing * added new and old quadrature equivalence tests * black * Removing comments Co-authored-by: Vincent Dutordoir <dutordoirv@gmail.com> * Typo Co-authored-by: Vincent Dutordoir <dutordoirv@gmail.com> * notation Co-authored-by: Vincent Dutordoir <dutordoirv@gmail.com> * reshape_Z_dZ return docstring fix * FIX: quad_old computed with the ndiagquad_old Co-authored-by: Vincent Dutordoir <dutordoirv@gmail.com> * more readable implementation Co-authored-by: Vincent Dutordoir <dutordoirv@gmail.com> * tf.ensure_shape added * removed ndiagquad * removed ndiagquad * Revert "removed ndiagquad" This reverts commit 7bb0e9f. * FIX: shape checking of dZ * Revert "removed ndiagquad" This reverts commit 8e23524. Co-authored-by: Gustavo Carvalho <gustavo.carvalho@delfosim.com> Co-authored-by: ST John <st@prowler.io> Co-authored-by: Vincent Dutordoir <dutordoirv@gmail.com> * Add base_conditional_with_lm function (#1528) * Added base_conditional_with_lm function, which accepts Lm instead of Kmm Co-authored-by: Neil Ferguson <neil@prowler.io> Co-authored-by: Vincent Dutordoir <dutordoirv@gmail.com> Co-authored-by: st-- <st--@users.noreply.github.com> * Fixed separate_independent_conditional to correctly handle q_sqrt=None. (#1533) * Fixed separate_independent_conditional to correctly handle q_sqrt=None. Co-authored-by: Aidan Scannell <scannell.aidan@gmail.com> Co-authored-by: st-- <st--@users.noreply.github.com> * Bump version numbers to 2.1.0. (#1544) * Re-introduce pytest-xdist (#1541) Enables pytest-xdist for locally running tests (`make test`) on multiple cores in parallel. * check dependency versions are valid on CI (#1536) * Update to not use custom image (#1545) * Update to not use custom image * Add test requirements * Update parameter to be savable (#1518) * Fix for quadrature failure mode when autograph was set to False (#1548) * Fix and test * Change shape of quadrature tensors for better broadcasting (#1542) * using the first dimension to hold the quadrature summation * adapting ndiagquad wrapper * Changed bf for bX in docstrings Co-authored-by: Gustavo Carvalho <gustavo.carvalho@delfosim.com> Co-authored-by: st-- <st--@users.noreply.github.com> Co-authored-by: Vincent Dutordoir <dutordoirv@gmail.com> * Update min TFP supported version to 0.10 (#1551) * Broadcasting constant and zero mean function (#1550) * Broadcasting constant and zero mean function * Use rank instead of ndim Co-authored-by: st-- <st--@users.noreply.github.com> Co-authored-by: joelberkeley-pio <joel.berkeley@prowler.io> Co-authored-by: gustavocmv <47801305+gustavocmv@users.noreply.github.com> Co-authored-by: Gustavo Carvalho <gustavo.carvalho@delfosim.com> Co-authored-by: ST John <st@prowler.io> Co-authored-by: Neil Ferguson <nfergu@users.noreply.github.com> Co-authored-by: Neil Ferguson <neil@prowler.io> Co-authored-by: Aidan Scannell <as12528@my.bristol.ac.uk> Co-authored-by: Aidan Scannell <scannell.aidan@gmail.com> Co-authored-by: Sandeep Tailor <s.tailor@insysion.net> Co-authored-by: Artem Artemev <art.art.v@gmail.com>
An attempt to make the custom tensor Parameter savable (loadable) with
tf.saved_model
.