Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Types of changes
Motivation and Context / Related issue
AdaClipOptimizer fails on the attempt to generate noise for
self.unclipped_num
.PyTorch fails after the first step complaining that torch.normal is not defined for LongTensors. Converting it to
.float
just before the noise addition seems the shortest change possible to fix the issue. Otherwise, AdaClip doesn't work with the current version of PyTorch:Btw, there is a general issue with how
unclipped_num_std
is handled. Initially it starts as aintfloat:then gets converted to a tensor almost unintentionally (most importantly, the tensor is int LongTensor here).
then the place with the fix where it can only be a tensor to work as a reference for the
_generate_noise
function:Immediately it is converted back into a vanilla float:
On your permission I can attend to it deeper and stabilize its type to float. For the sake of generality it can either made to work with
unclipped_num_std=0.0
(although it violates privacy guarantees) or a check with a better exception can be added instead of a cryptic internal pytorch failure.How Has This Been Tested (if it applies)
Checklist