Skip to content

Conversation

evgri243
Copy link
Contributor

@evgri243 evgri243 commented Aug 8, 2025

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Docs change / refactoring / dependency upgrade

Motivation and Context / Related issue

AdaClipOptimizer fails on the attempt to generate noise for self.unclipped_num.
PyTorch fails after the first step complaining that torch.normal is not defined for LongTensors. Converting it to .float just before the noise addition seems the shortest change possible to fix the issue. Otherwise, AdaClip doesn't work with the current version of PyTorch:

unclipped_num_noise = _generate_noise(
    std=self.unclipped_num_std,
    reference=self.unclipped_num.float(), <--
    generator=self.generator,
)

Btw, there is a general issue with how unclipped_num_std is handled. Initially it starts as a intfloat:

self.unclipped_num = 0

then gets converted to a tensor almost unintentionally (most importantly, the tensor is int LongTensor here).

self.unclipped_num += (
    len(per_sample_clip_factor) - (per_sample_clip_factor < 1).sum()
)

then the place with the fix where it can only be a tensor to work as a reference for the _generate_noise function:

unclipped_num_noise = _generate_noise(
    std=self.unclipped_num_std,
    reference=self.unclipped_num,
    generator=self.generator,
)

Immediately it is converted back into a vanilla float:

self.unclipped_num = float(self.unclipped_num)

On your permission I can attend to it deeper and stabilize its type to float. For the sake of generality it can either made to work with unclipped_num_std=0.0 (although it violates privacy guarantees) or a check with a better exception can be added instead of a cryptic internal pytorch failure.

How Has This Been Tested (if it applies)

Checklist

  • The documentation is up-to-date with the changes I made.
  • I have read the CONTRIBUTING document and completed the CLA (see CONTRIBUTING).
  • All tests passed, and additional code has been covered with new tests.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 8, 2025
@facebook-github-bot
Copy link
Contributor

@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this in D79881145. (Because this pull request was imported automatically, there will not be any future comments.)

@iden-kalemaj iden-kalemaj self-assigned this Aug 14, 2025
@iden-kalemaj
Copy link
Contributor

Thank you for this fix! The current solution makes sense to me, working on having this landed.

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 566c1e0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants