Adafactor compile support tracker

### 🚀 The feature, motivation and pitch

The single tensor version of Adafactor has already landed in https://github.com/pytorch/pytorch/pull/129905 with foreach Adafactor in progress here: #132336. 

This issue is to track torch.compile support:
- [ ] Ensure single tensor Adafactor can run and is tested
- [ ] Ensure foreach Adafactor can run and is tested
    - [ ]  Adafactor foreach relies on the grouping logic to return a non-None dtype value in order to calculate eps1 in the default case. Historically, torch.compile has skipped the grouping logic as it's already handled in inductor and returned Nones for device and dtype. This is the most visible hurdle for lack of compile support in the foreach case today.

### Alternatives

_No response_

### Additional context

Adafactor is our first param-wise (not pointwise, not global) optimizer. There are many improvements left to be desired in the eager foreach implementation by supporting more foreach ops. Compile support would be pretty cool though.

cc @vincentqb @jbschlosser @albanD @crcrpar @ezyang @chauhang @penguinwu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adafactor compile support tracker #133268

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Adafactor compile support tracker #133268

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions