[ZeRo] Parameter group support in constructor

### 🚀 The feature, motivation and pitch

Motivated by: https://discuss.pytorch.org/t/adamw-zeroredundancyoptimizer-weight-decay-dictionary/141516

User might create optimizer such as:

```
optimizer = optim.AdamW(
                [
                    {"params": gain_or_bias_params, "weight_decay": 0.},
                    {"params": rest_params, "weight_decay": args.wd},
                ],
                lr=args.lr,
                betas=(args.beta1, args.beta2),
                eps=args.eps,
            )
```

where the first list is a dict specifying param groups. However,
```
optimizer = ZeroRedundancyOptimizer(
                [
                    {"params": gain_or_bias_params, "weight_decay": 0.},
                    {"params": rest_params, "weight_decay": args.wd},
                ],
                optim.AdamW,
                lr=args.lr,
                betas=(args.beta1, args.beta2),
                eps=args.eps,
            )
```

does not work as expected. A workaround is the following:

```
optimizer.add_param_group({"params": gain_or_bias_params, "weight_decay": 0.})
```

but for a better dev experience we should support passing them directly in the constructor.

### Alternatives

_No response_

### Additional context

_No response_

cc @pietern @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @SciPioneer @H-Huang

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ZeRo] Parameter group support in constructor #71347

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[ZeRo] Parameter group support in constructor #71347

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions