-
Notifications
You must be signed in to change notification settings - Fork 22.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[optim] Merge the pyi files into py files of optimizer #125452
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125452
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 3a87b3b with merge base 3759676 (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot drci |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Started reviewing, but it’d be easier to review if you could split the lr_scheduler changes from the optims
torch/optim/adam.py
Outdated
lr_dict = ( | ||
{lr.device: lr} if isinstance(lr, Tensor) and str(lr.device) != "cpu" else None | ||
) | ||
lr = torch.tensor(lr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this necessary? we don’t want to wrap lr into a Tensor normally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my modification here is incorrect, I will fix this.
torch/optim/adamw.py
Outdated
@@ -27,10 +29,10 @@ class AdamW(Optimizer): | |||
def __init__( | |||
self, | |||
params: ParamsT, | |||
lr: Union[float, Tensor] = 1e-3, | |||
lr=1e-3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the removal of types here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverted
I accidently removed it.
@@ -376,6 +377,7 @@ def _multi_tensor_asgd( | |||
torch._foreach_add_(grouped_state_steps, 1) | |||
|
|||
# intermediate = grad + param * lambd | |||
intermediate: Union[Tuple[Tensor, ...], List[Tensor]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what’s the diff between defining this here vs the first time it’s instantiated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without this, mypy will raise an error Incompatible types in assignment (expression has type "tuple[Tensor, ...]", variable has type "list[Tensor]")
in line 386, 392 and 411 because intermediate
was assigned as list[Tensor] in line 384. I think it is more concise to assign a union type here than add 3 type ignores.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there!
torch/optim/adam.py
Outdated
@@ -27,10 +29,10 @@ class Adam(Optimizer): | |||
def __init__( | |||
self, | |||
params: ParamsT, | |||
lr: Union[float, Tensor] = 1e-3, | |||
lr=1e-3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like there are more places you accidentally removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed. sorry for this.
torch/optim/adamw.py
Outdated
lr_dict = ( | ||
{lr.device: lr} if isinstance(lr, Tensor) and str(lr.device) != "cpu" else None | ||
) | ||
lr = torch.tensor(lr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
weight_decay: float, | ||
eps: float, | ||
maximize: bool, | ||
capturable: bool, # Needed for consistency. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why delete the comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this is unnecessary because has_complex
didn't have this comment. But I add it back and add a same comment for has_complex
.
torch/optim/swa_utils.py
Outdated
Tuple[List[List[Tensor]], Indices], | ||
], | ||
grouped_tensors, | ||
grouped_tensors = Optimizer._group_tensors_by_device_and_dtype( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the switch to use the optimizer one here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The optimizer one internally calls the original function and also supports compiling. It doesn't need to perform a lot of type casting because it is ignored in the optimizer one. Is this not suitable here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eh...the one in optimizer is a special case where it gets skipped in compile world..I'd rather not use it here if not necessary yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'll roll it back. Thanks for explaining this.
torch/optim/adam.py
Outdated
eps=1e-8, | ||
weight_decay=0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
though I see other places where this wasn't necessary--do you know the difference between defining vs not here?
If it's fine to not have a type, I am happy with not having one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, could you please explain the difference to me? I think mypy interprets weight_decay as an int, so should I specify a float here or change 0 to 0.0 (but there won't be any mypy error if using int weight_decay as a float)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does look like there's no error and it will allow an int or a float or anything (it might just infer it as Any). From my lil endeavor, it still feels the best to include the type. For example:
class Foo:
def __init__(self, arg, kwarg=0):
self.arg = arg
self.kwarg = kwarg
a = Foo(1, "l")
print(5 + a.kwarg)
This will not error and will do the type conversion automatically.
But the intention is closer to:
class Foo:
def __init__(self, arg, kwarg: int = 0):
self.arg = arg
self.kwarg = kwarg
a = Foo(1, "l")
print(5 + a.kwarg)
And here, with the type, mypy will error properly at the instantiation of a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, understood completely. I'll go ahead and add the types.
torch/optim/adamw.py
Outdated
eps=1e-8, | ||
weight_decay=1e-2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here
torch/optim/swa_utils.py
Outdated
Tuple[List[List[Tensor]], Indices], | ||
], | ||
grouped_tensors, | ||
grouped_tensors = Optimizer._group_tensors_by_device_and_dtype( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eh...the one in optimizer is a special case where it gets skipped in compile world..I'd rather not use it here if not necessary yet.
@pytorchbot merge |
Pull workflow has not been scheduled for the PR yet. It could be because author doesn't have permissions to run those or skip-checks keywords were added to PR/commits, aborting merge. Please get/give approval for the workflows and/or remove skip ci decorators before next merge attempt. If you think this is a mistake, please contact PyTorch Dev Infra. |
torch/optim/adamw.py
Outdated
@@ -318,92 +318,6 @@ def step(self, closure=None): | |||
) | |||
|
|||
|
|||
def adamw( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait is this deletion intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I merged main branch to this because github says there are some merge conflict, and 0f02e0a moved the function to the end of the file.
@pytorchbot merge |
Pull workflow has not been scheduled for the PR yet. It could be because author doesn't have permissions to run those or skip-checks keywords were added to PR/commits, aborting merge. Please get/give approval for the workflows and/or remove skip ci decorators before next merge attempt. If you think this is a mistake, please contact PyTorch Dev Infra. |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 mandatory check(s) failed. The first few are: Dig deeper by viewing the failures on hud |
@pytorchbot rebase -b main |
@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here |
Rebase failed due to Command
Raised by https://github.com/pytorch/pytorch/actions/runs/9040977794 |
Yes, we should keep the name. sgd is publicly exposed and the argument is sadly not positional only. |
Okay, I will fix it. |
The removal in optimizer |
done. |
looks like there are merge conflicts now :/ |
fixed |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 mandatory check(s) failed. The first few are: Dig deeper by viewing the failures on hud |
Please run lintrunner locally to ensure lint passes for the next commit (and usually before pushing). Getting lintrunner locally is pretty simple: just
|
ok! fixed |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
There was a regression in the public interface for `torch.optim` introduced in #125452 when `torch/optim/__init__.pyi` was merged into `torch/optim/__init__.py`. [The import aliases were not preserved and so now `pyright` thinks that these classes are not publicly exported from `torch/optim/__init__.py`.](https://github.com/pytorch/pytorch/pull/125452/files#diff-941595c1e1aa06bec94578499dd3654532a5183d0bc1bcd94d1f33b47e0d0adfL1-L15) ``` error: "SGD" is not exported from module "torch.optim" ``` Adding these classes/modules to `__all__` fixes this. Pull Request resolved: #131959 Approved by: https://github.com/ezyang
#136185) When stub files (`*.pyi`) were removed from `optim` (#125556, #125452), some types that existed are no longer available. This pull request adds them back. Just for reference, these types are used in `pytorch-lightning`'s `LightningCLI`. Command line interfaces are created automatically, and having type hints make them nicer. Pull Request resolved: #136185 Approved by: https://github.com/janeyx99
pytorch#136185) When stub files (`*.pyi`) were removed from `optim` (pytorch#125556, pytorch#125452), some types that existed are no longer available. This pull request adds them back. Just for reference, these types are used in `pytorch-lightning`'s `LightningCLI`. Command line interfaces are created automatically, and having type hints make them nicer. Pull Request resolved: pytorch#136185 Approved by: https://github.com/janeyx99
Continue the work of #125153