Fix generator exhaustion in SparseAdam #47724

mariosasko · 2020-11-11T02:17:08Z

Fixes #47594

dr-ci · 2020-11-11T02:59:01Z

💊 CI failures summary and remediations

As of commit 2a6b375 (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 7 times.

mariosasko · 2020-11-12T21:34:40Z

@heitorschueroff Can you please check this PR and merge it if everything is fine?

EDIT:
@albanD Could you please take a look or cc someone else before this PR becomes stale.

heitorschueroff · 2020-12-02T18:16:15Z

torch/optim/sparse_adam.py

@@ -32,8 +33,12 @@ def __init__(self, params, lr=1e-3, betas=(0.9, 0.999), eps=1e-8):
        if not 0.0 <= betas[1] < 1.0:
            raise ValueError("Invalid beta parameter at index 1: {}".format(betas[1]))

+        # if params are in the form of a generator, the next for-loop exhausts it,
+        # so the copy is passed to the loop
+        params, params_copy = itertools.tee(params)


Is there an advantage to doing this copy as opposed to changing params to a list? Like params = list(params)

I think we usually simply do list(params) yes.

There is no difference in behavior, but using list instead of itertools.tee seems more readable and easier to understand IMO.

codecov · 2020-12-03T04:21:28Z

Codecov Report

Merging #47724 (2a6b375) into master (de46369) will increase coverage by 6.48%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master   #47724      +/-   ##
==========================================
+ Coverage   74.34%   80.83%   +6.48%     
==========================================
  Files        1856     1860       +4     
  Lines      200606   200737     +131     
==========================================
+ Hits       149143   162258   +13115     
+ Misses      51463    38479   -12984

albanD

Thanks for the update!

albanD · 2020-12-02T18:39:01Z

torch/optim/sparse_adam.py

@@ -32,8 +33,12 @@ def __init__(self, params, lr=1e-3, betas=(0.9, 0.999), eps=1e-8):
        if not 0.0 <= betas[1] < 1.0:
            raise ValueError("Invalid beta parameter at index 1: {}".format(betas[1]))

+        # if params are in the form of a generator, the next for-loop exhausts it,
+        # so the copy is passed to the loop
+        params, params_copy = itertools.tee(params)


I think we usually simply do list(params) yes.

facebook-github-bot

@albanD has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-12-07T19:17:58Z

@albanD merged this pull request in f2c3efd.

facebook-github-bot · 2020-12-07T19:18:17Z

@albanD merged this pull request in f2c3efd.

…sparse params" Context: #47724 fixed the problem that SparseAdam could not handle generators by using the `list(...)` construct. However, this meant that SparseAdam deviated from other optimizers in that it could _accept_ a raw Tensors/Parameter vs requiring a container of them. This is not really a big deal. So why this PR? I do think this PR is cleaner. It uses the fact that the Optimizer parent class already containerizes parameters into parameter groups, so we could reuse that here by calling `super().__init__` first and then filter the param_groups after. This change would also make SparseAdam consistent with the rest of our optimizers in that only containerized params are accepted, which technically is BC breaking but would be minor, I believe. [ghstack-poisoned]

Context: #47724 fixed the problem that SparseAdam could not handle generators by using the `list(...)` construct. However, this meant that SparseAdam deviated from other optimizers in that it could _accept_ a raw Tensors/Parameter vs requiring a container of them. This is not really a big deal. So why this PR? I do think this PR is cleaner. It uses the fact that the Optimizer parent class already containerizes parameters into parameter groups, so we could reuse that here by calling `super().__init__` first and then filter the param_groups after. This change would also make SparseAdam consistent with the rest of our optimizers in that only containerized params are accepted, which technically is BC breaking but would be minor, I believe. [ghstack-poisoned]

…sparse params" Context: #47724 fixed the problem that SparseAdam could not handle generators by using the `list(...)` construct. However, this meant that SparseAdam deviated from other optimizers in that it could _accept_ a raw Tensors/Parameter vs requiring a container of them. This is not really a big deal. So why this PR? I do think this PR is cleaner. It uses the fact that the Optimizer parent class already containerizes parameters into parameter groups, so we could reuse that here by calling `super().__init__` first and then filter the param_groups after. This change would also make SparseAdam consistent with the rest of our optimizers in that only containerized params are accepted, which technically is BC breaking but would be minor, I believe. [ghstack-poisoned]

Context: #47724 fixed the problem that SparseAdam could not handle generators by using the `list(...)` construct. However, this meant that SparseAdam deviated from other optimizers in that it could _accept_ a raw Tensors/Parameter vs requiring a container of them. This is not really a big deal. So why this PR? I do think this PR is cleaner. It uses the fact that the Optimizer parent class already containerizes parameters into parameter groups, so we could reuse that here by calling `super().__init__` first and then filter the param_groups after. This change would also make SparseAdam consistent with the rest of our optimizers in that only containerized params are accepted, which technically is BC breaking but would be minor, I believe. [ghstack-poisoned]

…sparse params" Context: #47724 fixed the problem that SparseAdam could not handle generators by using the `list(...)` construct. However, this meant that SparseAdam deviated from other optimizers in that it could _accept_ a raw Tensors/Parameter vs requiring a container of them. This is not really a big deal. So why this PR? I do think this PR is cleaner. It uses the fact that the Optimizer parent class already containerizes parameters into parameter groups, so we could reuse that here by calling `super().__init__` first and then filter the param_groups after. This change would also make SparseAdam consistent with the rest of our optimizers in that only containerized params are accepted, which technically is BC breaking SO I've added a deprecation warning that we should remove in May 2024. (But is it really BC breaking when we've said in the docs that params should be an iterable this whole time? Maybe this is just a bug fix....😛) [ghstack-poisoned]

Context: #47724 fixed the problem that SparseAdam could not handle generators by using the `list(...)` construct. However, this meant that SparseAdam deviated from other optimizers in that it could _accept_ a raw Tensors/Parameter vs requiring a container of them. This is not really a big deal. So why this PR? I do think this PR is cleaner. It uses the fact that the Optimizer parent class already containerizes parameters into parameter groups, so we could reuse that here by calling `super().__init__` first and then filter the param_groups after. This change would also make SparseAdam consistent with the rest of our optimizers in that only containerized params are accepted, which technically is BC breaking SO I've added a deprecation warning that we should remove in May 2024. (But is it really BC breaking when we've said in the docs that params should be an iterable this whole time? Maybe this is just a bug fix....😛) [ghstack-poisoned]

Context: #47724 fixed the problem that SparseAdam could not handle generators by using the `list(...)` construct. However, this meant that SparseAdam deviated from other optimizers in that it could _accept_ a raw Tensors/Parameter vs requiring a container of them. This is not really a big deal. So why this PR? I do think this PR is cleaner. It uses the fact that the Optimizer parent class already containerizes parameters into parameter groups, so we could reuse that here by calling `super().__init__` first and then filter the param_groups after. This change would also make SparseAdam consistent with the rest of our optimizers in that only containerized params are accepted, which technically is BC breaking SO I've added a deprecation warning that we should remove in May 2024. (But is it really BC breaking when we've said in the docs that params should be an iterable this whole time? Maybe this is just a bug fix....😛) Pull Request resolved: #114425 Approved by: https://github.com/drisspg

facebook-github-bot added the cla signed label Nov 11, 2020

pytorchbot added the open source label Nov 11, 2020

ailzhang added module: optimizer Related to torch.optim triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Nov 11, 2020

heitorschueroff reviewed Dec 2, 2020

View reviewed changes

mariosasko added 2 commits December 3, 2020 00:38

Fix generator exhaustion in SparseAdam

2a14d00

Replace itertools.tee with list

2a6b375

mariosasko force-pushed the fix-sparse-adam branch from e391732 to 2a6b375 Compare December 2, 2020 23:40

mariosasko requested a review from heitorschueroff December 3, 2020 10:56

albanD approved these changes Dec 3, 2020

View reviewed changes

facebook-github-bot reviewed Dec 3, 2020

View reviewed changes

heitorschueroff mentioned this pull request Dec 3, 2020

SparseAdam exhausts generator params before initializing Optimizer #47594

Closed

facebook-github-bot closed this in f2c3efd Dec 7, 2020

facebook-github-bot added the Merged label Dec 7, 2020

janeyx99 mentioned this pull request Nov 23, 2023

[BE][SparseAdam] cleaner way to verify no sparse params #114425

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix generator exhaustion in SparseAdam #47724

Fix generator exhaustion in SparseAdam #47724

mariosasko commented Nov 11, 2020

dr-ci bot commented Nov 11, 2020 •

edited

mariosasko commented Nov 12, 2020 •

edited

heitorschueroff Dec 2, 2020

albanD Dec 2, 2020

mariosasko Dec 2, 2020

codecov bot commented Dec 3, 2020

albanD left a comment

albanD Dec 2, 2020

facebook-github-bot left a comment

facebook-github-bot commented Dec 7, 2020

facebook-github-bot commented Dec 7, 2020

Fix generator exhaustion in SparseAdam #47724

Fix generator exhaustion in SparseAdam #47724

Conversation

mariosasko commented Nov 11, 2020

dr-ci bot commented Nov 11, 2020 • edited

💊 CI failures summary and remediations

mariosasko commented Nov 12, 2020 • edited

heitorschueroff Dec 2, 2020

Choose a reason for hiding this comment

albanD Dec 2, 2020

Choose a reason for hiding this comment

mariosasko Dec 2, 2020

Choose a reason for hiding this comment

codecov bot commented Dec 3, 2020

Codecov Report

albanD left a comment

Choose a reason for hiding this comment

albanD Dec 2, 2020

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Dec 7, 2020

facebook-github-bot commented Dec 7, 2020

dr-ci bot commented Nov 11, 2020 •

edited

mariosasko commented Nov 12, 2020 •

edited