Skip to content

Conversation

vkuzo
Copy link
Contributor

@vkuzo vkuzo commented Jul 24, 2025

Summary:

These workarounds are no longer needed after
#2356 and the corresponding
improvements in PyTorch core.

Test Plan:

torchtitan bench on llama 3 8b on 8 H100s:

before

rowwise
Median Tokens/Second (excluding step 1): 7013.0
Max Memory Usage: 37.19 GiB

gw_hp
Median Tokens/Second (excluding step 1): 7232.0
Max Memory Usage: 37.13 GiB

after

rowwise
Median Tokens/Second (excluding step 1): 6984.5
Max Memory Usage: 37.19 GiB

gw_hp
Median Tokens/Second (excluding step 1): 7319.5
Max Memory Usage: 37.13 GiB

Reviewers:

Subscribers:

Tasks:

Tags:

vkuzo added 2 commits July 24, 2025 06:14
[ghstack-poisoned]
[ghstack-poisoned]
@vkuzo
Copy link
Contributor Author

vkuzo commented Jul 24, 2025

Stack from ghstack (oldest at bottom):

@vkuzo vkuzo mentioned this pull request Jul 24, 2025
Copy link

pytorch-bot bot commented Jul 24, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2595

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 24, 2025
vkuzo added a commit that referenced this pull request Jul 24, 2025
Summary:

These workarounds are no longer needed after
#2356 and the corresponding
improvements in PyTorch core.

Test Plan:

torchtitan bench on llama 3 8b on 8 H100s:

before

rowwise
Median Tokens/Second (excluding step 1): 7013.0
Max Memory Usage: 37.19 GiB

gw_hp
Median Tokens/Second (excluding step 1): 7232.0
Max Memory Usage: 37.13 GiB

after

rowwise
Median Tokens/Second (excluding step 1): 6984.5
Max Memory Usage: 37.19 GiB

gw_hp
Median Tokens/Second (excluding step 1): 7319.5
Max Memory Usage: 37.13 GiB

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: ae11ea7
ghstack-comment-id: 3113561383
Pull Request resolved: #2595
@vkuzo vkuzo requested review from danielvegamyhre and drisspg July 24, 2025 13:52
@vkuzo vkuzo added the topic: not user facing Use this tag if you don't want this PR to show up in release notes label Jul 24, 2025
[ghstack-poisoned]
@vkuzo vkuzo changed the base branch from gh/vkuzo/94/head to main July 24, 2025 16:27
vkuzo added a commit that referenced this pull request Jul 24, 2025
Summary:

These workarounds are no longer needed after
#2356 and the corresponding
improvements in PyTorch core.

Test Plan:

torchtitan bench on llama 3 8b on 8 H100s:

before

rowwise
Median Tokens/Second (excluding step 1): 7013.0
Max Memory Usage: 37.19 GiB

gw_hp
Median Tokens/Second (excluding step 1): 7232.0
Max Memory Usage: 37.13 GiB

after

rowwise
Median Tokens/Second (excluding step 1): 6984.5
Max Memory Usage: 37.19 GiB

gw_hp
Median Tokens/Second (excluding step 1): 7319.5
Max Memory Usage: 37.13 GiB

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: ae11ea7
ghstack-comment-id: 3113561383
Pull Request resolved: #2595
@vkuzo vkuzo merged commit f5b5567 into main Jul 24, 2025
21 of 24 checks passed
liangel-02 pushed a commit that referenced this pull request Aug 25, 2025
* Update

[ghstack-poisoned]

* Update

[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: not user facing Use this tag if you don't want this PR to show up in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants