Skip to content

[Fix] extra ops is called in _fsdp_state of pytorch2.8#1237

Merged
CyCle1024 merged 3 commits intoInternLM:mainfrom
CyCle1024:fix_pt28_extra_ops_fsdp_state
Nov 15, 2025
Merged

[Fix] extra ops is called in _fsdp_state of pytorch2.8#1237
CyCle1024 merged 3 commits intoInternLM:mainfrom
CyCle1024:fix_pt28_extra_ops_fsdp_state

Conversation

@CyCle1024
Copy link
Copy Markdown
Collaborator

@CyCle1024 CyCle1024 commented Nov 10, 2025

This PR fix extra ops in SequenceContext.__init__ being called in _fsdp_state _pre_forward of pytorch 2.8.
Extra ops is introduced in https://github.com/pytorch/pytorch/blob/v2.8.0/torch/distributed/fsdp/_fully_shard/_fsdp_state.py#L249, and further in https://github.com/pytorch/pytorch/blob/v2.8.0/torch/distributed/utils.py#L227, dataclasses.replace(x) would call x.__init__ in the case of x is a dataclass object.

Note: this is a workaround for SequenceContext, force it not being a dataclass. We should find a way to clean up dataclass object's __init__ method if it's passed into FSDPModule forward method as a param.

@CyCle1024 CyCle1024 requested a review from HAOCHENYE November 10, 2025 11:55
@CyCle1024 CyCle1024 force-pushed the fix_pt28_extra_ops_fsdp_state branch 3 times, most recently from 9a1e068 to 10b6689 Compare November 13, 2025 14:16
@CyCle1024 CyCle1024 force-pushed the fix_pt28_extra_ops_fsdp_state branch from 10b6689 to fd6adfc Compare November 14, 2025 08:54
@CyCle1024 CyCle1024 merged commit d5580f3 into InternLM:main Nov 15, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants